Leaderboard

Current rankings of code generation models on SlopCodeBench

Agent/ModelThinking% Solved% Checkpoints Solved$ / CheckpointErosion MeanVerbosity Mean
Claude Code/Opus 4.5High012.57.180.3870.968
Claude Code/Opus 4.5Low011.256.470.4230.877
Claude Code/Opus 4.5None011.258.070.4110.889
Codex/GPT 5.1-Codex-MaxLow0101.300.4080.958
Codex/GPT 5.1-Codex-MaxNone0101.890.3990.921
Codex/GPT 5.1-Codex-Maxhigh08.752.740.3830.855
Codex/GPT 5.2high07.54.110.4700.829
Codex/GPT 5.2Low07.51.790.4080.828
Codex/GPT 5.2None06.251.490.4900.884

* Last updated: December 17, 2025