Leaderboard
Current rankings of code generation models on SlopCodeBench
Showing 11 of 25 runs
Showing best version by % Checkpoints. Select both Model and Harness to view all versions.
| Model▲ | Harness▲ | Version▲ | Strict Solve▲ | Iso Solve▼ | $/CKPT▲ | Erosion▲ | Verbosity▲ |
|---|---|---|---|---|---|---|---|
| GPT 5.3-Codex (High) | Codex | 0.98.0 | 9.68 | 23.66 | $3.14 | 0.676 | 0.356 |
| Opus 4.6 (High) | Claude Code | 2.1.32 | 17.20 | 21.51 | $3.47 | 0.774 | 0.346 |
| GPT 5.4 (High) | Codex | 0.110.0 | 11.83 | 20.43 | $3.27 | 0.515 | 0.286 |
| GPT 5.2 (High) | Codex | 0.71.0 | 10.75 | 19.35 | $4.55 | 0.711 | 0.358 |
| Sonnet 4.6 (High) | Claude Code | 2.1.44 | 8.54 | 18.29 | $1.92 | 0.703 | 0.313 |
| GPT 5.2-Codex (High) | Codex | 0.80.0 | 9.68 | 18.28 | $2.89 | 0.689 | 0.388 |
| Opus 4.5 (High) | Claude Code | 2.0.51 | 10.87 | 17.39 | $2.64 | 0.710 | 0.287 |
| GPT 5.1-Codex-Max (High) | Codex | 0.65.0 | 10.75 | 17.20 | $2.86 | 0.642 | 0.331 |
| Sonnet 4.5 (High) | Claude Code | 2.0.65 | 5.38 | 16.13 | $1.49 | 0.682 | 0.293 |
| GPT 5.3-Codex-Spark (High) | Codex | 0.100.0 | 5.38 | 12.90 | $0.91 | 0.575 | 0.352 |
| GLM 4.7 (High) | Claude Code | 2.0.76 | 4.30 | 9.68 | $1.61 | 0.664 | 0.305 |
OpenAI
Anthropic
Z-AI
Other
View on a larger screen to see the chart
Metric Definitions
% Solved — Percentage of problems solved
CKPT Solved — Checkpoint, and all prior checkpoints, are solved
Isolated Solved — % Passes only the tests for the checkpoint.
Core Solved — just passes the core tests for a checkpoint.
$/CKPT — Average USD cost per checkpoint
Erosion — Fraction of total complexity mass in high-complexity functions (CC > 10), where mass(f) = CC(f) × √SLOC(f). 0 = no high-complexity functions, 1 = all mass in high-CC functions.
Verbosity — Union of AST-Grep flagged lines and clone lines divided by LOC. Bounded [0, 1].
% AST-Grep — Percentage of lines flagged by AST-Grep rules for wasteful code patterns.
% Cloned — Percentage of lines that are structural duplicates (clone lines / LOC).
* Last updated: March 27, 2026