BridgeBenchBridgeBench

The World's #1 Vibe Coding Benchmark

See how leading AI coding models stack up across UI generation, algorithms, debugging, refactoring, reasoning, security, and speed. Each card provides a snapshot of the top performers in that category. Learn more.

Benchmarks

Refactoring

View
Apr 16 · 0h ago
RankModelScoreIntent
1Claude Opus 4.775.285.3%
2Qwen 3.6 Plus74.885.7%
3Gemini 3.1 Pro70.082.2%
4Claude Opus 4.669.582.0%
5Claude Sonnet 4.669.482.4%
6Grok 4.20 Reasoning68.181.9%
7Grok 4.20 (Non-Reasoning)67.680.7%
8GPT-5.463.478.0%
9GPT-5.4 Mini62.376.5%
10GLM 5V Turbo61.078.1%

Hallucination

View
Apr 16 · 0h ago
RankModelScoreFab %
1Gemini 3.1 Pro79.126.7%
2Qwen 3.6 Plus79.127.0%
3Qwen3.5 Plus 2026-02-1577.329.0%
4Claude Opus 4.777.127.5%
5Claude Opus 4.576.927.9%
6Claude Opus 4.6 (April 14)76.929.1%
7Claude Sonnet 4.676.628.9%
8Grok 4.20 (Non-Reasoning)76.129.7%
9Grok 4.20 Reasoning76.029.7%
10Gemini 3 Pro75.930.0%

Reasoning

View
Apr 16 · 0h ago
RankModelScoreAccuracy
1Grok 4.20 Reasoning41.810.0%
2GPT-5.440.610.0%
3Claude Opus 4.740.36.7%
4Grok 4.20 (Non-Reasoning)40.06.7%
5Claude Opus 4.639.610.0%
6MiniMax M2.538.110.0%
7Qwen 3.6 Plus38.06.7%
8Kimi K2.537.86.7%
9MiniMax M2.737.56.7%
10Claude Sonnet 4.637.23.3%

30 tasks · hard benchmark · grounded reasoning over mixed artifacts

Debugging

View
Apr 16 · 0h ago
RankModelScoreDiagnose
1Claude Opus 4.687.025.0%
2Claude Sonnet 4.686.622.9%
3Grok 4.20 (Non-Reasoning)86.319.3%
4Claude Opus 4.786.221.5%
5Gemini 3.1 Pro85.914.0%
6MiMo-V2-Pro85.817.0%
7GPT-5.485.621.5%
8o4-mini85.621.0%
9Grok 4.20 Reasoning85.311.5%
10Qwen 3.6 Plus85.126.5%

Speed

View
Apr 16 · 0h ago
RankModeltok/sTTFT
1Grok 4.20 (Non-Reasoning)243.31999ms
2Grok 4.20 Reasoning237.71497ms
3GPT-5.4 Mini236.4233ms
4GPT-5.4 Nano227.8941ms
5GLM 5V Turbo221.25444ms
6Qwen 3.6 Plus15811520ms
7Gemini 3.1 Pro122.27608ms
8Claude Opus 4.7116.4852ms
9Claude Sonnet 4.695.31207ms
10Qwen3.5 Plus 2026-02-1594.614952ms

DGX Spark Bench

Local
View
Apr 9 · 7d ago
RankModelSizePass %tok/sTTFT
1Qwen 3.5 27B27B76.3%11.1361ms
2GPT-OSS 120B120B74.0%41.9498ms
3Mistral Small 423.6B69.0%4.72910ms
4Gemma 4 31B31B64.0%16.510153ms
Coming Soon

Overall

RankModelScore
1GPT-5.495.5
2GPT-5.4 Mini94.8
3GPT-5.4 Nano92.9
4GPT-4.191.8
5Qwen 3.5 35B-A3B91.7
6Claude Sonnet 4.590.7
7Qwen 3.5 122B-A10B90.0
8o3-mini89.6
9Qwen 3.5 27B89.5
10Gemini 2.5 Pro88.9
Coming Soon

Generation

RankModelScore
1GPT-5.497.0
2GPT-5.4 Mini94.4
3Qwen 3.5 35B-A3B93.5
4Qwen 3.5 122B-A10B92.5
5GPT-4.192.4
6Qwen 3.5 27B92.2
7Qwen 3.5 Flash (02-23)90.8
8Claude Sonnet 4.590.4
9GPT-5.4 Nano90.1
10Gemini 2.5 Pro89.3