Live AI intelligence
Choose the right AI model faster.
Start with the benchmark-weighted leaderboard, check the frontier line, then scan what changed today.
42
Ranked LLMs
17
Weighted benchmarks
20
Digest stories
11 Apr 2026
Latest full refresh
Latest news
Open the news deskGoogle and Intel deepen AI infrastructure partnership
96GB Vram. What to run in 2026?
Running a non-profit that needs to OCR 64 million pages. Where can I apply for free or subsidized compute to run a local model?
is Agentic Commerce just the next buzzword for let’s automate your bank account?
Qwen3.5-122B at 198 tok/s on 2x RTX PRO 6000 Blackwell — Budget build, verified results
Homepage ranking
The benchmark-weighted composite AI leaderboard.
Latest tracked release: Gemma 4 31B on 2 Apr 2026. It stays out of the scored set until public benchmark and quality coverage are strong enough to rank it honestly.
| # | Model | Composite | Bench | Coverage | Price |
|---|---|---|---|---|---|
| 01 | o3 API Vision | 86.9 composite | 86.4 11 tracks | 68% weighted | $2.00 / $8.00 |
| 02 | Gemini 2.5 Pro Preview 06-05 API Vision Audio | 82.1 composite | 81.7 11 tracks | 68% weighted | $1.25 / $10.00 |
| 03 | GPT-5.2 API Vision | 79.8 composite | 77.9 8 tracks | 53% weighted | $1.75 / $14.00 |
| 04 | R1 Open API | 78.2 composite | 75.5 11 tracks | 68% weighted | $0.70 / $2.50 |
| 05 | Grok 4 API Vision | 74.6 composite | 77.4 7 tracks | 46% weighted | $3.00 / $15.00 |
| 06 | Grok 3 Beta API | 73.2 composite | 85.0 6 tracks | 38% weighted | $3.00 / $15.00 |
| 07 | Claude Opus 4.6 API Vision | 72.5 composite | 74.8 7 tracks | 45% weighted | $15.00 / $75.00 |
| 08 | Claude Opus 4 API Vision | 72.1 composite | 83.0 6 tracks | 39% weighted | $15.00 / $75.00 |
| 09 | Claude Sonnet 4 API Vision | 70.3 composite | 77.9 7 tracks | 43% weighted | $3.00 / $15.00 |
| 10 | o4 Mini API Vision | 69.5 composite | 91.0 4 tracks | 26% weighted | $1.10 / $4.40 |
Top evaluated model
o3
OpenAI currently tops the evaluated benchmark set with a composite score of 86.9.
- Benchmark score
- 86.4
- Coverage
- 68%
- Best for
- General use
Newer tracked launch: Gemma 4 31B. Release coverage is live before it becomes rankable.
Best open model
R1
The strongest open-weight entry on the weighted ranking right now, with benchmark coverage baked into the score.
Open source shortlistBest value
Mistral Nemo
Strongest quality-per-cost ratio in the current leaderboard, useful when performance still has to fit a budget.
Full value rankingFrontier signal
The AGI progress view is a compact frontier chart, not just a link.
May 2024
GPT-4o (extended)
Dec 2024
DeepSeek V3
Mar 2025
Gemini 2.5 Pro Preview 06-05
Apr 2025
o3
Aug 2025
GPT-5
Frontier leader
GPT-5
This is the highest published AGI-style frontier score in the current benchmark set at 84.2.
12 month gain
+19.7
Change in the frontier signal between the current leader and the last comparable point roughly one year earlier.
Strongest benchmark
Chatbot Arena ELO
The current frontier leader’s strongest normalized result is 92.5 on this track.
Breaking news / daily digest
The current brief.
9 Apr 2026 digest with 20 stories from 677 sources.
Meta Releases Muse Spark - A Natively Multimodal Reasoning model
Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration.
Mamba 1 & 2 to Mamba 3 Architectural Upgrade
This repository contains the methodology and scripts to bypass training from scratch by structurally transplanting weights from the Mamba-1/Mamba-2 architectures directly into Mamba-3 gates.
Finally Abliterated Sarvam 30B and 105B!
I abliterated Sarvam-30B and 105B - India's first multilingual MoE reasoning models - and found something interesting along the way!
Turbo-OCR for high-volume image and PDF processing
I recently had to process \~940,000 PDFs. I started with the standard OCR tools, but the bottlenecking was frustrating. Even on an RTX 5090, I was seeing low speed.
[AutoBe] Qwen 3.5-27B Just Built Complete Backends from Scratch — 100% Compilation, 25x Cheaper
We benchmarked Qwen 3.5-27B against 10 other models on backend generation — including Claude Opus 4.6 and GPT-5.4. The outputs were nearly identical. 25x cheaper.
Updated data
Pipeline freshness.
Composite leaderboard
17 benchmark tracks weighted into the ranking layer.
Pricing & value
Official provider pricing and routed API cost references.
Speed measurements
Latency and tokens-per-second snapshots for tracked models.
Jobs market
1,094 live roles across tracked company boards.
Daily digest
20 stories from 677 sources in the latest brief.
Today in AI
The launch birthdays and lab dates that matter.
No exact anniversary lands today. The next one is Llama 3 released in 7 days.
Llama 3 released
Llama 3 strengthened Meta’s position in open-weight models and raised the bar for broadly available open releases.
Google DeepMind formed
Google merged DeepMind and Google Brain into one lab, concentrating one of the largest frontier AI teams under a single brand.
GPT-4o introduced
GPT-4o fused text, image and audio into a single flagship model and reset the baseline for mainstream multimodal products.
Latest activities
The site changelog, in live form.
Recomputed benchmark-weighted quality scores
Refreshed the model quality layer that feeds ranking and comparison pages.
Updated speed measurements
Refreshed output speed and latency references for tracked models.
Synced Chatbot Arena benchmark track
Updated the frontier conversation signal used in leaderboard weighting.
Validated official pricing snapshots
Rechecked provider pricing pages against the comparison database.
Pulled latest OpenRouter price index
Updated comparison data for providers and routed model endpoints.
Jobs market snapshot refreshed
1,094 open roles across 10 tracked companies.
Mamba 1 & 2 to Mamba 3 Architectural Upgrade
Reddit r/LocalLLaMA featured in the latest daily brief.
Published the 2026-04-09 daily digest
20 stories captured from 677 sources.
The homepage composite score is a coverage-aware blend of benchmark-normalized results and the existing quality layer. The AGI panel is a derived frontier signal built from ARC-AGI, GPQA Diamond, Humanity’s Last Exam, MMLU-Pro, SWE-bench Verified, and Chatbot Arena. Read the methodology before treating any ranking as gospel.