A major update released by Google this week (link below):
Google blog
The updated Deep Think mode continues to push the frontiers of intelligence, reaching new heights across the most rigorous academic benchmarks, including:
- Setting a new standard (48.4%, without tools) on Humanity's Last Exam, a benchmark designed to test the limits of modern frontier models
- Achieving an unprecedented 84.6% on ARC-AGI-2, verified by the ARC Prize Foundation
- Attaining a staggering Elo of 3455 on Codeforces, a benchmark consisting of competitive programming challenges
- Reaching gold-medal level performance on the International Math Olympiad 2025
- Todor
------------------------------
Todor Kostov
Director
------------------------------