Google has announced that its Gemini Deep Think system, particularly through an agent called Aletheia, is being utilized for open-ended research in mathematics rather than merely solving fixed-answer contest problems. Aletheia can draft, check, and repair mathematical proofs, achieving a performance rate of approximately 90% on the IMO-ProofBench Advanced test. This advancement follows its recognition in July 2025 for meeting IMO Gold-medal standards. Notably, Aletheia enhances traditional methods by correcting or restarting flawed proofs using natural language checks, while also grounding its claims in published work through web browsing. The system’s applications extend to solving complex Erdős-style problems and aiding in various theoretical fields, although it emphasizes the necessity of strict human review to ensure accuracy.
Gemini Deep Think reaches 90% accuracy on IMO-ProofBench Advanced test
