The second batch of “First Proof” problems is meant to evaluate AI’s usefulness for research-level math. The best model got six or seven of the ten questions right.
Artificial intelligence has moved from checking homework to attacking problems that professional mathematicians once treated as out of reach. Systems tuned for symbolic reasoning are now cracking long ...
Google DeepMind’s AlphaProof system scored at a silver-medal level when tested against the 2024 International Mathematical Olympiad, solving problems that have historically separated elite human ...