Yesterday, Google announced that DeepMind has set a new record in the field of artificial intelligence: its latest AI system, AlphaGeometry2, has for the first time surpassed the level of human gold medalists in a large-scale geometry problem test at the International Mathematical Olympiad (IMO).
<img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/c9f9683f-fbec-41a6-936a-6dffe655d208.png?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" alt="" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/c9f9683f-fbec-41a6-936a-6dffe655d208.png"/>
The research team selected 45 geometry problems from the IMO competitions held from 2000 to 2024, which were then processed and converted into 50 standard problems. The test results showed that AlphaGeometry2 successfully solved 42 of them, surpassing the average score of gold medalists, which is 40.9 points.
This breakthrough by DeepMind is of profound significance. The research team believes that the reasoning ability and strategic choice required to solve challenging geometry problems (especially Euclidean geometry problems) are key elements in building the next generation of general artificial intelligence.
Netizens commented, &#34;Close to perfect.&#34;
<img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/853b7880-ab51-444a-90a2-48e4aadd13ea.png?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" alt="" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/853b7880-ab51-444a-90a2-48e4aadd13ea.png"/>
<h2>AG2 Surpasses IMO Gold Medalists</h2>
DeepMind places great importance on this high school mathematics competition due to a deep insight: the ability to solve Euclidean geometry problems may be key to building more powerful AI systems.
Proving mathematical theorems requires both reasoning ability and the capacity to make choices among multiple possible steps, and these problem-solving skills may become an important component of future general AI models.
In fact, during a demonstration in the summer of 2024, DeepMind successfully combined AlphaGeometry2 with the mathematical formal reasoning AI model AlphaProof to solve 4 out of 6 problems from that year's IMO competition.
Technically, AlphaGeometry2 employs a hybrid approach that combines Google's Gemini series language models with a specialized symbolic computation engine.
In the problem-solving process, the Gemini model is responsible for predicting the geometric constructions that may be needed to solve the problems (such as adding auxiliary points, lines, or circles), while the symbolic engine derives based on strict mathematical rules. The two modules work together through parallel search algorithms, storing useful information discovered in a shared knowledge base. The problem is considered solved when the system can combine the suggestions from the Gemini model with the known principles of the symbolic engine to arrive at a complete proof.
<img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/7589fbc5-f364-4d13-b114-5f4ffa11b4a8.png?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" alt="" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/7589fbc5-f364-4d13-b114-5f4ffa11b4a8.png"/>
To overcome the challenge of a lack of geometric training data, the research team independently generated over 300 million theorems and proofs of varying complexity for training. This large-scale synthetic data training method provides a new model for AI breakthroughs in specific fields
However, AlphaGeometry2's capabilities still have clear boundaries. It cannot handle problems that involve variable point numbers, nonlinear equations, and inequalities. Among the 29 more challenging IMO candidate problems specifically selected by the research team, the system was only able to solve 20.
This breakthrough has sparked deep reflections on the development path of AI. Traditionally, there are two main approaches in the field of AI: symbolic operation-based methods (which represent knowledge through rule-based operations) and neural network methods that resemble the human brain.
AlphaGeometry2 adopts a hybrid architecture: its Gemini model uses a neural network architecture, while the symbolic engine is based on rule operations. According to DeepMind's paper, in tests, the OpenAI o1 model, which also uses a neural network architecture, was unable to solve any of the IMO problems that AlphaGeometry2 successfully answered.
Vince Conitzer, an AI expert at Carnegie Mellon University, stated:
<blockquote>
&#34;While making astonishing progress on these benchmarks, language models, including the latest 'reasoning' types, still struggle with some simple common-sense questions, which is a thought-provoking comparison.
This is not a false prosperity, but it indicates that we still cannot accurately predict the behavior of the next system. Given the significant impact these systems may have, we urgently need to better understand them and their potential risks.&#34;
</blockquote>
However, this situation may not last forever. In the paper, the DeepMind team stated that they have found preliminary evidence suggesting that the language model component of AlphaGeometry2 has shown the potential to generate partial solutions without the assistance of a symbolic engine.
Nevertheless, the research team emphasized that until the model's computational speed is fundamentally improved and the &#34;hallucination&#34; problem is completely resolved, external tools such as symbolic computation will still play an indispensable role in mathematical applications

GOOG

GOOGL

Google DeepMind's latest AI system has surpassed the level of gold medal winners for the first time in large-scale testing of geometry problems in the International Mathematical Olympiad, achieving a solution rate of 84%. The research team believes that geometric reasoning ability is key to building general artificial intelligence, and this breakthrough opens up new pathways for AI development

- Google announced that DeepMind's AI system, AlphaGeometry2, has surpassed human gold medalists in the International Mathematical Olympiad (IMO) geometry problems.  
- The system successfully solved 42 out of 50 selected geometry problems, exceeding the average score of gold medalists.  
- This breakthrough highlights the importance of reasoning and strategy in developing next-generation general AI, although AlphaGeometry2 still faces limitations with certain complex problems.