AI reaches silver-medal level at this year's Math Olympiad

AI reaches silver-medal level at this year's Math Olympiad

Technology

DeepMind’s artificial intelligence programmes were able to solve a total of four out of six problems

Follow on
Follow us on Google News
 

(Web Desk) - During the 2024 International Mathematical Olympiad, Google DeepMind debuted an AI programme that can generate complex mathematical proofs

While Paris was preparing to host the 33rd Olympic Games, more than 600 students from nearly 110 countries came together in the idyllic English town of Bath in July for the International Mathematical Olympiad (IMO).

They had two sessions of four and a half hours each to answer six problems from various mathematical disciplines. Chinese student Haojia Shi took first place in the individual rankings with a perfect score.

In the rankings by country, the team from the U.S. came out on top. The most noteworthy results at the event, however, were those achieved by two machines from Google DeepMind that entered the competition.

DeepMind’s artificial intelligence programmes were able to solve a total of four out of six problems, which would correspond to the level of a silver medalist.

The two programs scored 28 out of a possible 42 points. Only around 60 students scored better, wrote mathematician and Fields Medalist Timothy Gowers, a previous gold medalist in the competition, in a thread on X (formerly Twitter).

To achieve this impressive result, the DeepMind team used two different AI programs: AlphaProof and AlphaGeometry 2. The former works in a similar way to the algorithms that mastered chess, shogi and Go.

Using what is called reinforcement learning, AlphaProof repeatedly competes against itself and improves step-by-step. This method can be implemented quite easily for board games.

The AI executes several moves; if these do not lead to a win, it is penalized and learns to pursue other strategies.

To do the same for mathematical problems, however, a program must be able not only to check that it has solved the problem but also to verify that the reasoning steps it took to arrive at the solution were correct.

To accomplish this, AlphaProof uses so-called proof assistants—algorithms that go through a logical argument step-by-step to check whether answers to the problems posed are correct.

Although proof assistants have been around for several decades, their use in machine learning been constrained by the very limited amount of mathematical data available in a formal language, such as Lean, that the computer can understand.

Solutions to math problems that are written in natural language, on the other hand, are available in abundance.

There are numerous problems on the Internet that humans have solved step-by-step.

The DeepMind team therefore trained a large language model called Gemini to translate a million such problems into the Lean programming language so that the proof assistant could use them to train.

“When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean,” the developers wrote on DeepMind’s website.

By doing so, AlphaProof gradually learns which proof steps are useful and which are not, enhancing its ability to solve more complex problems.