Machine Learning and Neural AI

Machine learning involves having the computer learn a model that can be used to carry out inference. The model can be as simple as a line that provides the best fit to data, or it can be a neural network with billions of parameters. In supervised learning, a model is trained on input/output pairs to learn a relationship between the inputs and outputs and produce suitable outputs from new, unseen inputs. In reinforcement learning, a computational agent explores an environment by choosing actions from a given action space and is rewarded or punished accordingly. Over time, the agent learns to optimize behavior and maximize the reward.

Both automated reasoning and machine learning have been part of AI from the very beginning. Machine learning became more prominent with the advent of big data in the 1990s, and neural networks took off after the success of AlexNet in the ImageNet image classification challenge in 2012. Advances in reinforcement learning led to the stunning success of Google DeepMind’s AlphaGo in defeating reigning Go champion Lee Sedol, winning four games in a five-game match in 2016. Advances in generative AI and large language models led to the release of OpenAI’s ChatGPT in 2022.

Given that doing mathematics requires knowing the mathematical literature, detecting patterns, and learning from experience, it is not surprising that machine learning can be used in mathematics in several ways. We describe some of them here.

Discovering patterns

Working with Google DeepMind, András Juhász and Marc Lackenby trained a neural network on hyperbolic and algebraic invariants of knots, and found that it could predict an algebraic invariant from the hyperbolic invariants. A sensitivity analysis reduced the dependency to three quantities, and further experimentation led to a novel conjecture that they were able to prove by hand. The result was described in an article in Nature. The same article described a similar application of neural networks used by Geordie Williamson to obtain a result in representation theory: a formula for computing the Kazhdan-Lusztig polynomials of a pair of permutations from their unlabelled Bruhat interval, a graph-theoretic object.

Conventional data mining techniques can be used to detect other patterns in mathematical data. In 2020, Yang-Hui He, Kyu-Hwan Lee, and Thomas Oliver used machine learning algorithms on data associated with elliptic curves over finite fields. They determined that a simple logistical regression model could predict the rank of the curve from certain quantities known as Frobenius traces. In 2022, an undergraduate student, Alexey Pozdnyakov, used elementary data analysis methods and discovered a pattern separating rank 0 and rank 1 curves as a function of p. The phenomenon was called “murmurations,” and opened a new line of research in number theory. The initial experiments can be reproduced using a remarkable Python notebook by Pozdnyakov.

Neural computation

Physical data is often constrained to satisfy partial differential equations. Physics-informed neural networks (PINNs) incorporate such constraints and thus can learn from and predict data subject to physical constraints. They are an important new field in scientific computation.

PINNs can also be used to approximate numeric solutions to PDEs and to find singularities that are then confirmed by conventional methods. They are becoming a central tool in the study of PDEs.

Finding mathematical objects

One can use supervised learning with synthetic data to find expressions with verifiable properties, like antiderivatives and Lyapunov functions. The challenge is to generate data with a representative distribution.

One of DeepMind’s first striking successes with reinforcement learning was to train a system to play old Atari video games. In 2021, Adam Zsolt Wagner found a clever way to gamify the search for combinatorial objects, enabling a system to construct graphs with properties of interest. The methods have been refined in the years since. In a recent paper, Bogdan Georgiev, Javier Gómez-Serrano, Terence Tao, and Wagner used AlphaEvolve to find solutions to several challenging problems. There are open-source variations on that system, such as OpenEvolve and ShinkaEvolve.