Deepmind's AI Learns The Art Of Coding

In the field of computer science, there is perhaps no more fundamental task than sorting. Bubbling, stacking, merging: take your pick. The methods of rearranging data in a computer have been theorized to death, served as practice exercises for millions of novices, and optimized over decades by expert developers. Type a sort() function in any programming language and code you can rely on. Do not touch it. It’s already working great.

But last year, an AI system developed by engineers at Google’s Deepmin improved just enough to matter. The system, which Deepmind calls AlphaDev, was tasked with devising a new way to sort short sequences into numbers in C++, the popular coding language. It meant going under the hood and letting the AI build new algorithms into assembly code – the instructions that bridge the gap between programming languages like C++ and computer hardware. When a C++ developer tells the computer to “sort,” those commands are converted into machine-readable code that tells a computer’s memory and processor exactly what to do: where to move data and how to change it. It’s where bits meet the metal.

The experiment worked. Since April last year, C++ has been running slightly faster, thanks to a new set of AI-devised sorting algorithms. But according to AlphaDev’s engineers, who described the work today in Nature, that’s just the first step. “We want to optimize the entire computing stack,” said Daniel Mankowitz, a staff researcher at Deepmind who led the sorting project. Mankowitz says AlphaDev has already improved the algorithms not only for sorting, but also for other basic tasks like hashing.

“I find this work incredibly exciting,” said Armando Solar-Lezama, an expert in program synthesis at MIT, who was not involved in the study. It is useful to let AI come up with a new sorting algorithm; it’s a much bigger deal to build an AI that can learn how to write state-of-the-art code for a variety of tasks, he says. That means AlphaDev has come to learn something more fundamental about the art of coding itself.

This naturally entails considerable limitations. “These are tiny programs,” he adds, totaling no more than a few dozen instructions in assembly code. But those little programs often pose major bottlenecks to computer performance because they’re optimized as far as people can push them. Overall, AlphaDev’s new C++ sorting algorithms are 1.7 percent more efficient than previous methods when sorting long strings of numbers, and up to 70 percent faster for strings of five items. At scale, these improvements add up, says Mankowitz. Since the AI-written code was submitted to Libc++, a major open source library for C++, he estimates that the algorithms have been used trillions of times per day.

Those improvements are thanks to a technique called Reinforcement Learning, the same approach used to help Deepmind’s AI master games like chess and Go. This type of AI learns by doing. It works by treating a given task, such as writing an editing program, as a game, in which the AI receives rewards for making smart moves that increase the program’s efficiency. Over time, the system tries to maximize this reward, resulting in a winning Go strategy or a faster editing program. This differs from the kind of AI found in large language models like GPT-4, which rely on massive amounts of data to learn how to write words or code. That’s great for producing writing that reflects the tone of the internet or producing common code segments. But it’s not great at producing new, cutting-edge solutions to coding challenges that the AI has never seen before.

Deepmind’s AI learns the art of coding