OpenAI today announced an improved version of its most capable artificial intelligence model yet – one that takes even more time to think about questions – just a day after Google announced its first model of this type.
OpenAI's new model, called o3, replaces o1, which the company introduced in September. Like o1, the new model spends time thinking about a problem to provide better answers to questions that require step-by-step logical reasoning. (OpenAI has chosen to skip the “o2” moniker, as it is already the name of a mobile carrier in Britain.)
“We see this as the beginning of the next phase of AI,” Sam Altman, CEO of OpenAI, said during a livestream on Friday. “Where you can use these models to perform increasingly complex tasks that require a lot of reasoning.”
The o3 model scores much higher on several metrics than its predecessor, OpenAI says, including metrics that measure complex coding-related skills and advanced math and science skills. It is three times better than o1 at answering questions from ARC-AGI, a benchmark designed to test the ability of AI models to reason about extremely difficult mathematical and logical problems they encounter for the first time.
Google is following a similar line of inquiry. Noam Shazeer, a Google researcher, revealed in a post on X yesterday that the company has developed its own reasoning model called Gemini 2.0 Flash Thinking. Google's CEO, Sundar Pichai, called it “our most thoughtful model yet” in his own post. Google's new model achieved a high score on SWE-Bench, a test that measures a model's agentic skills.
However, OpenAI's new o3 model is 20 percent better than o1. “o3 blew it out of the water,” says Ofir Press, a postdoctoral researcher at Princeton University who helped develop SWE-Bench. “Very surprising increase, not sure how they did it.”
The two duel models show that the competition between OpenAI and Google is fiercer than ever. It's critical for OpenAI to demonstrate that it can continue to make progress as it tries to attract more investment and build a profitable business. Google, meanwhile, is desperate to show that it remains at the forefront of AI research.
The new models also show how AI companies are increasingly looking beyond just scaling AI models to squeeze more intelligence out of them.