Skip to content

The CEO of OpenAI says the era of giant AI models is already over

    The amazing possibilities of ChatGPT, the chatbot from startup OpenAI, has sparked a wave of new interest and investment in artificial intelligence. But late last week, OpenAI’s CEO warned that the research strategy that underpinned the bot has played out. It is unclear exactly where future advances will come from.

    OpenAI has made a series of impressive advances in AI working with language in recent years by taking existing machine learning algorithms and scaling them up to a previously unimaginable size. GPT-4, the newest of those projects, was probably trained using trillions of text words and many thousands of powerful computer chips. The process cost more than $100 million.

    But the company’s CEO, Sam Altman, says further progress won’t come from making models bigger. “I think we’re at the end of the era where it’s going to be these giant, giant models,” he told an audience at an event held at MIT late last week. “We’ll make them better in other ways.”

    Altman’s statement suggests an unexpected turn in the race to develop and deploy new AI algorithms. Since OpenAI launched ChatGPT in November, Microsoft has used the underlying technology to add a chatbot to its Bing search engine, and Google has launched a rival chatbot called Bard. Many people are rushing to experiment with using the new type of chatbot to help with work or personal tasks.

    Meanwhile, numerous well-funded startups, including Anthropic, AI21, Cohere, and Character.AI, are pouring huge resources into building ever-larger algorithms in an effort to catch up with OpenAI technology. The first version of ChatGPT was based on a slightly improved version of GPT-3, but users now also have access to a version powered by the more capable GPT-4.

    Altman’s statement suggests that GPT-4 could be the latest major advancement to come from OpenAI’s strategy of making the models bigger and giving them more data. He didn’t say what kind of investigative strategies or techniques might replace it. In the paper describing GPT-4, OpenAI says its estimates point to diminishing returns as model size scales. Altman said there are also physical limits to the number of data centers the company can build and how quickly it can build them.

    Nick Frosst, a co-founder at Cohere who previously worked on AI at Google, says Altman’s sense that getting bigger won’t work indefinitely is true. He, too, believes that progress in transformers, the type of machine learning model at the heart of GPT-4 and its rivals, goes beyond scaling. “There are many ways to make transformers much, much better and more useful, and many of them don’t require you to add parameters to the model,” he says. Frosst says new AI model designs, or architectures, and further tuning based on human feedback are promising directions that many researchers are already exploring.

    Each version of OpenAI’s influential family of language algorithms consists of an artificial neural network, software loosely inspired by the way neurons work together, trained to predict the words to follow a given sequence of text.