ChatGPT's Most Charming Trick Is Also Its Biggest Flaw

Like many others people Bindu Reddy fell under the spell of ChatGPT last week, a free chatbot that can answer all kinds of questions with amazing and unprecedented expressiveness.

Reddy, CEO of Abacus.AI, which develops tools for programmers using artificial intelligence, liked ChatGPT’s ability to answer requests for definitions of love or creative new cocktail recipes. Her company is already exploring how to use ChatGPT to write technical documents. “We tested it and it works great,” she says.

ChatGPT, created by startup OpenAI, has become the internet’s darling since its release last week. Early adopters have enthusiastically posted screenshots of their experiments, amazed at its ability generate short essays on just about any theme, make literary parodiesanswers complex coding questions, and much more. It has led to predictions that the service will make conventional search engines and homework assignments obsolete.

Still, the AI at the core of ChatGPT is actually not very new. It’s a version of an AI model called GPT-3 that generates text based on patterns it processed from massive amounts of text collected from the internet. That model, which is available as a commercial API for programmers, has already shown that it can sometimes answer questions very well and generate text. But for the service to respond in a certain way, the appropriate prompt had to be created to be entered into the software.

ChatGPT stands out because it can answer a naturally worded question with a new variant of GPT-3 called GPT-3.5. This tweak has unlocked a new capacity to respond to all kinds of questions, giving the powerful AI model an attractive new interface that virtually anyone can use. That OpenAI opened the service for free, and the fact that the glitches can be a lot of fun, also contributed to the chatbot’s viral debut — similar to how some AI-powered image creation tools have proved ideal for creating of memes.

OpenAI hasn’t released full details on how it gave its text generation software a naturalistic new interface, but the company shared some information in a blog post. It says the team fed human-written answers to GPT-3.5 as training data, then used a form of simulated reward and punishment known as reinforcement learning to push the model to provide better answers to sample questions.

Christopher Potts, a professor at Stanford University, says the method used to help ChatGPT answer questions, which OpenAI has previously shown, appears to be an important step forward in helping AI interact with language in ways that are more relatable is. “It’s extremely impressive,” Potts says of the technique, despite thinking it could complicate his job. “It got me thinking about what I’m going to do in my courses that require short answers to assignments,” says Potts.

Jacob Andreas, an assistant professor who works on AI and language at MIT, says the system is likely to increase the pool of people who can use AI language tools. “Here you’re presented with something in a familiar interface that makes you apply a mental model that you’re used to applying to other agents — people — you interact with,” he says.

ChatGPT’s most charming trick is also its biggest flaw