How ChatGPT And Other LLMs Work - And Where They Can Go

AI powered chatbots like as ChatGPT and Google Bard are definitely having a moment – the next generation of conversational software tools promise to do everything from taking over our web searches to producing an endless supply of creative literature to memorizing all the world’s knowledge so that we don’t have to.

ChatGPT, Google Bard, and other similar bots are examples of large language models, or LLMs, and it’s worth exploring how they work. It means you can make better use of them and better assess what they are good at (and what you shouldn’t trust them with).

Like many artificial intelligence systems, such as those designed to recognize your voice or generate cat pictures, LLMs are trained on massive amounts of data. The companies behind them have been rather cautious when it comes to revealing exactly where that data comes from, but there are certain clues we can look at.

For example, the research paper introducing the LaMDA (Language Model for Dialogue Applications) model, on which Bard is built, mentions Wikipedia, “public forums” and “code documents from sites related to programming, such as Q&A sites, tutorials, etc. .” Meanwhile, Reddit wants to start charging for access to its 18 years of text conversations, and StackOverflow just announced plans to start charging as well. The implication here is that LLMs have so far made extensive use of both sites as resources, completely free of charge and on the backs of the people who built and used those resources. Obviously much of what is publicly available on the internet has been scraped and analyzed by LLMs.

LLMs use a combination of machine learning and human input.

OpenAI via David Nield

All this text data, wherever it comes from, is processed through a neural network, a common type of AI engine that consists of multiple nodes and layers. These networks are constantly adapting the way they interpret and understand data based on a variety of factors, including the results of previous trial and error. Most LLMs use a specific neural network architecture called a transformer that has some tricks particularly suited to language processing. (That GPT after Chat stands for Generative Pretrained Transformer.)

Specifically, a transformer can read huge amounts of text, discover patterns in how words and phrases relate to each other, and then make predictions about which words should come next. You may have heard LLMs compared to supercharged autocorrect engines, and that’s not too far off the mark really: ChatGPT and Bard don’t really “know” anything, but they are very good at figuring out which word follows another, which begins to look like real thinking and creativity when it reaches a sufficiently advanced stage.

One of the main innovations of these transformers is the self-attention mechanism. It’s hard to explain in a paragraph, but essentially it means that words in a sentence are considered not in isolation, but also in relation to each other in a variety of sophisticated ways. It allows for a higher level of understanding than would otherwise be possible.

There is some randomness and variation built into the code, which is why you won’t get the same answer from a transformer chatbot every time. This autocorrect idea also explains how errors can creep in. On a fundamental level, ChatGPT and Google Bard don’t know what’s right and what’s not. They’re looking for answers that seem plausible, natural, and that match the data they’ve been trained on.

How ChatGPT and other LLMs work – and where they can go | WIRED