The technology that Substantiation ChatGPT has the potential to do much more than just talk. Linxi “Jim” Fan, an AI researcher at the chipmaker Nvidia, teamed up with some colleagues to come up with a way to use the powerful GPT-4 language model — the “brain” behind ChatGPT and a growing number of other apps and services — place inside the blocky video game Minecraft.
The Nvidia team, which includes Anima Anandkumar, the company’s director of machine learning and a professor at Caltech, created a Minecraft bot called Voyager that uses GPT-4 to solve problems in the game. The language model generates objectives that help the agent explore the game, and code that improves the bot’s skill in the game over time.
Voyager does not play the game as a person, but can directly read the status of the game via an API. For example, it can see a fishing rod in its inventory and a river nearby, and use GPT-4 to suggest the target to do some fishing to gain experience. It will then use this goal to have GPT-4 generate the code necessary for the character to accomplish this goal.
The most novel part of the project is the code that GPT-4 generates to add behaviors to Voyager. If the originally proposed code doesn’t work perfectly, Voyager tries to refine it using error messages, feedback from the game, and a description of the code generated by GPT-4.
Over time, Voyager builds a library of code to learn how to make increasingly complex things and explore more of the game. A graph created by the researchers shows how capable it is compared to other Minecraft agents. Voyager obtains more than three times as many items; explores more than twice as far; and builds tools 15 times faster than other AI agents. Fan says the approach could be improved in the future by adding a way for the system to take in visual information from the game.
While chatbots like ChatGPT have stunned the world with their eloquence and apparent knowledge – even though they often make things up – Voyager shows the enormous potential of language models to perform useful actions on computers. Using language models in this way may allow the automation of many routine office tasks, possibly one of the greatest economic impacts of the technology.
The process Voyager uses with GPT-4 to figure out how to do things in Minecraft can be adapted for a software assistant that works out how to automate tasks through the operating system on a PC or phone. OpenAI, the startup that created ChatGPT, has added “plugins” to the bot that allow it to interact with online services like the grocery delivery app Instacart. Microsoft, which owns Minecraft, also trains AI programs to play it, and the company recently announced Windows 11 Copilot, an operating system feature that will use machine learning and APIs to automate certain tasks. It might be a good idea to experiment with this kind of technology in a game like Minecraft, where flawed code can do relatively little harm.
Video games have long been a testbed for AI algorithms, of course. AlphaGo, the machine learning program that mastered the extremely subtle board game Go in 2016, cut its teeth playing simple Atari video games. AlphaGo used a technique called Reinforcement Learning, which trains an algorithm to play a game by giving it positive and negative feedback, such as the score in a game.
It is more difficult for this method to guide an agent in an open-ended game like Minecraft, where there is no score or set of objectives and where a player’s actions don’t pay off until much later. Whether or not you believe we should be gearing up to contain the existential threat of AI right now, Minecraft seems like an excellent playground for the technology.