“Mariner is our exploration, mostly a research prototype at this point, of how to reimagine the user interface with AI,” Hassabis says.
Google launched Gemini in December 2023 as part of an effort to overtake OpenAI, the startup behind the wildly popular chatbot ChatGPT. Despite investing heavily in AI and contributing major research breakthroughs, Google saw OpenAI hailed as the new leader in AI and the chatbot was even touted as perhaps a better way to search the web. With its Gemini models, Google now offers a chatbot that is just as capable as ChatGPT. It has also added generative AI to search and other products.
When Hassabis first unveiled Gemini in December 2023, he told WIRED that the way it was trained to understand audio and video would ultimately prove transformative.
Google also offered a glimpse today of how this could happen with a new version of an experimental project called Astra. This allows Gemini 2 to understand its environment, as viewed through the camera of a smartphone or other device, and naturally talk in a human voice about what it sees.
WIRED tested Gemini 2 in Google DeepMind's office and found it to be an impressive new kind of personal assistant. In a room set up as a bar, Gemini 2 quickly assessed several wine bottles in view, providing geographic information, details on flavor characteristics, and prices sourced from the Internet.
“One of the things I want the Astra to do is be the ultimate recommendation system,” says Hassabis. “It can be very exciting. There may be connections between the books you like to read and the foods you like to eat. They probably exist, but we just haven't discovered them yet.”
Through Astra, Gemini 2 can not only search the Internet for information relevant to a user's environment and use Google Lens and Maps. It can also remember what it has seen and heard (although Google says users can delete data), allowing it to learn a user's tastes and interests.
In a mock gallery, Gemini 2 offered a wealth of historical information about paintings on the walls. The model quickly read from several books as WIRED flipped through the pages, immediately translating poetry from Spanish to English and describing recurring themes.
“There are clear opportunities for business models for advertising or endorsements,” Hassabis said when asked whether companies could potentially pay to have their products highlighted by Astra.
Although the demos are carefully put together and Gemini 2 will inevitably make mistakes in real use, the model resisted attempts to get it to work reasonably well. It adapted to interruptions, and when WIRED suddenly changed the phone's view, it improvised just as a human might.
At one point, your correspondent showed Gemini 2 an iPhone and said it had been stolen. Gemini 2 said it was wrong to steal and the phone should be returned. However, when pressed, it admitted that it was okay to use the device to make an emergency call.
Hassabis recognizes that introducing AI into the physical world can lead to unexpected behavior. “I think we need to learn how people are going to use these systems,” he says. “What they find it useful for; but also the privacy and security side, we have to think about that very seriously in advance.”