I'm leaving ChatGPT's Advanced Speech Mode on as an ambient AI companion while writing this article. Every now and then I ask it for a synonym for a frequently used word, or some encouragement. About half an hour later, the chatbot interrupts our silence and starts talking to me unprompted in Spanish. I giggle a little and ask what’s going on. “A little change of pace? Gotta keep things interesting,” ChatGPT says, now back in English.
Testing out Advanced Voice Mode as part of the early alpha, my interactions with ChatGPT's new audio feature were entertaining, messy, and surprisingly varied, though it's worth noting that the features I had access to were only half of what OpenAI demonstrated when it launched the GPT-4o model back in May. The vision aspect we saw in the livestream demo is now slated for a later release, and the improved Sky voice, which Her Actor Scarlett Johanssen has been pushed back, removed from Advanced Voice Mode and is no longer an option for users.
So what’s the current vibe? Right now, Advanced Voice Mode feels reminiscent of when the original text-based ChatGPT came out, way back in late 2022. Sometimes it leads to unimpressive dead ends or devolves into empty AI platitudes. But other times, the low-latency conversations click in a way that Apple’s Siri or Amazon’s Alexa never did for me, and I feel compelled to keep chatting for the sheer enjoyment of it. It’s the kind of AI tool you show your family members for laughs over the holidays.
OpenAI gave a few WIRED reporters access to the feature a week after it was first announced, but pulled it the next morning over security concerns. Two months later, OpenAI soft-launched Advanced Voice Mode to a small group of users and released the GPT-4o system map, a technical document that outlines red-teaming efforts, what the company considers to be security risks, and mitigating steps the company has taken to limit damage.
Curious to try it out for yourself? Here’s what you need to know about the wider rollout of Advanced Voice Mode, and my first impressions of ChatGPT’s new voice feature, to get you started.
When is the full rollout?
OpenAI rolled out an audio-only Advanced Voice Mode to some ChatGPT Plus users in late July, and the alpha group still appears to be relatively small. The company plans to enable it for all subscribers sometime this fall. Niko Felix, an OpenAI spokesperson, didn’t share additional details when asked about the release timeline.
Screen and video sharing were a core part of the original demo, but they’re not available in this alpha test. OpenAI plans to add those aspects eventually, but it’s not clear when that will happen either.
If you are a ChatGPT Plus subscriber, you will receive an email from OpenAI when Advanced Voice Mode is available to you. Once it is on your account, you can toggle between Standard And Advanced at the top of the app screen when ChatGPT's voice mode is open. I was able to test the alpha version on an iPhone and a Galaxy Fold.
My first impressions of ChatGPT's advanced voice mode
Within my first hour of using it, I learned that I love interrupting ChatGPT. It’s not how you would talk to a human, but the new ability to interrupt ChatGPT mid-sentence and ask for a different version of the output feels like a dynamic improvement and a standout feature.
Early adopters who were excited by the original demos may be frustrated to gain access to a version of Advanced Voice Mode that’s limited with more restrictions than they expected. For example, while generative AI singing was a key component of the launch demos, featuring whispered lullabies and multiple voices attempting to harmonize, AI serenades are absent from the alpha.