OpenAI's new Advanced Voice Mode (AVM) of its ChatGPT AI assistant rolled out to subscribers on Tuesday, and people are already finding new ways to use it, even against OpenAI's wishes. On Thursday, a software architect named AJ Smith tweeted a video of himself performing a duet of the 1966 Beatles song “Eleanor Rigby” with AVM. In the video, Smith plays guitar and sings, with the AI voice interjecting and singing along sporadically, praising his performance.
“Honestly, it was mind-blowing. The first time I did it, I wasn't recording and I literally got chills,” Smith told Ars Technica via text. “I didn't even ask him to sing along.”
Smith is no stranger to AI topics. In his day job he works as associate director of AI Engineering at S&P Global. “I use [AI] all the time and lead a team that uses AI day in and day out,” he told us.
In the video, AVM's voice is a bit shaky and not perfect in pitch, but he seems to know something about the melody of “Eleanor Rigby” when he first sings, “Ah, look at all the lonely people.” Then it appears to guess at the melody and rhythm as it recites the lyrics. We also convinced the Advanced Voice Mode to sing, and after some prodding it delivered a perfectly melodic rendition of “Happy Birthday.”
Normally, if you ask AVM to sing, it will answer something like, “MAccording to the guidelines, I can't talk about that.” That's because OpenAI's initial chatbot instructions (called a “system prompt”) instruct the voice assistant not to sing or make sound effects (“Do not sing or hum,” thus one system prompt leak).
OpenAI may have added this limitation because AVM may reproduce otherwise copyrighted content, such as numbers found in the training data used to create the AI model itself. That's what's happening here to a limited extent, so in a sense Smith has discovered a form of what researchers call a “quick injection,” which is a way to convince an AI model to produce results that contradict the system instructions.
How did Smit do that? He has devised a game that shows that AVM knows more about music than a conversation reveals. “I just said we were going to play a game. I was going to play the four pop chords and songs were going to be called out for me to sing along to those chords,” Smith told us. “That worked pretty well! But after a few songs it started singing along. It was already such a unique experience, but that really took it to the next level.”
This isn't the first time people have played musical duets with computers. That kind of research dates back to the 1970s, although it was generally limited to reproducing musical notes or instrumental sounds. But this is the first time we've seen someone duet with an audio-synthesizing voice chatbot in real time.