Academic researchers have devised a new exploit that controls Amazon Echo’s smart speakers, forcing them to unlock doors, make phone calls and make unauthorized purchases, and control ovens, microwaves and other smart devices.
The attack works by using the device’s speaker to issue voice commands. As long as the speech contains the device’s activation word (usually “Alexa” or “Echo”) followed by an allowed command, the Echo will execute it, researchers at Royal Holloway University in London and Italy’s University of Catania have found. Even when devices require verbal confirmation before executing sensitive commands, it’s trivial to get around the yardstick by adding the word “yes” about six seconds after the command is given. Attackers can also exploit what the researchers call the “FVV” or full voice vulnerability, which allows Echos to issue self-issued commands without temporarily reducing device volume.
Alexa, go hack yourself
Because the hack uses Alexa functionality to force devices to create self-issued commands, the researchers dubbed it “AvA,” short for Alexa vs. Alexa. It only requires a few seconds of proximity to a vulnerable device while it’s powered on so that an attacker can give a voice command to instruct it to pair with an attacker’s Bluetooth device. As long as the device remains within radio range of the Echo, the attacker can issue commands.
The attack “is the first to exploit the vulnerability of self-issuing arbitrary commands on Echo devices, allowing an attacker to control them for extended periods of time,” the researchers wrote in a paper published two weeks ago. “With this work, we eliminate the need for an external speaker near the target device, increasing the overall likelihood of an attack.”
A variant of the attack uses an evil radio station to generate the self-published commands. That attack is no longer possible in the way shown in the newspaper after security patches that Echo maker Amazon released in response to the investigation. The researchers have confirmed that the attacks work against 3rd and 4th generation Echo Dot devices.
AvA begins when a vulnerable Echo device connects to the attacker’s device via Bluetooth (and for unpatched Echos, when they play the malicious radio station). From then on, the attacker can use a text-to-speech app or other means to stream voice commands. Here’s a video of AGM in action. All variations of the attack remain viable, with the exception of what is shown between 1:40 and 2:14:
The researchers found that they could use AvA to force devices to perform a variety of commands, many of which have serious privacy or security implications. Possible malicious actions include:
- Control other smart devices, such as turning off lights, turning on a smart microwave, setting the heating to an unsafe temperature or unlocking smart door locks. As noted earlier, when Echos needs confirmation, the opponent only needs to add a “yes” to the command about six seconds after the request.
- Call any phone number, including one controlled by the attacker, so that nearby noises can be eavesdropped. Although Echos use a light to indicate that they are on a call, devices are not always visible to users and less experienced users may not know what the light means.
- Making unauthorized purchases with the victim’s Amazon account. Although Amazon sends an email to notify the victim of the purchase, the email may be missed or the user may lose trust in Amazon. Alternatively, attackers can also delete items already in the account’s shopping cart.
- Tamper with a user’s previously linked calendar to add, move, delete, or change events.
- Imitate skills or start an ability of the attacker’s choice. This in turn allows attackers to obtain passwords and personal information.
- Retrieve all utterances from the victim. Using what the researchers call a “mask attack,” an adversary can intercept commands and store them in a database. This could allow the adversary to extract private data, gather information about skills used and infer user habits.
The researchers wrote:
With these tests, we showed that AvA can be used to issue arbitrary commands of any type and length, with optimal results – in particular, an attacker can control smart bulbs with a 93% success rate, successfully purchase unwanted items on Amazon 100% of the times, and sabotage [with] a linked calendar with an 88% success rate. Complex commands that must be correctly recognized in their entirety in order to succeed, such as calling a phone number, have a near-optimal success rate, in this case 73%. In addition, the results in Table 7 show that the attacker can successfully mount a Voice Masquerading Attack via our Mask Attack skill without being detected, and that all issued utterances can be retrieved and stored in the attacker’s database, namely 41 in our case.