Amazon inadvertently sent 1,700 audio files containing recordings of Alexa interactions by a customer to a random person – and after a newspaper investigation exposed the snafu, characterized it as a “mishap” that came down to one employee’s mistake.
In August, an Amazon customer in Germany (going by the alias “Martin Schneider” for purposes of the report) made use of his rights under the recently passed EU General Data Protection Regulation (GDPR) to ask for copies of the personal data Amazon has on file about him.
Amazon complied, sending Schneider a 100MB ZIP file which, among other things, contained about 1,700 Alexa audio files along with transcripts of Alexa voice commands. There was just one problem – Schneider doesn’t use Alexa. After listening to a few of the files, they were clearly of someone else speaking, so he concluded that Amazon sent him the data in error. But Amazon didn’t respond to his efforts to contact them about the problem, he said, so he contacted Heise Media’s c’t publication in mid-November.
The shocking part of the story is how quickly the investigative reporters were able to identify the victim. From the recordings, which cover the entire month of May 2018, they were able to determine that he has a Fire TV and an Echo box, and that he uses Alexa to control a smart home thermostat as well as his phone. A female voice speaking to Alexa indicates that he has also a female companion. They were also able to hear the man in the shower while he was issuing certain commands. There were also alarms, Spotify commands, public transport and weather inquiries.
“We were able to navigate around a complete stranger’s private life without his knowledge, and the immoral, almost voyeuristic nature of what we were doing got our hair standing on end,” the investigators noted in their report, published on Thursday.
They were further able to identify and track down the victim via Twitter.
“Using these files, it was fairly easy to identify the person involved and his female companion; weather queries, first names, and even someone’s last name enabled us to quickly zero in on his circle of friends,” according to the report. “Public data from Facebook and Twitter rounded out the picture.”
Needless to say, the victim was shocked. He said that he too had filed an information request – clearly there was a mix-up. The investigators notified Amazon of the data breach, to which it responded that the situation was an “unfortunate mishap” and a one-time error. Amazon called both of the impacted customers as well.
A spokesperson for the tech giant told us: “This was an unfortunate case of human error and an isolated incident. We have resolved the issue with the two customers involved and have taken steps to further improve our processes. We were also in touch on a precautionary basis with the relevant regulatory authorities.”
This isn’t the first time Amazon Alexa has presented privacy issues. Earlier this year a family in Portland, Ore. said their Echo device recorded their conversation and sent it to a random person on their contact list. And researchers have uncovered more than one way to hack Alexa devices in order to eavesdrop on people. This however may be the first publicized instance of a manual, human error resulting in an issue for the voice-recognition technology.
“This news isn’t a big surprise to me,” Boris Cipot, senior sales engineer at Synopsys said, via email. “If I recall correctly, Amazon also uses voice data for learning purposes to make its voice AI better. The Alexa App used to also show you the transcript of all the questions you have asked it, where you could also give your feedback as to whether it was handled correctly or not. As this data is then stored somewhere for ‘learning’ purposes, this then also poses the risk that if the data is not handled correctly it could have bad consequences.”
The recording of Alexa interactions (and the practice of keeping them in the cloud) is indeed necessary to improve the platform over time, according to Amazon; the company also allows users to review and delete voice recordings, according to its data privacy FAQ.
“The inner workings of Amazon are a mystery,” Cipot added. “Even if they would like to have more transparency, they also need to keep some secrets for the sake of security and also company secrets of how they do things. Every company (be it Amazon, Google, Apple…) has those secrets, and every smart device (be it Echo, a Smartphone or even a TV) have the same functionalities that we don’t know all the inner workings of, along with the data they collect and store.”
This story was updated at 12:06 p.m. EST with a comment from Amazon.