Google Home smart speakers and the Google Assistant virtual assistant have been caught eavesdropping without permission — capturing and recording highly personal audio of domestic violence, confidential business calls — and even some users asking their smart speakers to play porn on their connected mobile devices.
In a Wednesday report, Dutch news outlet VRT NWS said it obtained more than one thousand recordings from a Dutch subcontractor who was hired as a “language reviewer” to transcribe recorded audio collected by Google Home and Google Assistant, and help Google better understand the accents used in the language. Out of those one thousand recordings, 153 of the conversations should never have been recorded, as the wake-up command “OK Google” was clearly not given, the report said.
“VRT NWS was able to listen to more than a thousand excerpts recorded via Google Assistant,” according to the report. “In these recordings we could clearly hear addresses and other sensitive information. This made it easy for us to find the people involved and confront them with the audio recordings.”
Google for its part on Thursday acknowledged that VRT NWS had obtained authentic recordings, but argued that its language experts only review around 0.2 percent of all audio snippets.
“As part of our work to develop speech technology for more languages, we partner with language experts around the world who understand the nuances and accents of a specific language,” according to David Monsees, product manager at Google Search. “These language experts review and transcribe a small set of queries to help us better understand those languages. This is a critical part of the process of building speech technology, and is necessary to creating products like the Google Assistant.”
Google did acknowledge that some audio may be recorded by Google Home or Google Assistant without the wake-up word being said, due to error.
“Rarely, devices that have the Google Assistant built in may experience what we call a ‘false accept,'” said Monsees. “This means that there was some noise or words in the background that our software interpreted to be the hotword (like ‘OK Google’). We have a number of protections in place to prevent false accepts from occurring in your home.”
Google also argued that audio snippets are not associated with user accounts in the review process. Despite that, VRT NWS said, “it doesn’t take a rocket scientist to recover someone’s identity; you simply have to listen carefully to what is being said.”
While the incident shows that audio is being collected when users expect the devices to be dormant, it also highlights concerns around third-party security and Google’s data retention and sharing policies, given that a subcontractor leaked these recordings to a news outlet.
Monsees said that said subcontractor is currently under investigation.
“We just learned that one of these language reviewers has violated our data security policies by leaking confidential Dutch audio data,” he said. “Our security and privacy response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.”
Voice Assistant Data Privacy?
The incident comes as voice assistants such as Amazon Alexa and Google Home are coming under increased scrutiny about how much data is being collected, what that data is, how long it’s being retained and who accesses it.
In April, Amazon was thrust into the spotlight for a similar reason, after a report revealed the company employs thousands of auditors to listen to Echo users’ voice recordings. The report found that Amazon reviewers sift through up to 1,000 Alexa audio clips per shift – listening in on everything from mundane conversations to people singing in the shower, and even recordings that are upsetting and potentially criminal, like a child screaming for help or a sexual assault.
Voice assistants are increasingly being criticized for how they handle private data. In July, Amazon came under fire again after acknowledging that it retains the voice recordings and transcripts of customers’ interactions with its Alexa voice assistant indefinitely – raising questions about how long companies should be able to save highly-personal data collected from voice-assistant devices. And last year, Amazon inadvertently sent 1,700 audio files containing recordings of Alexa interactions by a customer to a random person –and later characterized it as a “mishap” that came down to one employee’s mistake.
Security experts for their part said that voice assistants will soon be a big focus for regulators in light of laws like the General Data Protection Regulation (GDPR).
“I definitely think that we’re going to have, let’s say within the next 15 months, a GDPR ruling on the data-collection policy of home-automation devices,” Tim Mackey, principal security strategist at the cybersecurity research center at Synopsys, told Threatpost. “Voice assistants will probably be high on the list, as would things like video doorbells. And effectively it’s going to be a case of what was disclosed and how was the information processed.”
Don’t miss our free live Threatpost webinar, “Streamlining Patch Management,” on Wed., July 24, at 2:00 p.m. EDT. Please join Threatpost editor Tom Spring and a panel of patch experts as they discuss the latest trends in Patch Management, how to find the right solution for your business and what the biggest challenges are when it comes to deploying a program. Register and Learn More