No ‘Silver Bullet’ Fix for Alexa, Google Smart Speaker Hacks

Karsten Nohl, who was behind this week’s research that outlined new eavesdropping hacks for Alexa and Google Home, says that privacy for smart home assistants still has a ways to go.

Researchers this week disclosed new ways that attackers can exploit Alexa and Google Home smart speakers to spy on users. The hacks, which rely on the abuse of “skills,” or apps for voice assistants, allow bad actors to eavesdrop on users and trick them into telling them their passwords over the smart assistant devices.

Unfortunately, when it comes to smart speakers, “there’s no silver bullet” for protecting the privacy and security of data, said Karsten Nohl, managing director at Security Research Labs. Nohl, a cryptography expert and hacker, has been behind several high-profile research projects, including the 2014 BadUSB hack.

“I think it’s important to flag this technology as a convenience-enhancing technology,” Nohl told Threatpost. “So if you wanted to read the Daily News or weather or even horoscope, I think that’s fine, but be aware that this is a technology that should not be trusted with credit card numbers, medical information, or any other information that goes beyond convenience and actually intrudes your privacy. That of course, also applies to the placement of these devices, they probably shouldn’t be sitting in boardrooms or hospitals, on trading floors of large companies. They are a convenience enhancing technology that is probably better placed in more leisure environments right.”

Listen to Threatpost’s full interview with Nohl, below, or download direct here.

Below find a lightly-edited transcript of the Threatpost podcast.

Lindsey O’Donnell: Welcome back to the Threatpost podcast. This is Lindsey O’Donnell with Threatpost and I’m joined today with Karsten Nohl with Security Research Labs. Karsten, thanks so much for joining us today.

Karsten Nohl: Thanks for having me Lindsey.

LO: Well, just as a quick intro for our listeners, Karsten is a German cryptography expert and hacker and his areas of research include GSM security, RFID security, privacy protection, and you may know him best if you were at Black Hat 2014, where he presented his research on the ‘BadUSB’ hack, but today we’re going to talk about some new research from Karsten that Security Research Labs drop this week. And that’s about smart home assistants, specifically Alexa and Google Home. So I don’t know about you Karsten, but I actually have an Alexa. So that makes this discussion a little more creepy on my end, I’m hoping that I’m not driven to go throw it in the trash by the end of the podcast.

KN: Let’s see. Yeah, probably no reason for concern yet. But definitely a maturing technology that needs a lot more attention from security researchers than it has gotten in the past.

LO: For sure, well, looking at your research, it was specifically about how developer interfaces could be used to essentially turn digital home assistance into”smart spies.” So the hack that you guys discovered enabled you to request and collect personal data, including user passwords, and eavesdrop on users after they believe that smart speakers had stopped listening. So just to start, can you talk about the background of this research? I mean, just from the get go, what made you decide that you want to do focus on smart speakers for this research project?

KN: Yeah, that’s actually an interesting story all in itself, because we didn’t come up with this research idea. This was a community submission. We run an event every year, we call it “Hack the Beach.” We fly people out to a beach for months and mentor them through a research journey. And this was this idea was submitted by Louisa, a math student in Berlin here who specifically said I can’t program and I still want to hack something. So I have to pick a target that non-programmers can interact with in a meaningful way. So it was her idea to try to explore the boundaries of the programming interfaces of both Alexa and Google Home. And then within that one month on the beach, she came up with all of these exploits and ultimately, showed the world now what is possible. So the background wasn’t a follow up on any of our previous research, but almost the opposite approach, somebody saying, “I have no hacking experience whatsoever, and yet, I can have a hacker mindset, testing the boundaries of a technology and trying to find corner cases that lead to unexpected behavior.”

LO: Right. That’s really, really interesting. So talk about what specifically you guys found, what was the feature that was in focus that these different attacks kind of stemmed from?

KN: So we found a couple of things that both undermined kind of the trust model that smart speakers suggest you can rely on. And let me first kind of go over the trust model before saying how we breached it. So the trust model seems to be intuitively that as long as you’re in a dialogue with a smart speaker, you say something, the speaker says something back, you might answer questions, you get a response, as long as that dialogue is upheld, you assume that you’re being listened in on, because you’re talking to the device. But as soon as the conversation stops, there’s either a stop command or something else that suggests that the dialogue has stopped. You will be right to assume that you’re not being listened in on anymore, and what we found are a couple of things now to breach that trust model.

First of all, we noticed that unpronounceable symbols, basically things you can type on a keyboard, but that you cannot pronounce in any language, those are perfectly valid statements that you can send to an Alexa to speak out loud. Only this will result in silence, right? So we as the programmers, we can feed the Alexa or the Google Home with these unpronounceable letter sequences. And the Alexa or Google Home will not say anything, however, it’s upholding the dialogue. So you’re sitting in your room long after the application has stopped (at least you think it has stopped). And you’re saying something, Alexa will respond to it in silence, you say something, Alexa response, and so forth, creating the possibility to listen in on people, long after they expected those home speakers to have stopped listening. But it’s the first thing that we found. And that gives us basically a spying capability.

The second thing is just a slight addition to that; after a while, say 30 minutes, we could go back to actually saying something other than just silence. For instance, “Here’s a security update available for your Alexa device. Please tell us your password to enable the security update.” So the user who might still be in reach of Alexa has long assumed that the app has stopped working and that this new interaction is now initiated by Alexa or by Google Home. And they will in many cases give us the password. What they don’t assume is that this is the same app still running from 30 minutes ago, from an hour ago, that has been listening in on them the entire time. And now even stealing that password. So those were the little tweaks, little inconsistencies, in the way Alexa and Google Home used language that allowed us to breach that trust model.

LO: And what’s freaky too is that even beyond asking for passwords, it sounds like you guys could also ask for credit card information, social security numbers, all kinds of malicious applications that you could use to kind of leverage this hack. So even taking it a step further, it certainly sounds like it could be launched in a number of different ways there.

KN: That’s correct. Yeah. Basically, we made those speakers into, into remote controllable spies, right as long as the user launches any of our voice apps called “skills” or “actions” in the different platforms, so little extensions that give those voice assistants features that are not coming from Amazon or from Google, but from third party developers like us.

Now, in that sense, these echoes are very similar to smartphones. But also we had to learn over the years that you don’t just install every third-party app, there’s a certain trust involved with installing something, it might look at your contacts, it might steal your location. So you have to trust applications that you install on your phone at least to some degree. Now with voice speakers in situation is a little bit more complicated because you never actually install anything. There isn’t a set of applications that you specifically selected to be used. Instead, each of those applications has certain trigger words. One of the ones we created, for instance, says, “Hey, Alexa, please read me my daily horoscope.” So anybody who said that anywhere on any Alexa and the world automatically launched our skill in the background, not knowing whether this actually came from Amazon or from us, it’s very blurry.

So it’s unclear where people breach their trust boundaries and install something that’s coming from a third party, because it is so indistinguishable what they use in the first place. And I think in that sense, the smart speakers have gone one step further – and perhaps one step too far – towards convenience and away from controllable security and privacy.

LO: Right. I mean, that brings up a really interesting point too about the security reviews for these different skills or actions. And I know, you know, looking at your report, when you were talking about the password request change hack, one of the issues was that you were able to create a seemingly innocuous skill or action. So probably similar to your example about the horoscope skill. And then Amazon and Google would then review the security of that skill and publish it, but then you were able to change the functionality afterwards. And then that did not prompt another review from them. Can you kind of break down that and how the security review process is in terms of these different skills?

KN: Absolutely. So just like with smartphone apps, your third-party contributions need to be reviewed. And that usually takes a day or so. So the assumption is that there’s actually a manual review that’s happening on these apps. And that happens only once, when the app is originally submitted. And after that, you can’t change everything about an app. But you can change enough details to go from a completely benign app to an app that shows this malicious behavior of spying and phishing for passwords and other information that are described. So the review process seems to be somewhat inconsistent in that you approve something, but then still allow it to be changed in a meaningful and significant way. Both Amazon and Google have started looking within the app store equivalents in these voice assistants for malicious behavior. And the apps that we had put in the stores a couple of weeks ago have since been removed, but it is not a pre-approval before these apps are being used, it’s rather an anomaly detection after the usage that then gets you banned, which still leaves a window of opportunities for criminals, because they can apparently upload in any number of skills and then hope that at least some of them have a longer lifetime in the store and can be used, for instance, for password phishing during that time.

LO: Right. Well, that kind of ties into another question I had, which was around disclosure of the issue. How did Amazon and Google react when you guys notified them of it? And you know, on that note, it does sound a little bit like they’re responding and trying to take action around this issue, but like you said, it may not be a full solution at this point.

KN: Both Google and Amazon have mature vulnerability disclosure processes, and we submitted with, you know, the typical three plus months that any company expects to have to fix issues, the time. Now companies don’t necessarily give much insights into what they’re fixing, how and when, to the researcher. And they don’t have to, as long as they’re fixing it. We were surprised that despite the three plus months of the time, that the issues hadn’t been fixed when we went live and that now apps are being removed, that should have been kicked out weeks or months earlier. And that still seems to be a reactive rather than a proactive process. But I can’t really comment on what happens inside those companies, because they don’t give us many insights.

I should note that that Google did give us a reward even a bug bounty to reward which we donated to a homeless shelter here in Berlin.

LO: That’s nice. Okay. I’m curious too how did the hacks differ when you look at those two different hacks around both eavesdropping and then asking for passwords using kind of that voice phishing method. Do they differ at all between Google Home and Alexa? And if so, how?

KN:  They do. And we’re somewhat lucky to have found similar outcomes for both of these platforms, showing that the both of these platforms are vulnerable to the same attacks. And that’s important for us to show so that we’re not criticizing a company in particular, but rather a technology where several companies seem to be making similar mistakes, but the actual in detail mistakes was slightly different. So around eavesdropping in particular, for Amazon, we have to go through this loop of interaction of always outputting some silence and listening to the user outputting some silence, based on a programming construct that relies on predefined words. So basically, we can only eavesdrop on you, as long as you are actually speaking, let’s say within 30 seconds, you’re saying at least one of the trigger words put into the application. So that eavesdropping capability stops, either when you stop talking, or when you’re using none of a few hundred predefined words in the app. Whereas with Google, that constraint isn’t there, or at least wasn’t there when we last checked. So here we can really indefinitely listen in on the user, even if there’s nothing that would trigger a specific command. So actual flaws differed slightly. But also, we don’t claim to have found all the flaws and all the nuances of these flaws. We just found a couple and with that was able to show that both of these platforms had some structural issues.

LO: Right, really interesting. I feel like there have been a string of concerning privacy incidents regarding smart speakers. You know, for example, if you remember earlier in July, Amazon came under fire for acknowledging that it retains the voice recordings and transcripts of customers interactions with Alexa indefinitely. And then it’s not just Alexa it was Google Home and even Apple around Siri had some similar issues. So I think that really raises kind of questions about how highly-personal data is being collected and saved from voice assistant devices, and most importantly, how it’s being shared. What do you think about the state of privacy in smart speaker devices overall, I mean, especially on the heels of research, like yours, like previous research around different skills and actions and things like that?

KN: From my perspective, being a privacy advocate myself, I think these smart speakers take the convenience aspects a little bit too serious, and privacy and security implications are neglected as a result of choosing in a trade off between convenience and security, more convenience. And I think that’s almost by design. What we learned through computers and then later smartphones is that giving feedback to users is an integral part to helping them stay secure.

Silly things like some security seals or some virus scanner that blinks green, to more significant, visual feedback, where people see that now they’re on HTTPS, for instance, instead of HTTP, things like that. None of that is possible in a technology where feedback is constrained to just voice, and that voice on top of everything is controlled by a third-party developer. So you can’t really trust anything in particular that’s coming out of an Alexa or Google Home with skills and actions, and there’s nothing beyond that voice output that would give you an indication of whether you’re secure or private right now. I’ll give you one example to illustrate that and this is coming out of another line of research from last year, somebody created an application for Alexa called “Capital Won,” spelled W-O-N. And obviously when pronounce it sounds very much like a large American bank. So if you now trigger a “Capital Won” voice skill, Alexa had to decide which word you just said. Was it ONE or was it WON? And, in many cases, had to guess, and guessed wrong many times, right?

And because there was no feedback to the users to what app was not chosen, there was absolutely no way for them to know that they just fell into a trap where now somebody might, with very good reason, ask for their credit card number and receive it, because they think they’re talking to their own bank.

LO: That’s really interesting and pretty disturbing too I mean, the red flags that we are kind of trained at this point to look out for in phishing emails or online are not there when you are talking to a device and using your voice instead of how you see things. So on that note, I mean, what can consumers do to look out for these types of hacks and to also ensure that they keep as little data from being collected by their smart speakers as possible?

KN: There’s no silver bullet here. I think it’s it’s important to  flag this technology as a convenience-enhancing technology. So if you wanted to read the Daily News or weather or even horoscope, I think that’s fine, but be aware that this is a technology that should not be trusted with credit card numbers, medical information, or any other information that goes beyond convenience and actually intrudes your privacy. That of course, also applies to the placement of these devices, they probably shouldn’t be sitting in boardrooms or hospitals, on trading floors of large companies. They are a convenience enhancing technology that is probably better placed in more leisure environments right.

Now, over time, I would imagine that those red flags that you mentioned, will be introduced somehow, that there will be a green light if you are now speaking to a pre-approved, vetted company and a red light if it’s an app from some third party that have no such credentials. But we are at the very beginning of this journey, of adding security on top of all the convenience that people are currently enjoying.

LO: Hopefully in the future, we will start to see more of those solutions built in, or at least have a better idea of how to fix these privacy concerns. But Karsten, thank you again for coming on to the Threatpost Podcast and talking a little bit more about your discoveries around Alexa and Google Home.

Karsten: It’s been great. Thank you very much.

LO: And catch us next week on the Threatpost podcast.

Suggested articles

Hey Alexa, Who Am I Messaging?

Research shows that microphones on digital assistants are sensitive enough to record what someone is typing on a smartphone to steal PINs and other sensitive info.