Skyping and Typing the Latest Threat to Privacy

A research paper explains how attackers can use recordings of keystroke sounds captured in a Skype conversation to guess what’s being typed.

Multitasking while on a work-related Skype call may be good for productivity, but perhaps not so much for privacy.

Typing while using Skype or over other Voice over Internet Protocol (VoIP) services presents an opportunity for an attacker to record the conversation, separate out the emanations from the typing, and use previous work in this field to analyze the sounds and accurately guess what’s being typed.

The research was presented in a paper published this week called “Don’t Skype & Type! Acoustic Eavesdropping in Voice-Over-IP,” written by Alberto Compagno of the University of Rome, Mauro Conti and Daniele Lain of the University of Padua, and Gene Tsudik of the University of California Irvine.

Tsudik told Threatpost that previous similar attacks require an adversary to be physically close to the target, precisely profile their typing style and physical keyboard model, and have access to typed information and corresponding sounds. All of this adds up to a generally impractical attack, unlike the one described in the paper.

Because this new take on an old attack is carried out over a VoIP service—Skype in this case—it would be much easier to pull off and apply existing research and analytical techniques through machine-learning tools.

“The main idea is the same; most physical keyboards make distinct sounds, like musical instruments,” Tsudik said. “Each key on the same keyboard sounds a little differently and produces sufficiently different sounds to map them to different keys.”

The accuracy of the technique is fairly high, even if the attacker is starting from scratch with little knowledge of the target’s typing inclinations. According to the paper, if some knowledge of the target’s typing style is known along with the keyboard model, accurately guessing a random key happens 92 percent of the time. If the attacker knows none of these attributes, they accuracy is still a whopping 42 percent of the time.

“I was surprised at the accuracy,” Tsudik said, admitting the analysis was done in a fairly “clinical” setting with a typical hunt-and-peck style of typing. He said that analysis from a VoIP recording may be more of a challenge if a skilled typist were at the keyboard, or multiple parties on a call typing at the same time.

Regardless, however, the work demonstrates a proof-of-concept that an attacker could glean sensitive secrets from a Skype session, such as banking passwords or the content of an email, by applying this type of analysis.

“I personally wind up in teleconferences over Skype where the parties are not necessarily friends,” Tsudik said. “Because I’m have a Skype call doesn’t mean I trust them, even though I probably know them.”

A determined attacker, meanwhile, could have scouted their target in advance and may know the type of laptop or the brand of physical keyboard the target uses, giving him an advantage for the attack.

Tsudik pointed out as well that touchscreen keyboards or projection keyboards are immune to this type of surveillance.

Moving forward, the researchers are expected to expand their work to include Google Hangouts and other VoIP services.

“We are starting to analyze Hangouts as a sanity check to see if it’s not just a Skype thing,” he said. “We will broaden the attack to include more keyboards. My guess is that it will get easier and faster with external keyboards and with noise in the background where there talking over typing. My guess is there would be still a possibility of high accuracy, but we need proof.”

Suggested articles