Tor Project leaders are trying to rein in concerns about an academic paper describing an end-to-end traffic correlation attack that could be used by a well-funded attacker such as a nation state to de-anonymize traffic on Tor.
Executive director Roger Dingledine points out that the researchers behind the paper, “On the Effectiveness of Traffic Analysis Against Anonymity Networks Using Flow Records,” concede a relatively high false positive rate of 6.4 percent in their own experiments.
“That high false positive rate is not at all surprising, since he is trying to capture only a summary of the flows at each side and then do the correlation using only those summaries,” Dingledine wrote in a blog post. “It would be neat (in a theoretical sense) to learn that it works, but it seems to me that there’s a lot of work left here in showing that it would work in practice.”
The researchers postulate they could use NetFlow data produced by Cisco routers to analyze the effects small “perturbations” from attacker-owned server side traffic would have on the client. By applying statistical correlation on those changes, the paper says they can effectively match traffic patterns as it enters and exit the anonymity network.
“To alleviate the uncertainty due to the coarse-grained nature of NetFlow data, our attack relies on a server under the control of the adversary that introduces deterministic perturbations to the traffic of anonymous visitors,” wrote the team of Sambuddho Chakravarty, Marco V. Barbera, Georgios Portokalidis, Michalis Polychronakis, and Angelos D. Keromytis.
“We assume a powerful adversary, capable enough of observing traffic entering and leaving the Tor network nodes at various points,” they wrote. “Such an adversary might be a powerful nations state or a group of colluding nation states that can collaborate to monitor network entering and leaving various [asynchronous servers] simultaneously.”
The researchers analyzed flow data using open source tools ipt_netflow and flowtools and NetFlow data from an in-house Cisco router. In the lab, relying on data from the open source tools, they were able to unmask the source of anonymous traffic flows 100 percent of the time. Evaluating real-world Tor traffic, however, they were able to detect the source 81 percent of the time; they also reported 12.2 percent false negatives, and 6.4 percent false positives.
“We developed an active attack that involves modulating victim traffic at a colluding server and observing its effects in the network using NetFlow,” the paper said. “We rely on statistical correlation (Pearson’s correlation coefficient) to identify the closest matching flow.”
Other academics have tried their hand at traffic correlation attacks, each with varying degrees of success and failure. These types of attacks are a take on side-channel attacks where an adversary measures changes in computer or traffic behavior to learn something about a target, rather than try to exploit a vulnerability. The researchers here assume the attacker owns a number of “network vantage points,” and can log and analyze changes in network bandwidth.
“Due to the network traffic perturbations induced by the server, bandwidth monitoring could reveal the relays part of the circuit being attacked, eventually leading to the source,” the researchers said. “Our efforts achieved modest success in confirming the identity of the source of anonymous traffic. On average, about 42% of the connections between victim Tor clients and entry nodes preserved the traffic fluctuations induced by the server.”
The paper describes an attack model in which the victim is lured to the attacker’s server through Tor. The attacker, meanwhile, grabs NetFlow data for the traffic moving between the exit node and the owned server, and between the Tor client and the victim’s Tor entry node.
“The adversary has control of the particular server (and potentially many others, which victims may visit), and thus knows which exit node the victim traffic originates from,” the paper says.
Dingledine, meanwhile, tried to maintain some order pointing out a plethora of similar research and hurdles it must overcome.
“I should also emphasize that whether this attack can be performed at all has to do with how much of the Internet the adversary is able to measure or control,” he said. “This diversity question is a large and important one, with lots of attention already.”