When the first NSA surveillance story broke in June, about the agency’s collection of phone metadata from Verizon, most people likely had never heard the word metadata before. Even some security and privacy experts weren’t sure what the term encompassed, and now a group of security researchers at Stanford have started a new project to collect data from Android users to see exactly how much information can be drawn from the logs of phone calls and texts.
The project, dubbed Metaphone, is soliciting volunteers who agree to allow the collection of various kinds of metadata from their phones, which will then be sent automatically to Stanford’s researchers. The Stanford Security Lab, which is running the project, is interested in showing that the collection of metadata amounts to surveillance, something that NSA leaders and Congress have said is not the case.
“Phone metadata is inherently revealing. We want to rigorously prove it—for the public, for Congress, and for the courts,” Jonathan Mayer, a PhD student at Stanford and a junior affiliate scholar at the Security Lab, wrote in an explanation of the project.
People interested in participating in the program can download the Metaphone app from Google Play. As part of the project, Metaphone will collect and transmit a variety of information to the researchers. The data will be destroyed at the end of the study.
“In the course of the study, your mobile phone will transmit device logs and social network information to researchers at Stanford University. Device data will include records about your recent calls and text messages. Social network data will include your profile, connections, and recent activity. The data will be stored and analyzed at Stanford, then deleted at the end of the study. Research staff will take reasonable precautions to secure the data in transit, storage, analysis, and destruction,” the researchers said.
In an email interview, Mayer said that he hopes the study will provide some clear answers on what metadata is and how invasive the collection of it can be for users.
“We intend to report preliminary results as soon as we have enough crowdsourced data. Phone records are plainly a hot-button issue: Congress is considering intelligence reform legislation, courts are hearing litigation challenges, and many in the public aren’t sure who’s telling the truth. Our aim is to provide rigorous answers about the sensitivity of phone metadata,” Mayer said.
“It is difficult to estimate the amount of data that we need because the quality, not just quantity, of the data coming back will affect how well our learning algorithms work. The general principle, though, is that more data is better and if we want to make a strong claim about metadata we would like to have as much data as possible. However, the analysis can be a continuous process so we can get started once we have some participants and then refine our approach as more data comes in,” said Patrick Mutchler, also of the Security Lab.
This story was updated on Nov. 13 to clarify that the data collection will not be anonymous.
Image from Flickr photos of Harshlight.