The practice of history sniffing, which has been seen as out-of-bounds and a serious privacy violation for the better part of a decade now, is still ongoing by some ad networks, researchers have found. A study completed recently by researchers at Stanford University’s Center for Internet and Society found that at least one ad network apparently is still using the technique to gather information about what links users have clicked on and which sites they’ve visited.
History sniffing by ad networks and others first came to the attention of the general public late last year when researchers at the University of California at San Diego conducted a large-scale experiment on thousands of popular sites to see whether they performed history stealing and other kinds of behavior tracking. It turned out that the practice was fairly widespread and some sites said after the research went public that they’d stop using the technique.
The Stanford researchers conducted an experiment recently in which
Web testing platform they’d developed. During the test they found that one
marketing company, Epic Marketplace, was using a script to test
thousands of URLs per second against a user’s history to see whether the
user has been to the sites. Epic was using an invisible iFrame to test the URLs, so there is no visible cue to the user that the script is running.
Jonathan Mayer, a graduate student at Stanford who helped conduct the research, said that it became obvious rather quickly what was going on during the experiment.
“There were only a couple of sites making ten thousand calls per second. Usually it’s some hacky coding that causes that, but going back through the results, we saw a ton of calls coming from trafficmp.com [a URL owned by Epic],” Mayer said. “This was so easy to spot because it was egregious. I think there’s a lot of consensus about history stealing. It was never intended that browsers would share their history with the pages they visit.”
In its test, the Stanford team found that what the script returns to Epic is a deduplicated list of so-called interest segments that show what the general topic of the page the user visited is. It doesn’t return specific URLs. Among the pages that Mayer’s team observed the script checking were pages on fertility treatments, pages about debt relief at the FTC and IRS and pages regarding other health information at the National Institutes of Health and the University of Maryland.
At its most basic level, history sniffing is used by some Web sites
and ad networks to determine which other sites a visitor has been to,
and can be used to display targeted ads. It’s often accomplished through
the use of a script that checks the color of links on a page, as
visited links will return a different color than un-clicked ones. It
also can be used in some scenarios to steal Web cookies or track users.
The tactic is enabled by a flaw in older browsers that allowed pages to access the history of browsers that visited the site. All of the major browsers have fixed the bug, but users who still employ outdated versions of Firefox, Internet Explorer or other browsers are vulnerable to history sniffing.
the UCSD researchers put it, “The attack uses the fact that browsers
display links differently depending on whether or not their target has
URL in a hidden part of the page, and then uses the browser’s DOM
inspect how the link is displayed. If the link is displayed as a visited link, the target URL is in the user’s history.”
advocates, security experts and others have denounced the practice as
unfair to users and a large percentage of Web users probably has no idea
that it goes on.
In a blog post responding to the Stanford research, Epic officials disputed the researchers’ claims and said that what the script is doing is called “segment verification.”
“The practice described in the blog, better labeled as ‘segment
verification’ (indeed, as admitted in the blog, no URLs or URL history
is actually collected) provides companies with a way to measure the
accuracy of the data that a company purchases from data vendors without
compromising consumer privacy. NO data obtained from
segment verification is personally identifiable information (PII), nor
is that data ever merged with other data points that are, or may be,
personally identifiable,” the blog post says. Epic did not respond to a request for further comment.
However, Mayer and others said that what’s going on is still history sniffing, regardless of the label that’s used.
“It doesn’t really matter whether Epic calls it something else, it’s still history sniffing. Segment verification sounds good and boring and no one will care about it,” said Chris Soghoian, a privacy advocate and researcher. “There are going to be hundreds of millions of consumers out there with unpatched browsers for the foreseeable future. They won’t get the savviest consumers, but there’s no prohibition against it, so there’s no reason for ad networks not to do it. They can pay an engineer to spend twenty hours writing something to do this and there’s nothing stopping them.”
The Stanford research also found that once a user opts out of the tracking by Epic, the tracking cookie persisted and history sniffing continued.