Facebook Exposed Dataset Debacle: Who’s Really To Blame?

facebook database record exposed

After two databases were discovered leaking Facebook data, researchers say the onus lies on all parties involved as data collection continues to grow.

UPDATE

The discovery of millions of Facebook records leaked from publicly-exposed AWS storage buckets has left researchers wondering where the responsibility lies.

The two separate datasets, disclosed Wednesday by researchers at Upguard, were held by two app developers, Cultura Colectiva and At the Pool. The actual data source for the records (like account names and personal data) in these databases was Facebook. Throwing another wrench into the mix is the fact that the data lived on publicly-accessible Amazon Web Services S3 buckets.

While both exposed databases have since been secured, the incident has left the security community scratching their heads when it comes to pinning the blame. And the incident is indicative of a larger issue as more and more troves of data are collected by massive companies and their third-party partners.

“Today, the situation is far from perfect,” researcher Bob Diachenko told Threatpost. “I think that in the nearest future big companies should be at least partially responsible for all datasets they share with their partners and require the same security standards they apply to themselves. If you can prove you can handle the data, then you can take it and use it. But when it is leaked, onus should be distributed between the source of the data and its holder.”

The first publicly-exposed dataset originates from a Mexico-based media company, Cultura Colectiva, and contains over 540 million records including comments, likes, reactions, account names and more.

Cultura Colectiva collected data on user responses to their Facebook posts, enabling them to tune an algorithm for predicting which future content will generate the most traffic.

Cultura Colectiva Facebook Statement

Cultura Colectiva Statement

In a statement sent to Threatpost, the company said that “all the publicly available data provided to us by Facebook, gathered from the fanpages we manage as a publisher, is public, not sensitive, and available to all users who have access to Facebook.”

The second publicly-exposed backup, a Facebook-integrated app titled At the Pool, exposed plaintext passwords for 22,000 users and other data via an Amazon S3 bucket.

At the Pool, meanwhile, which launched in 2011, is an app that was integrated into Facebook’s platform that served as a way of introducing users to potential new friends. The “At the Pool” webpage ceased operation in 2014.

Researchers notified Facebook about the Cultura Colectiva data on Jan. 10, and again on Jan. 14. There was no response, researchers said. Due to the data being stored in Amazon’s S3 cloud storage, researchers then notified Amazon Web Services of the situation on Jan. 28, which acknowledged the incident but also did nothing.

A Facebook spokesperson told Threatpost that the leaky data servers are under investigation but that exposed databases is a “violation of policy” on the part of the app developers.

“Facebook’s policies prohibit storing Facebook information in a public database,” the spokesperson told Threatpost. “Once alerted to the issue, we worked with Amazon to take down the databases. We are committed to working with the developers on our platform to protect people’s data.”

Indeed, one of Facebook’s developer policy rules states: “Protect the information you receive from us against unauthorized access, use, or disclosure. For example, don’t use data obtained from us to provide tools that are used for surveillance.” In addition, the policy also mandates that apps that stop using Facebook’s platform “promptly delete all user data you have received.”

Amazon also plays a part in the mix as well, as the data lived on publicly-accessible Amazon Web Services S3 buckets. However, users are ultimately responsible for ensuring that S3 buckets are encrypted and not publicly accessible.

“AWS customers own and fully control their data,” an Amazon spokesperson told Threatpost. “When we receive an abuse report concerning content that is not clearly illegal or otherwise prohibited, we notify the customer in question and ask that they take appropriate action, which is what happened here.While Amazon S3 is secure by default, we offer the flexibility to change our default configurations to suit the many use cases in which broader access is required, such as building a website or hosting publicly downloadable content. As is the case on premises or anywhere else, application builders must ensure that changes they make to access configurations are protecting access as intended.”

Amazon for its part has worked to help users better secure their AWS S3 buckets, adding features like security warning notifications and more.  Amazon did not respond to multiple attempts for comment from Threatpost.

Diachenko for his part said he thinks some of the responsibility does lay on the shoulders of Facebook  and Amazon. In Facebook’s case, as the source of the data, that data could have been encrypted, at least, he said. Meanwhile, “Amazon should act quicker when it comes to a responsible disclosure report,” he said.

Todd Shollenbarger, chief global strategist at Veridium, agreed that both the source of data and the partners should share onus.

“In the case of Facebook, this very basic security violation exposes two things: Facebook’s biggest selling point to potential advertisers (that it knows to whom it is advertising and can provide genuine insight into the digital journeys of its users) is not necessarily true, and hundreds of Relying Party enterprises mistakenly, if not negligently, relied on Facebook’s assumed good security practices,” he said.

March 2018’s Facebook and Cambridge Analytica data debacle raised similar concerns about who owns data. In this incident, data was harvested by app developers, as opposed to being accidentally exposed – however, in both cases Facebook said that app developers were violating its policies. The situation brought up concerns about how Facebook can successfully enforce its app developer policies.

Moving forward, massive data-chugging companies like Facebook need to keep firm tabs on who is using their consumers’ data, where that data is going, and how it’s being secured, researchers agreed.

“Amid the growing number of data privacy laws, all companies must pay more attention to their digital third parties’ data security and privacy practices—not doing so will come at a heavy price,” said Mike Bittner, digital security and operations manager at The Media Trust. “In the near future, companies that protect consumer data will be those that earn consumers’ trust as well as their business.”

This article was updated on April 5 at 8am ET with statements from Cultura Colectiva and Amazon.

Suggested articles