Medical Data Leaked on GitHub Due to Developer Errors

Up to 200,000 patient records from Office 365 and Google G Suite exposed by hardcoded credentials and other improper access controls.

Developer error caused the leak of 150,000 to 200,000 patient health records stored in productivity apps from Microsoft and Google that were recently found on GitHub.

Dutch researcher Jelle Ursem discovered nine separate files of highly sensitive personal health information (PHI) from apps such as Office 365 and Google G Suite from nine separate health organizations. He had difficulty reaching the companies whose data had been leaked and so eventually reported the breach to DataBreaches.net, which worked with him to publish a collaborative paper, “No Hack When It’s Leaking,” on the findings.

The title refers to the discovery that the information was exposed not through an attack or unaurhtorized entry into the health systems, but because of developers’ improper configuration of access controls and hardcoded credentials in the storing of the information, according to the paper.

Among the errors developers made included: Embedding hard-coded login credentials in code instead of making them a configuration option on the server the code runs on; using public repositories instead of private repositories; failing to use two-factor or multifactor authentication for email accounts; and/or abandoning repositories instead of deleting them when no longer needed, they wrote.

Ursem, self-appointed “lamest hacker you know” found the leaked info in a simple search to see if someone “is actually stupid enough to upload medical customer data to GitHub,” he told DataBreach.net. GitHub is an online developer platform for hosting software development and version control.

It took him less than 10 minutes to find the exposed data by using variations on simple search phrases such as “medicaid password FTP” to find “potentially vulnerable hard-coded login usernames and passwords for systems,” researchers wrote. Ursem could easily access the health systems and the files by using the exposed credentials, they said.

“It doesn’t matter if the credentials Ursem finds relate to a database, an Office365 or Gmail account or a Secure File Transfer host,” according to the paper. “You just point the right software at it and hit ‘connect,'” Ursem told DataBreach.net. “It really is that simple.”

Moreover, not only was the patient data exposed by common errors, that exposure went undetected for months because of negligent security policies at the companies in charge of the data, researchers said.

Companies failed to perform simple security auditing—such as their developer’s security and compliance with policies or providing a monitored account for researchers to report security concerns—which allowed the data to remain exposed without anyone at the company knowing.

The companies also failed to respond to DataBreach.net’s attempts at responsible disclosure for fear that the notifications they received were social-engineering attacks, further showing weak policies when it comes to securing their data, researchers wrote.

Patient data found in the files was from health organizations Xybion, MedPro Billing, Texas Physician House Calls, VirMedica, MaineCare, Waystar, Shields Health Care Group, AccQData, and one other company that is described in the report but not named.

The report describes one errant developer referred to as the “Typhoid Mary of Data Leaks” because of the multiple errors and repetition of these errors in his use of GitHub in relation to not just storage and management of medical data, but other files as well.

“It seemed that if there was any way this developer could do something wrong or mess something up, he would,” researchers wrote. “And he seemed to be surprisingly unaware that everything he was doing was visible to others.”

GitHub even hit the developer with a DMCA Takedown request for an e-book he improperly shared back in 2018, yet he continued to expose the data and platforms he was working with on the site, researchers found.

“If that takedown notice wasn’t a wake-up call that others could see all his work, we don’t know what would be,” they wrote.

Overall the report shows once again demonstrates how common it is for data to be exposed in the cloud when proper security protections and controls aren’t put in place.

Developers working on platforms such as GitHub need to be extra vigilante when dealing with sensitive client data, and companies, too, need to put proper measures in place to know who has access to their data and where and on what platforms it is being used, researchers said.

Indeed, even if a company itself does not use GitHub or even knows of its existence does not mean its data is not exposed on the site, researchers warned. “One of your vendors or business associates may have an employee using it, as one provider discovered the hard way,” they wrote.

On Wed Sept. 16 @ 2 PM ET: Learn the secrets to running a successful Bug Bounty Program. Resister today for this FREE Threatpost webinar “Five Essentials for Running a Successful Bug Bounty Program“. Hear from top Bug Bounty Program experts how to juggle public versus private programs and how to navigate the tricky terrain of managing Bug Hunters, disclosure policies and budgets. Join us Wednesday Sept. 16, 2-3 PM ET for this LIVE webinar.

Suggested articles