Researcher Troy Hunt announced a major revamp of his Pwned Passwords tool that includes more passwords, added features and tightened privacy for organizations who want to check if their in-use passwords can easily be cracked.
In V2 of Pwned Passwords, launched last week, Hunt updated his password data set from 320 million passwords to 501 million new passwords, pulled from almost 3,000 breaches over the past year.
These new data sources come courtesy of the Onliner Spambot Dump breach from August that collected 711 million email credentials and server login data. Also included in V2 of Pwned Passwords are the 1.4 billion clear text credentials discovered in December.
“There’s also a heap of other separate sources there where passwords were available in plain text. As with V1, I’m not going to name them here, suffice to say it’s a broad collection from many more breaches than I used in the original version,” Hunt stated in a blog post.
“It’s taken a heap of effort to parse through these but it’s helped build that list up to beyond the half billion mark which is a significant amount of data. From a defensive standpoint, this is good – more data means more ability to block risky passwords,” he said.
The idea behind Pwned Passwords, which was initially launched by Hunt in August, is to help organizations avoid using passwords that have previously appeared in a data breach or have been otherwise compromised in the past.
Pwned Passwords is part of Hunt’s site, Have I Been Pwned, which was first set up in 2013 to help organizations discover if they have been the victim of a security breach. The Pwned Passwords application programming interface operates using SHA-1 encryption and looks at the first five digits of submitted passwords.
2,844 new (alleged) breaches: a collection of almost 3K individual files with 80M unique email addresses was discovered online. They each contained email addresses and plain text passwords. 63% were already in @haveibeenpwned. Read more: https://t.co/HJfz6OTpNT
— Have I Been Pwned (@haveibeenpwned) February 26, 2018
In order to allow individuals or companies to securely compare their passwords with those that are part of Hunt’s database, he teamed up with Junade Ali from Cloudflare. Through the partnership, Hunt explains that V2 is bolstered via a feature that uses the mathematical properties “k-anonymity” to help users maintain anonymity when checking for pwned in-use passwords.
“It works like this,” Hunt expains. “Imagine if you wanted to check whether the password ‘P@ssw0rd’ exists in the data set. (Incidentally, the hackers have worked out people do stuff like this. I know, it sucks. They’re onto us.) The SHA-1 hash of that string is ’21BD12DC183F740EE76F27B78EB39C8AD972A757′ so what we’re going to do is take just the first 5 characters, in this case that means ’21BD1′. That gets sent to the Pwned Passwords API and it responds with 475 hash suffixes (that is everything after ’21BD1′) and a count of how many times the original password has been seen.”
V2 of Pwned Passwords also delivers a count next to submitted passwords, meaning that users can see how many times their passwords appeared in Hunt’s data sources. “What this means is that next to ‘abc123’ you’ll see 2,670,319 – that’s how many times it appeared in my data sources,” Hunt explained. “Obviously with a number that high, it appeared many times over in the same sources because many people chose the same password.”
Similar to V1, Hunt said that for V2 there’s a big archive that users can go pull down from the Pwned Passwords from either the site or via torrent.
“As with V1, the torrent file is served directly from HIBP’s Blob Storage and you’ll find a SHA-1 hash of the Pwned Passwords file next to it so you can check integrity if you’re so inclined,” said Hunt.