Metadata Analysis Draws its Own Conclusions on WannaCry Authors

Researchers at Telefonica’s cybersecurity unit ElevenPaths conducted an analysis of WannaCry metadata.

The most intriguing mystery that remains about WannaCry is the identity of the attacker. The theory with the best legs is that North Korea’s Lazarus APT is the entity behind the worldwide ransomware outbreak given the discovery of shared code samples in the malware with older Lazarus attacks.

That, however, doesn’t definitively dispel other contentions that point the finger at cybercriminals or other nation-state operatives. For example, a linguistics analysis of the 28 ransom notes embedded in the malware moved away from the Lazarus theory and concluded that the author was a native Chinese or English speaker. It also stated that the Korean version of the ransom note was among the most poorly written, or translated. To take that conclusion as gospel, however, seems premature as well given that native speakers could easily write in broken versions of their language.

Researchers at ElevenPaths, the cybersecurity unit of Telefonica, Spain’s largest telecommunications provider and one of WannaCry’s victims, weren’t satisfied and attacked the attribution question from a metadata perspective and used it to dissect the author’s actions in the days leading up to the May 12 attack.

Sergio de los Santos, the labs leader at ElevenPaths, said the author overlooked how much information that metadata buried in the malware files could give away about their identity. For example, the author and operator fields in the RTF files used in the ransom note are set to Messi, a reference to footballer Lionel Messi. This reference was also found in the first version of WannaCry that surfaced in March. The author also types in Korean, the default language the author used in the EMEA version of Word that was used to create the RTF files. This lends weight to the theory that the broken Korean may have been intentional, de los Santos said.

“The broken Korean, for me, means he was trying to hide himself. That’s a hint he was from Korea,” de los Santos said. “He forgot to change his [default language in] Word. If you’re trying to hide yourself from being English and write in broken English, anyone could do that. But I would forget to change my default language. He made the mistake of forgetting to change the default language. He tried to force a broken Korean.”

De los Santos’ team was also able to deduce a timeline of the days leading up to the first detections of WannaCry on May 12 based on metadata, which they extracted with an in-house tool called Metashield Clean-up.

The scheme started April 27, it appears, when the author created a bitmap image for the desktop background and the r.wnry readme file. On May 9, a .zip file was created that included tools to connect to Tor for ransom payments. That .zip was dropped onto the attacker’s filesystem May 9 at 16:57 the attacker’s local time. One day later, another file containing .onion domains and a Bitcoin wallet were created. On May 11, edits were made to the bitmap, readme, and .onion domains. The files were added to a password-protected .zip file on the same day. On May 12, at 2:22 a.m. local to the attacker, the executables were added to the .zip file and the payload was ready.

“This is the ‘magic’ hour as it is established as the time of infection in every affected system,” ElevenPaths researchers wrote in a report on the research.

The researchers were able to glean from the metadata how long the attacker spent editing each individual language file and that the default language was always Korean. The use of .zip files were another giveaway as to clues about the attacker’s location because they store the last access time and create dates obtained from the attacker’s disk, as well as his local timezone.

Going off the final .zip create date and time of May 12 at 2:22 a.m. and the first detections of WannaCry in Taiwan and throughout Asia, the researchers speculate, below, that the attacker was in the UTC +9 timezone, and certainly between UTC +2 and +12.

“He should have used .txt files which stores no metadata at all,” de los Santos said, adding that .txt files cannot be used to create files that include colors and different point sizes that were prevalent throughout the ransom note. A rich text format file, could, however. “You can write rich text format with lots of programs, but with Word, we have discovered that the default language is kept in the file and the name of the user is kept in the file. That was a mistake, and I think he didn’t notice.”

Suggested articles


  • ceretullis on

    "He made the mistake of forgetting to change the default language." de los Santos and his team are making the rather bad assumption the attacker "forgot" to change the default language setting away from Korean. It's quite possible they intentionally crafted metadata as a red herring. I.e. they may have intentionally set the default to Korean.
  • Haru on

    What if the default language had been set to Korean to enforce the theory of the native writing in broken Korean. Hopefully they were sloppy but maybe they were bigger masterminds than it would appear...
  • Ssantos on

    @ceretullis. Maybe, who knows. Please, read the original post. Attribution is risky. Anyway, korean is set as default since Wannacry version 1.0 in March, which was a "regular" and even unpopular ransomware back then, and with a note written only in English.
    • ceretullis on

      @Ssantos I read the original post. With the evidence available to you, it's possible to draw some tantalizing deductions. For example, assuming the creator is not very skilled since there were no analysis counter measures. This leads you to believe he's making mistakes when leaving metadata. Which leads you to believe you can draw reasonable conclusions from the metadata. Attribution based on this kind of metadata is not simply "risky" it is reckless. Matadata can be crafted to paint whatever picture an attacker wants. Unless you can corroborate the metadata with other reliable intelligence... you really have nothing but reckless speculation.
  • Ssantos on

    I agree. In the original post, that is what we try to do: corroborate with other intelligence... Anyway I fully agree again. Since it is a reckless exercise and speculation about intentional or mistakes... we are all just as right or wrong in the same way :). The important fact is that the language was set, and times in files and zips were those... if it helps to get to any other conclusion is up to every one of us. We will definitly need more hints.

Subscribe to our newsletter, Threatpost Today!

Get the latest breaking news delivered daily to your inbox.