PDFs are widely used business file format, which makes them a common target for malware attacks. On the surface, PDFs are secure, but because they have so many “features,” hackers have learned how to hide attacks deep under the surface.
By using a number of utilities, we are able to reverse engineer the techniques in malicious PDFs, providing insight that we can ultimately use to better protect our systems.
PDF as Text
By opening the PDF file with a text editor it is possible to see that there are some encrypted objects. The first circle, object 11, is a command to execute Javascript in object 12. The second and third circles, are a command for object 12 to filter the Javascript with AsciiHexDecode. The main reason for this filter is to hide malicious code inside the PDF and avoid anti-virus detection. This is our first red flag.
Decoding the Hex
This second image shows how the stream is decoded, but additional analysis is required to make sense of it. Again, we will open this code with a text editor to understand its purpose.
Hex as Text
Opening this code as text, the circle indicates it is Javascript, which is another red flag. We will now work to determine its intent.
Malzilla Analysis of Javascript
By using a utility called Malzilla, we can analyze the Javascript. We input the Javascript in the top box and decode it with the circled button. A closer look at the second circle indicates that this Javascript contains shellcode, yet another red flag.
A Closer Look at the Shellcode
This is a closer view of the shellcode. Shellcode is typically used to exploit vulnerabilities while avoiding detection. Shellcode has earned its name for launching a command shell for the attacker to control.
Shellcode as Exe
Again, we run a utility, this time to convert the shellcode into an Executable file, which we save, so that we can take an even closer look at its function.
Exe through IDA
Here, we run yet another utility, IDA, which enables us to disassemble and debug the commands of the Executable file. As we have highlighted, this file contains multiple Nop slide functions, which are used in Shellcode attacks since the location of the Shellcode is not precisely known. This raises another red flag. From here, we should see if there are any interesting binary strings.
Binary Strings
Here we have circled multiple binary strings that should raise concern. One of the circled items, URLDownloadToFileA, is a Windows API function to download a file from a remote server and to save it on the user’s PC. In this infected PDF, the shellcode uses it to point the PC to an infection point, which is the IP address we have circled (by the way, don’t visit that IP address). Once the infected file is downloaded, the shellcode will execute it, infecting the computer.There you have it! Like “Inception,” you have to go deeper to find what is truly at the heart of this infected PDF. Hackers are intelligent about wrapping Executable files in shellcode, encrypting it and hiding it in Javascript within PDF files, but by reverse engineering their techniques, we gain a better understanding of our vulnerabilities and can work to strengthen our security posture.
Tomer Bitton is a security researcher at Imperva.