Intro to javascript malware analysis

I am by no means an expert on this stuff. A few weeks ago, I ran across some suspicious links in spam and decided to see where they led. Some of them claimed to be from financial institutions that I have absolutely no connection to, and claimed that some transaction had failed to occur. Others were variants of shipping confirmation scams, pharmacy junk, etc. I wish I could say that I have no idea how people fall for these, but the fact is that some people will literally click on anything that shows up in their inbox, open any attachment and follow any link, no matter how blatantly fake we professionals think these scams are.

What lay at the tail end of all the script="http://some-site/whatever.js" includes and document.location redirects? A webpage that'd been owned, filled with a huge pile of nonsensical jibberish that could barely pass as javascript, which happened to be part of the Blackhole Exploit Kit.

I've done my share of picking apart obfuscated javascript before, but it had been a while. I gave a presentation of this same thing at KC2600 a few weeks ago. Then, this week, a colleague of mine who missed the meeting ran into the same thing in the wild. I passed on what I'd learned, and decided it might be time to write it up with a little more detail than I did a few weeks back. He made this quick video that covers how he was able to de-obfuscate this particular sample:

By now, I've seen several different obfuscation schemes for BlackHole, but once it's decoded, it all looks about the same.

The introductory basics are simple. Minimize the potential of infection by using a non-privileged account (and perhaps an OS other than Windows) and/or minimize the impact of a successful infection by running a virtual machine that you can blow away or revert to a snapshot of a known clean state. For the malware I'm using in this example, either (or both) of the above criteria will be ample to keep things from getting out of control. Other malware may be more insidious or may target non-Windows platforms. I have a few friends that have unwittingly infected their own workstations while trying to analyze things. Play safe.

Once you have a safe lab environment, your goal is to examine a suspicious link and dissect it.

In my case, I was able to find a few links to malware in my personal mail's spam folder. For the demo at KC2600, I used Malware Domain List to find some Blackhole samples.

In the wild, there may be any number of redirects ahead of the malware. You may see a shortened URL (through goo.gl, tinyurl, etc) which goes to a sparse HTML page with several calls to javascripts hosted on various sites, and those javascripts may simply be a document.location pointing to the malware. I usually stick with curl or wget to pull down suspicious links, and then I keep looking at the content and following the redirects until I strike gold.


The javascript itself is ugly once you get to it. Sometimes, the byte array is only a few (really long) lines. Other times, like this sample, each byte of the obfuscated data is on a new line, like this:

You'll see a few interesting things. There's an "e=eval;" line near the bottom, and then "e(c);" after that. It doesn't take a coding genius to realize that this is a way to call eval(c) without triggering some IDS signatures that look for "eval(". Many samples I saw weren't quite this obvious. In fact, the script in the video has the eval alias in a different part of the script and varies in several other ways if you look closely.

To turn this cryptic payload into something that resembles actual javascript, there's a post on SANS ISC from several years ago covering a few methods. I went with the so-called Tom Liston Method, essentially trying to wrangle the decoded stuff that was destined for the eval function into a document.write within a textarea box instead. Note: I ran into one sample of BlackHole that has a /textarea tag near the beginning, which would keep someone from using this trick to easily view the code with this trick, but I don't think it will eval the stuff behind it since it's been changed to a document.write.

In the above example (and in the video), the content that is destined for eval is stored in variable "c", so you simply replace "e(c);" with:

document.write("<textarea cols="150" rows="100">" + c + "</textarea>");

But obviously, you need to use some brain power here to figure out what trickery they're using to call eval, and what the variable is that needs to be wrapped up in the above document.write command. You may also wish to mess with the rows and columns on the textarea. I know on my netbook, that textarea size is far too unweildy. On my desktop, it's almost perfect.

Make sure the file is renamed as a .html, then load it up in your safe lab environment's browser, just in case something goes wrong.


Voila. If you scroll through recent versions of BlackHole Exploit Kit, you'll see that it tries to load an embedded java applet and a PDF, both of which are designed to exploit recent vulnerabilities in JRE and Adobe Reader. Since I don't have Windows running in a VM environment (and I'm not keen on actually infecting any of my Windows boxes) I'm not entirely sure what gets loaded from there. I'm guessing the carberp trojan, given most of what I've read lately. If that's the case, a successful infection would likely block access to anti-malware sites, try to sabotage existing security software, and start gathering sensitive data such as card numbers and online banking credentials.

blog comments powered by Disqus