Earlier this week, Microsoft released aTillmann Wernern announcement about the disruption of a dangerous botnet that was responsible for spam messages, theft of sensitive financial information, pump-and-dump stock scams and distributed denial-of-service attacks.

Kaspersky Lab played a critical role in this botnet takedown initiative, leading the way to reverse-engineer the bot malware, crack the communication protocol and develop tools to attack the peer-to-peer infrastructure. We worked closely with Microsoft’s Digital Crimes Unit (DCU), sharing the relevant information and providing them with access to our live botnet tracking system.

A key part of this effort is the sinkholing of the botnet. It’s important to understand that the botnet still exists – but it’s being controlled by Kaspersky Lab. In tandem with Microsoft’s move to the U.S. court system to disable the domains, we started to sinkhole the botnet. Right now we have 3,000 hosts connecting to our sinkhole every minute. This post describes the inner workings of the botnet and the work we did to prevent it from further operation.

Let’s start with some technical background: Kelihos is Microsoft’s name for what Kaspersky calls Hlux. Hlux is a peer-to-peer botnet with an architecture similar to the one used for the Waledac botnet. It consists of layers of different kinds of nodes: controllers, routers and workers. Controllers are machines presumably operated by the gang behind the botnet. They distribute commands to the bots and supervise the peer-to-peer network’s dynamic structure. Routers are infected machines with public IP addresses. They run the bot in router mode, host proxy services, participate in a fast-flux collective, and so on. Finally, workers are infected machines that do not run in router mode, simply put. They are used for sending out spam, collecting email addresses, sniffing user credentials from the network stream, etc. A sketch of the layered architecture is shown below with a top tier of four controllers and worker nodes displayed in green.

Figure 1: Architecture of the Hlux botnet

Worker Nodes

Many computers that can be infected with malware do not have a direct connection to the Internet. They are hidden behind gateways, proxies or devices that perform network address translation. Consequently, these machines cannot be accessed from the outside unless special technical measures are taken. This is a problem for bots that organize infected machines in peer-to-peer networks as that requires hosting services that other computers can connect to. On the other hand, these machines provide a lot of computing power and network bandwidth. A machine that runs the Hlux bot would check if it can be reached from the outside and if not, put itself in the worker mode of operation. Workers maintain a list of peers (other infected machines with public IP addresses) and request jobs from them. A job contains things like instructions to send out spam or to participate in denial-of-service attacks. It may also tell the bot to download an update and replace itself with the new version.

Router Nodes

Routers form some kind of backbone layer in the Hlux botnet. Each router maintains a peer list that contains information about other peers, just like worker nodes. At the same time, each router acts as an HTTP proxy that tunnels incoming connections to one of the Controllers. Routers may also execute jobs, but their main purpose is to provide the proxy layer in front of the controllers.

Controllers

The controller nodes are the top visible layer of the botnet. Controllers host a nginx HTTP server and serve job messages. They do not take part in the peer-to-peer network and thus never show up in the peer lists. There are usually six of them, spread pairwise over different IP ranges in different countries. Each two IP addresses of a pair share an SSH RSA key, so it is likely that there is really only one box behind each address pair. From time to time some of the controllers are replaced with new ones. Right before the botnet was taken out, the list contained the following entries:

193.105.134.189
193.105.134.190
195.88.191.55
195.88.191.57
89.46.251.158
89.46.251.160

The Peer-to-Peer Networks

Every bot keeps up to 500 peer records in a local peer list. This list is stored in the Windows registry under HKEY_CURRENT_USERSoftwareGoogle together with other configuration details. When a bot starts on a freshly infected machine for the first time, it initializes its peer list with some hard-coded addresses contained in the executable. The latest bot version came with a total of 176 entries. The local peer list is updated with peer information received from other hosts. Whenever a bot connects to a router node, it sends up to 250 entries from its current peer list, and the remote peer send 250 of his entries back. By exchanging peer lists, the addresses of currently active router nodes are propagated throughout the botnet. A peer record stores the information shown in the following example:

m_ip: 41.212.81.2
m_live_time: 22639 seconds
m_last_active_time: 2011-09-08 11:24:26 GMT
m_listening_port: 80
m_client_id: cbd47c00-f240-4c2b-9131-ceea5f4b7f67
The peer-to-peer architecture implemented by Hlux has the advantage of being very resilient against takedown attempts. The dynamic structure allows for fast reactions if irregularities are observed. When a bot wants to request jobs, it never connects directly to a controller, no matter if it is running in worker or router mode. A job request is always sent through another router node. So, even if all controller nodes go off-line, the peer-to-peer layer remains alive and provides a means to announce and propagate a new set of controllers.

The Fast-Flux Service Network

The Hlux botnet also serves several fast-flux domains that are announced in the domain name system with a TTL value of 0 in order to prevent caching. A query for one of the domains returns a single IP address that belongs to an infected machine. The fast-flux domains provide a fall-back channel that can be used by bots to regain access to the botnet if all peers in their local list are unreachable. Each bot version contains an individual hard-coded fall-back domain. Microsoft unregistered these domains and effectively decommissioned the fall-back channel. Here is the set of DNS names that were active before the takedown – in case you want to keep an eye on your DNS resolver. If you see machines asking for one of them, they are likely infected with Hlux and should be taken care of.

hellohello123.com
magdali.com
restonal.com
editial.com
gratima.com
partric.com
wargalo.com
wormetal.com
bevvyky.com
earplat.com
metapli.com

The botnet further used hundreds of sub-domains of ce.ms and cz.cc that can be registered without a fee. But these were only used to distribute updates and not as a backup link to the botnet.

Counteractions

A bot that can join the peer-to-peer network won’t ever resolve any of the fall-back domains – it does not have to. In fact, our botnet monitor has not logged a single attempt to access the backup channel during the seven months it was operated as at least one other peer has always been reachable.

The communication for bootstrapping and receiving commands uses a special custom protocol that implements a structured message format, encryption, compression and serialization. The bot code includes a protocol dispatcher to route incoming messages (bootstrap messages, jobs, SOCKS communication) to the appropriate functions while serving everything on a single port. We reverse engineered this protocol and created some tools for decoding botnet traffic. Being able to track bootstrapping and job messages for a intentionally infected machine provided a view of what was happening with the botnet, when updates were distributed, what architectural changes were undertaken and also to some extend how many infected machines participate in the botnet.

Figure 2: Hits on the sinkhole per minute

This Monday, we started to propagate a special peer address. Very soon, this address became the most prevalent one in the botnet, resulting in the bots talking to our machine, and to our machine only. Experts call such an action sinkholing – bots communicate with a sinkhole instead of its real controllers. At the same time, we distributed a specially crafted list of job servers to replace the original one with the addresses mentioned before and prevent the bots from requesting commands. From this point on, the botnet could not be commanded anymore. And since we have the bots communicating with our machine now, we can do some data mining and track infections per country, for example. So far, we have counted 49,007 different IP addresses. Kaspersky works with Internet service providers to inform the network owners about the infections.

Figure 3: Sinkholed IP addresses per country

What now?

The main question is now: what is next? We obviously cannot sinkhole Hlux forever. The current measures are a temporary solution, but they do not ultimately solve the problem, because the only real solution would be a cleanup of the infected machines. We expect that the number of machines hitting our sinkhole will slowly lower over time as computers get cleaned and reinstalled. Microsoft said their Malware Protection Center has added the bot to their Malicious Software Removal Tool. Given the spread of their tool this should have an immediate impact on infection numbers. However, in the last 16 hours we have still observed 22,693 unique IP addresses. We hope that this number is going to be much lower soon.

Interestingly, there is one other theoretical option to ultimately get rid of Hlux: we know how the bot’s update process works. We could use this knowledge and issue our own update that removes the infections and terminates itself. However, this would be illegal in most countries and will thus remain theory.

Tillmann Werner is a senior malware analyst at Kaspersky Lab.

Categories: Malware

Comments (26)

  1. madmod
    1

    Great work!  I applaud the efforts of the many workers. 

    Too bad that Microsoft doesn’t credit your work properly.

    Cheers to Natalia and Eugene!

    Best regards,

    Dave

  2. Anonymous
    2

    Could you issue an update that set all the ip addresses used by the botnet to 127.0.0.1? Not ideal but it then wouldn’t be able to contact anything.

  3. Anonymous
    3

    awsome summary and readable so a person can understand it without a ton of tech knowledge. Good work!

  4. Anonymous
    6

    Unauthorised access to a computer system, regardless of motivation, is generally an offence.  Publishing an anti-botnet patch using the bot net’s own channels without informed consent of the system owner, while well intentioned, is unauthorised access.   It might never come to anything with Joe Average’s machine, but if you do an en masse update of machines in a large litigous company then watch out.

  5. Joe
    7

    What a conundrum:  You can sinkhole the botnet by publishing misleading information to make it behave the way you want, but you can’t actually take it down by publishing an update that causes it to uninstall itself.

    Isn’t this just a question of threshold?  You’ve already commanded these machines to do something different than they would have by giving them a 4-byte string (an IP address), wrapped in whatever protocol was necessary.  If you gave them a larger string (an uninstaller’s executable code), in many ways it’s rather similar.

    I guess the difference is that in the first case, you’re directing existing software to do something within the behaviors defined by its instructions (and just happens to make the botnet inert for now), and in the second case, you’re sending completely new instructions.  That’s a technical distinction.  On a philosophical level, you’re exercising control over the botnet, which means you’re issuing instructions to the computers that are part of the botnet.  The technical distinction is whether you’re doing it with “data” (something only accessed by the CPU’s load/store units) or “program” (actual CPU opcodes).

    So it comes to the legal question of what constitutes “unauthorized access.”  Even just by sinkholing the botnet, you’re controlling the behavior of these machines without the explicit knowledge or consent of their owners.  Is the legal distinction truly equivalent to the technical distinction of downloading new program code, versus directing existing program code to your wishes?

    And, if you downloaded an interpreted program (ie. something that had no CPU opcodes, but rather was a purely “data” description of the job to perform) that happened to uninstall the botnet, where would that put you in the continuum?

  6. Anonymous
    8

    Tricky problem.

    Ask yourself this.

    If you got an email suggesting that you run some supplied code to clear your machine of malware, would you run it?

    If an outside agency chose to run something like that on your machine without your knowledge and consent, would you be happy?

    Now if you can make a virus like they do in films that will make the Controller PC’s erupt in a big fireball…..

    maybe just hand the problem over to the Stuxnet guys….

     

     

  7. Anonymous
    11

    You may consider notifying the owner of the IP address that it infected. This is only so-so effective, but at least it would help a bit in the cleanup.

    Another thing to consider is that most countries have provisions which allow you to do otherwise illegal things in order to avert crime. Under such provisions it would be fine to do a minimum modification to the bot software in order to render it harmless and to notify the owner that the machine is infected. Attempting a full removal and cleanup would of course be the neatest solution from a technical perspective, but it may actually increase your risk. Microsofts legal department should be able to tell you exactly how much you can do in each jurisdiction.

    A third option is to report the infected machines to the police authorities in each country. This is the formally correct way of handling the situation, but it will probably be totally ineffective. However, it may result in the police understanding that they need the infrastructure to find the computer holding a specific IP number at short notice.

     

  8. Dennis Fisher
    14

    If you’re having trouble seeing the images, you may need to disable NoScript, if you have that running. In Firefox with NS disabled, the images appear fine for me.

  9. Anonymous
    17

    fix the images; I dont want to change my computer configuration just for the images on your article….

  10. AlphaCentauri
    18

    Do you have a color-coded map that shows number of infections in proportion to the number of internet customers in each country? We’ve seen previous malware that avoided infecting computers within certain countries or using certain keyboards. It might provide some clue as to the author’s nationality.

  11. Anonymous
    19

    Suggestion:
    1) Prepare the necessary cleanup code, or the essential pieces of it, easily configured into an active version.
    2) Provide the results to a government entity that can’t be sued and would like to bask in the glory of eradicating the threat.
    3) Support further progress toward deploying systems that are compliant with the object-capability model of computation, at both the language and operating-system levels.

  12. Anonymous2
    20

    I love your suggestion nr (2): “Provide the results to a government entity…” 😀

    – And what do you think that government “entity” would do with their new free botnet ??

  13. Anonymous
    22

    It is interesting that India, Thailand, Vietnam and Taiwan show up red (many infections) while Australia, Europe  and China is green and the US is orange.

    Does this indicate better virus protection at the ISP level, more sophisticated end-users, or just that the infection vector appeals to some audiences and not others.

Comments are closed.