Botnets

While I did mention DDoS and spam as reasons for infection already, what I left out so far was the infrastructure of hundreds or thousands of compromised machines, which is usually called a botnet. Once a worm has infected lots of systems, an attacker needs some way to control his zombies. Most often the nodes are made to connect to an IRC server and join a (password protected) secret channel. Depending on the malware in use, the attacker can usually command single or all nodes sitting on the channel to, for example, DDoS a host into oblivion, look for game CD keys and dump those into the channel, install additional software on the infected machines, or do a whole lot of other operations. While such an approach may be quite effective, it has several shortcomings.

IRC is a plaintext protocol.
Unless every node builds an SSL tunnel to an SSL-capable IRCD, everything that goes on in the channel will be sent from the IRCD to all nodes connected, which means that someone sniffing from an infected honeypot can see everything going on in the channel, including commands and passwords to control the botnet. Such a weakness allows botnets to be stolen or destroyed (f.ex. by issuing a command to make them connect to a new IRCD which is on IP 127.0.0.1).
It's a single point of failure.
What if the IRCD goes down because some victim contacted the admin of the IRC server? On top of this, an IRC Op (a IRC administrator) could render the channel inaccessible. If an attacker is left without a way to communicate with all of the zombie hosts, they become useless.

A way around this dilemma is to make use of dynamic DNS sites like www.dyndns.org. Instead of making the zombies connect to irc.somehost.com, the attacker can install a dyndns client which then allows drones to reference a hostname that can be directed to a new address by the attacker. This allows the attacker to migrate zombies from one IRC server to the next without issue. Though this solves the problem of reliability, IRC should not be considered secure enough to operate a botnet successfully.

The question, then, is what is a better solution? It seems the author of the trojan Phatbot already tried to find a way around this problem. His approach was to include peer to peer functionality in his code. He ripped the code of the P2P project ``Waste'' and incorporated it into his creation. The problem was, though, that Waste itself didn't include an easy way to exchange cryptographic keys that are required to successfully operate the network, and, as such, neither did Phatbot. The author is not aware of any case where Phatbot's P2P functionality was actually used. Then again, considering people won't run around telling everyone about it (well, not all of them at least), it's possible that such a case is just not publicly known.

To keep a botnet up and running, it requires reliability, authentication, secrecy, encryption and scalability. How can all of those goals be achieved? What would the basic functionality of a perfect botnet require? Consider the following points:

An easy way to quickly send commands to all nodes
Untraceability of the source IP address of a command
Impossibile to judge from an intercepted command packet which node it was addressed to
Authentication schemes to make sure only authorized personnel operate the zombie network
Encryption to conceal communication
Safe software upgrade mechanisms to allow for functionality enhancements
Containment; so that a single compromised node doesn't endanger the entire network
Reliability; to make sure the network is still up and running when most of its nodes have gone
Stealthiness on the infected host as well as on the network

At this point one should distinguish between unlinked and linked, or passive, botnets. Unlinked means each node is on its own. The nodes poll some central resource for information. Information can include commands to download software updates, to execute a program at a certain time, or the order a DDoS on a given target machine. A linked botnet means the nodes don't do anything by themselves but wait for command packets instead. Both approaches have advantages and disadvantages. While a linked botnet can react faster and may be more stealthy considering the fact that it doesn't build up periodic network connections to look out for commands, it also won't work for infected nodes sitting behind firewalls. Those nodes may be able to reach a website to look for commands, which means an unlinked approach would work for them, but command packets like in the linked approach won't reach them, as the firewall will filter those out. Also, consider the case of trying to build up a botnet with the next Windows worm. Infected Windows machines are generally home users with dynamic IP addresses. End-user machines change IPs regularly or are turned off because the owner is at work or on a hunting weekend. Good luck trying to keep an up-to-date list of infected IPs. So basically, depending on the purpose of the botnet, one needs to decide which approach to use. A combination of both might be best. The nodes could, for example, poll a resource of information once a day, where commands that don't require immediate attention are waiting for them. On the other hand if there's something urgent, sending command packets to certain nodes could still be an option. Imagine a sort of unlinked botnet. No node knows about another node and nor does it ever contact one of its brothers, which perfectly achieves our goal of containment. These nodes periodically contact what the author has labeled a resource of information to retrieve their latest orders. What could such a resource look like?

The following attributes are desirable:

It shouldn't be a single point for failure, like a single host that makes the whole system break down once it's removed.
It should be highly anonymous, meaning connecting there shouldn't be suspicious activity. To the contrary, the more people requesting information from it the better. This way the nodes' connections would vanish in the masses.
The system shouldn't be owned by the botnet master. Anonymity is one of the botnet's primary goals after all.
It should be easy to post messages there, so that commands to the botnet can be sent easily.

There are several options to achieve these goals. It could be:

Usenet: Messages posted to a large newsgroup which contain steganographically hidden commands that are cryptographically signed achieves all of the above mentioned goals.
P2P networks: The nodes link to a server once in a while and, like hundreds of thousands of other people, search for a certain term (``xxx''), and find command files. File size could be an indicator for the nodes that a certain file may be a command file.
The Web itself: This one would potentially be slow, but of course it's also possible to setup a website that includes commands, and register that site with a search engine. To find said site, the zombies would connect to the search engine and submit a keyword. A special title of the website would make it possible to identify the right page between thousands of other hits on the keyword, without visiting each of them.

Using those methods, it would be possible to administer even large botnets without even having to know the IP adresses of the nodes. The ``distance'' between botnet owner and botnet drone would be as large as possible since there would be no direct connection between the two. These approaches also face several problems, though:

How would the botnet master determine the number of infected hosts that are up and running? Only in the case of the website would estimation of the number of nodes be possible by inspecting the access logs, even logging were to be enabled. In the case of the Usenet approach a command of ``DDoS Ebay/Yahoo/Amazon/CNN'' might just reach the last 5 remaining hosts, and the attacker would only be left with the knowledge that it somehow didn't work. The problem is, however, that the attacker would not know the number of zombies that would actually take part in the attack. The same problem occurs with the type and location of the infected hosts. Some might be high profile, such as those connecting from big corporations, game developers, or financial institutions. The attacker might be interested in abusing those for something other than Spam and DDoS, if he knew about them in particular. If the attacker wants to bounce his connections over 5 of his compromised nodes to make sure he can't be traced, then it is required that he be able to communicate with 5 nodes only and that he must know address information about the nodes. If the attacker doesn't have a clue which IP addresses his nodes have, how can he tell 5 of them where to connect to? Besides the obvious problem of timing, of course. If the nodes poll for a new command file once every 24 hours, he'd have to wait 24 hours in the worst case until the last node finds out it's supposed to bind a port and forward the connection to somewhere else.

Subsections

Next: The Linked Network Up: Social Zombies - Aspects Previous: Hacking Contents