How I learned to stop worrying and love malware DGAs....

Published: 2014-12-23
Last Updated: 2014-12-23 23:50:02 UTC
by John Bambenek (Version: 1)
2 comment(s)

The growth of malware families using algorithms to generate domains in 2014 has been somewhat substantial.  For instance, P2P Gameover Zeus, Post-Tovar Zeus and Cryptolocker all used DGAs.  The idea is that code generates domains (usually but not always) by taking the data and running it throw some magic math to come up with a list of many domains per day.  This allows the attacker to avoid static lists of domains for callbacks in their code and allow them additional flexibility to make takedowns a little more difficult.  Instead of getting one domain suspects, now you have to get thousands suspended.  And if you think the "good guys" are on to you, you can change your encryption seed and get a new list of domains.

That said, it's also a double edged sword.  If you can get the algorithm, you can proactively block an entire family in one foul swoop.  Take, for instance, hesperbot.  Garage4Hackers has a nice write up on how they reverse engineered the DGA and provide a helpful script at the end.

This particular DGA doesn't generate many domains, but it provides a good example.  From the word go, you can simply dump the list of domains into RPZ or another DNS blocking technology.  That's nice, but what if you wanted to do some threat intelligence ninjitsu instead?

You can take that list of domains, attempt to resolve them and then dump the active IPs and domains into a feed.  Now you have data you can pivot off of, throw into CIF, or make available as OSINT to get mad love from your peers.

Currently I track 11 families this way and process about 200,000 domains every 10 minutes to generate feeds (my New Years goal is to increase that tenfold).  That brings an interesting scalability problem to the fore... how to lookup that many hosts in parallel instead of serial.  For that I use two linux commands: parallel (self-explanatory) and adns-tools.  Adns-tools is a suite that allows for asynchronous DNS lookups across many hostnames.  As long as you have a friendly DNS resolver that doesn't mind your unmitigated complete assault of its sensibilities, you're good to go.

Doing this allows patterns to emerge pretty quickly... usually it is the same IP addresses involved, typically they have a dedicated domain that does authoritative DNS for all the DGA-ized domains, and you can assess what nationality the actors are by what holidays they take from registering domains. :)

All for the price of learning a little bit of python, you can set up a homebrew malware surveillance system.

John Bambenek
bambenek \at\ gmail /dot/ com
Bambenek Consulting

2 comment(s)


Maybe my brain is workin slow today, but what is a DGA?
Domain Generation Algorithm

Diary Archives