A Poor Man's DNS Anomaly Detection Script
I still think, DNS logs are one of the most overlooked resources for intrusion and malware detection. Frequently, command and control servers will use specific top level domains or host names, and due to short TTL values, infected hosts will frequently query DNS servers for these names.
Additionally, DNS servers are overlooked choke points, which are as valuable to collect network wide data as firewalls and routers connecting the network to the internet.
In this diary, I would like to introduce a simple shell script to answer one question that in my opinion is quite useful to detect anomalous DNS queries: Which are the top 10 new host names that we looked up today.
First, you need DNS query logs, there are two ways to collect them: you could either enable query logging in your DNS server, or you could just use tcpdump on the DNS server to collect the logs. Query logging works fine for me, but it can put too much strain on a very busy name server. Running tcpdump on the name server, or a sensor monitoring the name server, may work better. We do not have to capture every single query for this technique to work.
First, we need to summarize past queries. In my case, the query logs are rotated hourly, and saved in files with names like "query.log.*" (* is a number). A sample line from my query logs:
16-Aug-2012 21:42:00.260 queries: info: client 10.5.0.210#54481: query: a1406.g.akamai.net IN A + (192.0.2.1)
To extract the host names, and summarize them, I use the following script:
cat query.log.*|
sed -e 's/.*query: //' | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > oldlog
This will sort the output by hostname (sort -k2 sorts by the second column), which becomes important later.
Next, I apply the same procedure to the current log:
cat query.log| sed -e 's/.*query: //' | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > newlog
Now, we need to find all entries in "newlog", that are not included in "oldlog". To do so, we use the bash command "join", which works pretty much like the SQL command join, but uses the two text files as input. It is important that the "join" column (the host name) is sorted, which was the reason for the -k 2 argument earlier.
join -1 2 -2 2 -a 2 oldlog newlog > combined
-a 2 will include all records from newlog that are not found in oldlog. "combined" now includes lines from both files, as well as the lines only found in "newlog". We need to remove the lines found in both files (which are identified by having two numbers):
cat combined | egrep -v '.* [0-9]+ [0-9]+$' | sort -nr -k2 | head -10
In the end, we sort the host names by frequency, and return the top 10.
To summarize the script for simple "copy/paste".I broke some lines up to a
oldlogs=query.log.*
newlog=query.log
tmpdir=/tmp
cat $oldlogs | sed -e 's/.*query: //' | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > $tmpdir/oldlog
cat $newlog | sed -e 's/.*query: //' | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > $tmpdir/newlog
join -1 2 -2 2 -a 2 oldlog newlog | egrep -v '.* [0-9]+ [0-9]+$' | sort -nr -k2 | head -10 > $tmpdir/suspects
The file "suspects" will now include the top 10 suspect domains. For added credit: add the ability to keep a whitelist.
------
Johannes B. Ullrich, Ph.D.
SANS Technology Institute
Twitter
Comments
www
Nov 17th 2022
6 months ago
EEW
Nov 17th 2022
6 months ago
qwq
Nov 17th 2022
6 months ago
mashood
Nov 17th 2022
6 months ago
isc.sans.edu
Nov 23rd 2022
6 months ago
isc.sans.edu
Nov 23rd 2022
6 months ago
isc.sans.edu
Dec 3rd 2022
5 months ago
isc.sans.edu
Dec 3rd 2022
5 months ago
<a hreaf="https://technolytical.com/">the social network</a> is described as follows because they respect your privacy and keep your data secure. The social networks are not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go.
<a hreaf="https://technolytical.com/">the social network</a> is not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go. The social networks only collect the minimum amount of information required for the service that they provide. Your personal information is kept private, and is never shared with other companies without your permission
isc.sans.edu
Dec 26th 2022
5 months ago
isc.sans.edu
Dec 26th 2022
5 months ago