Converting PCAP Web Traffic to Apache Log

Published: 2018-06-06
Last Updated: 2018-06-06 06:26:38 UTC
by Xavier Mertens (Version: 1)
6 comment(s)

PCAP data can be really useful when you must investigate an incident but when the amount of PCAP files to analyse is counted in gigabytes, it may quickly become tricky to handle. Often, the first protocol to be analysed is HTTP because it remains a classic infection or communication vector used by malware. What if you could analyze HTTP connections like an Apache access log? This kind of log can be easily indexed/processed by many tools.

Haka[1] isn’t a new tool (the first version was released in 2013) but it remains below the radar for many people. Haka is defined as "an open source security-oriented language which allows to describe protocols and apply security policies on (live) captured traffic”. Based on the LUA[2] programming language, it is extremely powerful to extract information from network flows but also to alter them on the fly (playing a man-in-the-middle role). 

I had to analyze a lot of HTTP requests from big PCAP files and I decided to automate this boring task. I found on the Haka blog an article[3] that explained how to generate an Apache access log from a PCAP file. Unfortunately, it did not work anymore probably due to the evolution of the language. So, I jumped into the code to fix it (with some Google support of course).

Let’s start a docker container based on Ubuntu and install the latest Haka package:

$ docker run -it --name haka --hostname haka ubuntu
root@haka:~# apt-get update && apt-get upgrade
root@haka:~# apt-get install libpcap0.8 # Required by Haka!
root@haka:~# curl http://github.com/haka-security/haka/releases/download/v0.3.0/haka_0.3.0_amd64.deb
root@haka:~# dpkg -i haka_0.3.0_amd64.deb
root@haka:~# akapcap -h
Usage: hakapcap [options] <config> <pcapfile>
Options:
    -h,--help:              Display this information
    --version:              Display version information
    -d,--debug:             Display debug output
    -l,--loglevel <level>:  Set the log level
                              (debug, info, warning, error or fatal)
    -a,--alert-to <file>:   Redirect alerts to given file
    --debug-lua:            Activate lua debugging
    --dump-dissector-graph: Dump dissector internals (grammar and state machine) in file <name>.dot
    --no-pass-through, --pass-through:
                            Select pass-through mode (default: true)
    -o <output>:            Save result in a pcap file

Ready!

Basically, Haka works with hooks that are called when a condition is matched. In our example, we collect traffic from interesting ports:

http.install_tcp_rule(80)
http.install_tcp_rule(3128)
http.install_tcp_rule(8080)

Then we created a hook that will trigger HTTP response detected in the PCAP files:

hook = http.events.response,
    eval = function (http, response) {
        ... your code here ... 
    }

The hook extracts information from the HTTP response to build an Apache log entry:

<clientip> - - [<date>] “<request> HTTP/<version>” <response> <size> “<referer>” "<useragent>”

Let’s try it with a PCAP file generated on a network:

$ docker cp test.pcap haka:/tmp
$ docker exec -it haka bash
root@haka:~# hakapcap http-dissector.lua /tmp/test.pcap | grep “GET /“
192.168.254.222 - - [05/Jun/2018:18:34:13 +0000] "GET /connecttest.txt HTTP/1.1" 200 10 "-" "Microsoft NCSI”
192.168.254.215 - - [05/Jun/2018:18:34:14 +0000] "GET /session/...HTTP/1.1" 200 10 "-" "AppleCoreMedia/1.0.0.15E216 (iPad; U; CPU OS 11_3 like Mac OS X; en_us)"
192.168.254.215 - - [05/Jun/2018:18:34:19 +0000] "GET /session/...m3u8 HTTP/1.1" 200 10 "-" "AppleCoreMedia/1.0.0.15E216 (iPad; U; CPU OS 11_3 like Mac OS X; en_us)"
192.168.254.66 - - [05/Jun/2018:18:34:21 +0000] "GET / HTTP/1.1" 200 0 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"

For now, the script returns a request size of ‘10’. It is hardcoded like usernames (default to "- -"). I’m still looking for a way to get the number of bytes per HTTP transaction. Also, you get only the client IP address and not the destination one. If you've improvement ideas, let me know!

My script compatible with Hack 0.3.0 is available on github.com[4].

[1] http://www.haka-security.org/
[2] https://www.lua.org/
[3] http://www.haka-security.org/blog/2014/03/18/transform-a-pcap-to-an-apache-log-file.html
[4] https://github.com/xme/toolbox/blob/master/haka_http_log.lua

Xavier Mertens (@xme)
ISC Handler - Freelance Security Consultant
PGP Key

6 comment(s)
ISC Stormcast For Wednesday, June 6th 2018 https://isc.sans.edu/podcastdetail.html?id=6027

Comments

What's this all about ..?
password reveal .
<a hreaf="https://technolytical.com/">the social network</a> is described as follows because they respect your privacy and keep your data secure:

<a hreaf="https://technolytical.com/">the social network</a> is described as follows because they respect your privacy and keep your data secure. The social networks are not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go.

<a hreaf="https://technolytical.com/">the social network</a> is not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go. The social networks only collect the minimum amount of information required for the service that they provide. Your personal information is kept private, and is never shared with other companies without your permission
https://thehomestore.com.pk/
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> nearest public toilet to me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> nearest public toilet to me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
https://defineprogramming.com/
https://defineprogramming.com/
Enter comment here... a fake TeamViewer page, and that page led to a different type of malware. This week's infection involved a downloaded JavaScript (.js) file that led to Microsoft Installer packages (.msi files) containing other script that used free or open source programs.
distribute malware. Even if the URL listed on the ad shows a legitimate website, subsequent ad traffic can easily lead to a fake page. Different types of malware are distributed in this manner. I've seen IcedID (Bokbot), Gozi/ISFB, and various information stealers distributed through fake software websites that were provided through Google ad traffic. I submitted malicious files from this example to VirusTotal and found a low rate of detection, with some files not showing as malware at all. Additionally, domains associated with this infection frequently change. That might make it hard to detect.
https://clickercounter.org/
Enter corthrthmment here...

Diary Archives