Filter JSON Data by Value with Linux jq
Since JSON has become more prevalent as a data service, unfortunately, it isn't at all BASH friendly and manipulating JSON data at the command line with REGEX (i.e. sed, grep, etc.) is cumbersome and difficult to get the output I want.
So, there is a Linux tool I use for this, jq is a tool specifically written to manipulate and filter the data I want (i.e. like scripting and extract the output I need) from large JSON file in an output format I can easily read and manipulate.
The most common form of logs I work with are JSON arrays (start and end []). For example, using a basic example like this to demonstrate how to iterate over an array:
echo '["a","b","c"]' | jq '.[]'
which will result to this using the object value iterator operator .[] will print each item in the array on a separate line:
"a"
"b"
"c"
In this next example, I take the data from the bot_ip.json file, parse the list of IP addresses and which site they came from. Before parsing this file, here is how the raw output of the file starts:
cat bot_ip.json | jq '.objects[].ip + ": " + .objects[].source' | sort | uniq
The output looks like this:
"212.39.114.139: Botscout BOT IPs"
"216.131.104.82: Botscout BOT IPs"
"2607:90:6628:470:0:4:0:801: Botscout BOT IPs"
Since this file contains objects before the open [, I use it as an anchor to start parsing the data I want to see. I added the column (:) separator between the IP and the data source.
This second example is with mal_url.json which contain know malware URL location. Before parsing this file, here is how the file starts:
cat mal_url.json | jq '.objects[].value + ": " + .objects[].source + ": " + .objects[].threat_type' | sort | uniq
"http://103.82.81.37:42595/Mozi.a: URLHaus: malware"
"http://103.84.240.226:45940/Mozi.m: URLHaus: malware"
"http://110.178.73.97:34004/Mozi.m: URLHaus: malware"
Using this test file available here, it contains several records that can be used to manipulate JSON data. Using wget, download the file to a Linux workstation [2] and ensure that jq is already installed (i.e. CentOS: yum -y install jq). Next take a quick look at the raw file using a Linux command of your choice (less, more, cat, etc) before parsing some of the data with jq. To view the data properly formatted and readable, use this command:
cat large-file.json | jq | more
Manipulate the data to get a list of actors with the current information, run this command:
cat large-file.json | jq '.[].actor' | more
To get just the list of actor login information, add .login to .actor:
cat large-file.json | jq '.[].actor.login' | more
What are some of your favorite tools to manipulate JSON data?
[1] https://stedolan.github.io/jq/manual/
[2] https://github.com/json-iterator/test-data/raw/master/large-file.json
[3] https://gchq.github.io/CyberChef/
[4] http://iplists.firehol.org/?ipset=botscout
-----------
Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu
Comments
www
Nov 17th 2022
6 months ago
EEW
Nov 17th 2022
6 months ago
qwq
Nov 17th 2022
6 months ago
mashood
Nov 17th 2022
6 months ago
isc.sans.edu
Nov 23rd 2022
6 months ago
isc.sans.edu
Nov 23rd 2022
6 months ago
isc.sans.edu
Dec 3rd 2022
5 months ago
isc.sans.edu
Dec 3rd 2022
5 months ago
<a hreaf="https://technolytical.com/">the social network</a> is described as follows because they respect your privacy and keep your data secure. The social networks are not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go.
<a hreaf="https://technolytical.com/">the social network</a> is not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go. The social networks only collect the minimum amount of information required for the service that they provide. Your personal information is kept private, and is never shared with other companies without your permission
isc.sans.edu
Dec 26th 2022
5 months ago
isc.sans.edu
Dec 26th 2022
5 months ago