This is a lightning talk I gave to the CPLUG November 2020 meeting on how to use jq with bash for dealing with JSON in modern tooling.
NOTE: use the space bar to scroll.
The Problem
More and more programs are using JSON as a data interchange format.
See docker inspect
Github API
JSON is not easily parsed using grep, sed, and awk in shell pipelines.
JSON is easily consumed via many programming languages, and this makes it
very popular as an interchange format.
Shell pipelines are great for explorations of certain APIs and running programs.
Composable, Fast, Parallel by default.
The Solution: jq
jq can be thought of as a stream editor
(sed) for JSON data. It can slice and filter, map, and transform this
data very easily.
Slice
Extract elements out of an array.
Filter
Extract things matching a boolean expression
Map
Apply some operation to the resulting value. Ex. add, subtract, concatenate.
Transform
Take one JSON input, and move the fields around to become a different JSON object.
Real World Usages
Parsing out health check information. Many services have a health check endpoint.
Not every business has setup proper monitoring for these end
points.
Extracting fields from your security IDS.
You find your company had a data breach over the VPN. You have a
JSON log of where everyone logged in from. You need to find the
anonomylous login.
Working with object stores on the command line.
Ex. Pumping information out of MongoDB, and working on it with the
shell.
I initially learned about it through hacker news, and learned it to
compete in a capture the flag event.
Usages of jq
Direct file input:
From a pipe:
Sending the output to another operation:
I know this is a useless use of cat, but it’s for example purposes.
Examples of jq operations
The Identity Transformation
An identity transformation just shunts the input to the output. jq will pretty print to help the human read the output.
Input
Output
Get the value of a key
We want to extract a value from some path of the json object’s keys.
Input
Output
Get the value of multiple keys
Sometimes we want to get more than 1 key out at a time. A comma can separate the multiple keys to pull.
Input
Output
Extracting a key from an array
In order to access the objects in a given array, they must first be unwraped with .[], we can the pipe them to the key expressions.
Input
Output
Indexing an Array
jq uses the standard array syntax to pull out single elements. In this case, the last object is grabbed.
Input
Output
Slicing a range of elements from an array
It might be neccessary to pull a contigious series of elements. That is a slicing operation. A range is in the pattern of start:end_exclusive. In this example we get the middle 2 elements.
Input
Output
Slicing a specific elements from an array
In this example, the first and last elements in our example are pulled by index.
Input
Output
Transforming a series of objects
Sometimes we want to change how a JSON object is formatted. Such as creating an array where the first is the name, and the remaining indexes are their hobbies. The first operation unpacks the array. Then generates an array by taking the name, and splatting the hobbies array.
Input
Output
Mapping to a new value
Arithmatic and basic string operations can be applied using jq. It prints out the modified object. In this case, we are doubling every person’s age.
Input
Output
Filtering JSON
Sometimes we want to filter things. We can use the select function in jq. This allows us to apply a boolean expression to the input array.
Input
Output
Building a histogram
A histogram is a very useful tool for determining frequency of values. jq doesn’t have a reduce, but we can use other shell commands to get us there. In this example we use the last 5 commits from the jq repository to determine who commited the most in that time period. Awk could be used as an alternative for sort uniq pattern.
Input
Output
Adding a Downloaded Field to a JSON Object
It can be useful to timestamp a downloaded JSON object. For example, we are pulling from a random endpoint that needs to be cached locally on the web server, and the end user needs to be told when the file was last updated. In this example, the updated_at field is added to an object stored in a file.
Input
Output
Useful jq Command Flags
--compact-output/-c
Cut down on the white space and the pretty printing. Pretty printing
is some what expensive with the excess of bytes it produces. Useful when
chaining jq calls.
--unbuffered
Are you reading from a slow source? This sends stuff as soon as it’s
ready to the next pipe or output.
--arg name value
jq can use variables in your expression. If you call it with jq
--arg foo 123, the value "123" will be bound to $foo in your
expressions.
--sort-keys/-S
Sort the fields in each output object by the keys.
--monochrome-output/-M
Do NOT output any color. By default jq will colorize output.