Advanced AWS CLI JMESPath Query Tricks

Joseph LawsonJuly 27, 2015

JMESPath Query in the AWS CLI

Introduction

The Amazon Command Line Interface (AWS CLI) is a great tool for exploring and querying your Amazon Web Services (AWS) infrastructure and AWS provides the AWS Command Line Interface Documentation to give you a good idea of how to use the tool but some of the nuances of the advanced options are left up to the user to discover. This post will focus on the --query command which allows you to filter your command results based on JMESPath query expressions. Using query expressions operations allows operators to skip some, if not all, of the painful exercises of extracting and manipulating the JSON with custom software.

If you haven’t explored it prior, take a moment to familiarize yourself with the methods by which you can Control Command Output from the AWS Command Line Interface. For this article specifically, we are going to focus on the techniques highlighted in the section How to Filter the Output with the --query Option.

Looking up Amazon Machine Images

Nearly everyone using AWS at one time or another has had to find an Amazon Machine Image (AMI) ID and code it into their software or launch an instance using it. The most painful part of using these images is their IDs are very unfriendly for people to remember. Furthermore, the image id changes every time an update is rolled out leaving you with an outdated and perhaps soon to be deleted AMI. Let’s take a look at how the AWS CLI, and a few pipe commands, can filter out the extra info so you find the latest and greatest AMI.

If you just run a simple query to see what AMIs Amazon owns you will be quickly overwhelmed. Below we attempt to see how many AMIs are actually offered by AWS themselves. The default output is json so we switch to --output text which allows IMAGES results to return all on one line. In a moment, when we are trying to select JSON within the results, we will use the default --output json.

$ aws ec2 describe-images --owner amazon --output text | grep -c ami
977

977 AMIs! Well there is quite the selection. For some introductory filtering just get the last AMI listed.

What if I just wanted the name and AMI id of all the images? AWS CLI has a query toggle (--query) to filter results. Note that --query is a JMESPath expression. Below, in the --query toggle, Images[] is a MultiSelect List. The documentation explains, “ A multi-select-list with N expressions will result in a list of length N”. So we are going to get all Images[] returned owned by AWS. The period (.) between Images[] and [ImageId,Name] signifies that we want JMESPath to perform a SubExpression evaluation. A subexpression takes the result of the expression on the left and then evaluates the expression on the right. So the full query says, we want all Images that have the values ImageId and Name. Let’s see what the first five results of this query results in.

$ aws ec2 describe-images --owner amazon --query 'Images[].[ImageId,Name]' --output text | grep -m5 "ami-"
ami-0048c968    .NET Beanstalk Cfn Container v2.0.2.1 on Windows 2012
ami-005daf69    ElasticBeanstalk-Tomcat6-64bit-20110322-2041
ami-0078da69    amzn-ami-pv-2012.03.2.x86_64-s3
ami-00c17768    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-php55-gpu-201409291824
ami-013aca6a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121

[Errno 32] Broken pipe

So, ignoring the error, our first five results have some elasticbeanstalk AMIs which isn’t too surprising considering that AWS Elastic Beanstalk is one of the original offerings by AWS.

You may be asking yourself, “Hey Joe just evaluated two expressions, ImageId and Name, but how did he know which to work with?” Sometimes you have to RTFM. AWS CLI uses botocore and in the botocore documentation the response syntax is given for a botocore ec2 describe_images function call. Each key in the response structure can be evaluated with the SubExpression to get its value. Note that botocore is returning a Pythonic CamelCase form of what the EC2 DescribeImages API XML response returns. You can use the AWS API and botocore API references to figure out what is available and then chop away to what your script may need. Yay!

Just to confirm out suspicions, try querying the tomcat7java6 Elastic Beanstalk AMI.

$ aws ec2 describe-images --image-ids ami-013aca6a
{
    "Images": [
        {
            "State": "available",
            "Architecture": "x86_64",
            "OwnerId": "102837901569",
            "KernelId": "aki-919dcaf8",
            "RootDeviceName": "/dev/sda1",
            "RootDeviceType": "ebs",
            "CreationDate": "2015-06-15T21:35:00.000Z",
            "ImageId": "ami-013aca6a",
            "ImageLocation": "amazon/aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121",
            "VirtualizationType": "paravirtual",
            "Hypervisor": "xen",
            "Public": true,
            "BlockDeviceMappings": [
                {
                    "Ebs": {
                        "Encrypted": false,
                        "VolumeSize": 8,
                        "DeleteOnTermination": true,
                        "SnapshotId": "snap-521f631c",
                        "VolumeType": "standard"
                    },
                    "DeviceName": "/dev/sda1"
                }
            ],
            "ImageType": "machine",
            "Name": "aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121",
            "ImageOwnerAlias": "amazon"
        }
    ]
}

As you can see, both Name and ImageID are both present. This concludes a quick overview of JMESPath queries for AWS CLI.

Advanced JMESPath Queries

As powerful as the simple JMESPath queries are, there is a lot more expression left in the specification! If you recall earlier, the MultiSelect List can take an optional expression within the bracket. There are a number of “bracket-specifier” expressions that allow you to access an array of results. Index enumeration, ie Images[0], returns the result at the specified position within a list. A slice-expression uses a start and end index along with an optional stepping number (the default step value is 1). For example, Images[0:5:2] returns three elements, 0, 2 and 4, within the array between the indexes 0 to 5 and a stepping value of 2. Finally, there are Filter Expressions which allow for the comparison operators ==, !=, <, <=, >, and >=. Filter expressions always start with a question mark (?).

JMESPath also includes Functional Expressions. When combining Filter Expressions with Functional Expressions you really start to feel the power. There are a number of Built-in Functions, all Functional Expressions, which allow you to filter down what you see. These functions generally operate on the various Data Types offered by JMESPath, ie number, string, boolean, array, object, null. Built-in Functions for the string type include contains, ends_with, join, length, reverse, sort, sort_by, and starts_with. You can chain expressions together using Pipe Expressions and Or Expressions. Being that it can take a bit to get how to put these queries together, I wanted to step through some advanced examples. Once you get the hang of them, you can really make efficient scripts and queries for your infrastructure.

Previously we used a grep expression to just get the first five results of a query. Let’s replace that grep limit with a range expression.

$ aws ec2 describe-images --owner amazon --query 'Images[0:5].[ImageId,Name]' --output text
aki-0251b36b    None
aki-0a4aa863    None
aki-12f0127b    None
aki-1a946e73    None
aki-1c669375    vmlinuz-2.6.21.7-2.ec2.v1.3.fc8xen.manifest.xml

Hrm, not quite what was expected. Notice that previously we grepped for the string “ami-“. If you are curious, the prefix aki refers to an Amazon Kernel Image which all older generation paravirtual (PV) guest machines made use of. When you launched a PV AMI, you were actually choosing a kernel image and a boot disk image. Because these are both disks, AWS stores them all in the images API. If you look where we described ami-013aca6a you can see that it is also associated with "KernelId": "aki-919dcaf8". You can read more about this topic in the EC2 Documentation under the section Linux AMI Virtualization Types.

Okay, lets try to eliminate the second part of our grep expression using a JMESPath query. The starts_with function looks like a good candidate. So we want to get the first five images whose ImageId starts with ami-. We will use the pipe, |, to chain our Images filters together. Because we need to take the first five that start with ami- we must make [?starts_with(ImageId, ami-) == true] the first filter in the chain.

$ aws ec2 describe-images --owner amazon --query 'Images[?starts_with(ImageId, `ami-`) == `true`]|[0:5].[ImageId,Name]' --output text
ami-0048c968    .NET Beanstalk Cfn Container v2.0.2.1 on Windows 2012
ami-005daf69    ElasticBeanstalk-Tomcat6-64bit-20110322-2041
ami-0078da69    amzn-ami-pv-2012.03.2.x86_64-s3
ami-00c17768    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-php55-gpu-201409291824
ami-013aca6a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121

As you can see the result is the exact same as the piped expression we did the in introduction but it uses pure JMESPath expressions. Neat!

Okay so we got results coming back that give us AMIs only but what if we want to find all AMIs which start with the name aws-elasticbeanstalk? This should be simple enough, we’ll just query for all images that start with the Name aws-elasticbeanstalk. Right?

$ aws ec2 describe-images --owner amazon --query 'Images[?starts_with(Name, `aws-elasticbeanstalk`) == `true`][0:5].[ImageId,Name]' --output text

In function starts_with(), invalid type for value: None, expected one of: ['string'], received: "null"

Oh no! We’ve run into an edge case with using the starts_with function! starts_with expects only strings to evaluate but it appears that some Name values are null which causes an exception. Because starts_with demands to only get strings we must first find all Images whose names are not equal to null. Comparison Operators and Pipe Expressions to the rescue!

$ aws ec2 describe-images --owner amazon --query 'Images[?Name!=`null`]|[?starts_with(Name, `aws-elasticbeanstalk`) == `true`]|[0:5].[ImageId,Name]' --output text
ami-00c17768    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-php55-gpu-201409291824
ami-013aca6a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121
ami-033aca68    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-hvm-201506152121
ami-08302e60    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-python26-pv-201505182010
ami-08566660    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-python26-hvm-201504032050

Fantastic! We now can discover all the Elastic Beanstalk AMIs using just JMESPath queries.

What if we just want tomcat7 AMIs? Let’s try out a contains built in function.

$ aws ec2 describe-images --owner amazon --query 'Images[?Name!=`null`]|[?starts_with(Name, `aws-elasticbeanstalk`) == `true`]|[?contains(Name, `tomcat7`) == `true`]|[0:5].[ImageId,Name]' --output text
ami-013aca6a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121
ami-033aca68    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-hvm-201506152121
ami-143b257c    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201505181913
ami-248dff4c    aws-elasticbeanstalk-amzn-2014.09.1.i386-tomcat7java7-pv-201501140057
ami-34c1775c    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-tomcat7java7-gpu-201409291824

Spot on. Okay we are starting to really get the hang of this. Let’s try one final example which is really to illustrate that sometimes a little Unix pipe command is your friend.

What if you want to retrieve only the latest tomcat7java6-pv AMI? You know there is a CreationDate field but JMESPath doesn’t support date sorting, only string and numbers, bummer. Don’t fret yet! GNU’s sort has your back. Okay, we want to get all the Elastic Beanstalk AMIs which run Tomcat7 and Java 6 on a paravirtual platform but really only want the latest build. Let see how far JMESPath gets us.

$ aws ec2 describe-images --owner amazon --query 'Images[?Name!=`null`]|[?starts_with(Name, `aws-elasticbeanstalk`) == `true`]|[?contains(Name, `tomcat7java6-pv`) == `true`].[CreationDate,ImageId,Name]' --output text
2015-06-15T21:35:00.000Z        ami-013aca6a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121
2015-05-18T19:18:26.000Z        ami-143b257c    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201505181913
2015-04-20T17:45:22.000Z        ami-58959230    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201504201742
2015-04-03T20:07:00.000Z        ami-6252620a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201504032003
2014-10-15T22:26:52.000Z        ami-721aa21a    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-tomcat7java6-pv-201410152224
2015-01-27T23:14:12.000Z        ami-a06d29c8    aws-elasticbeanstalk-amzn-2014.09.1.x86_64-tomcat7java6-pv-201501272310
2015-01-14T01:12:09.000Z        ami-a8f280c0    aws-elasticbeanstalk-amzn-2014.09.1.i386-tomcat7java6-pv-201501140107
2014-09-29T18:32:37.000Z        ami-bec177d6    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-tomcat7java6-pv-201409291829

Not bad, but it appears that there are both x86_64 and i386 versions. Lets filter to only x86_64 versions and then use GNU’s sort to sort on the first column of results.

$ aws ec2 describe-images --owner amazon --query 'Images[?Name!=`null`]|[?starts_with(Name, `aws-elasticbeanstalk`) == `true`]|[?contains(Name, `x86_64-tomcat7java6-pv`) == `true`].[CreationDate,ImageId,Name]' --output text | sort -k1
2014-09-29T18:32:37.000Z        ami-bec177d6    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-tomcat7java6-pv-201409291829
2014-10-15T22:26:52.000Z        ami-721aa21a    aws-elasticbeanstalk-amzn-2014.09.0.x86_64-tomcat7java6-pv-201410152224
2015-01-27T23:14:12.000Z        ami-a06d29c8    aws-elasticbeanstalk-amzn-2014.09.1.x86_64-tomcat7java6-pv-201501272310
2015-04-03T20:07:00.000Z        ami-6252620a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201504032003
2015-04-20T17:45:22.000Z        ami-58959230    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201504201742
2015-05-18T19:18:26.000Z        ami-143b257c    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201505181913
2015-06-15T21:35:00.000Z        ami-013aca6a    aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121

Alright we are getting close to our goal of the latest and greatest AMI! Since sort on dates defaults to ascending order, we know that we can just take the last value and then print out the second column, using GNU’s awk (gawk).

$ aws ec2 describe-images --owner amazon --query 'Images[?Name!=`null`]|[?starts_with(Name, `aws-elasticbeanstalk`) == `true`]|[?contains(Name, `x86_64-tomcat7java6-pv`) == `true`].[CreationDate,ImageId,Name]' --output text | sort -k1 | tail -n1 | gawk '{print $2}'
ami-013aca6a

Fabulous. We maximized our use of JMESPath queries to reduce our resulting set to something very manageable. Lets use one more GNU tool in the Findutils suite called xargs to take that AMI id and make one more ec2 describe-images calls to see all the properties of that resulting AMI.

$ aws ec2 describe-images --owner amazon --query 'Images[?Name!=`null`]|[?starts_with(Name, `aws-elasticbeanstalk`) == `true`]|[?contains(Name, `x86_64-tomcat7java6-pv`) == `true`].[CreationDate,ImageId,Name]' --output text | sort -k1 | tail -n1 | gawk '{print $2}' | xargs aws ec2 describe-images --image-ids "$@"
{
    "Images": [
        {
            "Hypervisor": "xen",
            "Public": true,
            "RootDeviceType": "ebs",
            "KernelId": "aki-919dcaf8",
            "State": "available",
            "ImageLocation": "amazon/aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121",
            "ImageOwnerAlias": "amazon",
            "Name": "aws-elasticbeanstalk-amzn-2015.03.0.x86_64-tomcat7java6-pv-201506152121",
            "VirtualizationType": "paravirtual",
            "Architecture": "x86_64",
            "CreationDate": "2015-06-15T21:35:00.000Z",
            "ImageId": "ami-013aca6a",
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/sda1",
                    "Ebs": {
                        "Encrypted": false,
                        "DeleteOnTermination": true,
                        "VolumeType": "standard",
                        "VolumeSize": 8,
                        "SnapshotId": "snap-521f631c"
                    }
                }
            ],
            "RootDeviceName": "/dev/sda1",
            "ImageType": "machine",
            "OwnerId": "102837901569"
        }
    ]
}

There you have it! The latest and greatest Elastic Beanstalk Tomcat7 with Java6 x86_64 paravirtual AMI without searching on Google. You could make a script which supplies this ID to a CloudFormation template and never have to record AMI ids in your template again. Later on I’ll explore doing AMI lookups via CloudFormation using Lambda but I think this is enough for today! Keep in mind that these scripts are very dependent on naming conventions but AWS and other public AMI owners seem to be fairly consistent in their naming conventions.

Conclusion

We have explored in depth how JMESPath expressions can really help you narrow down what you are looking for from the AWS CLI. You should now have a better context to work with the more advanced features of JMESPath queries using the AWS CLI. I hope you find this exploration of expressions useful. I’m looking forward to writing more about how search and cloud work together. Please feel free to get in touch!




More blog articles:


Let's do a project together!

We provide tailored search, discovery and personalization solutions using Solr and Elasticsearch. Learn more about our service offerings