Tutorial: Automate and Integrate AWS Services with Lambda and Python (with code examples)

Tutorial: Automate and Integrate AWS Services with Lambda and Python (with code examples)

Image credit: Automation by Nick Youngson CC BY-SA 3.0 Pix4free.org

## Intro

The central idea is to combine virtually any AWS services to form a continuous workflow, automatically triggered by an event of your choice. Thus creating a managed or – depending on the services used – even serverless backend. One that is easily scalable and produces a desired output.

In this post we'll look at how to automate workflows in AWS, using lambda functions with Python and the Boto3 module. Thus connecting otherwise separate AWS services. Lambda is the medium that stitches the services together.

## Objectives

By the end of this post you should be able to:

1. Explain what lambda and the Boto3 module are.

2. Describe how lambda can be used to automate workflows in AWS.

3. Implement two automated workflows, with example Python code.

4. Be well-equiped to keep learning autonomously and start creating your own automated workflows.

## Intro to the Cloud and AWS as a Cloud Provider

A cloud provider is a business that provides servers and services in the cloud, i.e. in remote data centers. These services are accessed over the internet, are available on-demand – on a pay-as-you-go basis – and scale easily and massively. The services can be mixed and matched to build applications that are hosted in the cloud, as opposed to on-premise servers.

The advent of the cloud revolutionized computing because it gives businesses, big and small, huge operational advantages:

1. The cloud allows companies to focus on their core business and value-adding tasks. By outsourcing their infrastructure, i.e. moving servers and storage to the cloud, the undifferentiated heavy lifting of provisioning and maintaining servers is eliminated or drastically reduced.

2. By moving to the cloud companies avoid costly investments in hardware. They can instead opt for on-demand, pay-per-use services.

3. The cloud deals efficiently with fluctuating demand. Demand peaks and valleys cease being a problem. It is possible to scale up or down quickly and automatically, in small increments. Because capacity meets demand much more closely, businesses avoid over or under-provisioning hardware.

4. The cloud is cost-efficient because it takes advantage of gigantic economies of scale, that no single company could possibly achieve.

AWS, which stands for Amazon Web Services, is the biggest cloud provider and was the first to launch in 2006. As of March 2021, they offer something in the vicinity of a whopping 200 services. Their physical presence is arranged chiefly in terms of regions, of which there are presently 25 distributed around the world. Regions are physically and functionally isolated from each other, by design. Each region contains three to six zones. Each zone is a complete data center with redundant power, networking and connectivity in the region. Zones in a region are separated by a significant distance, many kilometers, and interconnected by high-throughput networking.

Regions and zones accomplish two main goals:

1. Low latency: by placing your backend resources in close proximity to your user base, you can serve low latency applications.

2. High availability or fault tolerance: these are achieved by having redundant, segregated resources. Regions and zones do just that, to different degrees.

There are other big cloud providers out there. And also a number of niche cloud providers, for niche workloads.

## Intro to the AWS Services Accessory to this Tutorial

### IAM

IAM (Identity and Access Management) is AWS' authentication and authorization service. There are three types of identities supported by IAM:

- Users: users are people, and are given long-term credentials such as a username and password and/or access keys.

- Roles: roles can be people, AWS services, or apps. Roles are meant to be assumed by a set, short period of time and therefore are given short-term credentials, aka security tokens. Roles are typically used to give AWS services access to other services, e.g. give a lambda function access to S3 buckets.

- Groups: groups are collections of users. The primary function of groups is to aggregate users who have the same permission needs. In this way they can be managed in batch, as a single entity. Groups will typically be teams, e.g. engineering, admins, devops, developers, design, etc.

Users, roles and groups are given permissions by means of policies. Policies are json documents that define what an identity can or cannot do. Policies are essentially configuration files that can be attached to identities. By using policies, it is possible to be very granular in which permissions are assigned to any given identity.

There is no charge for using IAM, or for creating users, roles, groups or policies.

### S3

S3, or Simple Storage Service, is probably the most popular AWS service. S3 is object storage in the cloud. If you want to store unstructured data, S3 is most likely the solution. S3 is great for images, audio, video, static website files, backups, really any kind of object you may need to store. It is important to bear in mind that objects can only be added, deleted or replaced, but they can't be modified. S3 is not block storage.

Files in S3 are kept in storage units called buckets, as key-value pairs. Keys are usually long and resemble a directory structure, as in `mybucket/prefix/path/to/myobject`, even though at the hardware level it is really flat storage. The values are the files themselves.

S3 is infinitely scalable, fast and inexpensive. It is also very reliable because objects are replicated in two zones, making up for a total of three segregated copies.

Among other features, with S3 you can encrypt your objects and control access to them using permissions.

### EC2

EC2 is one of the most widely used AWS offers. This is one of the services responsible for providing cloud compute, in the form of virtual machine rentals. The EC2 price model is pay-as-you-go – by the second or by the hour – depending on the machine and plan chosen. This gets more complex because there are many plans and options available, but is essentially a pay-as-you-go service.

In EC2 virtual machines are called *instances*. There are various CPU, memory, storage and networking configurations available to assemble an instance, so it's relatively easy to meet your demand and workload with decent accuracy. There are also many options of preconfigured machine images available for installation, that include the operating system and in some cases specialized software.

Your EC2 instances will be protected by a firewall, in AWS parlance a *security group*. With a security group you can specify the protocols, types of traffic, ports and IP ranges that are allowed to access your instance. Also on the security arena, you'll be able to generate an access key pair (public and private keys) to protect login to your instance.

EC2s can also be fitted with an *elastic IP*, which is an static IPv4 that is associated with your account and therefore fixed. The main uses for an elastic IP are whitelisting it if your instance needs to access public internet resources, and masking failures because you can fail over to a second machine and keep the IP. Ideally your instances should be in different zones.

You have the option to connect permanent storage your instance that is not deleted even if your instance hibernates, shuts down or is terminated. This permanent storage is called *EBS volumes* and will be presented next.

### EBS

Elastic Block Storage – EBS for short – is, as the name implies, cloud block storage. Meaning that you can change your files, down to the byte level.

EBS is persitent network storage, and as such can be disconnected from one EC2 instance and connected to another. It works much like an external hard drive, that you can connect to any computer. The data in an EBS volume is automatically replicated to a second zone, to avoid data loss in case of failure.

EBS comes in two basic flavors: HDD and SSD, priced accordingly. SSD is the faster and more costly option, generally better for random access and if you need really high IOPS. An HDD contains moving parts and is therefore slower, but cheaper. It is generally better for sequential access such as streaming, logs, big data. An HDD is also good for colder, infrequently accessed data.

One of the nice features of EBS volumes is that their configuration can be adjusted on-the-fly, with a live, production app without downtime. So you can change volume type, size, or IOPS, anytime.

You can also enable EBS encryption. It encrypts data in transit between the instance and the volume, the volume, snapshots taken from the volume, and volumes created from the snapshot.

### Eventbridge (formerly CloudWatch Events)

Eventbridge is an event bus service that uses the pub/sub model, and delivers events in near real-time to/from other AWS services. Think of Eventbridge as an all-seeing eye that is always watching your AWS account, noting and delivering events as per your request.

Events are actions or state changes. They are represented by small JSON documents, that contain information about the change. Events are calls-to-action that trigger other services, called targets. Targets can be lambda functions, EC2 instances, SNS topics, SQS queues, Cloudwatch logs, pipelines in CodePipeline, and many others. Events can be as simple as the coming of a certain date and time. In this way it's possible to schedule tasks in the same fashion as a Linux cron job, e.g. turning off EC2 instances during weekends or periods of low demand. So you avoid paying for underutilized resources.

Eventbridge supports not only in-account events, but also events between AWS accounts and a number of third-party partner events. Some of the best known are Shopify, Salesforce, Auth0, MongoDB, Symantec, Zendesk.

As you have probably surmised by now, Eventbridge is great for **automation**.

## Serverless Architecture

Serverless is a relatively new computing paradigm in which from an application's standpoint – and the people developing and maintaining it - servers are abstracted away. Of course that doesn't mean that there really aren't servers. Only that you as the cloud provider's customer don't have to patch, secure, harden, provision, or maintain servers in any way. As a matter of fact you won't even see them on your cloud provider's account.

Serverless started as FaaS (function as a service) with AWS lambda, but now has been expanded to BaaS (backend as a service), meaning any cloud offering in which servers have been abstracted away. A few examples: API Gateway for a serverless API service, Cognito and Auth0 for authenticatin, Dynamo DB and Firebase for databases, S3 for storage, SQS for queueing, SNS for notifications.

These are the main benefits of serverless:

1. You never have to provision servers, maintain them, or manage capacity. You get to focus on your application.

2. Never pay for idle time. With serverless you pay per request, not for server time. You only pay for what you use. Contrast that with EC2, in which you pay by provisioned instance and by the hour.

3. Serverless is highly available. You don't have to think about deploying redundancies, about avoiding downtime.

4. Serverless scales automatically, in or out.

Serverless ideally entails the use of an event-driven architecture, using FaaS to provide the backend logic. A FaaS product is only called when an event happens.

## Lambda functions

Lambda is AWS' FaaS offering. With lambda you can run your code in the cloud with zero server administration. You pay per request, there's no charge when your code is not running.

Main lambda features:

1. Lambda is stateless, meaning each run starts from a clean slate.

2. Maximum execution time is 15 min.

3. You get to pick the memory size for your function, to optimize execution time.

4. Lambda is highly available.

5. Lambda scales automatically, up to hundreds of thousand requests per second.

### Lambda is event-driven

Lambda functions are invoked by events. Some of the most common services that can generate events and trigger lambda directly: API Gateway, Cloudwatch Logs, Eventbridge, Dynamo DB, Kinesis, S3, SNS, SQS. These are only the typical lambda triggers. Conveniently, lambda is really well integrated with the whole suite of AWS services. It can be triggered by literally hundreds of services. Some of these services can't trigger lambda directly. But they can do so with the intermediation of Evenbridge rules, and the end result is the same.

### Lambda as the Glue between AWS Services

So far we've seen that lambda can be triggered by hundreds of other AWS services, effectively creating integration between those services and lambda. But that's only *upstream* of our lambda functions. Which begs the question, can lambda be used to trigger services *downstream* the workflow? The answer is a resounding **Yes**. That can be achieved by employing lambda destinations.

A lambda destination can be another lambda, an SNS topic, an SQS queue, or an Eventbridge event bus. By picking a destination, you enable your lambda function to forward execution records – as with events, in the form of json objects – to any of the aforementioned services. Therefore creating integration downstream of the lambda function. And because by writing Eventbridge rules we can trigger hundreds of services, destinations really fan out to enable AWS-wide integration.

Destinations also allow us to simplify our apps by leveraging a microservice architecture. For instance by writing two lambdas and connecting them with a destination. By keeping our concerns separate, each in its own lambda, our code becomes more easily maintainable and scalable.

With triggers and destinations, in a sense lambda is the glue that keeps AWS services together.

### Using lambda for automation

So here we have a summary of how lambdas automate workflows in AWS:

- They can be automatically triggered directly by some services, or indirectly via Eventbridge rules. They can also be automatically triggered by Cloudwatch Alarms – an alarm is just another form of event.

- Lambdas can themselves automatically trigger other lambdas or other services by using destinations.

- The code you write into your lambdas can also perform tasks involving any other services, by using client or resource objects representing an instance of a service. We'll see how all of that works with two code examples. In theses examples our lambda code will cause S3, EC2 and EBS to perform actions automatically.

### Supported Languages

Lambda natively supports Python, Node.js, Ruby, Java, Go and .NET core (C#/Powershell).

## The Boto3 Module for Python

### What is Boto3?

Boto3 is the Python SDK for AWS. Boto3 is used when coding against AWS services such as EC2 or S3. It gives you convenient object-oriented representations of AWS services, so you can invoke properties and methods to deal with services. In the same way you'd do with any object-oriented programming.

The Boto3 library is built on top of Botocore. Botocore is a lower level Python library also used by the AWS CLI (command line interface) itself. Botocore is the module that provides client objects to both the AWS CLI and Boto3.

In our code examples we'll import Boto3, so we can create resources and clients.

### Resources

To put it in succintly, resources are objects that can be instantiated with Boto3. Resources are high-level interfaces that give us references to AWS services within our code.

I tend to favor the use of resources over clients because resources enable high-level object-oriented programmming. Code with resources is simpler, more compact, easier on the eyes and offers more capabilities. However resources don't offer 100% coverage of AWS services, so sometimes it's necessary to recourse to clients.

### Clients

A client is about the same as a resource, an object that can be generated with Boto3 that also gives you object-oriented access to AWS services. The difference is, clients are lower-level interfaces. Client methods are snake cased, not the most beautiful syntax. There're advantages though:

1. Clients offer 100% coverage of AWS services.

2. Client methods usually match AWS CLI commands.

Keep clients in mind and use them if you need more specific functionality.

## Code example 1, Automatically Resizing S3-uploaded Images to Thumbnails

### Project Overview

For this project we'll have two S3 buckets: the *inbucket* and the *outbucket*. The inbucket and outbucket are named from the final user's standpoint. The inbucket is where our users will upload their original images to. The outbucket is where the thumbnails will be stored, and available to be downloaded by the users.

An image upload to the inbucket will be the event that triggers lambda to automatically process the image. Once processed, our code will store the thumbnails in the outbucket.

The lambda function will need an IAM role, so it has the permissions to read from and write to our S3 buckets.

### Libraries Used

Libraries we'll be importing in our code:

- Boto3: because we have to deal with AWS services.

- PIL: the Python Imaging Library. Instead of using a (paid) AWS service to process our images, we'll be using Pillow, a friendly fork of PIL. PIL is free to download and free to use, and is a fairly capable image processing library. It can be downloaded directly from the pypi repository.

- The *os* library: *os* stands for operating system. A library that allows us to interact with the underlying operating system. In this example we'll be using it to create and modify directories and paths. Also to get a hold of environment variables.

- The *tempfile* library: we'll be using it to create a temporary, working directory in which to store the original images acquired from S3 – previously uploaded to S3 by the user – and also to temporarily store the generated thumbnails, before sending them away again to S3.

### Create the S3 buckets

The first step is to create the two buckets:

1. Head to your AWS console and type *S3* in the search box. Select the S3 service.

2. Now that you're in the S3 console, click the big orange "Create bucket" button on the top right.

3. Give your bucket a unique name, some variant of inbucket. Mine is called *inbucket-8*.

4. Pick your region, whatever is closest to you. Leave all remaining settings as defaults, scroll down and create the bucket.

5. Do the same for the outbucket. Your final result should look like this:

No alt text provided for this image

### Create the Permissions Policy

1. Type *iam* in the aws search box to go to the IAM console.

2. Click "Policies" under "Access Management" on the left-hand side of the screen.

3. Now click the blue "Create policy" button on the top left.

4. Click the "JSON" tab. Delete everything that's there and paste the following policy:

{
    "Version": "2012-10-17",

    "Statement": [

        {

            "Effect": "Allow",

            "Action": "s3:GetObject",

            "Resource": "arn:aws:s3:::<your-inbucket-name-here>/*"

        },

        {

            "Effect": "Allow",

            "Action": "s3:PutObject",

            "Resource": "arn:aws:s3:::<your-outbucket-name-here>/*"

        },

        {

            "Effect": "Allow",

            "Action": "logs:CreateLogGroup",

            "Resource": "arn:aws:logs:<your-region-here>:<your-account-number-here>:*"

        },

        {

            "Effect": "Allow",

            "Action": [

                "logs:CreateLogStream",

                "logs:PutLogEvents"

            ],

            "Resource": [

                "arn:aws:logs:<your-region-here>:<your-account-number-here>:log-group:/aws/lambda/<your-lambda-here>:*"

            ]
        }
    ]
}        

1. Replace the appropriate placeholders with your buckets' names. Without quotation marks.

2. Replace the appropriate placeholders with your region's name, e.g. 'us-west-2'. Without the quotation marks.

3. Replace the appropriate placeholders with your 12-digit account number.

4. Replace the appropriate placeholder with your lambda's name, e.g. 'makeThumbnails'. Without the quotation marks.

5. Click the "Next" button, then "Next" again. There's no need to change anything.

6. Give your policy a name and then hit "Create policy". Memorize this name, you'll need it in a minute.

### Create the Lambda Role and Attach Permissions

Now we have to create the role and attach our newly-created policy to the role.

1. Navigate back to the IAM console.

2. Click "roles" under "Identity and Access Management" on the left-hand panel. Now click the "Create role" blue button.

3. Select "AWS service" and then "Lambda". Click the "Next: Permissions" blue button.

4. Find the policy that you just created using the search box. Select it and click "Next: Tags". Now click "Next: Review".

5. Give your role a descriptive name and click "Create role".

Your role should look something like this:

No alt text provided for this image

Great! Now the role is ready with permissions to access the buckets.

### Create the Lambda function

1. Type *lambda* in the services search box to go to the Lambda console.

2. On the AWS menu bar, choose the same region that you chose for the S3 buckets. Click the "Create function" button.

3. Choose "Author from scratch". Fill in the function name, I'll go with **makeThumbnails* Lambda*. Pick the latest Python runtime.

4. Open the "Change default execution role" drop down and pick "Use an existing role". Pick the role you created in the previous step.

5. Hit "create function".

And the Lambda's been successfully created!

### Add the S3 trigger

1. Navigate to the Lambda console, and click "functions" on the left-hand panel. Now find your Lambda and click it.

2. Click the "Add trigger" button.

3. Now select "S3", your inbucket, and "All object create events" for the event type. No need to create a prefix or a suffix. Click "Add".

### Increase the execution time

1. While you're at the Lambda overview, click the "Configuration" tab.

2. Under "General configuration", click "Edit".

3. Increase the execution time to 10s under "Timeout".

Why? Because depending on the size and complexity of the image, the default 3s may not be enough.

### Set up environment variables

Our *makeThumbnails* Lambda function needs to access one environmental variable -- the outbucket. So we have to set it in our Lambda configurations.

1. Click the "Configuration" tab, and then "Environment variables" on the left-hand panel.

2. Click "Edit", and then "Add environment variable".

3. Fill in the "key" and the "value" fields. It should look like this:

No alt text provided for this image

Make sure to replace the value with the name of your own outbucket.

### Create the Code File

Time to add our Python code! Pull up the text editor of your choice, paste in the code below. Name the file *lambda_function.py*, that's lambda's default.

The code is extensively commented, so it's easy to understand. All comments refer to the next line.

import boto

from PIL import Image

import os

import tempfile


# the chosen size for the thumbnail

size = (150, 150)

# acquire the outbucket object from the environment

outbucket = os.environ['outbucket']

# the s3 resource created by boto3

s3 = boto3.resource('s3')


# helper function, the resizing engine

def makeThumbnails(tempDownloadPath, tempUploadPath):

    # Image is the class that opens the image file as an image object

    with Image.open(tempDownloadPath) as image:

        # this is where the magic happens

        image.thumbnail(size)

        # save the thumbnail to a temp directory

        image.save(tempUploadPath)


# the main function

# lambda_handler is the entry point lambda expects

def lambda_handler(event, context):

    # the for-loop makes the function robust,

    # in case there are multiple images per event

    # the "event" object is the event json

    for record in event['Records']:

        # acquire inbucket name

        inbucket = record['s3']['bucket']['name']

        # acquire inbucket-uploaded file name

        fileName = record['s3']['object']['key']

        # concat the "thumb-" prefix

        thumbFileName = 'thumb-' + fileName

        # have the OS create a temp work directory

        with tempfile.TemporaryDirectory() as tempdir:

            # basically concat the file name to tempdir

            tempDownloadPath = os.path.join(tempdir, fileName)

            # same for the thumb file

            tempUploadPath = os.path.join(tempdir, thumbFileName)

            # download the image from the inbucket

            s3.Bucket(inbucket).download_file(fileName, tempDownloadPath)

            # call helper function

            makeThumbnails(tempDownloadPath, tempUploadPath)

            # upload thumb file to the outbucket

            s3.Bucket(outbucket).upload_file(tempUploadPath, thumbFileName)

        # log results to Cloudwatch logs

        print(f"Thumbnail saved in bucket {outbucket} as {thumbFileName}.")        

### Download Pillow and Create the Lambda Zip

Because Pillow is not part of the standard library, it's not normally available with lambda. We have to download it from the repo, create a zip package with our code plus Pillow, and upload the zip to lambda.

1. Head to the the Pillow project in the pypi repo: Pillow

2. Click "Download files".

3. Now select the correct wheel. This is it:

No alt text provided for this image

And here's why:

- "cp38" to match our chosen Python runtime, 3.8.

- "manylinux" to match lambda's underlying OS.

- "x86_64" to match lambda's underlying hardware architecture, x86 64-bit.

4. Unzip the wheel in the same directory where lambda_function.py is:

unzip Pillow-8.1.2-cp38-cp38-manylinux1_x86_64.whl        

5. Remove the "Pillow-8.1.2.dist-info" directory, it's not really code:

rm -rf Pillow-8.1.2.dist-info        

6. Create the zip. Don't use *tar* or *gzip*, as lambda only accepts the zip format:

zip -r9 lambda.zip lambda_function.py PIL Pillow.libs        

7. Go to the lambda console, the "Code tab", and click "Upload from". Upload the zip. Your final result should look like this (comments removed for clarity):

No alt text provided for this image

There you have it! You're now ready to upload an image to the inbucket and get a thumbnail in the outbucket.

### Test the Workflow

1. Head to your AWS console and type *S3* in the search box. Select the S3 service.

2. Now that you're in the S3 console, click your inbucket's name.

3. Click the big orange button, "Upload".

4. Click the "Add files" button and choose any image you may have. Click "Upload".

5. Open the left-hand side panel, and click "Buckets".

6. Now click your outbucket's name and see that your have a thumbnail there!

A word of caution: make sure your image filename does not contain spaces. If it does, AWS will replace the spaces by '+' in the event json, and the lambda will fail.

## Code Example 2, Automatically Taking Daily EBS Snapshots

### Project Overview

This project will create snapshots of our EBS volumes every day. We'll use a Cloudwatch rule to create the schedule that will trigger our lambda every 24 hours.

The code will iterate through every AWS region looking for EC2 instances tagged "snapshots: true". The tagging is optional, if you'd like to back up every EC2 instance, just delete the ".filter(...)" method and leave `instances = ec2.instances` in the code. Or tag your instances differently if you'd like, and change the code accordingly.

Additionally, for every EC2 instance found the code will search for EBS volumes. Because each instance can have multiple volumes attached. Once an EBS volume is found, a snapshot is created.

As always, we'll need a lambda role with adequate permissions.

### Libraries Used

- Boto3

- Datetime: datetime is a library that helps us deal with dates and times in Python, as date, time, datetime, and timezone objects, and a few others. Datetime provides classes that enable us to do that.

### Create the Permissions Policy

There are a number of permissions that we must add to be able to create EBS snapshots.

1. Type *iam* in the aws search box and go there.

2. Click "Policies" under "Access Management" on the left-hand side of the screen.

3. Now click the blue button on the top left, "Create policy".

4. Click the "json" tab. Delete everything that's there and paste the following policy:


{
    "Version": "2012-10-17",

    "Statement": [

        {

        "Effect": "Allow",

        "Action": [

            "ec2:CreateSnapshot",

            "ec2:DeleteSnapshot",

            "ec2:ModifySnapshotAttribute",

            "ec2:ResetSnapshotAttribute",

            "ec2:Describe*",

            "ec2:CreateTags"

        ],

        "Resource": "*"

        },

        {

            "Effect": "Allow",

            "Action": "logs:CreateLogGroup",

            "Resource": "arn:aws:logs:<your-region-here>:<your-account-number-here>:*"

        },

        {

            "Effect": "Allow",

            "Action": [

                "logs:CreateLogStream",

                "logs:PutLogEvents"

            ],

            "Resource": [

                "arn:aws:logs:<your-region-here>:<your-account-number-here>:log-group:/aws/lambda/<your-lambda-here>:*"

            ]
        }
    ]
}        

1. Replace the appropriate placeholders with your buckets' names. Without quotation marks.

2. Replace the appropriate placeholders with your region's name, e.g. 'us-west-2'. Without the quotation marks.

3. Replace the appropriate placeholders with your 12-digit account number.

4. Replace the appropriate placeholder with your lambda's name, e.g. 'makeSnapshots'. Without the quotation marks.

5. Click the "Next" button, then "Next" again. There's no need to change anything.

6. Give your policy a name and then hit "Create policy". Memorize this name, you'll need it in a minute.

### Create the Lambda Role and Attach Permissions

Now we have to create the role and attach our newly-created policy to the role. So back to the IAM console.

1. Navigate back to the IAM console.

2. Click "roles" under "Identity and Access Management" on the left-hand panel. Now click the "Create role" blue button.

3. Select "AWS service" and then "Lambda". Click the "Next: Permissions" blue button.

4. Find the policy that you just created using the search box. Select it and click "Next: Tags". Now click "Next: Review".

5. Give your role a descriptive name and click "Create role".

At the end your role should look something like this:

No alt text provided for this image

Great! Now the role is ready with permissions to access EC2 snapshots.

### Create the Lambda function

1. Head to the Lambda console and hit "Create function".

2. Choose "Author from scratch". Fill in the function name, I'll go with *makeSnapshots*. Pick the latest Python runtime.

3. Open the "Change default execution role" drop down and pick "Use an existing role". Pick the role you created in the previous step.

4. Hit "create function".

Our Lambda contains no code yet. We'll add it shortly. But the Lambda's been successfully created!

### Increase the execution time

1. In your Lambda's page, click the "Configuration" tab.

2. Under "General configuration", click "Edit".

3. Increase the execution time to 30s.

The makeSnapshots Lambda takes about 17s to run. Possibly because there're many regions to search through, and there's quite a bit of networking involved.

### Create the rule to trigger snapshots

Time to create the rule that will trigger snapshots on a schedule, that is every 24-hour period. If you want your snapshots to happen at a particular time of the day, you'd have to add a cron expression. For the sake of simplicity, we won't be doing that in this tutorial.

1. Type *cloudwatch* in the services search box to go to the Cloudwatch console.

2. On the left-hand panel, under "Events" click "Rules".

3. Now click "Create rule".

4. Pick the "Schedule" radio button.

5. Choose "Fixed rate" and enter 24 hours.

6. Click "Add target" and select your Lambda.

7. Leave the defaults as are, and click "Configure details".

8. Name the rule, make sure the checkbox "Enabled" is checked, and click "Create rule".

Now if you click your rule, it should look like this:

No alt text provided for this image

And this is how the Lambda overview should be after adding the trigger:

No alt text provided for this image

### Add the Code

Let's add the code to our lambda function. Again the code is thoroughly commented. Under your lambda's "Code source" panel, double click the file "lambda_function.py". Delete everything that's there, grab the code below and paste it directly in:

import boto

from datetime import datetime


def lambda_handler(event, context):

    # an ec2 resource does not contain a "describe_regions()" method

    # so we need a client

    # instantiate ec2 client

    ec2_client = boto3.client('ec2')


    # this list comprehension will create an array

    # with all regions that work with EC2

    regions = [region['RegionName']

                for region in ec2_client.describe_regions()['Regions']]

    for region in regions:

        # return an EC2 resource for the region

        # the resource object contains the instances for that region

        ec2 = boto3.resource('ec2', region_name=region)


        # return the instances with the tag "snapshots: true"

        instances = ec2.instances.filter(

            Filters=[

                {'Name': 'tag:snapshots', 'Values': ['true']}
            ]
        )


        # iterate through all the instances of "region"

        for instance in instances.all():

            # iterate through all the EBS volumes of "instance"

            for ebs in instance.volumes.all():

                # return a string representing the date and time

                # "YYYY-MM-DD HH:MM:SS"

                # if you need ISO 8601 format, append ".isoformat()" to the below line

                timestamp = datetime.utcnow().replace(microsecond=0)

                # create description

                description = f"instance: {instance.id},\

                    EBS volume: {ebs.id}, created at {timestamp}"

                # create snapshot

                snapshot = ebs.create_snapshot(Description=description)


                # log results

                print(f"snapshot: {snapshot.id}, instance: {instance.id},\

                    EBS volume: {ebs.id}, created at {timestamp}")        

### Test the Workflow

There you have it. Now go test the automation and your Lambda. Make sure you have a couple of EC2 instances with EBS volumes attached, and don't forget to tag your instances.

1. Head to your AWS console and type *Eventbridge* in the search box. Select the Eventbridge service.

2. Click "Rules" on the left-hand panel, and click the rule you created.

3. Reduce the period in the scheduled event to two minutes.

4. Now type *EC2* in the AWS search box. Select the EC2 service.

5. Search for "Snapshots" in the left-hand panel and click it.

6. Now watch your lambda create snapshots every two minutes!

## Conclusion

In this tutorial we saw how it is possible to integrate and automate AWS workflows using lambda functions. Lambdas come with a good number of upsides: they are cheap and you don't pay for idle time, lambdas scale automatically and to a great degree, offer concurrency, and they're also highly-available.

Lambdas are very customizable in that you can use a variety of languages and runtimes, even add additonal language support by uploading custom runtimes. Lambdas also offer a myriad of configuration settings, such as memory, timeout, triggers, destinations, environment variables, secrets, async invocations, retries, etc.

They're not perfect though. There are limitations to consider, namely:

1. Lambdas don't lend themselves to long-running services. Because their execution time is a maximum of 15 minutes.

2. Lambdas alone are really well-suited for microservices, but not for an entire app. Because they're stateless.

3. There are other specialized services that are a better fit for particular integration use cases.

So regarding automated integration of AWS workflows, let's now turn our attention to alternatives and complements to lambda.

### Step Functions

Step functions are not so much an alternative as they are a complement to lambda. The step function service provides state machines. State machines are used to build a distributed app. They orchestrate and coordinate components (steps) of a distributed app or workflow. They employ simple logic to coordinate steps, such as serial and parallel execution, conditionals, branching, timeouts, error handling.

Step functions can orchestrate lambdas, EC2 instances, containers, Dynamo DB, SQS, SNS, many other AWS services and even on-premise servers. Step functions are a true integration tool.

As the name implies, state machines can maintain states. Thus overcoming the aforementioned lambda's shortcoming #2.

Also state machines support long execution times. Think weeks, even months. Thus overcoming lambda's shortcoming #1.

### SQS

SQS is AWS' simple queue service. SQS provides serverless message queues, for messaging between AWS services. Its main use is to create loosely coupled architectures. A frontend pool adds messages to a queue, and the messages are retrived by a worker pool by polling.

SQS can integrate lambdas, EC2 machines, ECS containers, S3 buckets, Dynamo DB, SNS, and others. Therefore also a great integration and automation tool and an alternative to lambda.

SQS is a better, more specific service than lambda for integration via queues. It's also a serverless offering and boasts a number of features dedicated to message queuing:

- Standard, FIFO, and delay queues.

- Visibility timeout.

- Dead letter queue.

- Short and long polling.

- Adjustable message retention period.

- Auto deletion of expired messages.

- And others.

Thus overcoming lambda's shortcoming #3, regarding queues.

### SNS

SNS stands for simple notification service. Also used for app integration, SNS is the defacto notification service in AWS. It employs the pub/sub architecture: it creates a topic, and there can be many publishers and many subscribers. Messages are published to the topic by any of the publishers and sent to subscribers.

SNS integrates with a great number of services. Publishers can be, for instance, the public internet (using an http/s endpoint), S3 buckets, EC2, Cloudwatch alarms, Auto-scaling groups, Cloudformation stacks, and others. Subscribers can be the public internet, email addresses, SQS queues, lambda, mobile push notifications, etc.

SNS is a better tool than lambda for notifications. Again, because it's a specialized tool with special features for notifications, e.g. message filtering for different kinds of subscribers.

Thus overcoming lambda's shortcoming #3, regarding notifications.

* If this tutorial helped you, please like and leave a comment. *

Happy automating and integrating!

P.S. Don't forget to delete the resources you created for this, lest you may incur AWS charges.

Vishal Antapurkar, EIT, M.Eng

Mechanical Engineer | Maintenance Planning and Scheduling | Reliability Engineering

3y

Great content, really helpful.

Like
Reply

That's quite insightful. Well done!

Like
Reply

To view or add a comment, sign in

More articles by Jose Neto

Insights from the community

Others also viewed

Explore topics