I just started playing around a lot with different parts of AWS. Before whenever I wanted to run a one off job like a RDS snapshot, I would put a cron job on a server I knew would never have more then one of.

So for example I would have a cron on my salt master server that looked like This

0 * * * * /opt/aws-scripts/api-rds-snapshot.py

That would take an hourly snapshot of an RDS instance. There are a few things “wrong” with this solution

  1. You need to grant access to the instance to take the snapshot. This might be using IAM role or using a api key/secret in the script.
  2. If the instance is down you might lose a backup.

Number 2 is pretty straight forward but the issue with number 1 is numberous.

For example if you grant api key/secret or IAM role someone could get on the instance that shouldn’t have access to that snapshot and delete them or download them. If you’ve ever dealt with compliance then you understand why this is a problem.

Lambda fixes a lot of it because it can run a python function to do the snapshot based on events that trigger the job. In this case we will use Scheduled Event to run the function once an hour.

In python lambda jobs don’t have access to a lot of modules you might use. It can’t use boto v2 so you have to use v3.

Here is the script.

The script will look for an instance named web-platform-slave and take a snapshot and then it will pull all snapshots for that instance and delete anything older then 1 week.