Solved: Automating AWS EC2 Snapshots with Lambda & CloudWatch Events
These articles are AI-generated summaries. Please check the original sources for full details.
Automating AWS EC2 Snapshots with Lambda & CloudWatch Events
Manual AWS EC2 snapshot management is prone to errors and costly, potentially leading to significant data loss and operational disruption. This tutorial provides a robust, cost-effective solution to automate EC2 snapshot creation using AWS Lambda and CloudWatch Events, ensuring critical data backup without manual overhead.
Why This Matters
While ideal models assume perfect operational execution, reality introduces human error and inconsistent application of backup policies. The cost of data loss from unmanaged snapshots can range from hours of recovery time to millions in revenue depending on the impacted system. Automating this process mitigates these risks and ensures consistent, reliable backups.
Key Insights
- A dedicated IAM role for the Lambda function requires
ec2:DescribeInstances,ec2:DescribeVolumes,ec2:CreateSnapshot,ec2:CreateTags, andlogs:PutLogEventspermissions. - The Python Lambda function, using
boto3, iterates through running EC2 instances, identifies EBS volumes, and creates snapshots with descriptive tags. - Amazon CloudWatch Events (EventBridge) are configured with a cron schedule to trigger the Lambda function periodically, automating the snapshot process.
Working Example
import boto3
import datetime
import os
def lambda_handler(event, context):
ec2 = boto3.client('ec2', region_name=os.environ.get('AWS_REGION', 'us-east-1'))
try:
instances_response = ec2.describe_instances(
Filters=[
{'Name': 'instance-state-name', 'Values': ['running']}
]
)
for reservation in instances_response['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
instance_name = 'No-Name'
for tag in instance.get('Tags', []):
if tag['Key'] == 'Name':
instance_name = tag['Value']
break
print(f"Processing instance: {instance_id} ({instance_name})")
for block_device_mapping in instance.get('BlockDeviceMappings', []):
if 'Ebs' in block_device_mapping:
volume_id = block_device_mapping['Ebs']['VolumeId']
description = (
f"Automated snapshot of {volume_id} "
f"attached to {instance_id} ({instance_name}) "
f"created on {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}."
)
print(f"Creating snapshot for volume: {volume_id}")
snapshot = ec2.create_snapshot(
VolumeId=volume_id,
Description=description,
TagSpecifications=[
{
'ResourceType': 'snapshot',
'Tags': [
{'Key': 'CreatedBy', 'Value': 'Lambda'},
{'Key': 'Automation', 'Value': 'EC2SnapshotTool'},
{'Key': 'InstanceId', 'Value': instance_id},
{'Key': 'InstanceName', 'Value': instance_name},
{'Key': 'VolumeId', 'Value': volume_id},
{'Key': 'Name', 'Value': f"{instance_name}-{volume_id}-snapshot-{datetime.datetime.now().strftime('%Y%m%d%H%M')}"}
]
}
]
)
print(f"Snapshot created: {snapshot['SnapshotId']}")
except Exception as e:
print(f"Error creating snapshots: {e}")
raise e
return {
'statusCode': 200,
'body': 'EC2 snapshots created successfully!'
}
Practical Applications
- TechResolve: Automates daily snapshots of production EC2 instances to ensure rapid recovery in case of failure.
- Pitfall: Relying on default Lambda timeouts can lead to incomplete snapshot creation if the function encounters a large number of volumes, resulting in inconsistent backups.
References:
Continue reading
Next article
Solved: How to Send Custom Prometheus Alerts to Discord via Webhooks
Related Content
Automating EC2 Instance Setup with User Data
AWS EC2 User Data enables automated server provisioning, eliminating manual configuration steps and reducing deployment time.
Automating HTTPS Setup with Terraform in 4 Lines of HCL
A Terraform template reduces manual HTTPS configuration in AWS from 47 console clicks to 4 lines of HCL, enabling version control, rollback, and automation.
AWS Infrastructure Composer: Visual IaC for Serverless Apps
AWS Infrastructure Composer simplifies CloudFormation and SAM templates with visual editing, reducing manual IaC configuration errors.