Scenario Intermediate Python Python AWS Scripting

Clean Up Unused AWS Resources — EBS Volumes, EIPs, Old AMIs with Cost Report

Python script to find and delete unattached EBS volumes, unassociated Elastic IPs, and old AMIs, then produce a JSON cost-savings report.

January 20, 2025 15 min read ~20 min to complete DB
The Situation

Cost governance — idle resources accumulate silently. Unattached 100 GB EBS volumes cost ~$10/month each. 50 forgotten volumes is $500/month wasted. This script finds and removes them safely with a dry-run mode.

6 Steps
2 Services Used
~20 min Duration
Intermediate Difficulty

Resource Cost Overview

ResourceApprox. CostWhen it wastes money
EBS gp3 volume$0.08/GB/monthWhen not attached to any instance
Elastic IP$0.005/hour (~$3.60/month)When not associated with a running instance
AMI snapshot$0.05/GB/monthWhen older than N generations (usually keep last 3)

Complete Script

"""
aws_resource_cleanup.py

Find and optionally delete unused AWS resources to reduce costs.
Run with DRY_RUN=true (default) to see what would be deleted first.
"""

import boto3
import json
import logging
import os
import sys
from datetime import datetime, timezone
from dataclasses import dataclass, field
from collections import defaultdict

logger = logging.getLogger(__name__)
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(message)s"
)


@dataclass
class CleanupReport:
    """Accumulates findings and savings estimates across all cleanup operations."""
    # Lists of resources found/deleted
    ebs_volumes:  list = field(default_factory=list)
    eips:         list = field(default_factory=list)
    amis:         list = field(default_factory=list)

    # Cost estimates
    ebs_savings_per_month:  float = 0.0
    eip_savings_per_month:  float = 0.0
    ami_savings_per_month:  float = 0.0

    @property
    def total_savings_per_month(self) -> float:
        return self.ebs_savings_per_month + self.eip_savings_per_month + self.ami_savings_per_month


class AWSResourceCleaner:
    """
    Finds and removes unused AWS resources in a single region.

    All destructive operations are gated by dry_run=True.
    Always run with dry_run=True first to review what will be deleted.
    """

    # Rough pricing — varies by region; adjust if needed
    EBS_PRICE_PER_GB_MONTH = {
        "gp2": 0.10,
        "gp3": 0.08,
        "io1": 0.125,
        "io2": 0.125,
        "st1": 0.045,
        "sc1": 0.015,
        "standard": 0.05,
    }
    EIP_PRICE_PER_HOUR      = 0.005   # When not associated
    SNAPSHOT_PRICE_PER_GB   = 0.05    # Per GB per month

    def __init__(self, region: str = "us-east-1", dry_run: bool = True):
        """
        region:   AWS region to audit (one region at a time).
        dry_run:  When True, reports findings but does NOT delete anything.
                  Set to False only after reviewing the dry-run output.
        """
        self.ec2     = boto3.client("ec2", region_name=region)
        self.region  = region
        self.dry_run = dry_run
        self.report  = CleanupReport()

        if dry_run:
            logger.info("Running in DRY-RUN mode — no resources will be deleted")
        else:
            logger.warning("Running in LIVE mode — resources WILL be deleted!")

    # ══════════════════════════════════════════════════════════════════
    # Part 1: Unattached EBS Volumes
    # ══════════════════════════════════════════════════════════════════
    def clean_unattached_ebs_volumes(self) -> list[dict]:
        """
        Find EBS volumes in 'available' state — these are not attached
        to any instance and accumulating charges.

        Volume lifecycle states:
          creating  → available (not attached) → in-use (attached) → deleting → deleted
          also: error, recovering

        We paginate describe_volumes with Filters so that only
        'available' volumes are returned — no need to filter client-side.

        For each volume we estimate the monthly waste and log it.
        """
        logger.info(f"[EBS] Scanning for unattached volumes in {self.region}")
        found = []

        paginator = self.ec2.get_paginator("describe_volumes")
        for page in paginator.paginate(
            Filters=[{"Name": "status", "Values": ["available"]}]
        ):
            for vol in page["Volumes"]:
                volume_id   = vol["VolumeId"]
                size_gb     = vol["Size"]
                vol_type    = vol["VolumeType"]
                create_time = vol["CreateTime"]
                tags        = {t["Key"]: t["Value"] for t in vol.get("Tags", [])}
                name        = tags.get("Name", "")

                # Calculate days idle (how long since creation with no attachment)
                # Note: real idle time requires checking attachment history, which
                # is not in the EC2 API. We use creation time as a proxy.
                days_old = (datetime.now(timezone.utc) - create_time).days

                # Estimate monthly cost
                price_per_gb = self.EBS_PRICE_PER_GB_MONTH.get(vol_type, 0.08)
                monthly_cost = size_gb * price_per_gb

                entry = {
                    "volume_id":    volume_id,
                    "name":         name,
                    "size_gb":      size_gb,
                    "type":         vol_type,
                    "days_old":     days_old,
                    "monthly_cost": round(monthly_cost, 2),
                    "deleted":      False,
                }

                logger.info(
                    f"[EBS] Found unattached: {volume_id} ({name}) "
                    f"{size_gb} GB {vol_type} — ${monthly_cost:.2f}/mo"
                )

                if not self.dry_run:
                    try:
                        # delete_volume() permanently destroys the EBS volume.
                        # This is IRREVERSIBLE — data is gone.
                        # The volume must be in 'available' state (not attached).
                        self.ec2.delete_volume(VolumeId=volume_id)
                        entry["deleted"] = True
                        logger.info(f"[EBS] Deleted: {volume_id}")
                        self.report.ebs_savings_per_month += monthly_cost
                    except self.ec2.exceptions.ClientError as e:
                        logger.error(f"[EBS] Failed to delete {volume_id}: {e}")
                else:
                    # In dry-run mode, still accumulate savings estimate
                    self.report.ebs_savings_per_month += monthly_cost

                found.append(entry)

        self.report.ebs_volumes = found
        logger.info(
            f"[EBS] Found {len(found)} unattached volumes "
            f"(${self.report.ebs_savings_per_month:.2f}/mo potential savings)"
        )
        return found

    # ══════════════════════════════════════════════════════════════════
    # Part 2: Unassociated Elastic IPs
    # ══════════════════════════════════════════════════════════════════
    def clean_unassociated_eips(self) -> list[dict]:
        """
        Find Elastic IPs not associated with any running instance or
        network interface. AWS charges $0.005/hour for idle EIPs.

        describe_addresses() returns ALL EIPs in the region.
        An EIP is unassociated if it has no AssociationId field.

        Two allocation domains:
          vpc:      EIP allocated for use in a VPC (AllocationId exists)
          standard: Legacy EC2-Classic (almost extinct, treat same way)

        release_address() returns the EIP to the AWS pool.
        You can no longer use that specific IP after this call.
        """
        logger.info(f"[EIP] Scanning for unassociated Elastic IPs in {self.region}")
        found = []

        # describe_addresses() is NOT paginated — returns all at once
        response = self.ec2.describe_addresses()

        for addr in response["Addresses"]:
            # Skip EIPs that are associated (in use)
            if "AssociationId" in addr:
                continue

            allocation_id = addr.get("AllocationId", "")
            public_ip     = addr["PublicIp"]
            tags          = {t["Key"]: t["Value"] for t in addr.get("Tags", [])}
            name          = tags.get("Name", "")

            # Cost: $0.005/hour × 24h × 30.5 days ≈ $3.65/month
            monthly_cost  = self.EIP_PRICE_PER_HOUR * 24 * 30.5

            entry = {
                "allocation_id": allocation_id,
                "public_ip":     public_ip,
                "name":          name,
                "monthly_cost":  round(monthly_cost, 2),
                "released":      False,
            }

            logger.info(
                f"[EIP] Found idle: {public_ip} ({name}) "
                f"— ${monthly_cost:.2f}/mo"
            )

            if not self.dry_run and allocation_id:
                try:
                    # release_address() returns this EIP to the AWS pool.
                    # Use AllocationId for VPC EIPs (not PublicIp).
                    self.ec2.release_address(AllocationId=allocation_id)
                    entry["released"] = True
                    logger.info(f"[EIP] Released: {public_ip} ({allocation_id})")
                    self.report.eip_savings_per_month += monthly_cost
                except self.ec2.exceptions.ClientError as e:
                    logger.error(f"[EIP] Failed to release {public_ip}: {e}")
            else:
                self.report.eip_savings_per_month += monthly_cost

            found.append(entry)

        self.report.eips = found
        logger.info(
            f"[EIP] Found {len(found)} unassociated EIPs "
            f"(${self.report.eip_savings_per_month:.2f}/mo potential savings)"
        )
        return found

    # ══════════════════════════════════════════════════════════════════
    # Part 3: Old AMIs — Keep Only N Most Recent Per Name Prefix
    # ══════════════════════════════════════════════════════════════════
    def clean_old_amis(self, keep_count: int = 3, name_prefix: str = "") -> list[dict]:
        """
        Delete AMIs older than the N most recent for each name prefix group.

        Strategy:
          - Group AMIs by name prefix (e.g., "app-server-*")
          - Sort each group by creation date, newest first
          - Keep the first `keep_count` AMIs
          - Deregister the rest, then delete their backing snapshots

        Two-step deletion:
          1. deregister_image(ImageId)   — removes the AMI registration
             (cannot launch new instances from it, but snapshots still exist)
          2. delete_snapshot(SnapshotId) — deletes the actual EBS snapshot
             (this is what actually frees storage and stops billing)

        Skipping these steps means orphaned snapshots continue to bill you
        even after the AMI is deregistered.

        OwnerIds=["self"] limits results to AMIs owned by THIS account.
        Without this, describe_images() could return AWS marketplace AMIs.
        """
        logger.info(
            f"[AMI] Scanning for old AMIs in {self.region} "
            f"(keep latest {keep_count} per name prefix)"
        )
        found = []

        # Fetch all AMIs owned by this account
        filters = [{"Name": "state", "Values": ["available"]}]
        if name_prefix:
            filters.append({"Name": "name", "Values": [f"{name_prefix}*"]})

        response = self.ec2.describe_images(
            OwnerIds=["self"],
            Filters=filters,
        )
        all_amis = response["Images"]

        # Group AMIs by name prefix (first word before a timestamp/version separator)
        # e.g., "app-server-20250120" → group key "app-server"
        groups: dict[str, list] = defaultdict(list)
        for ami in all_amis:
            ami_name  = ami.get("Name", "")
            # Split on common separators: -, _, space; take first 2 parts as prefix
            parts = ami_name.replace("_", "-").split("-")
            # Use first 2 dash-parts as the grouping key (customise as needed)
            group_key = "-".join(parts[:2]) if len(parts) >= 2 else ami_name
            groups[group_key].append(ami)

        for group_key, amis_in_group in groups.items():
            # Sort newest first by creation date
            # CreationDate format: "2025-01-20T14:30:00.000Z" — lexicographic sort works
            amis_sorted = sorted(
                amis_in_group, key=lambda a: a["CreationDate"], reverse=True
            )
            to_delete = amis_sorted[keep_count:]   # Everything after the N most recent

            for ami in to_delete:
                image_id      = ami["ImageId"]
                ami_name      = ami.get("Name", "")
                creation_date = ami["CreationDate"]

                # Collect snapshot IDs from the AMI's block device mappings
                # Each AMI has one or more EBS snapshots backing its volumes
                snapshot_ids = [
                    bdm["Ebs"]["SnapshotId"]
                    for bdm in ami.get("BlockDeviceMappings", [])
                    if "Ebs" in bdm and "SnapshotId" in bdm["Ebs"]
                ]

                # Estimate snapshot storage size
                total_size_gb = sum(
                    bdm["Ebs"].get("VolumeSize", 0)
                    for bdm in ami.get("BlockDeviceMappings", [])
                    if "Ebs" in bdm
                )
                monthly_cost = total_size_gb * self.SNAPSHOT_PRICE_PER_GB

                entry = {
                    "image_id":     image_id,
                    "name":         ami_name,
                    "group":        group_key,
                    "created":      creation_date,
                    "snapshots":    snapshot_ids,
                    "size_gb":      total_size_gb,
                    "monthly_cost": round(monthly_cost, 2),
                    "deregistered": False,
                    "snapshots_deleted": [],
                }

                logger.info(
                    f"[AMI] Old AMI: {image_id} ({ami_name}) "
                    f"created {creation_date[:10]} "
                    f"— ${monthly_cost:.2f}/mo in snapshots"
                )

                if not self.dry_run:
                    try:
                        # Step 1: Deregister the AMI
                        # After this, you cannot launch new instances from this AMI.
                        # Existing running instances are NOT affected.
                        self.ec2.deregister_image(ImageId=image_id)
                        entry["deregistered"] = True
                        logger.info(f"[AMI] Deregistered: {image_id}")

                        # Step 2: Delete each backing snapshot
                        for snap_id in snapshot_ids:
                            try:
                                self.ec2.delete_snapshot(SnapshotId=snap_id)
                                entry["snapshots_deleted"].append(snap_id)
                                logger.info(f"[AMI] Deleted snapshot: {snap_id}")
                            except self.ec2.exceptions.ClientError as e:
                                logger.warning(
                                    f"[AMI] Could not delete snapshot {snap_id}: {e}"
                                )

                        self.report.ami_savings_per_month += monthly_cost

                    except self.ec2.exceptions.ClientError as e:
                        logger.error(f"[AMI] Failed to deregister {image_id}: {e}")
                else:
                    self.report.ami_savings_per_month += monthly_cost

                found.append(entry)

        self.report.amis = found
        logger.info(
            f"[AMI] Found {len(found)} old AMIs "
            f"(${self.report.ami_savings_per_month:.2f}/mo potential savings)"
        )
        return found

    # ══════════════════════════════════════════════════════════════════
    # Part 4: Generate Cost-Savings Report
    # ══════════════════════════════════════════════════════════════════
    def generate_report(self, output_file: str = "cleanup_report.json") -> dict:
        """
        Write a JSON cost-savings report with all findings.
        The report is machine-readable (can be sent to Slack, stored in S3,
        or imported into a spreadsheet).
        """
        report_data = {
            "generated_at":  datetime.now(timezone.utc).isoformat(),
            "region":        self.region,
            "dry_run":       self.dry_run,
            "summary": {
                "ebs_volumes_found":     len(self.report.ebs_volumes),
                "eips_found":            len(self.report.eips),
                "amis_found":            len(self.report.amis),
                "ebs_savings_per_month": round(self.report.ebs_savings_per_month, 2),
                "eip_savings_per_month": round(self.report.eip_savings_per_month, 2),
                "ami_savings_per_month": round(self.report.ami_savings_per_month, 2),
                "total_savings_per_month": round(self.report.total_savings_per_month, 2),
                "total_savings_per_year":  round(self.report.total_savings_per_month * 12, 2),
            },
            "ebs_volumes": self.report.ebs_volumes,
            "eips":        self.report.eips,
            "amis":        self.report.amis,
        }

        # json.dump with default=str handles datetime objects
        with open(output_file, "w") as f:
            json.dump(report_data, f, indent=2, default=str)

        self._print_summary(report_data["summary"])
        logger.info(f"Full report written to {output_file}")
        return report_data

    def _print_summary(self, summary: dict) -> None:
        """Print a human-readable cost summary to console."""
        mode = "DRY-RUN estimate" if self.dry_run else "Actual savings"
        print(f"\n{'='*60}")
        print(f"AWS RESOURCE CLEANUP REPORT — {self.region.upper()}")
        print(f"Mode: {mode}")
        print(f"{'='*60}")
        print(f"  Unattached EBS volumes: {summary['ebs_volumes_found']:>4}  ${summary['ebs_savings_per_month']:>8.2f}/mo")
        print(f"  Unassociated EIPs:      {summary['eips_found']:>4}  ${summary['eip_savings_per_month']:>8.2f}/mo")
        print(f"  Old AMIs:               {summary['amis_found']:>4}  ${summary['ami_savings_per_month']:>8.2f}/mo")
        print(f"{'─'*60}")
        print(f"  TOTAL POTENTIAL SAVINGS:      ${summary['total_savings_per_month']:>8.2f}/mo")
        print(f"  ANNUALIZED:                   ${summary['total_savings_per_year']:>8.2f}/yr")
        print(f"{'='*60}\n")


# ── Entry point ───────────────────────────────────────────────────
def main() -> int:
    region  = os.environ.get("AWS_DEFAULT_REGION", "us-east-1")
    dry_run = os.environ.get("DRY_RUN", "true").lower() != "false"

    cleaner = AWSResourceCleaner(region=region, dry_run=dry_run)

    # Run all three cleanup operations
    cleaner.clean_unattached_ebs_volumes()
    cleaner.clean_unassociated_eips()
    cleaner.clean_old_amis(
        keep_count=3,
        name_prefix=os.environ.get("AMI_PREFIX", ""),  # e.g., "app-server"
    )

    # Write the consolidated report
    report = cleaner.generate_report("cleanup_report.json")

    # Return non-zero if live run deleted resources (useful for CI audit jobs)
    if not dry_run and report["summary"]["ebs_volumes_found"] > 0:
        return 0   # Deletions happened — success
    return 0


if __name__ == "__main__":
    sys.exit(main())

Safe Execution Workflow

# Step 1: Always dry-run first — see what WOULD be deleted
DRY_RUN=true AWS_DEFAULT_REGION=ap-south-1 python aws_resource_cleanup.py

# Step 2: Review cleanup_report.json
cat cleanup_report.json | python -m json.tool | grep -E '"name"|"size_gb"|"monthly_cost"'

# Step 3: If the report looks correct, run for real
DRY_RUN=false AWS_DEFAULT_REGION=ap-south-1 python aws_resource_cleanup.py

# Limit AMI cleanup to a specific name prefix
AMI_PREFIX=app-server DRY_RUN=true python aws_resource_cleanup.py

Key Commands Explained

CommandWhat it does
describe_volumes(Filters=[{"Name":"status","Values":["available"]}])Lists only unattached EBS volumes — server-side filter, no extra client logic
get_paginator("describe_volumes")Handles pagination for accounts with many volumes
delete_volume(VolumeId=id)Permanently destroys EBS volume — irreversible
describe_addresses()Returns all Elastic IPs (not paginated — returns all at once)
"AssociationId" in addrTrue means EIP is in use; False means it’s idle and billing
release_address(AllocationId=id)Returns EIP to AWS pool — you lose that IP permanently
describe_images(OwnerIds=["self"])Lists only AMIs you own (not public or marketplace AMIs)
deregister_image(ImageId=id)Removes AMI — cannot launch from it, but snapshots still exist
delete_snapshot(SnapshotId=id)Actually frees the storage and stops billing
json.dump(..., default=str)Serialises datetime objects to ISO strings in the JSON output

Common Gotchas

EBS volumes with snapshotsdelete_volume() does NOT delete snapshots of that volume. Snapshots outlive the volume and keep billing. Use describe_snapshots(OwnerIds=["self"]) and filter by VolumeId to find them.

AMI deregister before snapshot delete — You must deregister the AMI first before its snapshots can be deleted. Attempting to delete a snapshot that is still registered as an AMI’s root device raises InvalidSnapshot.InUse.

EIP in EC2-Classic domain — Very old accounts may have EIPs with Domain=standard (not vpc). These are released with PublicIp=addr["PublicIp"] instead of AllocationId.


🔍 Line-by-Line Code Walkthrough

Imports

LineWhy It’s Used
from dataclasses import dataclass, field@dataclass auto-generates __init__ and __repr__. field(default_factory=list) safely initializes mutable list fields
from collections import defaultdictdefaultdict(list) auto-creates an empty list when a new key is first accessed — used to group AMIs by name prefix

CleanupReport Dataclass

@dataclass
class CleanupReport:
    ebs_volumes: list = field(default_factory=list)
    eips:        list = field(default_factory=list)
    amis:        list = field(default_factory=list)
    ebs_savings_per_month: float = 0.0

    @property
    def total_savings_per_month(self) -> float:
        return self.ebs_savings_per_month + self.eip_savings_per_month + self.ami_savings_per_month
LineExplanation
field(default_factory=list)Mutable default values (lists, dicts) cannot be written as ebs_volumes: list = [] in a dataclass — that list would be shared across ALL instances. default_factory=list creates a new empty list per instance
ebs_savings_per_month: float = 0.0Immutable default (float) is safe to use directly — no field() wrapper needed
@property total_savings_per_monthComputed on read. Sums the three savings fields. No storage needed — always derived from the three source fields

AWSResourceCleaner.__init__

def __init__(self, region: str = "us-east-1", dry_run: bool = True):
    self.ec2     = boto3.client("ec2", region_name=region)
    self.region  = region
    self.dry_run = dry_run
    self.report  = CleanupReport()
    if dry_run:
        logger.info("Running in DRY-RUN mode — no resources will be deleted")
    else:
        logger.warning("Running in LIVE mode — resources WILL be deleted!")
LineExplanation
dry_run: bool = TrueDefault is True — must explicitly opt-in to deletion. Prevents accidents
boto3.client("ec2", ...)All three cleanup operations (EBS, EIP, AMI) use the EC2 API — one client covers all of them
self.report = CleanupReport()Creates a fresh report object. As cleanup runs, results are accumulated into this object
logger.warning(...) for live modeUses WARNING (not INFO) for live mode — so it’s visually distinct in log output and will appear even if LOG_LEVEL=WARNING is set

Part 1 — clean_unattached_ebs_volumes()

paginator = self.ec2.get_paginator("describe_volumes")
for page in paginator.paginate(
    Filters=[{"Name": "status", "Values": ["available"]}]
):
    for vol in page["Volumes"]:
LineExplanation
get_paginator("describe_volumes")Returns a Paginator that automatically handles NextToken — essential for accounts with hundreds of volumes
Filters=[{"Name": "status", "Values": ["available"]}]Server-side filter — only volumes in "available" state are returned. "available" = not attached to any instance. "in-use" = attached. The filter runs in AWS, saving network transfer and CPU
for vol in page["Volumes"]Each page has a "Volumes" key containing a list of volume dicts
days_old = (datetime.now(timezone.utc) - create_time).days
price_per_gb = self.EBS_PRICE_PER_GB_MONTH.get(vol_type, 0.08)
monthly_cost = size_gb * price_per_gb
LineExplanation
datetime.now(timezone.utc) - create_timeReturns a timedelta. Both sides must be timezone-aware. vol["CreateTime"] is already UTC-aware (boto3 returns aware datetimes)
.daysExtracts only the integer number of days from the timedelta
self.EBS_PRICE_PER_GB_MONTH.get(vol_type, 0.08)Dict lookup with default. If the volume type isn’t in our price table (e.g., a future type), default to gp3 pricing
size_gb * price_per_gbSimple multiplication — not exact (AWS rounds to the month), but close enough for waste estimates
if not self.dry_run:
    self.ec2.delete_volume(VolumeId=volume_id)
    entry["deleted"] = True
    self.report.ebs_savings_per_month += monthly_cost
else:
    self.report.ebs_savings_per_month += monthly_cost
LineExplanation
self.ec2.delete_volume(VolumeId=volume_id)Permanently destroys the EBS volume. The volume must be in "available" state — attached volumes raise VolumeInUse. This is irreversible
entry["deleted"] = TrueMarks the entry so the report distinguishes dry-run (would delete) from actual deletions
Accumulate savings in both branchesWhether we deleted or just found, we accumulate the savings estimate. In dry-run, this is the potential savings; in live mode, this is actual savings

Part 2 — clean_unassociated_eips()

response = self.ec2.describe_addresses()
for addr in response["Addresses"]:
    if "AssociationId" in addr:
        continue
LineExplanation
describe_addresses()Returns all Elastic IPs in the region. Not paginated — always returns all at once (AWS accounts typically have a small limit, like 5 per region by default)
response["Addresses"]The list of EIP objects in the response
if "AssociationId" in addr: continueEIPs associated with an instance or network interface have "AssociationId". If present, the EIP is in use — skip it. If absent, the key doesn’t exist at all in the dict (not None)
monthly_cost = self.EIP_PRICE_PER_HOUR * 24 * 30.5
self.ec2.release_address(AllocationId=allocation_id)
LineExplanation
EIP_PRICE_PER_HOUR * 24 * 30.5$0.005 × 24 hours × 30.5 days ≈ $3.65/month. Uses 30.5 as the average month length
release_address(AllocationId=...)Returns this EIP to the AWS pool. You permanently lose that specific IP address. Use AllocationId for VPC EIPs (modern). Legacy EC2-Classic EIPs use PublicIp
if not self.dry_run and allocation_id:Double check: only release if not dry-run AND the allocation ID exists (EC2-Classic IPs may lack one)

Part 3 — clean_old_amis(keep_count, name_prefix)

response = self.ec2.describe_images(OwnerIds=["self"], Filters=filters)
all_amis = response["Images"]
LineExplanation
OwnerIds=["self"]Critical: restricts results to AMIs owned by your account. Without this, describe_images() returns AWS marketplace AMIs and public AMIs — thousands of results you can’t delete
"self" is a special aliasAWS resolves "self" to your current account ID
response["Images"]The list of AMI objects — not paginated (returns all at once)
groups: dict[str, list] = defaultdict(list)
for ami in all_amis:
    parts = ami_name.replace("_", "-").split("-")
    group_key = "-".join(parts[:2]) if len(parts) >= 2 else ami_name
    groups[group_key].append(ami)
LineExplanation
defaultdict(list)When groups["app-server"] is accessed for the first time, it automatically creates []. Without this, you’d need groups.setdefault(key, []).append(ami)
ami_name.replace("_", "-")Normalizes separators. AMI names like "app_server-20250120" and "app-server-20250120" become the same prefix
.split("-")parts[:2]Splits "app-server-20250120" into ["app", "server", "20250120"]. Taking [:2] gives ["app", "server"] → joined as "app-server"
groups[group_key].append(ami)Groups all AMIs with the same prefix into the same list for sorting and pruning
amis_sorted = sorted(amis_in_group, key=lambda a: a["CreationDate"], reverse=True)
to_delete = amis_sorted[keep_count:]
LineExplanation
key=lambda a: a["CreationDate"]Sorts by the "CreationDate" string ("2025-01-20T14:30:00.000Z"). ISO 8601 dates sort correctly as strings (lexicographic order = chronological order)
reverse=TrueNewest first. Index 0 is the most recent AMI
amis_sorted[keep_count:]Python slice — skips the first keep_count items (the ones we keep) and returns the rest (the ones to delete)
snapshot_ids = [
    bdm["Ebs"]["SnapshotId"]
    for bdm in ami.get("BlockDeviceMappings", [])
    if "Ebs" in bdm and "SnapshotId" in bdm["Ebs"]
]
self.ec2.deregister_image(ImageId=image_id)
for snap_id in snapshot_ids:
    self.ec2.delete_snapshot(SnapshotId=snap_id)
LineExplanation
ami.get("BlockDeviceMappings", [])List of block device mappings — each volume the AMI includes. [] default prevents errors on AMIs with no mappings
if "Ebs" in bdm and "SnapshotId" in bdm["Ebs"]Some mappings are ephemeral (instance store) not EBS — they have no "Ebs" key. Double-check guards against KeyError
deregister_image(ImageId=...)Step 1: removes the AMI from the catalog. After this, no new instances can be launched from it. Running instances are not affected
delete_snapshot(SnapshotId=...)Step 2: deletes the EBS snapshot. This is what actually frees storage and stops billing. You MUST deregister the AMI first — otherwise delete_snapshot raises InvalidSnapshot.InUse
Why two steps?AMI registration and snapshot storage are separate resources. AWS designed it this way to allow AMI sharing without sharing the underlying snapshot

generate_report(output_file) — JSON Output

report_data = {
    "generated_at": datetime.now(timezone.utc).isoformat(),
    ...
    "total_savings_per_year": round(self.report.total_savings_per_month * 12, 2),
}
with open(output_file, "w") as f:
    json.dump(report_data, f, indent=2, default=str)
LineExplanation
datetime.now(timezone.utc).isoformat()Produces "2025-01-20T14:30:00+00:00" — a standard ISO 8601 string. The report timestamp tells you exactly when this audit ran
round(..., 2)Rounds to 2 decimal places (cents). Avoids floating-point noise like $3.6499999999
total_savings_per_month * 12Annualizes the monthly estimate. Useful for presenting cost savings to management
json.dump(..., default=str)default=str is a fallback serializer. When json.dump encounters a type it can’t serialize (like datetime objects), it calls str() on them. Prevents TypeError: Object of type datetime is not JSON serializable
indent=2Pretty-prints the JSON with 2-space indentation — human-readable for manual review
Services Used
EC2boto3
Prerequisites
  • Python 3.8+
  • boto3
  • IAM: ec2:DescribeVolumes, ec2:DeleteVolume, ec2:DescribeAddresses, ec2:ReleaseAddress, ec2:DescribeImages, ec2:DeregisterImage, ec2:DeleteSnapshot
What You Learned
  • EBS volume lifecycle states
  • Elastic IP allocation vs association
  • AMI deregistration and snapshot cleanup
  • Cost calculation from volume size and pricing
  • Safe dry-run pattern for destructive scripts

Have a similar scenario to share?

Production incidents are the best teachers. Submit your real-world scenario and help others learn.

Open Google Form

Related Scenarios