Migrate from ECS Fargate to EKS With Zero Downtime
Walk through a phased migration strategy from ECS Fargate to Amazon EKS using the Strangler Fig pattern, Route53 weighted routing, and parallel validation.
Your 15-service application runs on ECS Fargate. The engineering org has standardized on Kubernetes for new projects, and the platform team needs to migrate existing services. Business requirement: no service outages. Technical constraint: services share a database, so you can't do a big-bang migration.
The Problem
A big-bang cutover from ECS to EKS is too risky — if EKS has issues, you’ve already disconnected ECS. The Strangler Fig pattern lets you migrate incrementally: run both platforms simultaneously and shift traffic gradually until ECS handles 0%.
Migration Timeline
Phase 1: Foundation & Setup (Week 1-2)
├── ECS: 100% traffic
└── EKS: 0% traffic (cluster setup, networking, first service deployed)
Phase 2: Canary (Week 3-4)
├── ECS: 90% traffic
└── EKS: 10% traffic (via Route53 weighted routing)
Phase 3: Progressive Shift (Week 5-6)
├── ECS: 50% → 20% → 5%
└── EKS: 50% → 80% → 95%
Phase 4: Full Cutover (Week 7)
└── EKS: 100% — ECS tasks drained and decommissioned
Step 1: Translate ECS Task Definitions to Kubernetes Manifests
Use kompose to generate a starting point, then refine:
# Convert docker-compose.yml to Kubernetes YAML
kompose convert -f docker-compose.yml -o k8s/
Manual translation example:
ECS Task Definition:
{
"family": "payment-service",
"cpu": "512",
"memory": "1024",
"containerDefinitions": [{
"name": "payment-service",
"image": "123456789.dkr.ecr.us-east-1.amazonaws.com/payment-service:latest",
"portMappings": [{"containerPort": 8080}],
"secrets": [{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:...:secret:prod/db/password"
}]
}]
}
Kubernetes Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-service
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: payment-service
template:
metadata:
labels:
app: payment-service
spec:
containers:
- name: payment-service
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/payment-service:latest
ports:
- containerPort: 8080
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: payment-service-secrets
key: DB_PASSWORD
Step 2: Set Up EKS Cluster With VPC CNI
Create the EKS cluster in the same VPC as ECS so both can reach the shared RDS database:
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.0"
cluster_name = "prod-cluster"
cluster_version = "1.29"
vpc_id = data.terraform_remote_state.networking.outputs.vpc_id
subnet_ids = data.terraform_remote_state.networking.outputs.private_subnet_ids
# Same security groups to share RDS access
node_security_group_additional_rules = {
ingress_rds = {
type = "ingress"
from_port = 5432
to_port = 5432
protocol = "tcp"
source_security_group_id = aws_security_group.rds.id
}
}
}
Step 3: Migrate Secrets — External Secrets Operator
ECS reads secrets directly from Secrets Manager via the task definition. In Kubernetes, use External Secrets Operator to sync Secrets Manager into Kubernetes Secrets:
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets -n external-secrets --create-namespace
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: payment-service-secrets
namespace: production
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: payment-service-secrets
creationPolicy: Owner
data:
- secretKey: DB_PASSWORD
remoteRef:
key: prod/db/password
property: password
- secretKey: STRIPE_API_KEY
remoteRef:
key: prod/payment/stripe
property: api_key
Step 4: Install AWS Load Balancer Controller
The ALB Ingress Controller creates Application Load Balancers from Kubernetes Ingress resources:
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=prod-cluster \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controller
# Create ALB for EKS services
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: payment-service
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internal
alb.ingress.kubernetes.io/target-type: ip
spec:
rules:
- host: payment-internal.prod.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: payment-service
port:
number: 80
Step 5: Parallel Validation — Run Both and Compare
Before shifting traffic, validate that EKS produces identical results to ECS:
# Compare response times and error rates between ECS ALB and EKS ALB
# Run k6 load test against both simultaneously
k6 run - <<EOF
import http from 'k6/http';
import { check } from 'k6';
export let options = { vus: 100, duration: '5m' };
export default function() {
// Test ECS endpoint
let ecs = http.get('http://ecs-alb.internal/api/payments');
check(ecs, { 'ECS 200': (r) => r.status === 200 });
// Test EKS endpoint
let eks = http.get('http://eks-alb.internal/api/payments');
check(eks, { 'EKS 200': (r) => r.status === 200 });
}
EOF
Step 6: Traffic Shifting via Route 53 Weighted Routing
# Start: 90% ECS, 10% EKS
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890 \
--change-batch '{
"Changes": [
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "payment-service.prod.internal",
"Type": "CNAME",
"SetIdentifier": "ECS",
"Weight": 90,
"TTL": 30,
"ResourceRecords": [{"Value": "ecs-alb-dns.us-east-1.elb.amazonaws.com"}]
}
},
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "payment-service.prod.internal",
"Type": "CNAME",
"SetIdentifier": "EKS",
"Weight": 10,
"TTL": 30,
"ResourceRecords": [{"Value": "eks-alb-dns.us-east-1.elb.amazonaws.com"}]
}
}
]
}'
Monitor CloudWatch error rates and latency for both. If EKS metrics match ECS, shift to 50/50, then 95/5, then 100/0.
Step 7: Drain and Decommission ECS
Once EKS handles 100% of traffic:
# Scale ECS service to 0 (stop tasks but keep service definition)
aws ecs update-service \
--cluster prod-cluster \
--service payment-service \
--desired-count 0
# Monitor for 48 hours, then delete
aws ecs delete-service \
--cluster prod-cluster \
--service payment-service \
--force
Migration Checklist
| Item | Status |
|---|---|
| EKS cluster provisioned in same VPC | ✅ |
| All secrets migrated to External Secrets Operator | ✅ |
| AWS Load Balancer Controller installed | ✅ |
| Observability: CloudWatch Container Insights enabled | ✅ |
| Parallel load test validates EKS parity with ECS | ✅ |
| Route 53 TTL reduced to 30s before cutover | ✅ |
| Rollback plan documented (flip Route 53 weights back) | ✅ |
- The Strangler Fig migration pattern applied to container platforms
- How to translate ECS task definitions to Kubernetes manifests
- How to migrate secrets from ECS task definitions to External Secrets Operator
- How to use Route 53 weighted routing for gradual traffic shifting
- How to validate parallel EKS vs ECS performance before cutting over
Have a similar scenario to share?
Production incidents are the best teachers. Submit your real-world scenario and help others learn.
Open Google FormRelated Scenarios
Build a Zero-Downtime Deployment Pipeline for Microservices on EKS
The Problem A traditional kubectl apply replaces all pods simultaneously — if the new image is broken, users hit errors until you notice and …
Application Latency Spiked After Migrating EC2 to ECS Fargate
The Problem Latency regressions after migrating to Fargate are almost always caused by networking changes — not application code. Fargate …
Design an Observability Stack for 50+ Microservices on EKS
The Problem Without centralized observability, you’re flying blind. Debugging requires SSH-ing into pods, grepping logs, and guessing …