Migrate a Java Monolith From On-Premises to AWS With Minimal Downtime
Plan and execute a phased migration of a Java on-premises monolith to AWS using re-platform strategy, AWS DMS for database migration, and a 30-minute cutover window.
Your company runs a 10-year-old Java monolith on bare-metal servers in a data center. The data center contract expires in 6 months. You need to migrate to AWS with minimal downtime — the application processes orders 24/7 and the business allows a maximum 30-minute maintenance window. The application uses Oracle DB (200GB), serves 500 concurrent users, and has 15 external integrations.
The Problem
A 6-month deadline forces choices. You don’t have time to re-architect the monolith into microservices. You need to move the application to AWS, keep it working, and modernize incrementally over the following 12 months.
Step 1: Discovery and Assessment (Weeks 1-2)
Before writing a line of Terraform, understand what you’re moving:
# Deploy AWS Application Discovery Service agents on all on-prem servers
# (done via console: Migration Hub → Discover → Data Collectors)
# After 2 weeks of collection, export the dependency map
aws discovery describe-export-tasks
aws discovery list-servers \
--query 'servers[*].{Name:serverInfo.networkInterfaceInfo[0].ipAddress,CPU:serverInfo.cpuType,RAM:serverInfo.ramInMB,OS:osInfo.type}'
Key questions the discovery answers:
- Which servers communicate with each other? (Dependencies to migrate together)
- What external IPs does the app reach? (Firewall rules to replicate)
- What are the peak CPU/memory hours? (Right-size EC2 instances)
- Which processes run as scheduled jobs? (Cron → EventBridge / Lambda)
Apply the 6 Rs framework:
| Service | 6R Decision | Rationale |
|---|---|---|
| Java App Server | Re-platform → EC2 (then ECS/EKS) | Same app, managed infra, faster migration |
| Oracle DB | Re-platform → RDS Oracle → Aurora | Managed, but convert to PostgreSQL next year |
| File Server | Re-host → EFS | Lift and shift shared file system |
| Active Directory | Re-platform → AWS Managed AD | Keep AD for auth, eliminate on-prem AD |
| Batch Jobs (cron) | Re-architect → EventBridge + Lambda | Easy win during migration |
Step 2: Foundation (Weeks 3-6)
# Set up Landing Zone via Control Tower
# Minimum 3 accounts:
# - Management (billing)
# - Production (app workload)
# - Shared Services (ECR, logging)
# Deploy network foundation with Terraform
# VPC with 3-tier architecture
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "prod-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.10.0/24", "10.0.11.0/24", "10.0.12.0/24"]
public_subnets = ["10.0.0.0/24", "10.0.1.0/24", "10.0.2.0/24"]
database_subnets = ["10.0.20.0/24", "10.0.21.0/24", "10.0.22.0/24"]
enable_nat_gateway = true
single_nat_gateway = false # HA: one NAT per AZ
}
# Set up AWS Direct Connect (private 1Gbps path for data migration)
# - Avoids internet bandwidth limits during 200GB database transfer
# - Required for continuous DMS replication
# Provisioning takes 4-8 weeks — start this FIRST
Step 3: Database Migration With AWS DMS (Weeks 7-10)
DMS enables zero-downtime database migration with continuous replication:
Phase 1: Full Load
On-Prem Oracle ────────────────────────────► RDS Oracle
(takes hours, app still writes to on-prem)
Phase 2: Ongoing Replication (CDC)
On-Prem Oracle ──► Capture changes (CDC) ──► RDS Oracle
(< 1 second lag, runs for weeks before cutover)
Phase 3: Cutover (30-minute window)
1. Stop writes to on-prem (maintenance mode)
2. Wait for DMS lag to reach 0
3. Validate row counts and checksums
4. Redirect app to RDS
5. Verify app works on AWS
6. Done
# Create DMS replication instance
aws dms create-replication-instance \
--replication-instance-identifier prod-migration-dms \
--replication-instance-class dms.r5.2xlarge \
--allocated-storage 500 \
--multi-az \
--vpc-security-group-ids sg-dms
# Create source endpoint (on-prem Oracle via Direct Connect)
aws dms create-endpoint \
--endpoint-identifier source-oracle-onprem \
--endpoint-type source \
--engine-name oracle \
--server-name 10.100.0.50 \
--port 1521 \
--database-name PRODDB \
--username dms_user \
--password $ORACLE_PASSWORD
# Create target endpoint (RDS Oracle)
aws dms create-endpoint \
--endpoint-identifier target-rds-oracle \
--endpoint-type target \
--engine-name oracle \
--server-name prod-oracle.us-east-1.rds.amazonaws.com \
--port 1521 \
--database-name PRODDB \
--username admin \
--password $RDS_PASSWORD
# Create replication task (full load + CDC)
aws dms create-replication-task \
--replication-task-identifier prod-oracle-migration \
--source-endpoint-arn $SOURCE_ARN \
--target-endpoint-arn $TARGET_ARN \
--replication-instance-arn $REPLICATION_INSTANCE_ARN \
--migration-type full-load-and-cdc \
--table-mappings '{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "include-all",
"object-locator": {
"schema-name": "PRODSCHEMA",
"table-name": "%"
},
"rule-action": "include"
}]
}'
Validate data consistency:
-- On-prem Oracle
SELECT COUNT(*) FROM orders; -- e.g., 4,823,419
-- RDS Oracle (should match after full load)
SELECT COUNT(*) FROM orders; -- 4,823,419 ✓
-- More thorough: checksum validation
SELECT SUM(ORA_HASH(order_id || total_amount)) FROM orders;
Step 4: Application Migration (Weeks 7-14, Parallel)
While DMS runs, set up the EC2 environment:
# Launch EC2 instance matching on-prem server specs
aws ec2 run-instances \
--image-id ami-java-app-server \ # Custom AMI with Java + Tomcat pre-installed
--instance-type m5.4xlarge \ # Sized from Compute Optimizer recommendations
--subnet-id subnet-private-az1 \
--security-group-ids sg-app-server \
--iam-instance-profile Name=app-server-role \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=prod-app-1}]'
# Deploy application on EC2 using existing Ansible playbooks
# (adapt on-prem playbooks to use RDS connection strings)
ansible-playbook deploy-app.yml \
-i aws-inventory.ini \
-e "db_host=prod-oracle.us-east-1.rds.amazonaws.com" \
-e "environment=aws-staging"
Parallel validation: Run the AWS instance in “shadow mode” — receiving real traffic but not serving responses — to compare behavior with on-prem.
Step 5: Cutover (30-Minute Maintenance Window)
Prerequisites before starting the window:
- DMS replication lag < 5 seconds (CDC is caught up)
- AWS environment validated with load tests
- Rollback plan ready (DNS flip back to on-prem)
- All stakeholders notified
# T-0: Enable maintenance mode on the on-prem load balancer
# (returns HTTP 503 with "Scheduled maintenance" page)
# T+2: Verify DMS lag has reached 0
aws dms describe-replication-tasks \
--filters Name=replication-task-arn,Values=$TASK_ARN \
--query 'ReplicationTasks[0].ReplicationTaskStats.CDCLatencyTarget'
# Must show: 0
# T+5: Run final validation query
psql -h prod-oracle.us-east-1.rds.amazonaws.com -c "SELECT COUNT(*) FROM orders;"
# Must match on-prem count
# T+7: Stop DMS replication task
aws dms stop-replication-task --replication-task-arn $TASK_ARN
# T+10: Update application config to point to RDS (deploy via SSM Parameter Store)
aws ssm put-parameter \
--name /prod/app/db-host \
--value prod-oracle.us-east-1.rds.amazonaws.com \
--type SecureString \
--overwrite
# T+12: Restart app servers to pick up new config
aws ssm send-command \
--document-name AWS-RunShellScript \
--targets Key=tag:Name,Values=prod-app-* \
--parameters commands=["sudo systemctl restart tomcat"]
# T+15: Update DNS to point to AWS ALB
aws route53 change-resource-record-sets \
--hosted-zone-id $ZONE_ID \
--change-batch '{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "app.company.com",
"Type": "CNAME",
"TTL": 60,
"ResourceRecords": [{"Value": "prod-alb.us-east-1.elb.amazonaws.com"}]
}
}]
}'
# T+18: Verify health checks green on AWS ALB
aws elbv2 describe-target-health --target-group-arn $TG_ARN
# T+25: Remove maintenance mode on-prem (traffic now flowing to AWS)
# T+30: Monitor error rates, latency, and database performance
Step 6: Post-Migration Optimization (Months 4-6)
After the migration, modernize incrementally:
| Month | Action | Benefit |
|---|---|---|
| Month 4 | Rightsize EC2 with Compute Optimizer | 20-30% cost reduction |
| Month 4 | Convert Oracle → Aurora PostgreSQL (Schema Conversion Tool) | 60-70% license cost saving |
| Month 5 | Extract stateless batch jobs → Lambda + EventBridge | Eliminate EC2 for scheduled tasks |
| Month 6 | Containerize app → ECS Fargate | Eliminate EC2 management |
# Use AWS Schema Conversion Tool for Oracle → PostgreSQL
# (GUI-based tool — identifies incompatible SQL constructs)
aws sct --source oracle --target aurora-postgresql \
--source-endpoint prod-oracle.us-east-1.rds.amazonaws.com
- How to apply the 6 Rs framework to choose the right migration strategy
- How to use AWS DMS for continuous replication during migration
- The difference between a migration cutover and a migration go-live
- How to use Application Discovery Service to map dependencies
- How Direct Connect provides a private high-bandwidth path during migration
Have a similar scenario to share?
Production incidents are the best teachers. Submit your real-world scenario and help others learn.
Open Google FormRelated Scenarios
AWS Cloud Foundations — Fresher Learning Path
How to Use This Path Each section below shows an AWS architecture diagram. Click any coloured block to see:
Implement AWS Control Tower for a 20-Account Organization
The Problem Without Control Tower, each new AWS account is a blank canvas. Security baselines drift. CloudTrail might be enabled in one …
Migrate from ECS Fargate to EKS With Zero Downtime
The Problem A big-bang cutover from ECS to EKS is too risky — if EKS has issues, you’ve already disconnected ECS. The Strangler Fig …