AWS • Terraform • ECS • CI/CD
Automated Blue-Green Deployment Platform on AWS ECS
Designed and implemented a production-style deployment platform using AWS ECS Fargate, Terraform, GitLab CI/CD, and Application Load Balancers to achieve zero-downtime releases with instant rollback capability.
< 1s
Deployment Downtime
< 2s
Rollback Time
0 changes
Terraform Drift
Blue-Green
Deployment Strategy
The Problem
Traditional rolling deployments temporarily reduce service availability and complicate rollback procedures.
Most Terraform implementations also tightly couple service configuration with infrastructure code, making onboarding slower and increasing operational complexity.
I wanted to build a deployment system that could:
- Achieve near zero-downtime deployments
- Enable instant rollback
- Scale across multiple services
- Avoid Terraform code changes for new services
- Support production-style CI/CD workflows
Architecture
GitLab CI/CD
↓
Terraform Apply
↓
AWS ECS Fargate
↓
Blue ECS Service ─────┐
│
├── Application Load Balancer
│
Green ECS Service ────┘
↓
Traffic Switching via ALB Listener UpdatesBoth blue and green environments run simultaneously. Traffic switching occurs by modifying ALB listener rules, enabling near-instant cutovers and rollbacks.
Key Engineering Decisions
JSON-Driven Infrastructure
Terraform dynamically reads service definitions from JSON files using fileset() and jsondecode(), enabling zero-code onboarding of new services.
Stable Terraform Graph
Autoscaling resources use color-stable keys to avoid unintended Terraform recreations during traffic switches.
Native S3 State Locking
Terraform 1.10 S3 native locking removed DynamoDB dependencies and simplified infrastructure bootstrapping.
ALB-Based Traffic Switching
Traffic cutover is handled using AWS CLI listener updates triggered from Terraform null_resource provisioners.
CI/CD Automation
The deployment pipeline was implemented using GitLab CI/CD with dynamic child pipelines and Python automation scripts.
- Automatic ECR repository provisioning
- Configuration synchronization before Terraform apply
- Dynamic child pipeline generation
- Per-service deployment isolation
- Automated traffic switching workflows
Validation & Results
Multiple deployment cycles were tested successfully, including production cutovers and rollback scenarios.
- Traffic cutover completed in under one second
- Rollback completed in under two seconds
- Terraform re-applies produced zero infrastructure drift
- CPU-based autoscaling validated under synthetic load
- CloudWatch logging verified across deployment cycles