
Who this is for: DevOps engineers or platform architects deploying n8n on Amazon ECS who need reliable, production‑grade auto scaling. We cover this in detail in the n8n Performance & Scaling Guide.
Quick Diagnosis
Problem: Your n8n workflow engine runs in an ECS service, but traffic spikes cause CPU throttling and request time‑outs.
Solution: Create a service‑level Auto Scaling configuration that (1) defines a task definition with appropriate cpu/memory reservations, (2) attaches a target‑tracking scaling policy based on CPUUtilization, and (3) adds a CloudWatch alarm to keep the desired task count within safe bounds.
Apply the checklist below, redeploy the service, and the desired task count will automatically rise when CPU > 70 % and fall when it drops below 30 %.
1. Prerequisites & IAM Permissions
If you encounter any kubernetes scaling strategies resolve them before continuing with the setup.
1.1 Required tools & roles
| Requirement | Why it matters | How to verify |
|---|---|---|
| AWS CLI ≥ 2.7 | Needed for ecs and application-autoscaling commands |
aws --version |
| Role ecsTaskExecutionRole with AmazonECSTaskExecutionRolePolicy | Allows the task to pull the n8n image and write logs to CloudWatch | IAM → Roles |
| Role ecsServiceAutoScalingRole with AWSApplicationAutoScalingECSServicePolicy | Grants Application Auto Scaling permission to modify the service | IAM → Roles |
| VPC with ≥ 2 subnets (public or private) | Ensures tasks have network connectivity for external webhooks | VPC → Subnets |
n8n Docker image (e.g., n8nio/n8n:latest) |
Container that runs the workflow engine | docker pull n8nio/n8n:latest (optional) |
EEFA tip – Never grant AdministratorAccess to these roles in production. Scope policies to the specific ECS cluster and service ARNs.
2. Crafting the ECS Task Definition for n8n
Below are small, focused JSON snippets that together form the complete task definition. Insert each snippet into n8n-task-def.json in the order shown.
2.1 Core task metadata
{
"family": "n8n-ecs-task",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024", // 1 vCPU
"memory": "2048"
2.2 Execution & task roles
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole", "taskRoleArn": "arn:aws:iam::123456789012:role/n8nTaskRole",
2.3 Container definition (core)
"containerDefinitions": [
{
"name": "n8n",
"image": "n8nio/n8n:latest",
"portMappings": [{ "containerPort": 5678, "protocol": "tcp" }],
2.4 Environment variables
"environment": [
{ "name": "GENERIC_TIMEZONE", "value": "UTC" },
{ "name": "N8N_BASIC_AUTH_ACTIVE", "value": "true" },
{ "name": "N8N_BASIC_AUTH_USER", "value": "admin" },
{ "name": "N8N_BASIC_AUTH_PASSWORD", "value": "SuperSecret" }
],
2.5 Log configuration & closing braces
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/n8n",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
Register the definition
aws ecs register-task-definition \ --cli-input-json file://n8n-task-def.json
EEFA tip – Set cpu and memory at the task level (not just container level) when using Fargate; otherwise the service defaults to the minimum (0.5 vCPU / 1 GiB) and auto‑scaling is throttled. If you encounter any horizontal scaling with redis queue resolve them before continuing with the setup.
3. Creating the ECS Service
Deploy the service on an existing cluster (n8n-cluster). The command below launches two tasks for high‑availability.
aws ecs create-service \
--cluster n8n-cluster \
--service-name n8n-service \
--task-definition n8n-ecs-task \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-abc123,subnet-def456],securityGroups=[sg-0123abcd],assignPublicIp=ENABLED}"
Why 2 tasks?
Two tasks guarantee AZ‑level redundancy and give the auto‑scaler headroom to add capacity without a cold‑start penalty.
4. Configuring Service Auto Scaling
4.1 Register a scalable target
aws application-autoscaling register-scalable-target \ --service-namespace ecs \ --resource-id service/n8n-cluster/n8n-service \ --scalable-dimension ecs:service:DesiredCount \ --min-capacity 2 \ --max-capacity 10
4.2 Target‑tracking scaling policy (CPU‑based)
Create cpu-policy.json with the following content:
{
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
},
"ScaleOutCooldown": 60,
"ScaleInCooldown": 120
}
Apply the policy:
aws application-autoscaling put-scaling-policy \ --policy-name n8n-cpu-target-tracking \ --service-namespace ecs \ --resource-id service/n8n-cluster/n8n-service \ --scalable-dimension ecs:service:DesiredCount \ --policy-type TargetTrackingScaling \ --target-tracking-scaling-policy-configuration file://cpu-policy.json
EEFA note – ScaleInCooldown should be twice ScaleOutCooldown to prevent rapid oscillation when traffic drops.
4.3 Optional step‑scaling policy (memory spikes)
Create memory-step.json:
{
"AdjustmentType": "ChangeInCapacity",
"Cooldown": 90,
"MetricAggregationType": "Average",
"StepAdjustments": [
{
"MetricIntervalLowerBound": 0,
"MetricIntervalUpperBound": 30,
"ScalingAdjustment": 1
},
{
"MetricIntervalLowerBound": 30,
"ScalingAdjustment": 2
}
],
"MetricSpecification": {
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageMemoryUtilization"
},
"Statistic": "Average",
"Unit": "Percent"
}
}
Apply the step‑scaling policy:
aws application-autoscaling put-scaling-policy \ --policy-name n8n-memory-step \ --service-namespace ecs \ --resource-id service/n8n-cluster/n8n-service \ --scalable-dimension ecs:service:DesiredCount \ --policy-type StepScaling \ --step-scaling-policy-configuration file://memory-step.json
5. Setting Up CloudWatch Metrics & Alarms
5.1 Alarm matrix
| Alarm | Metric | Threshold | Action |
|---|---|---|---|
| High‑CPU | ECS/Service/CPUUtilization (Average) |
> 85 % for 2 min | SNS alert + optional manual scale‑out |
| Low‑CPU | Same as above | < 30 % for 5 min | SNS alert – useful for capacity planning |
| Task‑Failure | ECS/ContainerInsights/Task/RunningCount |
< DesiredCount for 3 min | Lambda that restarts the service |
5.2 Create the High‑CPU alarm
aws cloudwatch put-metric-alarm \ --alarm-name n8n-HighCPU \ --metric-name CPUUtilization \ --namespace AWS/ECS \ --statistic Average \ --period 60 \ --evaluation-periods 2 \ --threshold 85 \ --comparison-operator GreaterThanThreshold \ --dimensions Name=ClusterName,Value=n8n-cluster Name=ServiceName,Value=n8n-service \ --alarm-actions arn:aws:sns:us-east-1:123456789012:n8n-alerts
EEFA insight – Do **not** set TreatMissingData to ignore. In production a missing metric usually means the task stopped reporting and should be treated as **breaching** to trigger rapid remediation.
6. Validation & Troubleshooting Checklist
| Step | What to verify | CLI / Console command |
|---|---|---|
| Task definition registered | family appears in ECS console |
aws ecs list-task-definitions --family-prefix n8n-ecs-task |
| Service running | Desired = Running = 2 (or more) | aws ecs describe-services --cluster n8n-cluster --services n8n-service |
| Scalable target set | Min = 2, Max = 10 | aws application-autoscaling describe-scalable-targets --service-namespace ecs --resource-id service/n8n-cluster/n8n-service |
| Target‑tracking policy active | Policy ARN listed | aws application-autoscaling describe-scaling-policies --service-namespace ecs --resource-id service/n8n-cluster/n8n-service |
| CloudWatch alarm OK | State = OK after calm period |
aws cloudwatch describe-alarms --alarm-names n8n-HighCPU |
| Logs streaming | /ecs/n8n log group shows recent entries |
aws logs tail /ecs/n8n --follow |
| Network connectivity | Webhook URLs reachable | curl -s -o /dev/null -w "%{http_code}" http://<ALB‑DNS>:5678/healthz |
Common pitfalls
| Symptom | Likely cause | Fix |
|---|---|---|
| Desired count never exceeds 2 | max-capacity set to 2 or missing scalable target |
Increase --max-capacity |
| Scale‑out takes > 5 min | ScaleOutCooldown too high |
Reduce to 30–60 seconds (ensure downstream DB can handle burst) |
| Tasks restart repeatedly | IAM role missing ecs:StartTask permission |
Add ecs:StartTask to ecsServiceAutoScalingRole |
| CPU metric stays at 0 % | Container not publishing CPU stats (missing cpu field) |
Ensure task definition cpu is set and awsvpc mode enabled |
7. Production‑Ready EEFA Recommendations
- Separate monitoring service – Run a dedicated ECS task with the CloudWatch Agent (
containerInsights) to isolate metric collection from the n8n workload. - Graceful shutdown hook – Add
"stopTimeout": 30in the task definition so in‑flight n8n executions can finish before termination during scale‑in. - Secure secrets – Store
N8N_BASIC_AUTH_PASSWORDin AWS Secrets Manager and reference it via thesecretsblock instead of plain environment variables. - Capacity buffer – Target CPU at **70 %** (instead of 80 %) to keep ~30 % headroom for sudden traffic bursts.
- Blue/Green deployments – Enable ECS deployment circuit breaker (
--deployment-configuration deploymentCircuitBreaker={enable=true,rollback=true}) to auto‑rollback if new tasks fail health checks.
Conclusion
By defining a properly sized Fargate task, registering a scalable target, and attaching a target‑tracking policy with sensible cooldowns, n8n can automatically grow to meet CPU demand and shrink during idle periods. Complementary CloudWatch alarms and EEFA‑focused hardening (least‑privilege roles, secret management, graceful shutdown) ensure the solution remains robust in production. Apply the checklist, verify each step, and your n8n workflow engine will stay responsive under real‑world traffic spikes without manual intervention. If you encounter any load balancer setup resolve them before continuing with the setup.



