Who this is for: Ops engineers and DevOps teams that run n8n in production and need a cost‑effective, performant instance sizing strategy. We cover this in detail in the n8n Cost, Scaling & Infrastructure Economics Guide.
Quick Diagnosis
Your n8n server is either throttling under load or you’re paying for idle capacity. Match the concurrency, execution time, and data volume of your workflows to the right cloud‑instance family (CPU‑optimized, memory‑optimized, or burstable). Use the checklist below to size, provision, and monitor the instance. Then enable auto‑scaling for cost‑efficiency.
In production, you’ll usually notice the problem when a webhook surge pushes the CPU past its limits.
1. Profile the n8n Workload
If you encounter any n8n architecture regulated environments resolve them before continuing with the setup.
| Metric | How to Measure | Typical n8n Range | Impact on Instance Choice |
|---|---|---|---|
| Concurrent executions | `n8n metrics` → `activeExecutions` | 1 – 200+ | More vCPU / network bandwidth |
| Average execution time | Add a Start → End timestamp in a workflow |
< 1 s – > 30 s | Longer jobs need more RAM to avoid swapping |
| Payload size per execution | Log request.body size or use *Workflow Statistics* |
< 100 KB – > 10 MB | Large payloads → higher memory + fast storage (NVMe) |
| External I/O (DB, S3, APIs) | Count outbound HTTP calls, DB queries per workflow | 0 – 50 per execution | Choose instances with high network throughput & local SSD |
| Scheduled vs. Event‑driven | Cron vs. webhook frequency | Continuous – bursty | Burstable (t3/t4g) OK for low‑traffic; provisioned for constant traffic |
EEFA note: n8n runs on Node.js; the V8 engine gains more from single‑thread speed than from many cores. Pick high‑clock CPUs (e.g., AWS c6i, GCP c2) when latency matters. A 3 GHz core often beats a 4‑core at 2 GHz for latency‑sensitive steps.
2. Map Workload Profiles to Instance Families
AWS Recommendations
| Workload Profile | Recommended AWS Instance |
|---|---|
| Light‑weight, bursty (< 10 concurrent, < 1 s exec) | `t4g.micro` – `t4g.large` (burstable) |
| CPU‑intensive (JS calculations, heavy transforms) | `c6i.large` – `c6i.xlarge` (≥ 3 GHz) |
| Memory‑heavy (large payloads, many parallel executions) | `r6i.large` – `r6i.xlarge` (≥ 16 GiB) |
| I/O‑bound (frequent DB writes, S3 uploads) | `i3.large` – `i3.xlarge` (NVMe SSD) |
| Hybrid (balanced CPU, RAM, I/O) | `m6i.large` – `m6i.xlarge` (balanced) |
GCP Recommendations
| Workload Profile | Recommended GCP Instance |
|---|---|
| Light‑weight, bursty | `e2-micro` – `e2-standard-2` (burstable) |
| CPU‑intensive | `c2-standard-4` – `c2-standard-8` |
| Memory‑heavy | `m2-standard-4` – `m2-standard-8` |
| I/O‑bound | `n2-standard-4` – `n2-standard-8` with local SSD |
| Hybrid | `n2-standard-4` – `n2-standard-8` |
Azure Recommendations
| Workload Profile | Recommended Azure Instance |
|---|---|
| Light‑weight, bursty | `B1s` – `B2s` (burstable) |
| CPU‑intensive | `Fsv2-Standard-2` – `Fsv2-Standard-4` |
| Memory‑heavy | `Esv3-Standard-2` – `Esv3-Standard-4` |
| I/O‑bound | `Lsv2-Standard-4` – `Lsv2-Standard-8` |
| Hybrid | `Dsv4-Standard-2` – `Dsv4-Standard-4` |
EEFA warning: Burstable instances can run out of CPU credits under sustained load, causing sudden throttling. Watch the credit balance (AWS) or equivalent metric and switch to a provisioned type before credits hit zero. When you’re near the limit, performance can drop like a hiccup. If you encounter any storage cost optimization execution history resolve them before continuing with the setup.
3. Cost‑Optimization Checklist
If you encounter any how logging levels impact cloud bills resolve them before continuing with the setup.
- Right‑size before launch – Pick the smallest family that satisfies all three dimensions (CPU, RAM, I/O).
- Enable Spot / Preemptible – For dev, CI, or non‑critical pipelines, run n8n on Spot (AWS) / Preemptible (GCP) with a fallback on‑demand instance. Most teams see big savings, but you need a graceful shutdown path.
- Reserve Instances / Savings Plans – Lock‑in a 1‑ or 3‑year term if you expect ≥ 6 months of steady traffic (up to 72 % savings).
- Use Autoscaling Groups – Target ~60 % CPU utilization; let the group add/remove nodes. Pair with an Elastic Load Balancer for zero‑downtime.
- Offload state to managed services – Store workflow data in RDS/Aurora or Redis to reduce RAM pressure on the VM.
- Watch network egress – High outbound traffic can dominate cost; use VPC endpoints or Private Service Connect to reduce per‑GB charges.
EEFA tip: When using Spot, add a **capacity‑rebalancing** lifecycle hook so n8n can finish in‑flight workflows before termination.
4. Terraform Provision: Production‑Grade n8n (AWS)
4.1 Variables
variable "instance_type" {
description = "Chosen instance type based on workload profiling"
type = string
default = "c6i.large"
}
variable "region" {
default = "us-east-1"
}
4.2 Provider & Security Group
provider "aws" {
region = var.region
}
resource "aws_security_group" "n8n_sg" {
name = "n8n-sg"
description = "Allow HTTPS and internal traffic"
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
4.3 EC2 Instance (core definition)
resource "aws_instance" "n8n" {
ami = data.aws_ami.ubuntu.id
instance_type = var.instance_type
key_name = aws_key_pair.n8n.key_name
vpc_security_group_ids = [aws_security_group.n8n_sg.id]
root_block_device {
volume_type = "gp3"
volume_size = 30
iops = 3000
throughput = 125
}
tags = {
Name = "n8n-${var.instance_type}"
}
}
EEFA note: The `gp3` volume with high IOPS approximates an NVMe SSD for less money. If you’re truly I/O‑bound, switch to an `i3` instance with instance‑store NVMe. Usually bumping the instance size is quicker than hunting down a memory leak.
4.4 Bootstrap Script (Docker + n8n)
resource "aws_instance" "n8n" {
# ... (previous block unchanged)
user_data = <<-EOF
#!/bin/bash
set -e
apt-get update && apt-get install -y docker.io
systemctl enable docker
docker run -d \\
--name n8n \\
-p 443:5678 \\
-e N8N_HOST="${aws_instance.n8n.public_ip}" \\
-e N8N_PROTOCOL="https" \\
-e N8N_BASIC_AUTH_ACTIVE=true \\
-e N8N_BASIC_AUTH_USER="admin" \\
-e N8N_BASIC_AUTH_PASSWORD="${random_password.n8n.result}" \\
-v /home/ubuntu/.n8n:/root/.n8n \\
n8nio/n8n:latest
EOF
}
4.5 Random Password (for basic auth)
resource "random_password" "n8n" {
length = 16
special = false
}
EEFA tip: In production, store the generated password in **AWS Secrets Manager** and reference it via an IAM role instead of embedding it in user data.
5. Monitoring, Alerts, and Troubleshooting
| Metric | Recommended Tool | Alert Threshold | Why It Matters |
|---|---|---|---|
| CPU Utilization | CloudWatch (AWS) / Cloud Monitoring (GCP) | > 80 % for > 5 min | Indicates under‑provisioned CPU; may need larger instance or more nodes. |
| Memory Pressure | CloudWatch custom `MemoryUtilization` | > 75 % | V8 GC pauses increase; risk of OOM kills. |
| Disk IOPS/Throughput | CloudWatch `DiskReadOps/WriteOps` | > 90 % of provisioned IOPS | I/O bottleneck; switch to NVMe or raise gp3 IOPS. |
| Network In/Out | CloudWatch `NetworkIn/Out` | > 80 % of bandwidth limit | External API throttling may appear as latency spikes. |
| n8n Queue Length | Prometheus `n8n_active_executions` | > 50 | Back‑log builds up; consider scaling out workers. |
Here’s a quick cheat‑sheet of what to watch.
EEFA Troubleshooting Flow
- CPU spike (CPU > 80 %, Memory < 60 %) → Upgrade to next CPU‑optimized size.
- Memory‑bound (CPU < 50 %, Memory > 80 %) → Move to a memory‑optimized instance.
- I/O‑bound (Disk latency > 5 ms) → Switch to `i3` or increase gp3 IOPS.
- Network throttling (high outbound bytes + API timeouts) → Enable VPC endpoints or PrivateLink to the SaaS provider.
6. Scaling Strategies for Variable n8n Traffic
| Strategy | When to Use | Implementation Steps |
|---|---|---|
| Horizontal scaling with Docker Swarm / Kubernetes | > 50 concurrent executions, need zero‑downtime upgrades | Deploy n8n as a **stateless service** behind a Load Balancer; store data in external PostgreSQL/Redis. |
| Vertical scaling (instance resize) | Predictable, slowly growing load | Use cloud provider’s “Resize Instance” API; schedule a maintenance window to avoid IP change. |
| Hybrid (burstable + on‑demand) | Mostly idle, occasional bursts (e.g., nightly imports) | Run a baseline t4g instance; trigger an AWS Lambda that launches a c6i Spot instance when queue length > 30. |
| Serverless n8n (AWS Fargate / Cloud Run) | Event‑driven, pay‑per‑execution model | Containerize n8n, set **max concurrency** = 1 per task to avoid state conflicts; use DynamoDB for workflow storage. |
EEFA caution: n8n’s default local file store isn’t shared across nodes. When scaling horizontally, set
N8N_DATABASE_TYPE=postgresdband point all instances to a shared PostgreSQL database; otherwise you’ll lose workflow history and hit race conditions. In our experience, the horizontal‑scaling path pays off once you cross the 50‑concurrent mark.
Conclusion
Pick the smallest instance family that meets CPU, memory, and I/O requirements derived from your measured workload. Start with that baseline, enable auto‑scaling, and continuously monitor the key metrics above. This disciplined approach keeps your automations fast, reliable, and cost‑effective in real‑world production.



