Who this is for: DevOps engineers, SREs, or anyone running n8n in the cloud who needs to shrink their monthly bill without losing reliability. We cover this in detail in the n8n Cost, Scaling & Infrastructure Economics Guide.
In production, the bill often spikes after a new workflow that spawns many workers lands in the repo.
Quick Diagnosis & Actionable Fix
| Symptom | Root Cause | One‑Line Fix |
|---|---|---|
| Monthly bill spikes > 30 % | Unlimited EXECUTIONS_PROCESS workers on a small VM |
Set EXECUTIONS_PROCESS=main and cap MAX_CONCURRENT_EXECUTIONS to the VM’s CPU count. |
| Idle containers keep running | Default Docker‑Compose restart: always with no health‑checks |
Add restart: unless-stopped + healthcheck + stop_grace_period to shut down idle workers. |
| Storage costs exploding | Unlimited execution history stored in SQLite/MySQL | Enable EXECUTIONS_DELETE_AFTER_DAYS=30 and rotate logs to S3 Glacier. |
| CPU throttling → failed workflows | Over‑provisioned workers on a low‑end instance | Use n8n_node_auto_scale (K8s) or docker‑compose scale to match workers to CPU cores. |
Apply the fixes above first; they usually shave 15‑40 % off the monthly spend with zero functional loss.
1. Map the Real Cost Drivers in an n8n Deployment
If you encounter any cost efficient scaling strategies n8n resolve them before continuing with the setup.
| Component | Typical Cost % (AWS) | Why It Grows | Monitoring Metric |
|---|---|---|---|
| Compute (EC2 / Fargate) | 45 % | Unlimited workers, high‑CPU loops | CPUUtilization, RunningContainers |
| Data Store (RDS / Aurora) | 25 % | Execution history, large JSON payloads | DBConnections, DiskUsage |
| Object Storage (S3) | 12 % | Binary data (files, PDFs) kept indefinitely | BucketSizeBytes, NumberOfObjects |
| Network (Data Transfer) | 8 % | Large payloads between nodes, webhook callbacks | BytesOut, BytesIn |
| Auxiliary (CloudWatch, Secrets Manager) | 10 % | Over‑logging, unused secrets | LogEvents, SecretsCount |
EEFA note: Over‑provisioning compute is the most common hidden expense. Scaling down without a proper autoscaling policy will cause silent CPU throttling, breaking time‑critical automations.
2. Trim Compute Costs – Right‑Size Workers & Autoscaling
Micro‑summary: Limit how many workers run, and let the platform add or remove them automatically based on load. If you encounter any over provisioning workers in n8n resolve them before continuing with the setup.
2.1. Cap Workers with Environment Variables
Put the core env vars into your docker‑compose.yml. They limit parallel executions to the VM’s capacity.
services:
n8n:
image: n8nio/n8n:latest
environment: - EXECUTIONS_PROCESS=main # single‑process mode - MAX_CONCURRENT_EXECUTIONS=4 # match vCPU count - WORKER_TIMEOUT=600 # kill idle workers after 10 min - EXECUTIONS_DELETE_AFTER_DAYS=30 # purge old executions
EXECUTIONS_PROCESS=maindisables the default per‑execution forked workers.MAX_CONCURRENT_EXECUTIONSshould never exceed the number of vCPUs; otherwise the OS will swap and waste CPU credits.- Most teams discover this mismatch only after a few weeks of steady traffic.
2.2. Docker‑Compose Autoscaling (non‑K8s)
Run a lightweight script every five minutes via cron. It reads CPU usage and tweaks the worker replica count.
# Gather current state
CURRENT=$(docker ps -q -f "name=n8n_worker" | wc -l)
CPU=$(docker stats --no-stream --format "{{.CPUPerc}}" n8n | awk -F% '{print $1}')
# Scale up if CPU > 70 % and we have room; scale down if < 30 % if (( $(echo "$CPU > 70" | bc -l) )) && [ $CURRENT -lt 8 ]; then docker-compose up -d --scale n8n_worker=$((CURRENT+2)) elif (( $(echo "$CPU < 30" | bc -l) )) && [ $CURRENT -gt 2 ]; then docker-compose up -d --scale n8n_worker=$((CURRENT-2)) fi
EEFA warning: Abruptly stopping workers can interrupt active jobs. Add
stop_grace_period: 30sto the service definition so in‑flight executions finish gracefully.
In practice, the script’s five‑minute interval provides a good balance between responsiveness and stability.
2.3. Kubernetes‑Native Autoscaling (EKS/GKE)
Deploy an HPA that watches CPU and leaves a buffer for bursty webhook traffic.
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: n8n-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: n8n
minReplicas: 2
maxReplicas: 12
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 65
Set averageUtilization between 55 %–70 % to keep a cushion for sudden webhook spikes. Most operators find 65 % a sweet spot that avoids both over‑scaling and throttling.
3. Shrink Data‑Store Expenses
Micro‑summary: Delete old execution records, pick a cost‑effective database, and index only what you need.
3.1. Execution History Pruning
Run the built‑in delete command nightly. It removes executions older than the configured retention period.
docker exec -t n8n n8n execution:delete --older-than=30d
Why: Each completed execution stores a full JSON payload; a busy instance can exceed 10 GB of history in a month.
If you forget this step, storage costs can balloon before you notice.
3.2. Switch to a Cost‑Effective DB Engine
| Engine | Monthly Cost (t2.micro) | Pros | Cons |
|---|---|---|---|
| SQLite (file‑based) | $0 (included in EC2) | Zero‑cost, simple backup | Not HA, limited concurrent writes |
| PostgreSQL (RDS‑Free Tier) | $0‑$15 | ACID, scalable reads | Slightly higher storage cost |
| Aurora Serverless v2 | $0‑$30 (pay‑per‑use) | Auto‑scales, pay‑only for active seconds | Cold‑start latency |
Recommendation: For > 5 k exec/day, move to Aurora Serverless v2, maxCapacity: 2. Keeps you under **$20 /mo** while still auto‑scaling.
3.3. Index‑Only Queries for Large Payloads
Create a partial index that covers only the columns you filter on, avoiding bloat from JSON fields.
CREATE INDEX idx_execution_status ON execution (status) WHERE status = 'success';
EEFA note: Full indexes on JSON columns increase storage size; partial indexes stay lightweight. If you encounter any n8n idle resource waste explained resolve them before continuing with the setup.
4. Optimize Object Storage & Bandwidth
Micro‑summary: Compress files before upload, move stale objects to Glacier, and use caching where possible.
| Action | Cost Impact | Implementation |
|---|---|---|
| Compress binary payloads (gzip) before S3 upload | ↓ ≈ 40 % | Add a pre‑node: {{ $json["file"] | gzip }} |
| Lifecycle rule → Glacier for > 30 days data | ↓ ≈ 70 % | S3 console → Management → Lifecycle |
| Enable Transfer Acceleration only for external APIs | ↓ ≈ 15 % | aws s3api put-bucket-accelerate-configuration … |
| Cache webhook responses with CloudFront | ↓ ≈ 20 % | Set Cache-Control: max-age=300 in webhook node |
EEFA warning: Deleting objects before a running workflow finishes will raise “File not found” errors. Use a reference‑counter node that only deletes after the last dependent workflow completes.
A simple “last‑step” clean‑up node prevents most of these hiccups.
5. Logging & Monitoring – Pay‑Only for What You Need
Micro‑summary: Lower log volume, expose Prometheus metrics, and trigger cost‑saving actions automatically.
5.1. Reduce CloudWatch Log Volume
Set the log level to warn and send output only to stdout. A side‑car fluent‑bit can filter out debug entries before they hit CloudWatch.
environment: - LOG_LEVEL=warn # default is info - LOG_OUTPUT=stdout # avoid duplicate file logs
5.2. Centralized Metrics Dashboard (Grafana + Prometheus)
Expose n8n’s built‑in metrics endpoint and let Prometheus scrape it.
scrape_configs:
- job_name: 'n8n'
static_configs:
- targets: ['n8n:5678'] # n8n exposes /metrics
Track n8n_execution_duration_seconds and alert when the 95th percentile exceeds 30 s – a sign of CPU starvation.
5.3. Alert‑Driven Cost Controls
| Alert | Threshold | Automated Action |
|---|---|---|
| CPU > 80 % for 5 min | 80 % | Scale up HPA by 2 replicas |
| DB storage > 80 % | 80 % | Trigger execution:delete --older-than=7d |
| S3 bucket size > 5 GB | 5 GB | Apply lifecycle rule to Glacier |
6. Checklist – “Is My n8n Stack Cost‑Optimized?”
- ☐ Worker limits –
MAX_CONCURRENT_EXECUTIONS≤ vCPU count. - ☐ Process mode –
EXECUTIONS_PROCESS=main(unless heavy parallelism is required). - ☐ Execution retention –
EXECUTIONS_DELETE_AFTER_DAYS≤ 30 days. - ☐ Database choice – Aurora Serverless v2 with
maxCapacitytuned to traffic. - ☐ S3 lifecycle – Move > 30 day objects to Glacier.
- ☐ Log level –
LOG_LEVEL=warnin production. - ☐ Autoscaling – HPA configured, CPU target 65 %.
- ☐ Health checks – Docker
healthcheckdefined; containers restart only on failure. - ☐ Cost alerts – CloudWatch alarms for CPU, DB storage, S3 size.
7. Real‑World Production Tips (EEFA)
- Cold‑Start Mitigation – Keep Aurora Serverless at a minimum of 0.5 ACU to avoid > 5 s cold starts after idle periods.
- Zero‑Downtime Deploys – Use a blue‑green Docker strategy: launch a new container with updated config, route traffic via an Nginx upstream, then gracefully shut down the old container after confirming no active executions (
docker exec n8n n8n execution:list --status=running). - Secrets Management – Store
EXECUTIONS_PROCESSand DB credentials in AWS Secrets Manager and reference them at container start:{{ secret:MY_N8N_DB_PASSWORD }}. This removes hard‑coded secrets and reduces IAM policy scope (lower compliance cost). - Network Egress Savings – If most webhooks are internal, place n8n in a private subnet and use VPC endpoints for S3. This eliminates NAT‑gateway egress charges.
Conclusion
By capping worker concurrency, pruning execution history, moving to a pay‑per‑use database, and tightening autoscaling, logging, and storage policies, you can shave 15‑40 % off your n8n infrastructure spend without compromising reliability. Follow the checklist, monitor the alerts, and iterate each month to keep costs predictable and under control.



