Who this is for: Teams that need to decide how to run n8n in production, from a solo developer to an enterprise automations group. We cover this in detail in the Production‑Grade n8n Architecture
Quick Diagnosis
| Situation | Recommended n8n deployment |
|---|---|
| < 100 daily workflow runs, single‑user or small team, low latency required | Single‑instance (Docker or binary) – simplest, cheapest, fastest start‑up |
| ≥ 100 daily runs or burst traffic, need HA, zero‑downtime upgrades, or geographic distribution | Multi‑instance (clustered) – Kubernetes, Docker Swarm, or PM2‑managed fleet with a shared DB & queue |
| Must guarantee no data loss during node failure | Multi‑instance with PostgreSQL + Redis + health‑checks |
| Budget‑constrained, can tolerate brief downtime for upgrades | Single‑instance with scheduled maintenance windows |
*Need horizontal scaling, automatic fail‑over, or per‑team isolation? Go multi‑instance. Otherwise a single‑instance is fine for most small‑to‑medium automations.*
Real‑world note – The “< 100 runs” rule usually holds until a new integration spikes traffic; then the cluster becomes attractive.
1. Core Architectural Differences
If you encounter any n8n high availability patterns resolve them before continuing with the setup.
| Aspect | Single‑instance n8n | Multi‑instance n8n (cluster) |
|---|---|---|
| Process model | One Node.js process runs all workflows | Multiple Node.js processes (pods, containers, or PM2 workers) share the same DB/queue |
| State storage | SQLite (default) or local PostgreSQL file | Centralized PostgreSQL (or MySQL) + optional Redis for job queue |
| Scalability | Vertical only (more CPU/RAM on the same host) | Horizontal – add/remove workers on demand |
| High‑availability | None – single point of failure | Built‑in HA via DB replication & load‑balancer health checks |
| Deployment complexity | Low – docker run or binary |
Medium‑high – K8s manifests, Docker‑Compose with multiple services, or PM2 ecosystem |
| Cost | One VM/container (≈ $5‑$15 /mo) | Multiple nodes + managed DB/Redis (≈ $30‑$150 /mo) |
| Typical use‑case | Prototyping, personal automations, small teams | Enterprise automations, CI/CD pipelines, SaaS integrations, regulated environments |
EEFA Note – The single‑instance defaults to SQLite. SQLite cannot be safely accessed by multiple processes; trying to run two n8n containers against the same SQLite file will corrupt the DB. Switch to PostgreSQL before scaling out.
2. When to Choose Single‑instance?
Good for low throughput, tight budgets, rapid prototyping. If you encounter any n8n zero downtime upgrades resolve them before continuing with the setup.
2.1 Deploy a production‑ready single‑instance with PostgreSQL
Step 1 – Spin up a dedicated PostgreSQL container
docker run -d \ --name pg-n8n \ -e POSTGRES_USER=n8n \ -e POSTGRES_PASSWORD=StrongPass123 \ -e POSTGRES_DB=n8n \ -p 5432:5432 \ postgres:15-alpine
Step 2 – Launch n8n pointing at the external DB
docker run -d \ --name n8n-single \ -p 5678:5678 \ -e DB_TYPE=postgresdb \ -e DB_POSTGRESDB_HOST=host.docker.internal \ -e DB_POSTGRESDB_PORT=5432 \ -e DB_POSTGRESDB_DATABASE=n8n \ -e DB_POSTGRESDB_USER=n8n \ -e DB_POSTGRESDB_PASSWORD=StrongPass123 \ -e N8N_BASIC_AUTH_ACTIVE=true \ -e N8N_BASIC_AUTH_USER=admin \ -e N8N_BASIC_AUTH_PASSWORD=SuperSecret! \ n8nio/n8n:latest
EEFA – Common errors & fixes
| Symptom | Fix |
|---|---|
| SequelizeConnectionError: connect ECONNREFUSED | Verify DB host/port, ensure the container can reach it (use host.docker.internal on macOS/Windows, or --network=host on Linux). |
| No TLS on DB in production | Add DB_POSTGRESDB_SSL=true and configure PostgreSQL with a certificate. |
> *Quick tip*: If you already have a Docker‑Compose stack, just add the PostgreSQL service and reuse the same network – it shaves a few lines of config.
3. When to Choose Multi‑instance (Cluster)?
*Ideal for horizontal scaling, zero‑downtime upgrades, and compliance‑driven reliability.*
3.1 Architecture Blueprint (Kubernetes)
ConfigMap – shared environment variables
apiVersion: v1 kind: ConfigMap metadata: name: n8n-config data: DB_TYPE: "postgresdb" DB_POSTGRESDB_HOST: "postgres-n8n" DB_POSTGRESDB_DATABASE: "n8n" DB_POSTGRESDB_USER: "n8n" DB_POSTGRESDB_SSL: "true" N8N_LOG_LEVEL: "info"
Deployment – horizontal pod autoscaler enabled
apiVersion: apps/v1
kind: Deployment
metadata:
name: n8n
spec:
replicas: 3
selector:
matchLabels:
app: n8n
template:
metadata:
labels:
app: n8n
spec:
containers:
- name: n8n
image: n8nio/n8n:latest
ports:
- containerPort: 5678
envFrom:
- configMapRef:
name: n8n-config
readinessProbe:
httpGet:
path: /healthz
port: 5678
initialDelaySeconds: 5
periodSeconds: 10
Service – external load balancer
apiVersion: v1
kind: Service
metadata:
name: n8n
spec:
type: LoadBalancer
selector:
app: n8n
ports:
- port: 80
targetPort: 5678
The readiness probe stops traffic to a pod that’s still starting up – a small detail that saves a lot of flaky runs.
EEFA – Race‑condition fix
If you see Error: Unable to acquire lock on workflow, enable Redis as a job queue and set:
env:
- name: N8N_EXECUTIONS_PROCESS
value: "main"
- name: N8N_QUEUE_MODE
value: "redis"
- name: N8N_REDIS_HOST
value: "redis"
- name: N8N_REDIS_PORT
value: "6379"
3.2 Docker‑Compose Example with Redis Queue
PostgreSQL service
postgres:
image: postgres:15-alpine
environment:
POSTGRES_USER: n8n
POSTGRES_PASSWORD: StrongPass123
POSTGRES_DB: n8n
volumes:
- pg-data:/var/lib/postgresql/data
Redis service (persistent, AOF enabled)
redis:
image: redis:7-alpine
command: ["redis-server", "--appendonly", "yes"]
volumes:
- redis-data:/data
n8n service – shared DB & queue
n8n:
image: n8nio/n8n:latest
depends_on:
- postgres
- redis
environment:
DB_TYPE: postgresdb
DB_POSTGRESDB_HOST: postgres
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n
DB_POSTGRESDB_PASSWORD: StrongPass123
N8N_REDIS_HOST: redis
N8N_REDIS_PORT: 6379
EXECUTIONS_PROCESS: main
N8N_QUEUE_MODE: redis
ports:
- "5678:5678"
deploy:
mode: replicated
replicas: 4
resources:
limits:
memory: 512M
EEFA – Scaling Redis
When you run more than four replicas, bump Redis maxmemory (e.g., maxmemory 2gb) or enable clustering. In practice we hit this ceiling after a sudden influx of webhook events, so monitoring early helps.
4. Decision Checklist – Choose the Right Deployment
| Checklist Item | Single‑instance | Multi‑instance |
|---|---|---|
| Expected daily executions | ≤ 100 | > 100 |
| Need zero‑downtime upgrades | No | Yes |
| Geographic latency requirements | Local only | Multi‑region |
| Team size / role segregation | ≤ 5 users, no RBAC needed | > 5 users, separate credential sets |
| Budget for managed DB/Redis | Low (optional) | Medium‑high (managed PostgreSQL + Redis) |
| Compliance / audit log retention | Basic file logs | Centralized DB + immutable backups |
| Operational expertise | Basic Docker/CLI | Kubernetes, Helm, or Docker‑Swarm orchestration |
Quick Action: If any “Multi‑instance” column is checked, start planning a clustered deployment; otherwise spin up a single‑instance today.
5. Migration Path – From Single to Multi‑instance
| Phase | Tasks | Command / Tool |
|---|---|---|
| Export data | Dump SQLite (if used) → PostgreSQL | sqlite3 ~/.n8n/database.sqlite .dump > dump.sql psql -U n8n -d n8n -f dump.sql |
| Provision shared services | Deploy PostgreSQL + Redis (managed or self‑hosted) | Helm chart bitnami/postgresql and bitnami/redis |
| Update n8n config | Set DB_TYPE=postgresdb, EXECUTIONS_PROCESS=main, N8N_QUEUE_MODE=redis |
Edit ConfigMap or .env |
| Deploy first worker | Run one n8n pod/container, verify workflow execution | kubectl rollout restart deployment/n8n |
| Scale out | Increase replica count, monitor CPU/Memory | kubectl scale deployment n8n –replicas=5 |
| Enable health checks | Add liveness/readiness probes, configure load‑balancer health checks | See K8s manifest above |
| Cutover | Point DNS to new LoadBalancer, decommission old single‑instance | Update DNS A record, stop old container |
EEFA – Migration pitfalls
| Symptom | Fix |
|---|---|
| Error: Cannot read property ‘id’ of undefined after migration | Use the official migration script: n8n migrate --from sqlite --to postgres. |
| Workflows fail silently | Keep the old single‑instance running for 24 h as a fallback while you validate the cluster. |
*Pro tip*: Run the migration during a low‑traffic window; debugging is far less stressful. If you encounter any n8n data consistency resolve them before continuing with the setup.
6. Monitoring & Alerting for Multi‑instance n8n
| Metric | Recommended Tool | Alert Threshold |
|---|---|---|
| Workflow execution latency | Prometheus + Grafana (n8n_execution_time_seconds) |
> 5 s for a 5‑minute window |
| Node health | K8s readinessProbe failures |
> 2 consecutive failures |
| Redis queue length | Redis Exporter (redis_queue_length) |
> 10 000 pending jobs |
| PostgreSQL connection errors | pgBouncer stats or pg_stat_activity |
> 5 % connection errors |
| CPU / Memory per pod | K8s HPA metrics | CPU > 80 % for 3 min → auto‑scale |
**EEFA tip** – Enable structured logging (N8N_LOG_OUTPUT=json) and ship logs to a centralized system (e.g., Loki) to correlate workflow failures with infrastructure events.
7. Real‑World Example – Scaling a Ticket‑Automation Pipeline
Scenario: A SaaS support team processes ≈ 2 000 tickets per hour via n8n → Zendesk → Slack notifications.
| Step | Implementation |
|---|---|
| DB | Managed PostgreSQL (AWS RDS) with a read replica for reporting |
| Queue | Redis (AWS ElastiCache) with maxmemory 4gb |
| Workers | 6‑node Kubernetes deployment (2 CPU, 1 GiB each) |
| Ingress | Nginx Ingress with sticky sessions disabled (stateless) |
| Autoscaling | HPA target CPU 65 % → scales 2‑12 pods based on load |
| Result | 99.99 % SLA, zero‑downtime releases, cost ≈ $120 / month |
Bottom Line
- Single‑instance gets n8n up fast and cheap for low‑volume, low‑risk automations.
- Multi‑instance (cluster) is required when you need horizontal scaling, high availability, or regulatory‑grade reliability.
- Follow the checklist, run the migration steps, and put proper monitoring in place so your n8n deployment grows with the business without surprise outages.



