Production‑Ready n8n Deployment Checklist: Step-by-Step G...

Step by Step Guide to solve n8n production readiness checklist

Who this is for: Platform engineers and DevOps teams that need to run n8n reliably at scale. We cover this in detail in the n8n Production Readiness & Scalability Risks Guide.

Quick Diagnosis

Problem: n8n works in development but fails under production load, leaks credentials, or loses workflow state.

Fast‑track fix: Run the checklist below, apply every Critical  item, and redeploy. This step resolves the most common production‑grade failures and satisfies featured‑snippet criteria.

*In production, this usually shows up when the DB connection drops or a webhook payload exceeds the default size limit.*

Core Infrastructure Requirements

If you encounter any common n8n architecture mistakes resolve them before continuing with the setup.

These items are the ones we watch when a fresh n8n install starts misbehaving under load.

Item	Recommended Setting	Why It Matters
Persistent Storage	POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD mounted to a durable volume (Docker: `-v n8n-data:/home/node/.n8n`)	Guarantees workflow definitions and execution data survive container restarts.
Dedicated DB	External PostgreSQL ≥ 13, not SQLite for > 10k executions/month	SQLite is a single file and prone to corruption under concurrency.
CPU / Memory	2 vCPU + 4 GiB RAM baseline; add 0.5 vCPU per 1 k concurrent executions	Prevents OOM kills and CPU throttling during peak workflow runs.
Network Isolation	Deploy n8n in its own Docker network or Kubernetes namespace	Limits blast radius if a compromised workflow tries lateral movement.
TLS Termination	Reverse proxy (Traefik, NGINX) with Let’s Encrypt certificates	Encrypts API traffic and protects webhook payloads.

How to verify – run the commands listed in the right‑hand column of each row (e.g., docker exec <container> ls /home/node/.n8n or psql -U $POSTGRES_USER -d $POSTGRES_DB -c "\dt").

EEFA note: Never expose port 5678 directly to the internet; always place a TLS‑terminating proxy in front.

Security Hardening Checklist

If you encounter any hidden cost of cheap n8n hosting resolve them before continuing with the setup.

Item	Recommended Value	Reason
N8N_BASIC_AUTH_ACTIVE	true	Disables anonymous UI access.
N8N_BASIC_AUTH_USER / N8N_BASIC_AUTH_PASSWORD	Random 16‑plus‑char strings	Prevents credential stuffing.
N8N_ENCRYPTION_KEY	32‑byte Base64 secret (`openssl rand -base64 32`)	Encrypts stored credentials & secrets.
N8N_DISABLE_PRODUCTION_WARNINGS	false	Keeps safety warnings visible.
WEBHOOK_TUNNEL_URL	Never set in production	Stops accidental exposure of local tunnels.
Secret Management	Store all env vars in a secret manager (AWS Secrets Manager, Vault)	Avoids plaintext secrets in Dockerfiles or git.

EEFA warning: Changing N8N_ENCRYPTION_KEY after credentials are stored will render those credentials unusable. Migrate data before rotating the key.

Most teams forget to rotate the encryption key until they hit a credential‑related error.

High Availability & Scaling

If you encounter any n8n execution history time bomb resolve them before continuing with the setup.

1. Horizontal Scaling with Docker‑Compose (Swarm)

Docker‑Compose makes it easy to spin up a replicated stack, but remember that the volume must be shared across replicas.

Deploy a replicated stack – the snippet below defines a three‑replica service with resource limits.

version: "3.8"
services:
  n8n:
    image: n8nio/n8n:latest
    deploy:
      mode: replicated
      replicas: 3

    resources:
      limits:
        cpus: "1.0"
        memory: 2G
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres

      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=${POSTGRES_USER}
      - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}

      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=${BASIC_USER}
      - N8N_BASIC_AUTH_PASSWORD=${BASIC_PASS}
      - N8N_ENCRYPTION_KEY=${ENC_KEY}
    volumes:
      - n8n-data:/home/node/.n8n
    ports:
      - "5678:5678"
    depends_on:
      - postgres

  postgres:
    image: postgres:13-alpine
    environment:
      POSTGRES_DB: n8n
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - pg-data:/var/lib/postgresql/data

volumes:
  n8n-data:
  pg-data:

Deploy with:

docker stack deploy -c docker-compose.yml n8n

2. Kubernetes (StatefulSet + Service)

Kubernetes will schedule each pod on a different node if resources allow, which removes a single point of failure.

StatefulSet definition – each replica receives its own persistent volume claim.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: n8n
spec:
  serviceName: "n8n"
  replicas: 3
  selector:
    matchLabels:
      app: n8n

  template:
    metadata:
      labels:
        app: n8n
    spec:
      containers:
        - name: n8n
          image: n8nio/n8n:latest
          envFrom:
            - secretRef:
                name: n8n-secrets

          ports:
            - containerPort: 5678
          volumeMounts:
            - name: n8n-data
              mountPath: /home/node/.n8n

      volumes:
        - name: n8n-data
          persistentVolumeClaim:
            claimName: n8n-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: n8n
spec:
  selector:
    app: n8n
  ports:
    - protocol: TCP
      port: 80
      targetPort: 5678
  type: ClusterIP

EEFA tip: Use a ReadWriteMany PVC (e.g., NFS, Ceph) only if you need shared storage across pods. Otherwise each replica gets its own copy, preventing divergent workflow states.

3. Load‑Balancing Webhooks

Sticky sessions are rarely required unless the DB cannot keep up with session state.
Ingress – route /webhook/* to the n8n service.
Sticky Sessions – enable sessionAffinity: ClientIP only when you have a single‑node DB; otherwise let the DB handle state.

Monitoring, Logging, & Alerting

Component	Recommended Tool	Config Snippet
Metrics	Prometheus + node‑exporter	Set `METRICS=true` to expose `/metrics`.
Logs	Loki + Grafana	`docker run -d --log-driver=gelf --log-opt gelf-address=udp://loki:12201 n8nio/n8n`
Health Checks	K8s liveness/readiness probes	`httpGet: path: /healthz port: 5678`
Alerting	Alertmanager (CPU > 80% for 5 min, DB errors)	- alert: HighCpuUsage expr: sum(rate(container_cpu_usage_seconds_total{container="n8n"}[1m])) by (instance) > 0.8

EEFA note: Do not rely solely on the UI “Workflow Execution History” for production diagnostics; it truncates after 100 entries. Centralized logging is mandatory for forensic analysis.

In the field, we’ve seen alerts go silent if the metrics endpoint isn’t exposed.

Backup, Restore, & Disaster Recovery

Database dump (PostgreSQL)

pg_dump -U $POSTGRES_USER -Fc $POSTGRES_DB > n8n_$(date +%F).dump

Workflow export via API

curl -X GET "https://n8n.example.com/rest/workflows" \
     -H "Authorization: Bearer $API_TOKEN" \
     -o workflows_$(date +%F).json

Automated snapshot – schedule a daily cron job (or K8s CronJob) that runs both steps and pushes the artifacts to an off‑site object store (AWS S3, GCS).

Restore workflow

pg_restore -U $POSTGRES_USER -d $POSTGRES_DB n8n_2024-12-01.dump

curl -X POST "https://n8n.example.com/rest/workflows" \
     -H "Authorization: Bearer $API_TOKEN" \
     -H "Content-Type: application/json" \
     --data @workflows_2024-12-01.json

EEFA warning: Restoring a dump that contains encrypted credentials will fail if the N8N_ENCRYPTION_KEY has changed. Keep the key version‑controlled alongside your backup policy.

Make sure the backup job runs with the same ENCRYPTION_KEY that the live instance uses, otherwise restores will fail.

CI/CD & Automated Deployments

Step	Tool	Example
Lint & Test	n8n-cli (n8n lint) + Jest for custom nodes	npm run lint && npm test
Container Build	GitHub Actions → Docker Buildx	jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Build & push uses: docker/build-push-action@v5 with: context: . push: true tags: ghcr.io/yourorg/n8n:${{ github.sha }}
Deploy	Argo CD (K8s) or Docker Swarm stack update	docker stack deploy -c docker-compose.yml n8n
Smoke Test	Post‑deployment curl to `/healthz`	curl -f https://n8n.example.com/healthz
Rollback	Keep previous image tag; `docker service update --image ...`	docker service update –image n8nio/n8n:1.24.0 n8n_n8n

EEFA tip: A quick smoke test after each deploy catches most misconfigurations before they hit users.

Freeze the N8N_ENCRYPTION_KEY in a secret manager and reference it via ${{ secrets.N8N_ENCRYPTION_KEY }}. Changing the key mid‑pipeline breaks all stored credentials.

Final Verification & Ongoing Audits

Checklist Item	Pass/Fail	Evidence
All critical env vars sourced from a secret manager
TLS terminates at the edge; HTTP → HTTPS redirect enforced		curl -I http://n8n.example.com → 301 to https
Backup succeeded for the last 24 h		S3 object list shows today’s dump
Prometheus metrics scraped without errors		up{job=”n8n”} == 1 in Grafana
No default credentials exist		docker exec n8n grep -i admin .env returns none
Load test ≥ 500 req/s with < 200 ms latency		k6 script results attached

Run this verification after every major version upgrade. Document any deviation and create a JIRA ticket for remediation.

Running the checklist after each upgrade is a habit that saves a lot of firefighting later.

By systematically ticking each row in the tables above, you turn a bare‑bones n8n instance into a production‑grade automation engine that meets reliability, security, and compliance expectations.

Production‑Ready n8n Deployment Checklist: Step-by-Step G…

Quick Diagnosis

Core Infrastructure Requirements

Security Hardening Checklist

High Availability & Scaling

1. Horizontal Scaling with Docker‑Compose (Swarm)

2. Kubernetes (StatefulSet + Service)

3. Load‑Balancing Webhooks

Monitoring, Logging, & Alerting

Backup, Restore, & Disaster Recovery

CI/CD & Automated Deployments

Final Verification & Ongoing Audits

Leave a Comment Cancel Reply

Sign up for Newsletter

Quick Diagnosis

Core Infrastructure Requirements

Security Hardening Checklist

High Availability & Scaling

1. Horizontal Scaling with Docker‑Compose (Swarm)

2. Kubernetes (StatefulSet + Service)

3. Load‑Balancing Webhooks

Monitoring, Logging, & Alerting

Backup, Restore, & Disaster Recovery

CI/CD & Automated Deployments

Final Verification & Ongoing Audits

Must Read

Leave a Comment Cancel Reply