Common n8n Architecture Mistakes Seen in Real Companies

Step by Step Guide to solve common n8n architecture mistakes 
Step by Step Guide to solve common n8n architecture mistakes


Who this is for:  Ops engineers, platform architects, and senior developers who run n8n in production and need a reliable, secure setup.
In production, the SQLite issue often surfaces after a few hundred runs. We cover this in detail in the n8n Production Readiness & Scalability Risks Guide.


Quick Diagnosis & Fix

 

Mistake Symptom One‑Line Fix
Default SQLite DB Crashes after a few hundred runs Switch to PostgreSQL (DB_TYPE=postgresdb)
Single‑process n8n (no queue) Latency spikes, missed webhooks Enable queue mode with Redis (EXECUTIONS_MODE=queue)
Plain‑text credentials in the UI Credential leaks, GDPR risk Set N8N_ENCRYPTION_KEY + use an external secret store
Unprotected webhooks Unauthorized POSTs Add HMAC signing or basic auth
No dev/prod separation Accidental data loss, config drift Deploy isolated Docker‑Compose stacks per environment

*These steps are often missed in initial setups.*
Remediation consists of three actions: (1) Replace SQLite → PostgreSQL, (2) Enable queue mode, (3) Harden credentials & webhooks.


1. Mistake #1 – Relying on SQLite

If you encounter any n8n production readiness checklist resolve them before continuing with the setup.

The default image creates an SQLite database inside the container. Suitable for a quick demo, but insufficient for production traffic.

Why SQLite breaks under load

  • Each write locks the whole file → bottleneck when many webhooks fire together.
  • No replication or point‑in‑time recovery → a container restart can wipe data.

Better choice: PostgreSQL

Feature SQLite PostgreSQL
Concurrency Single writer MVCC, many writers
Scaling Requires shared FS Native read replicas
Backup Manual file copy pg_dump & PITR
Security Plain file Role‑based, TLS

Docker‑Compose snippet – PostgreSQL service

services:
  postgres:
    image: postgres:15-alpine
    environment:
      - POSTGRES_USER=n8n_user
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data

Docker‑Compose snippet – n8n service (DB vars only)

  n8n:
    image: n8nio/n8n:latest
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=n8n_user
      - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
      - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
    ports: ["5678:5678"]
    depends_on: [postgres]

Swapping the database is typically faster than attempting to patch SQLite. EEFA note: Do not expose the PostgreSQL port to the internet; keep it on the internal Docker network or a VPC‑restricted endpoint.


2. Mistake #2 – Running n8n Without a Queue

If you encounter any hidden cost of cheap n8n hosting resolve them before continuing with the setup.

CPU spikes on the n8n pod while the rest of the stack remains idle indicate a missing queue.

What will be observed

  • Webhook latency > 2 s under load.
  • “Execution failed” messages despite trigger arrival.
  • CPU spikes on the n8n container while other services stay idle.

Queue mode solves it

Aspect Single‑process Queue (Redis)
Concurrency One workflow at a time Unlimited workers
Fault tolerance Crash = lost jobs Jobs survive in Redis
Horizontal scaling Not possible Add more worker containers

Docker‑Compose snippet – Redis service

services:
  redis:
    image: redis:7-alpine
    command: ["redis-server", "--appendonly", "yes"]

Docker‑Compose snippet – n8n with queue vars

  n8n:
    image: n8nio/n8n:latest
    environment:
      - EXECUTIONS_MODE=queue
      - EXECUTIONS_PROCESS=main
      - QUEUE_BULL_REDIS_HOST=redis
      - QUEUE_BULL_REDIS_PORT=6379
      - DB_TYPE=postgresdb
    ports: ["5678:5678"]
    depends_on: [redis, postgres]

Enabling queue mode provides a cost‑effective path to horizontal scaling. EEFA note: Enable Redis authentication (`REDIS_PASSWORD`) and restrict network access to the n8n containers only.


3. Mistake #3 – Storing Plain‑Text Credentials

If you encounter any n8n execution history time bomb resolve them before continuing with the setup.

Plain‑text credentials often appear after a few weeks, not on day 1.

Risks

  • Former staff can export the JSON and reuse API keys.
  • DB dumps flagged by secret‑scanning tools become a compliance incident.

Secure options

Option Pros Cons
Built‑in AES (N8N_ENCRYPTION_KEY) Quick, works out‑of‑the‑box Key rotation is manual
External secret manager (AWS Secrets Manager, Vault) Auditing, auto‑rotation Extra latency & cost

Example: Pulling secrets from HashiCorp Vault

services:
  n8n:
    image: n8nio/n8n:latest
    environment:
      - VAULT_ENDPOINT=https://vault.mycorp.com
      - VAULT_TOKEN=${VAULT_TOKEN}
      - VAULT_PATH=n8n/credentials
      - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}

Create a “Vault Credential” type in the UI that reads the secret at runtime.
The built‑in encryption is adequate for small teams; larger organizations should plan for a secret manager. EEFA note: Never commit VAULT_TOKEN to source control. Use Docker secrets or Kubernetes Secret objects instead.


4. Mistake #4 – Exposing Webhooks Without Authentication

Unauthenticated webhooks are often abused to flood an endpoint.

Typical attack vector

An attacker POSTs arbitrary payloads to /webhook/…, causing unwanted data creation or exhausting API quotas.

Hardened pattern

  1. Enable HMAC signing – n8n can verify a header you define.
  2. Restrict source IPs – firewall or API gateway (Kong, AWS API Gateway).
  3. Rate‑limit – e.g., 10 req/s per IP.

HMAC verification – Function node (first node in the workflow)

const crypto = require('crypto');
const secret = process.env.WEBHOOK_HMAC_SECRET;
const payload = JSON.stringify($json);
const expected = crypto.createHmac('sha256', secret).update(payload).digest('hex');

if (msg.headers['x-n8n-signature'] !== expected) {
  throw new Error('Invalid webhook signature');
}
return msg;

A simple HMAC check stops most accidental noise. EEFA note: Rotate WEBHOOK_HMAC_SECRET quarterly and keep it out of logs (LOGGING_LEVEL=error in production).


5. Mistake #5 – No Environment Separation

Mixing dev and prod configurations is a common source of accidental data loss.

Why mixing dev and prod is dangerous

  • Production traffic can be sent unintentionally.
  • Schema changes in dev may break live workflows.

Recommended stack layout

Component Dev Prod
DB n8n_dev (Postgres) n8n_prod (Postgres)
Redis redis_dev redis_prod
Secrets .env.dev (local) Vault / AWS Secrets Manager
Compose file docker-compose.dev.yml docker-compose.prod.yml (with restart: unless-stopped)

Separate .env files

# .env.dev
POSTGRES_PASSWORD=devSecret123
N8N_ENCRYPTION_KEY=devEncKey
WEBHOOK_HMAC_SECRET=devWebhookKey
# .env.prod (never checked into VCS)
POSTGRES_PASSWORD=••••••••••••
N8N_ENCRYPTION_KEY=••••••••••••
WEBHOOK_HMAC_SECRET=••••••••••••

Run the production stack

docker compose -f docker-compose.prod.yml --env-file .env.prod up -d

Keeping separate compose files pays off quickly. EEFA note: Add a Git hook that blocks committing any .env.* containing production secrets.


6. Checklist: Audit Your n8n Architecture

*Before going live, run through this quick audit.*

Item How to Verify
Database – PostgreSQL/MySQL in use SELECT version(); inside the DB container
Queue – Execution mode = queue docker exec n8n env | grep EXECUTIONS_MODE
Credential encryption – N8N_ENCRYPTION_KEY set docker exec n8n env | grep N8N_ENCRYPTION_KEY
Webhook auth – HMAC or Basic Auth present Inspect the first node of each public webhook workflow
Env isolation – Separate DB/Redis for dev & prod Diff the docker-compose.*.yml files
Backup – Daily pg_dump with retention Check cron job or managed backup service
Monitoring – Logs shipped to ELK/Datadog Verify LOGGING_LEVEL and log forwarder config
Scaling – ≥ 2 worker replicas in prod docker service ls or kubectl get pods

Skipping any of these items leaves the system exposed.


7. Real‑World Example: Fixing a “Stuck Webhook” Incident

Scenario – A SaaS company missed inbound orders for 30 minutes after a spike of ~5 k requests/min.

Root cause – n8n ran in single‑process mode with SQLite; the DB lock queue filled until the process timed out.

Resolution

  1. Swapped SQLite for PostgreSQL (Section 1).
  2. Enabled Redis queue and added three worker containers (Section 2).
  3. Added HMAC verification to the order webhook (Section 4).
  4. Deployed using the production .env and isolated stack (Section 5).

Result – Order webhook latency dropped from ~3 s to < 200 ms, and no executions were lost during a subsequent load test of 10 k req/min.


8. Frequently Asked Questions

Question Short Answer
Can SQLite be kept for a small team? Only for development or proof‑of‑concept. Production requires a concurrent RDBMS.
Are both Redis and a DB needed? Yes. Redis handles execution queuing; the DB stores workflow definitions and results.
What’s the minimum n8n version for queue mode? v0.215.0 introduced stable queue support – upgrade to the latest LTS.
Is Vault mandatory for credential security? No, but it provides audit trails and automated rotation, which align with EEFA best practices.
How is queue backlog monitored? Export Redis INFO metrics to Prometheus and alert when list-length > 100.

Conclusion

The most common architectural missteps using SQLite, skipping queue mode, mishandling credentials, exposing unsecured webhooks, and mixing environments are resolved with a handful of concrete configuration changes. Applying the checklist and snippets transitions n8n from a fragile prototype to a production‑grade automation engine.

Leave a Comment

Your email address will not be published. Required fields are marked *