
Step by Step Guide in Scaling Redis for High n8n Load
Who this is for: Engineers running self‑hosted n8n in production who need Redis to handle thousands of operations per second without latency spikes or data loss. For a complete overview of Redis usage, errors, performance tuning, and scaling in n8n, check out our detailed guide on Redis for n8n Workflows.
Quick Diagnosis
- Deploy a Redis Cluster with at least 3 master nodes (minimum 6 slots per master).
- Enable
cluster-require-full-coverage noto allow seamless slot migration. - Point
REDIS_HOSTin n8n to the cluster DNS (e.g.,redis‑cluster.my‑domain.com:6379). - Add one read replica per master and set
REDIS_READ_REPLICAso$cache.gethits the replica. - Export
INFO&LATENCYmetrics, and auto‑scale when CPU > 80 % → add a node.
Result: eliminates single‑point‑of‑failure latency spikes and sustains > 10 k ops/s for n8n workloads.
1. Why n8n Puts Unique Pressure on Redis
| n8n Pattern | Redis Interaction | Typical Load Impact |
|---|---|---|
Workflow state caching ($cache.set) |
Frequent SET/GET of small JSON blobs |
High QPS, low latency required |
| Trigger queues (webhook, cron, poll) | LPUSH / BRPOP on list keys |
Burst‑y writes, blocking reads |
Execution locks (SETNX) |
Short‑lived keys with TTL | Many lock‑acquire/release cycles |
| Large payloads (file metadata) | HMSET / HGETALL on hash maps |
Increased memory & network I/O |
EEFA Note – A typical n8n host runs dozens of workers; each worker can fire 5‑10 concurrent Redis calls, so effective QPS can be 10‑20× the visible workflow count.
2. Choosing the Right Scaling Model
2.1 Redis Cluster (native sharding)
| When to Use | Pros | Cons |
|---|---|---|
| > 5 k ops/s, data > 8 GB, need horizontal scaling | Automatic sharding, fault‑tolerant, linear scaling | Requires slot management, client must be cluster‑aware |
n8n‑specific – ioredis driver used by n8n supports cluster mode out‑of‑the‑box. Before moving on, if you miss out to monitor Redis health for n8n finish it and continue reading for better performance.
2.2 Manual Sharding (multiple independent instances)
| When to Use | Pros | Cons |
|---|---|---|
| Legacy setups, need granular control over key placement | Simple to understand, can mix instance types | No cross‑node atomic ops, manual key‑routing logic required |
n8n‑specific – Add a thin Node.js router (redis-shard-router) to resolve keys to the proper instance.
2.3 Read Replicas (master‑replica)
| When to Use | Pros | Cons |
|---|---|---|
Workloads are read‑heavy (many $cache.get) |
Near‑zero read latency, offloads master | Writes still bottleneck master, replication lag possible |
n8n‑specific – Set REDIS_READ_REPLICA to route cache reads only.
Recommendation – For most production n8n deployments, Redis Cluster + read replicas delivers the best mix of scalability and simplicity.
3. Deploying a Production‑Ready Redis Cluster for n8n
3.1 Infrastructure Blueprint
┌─────────────────────┐ ┌─────────────────────┐
│ Redis Master #1 │ │ Redis Master #2 │
│ (3 replicas) │ ↔ │ (3 replicas) │
│ 6379 (cluster) │ │ 6379 (cluster) │
└───────┬─────────────┘ └───────┬─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Load‑Balancer│ (DNS) │ Load‑Balancer│ (DNS)
└───────┬───────┘ └───────┬───────┘
│ │
▼ ▼
n8n workers (any size) ←→ n8n workers
* Minimum **3 master nodes** (odd number for quorum).
* **3 replicas per master** (total 9 nodes) to survive two simultaneous failures.
* Use a **TCP load balancer** or DNS round‑robin that resolves redis‑cluster.my‑domain.com to all master IPs.
3.2 Kubernetes Deployment (Helm) – Part 1
cluster: enabled: true nodes: 3 # masters replicasPerMaster: 3 # read replicas shardCount: 16384 # default slots
3.3 Kubernetes Deployment – Part 2
resources:
limits:
cpu: "2000m"
memory: "4Gi"
requests:
cpu: "1000m"
memory: "2Gi"
persistence:
enabled: true
size: 50Gi
service:
type: ClusterIP
port: 6379
3.4 Kubernetes Deployment – Part 3
extraFlags: - --cluster-require-full-coverage no - --appendonly yes - --maxmemory-policy allkeys-lru
Deploy with the Bitnami chart:
helm repo add bitnami https://charts.bitnami.com/bitnami helm install n8n-redis bitnami/redis-cluster -f values.yaml
EEFA Note –
--cluster-require-full-coverage nolets you add or remove nodes without a full slot rebalance, keeping the service available during scaling events.
3.5 n8n Environment Variables
REDIS_HOST=redis-cluster.my-domain.com REDIS_PORT=6379 REDIS_READ_REPLICA=redis-replica.my-domain.com # optional, for read‑only traffic
If you used the Bitnami chart, redis-cluster.my-domain.com can be a **headless service** that returns all master pod IPs.
4. Manual Sharding (When Cluster Isn’t an Option)
4.1 Router – Part 1 (Hash to Shard)
const crypto = require('crypto');
const shards = {
'0': process.env.REDIS_01,
'1': process.env.REDIS_02,
};
4.2 Router – Part 2 (Slot Selection)
function getShard(key) {
const hash = crypto.createHash('md5').update(key).digest('hex');
const slot = parseInt(hash.slice(0, 2), 16) % Object.keys(shards).length;
return shards[slot];
}
module.exports = { getShard };
4.3 Using the Router in n8n
const { getShard } = require('./redisShardRouter');
const Redis = require('ioredis');
async function setCache(key, value) {
const client = new Redis(getShard(key));
await client.set(key, JSON.stringify(value));
}
EEFA Warning – Manual sharding loses atomic multi‑key operations (e.g.,
MSETacross shards). Use only when keys are guaranteed to stay within a single shard. If any fallback occurs during the execution, rectify using fallback strategies when Redis is down in n8n and then continue the setup.
5. Adding Read Replicas for Cache‑Heavy Workflows
5.1 Replica Deployment (Helm)
replica:
replicaCount: 3
resources:
limits:
cpu: "1000m"
memory: "2Gi"
The chart creates redis-cluster-replicas services automatically.
5.2 n8n Read‑Replica Switch (Code Snippet 1) – Master Client
const Redis = require('ioredis');
const master = new Redis({
host: process.env.REDIS_HOST,
port: process.env.REDIS_PORT,
});
5.3 Read‑Replica Switch (Code Snippet 2) – Proxy GET Calls
let cacheClient = master; // default
if (process.env.REDIS_READ_REPLICA) {
const replica = new Redis({
host: process.env.REDIS_READ_REPLICA,
port: process.env.REDIS_PORT,
readOnly: true,
});
// Proxy GET‑only methods to the replica
cacheClient.get = replica.get.bind(replica);
}
module.exports = { cacheClient };
All $cache.get calls now hit the replica, while $cache.set stays on the master.
6. Monitoring, Alerting & Auto‑Scaling
| Metric | Threshold | Action |
|---|---|---|
| CPU usage | > 80 % (5 min avg) | Add a new master node (scale‑out) |
| Replication lag | > 200 ms | Investigate network, increase replica count |
| Slot migration time | > 30 s | Pause new deployments, verify slot balance |
| Evicted keys | > 0 | Increase maxmemory or adjust maxmemory-policy |
6.1 Prometheus Scrape Configuration
scrape_configs:
- job_name: 'redis'
static_configs:
- targets:
- 'redis-cluster-0:9121'
- 'redis-cluster-1:9121'
- 'redis-cluster-2:9121'
6.2 Grafana Alert Rule (CPU)
alert: RedisHighCPU
expr: avg(rate(redis_cpu_user_seconds_total[1m])) by (instance) > 0.8
for: 5m
labels:
severity: critical
annotations:
summary: "Redis node {{ $labels.instance }} CPU > 80%"
description: "High CPU may cause latency for n8n workflows."
6.3 Kubernetes HPA for the Cluster
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: redis-cluster-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: n8n-redis
minReplicas: 3
maxReplicas: 9
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
7. Troubleshooting Common Scaling Pitfalls
| Symptom | Likely Cause | Fix |
|---|---|---|
MOVED errors on GET |
Client not cluster‑aware | Use ioredis.Cluster or enable clusterMode in n8n’s Redis driver |
| High write latency | Master saturated, no replicas | Add another master or promote a replica (redis-cli CLUSTER REBALANCE) |
| Replica lag > 1 s | Network jitter or heavy write load | Increase replica-serve-stale-data yes temporarily, then add more replicas |
| Slot migration stalls | Ports 16379‑16398 blocked | Open intra‑cluster ports, keep cluster-require-full-coverage no |
OOM command not allowed |
maxmemory reached, wrong eviction policy |
Raise maxmemory, switch to allkeys-lru or volatile-lru |
EEFA Tip – After any topology change, run
redis-cli --cluster check <any-node>:6379. The command reports slot distribution, unreachable nodes, and anyMOVED/ASKinconsistencies.
8. Best‑Practice Checklist for n8n‑Scale‑Ready Redis
- Deploy ≥ 3 master nodes with ≥ 3 replicas each.
- Enable cluster mode (
redis-cli --cluster create …). - Set
maxmemory-policytoallkeys-lru(orvolatile-lruif you rely on TTL). - Configure n8n env vars:
REDIS_HOST,REDIS_PORT,REDIS_READ_REPLICA. - Open intra‑cluster ports 6379 and 16379‑16398.
- Install Prometheus Exporter (
bitnami/redis-exporter) and add alerts for CPU, latency, and replication lag. - Verify failover:
redis-cli -c -h <master> shutdown nosave→ ensure a replica is promoted. - Run a load test (e.g.,
hey -c 200 -n 50000 http://n8n.my-domain.com/webhook/…) and confirm 99th‑percentile latency < 50 ms.
Next Steps
- Deploying Redis Sentinel for HA when clustering isn’t possible.
- Using Redis Streams as an n8n queue alternative to list‑based triggers.
- Securing Redis with TLS and ACLs for multi‑tenant n8n installations.
All recommendations are production‑grade, tested on Kubernetes 1.28+ with n8n 0.236. Adjust node sizes and replica counts to match your specific workload.



