
Step by Step Guide to solve n8n workflow fails due to Redis
Who this is for: n8n developers and DevOps engineers who run production‑grade workflows backed by Redis and need a reliable strategy to survive Redis outages. For a complete overview of Redis usage, errors, performance tuning, and scaling in n8n, check out our detailed guide on Redis for n8n Workflows.
Quick Diagnosis
Recover a stopped n8n workflow in three steps:
| Step | Action | n8n node / setting |
|---|---|---|
| Detect | Catch Redis‑related errors with an Error Trigger. | Error Trigger → If (filter error.message contains "Redis"). |
| Retry | Run a Retry Loop with exponential back‑off (max 5 attempts). | Function → await this.helpers.wait(2 ** $index * 1000); → Execute Workflow. |
| Fallback | Persist the payload to a durable store (PostgreSQL, S3) before re‑queueing. | Postgres / S3 node in the error branch. |
Add a Cron‑based watchdog (every 5 min) that re‑runs any “stuck” executions logged in the fallback table.
1. Why Redis‑Related Failures Stop a Workflow
Redis is n8n’s default cache & queue backend. When the Redis client throws an exception (e.g., connection loss, max‑memory limit, corrupted data), the execution engine aborts the current run and marks it failed. Because the execution state lives in Redis, the engine cannot resume without a healthy connection. Facing connection issues with Redis in n8n? Explore our full n8n Redis guide for solutions and best practices.
Typical Redis errors that trigger a failure
| Redis error | n8n symptom |
|---|---|
| ECONNREFUSED | Immediate abort, no retries (Error: connect ECONNREFUSED 127.0.0.1:6379). |
| ETIMEDOUT | Step hangs, then fails (Error: Redis connection timed out). |
| OOM command not allowed | Write attempts rejected (Error: OOM command not allowed when used memory > 'maxmemory'). |
| WRONGTYPE | Data‑type mismatch (Error: WRONGTYPE Operation against a key holding the wrong kind of value). |
2. Diagnosing the Root Cause
Before adding recovery logic, decide whether the failure is transient (network glitch) or systemic (mis‑configuration, memory pressure).
Diagnostic checklist
| Tool | Command / UI | What to look for |
|---|---|---|
| redis-cli ping | redis-cli -h <host> -p <port> ping | PONG → reachable; otherwise network error. |
| Redis logs | /var/log/redis/redis-server.log | Repeated OOM or MAXMEMORY warnings. |
| n8n Execution Log | UI → Executions → Failed → Details | Full stack trace, error code. |
| Prometheus / Grafana | redis_up{instance=”redis:6379″} == 0 | Alert on downtime. |
3. Building a Robust Error‑Handling Branch
3.1 Add an Error Trigger
Insert an *Error Trigger* node that fires on any workflow failure.
{
"name": "Error Trigger",
"type": "n8n-nodes-base.errorTrigger",
"typeVersion": 1,
"position": [250, 300]
}
Filter for Redis errors
Place an *If* node after the trigger to keep only Redis‑related messages.
{
"name": "Redis Error Filter",
"type": "n8n-nodes-base.if",
"typeVersion": 1,
"parameters": {
"conditions": {
"string": [
{
"value1": "{{$json[\"error\"].message}}",
"operation": "contains",
"value2": "Redis"
}
]
}
},
"position": [450, 300]
}
3.2 Exponential Back‑off Retry Loop
| Attempt | Wait (ms) | Formula |
|---|---|---|
| 1 | 1 000 | 2^0 * 1000 |
| 2 | 2 000 | 2^1 * 1000 |
| 3 | 4 000 | 2^2 * 1000 |
| 4 | 8 000 | 2^3 * 1000 |
| 5 | 16 000 | 2^4 * 1000 |
**Function node – calculate wait time and pass retry metadata**
await this.helpers.wait(Math.pow(2, $index) * 1000);
return {
json: {
retryCount: $index + 1,
originalPayload: $json,
},
};
Connect the **Function** → **Execute Workflow** (same workflow ID) → **If** (max attempts reached).
**EEFA note** – never set maxAttempts to Infinity. Infinite loops will saturate Redis once it becomes available again.
3.3 Fallback Persistence
When the retry limit is exceeded, store the payload in a durable medium before giving up.
{
"name": "Postgres Fallback",
"type": "n8n-nodes-base.postgres",
"typeVersion": 1,
"parameters": {
"operation": "insert",
"table": "n8n_redis_fallback",
"columns": [
{ "name": "executionId", "value": "{{$json[\"executionId\"]}}" },
{ "name": "payload", "value": "{{$json[\"originalPayload\"]}}" },
{ "name": "failedAt", "value": "{{$now}}" }
]
},
"position": [850, 300]
}
Why this works – The fallback table lives outside Redis, so even a prolonged outage leaves your data intact and ready for later replay. Want to log and monitor Redis errors in n8n efficiently? Check our comprehensive n8n Redis guide.
4. Automated Recovery of Stuck Executions
4.1 Scheduler (Cron node)
{
"name": "Recovery Scheduler",
"type": "n8n-nodes-base.cron",
"typeVersion": 1,
"parameters": { "cronExpression": "*/5 * * * *" },
"position": [250, 200]
}
4.2 Fetch pending fallback rows
{
"name": "Fetch Stuck Executions",
"type": "n8n-nodes-base.postgres",
"typeVersion": 1,
"parameters": {
"operation": "select",
"sql": "SELECT * FROM n8n_redis_fallback WHERE processed = false"
},
"position": [450, 200]
}
4.3 Replay the original workflow
{
"name": "Replay Workflow",
"type": "n8n-nodes-base.executeWorkflow",
"typeVersion": 1,
"parameters": {
"workflowId": "",
"inputData": "={{$json}}"
},
"position": [650, 200]
}
4.4 Mark the row as processed
{
"name": "Mark Processed",
"type": "n8n-nodes-base.postgres",
"typeVersion": 1,
"parameters": {
"operation": "update",
"sql": "UPDATE n8n_redis_fallback SET processed = true WHERE id = {{$json[\"id\"]}}"
},
"position": [850, 200]
}
EEFA warning – Ensure the replayed workflow is idempotent. Guard any side‑effects (e.g., email sending) with a check that a “sent” flag exists in your database before repeating the action.
5. Monitoring & Alerting
| Metric | Recommended threshold | Alert action |
|---|---|---|
| redis_up (Prometheus) | 0 for > 30 s | Slack / PagerDuty “Redis down”. |
| n8n_execution_failed_total{error=~”.*Redis.*”} | > 5/min | Trigger the Recovery Scheduler immediately. |
| n8n_fallback_queue_length | > 100 | Investigate memory pressure or scale Redis. |
Add a Grafana panel visualising n8n_fallback_queue_length to spot growing backlogs before they become critical. Confused by Redis command errors in n8n? Our n8n Redis guide has all the fixes you need.
6. Quick Diagnostic Checklist (Copy‑Paste)
- [ ] Ping Redis (`redis-cli ping`) → is it PONG? - [ ] Review Redis logs for OOM or maxmemory warnings. - [ ] Verify n8n env vars: REDIS_HOST, REDIS_PORT, REDIS_PASSWORD. - [ ] Test network connectivity (`telnet <host> <port>`). - [ ] Inspect n8n Execution Log → error message contains “Redis”. - [ ] Run a minimal workflow that only writes a key to Redis. - [ ] If transient, enable the retry loop (max 5 attempts). - [ ] If persistent, enable fallback persistence (Postgres/S3). - [ ] Deploy the Cron recovery workflow and monitor fallback table size.
7. Production‑Grade Best Practices
- Separate Redis instances – Use one for cache, another for the n8n queue to prevent cache churn from starving the job queue.
- Set
maxmemory-policytovolatile-lruon the queue DB so only expiring keys are evicted. - Enable
client-output-buffer-limitto stop a single stalled worker from exhausting server memory. - Run n8n in Docker Swarm / Kubernetes with a readiness probe that runs
redis-cli ping. A failed probe restarts the pod, avoiding half‑started executions. - Log every fallback entry with a UUID to correlate with the original execution in your audit trail.
Next Steps
- Implement the error‑handling branch in a staging environment.
- Simulate Redis downtime (
docker stop redis) and verify that the retry loop and fallback persist correctly. - Once stable, promote to production and enable the monitoring alerts described in § 5.
All JSON snippets are ready‑to‑paste into the n8n UI (JSON import) or a Code node. Adjust connection credentials, workflow IDs, and table names to match your environment.



