Monitoring Redis Health for n8n: Complete Guide

n8n Cache Architecture and Data Flow Diagram

Step by Step Guide for Monitoring Redis Health for n8n

 

 

Who this is for: Site reliability engineers, DevOps specialists, or n8n operators who need production‑grade visibility and alerts on the Redis instance that backs n8n workflows. For a complete overview of Redis usage, errors, performance tuning, and scaling in n8n, check out our detailed guide on Redis for n8n Workflows.


Quick Diagnosis

  1. Deploy redis_exporter (Docker or Helm).
  2. Add a redis-n8n scrape job to prometheus.yml.
  3. Import the n8n Redis Health Grafana dashboard (JSON below).
  4. Create Alertmanager rules for redis_up, memory pressure, and client spikes.

Result: real‑time graphs and alerts for memory usage, client connections, and instance availability—critical for keeping n8n operational.


1. Why n8n Needs Dedicated Redis Monitoring

n8n relies on Redis for workflow state, credential caches, and queue data. A single metric breach can cascade into workflow failures, credential loss, or dead‑lettered jobs. If you are planning to scale Redis for high n8n load, finish it up and continue the setup.

**n8n Concern** **Key Redis Metric**
Workflow continuity `connected_clients`
Credential cache reliability `used_memory` / `maxmemory`
Queue processing speed `instantaneous_ops_per_sec`
Instance availability `redis_up`

Impact: If any of these metrics cross their thresholds, n8n jobs stall or data is lost.


2. Prerequisites

Requirement Detail
n8n version ≥ 0.221 (external Redis)
Redis version 5 – 7 (supported by exporter)
Prometheus ≥ 2.30
Grafana ≥ 9.0
Network Prometheus → Redis :6379 reachable
Optional Alertmanager for automated alerts

3. Deploy the Redis Exporter

3.1 Docker – quick start

docker run -d \
  --name redis_exporter \
  -p 9121:9121 \
  -e REDIS_ADDR=redis://<REDIS_HOST>:6379 \
  oliver006/redis_exporter:latest

*Replace <REDIS_HOST> with the host n8n uses.*

3.2 Kubernetes – Helm chart

helm repo add prometheus-community \
  https://prometheus-community.github.io/helm-charts
helm repo update
helm install redis-exporter prometheus-community/prometheus-redis-exporter \
  --set redis.address=redis://<REDIS_HOST>:6379 \
  --set service.port=9121

EEFA Note – In production, set REDIS_PASSWORD (or --redis.password) to avoid exposing unauthenticated metrics.


4. Configure Prometheus to Scrape the Exporter

Add a new job to prometheus.yml:

scrape_configs:
  - job_name: 'redis-n8n'
    static_configs:
      - targets: ['<EXPORTER_HOST>:9121']

Add a relabel rule that tags metrics with the n8n instance name:

    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
        replacement: n8n-redis

Reload Prometheus: curl -X POST http://localhost:9090/-/reload


5. n8n‑Specific Grafana Dashboard

Import the JSON below (Grafana > Dashboard > Import). It visualizes the three health pillars: availability, memory, and client load.

{
  "dashboard": {
    "title": "n8n Redis Health",
    "panels": [
      {
        "type": "stat",
        "title": "Redis Up",
        "targets": [{ "expr": "redis_up{instance=\"n8n-redis\"}" }],
        "colorMode": "value",
        "thresholds": "0,1"
      },
      {
        "type": "graph",
        "title": "Memory Usage vs Maxmemory",
        "targets": [
          { "expr": "redis_memory_used_bytes{instance=\"n8n-redis\"}", "legendFormat": "Used" },
          { "expr": "redis_memory_max_bytes{instance=\"n8n-redis\"}", "legendFormat": "Max" }
        ],
        "yaxes": [{ "format": "bytes" }, {}]
      },
      {
        "type": "graph",
        "title": "Connected Clients",
        "targets": [{ "expr": "redis_connected_clients{instance=\"n8n-redis\"}" }],
        "yaxes": [{ "format": "short" }, {}]
      },
      {
        "type": "graph",
        "title": "Ops per Second (Instantaneous)",
        "targets": [{ "expr": "redis_instantaneous_ops_per_sec{instance=\"n8n-redis\"}" }],
        "yaxes": [{ "format": "ops" }, {}]
      }
    ],
    "templating": {
      "list": [
        {
          "type": "query",
          "name": "instance",
          "datasource": "Prometheus",
          "query": "label_values(redis_up, instance)",
          "refresh": 1,
          "includeAll": false
        }
      ]
    }
  }
}

Dashboard Customization Checklist

  • Set Refresh to 30 s (captures n8n queue spikes).
  • Add Annotations for n8n deployment rollouts.
  • Enable Panel Links to the n8n workflow editor for quick navigation.
  • When a tragedy or fallback occurs,
    Check out: fallback strategies when Redis is down in n8n

6. Alerting – Detecting Critical Redis Conditions

Create redis_n8n_alerts.yml and reference it from your Prometheus alerting_rules.yml.

6.1 Instance‑down alert

- alert: RedisDown
  expr: redis_up{instance="n8n-redis"} == 0
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: "Redis instance for n8n is unreachable"
    description: "No metrics received from {{ $labels.instance }} for >2 minutes."

6.2 Memory‑pressure alert

- alert: RedisMemoryPressure
  expr: (redis_memory_used_bytes{instance="n8n-redis"} /
         redis_memory_max_bytes{instance="n8n-redis"}) > 0.85
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Redis memory usage high on n8n"
    description: "Memory usage is {{ $value | humanizePercentage }} of maxmemory."

6.3 Client‑spike alert

- alert: RedisClientSpikes
  expr: redis_connected_clients{instance="n8n-redis"} > 800
  for: 3m
  labels:
    severity: warning
  annotations:
    summary: "High number of Redis clients for n8n"
    description: "{{ $value }} clients connected (threshold 800)."

EEFA Note – In a Kubernetes StatefulSet with maxmemory-policy allkeys-lru, a memory‑pressure alert should trigger a **scale‑out** or **cache‑purge** job rather than a simple restart.


7. Pulling n8n‑Specific Insights from INFO

Some n8n‑relevant fields are only exposed via INFO. Run the helper script on the Redis host (or via redis-cli from n8n) to surface them:

#!/usr/bin/env bash
# n8n‑Redis health helper – prints only n8n‑relevant sections
redis-cli INFO memory | grep -E 'used_memory|maxmemory|used_memory_peak'
redis-cli INFO clients | grep connected_clients
redis-cli INFO stats   | grep instantaneous_ops_per_sec

Interpretation Table

INFO line n8n relevance Action if abnormal
`used_memory` Current memory footprint of workflow state Investigate large payloads or raise maxmemory.
`maxmemory` Configured limit (if set) Increase limit or enable LRU eviction.
`connected_clients` Number of n8n workers + API callers Scale n8n workers or adjust client pooling.
`instantaneous_ops_per_sec` Throughput of queue pushes/pops Check for bottlenecks in heavy‑load workflows.

8. Troubleshooting Common Monitoring Pitfalls

Symptom Likely Cause Fix
No metrics appear in Grafana Exporter not reachable (firewall, wrong REDIS_ADDR) Verify: curl http://<EXPORTER_HOST>:9121/metrics
redis_up always 0 Exporter started without proper auth Set REDIS_PASSWORD env var; restart exporter
Alert flaps repeatedly Scrape interval too low vs for: clause Increase scrape_interval to 30s and for: to 5m
Memory chart flat at 0 maxmemory not set, exporter skips metric Define maxmemory in redis.conf or rely on redis_memory_used_bytes alone
High connected_clients but low latency Stale connections from old n8n pods Enable client-timeout or run CLIENT KILL TYPE normal post‑deploy

9. Production‑Grade Enhancements

  1. TLS‑Secured Exporter – Place the exporter behind an NGINX sidecar with mTLS; use --tls-client-ca and --tls-client-cert.
  2. Multi‑Tenant Labels – For multiple n8n instances, add instance="n8n-prod" / instance="n8n-staging" to both exporter env and Prometheus job.
  3. Auto‑Scaling Hook – Wire an Alertmanager webhook to a Kubernetes HPA that adds n8n workers when connected_clients > 800.
  4. Long‑Term Retention – Configure Prometheus remote‑write to Thanos or Cortex to keep Redis health history for compliance audits.

Conclusion

By instrumenting Redis with redis_exporter, scraping it via Prometheus, visualizing the core n8n metrics in a purpose‑built Grafana dashboard, and wiring sensible Alertmanager rules, you gain immediate visibility into the health factors that directly affect n8n workflow execution. The approach is lightweight, production‑ready, and scales from a single‑node deployment to multi‑tenant, HA Redis clusters—ensuring n8n remains reliable under real‑world load.

Leave a Comment

Your email address will not be published. Required fields are marked *