Fix 5 n8n Monitoring Dashboard Setup Issues Fast

<figure class="wp-block-image aligncenter"><img src="https://flowgenius.in/wp-content/uploads/2026/01/monitoring-dashboard-setup.png" alt="Step by Step Guide to solve monitoring dashboard setup" /> <figcaption style="text-align: center;">Step by Step Guide to solve monitoring dashboard setup</p> <hr /> </figcaption></figure> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Who this is for</strong>: DevOps engineers, SREs, and n8n administrators who need production‑grade observability for workflow latency, error rates, and resource usage. <strong>We cover this in detail in the </strong><a href="https://flowgenius.in/n8n-performance-and-scaling-guide/">n8n Performance & Scaling Guide.</a></p> <div style="margin: 50px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">Quick Diagnosis</h2> <ul style="margin-bottom: 2em; line-height: 1.9;"> <li><strong>Problem</strong>: Without real‑time metrics you can’t troubleshoot n8n performance or plan capacity.</li> <li><strong>Featured‑snippet solution</strong>: Deploy the official <strong>n8n Prometheus exporter</strong>, scrape its <code>/metrics</code> endpoint with <strong>Prometheus</strong>, and connect a <strong>Grafana</strong> data source to visualize the key metrics on a ready‑made dashboard.</li> </ul> <div style="margin: 50px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">1. Prerequisites & Environment Checklist</h2> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Item</th> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Description</th> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Recommended Version</th> </tr> </thead> <tbody> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">n8n instance</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Running (Docker, Kubernetes, or binary)</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">≥ 0.230</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Prometheus server</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Collector for metrics</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">2.45+</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Grafana UI</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Dashboard visualizer</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">10.2+</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Network access</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">n8n <code>/metrics</code> reachable from Prometheus</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">–</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Optional: Alertmanager</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">For alerts on SLA breaches</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">0.27+</td> </tr> </tbody> </table> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #e0e0e0;"><p><strong>EEFA note</strong> – In production, isolate the Prometheus endpoint behind a firewall or protect it with basic auth to avoid exposing internal metrics publicly. If you encounter any <a href="/logging-optimization">logging optimization </a>resolve them before continuing with the setup.</p></blockquote> <div style="margin: 50px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">2. Enable the n8n Prometheus Exporter</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.1 Docker Compose (most common)</h3> <p style="margin-bottom: 2em; line-height: 1.9;">Add the exporter configuration to your <code>docker‑compose.yml</code>:</p> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">version: "3.8" services: n8n: image: n8nio/n8n:latest ports: - "5678:5678" environment: - N8N_METRICS_ENABLED=true - N8N_METRICS_PORT=9464 expose: - "9464" </pre> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #e0e0e0;"><p><strong>EEFA warning</strong> – Do <strong>not</strong> expose port 9464 to the public internet. Keep it on an internal Docker network or front it with a reverse‑proxy that enforces authentication.</p></blockquote> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.2 Kubernetes (Helm)</h3> <p style="margin-bottom: 2em; line-height: 1.9;">Enable the exporter via Helm values:</p> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">helm upgrade --install n8n n8n/n8n \ --set metrics.enabled=true \ --set metrics.port=9464 \ --set service.annotations."prometheus\.io/scrape"="true" \ --set service.annotations."prometheus\.io/port"="9464" </pre> <p style="margin-bottom: 2em; line-height: 1.9;">The chart automatically adds the required Prometheus annotations. If you encounter any <a href="/benchmarking-tools">benchmarking tools </a>resolve them before continuing with the setup.</p> <div style="margin: 50px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">3. Configure Prometheus to Scrape n8n</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">3.1 Add a scrape job</h3> <p style="margin-bottom: 2em; line-height: 1.9;">Insert the following into <code>prometheus.yml</code> (or use the UI → *Configuration* → *Add Scrape Target*):</p> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">scrape_configs: - job_name: 'n8n' static_configs: - targets: ['n8n:9464'] metrics_path: /metrics scheme: http </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">3.2 Reload Prometheus</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">docker exec prometheus kill -HUP 1 </pre> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #e0e0e0;"><p><strong>EEFA tip</strong> – Use <code>relabel_configs</code> to drop internal metrics you never query; this reduces storage bloat.</p></blockquote> <h3 style="margin-bottom: 45px; line-height: 1.3;">3.3 Verify the scrape</h3> <p style="margin-bottom: 2em; line-height: 1.9;">A quick curl against the Prometheus API confirms health:</p> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">curl http://localhost:9090/api/v1/targets \ | jq '.data.activeTargets[] | select(.scrapeUrl|contains("n8n"))' </pre> <p style="margin-bottom: 2em; line-height: 1.9;">You should see <code>"health":"up"</code> and a recent scrape timestamp.</p> <div style="margin: 50px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">4. Build the Grafana Dashboard</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.1 Add Prometheus as a data source</h3> <ol style="margin-bottom: 2em; line-height: 1.9;"> <li>Configuration → Data Sources → Add data source</li> <li>Choose <strong>Prometheus</strong></li> <li>URL: <code>http://prometheus:9090</code> (adjust to your network)</li> <li>Click <strong>Save & Test</strong> – you should see *Data source is working*.</li> </ol> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.2 Import the JSON dashboard (split for readability)</h3> <h4 style="margin-bottom: 45px; line-height: 1.3;">4.2.1 Dashboard metadata & templating</h4> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">{ "dashboard": { "title": "n8n Monitoring", "uid": "n8n-monitoring", "templating": { "list": [ { "name": "job", "type": "query", "datasource": "Prometheus", "query": "label_values(n8n_workflow_executions_total, job)", "refresh": 1, "includeAll": false } ] }, </pre> <h4 style="margin-bottom: 45px; line-height: 1.3;">4.2.2 Core panels</h4> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;"> "panels": [ { "type": "graph", "title": "Workflow Execution Rate (per min)", "datasource": "Prometheus", "targets": [{ "expr": "rate(n8n_workflow_executions_total[1m])", "legendFormat": "Exec/min" }], "gridPos": {"x":0,"y":0,"w":12,"h":8} }, { "type": "stat", "title": "Active Workers", "datasource": "Prometheus", "targets": [{ "expr": "n8n_worker_active", "legendFormat": "Workers" }], "gridPos": {"x":12,"y":0,"w":6,"h":4} }, { "type": "graph", "title": "CPU Usage (%)", "datasource": "Prometheus", "targets": [{ "expr": "rate(process_cpu_seconds_total{job=\"n8n\"}[30s]) * 100", "legendFormat": "CPU %" }], "gridPos": {"x":12,"y":4,"w":12,"h":8} }, { "type": "graph", "title": "Memory RSS (bytes)", "datasource": "Prometheus", "targets": [{ "expr": "process_resident_memory_bytes{job=\"n8n\"}", "legendFormat": "RSS" }], "gridPos": {"x":0,"y":8,"w":12,"h":8} }, { "type": "table", "title": "Top 5 Slowest Workflows (last 5 min)", "datasource": "Prometheus", "targets": [{ "expr": "topk(5, avg_over_time(n8n_workflow_execution_duration_seconds[5m]))", "format": "table", "legendFormat": "{{workflow_id}}" }], "gridPos": {"x":12,"y":12,"w":12,"h":8} } ] }, "overwrite": true } </pre> <p style="margin-bottom: 2em; line-height: 1.9;">Copy the full JSON (metadata + panels) into **Dashboard → Manage → Import → Upload JSON file. If you encounter any <a href="/security-impact-on-performance">security impact on performance </a>resolve them before continuing with the setup.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.3 Customising panels</h3> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Panel</th> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Core metric</th> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Recommended alert</th> </tr> </thead> <tbody> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Workflow Execution Rate</td> <td style="padding: 13px; border: 1px solid #e0e0e0;"><code>rate(n8n_workflow_executions_total[1m])</code></td> <td style="padding: 13px; border: 1px solid #e0e0e0;">< 5 exec/min → <strong>Warning</strong></td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Active Workers</td> <td style="padding: 13px; border: 1px solid #e0e0e0;"><code>n8n_worker_active</code></td> <td style="padding: 13px; border: 1px solid #e0e0e0;">= 0 → <strong>Critical</strong></td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">CPU Usage</td> <td style="padding: 13px; border: 1px solid #e0e0e0;"><code>rate(process_cpu_seconds_total[30s]) * 100</code></td> <td style="padding: 13px; border: 1px solid #e0e0e0;">> 80 % for 5 min → <strong>Warning</strong></td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Memory RSS</td> <td style="padding: 13px; border: 1px solid #e0e0e0;"><code>process_resident_memory_bytes</code></td> <td style="padding: 13px; border: 1px solid #e0e0e0;">> 80 % of container limit → <strong>Critical</strong></td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Slowest Workflows</td> <td style="padding: 13px; border: 1px solid #e0e0e0;"><code>avg_over_time(n8n_workflow_execution_duration_seconds[5m])</code></td> <td style="padding: 13px; border: 1px solid #e0e0e0;">> 30 s → <strong>Info</strong></td> </tr> </tbody> </table> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #e0e0e0;"><p><strong>EEFA note</strong> – In Kubernetes use <code>container_cpu_usage_seconds_total</code> and <code>container_memory_working_set_bytes</code> instead of the generic <code>process_*</code> metrics to avoid double‑counting across pods.</p></blockquote> <div style="margin: 50px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">5. Alerting with Prometheus Alertmanager</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">5.1 Define alert rules</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">groups: - name: n8n.alerts rules: - alert: n8nHighCPU expr: rate(process_cpu_seconds_total{job="n8n"}[2m]) * 100 > 80 for: 5m labels: severity: warning annotations: summary: "High CPU usage on n8n" description: "CPU usage has been above 80 % for the last 5 minutes." </pre> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;"> - alert: n8nNoWorkers expr: n8n_worker_active == 0 for: 2m labels: severity: critical annotations: summary: "No active n8n workers" description: "All worker processes are down – workflows will not execute." </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">5.2 Wire the rule file into Prometheus</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">rule_files: - "/etc/prometheus/alert.rules.yml" </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">5.3 Configure Alertmanager (example: Slack webhook)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; white-space: pre-wrap; overflow-x: auto;">receivers: - name: slack slack_configs: - webhook_url: https://hooks.slack.com/services/XXXXX/XXXXX/XXXXX channel: "#alerts" route: receiver: slack group_wait: 30s group_interval: 5m repeat_interval: 12h </pre> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #e0e0e0;"><p><strong>EEFA caution</strong> – Add a maintenance “silence” window for scheduled deployments; otherwise you’ll generate alert fatigue.</p></blockquote> <div style="margin: 50px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">6. Advanced Troubleshooting & Performance Tuning</h2> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Symptom</th> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Likely cause (metric)</th> <th style="padding: 13px; border: 1px solid #e0e0e0; text-align: left;">Quick fix</th> </tr> </thead> <tbody> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Spike in <code>n8n_workflow_execution_duration_seconds</code></td> <td style="padding: 13px; border: 1px solid #e0e0e0;">DB latency (<code>pg_stat_activity</code> high)</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Increase DB pool (<code>N8N_DB_MAX_CONNECTIONS</code>)</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;"><code>n8n_worker_active</code> drops to 0</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">OOM kill of worker container</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Raise memory limit or enable swap (if allowed)</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Prometheus scrape errors</td> <td style="padding: 13px; border: 1px solid #e0e0e0;"><code>401 Unauthorized</code> on <code>/metrics</code></td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Verify exporter auth; add <code>basic_auth</code> to scrape job</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #e0e0e0;">Grafana panel shows “NaN”</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Metric name typo</td> <td style="padding: 13px; border: 1px solid #e0e0e0;">Check metric list via <code>http://n8n:9464/metrics</code></td> </tr> </tbody> </table> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Tip</strong> – Keep a “Metrics health” dashboard that only shows <code>up{job="n8n"}</code> and <code>scrape_duration_seconds{job="n8n"}</code>. If those go red, the monitoring stack itself needs attention before investigating downstream symptoms.</p> <hr /> <h2></h2> <h2 style="margin-bottom: 45px; line-height: 1.3;">Conclusion</h2> <p style="margin-bottom: 2em; line-height: 1.9;">By exposing the built‑in n8n Prometheus exporter, configuring Prometheus to scrape it, and importing a purpose‑built Grafana dashboard, you gain real‑time visibility into workflow throughput, worker health, CPU, and memory consumption. Coupled with concise Alertmanager rules, this stack provides early warning of performance regressions and ensures your automation pipelines stay reliable in production. Implement the steps above, tailor alert thresholds to your SLAs, and you’ll have a production‑grade observability solution without unnecessary complexity.</p>

Step by Step Guide to solve monitoring dashboard setup

Who this is for: DevOps engineers, SREs, and n8n administrators who need production‑grade observability for workflow latency, error rates, and resource usage. We cover this in detail in the n8n Performance & Scaling Guide.

Quick Diagnosis

Problem: Without real‑time metrics you can’t troubleshoot n8n performance or plan capacity.
Featured‑snippet solution: Deploy the official n8n Prometheus exporter, scrape its /metrics endpoint with Prometheus, and connect a Grafana data source to visualize the key metrics on a ready‑made dashboard.

1. Prerequisites & Environment Checklist

Item	Description	Recommended Version
n8n instance	Running (Docker, Kubernetes, or binary)	≥ 0.230
Prometheus server	Collector for metrics	2.45+
Grafana UI	Dashboard visualizer	10.2+
Network access	n8n `/metrics` reachable from Prometheus	–
Optional: Alertmanager	For alerts on SLA breaches	0.27+

EEFA note – In production, isolate the Prometheus endpoint behind a firewall or protect it with basic auth to avoid exposing internal metrics publicly. If you encounter any logging optimization resolve them before continuing with the setup.

2. Enable the n8n Prometheus Exporter

2.1 Docker Compose (most common)

Add the exporter configuration to your docker‑compose.yml:

version: "3.8"
services:
  n8n:
    image: n8nio/n8n:latest
    ports:
      - "5678:5678"
    environment:
      - N8N_METRICS_ENABLED=true
      - N8N_METRICS_PORT=9464
    expose:
      - "9464"

EEFA warning – Do not expose port 9464 to the public internet. Keep it on an internal Docker network or front it with a reverse‑proxy that enforces authentication.

2.2 Kubernetes (Helm)

Enable the exporter via Helm values:

helm upgrade --install n8n n8n/n8n \
  --set metrics.enabled=true \
  --set metrics.port=9464 \
  --set service.annotations."prometheus\.io/scrape"="true" \
  --set service.annotations."prometheus\.io/port"="9464"

The chart automatically adds the required Prometheus annotations. If you encounter any benchmarking tools resolve them before continuing with the setup.

3. Configure Prometheus to Scrape n8n

3.1 Add a scrape job

Insert the following into prometheus.yml (or use the UI → *Configuration* → *Add Scrape Target*):

scrape_configs:
  - job_name: 'n8n'
    static_configs:
      - targets: ['n8n:9464']
    metrics_path: /metrics
    scheme: http

3.2 Reload Prometheus

docker exec prometheus kill -HUP 1

EEFA tip – Use relabel_configs to drop internal metrics you never query; this reduces storage bloat.

3.3 Verify the scrape

A quick curl against the Prometheus API confirms health:

curl http://localhost:9090/api/v1/targets \
  | jq '.data.activeTargets[] | select(.scrapeUrl|contains("n8n"))'

You should see "health":"up" and a recent scrape timestamp.

4. Build the Grafana Dashboard

4.1 Add Prometheus as a data source

Configuration → Data Sources → Add data source
Choose Prometheus
URL: http://prometheus:9090 (adjust to your network)
Click Save & Test – you should see *Data source is working*.

4.2 Import the JSON dashboard (split for readability)

4.2.1 Dashboard metadata & templating

{
  "dashboard": {
    "title": "n8n Monitoring",
    "uid": "n8n-monitoring",
    "templating": {
      "list": [
        {
          "name": "job",
          "type": "query",
          "datasource": "Prometheus",
          "query": "label_values(n8n_workflow_executions_total, job)",
          "refresh": 1,
          "includeAll": false
        }
      ]
    },

4.2.2 Core panels

    "panels": [
      {
        "type": "graph",
        "title": "Workflow Execution Rate (per min)",
        "datasource": "Prometheus",
        "targets": [{ "expr": "rate(n8n_workflow_executions_total[1m])", "legendFormat": "Exec/min" }],
        "gridPos": {"x":0,"y":0,"w":12,"h":8}
      },
      {
        "type": "stat",
        "title": "Active Workers",
        "datasource": "Prometheus",
        "targets": [{ "expr": "n8n_worker_active", "legendFormat": "Workers" }],
        "gridPos": {"x":12,"y":0,"w":6,"h":4}
      },
      {
        "type": "graph",
        "title": "CPU Usage (%)",
        "datasource": "Prometheus",
        "targets": [{ "expr": "rate(process_cpu_seconds_total{job=\"n8n\"}[30s]) * 100", "legendFormat": "CPU %" }],
        "gridPos": {"x":12,"y":4,"w":12,"h":8}
      },
      {
        "type": "graph",
        "title": "Memory RSS (bytes)",
        "datasource": "Prometheus",
        "targets": [{ "expr": "process_resident_memory_bytes{job=\"n8n\"}", "legendFormat": "RSS" }],
        "gridPos": {"x":0,"y":8,"w":12,"h":8}
      },
      {
        "type": "table",
        "title": "Top 5 Slowest Workflows (last 5 min)",
        "datasource": "Prometheus",
        "targets": [{
          "expr": "topk(5, avg_over_time(n8n_workflow_execution_duration_seconds[5m]))",
          "format": "table",
          "legendFormat": "{{workflow_id}}"
        }],
        "gridPos": {"x":12,"y":12,"w":12,"h":8}
      }
    ]
  },
  "overwrite": true
}

Copy the full JSON (metadata + panels) into **Dashboard → Manage → Import → Upload JSON file. If you encounter any security impact on performance resolve them before continuing with the setup.

4.3 Customising panels

Panel	Core metric	Recommended alert
Workflow Execution Rate	`rate(n8n_workflow_executions_total[1m])`	< 5 exec/min → Warning
Active Workers	`n8n_worker_active`	= 0 → Critical
CPU Usage	`rate(process_cpu_seconds_total[30s]) * 100`	> 80 % for 5 min → Warning
Memory RSS	`process_resident_memory_bytes`	> 80 % of container limit → Critical
Slowest Workflows	`avg_over_time(n8n_workflow_execution_duration_seconds[5m])`	> 30 s → Info

EEFA note – In Kubernetes use container_cpu_usage_seconds_total and container_memory_working_set_bytes instead of the generic process_* metrics to avoid double‑counting across pods.

5. Alerting with Prometheus Alertmanager

5.1 Define alert rules

groups:
  - name: n8n.alerts
    rules:
      - alert: n8nHighCPU
        expr: rate(process_cpu_seconds_total{job="n8n"}[2m]) * 100 > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage on n8n"
          description: "CPU usage has been above 80 % for the last 5 minutes."

      - alert: n8nNoWorkers
        expr: n8n_worker_active == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "No active n8n workers"
          description: "All worker processes are down – workflows will not execute."

5.2 Wire the rule file into Prometheus

rule_files:
  - "/etc/prometheus/alert.rules.yml"

5.3 Configure Alertmanager (example: Slack webhook)

receivers:
  - name: slack
    slack_configs:
      - webhook_url: https://hooks.slack.com/services/XXXXX/XXXXX/XXXXX
        channel: "#alerts"
route:
  receiver: slack
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 12h

EEFA caution – Add a maintenance “silence” window for scheduled deployments; otherwise you’ll generate alert fatigue.

6. Advanced Troubleshooting & Performance Tuning

Symptom	Likely cause (metric)	Quick fix
Spike in `n8n_workflow_execution_duration_seconds`	DB latency (`pg_stat_activity` high)	Increase DB pool (`N8N_DB_MAX_CONNECTIONS`)
`n8n_worker_active` drops to 0	OOM kill of worker container	Raise memory limit or enable swap (if allowed)
Prometheus scrape errors	`401 Unauthorized` on `/metrics`	Verify exporter auth; add `basic_auth` to scrape job
Grafana panel shows “NaN”	Metric name typo	Check metric list via `http://n8n:9464/metrics`

Tip – Keep a “Metrics health” dashboard that only shows up{job="n8n"} and scrape_duration_seconds{job="n8n"}. If those go red, the monitoring stack itself needs attention before investigating downstream symptoms.

Conclusion

By exposing the built‑in n8n Prometheus exporter, configuring Prometheus to scrape it, and importing a purpose‑built Grafana dashboard, you gain real‑time visibility into workflow throughput, worker health, CPU, and memory consumption. Coupled with concise Alertmanager rules, this stack provides early warning of performance regressions and ensures your automation pipelines stay reliable in production. Implement the steps above, tailor alert thresholds to your SLAs, and you’ll have a production‑grade observability solution without unnecessary complexity.

Fix 5 n8n Monitoring Dashboard Setup Issues Fast

Quick Diagnosis

1. Prerequisites & Environment Checklist

2. Enable the n8n Prometheus Exporter

2.1 Docker Compose (most common)

2.2 Kubernetes (Helm)

3. Configure Prometheus to Scrape n8n

3.1 Add a scrape job

3.2 Reload Prometheus

3.3 Verify the scrape

4. Build the Grafana Dashboard

4.1 Add Prometheus as a data source

4.2 Import the JSON dashboard (split for readability)

4.2.1 Dashboard metadata & templating

4.2.2 Core panels

4.3 Customising panels

5. Alerting with Prometheus Alertmanager

5.1 Define alert rules

5.2 Wire the rule file into Prometheus

5.3 Configure Alertmanager (example: Slack webhook)

6. Advanced Troubleshooting & Performance Tuning

Conclusion

Leave a Comment Cancel Reply

Sign up for Newsletter

Quick Diagnosis

1. Prerequisites & Environment Checklist

2. Enable the n8n Prometheus Exporter

2.1 Docker Compose (most common)

2.2 Kubernetes (Helm)

3. Configure Prometheus to Scrape n8n

3.1 Add a scrape job

3.2 Reload Prometheus

3.3 Verify the scrape

4. Build the Grafana Dashboard

4.1 Add Prometheus as a data source

4.2 Import the JSON dashboard (split for readability)

4.2.1 Dashboard metadata & templating

4.2.2 Core panels

4.3 Customising panels

5. Alerting with Prometheus Alertmanager

5.1 Define alert rules

5.2 Wire the rule file into Prometheus

5.3 Configure Alertmanager (example: Slack webhook)

6. Advanced Troubleshooting & Performance Tuning

Conclusion

Must Read

Leave a Comment Cancel Reply

5. Alerting with Prometheus Alertmanager