n8n workflows slow after weeks in production execution table bloat fix

Step by Step Guide to solve n8n workflows slow after weeks in production root cause analysis

Who this is for: Developers and DevOps engineers running n8n in a production environment who need to diagnose and prevent gradual workflow slowdown. We cover this in detail in the n8n Performance Degradation & Stability Issues Guide.

Quick Diagnosis

Step	Action
1.	Open Execution List → “Last 24 h” and note the average execution time.
2.	Run the workflow in Debug mode with “Run from start” and enable “Show execution details”.
3.	If any node shows `durationMs` > 2 s, open its Advanced settings and enable Cache results or adjust Batch size.
4.	Restart the n8n worker (`pm2 restart n8n` or `docker restart n8n`).
5.	If execution time stays > 500 ms, proceed to the full root‑cause analysis below.

1. Core Causes of Degrading Performance

If you encounter any why n8n execution time increases over time resolve them before continuing with the setup.

1.1 Memory‑Related Issues

Trigger	Manifestation
Custom JavaScript that keeps references in `global` or long‑living closures	Node.js heap grows → GC pauses → 5‑30 s spikes.

1.2 Data‑Store Issues

Trigger	Manifestation
`executions` table never purged, large JSON blobs in `workflow_execution`	Full‑table scans on every run, slowing all workflows.

1.3 Queue & Concurrency Problems

Trigger	Manifestation
`Execute Workflow` node with `maxConcurrency = 0` (unlimited)	Backlog builds, each new execution waits for the previous ones.

1.4 External API Throttling

Trigger	Manifestation
Rate‑limited services (Salesforce, HubSpot, etc.)	Retries with exponential back‑off add seconds per node.

1.5 Inefficient Node Logic

Trigger	Manifestation
Large CSV parsing, nested loops in a “Function” node	CPU spikes, especially on low‑end containers.

1.6 Version Drift

Trigger	Manifestation
Upgrading n8n core without revisiting custom nodes	New defaults (e.g., stricter validation) cause hidden re‑processing.

EEFA Note: In production, the silent killer is almost always a memory leak from user‑supplied JavaScript. Even a single stray global.someArray = [] that keeps growing will stall the worker process.

2. Step‑by‑Step Root‑Cause Investigation

2.1 Capture Baseline Metrics

Export recent execution stats (last 7 days):

bash
curl -X GET "http://localhost:5678/rest/executions?filter[finished]=true&filter[createdAt][gte]=$(date -d '-7 days' +%s)" \
  -H "Authorization: Bearer $API_TOKEN" > execs.json

Calculate the average duration (requires jq):

bash
jq '[.data[].durationMs] | add / length / 1000' execs.json

*If the average > 1 s, a systemic issue exists.*

2.2 Profile Memory Usage

Attach the Node.js inspector (default port 9229) and take a heap snapshot after ~30 executions:

bash
node --inspect-brk $(which n8n) &

Open Chrome → chrome://inspect → “Open dedicated DevTools for Node”. Compare the snapshot with one taken from a fresh start; look for growing arrays or detached objects. If you encounter any n8n uses more memory every day leak or design issue resolve them before continuing with the setup.

2.3 Audit the Database

Count rows and size of the execution table (PostgreSQL example):

sql
SELECT COUNT(*) FROM execution;
SELECT pg_total_relation_size('execution')/1024/1024 AS mb;

EEFA Warning: Deleting rows directly (DELETE FROM execution;) bypasses n8n’s cleanup logic. Use the built‑in purge endpoint or CLI instead.

CLI purge of old executions:

bash
n8n execution:clear --older-than 30d

2.4 Identify Hot Nodes

Open Workflow → Settings → Execution → Show execution details.
Sort the “Node execution time (ms)” column.
Flag any node with durationMs > 2000 ms.

Typical hot spots: Function, HTTP Request, Spreadsheet File nodes.

2.5 Review Concurrency Settings

Set a realistic limit based on CPU cores:

yaml
# n8n environment configuration
EXECUTIONS_PROCESS=main
MAX_EXECUTIONS=5          # limit concurrent runs
EXECUTIONS_TIMEOUT=60000 # 60 s timeout

3. Targeted Fixes for Each Root Cause

3.1 Memory Leak in Custom JavaScript

Problematic pattern (leaky):

js
// BAD – uses global mutable state
global.cache = global.cache || [];
global.cache.push(item);
return global.cache;

Stateless alternative (safe):

js
// GOOD – local variable only
const cache = [];
cache.push(item);
return cache;

*Scope variables locally and avoid mutating global.*

3.2 Database Bloat

Enable automatic retention and schedule regular purges:

bash
export EXECUTIONS_DATA_RETENTION_DAYS=30   # keep 30 days of data
n8n start

Add a cron job (runs daily at 02:00 am):

cron
0 2 * * * /usr/local/bin/n8n execution:clear --older-than 30d >> /var/log/n8n-purge.log 2>&1

3.3 Unbounded Queues

Set maxConcurrency on the **Execute Workflow** node (UI → Advanced) or globally via MAX_EXECUTIONS as shown earlier.

3.4 External API Throttling

Add a **Retry** node with custom back‑off parameters:

json
{
  "retry": {
    "maxAttempts": 5,
    "delay": 2000
  }
}

Cache frequent responses where possible.

3.5 Inefficient Parsing

Stream CSV data instead of loading the whole file:

js
const csv = require('csv-parser');
stream.pipe(csv()).on('data', row => {
  // process each row here
});

3.6 Version Drift

After any n8n upgrade, validate all workflows:

bash
n8n workflow:validate --all

Fix any reported breaking changes before traffic resumes.

EEFA Insight: Always restart the worker after applying a fix to clear stale memory and reset DB connection pools. In Docker, docker compose restart n8n is the safest approach. If you encounter any n8n slows down even with low cpu usage resolve them before continuing with the setup.

4. Preventive Maintenance Checklist

Item	Verification Method	Frequency
Enable execution retention (`EXECUTIONS_DATA_RETENTION_DAYS`)	`echo $EXECUTIONS_DATA_RETENTION_DAYS` returns a non‑zero number	Once (deployment)
Schedule DB purge (cron)	`crontab -l` shows `n8n execution:clear` command	Daily
Monitor heap size (Prometheus `nodejs_heap_size_used_bytes`)	Grafana panel stays < 70 % of limit	Continuous
Cap concurrent runs (`MAX_EXECUTIONS`)	`n8n config:get MAX_EXECUTIONS` matches `CPU_COUNT * 2`	After scaling
Audit custom JS (static analysis)	Run `eslint` with `no-global` rule on repo	PR review
Validate all workflows after upgrades	CLI exits with status 0	Post‑upgrade
Alert on slow node execution (> 2 s)	Alert on `n8n_node_execution_time_seconds` > 2	Continuous

5. Real‑World Example: Fixing a 3‑Week Slowdown

Scenario: An e‑commerce integration workflow (order-sync) started at 200 ms per run and rose to 8 s after four weeks.

Root causes discovered

A Function node built a global.orderCache array that never cleared.
The execution table grew to 2.3 M rows (~12 GB).

Fixes applied

Removed the global cache:

js
// Old leaky code
global.orderCache = global.orderCache || [];
global.orderCache.push(newOrder);
return global.orderCache;

js
// New stateless code
return [newOrder]; // only return current payload

Purged old executions and restarted the worker:

bash
n8n execution:clear --older-than 14d
pm2 restart n8n

Outcome: Execution time dropped back to 180 ms and stayed stable for the next 90 days.

EEFA Note: The leak was invisible because the Function node didn’t log its global usage. Always audit custom code for side‑effects before deploying.

Fix Checklist

Check recent avg execution time (> 1 s → investigate).
Run workflow in Debug → note any node > 2 s.
If a Function node, ensure no global or long‑living objects.
Purge old executions: n8n execution:clear --older-than 30d.
Restart n8n worker (pm2 restart n8n or Docker restart).
Set MAX_EXECUTIONS to a safe limit (e.g., CPU_COUNT * 2).

If performance still degrades, follow the full root‑cause analysis steps above.

All commands assume a Unix‑like environment and that the n8n CLI is in $PATH. Adjust paths, tokens, and environment variables to match your deployment.

n8n workflows slow after weeks in production execution table bloat fix

Quick Diagnosis

1. Core Causes of Degrading Performance

1.1 Memory‑Related Issues

1.2 Data‑Store Issues

1.3 Queue & Concurrency Problems

1.4 External API Throttling

1.5 Inefficient Node Logic

1.6 Version Drift

2. Step‑by‑Step Root‑Cause Investigation

2.1 Capture Baseline Metrics

2.2 Profile Memory Usage

2.3 Audit the Database

2.4 Identify Hot Nodes

2.5 Review Concurrency Settings

3. Targeted Fixes for Each Root Cause

3.1 Memory Leak in Custom JavaScript

3.2 Database Bloat

3.3 Unbounded Queues

3.4 External API Throttling

3.5 Inefficient Parsing

3.6 Version Drift

4. Preventive Maintenance Checklist

5. Real‑World Example: Fixing a 3‑Week Slowdown

Root causes discovered

Fixes applied

Fix Checklist

Leave a Comment Cancel Reply

Sign up for Newsletter

Quick Diagnosis

1. Core Causes of Degrading Performance

1.1 Memory‑Related Issues

1.2 Data‑Store Issues

1.3 Queue & Concurrency Problems

1.4 External API Throttling

1.5 Inefficient Node Logic

1.6 Version Drift

2. Step‑by‑Step Root‑Cause Investigation

2.1 Capture Baseline Metrics

2.2 Profile Memory Usage

2.3 Audit the Database

2.4 Identify Hot Nodes

2.5 Review Concurrency Settings

3. Targeted Fixes for Each Root Cause

3.1 Memory Leak in Custom JavaScript

3.2 Database Bloat

3.3 Unbounded Queues

3.4 External API Throttling

3.5 Inefficient Parsing

3.6 Version Drift

4. Preventive Maintenance Checklist

5. Real‑World Example: Fixing a 3‑Week Slowdown

Root causes discovered

Fixes applied

Fix Checklist

Must Read

Leave a Comment Cancel Reply