
Who this is for: Platform engineers and DevOps practitioners running n8n in production who need reliable, long‑running queue workers. We cover this in detail in the n8n Queue Mode Errors Guide.
Quick Diagnosis
Symptom: Queue workers crash after 30‑60 min with “JavaScript heap out of memory” or a SIGKILL.
Quick fix
- Increase the V8 heap:
NODE_OPTIONS="--max-old-space-size=4096"in the worker container. - Enable the inspector (
--inspect) and capture a heap snapshot withheapdump. - Pinpoint the offending node (usually a custom node or a huge JSON payload) and free its memory (
delete,clear, or stream the data). - Add a health‑check that restarts any worker that runs longer than 45 min.
What’s happening?
If you encounter any n8n queue mode logging not enabled resolve them before continuing with the setup.
| Symptom | Typical Log Message | Immediate Impact |
|---|---|---|
| Worker exits with code 137 | SIGKILL: Out of memory or JavaScript heap out of memory | Jobs stall; queue length spikes |
| CPU spikes to 100 % before crash | node: internal/process/promises: … | Host becomes unresponsive |
| RSS grows linearly (500 MiB → 4 GiB) | No error until OOM killer fires | Crash is delayed, hard to reproduce |
Fast check – Run docker stats (or kubectl top pod) on the worker pods. A steady RSS climb without plateau signals a leak.
1️⃣ Enable Precise Memory Monitoring in Workers
If you encounter any n8n queue mode queue retry limit exceeded resolve them before continuing with the setup.
1.1 Add Node‑level diagnostics (docker‑compose)
# docker‑compose.yml – worker service (environment)
services:
n8n-worker:
environment:
- NODE_OPTIONS=--max-old-space-size=4096 --inspect=0.0.0.0:9229
# docker‑compose.yml – worker service (ports)
ports:
- "9229:9229" # expose V8 inspector internally only
EEFA: Bind the inspector to the internal network or tunnel via SSH; never expose it publicly.
1.2 Capture a heap snapshot on demand
Step 1 – Install heapdump in a custom node or pre‑execution hook
npm install heapdump --save
Step 2 – Add a signal handler that writes a snapshot
const heapdump = require('heapdump');
process.on('SIGUSR2', () => {
const file = `/tmp/heap-${Date.now()}.heapsnapshot`;
heapdump.writeSnapshot(file, (err) => {
if (err) console.error(err);
else console.log('Heap snapshot written to', file);
});
});
Step 3 – Trigger the snapshot when memory > 75 %
docker kill --signal=SIGUSR2 n8n-worker
2️⃣ Identify the Leak Source
2.1 Analyze the snapshot
- Open the
.heapsnapshotin Chrome DevTools → Memory → Comparison. - Look for objects (e.g.,
Array,Object) retaining > 200 MiB. - Drill into the *constructor name* to see which node created the allocation (
LargeJsonNode,CsvParseNode, etc.).
2.2 Common culprits in n8n
| Node / Feature | Typical Leak Pattern | Fix Hint |
|---|---|---|
| Custom JavaScript node | Large item arrays stored in a closure or global |
Return only needed data; delete large fields after use |
| HTTP Request (big response) | Buffer kept in memory because binary isn’t cleared |
Stream to a temp file and delete binary after processing |
| Webhook with long‑running connections | Event listeners never removed on restart | Call process.removeAllListeners() in an onClose hook |
| CSV/JSON parser on huge payloads | Entire dataset held in a single array | Use csv-parser or JSONStream to process line‑by‑line |
2.3 Verify with a minimal reproduction workflow
Node 1 – Download a large JSON file
{
"type": "n8n-nodes-base.httpRequest",
"parameters": {
"url": "https://example.com/large-file.json",
"responseFormat": "json",
"options": { "jsonParse": false }
},
"name": "Download Large JSON"
}
Node 2 – Pass‑through function (no processing)
{
"type": "n8n-nodes-base.function",
"parameters": {
"functionCode": "return items;"
},
"name": "Pass‑Through"
}
Connect the two nodes and run the workflow repeatedly (e.g., via a cron). If memory climbs, the leak originates from the HTTP request handling.
3️⃣ Apply Production‑Grade Fixes
3.1 Enforce memory limits and graceful restarts
# docker‑compose.yml – memory & restart policy
services:
n8n-worker:
mem_limit: 4g
restart: on-failure
# docker‑compose.yml – health‑check that forces a restart before OOM
healthcheck:
test: ["CMD", "node", "-e",
"process.exit(process.memoryUsage().heapUsed > 3.5e9 ? 1 : 0)"]
interval: 5m
retries: 2
EEFA: The health check restarts the pod before the OS OOM killer intervenes, preserving queue continuity.
3.2 Stream large payloads instead of buffering
// Stream a CSV download to a temporary file
const fs = require('fs');
const https = require('https');
const tmp = require('tmp');
async function streamCsv(url) {
const tmpFile = tmp.fileSync({ postfix: '.csv' });
const fileStream = fs.createWriteStream(tmpFile.name);
return new Promise((resolve, reject) => {
https.get(url, (res) => {
res.pipe(fileStream);
res.on('end', () => {
fileStream.close();
resolve(tmpFile.name);
});
}).on('error', reject);
});
}
After processing, delete the temp file:
fs.unlinkSync(tmpFile.name);
3.3 Explicitly free large objects in custom nodes
// Example: Clean up after heavy computation
items = items.map(item => {
const result = heavyComputation(item.json);
delete item.json.largePayload; // free original data
return { json: result };
});
global.gc && global.gc(); // trigger GC if --expose-gc is set
return items;
Enable --expose-gc only in staging environments to avoid production overhead.
3.4 Isolate memory‑intensive jobs with a dedicated worker pool
Add a second pool that only runs “heavy” jobs:
# .env for heavy‑worker pool QUEUE_MODE=memory WORKER_CONCURRENCY=2 WORKER_LABELS=heavy
In the workflow, set Execute on Worker Label → heavy. This isolates leaks to a subset of pods that can be cycled more aggressively.
4️⃣ Validation & Ongoing Monitoring
| Tool | Metric | Alert Threshold |
|---|---|---|
Prometheus (node_memory_Active_bytes) |
Worker RSS | > 3.5 GiB (when limit = 4 GiB) |
| Grafana (Heap Used) | process_resident_memory_bytes |
80 % of max-old-space-size |
| Sentry (error fingerprint) | JavaScript heap out of memory | Immediate ticket |
| Datadog (container restart count) | container_restart_total |
> 2 per hour |
Create a dashboard that shows **memory trend per worker** and **queue backlog**. Correlate spikes with workflow IDs (available via process.env.N8N_WORKFLOW_ID in logs).
5️⃣ Preventive Best Practices (Checklist)
- Never store full payloads in
workflowDataorgloballonger than the node execution. - Stream files > 10 MiB; use
/tmpand delete after use. - Limit concurrency (
WORKER_CONCURRENCY) for jobs that parse large data sets. - Pin Node.js to LTS (e.g., v20.x) to benefit from V8 memory‑leak fixes.
- Run
npm auditon custom node packages; upgrade any that depend on outdatedrequestorxml2js. - Enable health‑check auto‑restart (see §3.1).
- Document any custom node that allocates > 50 MiB and add a “cleanup” note in its description.
Bottom line
Memory leaks in n8n’s queue workers are almost always caused by unreleased large objects (payloads, custom node state) or buffered network responses. By instrumenting workers with --inspect and heapdump, streaming heavy data, enforcing strict memory limits, and using health‑check‑driven restarts, you can detect, isolate, and permanently resolve the leak—keeping your queue healthy and your automations running 24/7.



