
Who this is for: developers and DevOps engineers who run n8n in production and need to pinpoint why a step misbehaves (missing data, time‑outs, permission errors, etc.). We cover this in detail in the n8n Architectural Failure Modes Guide.
Quick Diagnosis
If a workflow behaves oddly, first locate where the offending code runs—Main Process, Worker Process, or Sandbox—and adjust node settings, credentials, or deployment limits for that context; the problem often disappears.
In production this often shows up as a missing field or a timeout after a few minutes of execution.
Featured‑snippet answer – n8n routes each workflow step to one of three runtimes:
- Main Process – UI, webhook listeners, and lightweight nodes.
- Worker Process – CPU‑intensive or long‑running nodes in a separate child process.
- Sandbox – Isolated execution of custom JavaScript and credential decryption.
The engine decides based on node type, sync/async mode, and available resources.
1. High‑Level Execution Flow
If you encounter any n8n execution ordering guarantees resolve them before continuing with the setup.
Below are the major phases and their runtimes.
| Phase | Component | Runs In | Typical Tasks |
|---|---|---|---|
| Trigger Reception | Webhook / Cron / Manual | Main Process | Accept HTTP request, schedule cron, start manual run |
| Node Scheduling | `WorkflowRunner` | Main Process | Parse workflow JSON, build execution graph, decide node placement |
| Heavy‑Lift Execution | `WorkerPool` (child processes) | Worker Process | Run loops, API calls with retries, large data transforms |
| Isolated Code | `Sandbox` (VM2) | Sandbox | Execute custom JavaScript, credential decryption |
| Result Aggregation | `WorkflowRunner` | Main Process | Collect node outputs, resolve expressions, write to DB |
| Cleanup | `ProcessManager` | Main Process | Terminate workers, clear temp files, update status |
The table maps each logical step to its runtime, helping locate the source of a failure.
EEFA notes for each phase
| Phase | EEFA Note |
|---|---|
| Trigger Reception | Ensure the main process has a high enough ulimit for open file descriptors; otherwise webhooks may be dropped. |
| Node Scheduling | Mis‑ordered dependencies cause “Node not found” errors – verify nextNode IDs. |
| Heavy‑Lift Execution | Worker crashes are logged to worker.log; restart the pool if you see EWORKEREXIT. |
| Isolated Code | Sandbox memory defaults to 150 MB. Increase via N8N_SANDBOX_MEMORY_LIMIT only after a security review. |
| Result Aggregation | Failures often point to malformed JSON from a node; enable N8N_DEBUG_OUTPUT to see raw payloads. |
| Cleanup | Stale temp files can fill /tmp; set N8N_TMP_DIR to a larger volume on production. |
Execution Path (text diagram)
[Webhook/Trigger] → Main Process → Scheduler → ├─► Worker (if heavy)
│
└─► Sandbox (if custom code)
↓
Result Aggregator → DB / Response
2. The Main Process – “Control Plane”
If you encounter any n8n webhook backpressure explained resolve them before continuing with the setup.
The Main Process orchestrates everything that doesn’t need heavy lifting or isolation.
2.1 What lives here?
- HTTP server (
express) – receives webhooks and UI requests. - Workflow registry – loads workflow JSON from the database.
- Execution graph builder – creates a DAG (directed acyclic graph).
- Lightweight nodes – e.g.,
Set,Merge,IF,Switch(no external I/O).
2.2 Configuration knobs
| Env Variable | Default | Effect |
|---|---|---|
| N8N_MAX_EXECUTION_TIME | 3600 s | Max wall‑clock time for a workflow before the main process aborts it. |
| N8N_WORKER_CONCURRENCY | 5 | Max concurrent workers; increase on multi‑core servers. |
| N8N_DISABLE_WEBHOOKS | false | Disables webhook listener (useful for batch‑only installations). |
On a VM with limited RAM, keep N8N_MAX_EXECUTION_TIME low (e.g., 300) to avoid runaway loops exhausting main process memory.
3. Worker Processes: The “Data Plane”
If you encounter any n8n state handling between nodes resolve them before continuing with the setup.
Workers handle anything that would otherwise block the main process.
3.1 When does n8n spawn a worker?
| Condition | Example Nodes |
|---|---|
Node declares executionMode: "main" → stays in main. |
Set, IF |
Node declares executionMode: "worker" → off‑loaded. |
HTTP Request, Google Sheets, AWS S3, Code (custom JS) |
Node has runInBackground: true (e.g., long polling). |
Webhook, Trigger with polling mode |
3.2 Worker lifecycle
- Fork a new Node.js child process (
worker.js). - Bootstrap with shared config (
worker-config.json). - Execute node’s
execute()method. - Return result via IPC (inter‑process communication).
- Terminate after
N8N_WORKER_TIMEOUT(default 600 s) or on error.
3.3 Debugging worker failures
# View live worker logs (4‑5 lines) tail -f ~/.n8n/worker.log
# Restart the worker pool after a code update docker exec -it n8n n8n restart-worker-pool
| Symptom | Likely Cause | Fix |
|---|---|---|
| EWORKEREXIT | Worker crashed (segfault, OOM) | Increase N8N_WORKER_MEMORY_LIMIT or shrink payload size. |
| ETIMEOUT | Node exceeded N8N_WORKER_TIMEOUT |
Optimize API calls, enable pagination, or raise timeout. |
| ENOTFOUND (credential) | Credential not loaded in sandbox | Verify credential scope (global vs workflow) and re‑authenticate. |
Restarting the worker pool is usually faster than hunting for the exact node that crashed.
EEFA note – In containers, ensure the cgroup memory limit exceeds N8N_WORKER_MEMORY_LIMIT; otherwise the kernel will OOM‑kill the worker silently.
4. Sandbox (VM2) – Secure Isolation
The sandbox protects the host process from arbitrary user code.
4.1 Purpose
- Isolate user‑provided JavaScript (
Codenode) and credential decryption. - Prevent malicious code from touching the file system, network (unless whitelisted), or environment variables.
4.2 How it works – VM2 setup
const { NodeVM } = require('vm2');
const vm = new NodeVM({
console: 'inherit',
sandbox: { $node },
require: {
external: true,
builtin: ['fs', 'path'], // whitelist if needed
root: "./node_modules"
},
timeout: 30000, // 30 s max execution
memoryLimit: process.env.N8N_SANDBOX_MEMORY_LIMIT || 150
});
4.3 Running a user script
const userScript = `
module.exports = async function() {
return $node["input"];
};
`;
module.exports = vm.run(userScript);
4.4 Common pitfalls
| Issue | Symptoms | Resolution |
|---|---|---|
| Memory exceeded | RangeError: Invalid array length inside Code node |
Raise N8N_SANDBOX_MEMORY_LIMIT (cautiously) or refactor to stream data. |
| Missing module | Error: Cannot find module 'axios' |
Add the module to the **allowed external list** in VM2 config or install it globally (npm i axios). |
| Infinite loop | Execution hangs > 30 s, worker killed | Reduce timeout to a lower value for safety, or break loop into chunked iterations. |
Do not disable sandboxing (N8N_DISABLE_SANDBOX=true) in production; it opens the host to arbitrary code execution.
5. Scaling the Execution Engine
Different strategies for handling growth.
5.1 Horizontal scaling (multiple n8n instances)
| Component | Scaling Strategy |
|---|---|
| Webhook listener | Use a load balancer (NGINX, Traefik) with sticky sessions; set N8N_ENDPOINT_WEBHOOK to a shared domain. |
| Worker pool | Deploy a **Redis‑backed queue** (bullmq) via N8N_WORKER_QUEUE=redis://redis:6379 to share jobs across instances. |
| Database | Centralize on PostgreSQL or MySQL; configure DB_TYPE, DB_POSTGRESDB_DATABASE, etc. |
| Cache | Enable N8N_CACHE=true with Redis to avoid duplicate credential decryption. |
The table shows which pieces benefit from load‑balanced or shared resources.
5.2 Vertical scaling (single instance)
| Resource | Recommended Setting |
|---|---|
| CPU cores | N8N_WORKER_CONCURRENCY = #cores * 2 (e.g., 8 cores → 16 workers). |
| RAM | Allocate at least **2 GB per 5 concurrent workers**. |
| Disk I/O | Use SSD for /tmp and DB storage; set N8N_TMP_DIR to a high‑throughput mount. |
**EEFA production checklist**
N8N_LOG_LEVEL=error(avoid verbose logs).- Enable process monitoring (PM2, systemd) with auto‑restart on crash.
- Set resource limits (
ulimit -n 4096,--max-old-space-size=1024). - Rotate
worker.logdaily (logrotateconfig). - Harden sandbox: whitelist only needed built‑ins, disable
eval.
6. Troubleshooting “What Runs Where” Issues
Targeted steps for common pain points.
6.1 Symptom: Data missing after a Code node
- Open the execution log and locate the node ID (e.g.,
Node 12). - Check sandbox logs (
~/.n8n/sandbox.log). - If you see
RangeError: Invalid array length, the script exceeded the sandbox memory limit.
Fix – increase the memory limit and restart n8n:
export N8N_SANDBOX_MEMORY_LIMIT=300 # 300 MB docker restart n8n
6.2 Symptom: “Webhook not received” on cloud deployment
| Likely place | Why |
|---|---|
| Main Process | Listener bound to localhost instead of public IP (N8N_HOST=0.0.0.0). |
| Load balancer | Health‑check path not whitelisted, causing 502. |
| Worker | Worker crashed before the webhook could be acknowledged (EWORKEREXIT). |
**Resolution** – ensure the host binds to all interfaces and verify the LB health check:
export N8N_HOST=0.0.0.0 curl -I http://your-n8n.example.com/webhook-test
6.3 Symptom: “Credential not found” in a Google Sheets node
| Check | Action |
|---|---|
| Credential scope | Confirm the credential is global or attached to the workflow (Credentials → Scope). |
| Sandbox isolation | If a Code node fetches a token, ensure you call $credentials["googleSheets"] inside the sandbox. |
| Worker cache | Stale cache can cause misses; restart the worker pool. |
Restart the worker pool:
docker exec -it n8n n8n restart-worker-pool
Conclusion
- Main Process – triggers, lightweight nodes, orchestration.
- Worker Process – heavy I/O or CPU‑intensive nodes in child processes.
- Sandbox – isolates custom JavaScript and credential decryption via VM2.
- Identify the problematic context, tweak the relevant environment variables, and restart the affected component (main, worker, or sandbox).
All examples assume a Unix‑like environment; adapt paths and service names for Windows or Docker‑Compose setups.



