Who this is for: Developers and DevOps engineers running n8n in production who need to understand, tune, or troubleshoot webhook concurrency. We cover this in detail in the n8n Production Readiness & Scalability Risks Guide.
Quick Diagnosis
n8n receives every HTTP request on a single Webhook Server (Express). The request is placed in WorkflowExecutionQueue. A worker pool (default = MAX_CONCURRENT_EXECUTIONS = 10) pulls jobs, locks the target workflow in the database, runs the workflow in an isolated process (or Docker container), and finally releases the lock. Concurrency limits, retry policies, and scaling options are controlled via environment variables (EXECUTIONS_PROCESS, MAX_CONCURRENT_EXECUTIONS, WEBHOOK_CONCURRENCY_LIMIT, WORKFLOW_EXECUTION_TIMEOUT).
EEFA – “Webhook queue full” or “Workflow execution timed out” in the logs means the concurrency settings are too low for the traffic. Increase
MAX_CONCURRENT_EXECUTIONSor add more n8n instances behind a load balancer.
In production we often see the queue fill up the moment a burst of traffic hits, so the first thing to check is the queue length.
1. Request Lifecycle: From HTTP to Workflow Execution
If you encounter any database write amplification in n8n resolve them before continuing with the setup.
| Stage | Component | What Happens |
|---|---|---|
| 1️⃣ Receive | WebhookServer (Express) | Listens on /:webhookId, validates optional signature. |
| 2️⃣ Queue | WorkflowExecutionQueue (BullMQ) | Pushes a job { workflowId, eventData, requestMeta }. |
| 3️⃣ Acquire Lock | WorkflowExecutionLock (DB) | Row‑level lock prevents parallel runs of the same workflow. |
| 4️⃣ Worker Pull | ExecutionWorker (child process or Docker) | Dequeues job, spawns an isolated process. |
| 5️⃣ Run | WorkflowRunner | Executes nodes in topological order. |
| 6️⃣ Cleanup | ExecutionWorker | Sends result back, releases DB lock, acknowledges the queue. |
The lock works across clustered databases (e.g., Aurora), guaranteeing single‑run semantics even under high load.
1.1. Express Hook Registration – Part 1 (Setup)
import express from 'express';
import { addToQueue } from './WorkflowExecutionQueue';
const app = express();
app.use(express.json());
Creates the Express app and enables JSON parsing.
1.2. Express Hook Registration – Part 2 (Route Handler)
app.post('/webhook/:id', async (req, res) => {
const webhookId = req.params.id;
Defines the POST endpoint that receives webhook payloads.
1.3. Signature Validation (optional)
if (process.env.WEBHOOK_SIGNATURE_CHECK) {
const valid = verifySignature(req);
if (!valid) return res.status(401).send('Invalid signature');
}
If signature checking is enabled, the request is rejected early on failure.
1.4. Queue the Payload
const queued = await addToQueue({
webhookId,
payload: req.body,
headers: req.headers,
ip: req.ip,
});
The payload and metadata are handed off to the internal queue.
1.5. Immediate Acknowledgement
res.status(200).json({ queued });
});
Responds quickly so webhook providers see a successful delivery.
2. Queue Implementation (BullMQ) – Overview
If you encounter any n8n backfills and replays at scale resolve them before continuing with the setup.
2.1. Queue Creation (4 lines)
import { Queue } from 'bullmq';
import IORedis from 'ioredis';
const connection = new IORedis(process.env.REDIS_URL);
Establishes a Redis connection for the queue.
2.2. Queue Configuration (4 lines)
export const webhookQueue = new Queue('webhook-execution', {
connection,
defaultJobOptions: {
attempts: 3,
backoff: { type: 'exponential', delay: 2000 },
removeOnComplete: true,
},
});
Sets retry behavior and ensures completed jobs are removed.
2.3. Adding a Job with Concurrency Guard (5 lines)
export async function addToQueue(jobData) {
const limit = Number(process.env.WEBHOOK_CONCURRENCY_LIMIT) || 1000;
const waiting = await webhookQueue.getWaitingCount();
if (waiting >= limit) throw new Error('Webhook queue full');
return webhookQueue.add('execute', jobData);
}
Rejects new webhooks when the waiting queue exceeds the configured limit.
3. Configuring Concurrency – Environment Variables
If you encounter any signs n8n will fail at scale resolve them before continuing with the setup.
| Variable | Default | Typical Production |
|---|---|---|
| MAX_CONCURRENT_EXECUTIONS | 10 | 30–50 for medium traffic, 100+ on autoscaling |
| WEBHOOK_CONCURRENCY_LIMIT | 1000 | Adjust to inbound spikes, e.g., 5000 |
| EXECUTIONS_PROCESS | main | docker for SaaS, queue for on‑prem |
| WORKFLOW_EXECUTION_TIMEOUT | 300000 ms (5 min) | 60000 ms for fast APIs |
| WEBHOOK_TTL | 86400 s (24 h) | Keep default unless short‑lived URLs are needed |
3.1. Scaling Checklist
- Measure baseline – Use
docker statsorpm2 status. - Raise
MAX_CONCURRENT_EXECUTIONS– Edit.envand restart.
Usually bumping this first gives the biggest win. - Tune BullMQ concurrency – Adjust the
concurrencyoption inExecutionWorker. - Add Redis nodes – Clustered Redis reduces latency.
- Deploy replicas – Put multiple n8n instances behind NGINX/Traefik; they share DB/Redis for true horizontal scaling.
- Enable health‑checks –
GET /healthzshould return200only when queue length <WEBHOOK_CONCURRENCY_LIMIT.
4. Exactly‑One Execution per Workflow
n8n uses a row‑level lock to avoid race conditions when many webhook calls target the same workflow.
4.1. Acquire Lock (PostgreSQL) – Part 1 (4 lines)
export async function acquireLock(workflowId, client) {
const result = await client.query(
`SELECT id FROM workflow_execution WHERE workflow_id = $1 FOR UPDATE SKIP LOCKED`,
[workflowId],
);
Attempts to lock the workflow row; SKIP LOCKED returns immediately if another transaction holds the lock.
4.2. Acquire Lock – Part 2 (3 lines)
if (result.rowCount === 0) throw new Error('Workflow already in execution');
// Lock acquired – proceed
}
If the lock is unavailable, the worker re‑queues the job with a short back‑off.
EEFA Warning – MySQL < 8.0 lacks
SKIP LOCKED. UseGET_LOCKor upgrade to PostgreSQL for reliable locking.
The lock is taken per workflow ID, not per individual node, so the whole workflow is serialized.
4.3. When “Run Once” Is Disabled
If the workflow’s Run Once flag is unchecked, n8n skips the lock step, allowing parallel runs. This is fine for stateless ingestion pipelines but can cause duplicate side‑effects (e.g., double emails). Verify that downstream nodes are idempotent before disabling the lock.
5. Troubleshooting Common Concurrency Issues
| Symptom | Likely Cause | Diagnostic | Fix |
|---|---|---|---|
| “Webhook queue full” | Low WEBHOOK_CONCURRENCY_LIMIT or Redis latency |
redis-cli LLEN bull:webhook-execution:wait |
Raise limit, add Redis replicas, enable maxStalledCount. |
| “Workflow execution timed out” | Timeout too short, heavy node (HTTP request) | grep "Execution timed out" /var/log/n8n.log |
Increase WORKFLOW_EXECUTION_TIMEOUT or offload heavy work. |
| Duplicate records | Run Once disabled, non‑idempotent downstream API |
Search DB for duplicate eventId rows |
Enable lock or make downstream API idempotent (Idempotency-Key). |
| Memory OOM | Worker leaks (large CSV parsing) | pm2 monit or docker stats |
Switch EXECUTIONS_PROCESS to docker, limit Node memory (--max-old-space-size=256). |
| Deadlocked DB | MySQL < 8.0 lacking SKIP LOCKED |
SHOW ENGINE INNODB STATUS\G |
Upgrade MySQL or migrate to PostgreSQL. |
When a problem shows up, the queue length is usually the quickest thing to look at.
5.1. Auto‑Scale Workers Script (4 line chunks)
#!/usr/bin/env bash THRESHOLD=200 # jobs waiting before spawning a new worker MAX_WORKERS=8 # cap
Sets the scaling thresholds.
while true; do
WAITING=$(redis-cli LLEN bull:webhook-execution:wait)
CURRENT=$(docker ps --filter "name=n8n_worker" --format "{{.Names}}" | wc -l)
Fetches queue length and current worker count.
if (( WAITING > THRESHOLD && CURRENT < MAX_WORKERS )); then
echo "$(date) – Queue $WAITING, launching extra worker..."
docker-compose up -d --scale n8n_worker=$((CURRENT+1))
fi
sleep 30
done
Spins up an extra worker when the queue exceeds the threshold.
6. Production‑Grade Recommendations (EEFA)
- Isolate Execution – Set
EXECUTIONS_PROCESS=dockerwith a lightweight image (node:18-alpine). Prevents a runaway workflow from crashing the main process. - Persist Queue State – Enable Redis
appendonly yesso in‑flight jobs survive host restarts. - Rate‑Limit Incoming Webhooks – Use NGINX
limit_req zone=webhook burst=20 nodelay;to protect n8n from abusive spikes. - Observability – Export BullMQ metrics to Prometheus (
bull-board+prom-client). Trackqueue_waiting,queue_active,worker_errors. - Graceful Shutdown – On SIGTERM, call
webhookQueue.close()andworkerPool.close()so jobs finish before the container stops.
Bottom Line
n8n’s concurrent webhook handling follows a clear pipeline: Express → BullMQ queue → DB lock → worker pool → isolated process. By tuning environment variables, scaling the queue/worker layer, and respecting the row‑level lock, you can safely process thousands of webhook calls per second without race conditions or data loss.



