
Step by Step Guide to solve benchmarking tools
Who this is for: Developers and SREs who need a reproducible way to measure n8n workflow throughput and latency in a CI‑ready, production‑like environment. We cover this in detail in the n8n Performance & Scaling Guide.
Quick Diagnosis
Problem: You need a reproducible way to measure n8n workflow throughput and latency.
Solution: Deploy a lightweight load generator (k6 or Locust), target the n8n REST API (/webhook/... or /executions), run a scripted scenario that mimics real‑world payloads, and capture VU‑seconds, response‑time percentiles, and error rates. Export the results to CSV/JSON for analysis or CI integration. If you encounter any monitoring dashboard setup resolve them before continuing with the setup.
1. Choosing the Right Load Generator for n8n
| Feature | k6 | Locust | When to Prefer |
|---|---|---|---|
| Language | JavaScript (ES6) | Python | Use k6 if your team is JS‑centric; Locust if you need complex stateful user flows. |
| CLI‑only vs UI | CLI only (HTML report optional) | Web UI + CLI | Locust’s UI is handy for exploratory testing. |
| Distributed Load | Built‑in cloud/remote workers | Master‑worker architecture | For > 10k VUs, both scale, but k6’s cloud service offers managed scaling. |
| Integration | InfluxDB, Grafana, CI pipelines | Prometheus, Grafana, CI pipelines | Choose based on existing observability stack. |
| License | Open‑source (MIT) + commercial cloud | Open‑source (MIT) | Both free; pick the one that matches your tech stack. |
EEFA note: Both tools generate HTTP traffic only. To benchmark n8n’s internal queue processing, combine the load test with a background worker monitor (see “Measuring Queue Drain Rate” later). If you encounter any logging optimization resolve them before continuing with the setup.
2. Preparing n8n for Benchmarking
Micro‑summary: Set up an isolated n8n instance with a minimal echo workflow and deterministic logging to ensure the benchmark measures HTTP handling, not background processing.
2.1 Spin up a dedicated n8n instance
# docker-compose.yml – n8n service
services:
n8n:
image: n8nio/n8n:latest
environment:
- LOG_LEVEL=debug
- EXECUTIONS_PROCESS=1
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=bench
- N8N_BASIC_AUTH_PASSWORD=bench123
ports:
- "5678:5678"
2.2 Create a simple “echo” webhook workflow
- Add a Webhook node listening on
/benchmark. - Connect it directly to a Set node that returns
{{ $json }}. - Deploy the workflow.
EEFA warning: Do not run benchmarks on a multi‑node cluster without synchronizing EXECUTIONS_PROCESS across pods; otherwise, results will reflect load‑balancer jitter rather than true per‑node capacity. If you encounter any security impact on performance resolve them before continuing with the setup.
3. k6 Benchmark Script for n8n Webhook
Micro‑summary: Install k6, write a short script that posts JSON payloads, and capture custom latency and error metrics.
3.1 Install k6
# macOS brew install k6
# Linux (Debian/Ubuntu) curl -s https://dl.k6.io/public/release.key | sudo apt-key add - echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list sudo apt-get update && sudo apt-get install k6
3.2 Define custom metrics (4‑line snippet)
import { Trend, Rate } from 'k6/metrics';
export let latency = new Trend('latency_ms');
export let errors = new Rate('error_rate');
3.3 Set test options (5‑line snippet)
export const options = {
stages: [
{ duration: '30s', target: 50 }, // ramp‑up
{ duration: '2m', target: 50 }, // hold
{ duration: '30s', target: 0 }, // ramp‑down
],
thresholds: {
latency_ms: ['p(95)<500'],
error_rate: ['rate<0.01'],
},
};
3.4 Build the request payload (4‑line snippet)
const payload = JSON.stringify({ message: 'benchmark' });
const params = {
headers: {
'Content-Type': 'application/json',
Authorization: `Basic ${btoa(`${__ENV.N8N_USER}:${__ENV.N8N_PASS}`)}`,
},
timeout: '60s',
};
3.5 Execute the request and record metrics (5‑line snippet)
export default function () {
const res = http.post(`${BASE_URL}/webhook/benchmark`, payload, params);
latency.add(res.timings.duration);
errors.add(!check(res, { 'status is 200': (r) => r.status === 200 }));
sleep(0.2);
}
3.6 Run the test
export N8N_URL=http://localhost:5678 export N8N_USER=bench export N8N_PASS=bench123 k6 run n8n_k6_test.js --out json=results.json
3.7 Interpreting k6 output
| Metric | Meaning | Typical Target |
|---|---|---|
| http_req_duration (p95) | 95 % percentile of total request time | ≤ 500 ms for simple webhook |
| latency_ms (custom) | End‑to‑end latency per request | ≤ 400 ms |
| error_rate | Fraction of failed requests | <1 % |
| vus | Peak virtual users | Matches stage target |
EEFA tip: Correlate k6 latency with n8n’s internal request_time_ms log field to verify that network overhead isn’t the bottleneck.
4. Locust Benchmark Script for Stateful Scenarios
Micro‑summary: Install Locust, create a user class that posts to the webhook and optionally fetches execution details, then run in headless or distributed mode.
4.1 Install Locust
python3 -m venv locust-env source locust-env/bin/activate pip install locust
4.2 Define authentication header (4‑line snippet)
import base64
user = "bench"
pwd = "bench123"
token = base64.b64encode(f"{user}:{pwd}".encode()).decode()
self.headers = {
"Content-Type": "application/json",
"Authorization": f"Basic {token}",
}
4.3 Post to the webhook (5‑line snippet)
@task(5)
def post_webhook(self):
payload = {"message": "locust-bench"}
with self.client.post(
"/webhook/benchmark",
json=payload,
headers=self.headers,
catch_response=True,
) as response:
if response.status_code != 200:
response.failure(f"Bad status: {response.status_code}")
else:
response.success()
4.4 Optional execution fetch (4‑line snippet)
@task(1)
def fetch_execution(self):
resp = self.client.get("/executions", headers=self.headers)
if resp.status_code == 200 and resp.json():
exec_id = resp.json()[0]["id"]
self.client.get(f"/executions/{exec_id}", headers=self.headers)
4.5 Export a CSV summary on test stop (4‑line snippet)
@events.test_stop.add_listener
def on_test_stop(environment, **kwargs):
with open("locust_summary.csv", "w") as f:
f.write("Name,Requests,Failures,Median,95th\n")
for name, stats in environment.stats.entries.items():
f.write(
f"{name},{stats.num_requests},{stats.num_failures},"
f"{stats.median_response_time},{stats.get_response_time_percentile(95)}\n"
)
4.6 Run Locust (headless example)
locust -f locustfile.py --headless -u 100 -r 10 --run-time 5m --csv=run
4.7 Distributed mode (optional)
# Master locust -f locustfile.py --master --expect-workers=3 # Workers (on separate hosts) locust -f locustfile.py --worker --master-host=MASTER_IP
EEFA note: Locust’s default max_rps is unlimited; in production‑like environments, cap it (--headless -u 200 -r 20) to avoid saturating the network interface before n8n does.
5. Advanced Benchmarking Patterns
| Pattern | Use‑Case | k6 Implementation |
|---|---|---|
| Burst Load | Spike handling (e.g., webhook surge) | Add a rapid ramp stage (target: 200 over 10s). |
| Steady State Throughput | Sustained processing capacity | Hold VUs for 5‑10 min (target: 100). |
| Mixed Payloads | Different workflow payload sizes | Parameterize payload with CSV data via options.scenarios. |
| End‑to‑End Queue Drain | Measure how fast n8n processes queued jobs after a load burst | Combine k6 with a post‑run script that polls /executions until the queue is empty. |
| CI/CD Gate | Fail build if latency > SLA | k6 run … --out json=ci.json && node check-sla.js |
| Pattern | Locust Implementation |
|---|---|
| Burst Load | Set a high spawn_rate (-r 100) for a short duration. |
| Steady State Throughput | -u 100 -t 10m in headless mode. |
| Mixed Payloads | Use weighted @task methods and generate random payloads in the task body. |
| End‑to‑End Queue Drain | Add a post‑load @task that repeatedly polls /executions until empty, then record elapsed time. |
| CI/CD Gate | Run locust -f locustfile.py --headless -u 50 -r 10 --run-time 2m --csv=ci and parse the CSV for SLA checks. |
EEFA checklist before CI integration
- Mirror production DB and Node version in the test environment.
- Disable n8n auto‑scaling (if on Kubernetes) to keep node count constant.
- Pin k6/Locust versions in
package.jsonorrequirements.txt. - Store authentication secrets in CI vault, not in repo.
- Add a timeout guard (
--max-run-time) to prevent runaway jobs.
6. Troubleshooting Common Errors
| Symptom | Likely Cause | Fix |
|---|---|---|
| 401 Unauthorized from k6/Locust | Missing or wrong Basic Auth header | Verify N8N_USER/N8N_PASS env vars; base64‑encode correctly. |
| socket hang up / ECONNRESET | n8n container hitting open‑file limit (ulimit -n) under load |
Increase nofile limit in Docker/K8s (ulimit -n 65535). |
| High latency spikes only in CI | Shared CI runner network contention | Isolate load generator on a dedicated VM or use k6 Cloud. |
Metrics show 0 % error but logs contain Execution timed out |
n8n internal timeout not reflected in HTTP status (still 200) | Enable EXECUTIONS_TIMEOUT=300000 to surface timeout as 500 error, or add a custom response check for executionFinished flag. |
| Locust UI crashes after many workers | Insufficient RAM on master node | Allocate more memory or limit --expect-workers to realistic count. |
7. Exporting Benchmark Results for Stakeholder Reporting
7.1 Convert k6 JSON to CSV (4‑line snippet)
jq -r '.metrics | to_entries[] | "\(.key),\(.value.values.mean)"' results.json > k6_summary.csv
7.2 Locust already writes CSV (see section 4.5)
The file locust_summary.csv contains:
Name,Requests,Failures,Median,95th /webhook/benchmark,12000,12,312,498 ...
7.3 Sample Markdown report (5‑column table split)
k6 Results
| Metric | Peak VUs | Avg Latency (ms) | p95 Latency (ms) | Error Rate |
|---|---|---|---|---|
| k6 | 50 | 312 | 498 | 0.2 % |
Locust Results
| Metric | Peak VUs | Avg Latency (ms) | p95 Latency (ms) | Error Rate |
|---|---|---|---|---|
| Locust | 100 | 280 | 470 | 0.1 % |
Observations
– Latency stays under the 500 ms SLA up to 100 concurrent webhooks.
– Queue‑drain time after a 2‑minute burst is ~45 s (see *End‑to‑End Queue Drain*).
Next Steps
– Increase EXECUTIONS_PROCESS to 2 for higher concurrency.
– Add Redis cache for credential lookups (see the sibling guide on *Docker Performance Tuning*).
All commands assume a Unix‑like shell. Adjust paths for Windows PowerShell as needed.



