n8n Benchmarking Tools to Measure n8n Performance

Step by Step Guide to solve benchmarking tools

 

Step by Step Guide to solve benchmarking tools


Who this is for: Developers and SREs who need a reproducible way to measure n8n workflow throughput and latency in a CI‑ready, production‑like environment. We cover this in detail in the n8n Performance & Scaling Guide.


Quick Diagnosis

Problem: You need a reproducible way to measure n8n workflow throughput and latency.

Solution: Deploy a lightweight load generator (k6 or Locust), target the n8n REST API (/webhook/... or /executions), run a scripted scenario that mimics real‑world payloads, and capture VU‑seconds, response‑time percentiles, and error rates. Export the results to CSV/JSON for analysis or CI integration. If you encounter any monitoring dashboard setup resolve them before continuing with the setup.


1. Choosing the Right Load Generator for n8n

Feature k6 Locust When to Prefer
Language JavaScript (ES6) Python Use k6 if your team is JS‑centric; Locust if you need complex stateful user flows.
CLI‑only vs UI CLI only (HTML report optional) Web UI + CLI Locust’s UI is handy for exploratory testing.
Distributed Load Built‑in cloud/remote workers Master‑worker architecture For > 10k VUs, both scale, but k6’s cloud service offers managed scaling.
Integration InfluxDB, Grafana, CI pipelines Prometheus, Grafana, CI pipelines Choose based on existing observability stack.
License Open‑source (MIT) + commercial cloud Open‑source (MIT) Both free; pick the one that matches your tech stack.

EEFA note: Both tools generate HTTP traffic only. To benchmark n8n’s internal queue processing, combine the load test with a background worker monitor (see “Measuring Queue Drain Rate” later). If you encounter any logging optimization resolve them before continuing with the setup.


2. Preparing n8n for Benchmarking

Micro‑summary: Set up an isolated n8n instance with a minimal echo workflow and deterministic logging to ensure the benchmark measures HTTP handling, not background processing.

2.1 Spin up a dedicated n8n instance

# docker-compose.yml – n8n service
services:
  n8n:
    image: n8nio/n8n:latest
    environment:
      - LOG_LEVEL=debug
      - EXECUTIONS_PROCESS=1
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=bench
      - N8N_BASIC_AUTH_PASSWORD=bench123
    ports:
      - "5678:5678"

2.2 Create a simple “echo” webhook workflow

  1. Add a Webhook node listening on /benchmark.
  2. Connect it directly to a Set node that returns {{ $json }}.
  3. Deploy the workflow.

EEFA warning: Do not run benchmarks on a multi‑node cluster without synchronizing EXECUTIONS_PROCESS across pods; otherwise, results will reflect load‑balancer jitter rather than true per‑node capacity. If you encounter any security impact on performance resolve them before continuing with the setup.


3. k6 Benchmark Script for n8n Webhook

Micro‑summary: Install k6, write a short script that posts JSON payloads, and capture custom latency and error metrics.

3.1 Install k6

# macOS
brew install k6
# Linux (Debian/Ubuntu)
curl -s https://dl.k6.io/public/release.key | sudo apt-key add -
echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update && sudo apt-get install k6

3.2 Define custom metrics (4‑line snippet)

import { Trend, Rate } from 'k6/metrics';
export let latency = new Trend('latency_ms');
export let errors = new Rate('error_rate');

3.3 Set test options (5‑line snippet)

export const options = {
  stages: [
    { duration: '30s', target: 50 },   // ramp‑up
    { duration: '2m', target: 50 },   // hold
    { duration: '30s', target: 0 },   // ramp‑down
  ],
  thresholds: {
    latency_ms: ['p(95)<500'],
    error_rate: ['rate<0.01'],
  },
};

3.4 Build the request payload (4‑line snippet)

const payload = JSON.stringify({ message: 'benchmark' });
const params = {
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Basic ${btoa(`${__ENV.N8N_USER}:${__ENV.N8N_PASS}`)}`,
  },
  timeout: '60s',
};

3.5 Execute the request and record metrics (5‑line snippet)

export default function () {
  const res = http.post(`${BASE_URL}/webhook/benchmark`, payload, params);
  latency.add(res.timings.duration);
  errors.add(!check(res, { 'status is 200': (r) => r.status === 200 }));
  sleep(0.2);
}

3.6 Run the test

export N8N_URL=http://localhost:5678
export N8N_USER=bench
export N8N_PASS=bench123
k6 run n8n_k6_test.js --out json=results.json

3.7 Interpreting k6 output

Metric Meaning Typical Target
http_req_duration (p95) 95 % percentile of total request time ≤ 500 ms for simple webhook
latency_ms (custom) End‑to‑end latency per request ≤ 400 ms
error_rate Fraction of failed requests <1 %
vus Peak virtual users Matches stage target

EEFA tip: Correlate k6 latency with n8n’s internal request_time_ms log field to verify that network overhead isn’t the bottleneck.


4. Locust Benchmark Script for Stateful Scenarios

Micro‑summary: Install Locust, create a user class that posts to the webhook and optionally fetches execution details, then run in headless or distributed mode.

4.1 Install Locust

python3 -m venv locust-env
source locust-env/bin/activate
pip install locust

4.2 Define authentication header (4‑line snippet)

import base64

user = "bench"
pwd = "bench123"
token = base64.b64encode(f"{user}:{pwd}".encode()).decode()
self.headers = {
    "Content-Type": "application/json",
    "Authorization": f"Basic {token}",
}

4.3 Post to the webhook (5‑line snippet)

@task(5)
def post_webhook(self):
    payload = {"message": "locust-bench"}
    with self.client.post(
        "/webhook/benchmark",
        json=payload,
        headers=self.headers,
        catch_response=True,
    ) as response:
        if response.status_code != 200:
            response.failure(f"Bad status: {response.status_code}")
        else:
            response.success()

4.4 Optional execution fetch (4‑line snippet)

@task(1)
def fetch_execution(self):
    resp = self.client.get("/executions", headers=self.headers)
    if resp.status_code == 200 and resp.json():
        exec_id = resp.json()[0]["id"]
        self.client.get(f"/executions/{exec_id}", headers=self.headers)

4.5 Export a CSV summary on test stop (4‑line snippet)

@events.test_stop.add_listener
def on_test_stop(environment, **kwargs):
    with open("locust_summary.csv", "w") as f:
        f.write("Name,Requests,Failures,Median,95th\n")
        for name, stats in environment.stats.entries.items():
            f.write(
                f"{name},{stats.num_requests},{stats.num_failures},"
                f"{stats.median_response_time},{stats.get_response_time_percentile(95)}\n"
            )

4.6 Run Locust (headless example)

locust -f locustfile.py --headless -u 100 -r 10 --run-time 5m --csv=run

4.7 Distributed mode (optional)

# Master
locust -f locustfile.py --master --expect-workers=3

# Workers (on separate hosts)
locust -f locustfile.py --worker --master-host=MASTER_IP

EEFA note: Locust’s default max_rps is unlimited; in production‑like environments, cap it (--headless -u 200 -r 20) to avoid saturating the network interface before n8n does.


5. Advanced Benchmarking Patterns

Pattern Use‑Case k6 Implementation
Burst Load Spike handling (e.g., webhook surge) Add a rapid ramp stage (target: 200 over 10s).
Steady State Throughput Sustained processing capacity Hold VUs for 5‑10 min (target: 100).
Mixed Payloads Different workflow payload sizes Parameterize payload with CSV data via options.scenarios.
End‑to‑End Queue Drain Measure how fast n8n processes queued jobs after a load burst Combine k6 with a post‑run script that polls /executions until the queue is empty.
CI/CD Gate Fail build if latency > SLA k6 run … --out json=ci.json && node check-sla.js
Pattern Locust Implementation
Burst Load Set a high spawn_rate (-r 100) for a short duration.
Steady State Throughput -u 100 -t 10m in headless mode.
Mixed Payloads Use weighted @task methods and generate random payloads in the task body.
End‑to‑End Queue Drain Add a post‑load @task that repeatedly polls /executions until empty, then record elapsed time.
CI/CD Gate Run locust -f locustfile.py --headless -u 50 -r 10 --run-time 2m --csv=ci and parse the CSV for SLA checks.

EEFA checklist before CI integration

  • Mirror production DB and Node version in the test environment.
  • Disable n8n auto‑scaling (if on Kubernetes) to keep node count constant.
  • Pin k6/Locust versions in package.json or requirements.txt.
  • Store authentication secrets in CI vault, not in repo.
  • Add a timeout guard (--max-run-time) to prevent runaway jobs.

6. Troubleshooting Common Errors

Symptom Likely Cause Fix
401 Unauthorized from k6/Locust Missing or wrong Basic Auth header Verify N8N_USER/N8N_PASS env vars; base64‑encode correctly.
socket hang up / ECONNRESET n8n container hitting open‑file limit (ulimit -n) under load Increase nofile limit in Docker/K8s (ulimit -n 65535).
High latency spikes only in CI Shared CI runner network contention Isolate load generator on a dedicated VM or use k6 Cloud.
Metrics show 0 % error but logs contain Execution timed out n8n internal timeout not reflected in HTTP status (still 200) Enable EXECUTIONS_TIMEOUT=300000 to surface timeout as 500 error, or add a custom response check for executionFinished flag.
Locust UI crashes after many workers Insufficient RAM on master node Allocate more memory or limit --expect-workers to realistic count.

7. Exporting Benchmark Results for Stakeholder Reporting

7.1 Convert k6 JSON to CSV (4‑line snippet)

jq -r '.metrics | to_entries[] | "\(.key),\(.value.values.mean)"' results.json > k6_summary.csv

7.2 Locust already writes CSV (see section 4.5)

The file locust_summary.csv contains:

Name,Requests,Failures,Median,95th
/webhook/benchmark,12000,12,312,498
...

7.3 Sample Markdown report (5‑column table split)

k6 Results

Metric Peak VUs Avg Latency (ms) p95 Latency (ms) Error Rate
k6 50 312 498 0.2 %

Locust Results

Metric Peak VUs Avg Latency (ms) p95 Latency (ms) Error Rate
Locust 100 280 470 0.1 %

Observations
– Latency stays under the 500 ms SLA up to 100 concurrent webhooks.
– Queue‑drain time after a 2‑minute burst is ~45 s (see *End‑to‑End Queue Drain*).

Next Steps
– Increase EXECUTIONS_PROCESS to 2 for higher concurrency.
– Add Redis cache for credential lookups (see the sibling guide on *Docker Performance Tuning*).

All commands assume a Unix‑like shell. Adjust paths for Windows PowerShell as needed.

Leave a Comment

Your email address will not be published. Required fields are marked *