Zero-Downtime n8n Upgrades

<figure class="wp-block-image aligncenter"><img src="https://flowgenius.in/wp-content/uploads/2026/01/n8n-zero-downtime-upgrades.png" alt="Step by Step Guide to solve n8n zero downtime upgrades" /> <figcaption style="text-align: center;">Step by Step Guide to solve n8n zero downtime upgrades</p> <hr /> </figcaption></figure> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Who this is for</strong>: Platform engineers or DevOps specialists who run n8n in production and need to upgrade without interrupting active workflows. <strong>We cover this in detail in the </strong>Production‑Grade n8n Architecture</p> <p style="margin-bottom: 2em; line-height: 1.9;"> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #ddd;"> <p style="margin: 0; line-height: 1.9;">In the field the backup step is the one that trips people up most – if you’re still writing to the DB while dumping, the dump can be inconsistent.</p> </blockquote> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 45px; line-height: 1.3;">Quick Diagnosis (Featured‑Snippet Ready)</h2> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Problem</strong> – You need to upgrade a live n8n instance without aborting or corrupting running workflows.</p> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Solution</strong> – Deploy the new version alongside the current one using a <strong>blue‑green</strong> or <strong>rolling‑update</strong> strategy (Docker‑Compose, Docker‑Swarm, or Kubernetes). Pair the deployment with a pre‑upgrade backup checklist and the required DB migration scripts. Typical times: ~5 min for Docker‑Compose, <30 min for a Kubernetes cluster.</p> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 45px; line-height: 1.3;">1. Prerequisites & Safety Checklist</h2> <p>If you encounter any <a href="/single-vs-multi-instance-n8n">single vs multi instance n8n </a>resolve them before continuing with the setup.</p> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Item</th> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Why It Matters</th> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">How to Verify</th> </tr> </thead> <tbody> <tr> <td style="padding: 13px; border: 1px solid #ddd;"><strong>Database backup</strong> (PostgreSQL/MySQL/SQLite)</td> <td style="padding: 13px; border: 1px solid #ddd;">Prevents data loss if migration fails</td> <td style="padding: 13px; border: 1px solid #ddd;"> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin: 0;">pg_dump -U $POSTGRES_USER -Fc $POSTGRES_DB > backup_$(date +%F).dump</pre> </td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;"><strong>Workflow export</strong> (optional)</td> <td style="padding: 13px; border: 1px solid #ddd;">Guarantees a recoverable state of custom workflows</td> <td style="padding: 13px; border: 1px solid #ddd;"> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin: 0;">n8n export:workflow --all -o workflows.json</pre> </td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;"><strong>Staging clone</strong></td> <td style="padding: 13px; border: 1px solid #ddd;">Test the target version with real data before production</td> <td style="padding: 13px; border: 1px solid #ddd;">Deploy a copy using the same <code>docker‑compose.yml</code> but on a different port</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;"><strong>Custom node compatibility</strong></td> <td style="padding: 13px; border: 1px solid #ddd;">Community nodes may need recompilation after major releases</td> <td style="padding: 13px; border: 1px solid #ddd;">Run <code>npm rebuild</code> inside the custom‑node container</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;"><strong>Health‑check endpoint</strong> (<code>/healthz</code>) enabled</td> <td style="padding: 13px; border: 1px solid #ddd;">Allows orchestrators to detect a ready pod before traffic switch</td> <td style="padding: 13px; border: 1px solid #ddd;">Add <code>HEALTH_CHECK_PATH=/healthz</code> to env vars and verify <code>curl http://localhost:5678/healthz</code> returns <code>OK</code></td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;"><strong>Version pinning</strong></td> <td style="padding: 13px; border: 1px solid #ddd;">Guarantees you know exactly which image/tag you’re deploying</td> <td style="padding: 13px; border: 1px solid #ddd;">Use <code>n8nio/n8n:0.236.0</code> instead of <code>latest</code></td> </tr> </tbody> </table> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #ddd;"> <p style="margin: 0; line-height: 1.9;"><strong>EEFA Note</strong> – Store backups off‑site (e.g., S3 with versioning) and keep at least three snapshots before any major upgrade.</p> </blockquote> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 45px; line-height: 1.3;">2. Blue‑Green Upgrade with Docker‑Compose</h2> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Summary</strong> – Spin up a parallel “green” instance, verify it, then switch traffic and retire the old “blue” instance.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.1. Define the Blue Service</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">services: n8n-blue: image: n8nio/n8n:0.236.0 container_name: n8n-blue restart: unless-stopped ports: - "5678:5678" </pre> <p style="margin-bottom: 2em; line-height: 1.9;">Runs the current production version on the standard port.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.2. Define the Green Service</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;"> n8n-green: image: n8nio/n8n:0.237.0 container_name: n8n-green restart: unless-stopped ports: - "5679:5678" # alternate host port </pre> <p style="margin-bottom: 2em; line-height: 1.9;">Starts the target version on a different host port for isolated testing.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.3. Add Health‑Checks (shared for both services)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;"> healthcheck: test: ["CMD", "curl", "-f", "http://localhost:5678/healthz"] interval: 10s timeout: 5s retries: 3 </pre> <p style="margin-bottom: 2em; line-height: 1.9;">Lets Docker know when each container is ready to receive traffic.<br /> Docker only marks the container healthy after the command succeeds, so the first few seconds may still be flaky. If you encounter any <a href="/n8n-high-availability-patterns">n8n high availability patterns </a>resolve them before continuing with the setup.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.4. Bring Up the Green Instance</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">docker compose up -d n8n-green </pre> <p style="margin-bottom: 2em; line-height: 1.9;">After the command, check logs for <code>Server ready on http://0.0.0.0:5678</code> and confirm the health‑check passes.<br /> If you don’t see it, something’s off.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.5. Smoke‑Test the Green Instance</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">curl -X POST http://localhost:5679/webhook-test </pre> <p style="margin-bottom: 2em; line-height: 1.9;">A successful workflow execution means the new version is healthy.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.6. Switch Traffic via Reverse Proxy</h3> <p style="margin-bottom: 2em; line-height: 1.9;">Update your proxy configuration to point to the green port, then reload.</p> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">upstream n8n { server 127.0.0.1:5679; # green instance } </pre> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">nginx -s reload </pre> <p style="margin-bottom: 2em; line-height: 1.9;">Swapping the proxy is usually faster than trying to hot‑swap ports.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">2.7. Decommission the Blue Instance</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">docker compose stop n8n-blue && docker compose rm -f n8n-blue </pre> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #ddd;"> <p style="margin: 0; line-height: 1.9;"><strong>EEFA</strong> – Keep the blue container for <strong>30 minutes</strong> after cut‑over. If hidden errors surface, you can instantly roll back by re‑exposing its port.</p> </blockquote> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 45px; line-height: 1.3;">3. Rolling Update in Docker‑Swarm</h2> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Summary</strong> – Let Swarm replace each replica one‑by‑one, ensuring the new container passes health checks before the old one stops.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">3.1. Service Definition (excerpt)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">services: n8n: image: n8nio/n8n:${N8N_VERSION:-0.236.0} deploy: mode: replicated replicas: 2 </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">3.2. Rolling‑Update Settings</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;"> update_config: parallelism: 1 delay: 15s order: start-first </pre> <p style="margin-bottom: 2em; line-height: 1.9;">`start-first` ensures the new container starts before the old one stops, preserving traffic.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">3.3. Restart Policy & Health‑Check</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;"> restart_policy: condition: on-failure healthcheck: test: ["CMD", "curl", "-f", "http://localhost:5678/healthz"] interval: 10s timeout: 5s retries: 3 </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">3.4. Trigger the Upgrade</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">export N8N_VERSION=0.237.0 # target version docker stack deploy -c stack.yml n8n_stack </pre> <p style="margin-bottom: 2em; line-height: 1.9;">Swarm updates each replica sequentially, waiting for the health‑check to succeed before moving on.</p> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #ddd;"> <p style="margin: 0; line-height: 1.9;"><strong>EEFA</strong> – For PostgreSQL‑backed n8n, raise <code>max_connections</code> on the DB service so the temporary extra replica doesn’t hit connection limits. Most teams run into this on the first swap, not on day one.</p> </blockquote> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 45px; line-height: 1.3;">4. Zero‑Downtime Upgrade on Kubernetes (Helm Chart)</h2> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Summary</strong> – Use Helm’s rolling‑update strategy with <code>maxSurge: 1</code> and <code>maxUnavailable: 0</code> to keep all pods serving traffic while a new pod is added.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.1. Helm Values – Image & Strategy</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">image: repository: n8nio/n8n tag: "0.236.0" pullPolicy: IfNotPresent strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.2. Service & Probe Configuration</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">service: type: ClusterIP port: 5678 readinessProbe: httpGet: path: /healthz port: http initialDelaySeconds: 10 periodSeconds: 5 failureThreshold: 3 </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.3. Perform the Upgrade</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">helm upgrade n8n-release n8n/n8n -f values.yaml \ --set image.tag=0.237.0 </pre> <p style="margin-bottom: 2em; line-height: 1.9;">If you’re already using Helm, the extra <code>--set</code> flag is the quickest way to bump the tag.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.4. Verify Rollout</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">kubectl rollout status deployment/n8n-release-n8n </pre> <p style="margin-bottom: 2em; line-height: 1.9;">The command returns when all pods are ready with the new image.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.5. Optional Post‑Upgrade DB Migration</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin-bottom: 2em;">POD=$(kubectl get pod -l app=n8n -o jsonpath="{.items[0].metadata.name}") kubectl exec -it $POD -- n8n migration:run </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">4.6. Canary Validation (Advanced)</h3> <ol style="margin-bottom: 2em; line-height: 1.9;"> <li>Deploy a single‑replica canary with a node selector.</li> <li>Expose it via a temporary Ingress.</li> <li>Run a representative workflow.</li> <li>If successful, scale the main deployment to full size.</li> </ol> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #ddd;"> <p style="margin: 0; line-height: 1.9;"><strong>EEFA</strong> – Ensure any <code>PodDisruptionBudget</code> has <code>maxUnavailable: 0</code> and <code>minAvailable</code> high enough (e.g., <code>2</code> for a 2‑replica set) so the extra pod created by <code>maxSurge</code> does not violate the budget. If you encounter any <a href="/n8n-data-consistency">n8n data consistency </a>resolve them before continuing with the setup.</p> </blockquote> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 45px; line-height: 1.3;">5. Post‑Upgrade Validation Checklist</h2> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Step</th> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Command / Action</th> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Success Indicator</th> </tr> </thead> <tbody> <tr> <td style="padding: 13px; border: 1px solid #ddd;">Workflow health</td> <td style="padding: 13px; border: 1px solid #ddd;"><code>curl -X POST http://localhost:5678/webhook-test</code></td> <td style="padding: 13px; border: 1px solid #ddd;">Workflow finishes with status <code>success</code></td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;">DB schema version</td> <td style="padding: 13px; border: 1px solid #ddd;"><code>SELECT version FROM n8n_schema_migrations ORDER BY applied_at DESC LIMIT 1;</code></td> <td style="padding: 13px; border: 1px solid #ddd;">Returns the new version (e.g., <code>0.237.0</code>)</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;">Custom node loading</td> <td style="padding: 13px; border: 1px solid #ddd;">Inspect container logs for <code>Loading custom nodes</code></td> <td style="padding: 13px; border: 1px solid #ddd;">No “module not found” errors</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;">Metrics endpoint</td> <td style="padding: 13px; border: 1px solid #ddd;"><code>curl http://localhost:5678/metrics</code></td> <td style="padding: 13px; border: 1px solid #ddd;">Prometheus metrics are returned without 5xx</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;">Backup integrity</td> <td style="padding: 13px; border: 1px solid #ddd;">Restore a random workflow from <code>workflows.json</code></td> <td style="padding: 13px; border: 1px solid #ddd;">Workflow appears unchanged in the UI</td> </tr> </tbody> </table> <blockquote style="margin: 0 0 2em 0; padding-left: 1em; border-left: 4px solid #ddd;"> <p style="margin: 0; line-height: 1.9;"><strong>EEFA</strong> – If any step fails, roll back immediately:</p> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; margin: 1em 0;"># Helm helm rollback n8n-release 1 # Docker‑Compose (blue‑green) docker compose up -d n8n-blue && \ docker compose stop n8n-green && \ docker compose rm -f n8n-green </pre> </blockquote> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 45px; line-height: 1.3;">6. Frequently Asked “Zero‑Downtime” Scenarios</h2> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Scenario</th> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Root Cause</th> <th style="padding: 13px; border: 1px solid #ddd; text-align: left;">Fix (Zero‑Downtime)</th> </tr> </thead> <tbody> <tr> <td style="padding: 13px; border: 1px solid #ddd;">Long‑running workflow stalls during upgrade</td> <td style="padding: 13px; border: 1px solid #ddd;">Container receives SIGTERM → workflow aborts</td> <td style="padding: 13px; border: 1px solid #ddd;">Set <code>terminationGracePeriodSeconds: 300</code> in the pod spec; n8n will finish in‑flight executions before exiting</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;">DB migration blocks new connections</td> <td style="padding: 13px; border: 1px solid #ddd;">Migration script holds exclusive locks</td> <td style="padding: 13px; border: 1px solid #ddd;">Run migration as a **pre‑upgrade Job** on a separate pod, then scale the app back up</td> </tr> <tr> <td style="padding: 13px; border: 1px solid #ddd;">Custom node binary incompatibility</td> <td style="padding: 13px; border: 1px solid #ddd;">New n8n release upgrades Node.js version</td> <td style="padding: 13px; border: 1px solid #ddd;">Re‑build custom nodes with the same Node.js version (<code>node:18-alpine</code>) before upgrade; validate in staging</td> </tr> </tbody> </table> <h2 style="margin-bottom: 45px; line-height: 1.3;"></h2> <hr style="margin: 55px 0;" /> <h2 style="margin-bottom: 2em; line-height: 1.9;"><strong>Zero‑downtime n8n upgrade checklist</strong></h2> <ol style="margin-bottom: 2em; line-height: 1.9;"> <li>Backup DB & workflows.</li> <li>Deploy the new version alongside the old one (blue‑green) <strong>or</strong> run a rolling update (Docker‑Swarm/K8s).</li> <li>Verify health via <code>/healthz</code>.</li> <li>Switch traffic to the new instance (proxy reload or Kubernetes rollout).</li> <li>Keep the old instance for 30 min, then retire it.</li> </ol> <p style="margin-bottom: 2em; line-height: 1.9;">Follow the detailed steps above for Docker‑Compose, Docker‑Swarm, or Kubernetes to ensure a seamless, production‑grade upgrade.</p>

Step by Step Guide to solve n8n zero downtime upgrades

Who this is for: Platform engineers or DevOps specialists who run n8n in production and need to upgrade without interrupting active workflows. We cover this in detail in the Production‑Grade n8n Architecture

In the field the backup step is the one that trips people up most – if you’re still writing to the DB while dumping, the dump can be inconsistent.

Quick Diagnosis (Featured‑Snippet Ready)

Problem – You need to upgrade a live n8n instance without aborting or corrupting running workflows.

Solution – Deploy the new version alongside the current one using a blue‑green or rolling‑update strategy (Docker‑Compose, Docker‑Swarm, or Kubernetes). Pair the deployment with a pre‑upgrade backup checklist and the required DB migration scripts. Typical times: ~5 min for Docker‑Compose, <30 min for a Kubernetes cluster.

1. Prerequisites & Safety Checklist

If you encounter any single vs multi instance n8n resolve them before continuing with the setup.

Item	Why It Matters	How to Verify
Database backup (PostgreSQL/MySQL/SQLite)	Prevents data loss if migration fails	pg_dump -U $POSTGRES_USER -Fc $POSTGRES_DB > backup_$(date +%F).dump
Workflow export (optional)	Guarantees a recoverable state of custom workflows	n8n export:workflow --all -o workflows.json
Staging clone	Test the target version with real data before production	Deploy a copy using the same `docker‑compose.yml` but on a different port
Custom node compatibility	Community nodes may need recompilation after major releases	Run `npm rebuild` inside the custom‑node container
Health‑check endpoint (`/healthz`) enabled	Allows orchestrators to detect a ready pod before traffic switch	Add `HEALTH_CHECK_PATH=/healthz` to env vars and verify `curl http://localhost:5678/healthz` returns `OK`
Version pinning	Guarantees you know exactly which image/tag you’re deploying	Use `n8nio/n8n:0.236.0` instead of `latest`

EEFA Note – Store backups off‑site (e.g., S3 with versioning) and keep at least three snapshots before any major upgrade.

2. Blue‑Green Upgrade with Docker‑Compose

Summary – Spin up a parallel “green” instance, verify it, then switch traffic and retire the old “blue” instance.

2.1. Define the Blue Service

services:
  n8n-blue:
    image: n8nio/n8n:0.236.0
    container_name: n8n-blue
    restart: unless-stopped
    ports:
      - "5678:5678"

Runs the current production version on the standard port.

2.2. Define the Green Service

  n8n-green:
    image: n8nio/n8n:0.237.0
    container_name: n8n-green
    restart: unless-stopped
    ports:
      - "5679:5678"   # alternate host port

Starts the target version on a different host port for isolated testing.

2.3. Add Health‑Checks (shared for both services)

    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:5678/healthz"]
      interval: 10s
      timeout: 5s
      retries: 3

Lets Docker know when each container is ready to receive traffic.
Docker only marks the container healthy after the command succeeds, so the first few seconds may still be flaky. If you encounter any n8n high availability patterns resolve them before continuing with the setup.

2.4. Bring Up the Green Instance

docker compose up -d n8n-green

After the command, check logs for Server ready on http://0.0.0.0:5678 and confirm the health‑check passes.
If you don’t see it, something’s off.

2.5. Smoke‑Test the Green Instance

curl -X POST http://localhost:5679/webhook-test

A successful workflow execution means the new version is healthy.

2.6. Switch Traffic via Reverse Proxy

Update your proxy configuration to point to the green port, then reload.

upstream n8n {
    server 127.0.0.1:5679;   # green instance
}

nginx -s reload

Swapping the proxy is usually faster than trying to hot‑swap ports.

2.7. Decommission the Blue Instance

docker compose stop n8n-blue && docker compose rm -f n8n-blue

EEFA – Keep the blue container for 30 minutes after cut‑over. If hidden errors surface, you can instantly roll back by re‑exposing its port.

3. Rolling Update in Docker‑Swarm

Summary – Let Swarm replace each replica one‑by‑one, ensuring the new container passes health checks before the old one stops.

3.1. Service Definition (excerpt)

services:
  n8n:
    image: n8nio/n8n:${N8N_VERSION:-0.236.0}
    deploy:
      mode: replicated
      replicas: 2

3.2. Rolling‑Update Settings

      update_config:
        parallelism: 1
        delay: 15s
        order: start-first

`start-first` ensures the new container starts before the old one stops, preserving traffic.

3.3. Restart Policy & Health‑Check

      restart_policy:
        condition: on-failure
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:5678/healthz"]
      interval: 10s
      timeout: 5s
      retries: 3

3.4. Trigger the Upgrade

export N8N_VERSION=0.237.0          # target version
docker stack deploy -c stack.yml n8n_stack

Swarm updates each replica sequentially, waiting for the health‑check to succeed before moving on.

EEFA – For PostgreSQL‑backed n8n, raise max_connections on the DB service so the temporary extra replica doesn’t hit connection limits. Most teams run into this on the first swap, not on day one.

4. Zero‑Downtime Upgrade on Kubernetes (Helm Chart)

Summary – Use Helm’s rolling‑update strategy with maxSurge: 1 and maxUnavailable: 0 to keep all pods serving traffic while a new pod is added.

4.1. Helm Values – Image & Strategy

image:
  repository: n8nio/n8n
  tag: "0.236.0"
  pullPolicy: IfNotPresent

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0

4.2. Service & Probe Configuration

service:
  type: ClusterIP
  port: 5678

readinessProbe:
  httpGet:
    path: /healthz
    port: http
  initialDelaySeconds: 10
  periodSeconds: 5
  failureThreshold: 3

4.3. Perform the Upgrade

helm upgrade n8n-release n8n/n8n -f values.yaml \
  --set image.tag=0.237.0

If you’re already using Helm, the extra --set flag is the quickest way to bump the tag.

4.4. Verify Rollout

kubectl rollout status deployment/n8n-release-n8n

The command returns when all pods are ready with the new image.

4.5. Optional Post‑Upgrade DB Migration

POD=$(kubectl get pod -l app=n8n -o jsonpath="{.items[0].metadata.name}")
kubectl exec -it $POD -- n8n migration:run

4.6. Canary Validation (Advanced)

Deploy a single‑replica canary with a node selector.
Expose it via a temporary Ingress.
Run a representative workflow.
If successful, scale the main deployment to full size.

EEFA – Ensure any PodDisruptionBudget has maxUnavailable: 0 and minAvailable high enough (e.g., 2 for a 2‑replica set) so the extra pod created by maxSurge does not violate the budget. If you encounter any n8n data consistency resolve them before continuing with the setup.

5. Post‑Upgrade Validation Checklist

Step	Command / Action	Success Indicator
Workflow health	`curl -X POST http://localhost:5678/webhook-test`	Workflow finishes with status `success`
DB schema version	`SELECT version FROM n8n_schema_migrations ORDER BY applied_at DESC LIMIT 1;`	Returns the new version (e.g., `0.237.0`)
Custom node loading	Inspect container logs for `Loading custom nodes`	No “module not found” errors
Metrics endpoint	`curl http://localhost:5678/metrics`	Prometheus metrics are returned without 5xx
Backup integrity	Restore a random workflow from `workflows.json`	Workflow appears unchanged in the UI

EEFA – If any step fails, roll back immediately:

# Helm
helm rollback n8n-release 1

# Docker‑Compose (blue‑green)
docker compose up -d n8n-blue && \
docker compose stop n8n-green && \
docker compose rm -f n8n-green

6. Frequently Asked “Zero‑Downtime” Scenarios

Scenario	Root Cause	Fix (Zero‑Downtime)
Long‑running workflow stalls during upgrade	Container receives SIGTERM → workflow aborts	Set `terminationGracePeriodSeconds: 300` in the pod spec; n8n will finish in‑flight executions before exiting
DB migration blocks new connections	Migration script holds exclusive locks	Run migration as a pre‑upgrade Job on a separate pod, then scale the app back up
Custom node binary incompatibility	New n8n release upgrades Node.js version	Re‑build custom nodes with the same Node.js version (`node:18-alpine`) before upgrade; validate in staging

Zero‑downtime n8n upgrade checklist

Backup DB & workflows.
Deploy the new version alongside the old one (blue‑green) or run a rolling update (Docker‑Swarm/K8s).
Verify health via /healthz.
Switch traffic to the new instance (proxy reload or Kubernetes rollout).
Keep the old instance for 30 min, then retire it.

Follow the detailed steps above for Docker‑Compose, Docker‑Swarm, or Kubernetes to ensure a seamless, production‑grade upgrade.

Zero-Downtime n8n Upgrades

Quick Diagnosis (Featured‑Snippet Ready)

1. Prerequisites & Safety Checklist

2. Blue‑Green Upgrade with Docker‑Compose

2.1. Define the Blue Service

2.2. Define the Green Service

2.3. Add Health‑Checks (shared for both services)

2.4. Bring Up the Green Instance

2.5. Smoke‑Test the Green Instance

2.6. Switch Traffic via Reverse Proxy

2.7. Decommission the Blue Instance

3. Rolling Update in Docker‑Swarm

3.1. Service Definition (excerpt)

3.2. Rolling‑Update Settings

3.3. Restart Policy & Health‑Check

3.4. Trigger the Upgrade

4. Zero‑Downtime Upgrade on Kubernetes (Helm Chart)

4.1. Helm Values – Image & Strategy

4.2. Service & Probe Configuration

4.3. Perform the Upgrade

4.4. Verify Rollout

4.5. Optional Post‑Upgrade DB Migration

4.6. Canary Validation (Advanced)

5. Post‑Upgrade Validation Checklist

6. Frequently Asked “Zero‑Downtime” Scenarios

Zero‑downtime n8n upgrade checklist

Leave a Comment Cancel Reply

Sign up for Newsletter

Quick Diagnosis (Featured‑Snippet Ready)

1. Prerequisites & Safety Checklist

2. Blue‑Green Upgrade with Docker‑Compose

2.1. Define the Blue Service

2.2. Define the Green Service

2.3. Add Health‑Checks (shared for both services)

2.4. Bring Up the Green Instance

2.5. Smoke‑Test the Green Instance

2.6. Switch Traffic via Reverse Proxy

2.7. Decommission the Blue Instance

3. Rolling Update in Docker‑Swarm

3.1. Service Definition (excerpt)

3.2. Rolling‑Update Settings

3.3. Restart Policy & Health‑Check

3.4. Trigger the Upgrade

4. Zero‑Downtime Upgrade on Kubernetes (Helm Chart)

4.1. Helm Values – Image & Strategy

4.2. Service & Probe Configuration

4.3. Perform the Upgrade

4.4. Verify Rollout

4.5. Optional Post‑Upgrade DB Migration

4.6. Canary Validation (Advanced)

5. Post‑Upgrade Validation Checklist

6. Frequently Asked “Zero‑Downtime” Scenarios

Zero‑downtime n8n upgrade checklist

Must Read

Leave a Comment Cancel Reply