Stop 4 Kubernetes Deployment Errors in n8n Queue Mode

<figure class="wp-block-image aligncenter"><img src="https://flowgenius.in/wp-content/uploads/2026/01/n8n-queue-mode-kubernetes-deployment-errors.png" alt="Step by Step Guide to solve n8n queue mode kubernetes deployment errors" /><figcaption style="text-align: center;">Step by Step Guide to solve n8n queue mode kubernetes deployment errors</figcaption></figure> <p> </p> <hr /> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Who this is for: </strong>Kubernetes operators and DevOps engineers who run n8n in production and need a reliable, zero‑downtime queue‑mode deployment. <strong>We cover this in detail in the </strong><a href="https://flowgenius.in/n8n-queue-mode-error-guide/">n8n Queue Mode Errors Guide.</a></p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">Quick Diagnosis</h2> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="border: 1px solid #ddd; padding: 13px;">Symptom</th> <th style="border: 1px solid #ddd; padding: 13px;">Most‑likely cause</th> <th style="border: 1px solid #ddd; padding: 13px;">One‑line fix</th> </tr> </thead> <tbody> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Pods stay <strong>Pending</strong> or <strong>CrashLoopBackOff</strong></td> <td style="border: 1px solid #ddd; padding: 13px;"><code>N8N_QUEUE_MODE=true</code> but <code>EXECUTIONS_PROCESS=main</code></td> <td style="border: 1px solid #ddd; padding: 13px;">Set <code>EXECUTIONS_PROCESS=queue</code> <strong>and</strong> add a dedicated worker deployment.</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Workers never pick jobs</td> <td style="border: 1px solid #ddd; padding: 13px;">Redis service name/port mismatch</td> <td style="border: 1px solid #ddd; padding: 13px;">Align <code>N8N_REDIS_HOST</code> and <code>N8N_REDIS_PORT</code> with the actual Redis Service.</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Workers are OOM‑killed</td> <td style="border: 1px solid #ddd; padding: 13px;">CPU/Memory limits too low for queue processing</td> <td style="border: 1px solid #ddd; padding: 13px;">Raise <code>resources.limits</code> to ≥ 500 MiB memory & ≥ 250 m CPU (adjust per load).</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Liveness probe fails repeatedly</td> <td style="border: 1px solid #ddd; padding: 13px;">Probe timeout < job start‑up time</td> <td style="border: 1px solid #ddd; padding: 13px;">Increase <code>initialDelaySeconds</code> to 30 – 45 s and <code>periodSeconds</code> to 15 s.</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">RBAC errors in logs (<code>Forbidden</code>…)</td> <td style="border: 1px solid #ddd; padding: 13px;">ServiceAccount missing <code>get</code>, <code>list</code> on ConfigMaps/Secrets</td> <td style="border: 1px solid #ddd; padding: 13px;">Add a <code>Role</code>/<code>RoleBinding</code> that grants <code>configmaps</code> & <code>secrets</code> access.</td> </tr> </tbody> </table> <p style="margin-bottom: 2em; line-height: 1.9;">Apply the <strong>step‑by‑step remediation workflow</strong> below to resolve any of the above errors in a production‑grade Kubernetes cluster.</p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">1. Why a Separate Worker Deployment Matters ?</h2> <p><strong>If you encounter any </strong><a href="/n8n-queue-mode-ssl-misconfiguration">n8n queue mode ssl misconfiguration </a><strong>resolve them before continuing with the setup.</strong></p> <p style="margin-bottom: 2em; line-height: 1.9;">When <code>N8N_QUEUE_MODE=true</code>, the <strong>web server</strong> only enqueues execution payloads. A <strong>worker pod</strong> (or a set of workers) pulls jobs from Redis and runs them. Combining both roles in a single pod works for tiny workloads but fails under load.</p> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="border: 1px solid #ddd; padding: 13px;">Issue when combined</th> <th style="border: 1px solid #ddd; padding: 13px;">Impact</th> </tr> </thead> <tbody> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Resource contention</td> <td style="border: 1px solid #ddd; padding: 13px;">Web‑server memory spikes kill workers</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Pod restarts affect all traffic</td> <td style="border: 1px solid #ddd; padding: 13px;">A worker crash restarts the web container too</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Horizontal scaling</td> <td style="border: 1px solid #ddd; padding: 13px;">`replicas` affect both roles simultaneously</td> </tr> </tbody> </table> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>Best‑practice:</strong> Deploy <code>n8n-web</code> and <code>n8n-worker</code> as independent <code>Deployment</code>s. Run at least **two worker replicas** behind a <code>PodDisruptionBudget</code> for zero‑downtime processing.</p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">2. Misconfiguration #1 – Wrong <code>EXECUTIONS_PROCESS</code> Value</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">What happens ?</h3> <p style="margin-bottom: 2em; line-height: 1.9;">If <code>EXECUTIONS_PROCESS</code> stays <code>main</code> while <code>N8N_QUEUE_MODE=true</code>, the web pod still tries to execute jobs locally, causing duplicate execution errors and rapid OOM kills.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">Required environment variables</h3> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="border: 1px solid #ddd; padding: 13px;">Variable</th> <th style="border: 1px solid #ddd; padding: 13px;">Value</th> </tr> </thead> <tbody> <tr> <td style="border: 1px solid #ddd; padding: 13px;">N8N_QUEUE_MODE</td> <td style="border: 1px solid #ddd; padding: 13px;">true</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">EXECUTIONS_PROCESS</td> <td style="border: 1px solid #ddd; padding: 13px;">queue</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">EXECUTIONS_WORKER_COUNT</td> <td style="border: 1px solid #ddd; padding: 13px;">1 (or higher)</td> </tr> </tbody> </table> <h3 style="margin-bottom: 45px; line-height: 1.3;">Web deployment – part 1 (metadata & selector)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">apiVersion: apps/v1 kind: Deployment metadata: name: n8n-web spec: replicas: 1 selector: matchLabels: app: n8n-web </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Web deployment – part 2 (pod template)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;"> template: metadata: labels: app: n8n-web spec: serviceAccountName: n8n-sa </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Web deployment – part 3 (container & env)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;"> containers: - name: n8n image: n8nio/n8n:latest env: - name: N8N_QUEUE_MODE value: "true" - name: EXECUTIONS_PROCESS value: "queue" - name: N8N_REDIS_HOST value: "n8n-redis" - name: N8N_REDIS_PORT value: "6379" </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Web deployment – part 4 (ports & resources)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;"> ports: - containerPort: 5678 resources: limits: memory: "512Mi" cpu: "250m" </pre> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>EEFA warning:</strong> Never set <code>EXECUTIONS_PROCESS=main</code> in a pod where <code>N8N_QUEUE_MODE=true</code>. The conflict generates “Execution already in progress” errors that are hard to debug. If you encounter any <a href="/n8n-queue-mode-docker-compose-issues">n8n queue mode docker compose issues </a>resolve them before continuing with the setup.</p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">3. Misconfiguration #2 – Redis Service Not Reachable</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">Typical symptom</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">[2023-10-01 12:34:56] Error: connect ECONNREFUSED 10.96.0.12:6379 </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Common root causes</h3> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="border: 1px solid #ddd; padding: 13px;">Cause</th> <th style="border: 1px solid #ddd; padding: 13px;">Typical mistake</th> </tr> </thead> <tbody> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Service name typo</td> <td style="border: 1px solid #ddd; padding: 13px;"><code>N8N_REDIS_HOST=n8n-redis-svc</code> while Service is <code>n8n-redis</code></td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Port mismatch</td> <td style="border: 1px solid #ddd; padding: 13px;">Redis runs on <code>6380</code> (TLS) but pod uses default <code>6379</code></td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Namespace mismatch</td> <td style="border: 1px solid #ddd; padding: 13px;">Redis Service lives in <code>infra</code> namespace, web pod in <code>default</code></td> </tr> </tbody> </table> <h3 style="margin-bottom: 45px; line-height: 1.3;">Create a ClusterIP Redis Service (same namespace)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">apiVersion: v1 kind: Service metadata: name: n8n-redis spec: selector: app: n8n-redis ports: - port: 6379 targetPort: 6379 protocol: TCP </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Cross‑namespace reference (if Redis is elsewhere)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">- name: N8N_REDIS_HOST value: "n8n-redis.infra.svc.cluster.local" </pre> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>EEFA tip:</strong> Enable Redis TLS in production (<code>N8N_REDIS_TLS=true</code>) and mount the CA cert as a secret. Reference it with <code>N8N_REDIS_TLS_CA_CERT</code>.</p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">4. Misconfiguration #3 – Resource Limits Trigger OOM Kills</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">Why it matters</h3> <p style="margin-bottom: 2em; line-height: 1.9;">Queue workers often need more memory than the web container because they load external APIs, run heavy transformations, and keep large payloads in memory.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">Recommended CPU resources</h3> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="border: 1px solid #ddd; padding: 13px;">Container</th> <th style="border: 1px solid #ddd; padding: 13px;">CPU request</th> <th style="border: 1px solid #ddd; padding: 13px;">CPU limit</th> </tr> </thead> <tbody> <tr> <td style="border: 1px solid #ddd; padding: 13px;">n8n-web</td> <td style="border: 1px solid #ddd; padding: 13px;">100m</td> <td style="border: 1px solid #ddd; padding: 13px;">250m</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">n8n-worker</td> <td style="border: 1px solid #ddd; padding: 13px;">250m</td> <td style="border: 1px solid #ddd; padding: 13px;">500m</td> </tr> </tbody> </table> <h3 style="margin-bottom: 45px; line-height: 1.3;">Recommended Memory resources</h3> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="border: 1px solid #ddd; padding: 13px;">Container</th> <th style="border: 1px solid #ddd; padding: 13px;">Memory request</th> <th style="border: 1px solid #ddd; padding: 13px;">Memory limit</th> </tr> </thead> <tbody> <tr> <td style="border: 1px solid #ddd; padding: 13px;">n8n-web</td> <td style="border: 1px solid #ddd; padding: 13px;">256Mi</td> <td style="border: 1px solid #ddd; padding: 13px;">512Mi</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">n8n-worker</td> <td style="border: 1px solid #ddd; padding: 13px;">512Mi</td> <td style="border: 1px solid #ddd; padding: 13px;">1Gi</td> </tr> </tbody> </table> <h3 style="margin-bottom: 45px; line-height: 1.3;">Worker deployment – part 1 (metadata & replicas)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">apiVersion: apps/v1 kind: Deployment metadata: name: n8n-worker spec: replicas: 2 selector: matchLabels: app: n8n-worker </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Worker deployment – part 2 (pod template)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;"> template: metadata: labels: app: n8n-worker spec: serviceAccountName: n8n-sa </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Worker deployment – part 3 (container, env & resources)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;"> containers: - name: n8n image: n8nio/n8n:latest env: - name: N8N_QUEUE_MODE value: "true" - name: EXECUTIONS_PROCESS value: "worker" - name: N8N_REDIS_HOST value: "n8n-redis" resources: requests: cpu: "250m" memory: "512Mi" limits: cpu: "500m" memory: "1Gi" </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">HorizontalPodAutoscaler for workers (4 lines)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: n8n-worker-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: n8n-worker minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 </pre> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>EEFA caution:</strong> A memory limit lower than the worker’s peak usage triggers <code>OOMKilled</code> events that appear as “CrashLoopBackOff”. Monitor <code>container_memory_working_set_bytes</code> in Prometheus and raise the limit before hitting the threshold.</p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">5. Misconfiguration #4 – Probes Too Aggressive</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">Typical failure</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">Readiness probe failed: Get http://10.244.1.5:5678/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers) </pre> <h3 style="margin-bottom: 45px; line-height: 1.3;">Reason</h3> <p style="margin-bottom: 2em; line-height: 1.9;">Queue workers need a few seconds to bootstrap the Redis client and load the execution queue. A probe that starts at <code>5s</code> with a <code>2s</code> timeout kills the pod before it is ready.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">Proven probe configuration (4‑line snippet)</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">livenessProbe: httpGet: path: /healthz port: 5678 initialDelaySeconds: 30 periodSeconds: 15 timeoutSeconds: 5 </pre> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">readinessProbe: httpGet: path: /healthz port: 5678 initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 </pre> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>EEFA note:</strong> If your worker image does not expose <code>/healthz</code>, replace the HTTP probe with a TCP check on the Redis port or an <code>exec</code> probe that runs <code>pgrep -f "worker"</code>.</p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">6. Misconfiguration #5 – Insufficient RBAC for ConfigMaps & Secrets</h2> <h3 style="margin-bottom: 45px; line-height: 1.3;">Error snippet</h3> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">Error: EACCES: permission denied, getaddrinfo ENOTFOUND redis </pre> <p style="margin-bottom: 2em; line-height: 1.9;">The underlying cause is often the worker’s ServiceAccount lacking permission to read the Redis credentials stored in a <code>Secret</code>.</p> <h3 style="margin-bottom: 45px; line-height: 1.3;">Minimal RBAC objects (split for readability)</h3> <h4 style="margin-bottom: 45px; line-height: 1.3;">ServiceAccount</h4> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">apiVersion: v1 kind: ServiceAccount metadata: name: n8n-sa </pre> <h4 style="margin-bottom: 45px; line-height: 1.3;">Role (granting ConfigMap & Secret read)</h4> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: n8n-role rules: - apiGroups: [""] resources: ["configmaps", "secrets"] verbs: ["get", "list"] </pre> <h4 style="margin-bottom: 45px; line-height: 1.3;">RoleBinding (attach role to ServiceAccount)</h4> <pre style="background: #fafafa; padding: 20px; border: 1px solid #e0e0e0; overflow: auto; line-height: 1.9;">apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: n8n-rb subjects: - kind: ServiceAccount name: n8n-sa roleRef: kind: Role name: n8n-role apiGroup: rbac.authorization.k8s.io </pre> <p style="margin-bottom: 2em; line-height: 1.9;">Attach <code>serviceAccountName: n8n-sa</code> to <strong>both</strong> the web and worker pods.</p> <p style="margin-bottom: 2em; line-height: 1.9;"><strong>EEFA reminder:</strong> For clusters using PodSecurityPolicies or OPA Gatekeeper, ensure the ServiceAccount is allowed to run as UID 1000 (the default n8n UID).</p> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">7. Diagnostic Checklist – Quick Copy‑Paste for On‑Call Engineers</h2> <table style="border-collapse: collapse; width: 100%; margin-bottom: 2em;"> <thead> <tr> <th style="border: 1px solid #ddd; padding: 13px;">Check</th> <th style="border: 1px solid #ddd; padding: 13px;">Command / Manifest</th> <th style="border: 1px solid #ddd; padding: 13px;">Expected result</th> </tr> </thead> <tbody> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Queue mode env vars set correctly</td> <td style="border: 1px solid #ddd; padding: 13px;">kubectl exec -ti <web-pod> — printenv | grep N8N_</td> <td style="border: 1px solid #ddd; padding: 13px;">N8N_QUEUE_MODE=true and EXECUTIONS_PROCESS=queue</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Redis reachable from pod</td> <td style="border: 1px solid #ddd; padding: 13px;">kubectl exec -ti <worker-pod> — nc -zv n8n-redis 6379</td> <td style="border: 1px solid #ddd; padding: 13px;">Connection to n8n-redis 6379 port [tcp/*] succeeded!</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Worker pod has enough memory</td> <td style="border: 1px solid #ddd; padding: 13px;">kubectl top pod <worker-pod></td> <td style="border: 1px solid #ddd; padding: 13px;">MEMORY ≤ memory.limit</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">Probes are not failing</td> <td style="border: 1px solid #ddd; padding: 13px;">kubectl describe pod <worker-pod></td> <td style="border: 1px solid #ddd; padding: 13px;">No Liveness/Readiness failures in events</td> </tr> <tr> <td style="border: 1px solid #ddd; padding: 13px;">ServiceAccount can read secret</td> <td style="border: 1px solid #ddd; padding: 13px;">kubectl auth can-i get secret n8n-redis-secret –as=system:serviceaccount:default:n8n-sa</td> <td style="border: 1px solid #ddd; padding: 13px;">yes</td> </tr> </tbody> </table> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">8. Step‑by‑Step Remediation Workflow</h2> <ol style="line-height: 1.9; margin-bottom: 1.8em;"> <li>Validate environment variables – Run the env‑check command above. Edit the <code>Deployment</code> if any variable is missing or wrong, then <code>kubectl apply -f <file>.yaml</code>.</li> <li>Test Redis connectivity – Use <code>nc</code> or <code>redis-cli</code>. If unreachable, verify Service name, namespace, and DNS (<code>nslookup n8n-redis</code>).</li> <li>Adjust resources – Increase <code>resources.limits</code> in the worker Deployment, then <code>kubectl rollout restart deployment/n8n-worker</code>.</li> <li>Tune probes – Apply the probe snippet, then <code>kubectl apply -f <probe-file>.yaml</code>. Wait for the pod to become <code>Ready</code>.</li> <li>Apply RBAC – Deploy the ServiceAccount, Role, and RoleBinding. Re‑attach the ServiceAccount to the pods if not already.</li> <li>Scale workers – Once the pod is stable, use the HPA or manually increase <code>replicas</code>. Verify jobs are processed via the n8n UI (Executions → Queue).</li> </ol> <div style="margin: 55px 0;"> <hr /> </div> <h2 style="margin-bottom: 45px; line-height: 1.3;">Conclusion</h2> <p style="margin-bottom: 2em; line-height: 1.9;">The most common Kubernetes deployment errors in n8n queue mode stem from <strong>misaligned environment variables</strong>, <strong>unreachable Redis</strong>, <strong>insufficient resources</strong>, <strong>over‑eager health probes</strong>, and <strong>missing RBAC</strong>. By separating the web and worker roles, configuring the correct <code>EXECUTIONS_PROCESS</code>, ensuring Redis connectivity, provisioning adequate CPU/memory, tuning probes, and granting the proper permissions, you create a resilient, horizontally‑scalable queue‑mode deployment that handles production workloads without unexpected restarts. Apply the checklist and remediation workflow above, monitor the pods, and the n8n queue will run smoothly in any Kubernetes environment.</p>

Step by Step Guide to solve n8n queue mode kubernetes deployment errors

Who this is for: Kubernetes operators and DevOps engineers who run n8n in production and need a reliable, zero‑downtime queue‑mode deployment. We cover this in detail in the n8n Queue Mode Errors Guide.

Quick Diagnosis

Symptom	Most‑likely cause	One‑line fix
Pods stay Pending or CrashLoopBackOff	`N8N_QUEUE_MODE=true` but `EXECUTIONS_PROCESS=main`	Set `EXECUTIONS_PROCESS=queue` and add a dedicated worker deployment.
Workers never pick jobs	Redis service name/port mismatch	Align `N8N_REDIS_HOST` and `N8N_REDIS_PORT` with the actual Redis Service.
Workers are OOM‑killed	CPU/Memory limits too low for queue processing	Raise `resources.limits` to ≥ 500 MiB memory & ≥ 250 m CPU (adjust per load).
Liveness probe fails repeatedly	Probe timeout < job start‑up time	Increase `initialDelaySeconds` to 30 – 45 s and `periodSeconds` to 15 s.
RBAC errors in logs (`Forbidden`…)	ServiceAccount missing `get`, `list` on ConfigMaps/Secrets	Add a `Role`/`RoleBinding` that grants `configmaps` & `secrets` access.

Apply the step‑by‑step remediation workflow below to resolve any of the above errors in a production‑grade Kubernetes cluster.

1. Why a Separate Worker Deployment Matters ?

If you encounter any n8n queue mode ssl misconfiguration resolve them before continuing with the setup.

When N8N_QUEUE_MODE=true, the web server only enqueues execution payloads. A worker pod (or a set of workers) pulls jobs from Redis and runs them. Combining both roles in a single pod works for tiny workloads but fails under load.

Issue when combined	Impact
Resource contention	Web‑server memory spikes kill workers
Pod restarts affect all traffic	A worker crash restarts the web container too
Horizontal scaling	`replicas` affect both roles simultaneously

Best‑practice: Deploy n8n-web and n8n-worker as independent Deployments. Run at least **two worker replicas** behind a PodDisruptionBudget for zero‑downtime processing.

2. Misconfiguration #1 – Wrong `EXECUTIONS_PROCESS` Value

What happens ?

If EXECUTIONS_PROCESS stays main while N8N_QUEUE_MODE=true, the web pod still tries to execute jobs locally, causing duplicate execution errors and rapid OOM kills.

Required environment variables

Variable	Value
N8N_QUEUE_MODE	true
EXECUTIONS_PROCESS	queue
EXECUTIONS_WORKER_COUNT	1 (or higher)

Web deployment – part 1 (metadata & selector)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: n8n-web
spec:
  replicas: 1
  selector:
    matchLabels:
      app: n8n-web

Web deployment – part 2 (pod template)

  template:
    metadata:
      labels:
        app: n8n-web
    spec:
      serviceAccountName: n8n-sa

Web deployment – part 3 (container & env)

      containers:
        - name: n8n
          image: n8nio/n8n:latest
          env:
            - name: N8N_QUEUE_MODE
              value: "true"
            - name: EXECUTIONS_PROCESS
              value: "queue"
            - name: N8N_REDIS_HOST
              value: "n8n-redis"
            - name: N8N_REDIS_PORT
              value: "6379"

Web deployment – part 4 (ports & resources)

          ports:
            - containerPort: 5678
          resources:
            limits:
              memory: "512Mi"
              cpu: "250m"

EEFA warning: Never set EXECUTIONS_PROCESS=main in a pod where N8N_QUEUE_MODE=true. The conflict generates “Execution already in progress” errors that are hard to debug. If you encounter any n8n queue mode docker compose issues resolve them before continuing with the setup.

3. Misconfiguration #2 – Redis Service Not Reachable

Typical symptom

[2023-10-01 12:34:56] Error: connect ECONNREFUSED 10.96.0.12:6379

Common root causes

Cause	Typical mistake
Service name typo	`N8N_REDIS_HOST=n8n-redis-svc` while Service is `n8n-redis`
Port mismatch	Redis runs on `6380` (TLS) but pod uses default `6379`
Namespace mismatch	Redis Service lives in `infra` namespace, web pod in `default`

Create a ClusterIP Redis Service (same namespace)

apiVersion: v1
kind: Service
metadata:
  name: n8n-redis
spec:
  selector:
    app: n8n-redis
  ports:
    - port: 6379
      targetPort: 6379
      protocol: TCP

Cross‑namespace reference (if Redis is elsewhere)

- name: N8N_REDIS_HOST
  value: "n8n-redis.infra.svc.cluster.local"

EEFA tip: Enable Redis TLS in production (N8N_REDIS_TLS=true) and mount the CA cert as a secret. Reference it with N8N_REDIS_TLS_CA_CERT.

4. Misconfiguration #3 – Resource Limits Trigger OOM Kills

Why it matters

Queue workers often need more memory than the web container because they load external APIs, run heavy transformations, and keep large payloads in memory.

Recommended CPU resources

Container	CPU request	CPU limit
n8n-web	100m	250m
n8n-worker	250m	500m

Recommended Memory resources

Container	Memory request	Memory limit
n8n-web	256Mi	512Mi
n8n-worker	512Mi	1Gi

Worker deployment – part 1 (metadata & replicas)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: n8n-worker
spec:
  replicas: 2
  selector:
    matchLabels:
      app: n8n-worker

Worker deployment – part 2 (pod template)

  template:
    metadata:
      labels:
        app: n8n-worker
    spec:
      serviceAccountName: n8n-sa

Worker deployment – part 3 (container, env & resources)

      containers:
        - name: n8n
          image: n8nio/n8n:latest
          env:
            - name: N8N_QUEUE_MODE
              value: "true"
            - name: EXECUTIONS_PROCESS
              value: "worker"
            - name: N8N_REDIS_HOST
              value: "n8n-redis"
          resources:
            requests:
              cpu: "250m"
              memory: "512Mi"
            limits:
              cpu: "500m"
              memory: "1Gi"

HorizontalPodAutoscaler for workers (4 lines)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: n8n-worker-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: n8n-worker
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

EEFA caution: A memory limit lower than the worker’s peak usage triggers OOMKilled events that appear as “CrashLoopBackOff”. Monitor container_memory_working_set_bytes in Prometheus and raise the limit before hitting the threshold.

5. Misconfiguration #4 – Probes Too Aggressive

Typical failure

Readiness probe failed: Get http://10.244.1.5:5678/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

Reason

Queue workers need a few seconds to bootstrap the Redis client and load the execution queue. A probe that starts at 5s with a 2s timeout kills the pod before it is ready.

Proven probe configuration (4‑line snippet)

livenessProbe:
  httpGet:
    path: /healthz
    port: 5678
  initialDelaySeconds: 30
  periodSeconds: 15
  timeoutSeconds: 5

readinessProbe:
  httpGet:
    path: /healthz
    port: 5678
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 5

EEFA note: If your worker image does not expose /healthz, replace the HTTP probe with a TCP check on the Redis port or an exec probe that runs pgrep -f "worker".

6. Misconfiguration #5 – Insufficient RBAC for ConfigMaps & Secrets

Error snippet

Error: EACCES: permission denied, getaddrinfo ENOTFOUND redis

The underlying cause is often the worker’s ServiceAccount lacking permission to read the Redis credentials stored in a Secret.

Minimal RBAC objects (split for readability)

ServiceAccount

apiVersion: v1
kind: ServiceAccount
metadata:
  name: n8n-sa

Role (granting ConfigMap & Secret read)

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: n8n-role
rules:
  - apiGroups: [""]
    resources: ["configmaps", "secrets"]
    verbs: ["get", "list"]

RoleBinding (attach role to ServiceAccount)

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: n8n-rb
subjects:
  - kind: ServiceAccount
    name: n8n-sa
roleRef:
  kind: Role
  name: n8n-role
  apiGroup: rbac.authorization.k8s.io

Attach serviceAccountName: n8n-sa to both the web and worker pods.

EEFA reminder: For clusters using PodSecurityPolicies or OPA Gatekeeper, ensure the ServiceAccount is allowed to run as UID 1000 (the default n8n UID).

7. Diagnostic Checklist – Quick Copy‑Paste for On‑Call Engineers

Check	Command / Manifest	Expected result
Queue mode env vars set correctly	kubectl exec -ti <web-pod> — printenv \| grep N8N_	N8N_QUEUE_MODE=true and EXECUTIONS_PROCESS=queue
Redis reachable from pod	kubectl exec -ti <worker-pod> — nc -zv n8n-redis 6379	Connection to n8n-redis 6379 port [tcp/*] succeeded!
Worker pod has enough memory	kubectl top pod <worker-pod>	MEMORY ≤ memory.limit
Probes are not failing	kubectl describe pod <worker-pod>	No Liveness/Readiness failures in events
ServiceAccount can read secret	kubectl auth can-i get secret n8n-redis-secret –as=system:serviceaccount:default:n8n-sa	yes

8. Step‑by‑Step Remediation Workflow

Validate environment variables – Run the env‑check command above. Edit the Deployment if any variable is missing or wrong, then kubectl apply -f <file>.yaml.
Test Redis connectivity – Use nc or redis-cli. If unreachable, verify Service name, namespace, and DNS (nslookup n8n-redis).
Adjust resources – Increase resources.limits in the worker Deployment, then kubectl rollout restart deployment/n8n-worker.
Tune probes – Apply the probe snippet, then kubectl apply -f <probe-file>.yaml. Wait for the pod to become Ready.
Apply RBAC – Deploy the ServiceAccount, Role, and RoleBinding. Re‑attach the ServiceAccount to the pods if not already.
Scale workers – Once the pod is stable, use the HPA or manually increase replicas. Verify jobs are processed via the n8n UI (Executions → Queue).

Conclusion

The most common Kubernetes deployment errors in n8n queue mode stem from misaligned environment variables, unreachable Redis, insufficient resources, over‑eager health probes, and missing RBAC. By separating the web and worker roles, configuring the correct EXECUTIONS_PROCESS, ensuring Redis connectivity, provisioning adequate CPU/memory, tuning probes, and granting the proper permissions, you create a resilient, horizontally‑scalable queue‑mode deployment that handles production workloads without unexpected restarts. Apply the checklist and remediation workflow above, monitor the pods, and the n8n queue will run smoothly in any Kubernetes environment.

Stop 4 Kubernetes Deployment Errors in n8n Queue Mode

Quick Diagnosis

1. Why a Separate Worker Deployment Matters ?

2. Misconfiguration #1 – Wrong `EXECUTIONS_PROCESS` Value

What happens ?

Required environment variables

Web deployment – part 1 (metadata & selector)

Web deployment – part 2 (pod template)

Web deployment – part 3 (container & env)

Web deployment – part 4 (ports & resources)

3. Misconfiguration #2 – Redis Service Not Reachable

Typical symptom

Common root causes

Create a ClusterIP Redis Service (same namespace)

Cross‑namespace reference (if Redis is elsewhere)

4. Misconfiguration #3 – Resource Limits Trigger OOM Kills

Why it matters

Recommended CPU resources

Recommended Memory resources

Worker deployment – part 1 (metadata & replicas)

Worker deployment – part 2 (pod template)

Worker deployment – part 3 (container, env & resources)

HorizontalPodAutoscaler for workers (4 lines)

5. Misconfiguration #4 – Probes Too Aggressive

Typical failure

Reason

Proven probe configuration (4‑line snippet)

6. Misconfiguration #5 – Insufficient RBAC for ConfigMaps & Secrets

Error snippet

Minimal RBAC objects (split for readability)

ServiceAccount

Role (granting ConfigMap & Secret read)

RoleBinding (attach role to ServiceAccount)

7. Diagnostic Checklist – Quick Copy‑Paste for On‑Call Engineers

8. Step‑by‑Step Remediation Workflow

Conclusion

Leave a Comment Cancel Reply

Sign up for Newsletter

Quick Diagnosis

1. Why a Separate Worker Deployment Matters ?

2. Misconfiguration #1 – Wrong EXECUTIONS_PROCESS Value

What happens ?

Required environment variables

Web deployment – part 1 (metadata & selector)

Web deployment – part 2 (pod template)

Web deployment – part 3 (container & env)

Web deployment – part 4 (ports & resources)

3. Misconfiguration #2 – Redis Service Not Reachable

Typical symptom

Common root causes

Create a ClusterIP Redis Service (same namespace)

Cross‑namespace reference (if Redis is elsewhere)

4. Misconfiguration #3 – Resource Limits Trigger OOM Kills

Why it matters

Recommended CPU resources

Recommended Memory resources

Worker deployment – part 1 (metadata & replicas)

Worker deployment – part 2 (pod template)

Worker deployment – part 3 (container, env & resources)

HorizontalPodAutoscaler for workers (4 lines)

5. Misconfiguration #4 – Probes Too Aggressive

Typical failure

Reason

Proven probe configuration (4‑line snippet)

6. Misconfiguration #5 – Insufficient RBAC for ConfigMaps & Secrets

Error snippet

Minimal RBAC objects (split for readability)

ServiceAccount

Role (granting ConfigMap & Secret read)

RoleBinding (attach role to ServiceAccount)

7. Diagnostic Checklist – Quick Copy‑Paste for On‑Call Engineers

8. Step‑by‑Step Remediation Workflow

Conclusion

Must Read

Leave a Comment Cancel Reply

2. Misconfiguration #1 – Wrong `EXECUTIONS_PROCESS` Value

Web deployment – part 1 (metadata & selector)

Web deployment – part 2 (pod template)

Web deployment – part 3 (container & env)

Web deployment – part 4 (ports & resources)

Worker deployment – part 1 (metadata & replicas)

Worker deployment – part 2 (pod template)

Worker deployment – part 3 (container, env & resources)