Back to Portfolio

Building a Production-Grade N8N Infrastructure on 16GB RAM

πŸ“… December 2025 🏷️ N8N β€’ Infrastructure πŸ‘€ Muhammad Nawaz

The Challenge

N8N is a powerful workflow automation platform, but scaling it to handle heavy production workloads while maintaining a reasonable infrastructure budget is challenging. This is the story of how I architected a high-performance N8N deployment that handles complex Python and JavaScript workloads within the constraints of a single 16GB server.

Architecture Overview

The infrastructure consists of multiple specialized components working together in a queue-based architecture:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Client Requests                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚  Caddy (Proxy)  β”‚
              β”‚   SSL/TLS       β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚            β”‚            β”‚
          β–Ό            β–Ό            β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ N8N Mainβ”‚  β”‚ Webhook  β”‚  β”‚ Webhook  β”‚
    β”‚ UI/API  β”‚  β”‚ Handler  β”‚  β”‚ Handler  β”‚
    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”‚ Publishes jobs to
         β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Redis Queue     β”‚
    β”‚  (Bull Queue)    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
       β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”
       β”‚           β”‚
       β–Ό           β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚Worker-1β”‚  β”‚Worker-2β”‚
   β”‚  15cc  β”‚  β”‚  15cc  β”‚
   β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
       β”‚           β”‚
       β”‚ Delegates β”‚
       β–Ό           β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚Task    β”‚  β”‚Task    β”‚
   β”‚Runners β”‚  β”‚Runners β”‚
   β”‚Python  β”‚  β”‚Python  β”‚
   β”‚& JS    β”‚  β”‚& JS    β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Design Decisions

1. External Task Runners for Heavy Workloads

One of the most critical architectural decisions was implementing external task runners. N8N supports executing Python and JavaScript code, but running these in the main Node.js process poses security and performance risks.

The Solution:

Benefits:

2. Dedicated Task Runners Per Worker

Instead of sharing a pool of task runners, each worker has dedicated task runners. This was a deliberate choice for heavy workload scenarios.

worker-1 (2.5GB)
  β”œβ”€β”€ task-runner-worker-1-1 (1.5GB)
  └── task-runner-worker-1-2 (1.5GB)

worker-2 (2.5GB)
  β”œβ”€β”€ task-runner-worker-2-1 (1.5GB)
  └── task-runner-worker-2-2 (1.5GB)

Why This Matters:

3. Queue-Based Architecture

The infrastructure uses Bull Queue (Redis-backed) for job distribution:

Manual Execution (UI) β†’ Main Process β†’ task-runner-main
Production Jobs β†’ Redis Queue β†’ Workers β†’ task-runner-worker-X
Webhooks β†’ Webhook Process β†’ Redis Queue β†’ Workers

This separation ensures:

Memory Optimization Strategy

Fitting a production-grade setup into 16GB required careful resource allocation:

Component Count Memory/Instance Total Strategy
N8N Main 1 4GB 4GB UI/API - needs headroom for workflow editing
Workers 2 2.5GB 5GB Queue consumers - right-sized for 15 concurrent jobs
Task Runners (Main) 2 1.5GB 3GB Manual executions - lower load
Task Runners (Workers) 4 1.5GB 6GB Production - most critical
Webhooks 2 1GB 2GB Stateless handlers
Caddy 1 256MB 256MB Lightweight proxy
Prometheus 1 512MB 512MB Monitoring with 7-day retention
Total 13 - ~21GB With soft limits & reservations

How it fits in 16GB:

Docker Compose Configuration Highlights

Custom N8N Image with FFmpeg

FROM n8nio/n8n:latest

USER root
RUN apk add --no-cache ffmpeg
USER node

Simple and effective - adds video processing capabilities without bloat.

Task Runner Configuration

The task runners use a custom configuration file (n8n-task-runners.json) that defines:

{
  "task-runners": [
    {
      "runner-type": "javascript",
      "command": "/usr/local/bin/node",
      "args": ["--disallow-code-generation-from-strings"],
      "env-overrides": {
        "NODE_FUNCTION_ALLOW_BUILTIN": "",
        "NODE_FUNCTION_ALLOW_EXTERNAL": ""
      }
    },
    {
      "runner-type": "python",
      "command": "/opt/runners/task-runner-python/.venv/bin/python",
      "env-overrides": {
        "N8N_RUNNERS_EXTERNAL_ALLOW": "numpy,pandas,requests,yt-dlp"
      }
    }
  ]
}

This ensures:

Worker Configuration with Task Runner Support

Critical environment variables that enable workers to use task runners:

worker-1:
  environment:
    - EXECUTIONS_MODE=queue
    - N8N_RUNNERS_ENABLED=true
    - N8N_RUNNERS_MODE=external
    - N8N_RUNNERS_BROKER_LISTEN_ADDRESS=0.0.0.0
    - N8N_RUNNERS_MAX_CONCURRENCY=5

Without these, workers would fail when encountering Python/JS code nodes.

Performance Characteristics

Throughput

Load Distribution

Manual/UI Load β†’ task-runner-main (sporadic, low volume)
Production Load β†’ task-runner-worker-1 & worker-2 (continuous, high volume)
Webhook Traffic β†’ webhook processes β†’ Queue β†’ Workers

Resource Utilization Under Load

Monitoring and Observability

Prometheus Integration

Each process exposes metrics with unique prefixes:

Health Checks

Every critical component has health checks:

healthcheck:
  test: ["CMD", "wget", "--spider", "http://localhost:5678/healthz"]
  interval: 30s
  timeout: 5s
  retries: 3

This ensures:

Common Pitfalls Avoided

1. Task Runner Connection Mismatch

Problem: Using docker-compose replicas creates containers like worker_1, worker_2, but there's no service named just worker.

Wrong:

task-runner-worker:
  environment:
    - N8N_RUNNERS_TASK_BROKER_URI=http://worker:5679  # Fails!

Correct:

task-runner-worker-1:
  environment:
    - N8N_RUNNERS_TASK_BROKER_URI=http://worker-1:5679  # Works!

2. Workers Without Task Runner Configuration

Workers must explicitly enable task runners:

- N8N_RUNNERS_ENABLED=true
- N8N_RUNNERS_MODE=external

Without these, Python/JS nodes will fail silently.

3. Insufficient Memory for Task Runners

Initial allocation of 512MB per task runner caused OOM kills with Pandas operations. Increased to 1.5GB to handle:

Deployment and Scaling

Initial Deployment

# Build custom images
docker-compose build --no-cache

# Start infrastructure
docker-compose up -d

# Verify all services
docker-compose ps
docker stats

Horizontal Scaling (Future)

The architecture supports scaling to multiple servers:

  1. Move Redis to dedicated server
  2. Add worker servers pointing to central Redis
  3. Scale task runners based on worker load
  4. Use external load balancer instead of Caddy

Vertical Scaling

If upgrading to 32GB RAM:

Security Considerations

Network Isolation

All services communicate over internal Docker network. Only Caddy exposes ports externally.

Code Execution Sandboxing

Secrets Management

Sensitive values stored in .env file:

N8N_ENCRYPTION_KEY=your-encryption-key
N8N_RUNNERS_AUTH_TOKEN=your-auth-token

Never committed to version control.

Lessons Learned

1. Memory Limits vs Reservations

Using both provides the best of both worlds:

2. Log Levels Matter

Changing from info to warn reduced disk I/O by ~40% and saved ~200MB memory.

3. Health Check Intervals

Initial 10s intervals created unnecessary overhead. 30s is sufficient for production.

4. Task Runner Auto-Shutdown

Setting N8N_RUNNERS_AUTO_SHUTDOWN_TIMEOUT=300 allows idle runners to shut down, freeing memory.

Conclusion

Building a production-grade N8N infrastructure on limited resources requires careful architectural decisions and resource optimization. The key takeaways:

  1. External task runners are essential for heavy Python/JS workloads
  2. Dedicated task runners per worker provide better isolation and debugging
  3. Queue-based architecture enables scaling and resilience
  4. Memory optimization through limits, reservations, and right-sizing
  5. Comprehensive monitoring is critical for production stability

This infrastructure successfully handles production workloads processing thousands of workflow executions daily, with complex data transformations using Pandas, video downloads with yt-dlp, and custom JavaScript logicβ€”all within a 16GB constraint.

The architecture is battle-tested, cost-effective, and ready to scale when needed.


Resources

Full configuration available on request.