Tutorial

Scaling n8n for Production: Performance Optimization [2025]

House of Loops TeamOctober 10, 202513 min read
Scaling n8n for Production: Performance Optimization [2025]

Scaling n8n for Production: Performance Optimization

Moving n8n from development to production is a significant step. This guide takes you beyond basic deployment to show you how to build a high-performance n8n infrastructure that handles millions of executions, maintains sub-second response times, and scales horizontally to meet growing demand.

Understanding n8n Performance Characteristics

Before optimizing, understand how n8n performs under load:

Performance Metrics That Matter

  1. Throughput: Workflows executed per minute
  2. Latency: Time from trigger to completion
  3. Queue depth: Number of pending executions
  4. Resource utilization: CPU, memory, and I/O usage
  5. Error rate: Failed executions per total executions
  6. Worker efficiency: Successful completions per worker

Common Performance Bottlenecks

┌─────────────────────────────────────────────────────────┐
│              Performance Bottleneck Analysis             │
│                                                          │
│  Database                                                │
│  ████████████████████░░░░░  65%  ← Primary bottleneck  │
│                                                          │
│  Redis Queue                                             │
│  ██████████░░░░░░░░░░░░░░░  35%                         │
│                                                          │
│  Worker CPU                                              │
│  ███████░░░░░░░░░░░░░░░░░░  25%                         │
│                                                          │
│  Network I/O                                             │
│  █████░░░░░░░░░░░░░░░░░░░░  20%                         │
│                                                          │
│  Memory                                                  │
│  ███░░░░░░░░░░░░░░░░░░░░░░  15%                         │
└─────────────────────────────────────────────────────────┘

Architecture for Scale

Horizontal Scaling Strategy

# docker-compose.production.yml
version: '3.8'

services:
  # Main instances for handling webhooks and UI
  n8n-main:
    image: n8nio/n8n:latest
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: '2'
          memory: 4G
        reservations:
          cpus: '1'
          memory: 2G
      placement:
        constraints:
          - node.role == worker
        preferences:
          - spread: node.labels.zone
    environment:
      - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres-primary.cluster.local
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
      - DB_POSTGRESDB_USER=${POSTGRES_USER}
      - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
      - DB_POSTGRESDB_POOL_SIZE=50
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis-cluster.default.svc.cluster.local
      - QUEUE_BULL_REDIS_PORT=6379
      - QUEUE_BULL_REDIS_DB=0
      - QUEUE_HEALTH_CHECK_ACTIVE=true
      - N8N_METRICS=true
      - N8N_LOG_LEVEL=warn
      - N8N_LOG_OUTPUT=json
      - NODE_OPTIONS=--max-old-space-size=3072
      - WEBHOOK_URL=https://automation.company.com
    healthcheck:
      test: ['CMD', 'wget', '--spider', '-q', 'http://localhost:5678/healthz']
      interval: 30s
      timeout: 10s
      retries: 3

  # Worker pool for executing workflows
  n8n-worker-standard:
    image: n8nio/n8n:latest
    deploy:
      replicas: 20
      resources:
        limits:
          cpus: '2'
          memory: 4G
      update_config:
        parallelism: 5
        delay: 10s
    environment:
      - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres-primary.cluster.local
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
      - DB_POSTGRESDB_USER=${POSTGRES_USER}
      - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
      - DB_POSTGRESDB_POOL_SIZE=20
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis-cluster.default.svc.cluster.local
      - QUEUE_BULL_REDIS_PORT=6379
      - QUEUE_WORKER_ID=${HOSTNAME}
      - NODE_OPTIONS=--max-old-space-size=3072
      - EXECUTIONS_DATA_SAVE_ON_SUCCESS=none
      - EXECUTIONS_DATA_SAVE_ON_ERROR=all
      - EXECUTIONS_DATA_MAX_AGE=168 # 7 days
    command: worker

  # High-priority worker pool
  n8n-worker-priority:
    image: n8nio/n8n:latest
    deploy:
      replicas: 5
      resources:
        limits:
          cpus: '4'
          memory: 8G
    environment:
      - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres-primary.cluster.local
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
      - DB_POSTGRESDB_USER=${POSTGRES_USER}
      - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis-cluster.default.svc.cluster.local
      - QUEUE_BULL_REDIS_PORT=6379
      - QUEUE_WORKER_ID=${HOSTNAME}-priority
      - NODE_OPTIONS=--max-old-space-size=6144
    command: worker

  # PostgreSQL with replication
  postgres-primary:
    image: postgres:15-alpine
    environment:
      - POSTGRES_DB=${POSTGRES_DB}
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_INITDB_ARGS=-c shared_buffers=2GB -c effective_cache_size=6GB
    volumes:
      - postgres-data:/var/lib/postgresql/data
      - ./postgresql-tuning.conf:/etc/postgresql/postgresql.conf
    command: postgres -c config_file=/etc/postgresql/postgresql.conf
    deploy:
      placement:
        constraints:
          - node.labels.disk == ssd

  # Redis cluster for queuing
  redis-cluster:
    image: redis:7-alpine
    command: >
      redis-server
      --appendonly yes
      --maxmemory 4gb
      --maxmemory-policy allkeys-lru
      --tcp-backlog 511
      --timeout 0
      --tcp-keepalive 300
    deploy:
      replicas: 3
      placement:
        preferences:
          - spread: node.labels.zone
    volumes:
      - redis-data:/data

volumes:
  postgres-data:
  redis-data:

Database Optimization

PostgreSQL Performance Tuning

# postgresql-tuning.conf
# Optimized for 32GB RAM, SSD storage

# Connection Settings
max_connections = 200
superuser_reserved_connections = 3

# Memory Settings
shared_buffers = 8GB                    # 25% of RAM
effective_cache_size = 24GB             # 75% of RAM
maintenance_work_mem = 2GB
work_mem = 64MB                         # RAM / max_connections / 16
wal_buffers = 16MB
random_page_cost = 1.1                  # SSD optimized

# Checkpoint Settings
checkpoint_completion_target = 0.9
wal_compression = on
min_wal_size = 1GB
max_wal_size = 4GB

# Query Planner
default_statistics_target = 500
effective_io_concurrency = 200          # SSD can handle many concurrent I/O

# Parallel Query Execution
max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_maintenance_workers = 4
max_parallel_workers = 16

# Write Ahead Log
wal_level = replica
max_wal_senders = 10
wal_keep_size = 2GB
synchronous_commit = off                # Async for performance (acceptable for n8n)

# Background Writer
bgwriter_delay = 200ms
bgwriter_lru_maxpages = 100
bgwriter_lru_multiplier = 2.0

# Autovacuum (critical for n8n performance)
autovacuum = on
autovacuum_max_workers = 6
autovacuum_naptime = 10s
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.02   # More aggressive
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.01

# Logging for performance analysis
log_min_duration_statement = 500        # Log slow queries
log_checkpoints = on
log_connections = off
log_disconnections = off
log_lock_waits = on
log_temp_files = 0                      # Log temp file usage

Critical Indexes

-- Execution queries optimization
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_execution_workflow_started
  ON execution_entity(workflow_id, started_at DESC NULLS LAST)
  WHERE finished_at IS NOT NULL;

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_execution_status_waiting
  ON execution_entity(status, waiting_till)
  WHERE status = 'waiting';

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_execution_mode_started
  ON execution_entity(mode, started_at DESC)
  WHERE mode = 'webhook';

-- Webhook lookup optimization
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_webhook_path_method
  ON webhook_entity USING hash (webhook_path);

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_webhook_workflow
  ON webhook_entity(workflow_id, webhook_path, method);

-- Credentials optimization
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_credentials_name
  ON credentials_entity(name) WHERE type != 'deleted';

-- Workflow optimization
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_workflow_active
  ON workflow_entity(active, id) WHERE active = true;

-- Partial index for error tracking
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_execution_errors
  ON execution_entity(workflow_id, stopped_at DESC)
  WHERE status = 'error';

Table Partitioning for Execution History

-- Convert execution_entity to partitioned table
-- This improves query performance and enables efficient archiving

-- Create partitioned table
CREATE TABLE execution_entity_partitioned (
  LIKE execution_entity INCLUDING ALL
) PARTITION BY RANGE (started_at);

-- Create monthly partitions
CREATE TABLE execution_entity_2025_01 PARTITION OF execution_entity_partitioned
  FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');

CREATE TABLE execution_entity_2025_02 PARTITION OF execution_entity_partitioned
  FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');

-- Automated partition management function
CREATE OR REPLACE FUNCTION create_monthly_partition()
RETURNS void AS $$
DECLARE
  partition_date DATE;
  partition_name TEXT;
  start_date TEXT;
  end_date TEXT;
BEGIN
  -- Create partition for next month
  partition_date := DATE_TRUNC('month', CURRENT_DATE + INTERVAL '1 month');
  partition_name := 'execution_entity_' || TO_CHAR(partition_date, 'YYYY_MM');
  start_date := TO_CHAR(partition_date, 'YYYY-MM-DD');
  end_date := TO_CHAR(partition_date + INTERVAL '1 month', 'YYYY-MM-DD');

  EXECUTE format(
    'CREATE TABLE IF NOT EXISTS %I PARTITION OF execution_entity_partitioned FOR VALUES FROM (%L) TO (%L)',
    partition_name, start_date, end_date
  );

  -- Create indexes on new partition
  EXECUTE format(
    'CREATE INDEX IF NOT EXISTS %I ON %I (workflow_id, started_at DESC)',
    partition_name || '_workflow_started', partition_name
  );
END;
$$ LANGUAGE plpgsql;

-- Schedule monthly partition creation
SELECT cron.schedule('create-partitions', '0 0 1 * *', 'SELECT create_monthly_partition()');

-- Archive old partitions
CREATE OR REPLACE FUNCTION archive_old_partitions()
RETURNS void AS $$
DECLARE
  partition_name TEXT;
  partition_date DATE;
BEGIN
  FOR partition_name IN
    SELECT tablename FROM pg_tables
    WHERE schemaname = 'public'
      AND tablename LIKE 'execution_entity_2%'
  LOOP
    -- Extract date from partition name
    partition_date := TO_DATE(SUBSTRING(partition_name FROM 18), 'YYYY_MM');

    -- Archive partitions older than 6 months
    IF partition_date < CURRENT_DATE - INTERVAL '6 months' THEN
      -- Export to S3 or archive storage
      EXECUTE format('COPY %I TO PROGRAM ''aws s3 cp - s3://n8n-archives/%s.csv'' WITH CSV HEADER',
        partition_name, partition_name);

      -- Detach partition
      EXECUTE format('ALTER TABLE execution_entity_partitioned DETACH PARTITION %I', partition_name);

      -- Drop old partition
      EXECUTE format('DROP TABLE %I', partition_name);

      RAISE NOTICE 'Archived and dropped partition: %', partition_name;
    END IF;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

-- Schedule monthly archiving
SELECT cron.schedule('archive-partitions', '0 2 1 * *', 'SELECT archive_old_partitions()');

Connection Pooling

// PgBouncer configuration for connection pooling
// pgbouncer.ini
[databases]
n8n = host=postgres-primary.cluster.local port=5432 dbname=n8n

[pgbouncer]
listen_port = 6432
listen_addr = *
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
min_pool_size = 10
reserve_pool_size = 10
reserve_pool_timeout = 5
max_db_connections = 100
max_user_connections = 100
server_lifetime = 3600
server_idle_timeout = 600
server_connect_timeout = 15
query_timeout = 0
client_idle_timeout = 0
idle_transaction_timeout = 0
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1
stats_period = 60

// Update n8n to use PgBouncer
DB_POSTGRESDB_HOST=pgbouncer
DB_POSTGRESDB_PORT=6432

Redis Queue Optimization

Redis Configuration

# redis.conf - Optimized for n8n queue

# Memory
maxmemory 8gb
maxmemory-policy allkeys-lru

# Persistence
appendonly yes
appendfsync everysec
no-appendfsync-on-rewrite yes
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# Replication
repl-diskless-sync yes
repl-diskless-sync-delay 5

# Performance
tcp-backlog 511
timeout 0
tcp-keepalive 300
hz 10

# Slow log
slowlog-log-slower-than 10000
slowlog-max-len 128

# Client output buffer limits
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60

# Threading (Redis 6+)
io-threads 4
io-threads-do-reads yes

Advanced Queue Management

// Custom queue configuration for different workflow types
const Bull = require('bull');

// Create separate queues for different priorities
const queues = {
  critical: new Bull('n8n-critical', {
    redis: {
      host: process.env.REDIS_HOST,
      port: process.env.REDIS_PORT,
      maxRetriesPerRequest: null,
      enableReadyCheck: false,
    },
    defaultJobOptions: {
      attempts: 5,
      backoff: {
        type: 'exponential',
        delay: 2000,
      },
      timeout: 30000,
      removeOnComplete: {
        age: 3600, // Keep for 1 hour
        count: 1000,
      },
      removeOnFail: {
        age: 86400, // Keep for 24 hours
        count: 5000,
      },
    },
  }),

  standard: new Bull('n8n-standard', {
    redis: {
      host: process.env.REDIS_HOST,
      port: process.env.REDIS_PORT,
    },
    defaultJobOptions: {
      attempts: 3,
      backoff: {
        type: 'exponential',
        delay: 5000,
      },
      timeout: 300000,
      removeOnComplete: {
        age: 300,
        count: 100,
      },
    },
  }),

  batch: new Bull('n8n-batch', {
    redis: {
      host: process.env.REDIS_HOST,
      port: process.env.REDIS_PORT,
    },
    defaultJobOptions: {
      attempts: 2,
      backoff: {
        type: 'fixed',
        delay: 60000,
      },
      timeout: 3600000,
      removeOnComplete: true,
    },
  }),
};

// Dynamic queue selection
const getQueue = workflow => {
  const tags = workflow.tags || [];

  if (tags.includes('critical') || tags.includes('realtime')) {
    return queues.critical;
  }

  if (tags.includes('batch') || tags.includes('bulk')) {
    return queues.batch;
  }

  return queues.standard;
};

// Queue monitoring and auto-scaling
const monitorQueues = async () => {
  for (const [name, queue] of Object.entries(queues)) {
    const waiting = await queue.getWaitingCount();
    const active = await queue.getActiveCount();
    const delayed = await queue.getDelayedCount();

    console.log(`Queue ${name}:`, {
      waiting,
      active,
      delayed,
      total: waiting + active + delayed,
    });

    // Auto-scale workers based on queue depth
    if (waiting > 1000 && name === 'standard') {
      await scaleWorkers('n8n-worker-standard', 'up');
    } else if (waiting < 100 && active < 50) {
      await scaleWorkers('n8n-worker-standard', 'down');
    }
  }
};

// Rate limiting per workflow
const rateLimiters = new Map();

const getRateLimiter = workflowId => {
  if (!rateLimiters.has(workflowId)) {
    rateLimiters.set(
      workflowId,
      new Bull.RateLimiter({
        max: 100, // Max 100 executions
        duration: 60000, // Per minute
      })
    );
  }
  return rateLimiters.get(workflowId);
};

Worker Optimization

Node.js Performance Tuning

# Environment variables for worker processes
export NODE_ENV=production

# Memory allocation
export NODE_OPTIONS="--max-old-space-size=3072 --max-semi-space-size=64"

# V8 optimization flags
export NODE_OPTIONS="$NODE_OPTIONS --optimize-for-size"

# Enable Node.js performance profiling
export NODE_OPTIONS="$NODE_OPTIONS --perf-basic-prof --perf-basic-prof-only-functions"

# Garbage collection optimization
export NODE_OPTIONS="$NODE_OPTIONS --expose-gc --gc-interval=100"

Worker Process Management

// worker-manager.js
const cluster = require('cluster');
const os = require('os');

if (cluster.isMaster) {
  const numWorkers = parseInt(process.env.WORKER_PROCESSES) || os.cpus().length;

  console.log(`Master cluster setting up ${numWorkers} workers...`);

  for (let i = 0; i < numWorkers; i++) {
    cluster.fork();
  }

  cluster.on('online', worker => {
    console.log(`Worker ${worker.process.pid} is online`);
  });

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died with code: ${code}, and signal: ${signal}`);
    console.log('Starting a new worker');
    cluster.fork();
  });

  // Graceful shutdown
  process.on('SIGTERM', () => {
    console.log('SIGTERM received, shutting down gracefully');
    for (const id in cluster.workers) {
      cluster.workers[id].send('shutdown');
    }

    setTimeout(() => {
      console.log('Forcing shutdown');
      process.exit(0);
    }, 30000);
  });

  // Memory monitoring
  setInterval(() => {
    for (const id in cluster.workers) {
      cluster.workers[id].send('memory-check');
    }
  }, 60000);
} else {
  // Worker process
  const n8n = require('n8n');

  // Handle shutdown signal
  process.on('message', msg => {
    if (msg === 'shutdown') {
      console.log(`Worker ${process.pid} shutting down`);
      // Finish current jobs
      setTimeout(() => {
        process.exit(0);
      }, 10000);
    }

    if (msg === 'memory-check') {
      const memUsage = process.memoryUsage();
      const heapUsedMB = Math.round(memUsage.heapUsed / 1024 / 1024);
      const heapTotalMB = Math.round(memUsage.heapTotal / 1024 / 1024);

      if (heapUsedMB > 2800) {
        // 2.8GB threshold
        console.log(`Worker ${process.pid} memory high: ${heapUsedMB}MB/${heapTotalMB}MB`);
        // Request graceful restart
        process.send({ cmd: 'restart-requested', pid: process.pid });
      }
    }
  });

  // Start n8n worker
  n8n.start();
}

Workflow Optimization Patterns

Efficient Data Handling

// Bad: Loading entire dataset into memory
const allUsers = await db.query('SELECT * FROM users'); // Could be millions of rows
for (const user of allUsers) {
  await processUser(user);
}

// Good: Streaming with batching
const batchSize = 100;
let offset = 0;
let hasMore = true;

while (hasMore) {
  const batch = await db.query('SELECT * FROM users ORDER BY id LIMIT $1 OFFSET $2', [
    batchSize,
    offset,
  ]);

  if (batch.length === 0) {
    hasMore = false;
    break;
  }

  // Process batch in parallel
  await Promise.all(batch.map(user => processUser(user)));

  offset += batchSize;

  // Prevent memory leaks
  if (global.gc) {
    global.gc();
  }
}

Parallel Processing

// Bad: Sequential API calls
for (const item of items) {
  const result = await callAPI(item);
  results.push(result);
}

// Good: Parallel with concurrency limit
const pLimit = require('p-limit');
const limit = pLimit(10); // Max 10 concurrent requests

const results = await Promise.all(items.map(item => limit(() => callAPI(item))));

// Better: Batched parallel processing
const chunk = (arr, size) =>
  Array.from({ length: Math.ceil(arr.length / size) }, (v, i) =>
    arr.slice(i * size, i * size + size)
  );

const batches = chunk(items, 100);
const allResults = [];

for (const batch of batches) {
  const batchResults = await Promise.all(batch.map(item => callAPI(item)));
  allResults.push(...batchResults);

  // Rate limiting between batches
  await new Promise(resolve => setTimeout(resolve, 1000));
}

Caching Strategies

// Multi-level caching implementation
class WorkflowCache {
  constructor() {
    this.memoryCache = new Map();
    this.redisClient = createRedisClient();
  }

  async get(key, fetcher, options = {}) {
    const { ttl = 3600, refresh = false } = options;

    // L1: Memory cache (fastest)
    if (!refresh && this.memoryCache.has(key)) {
      const cached = this.memoryCache.get(key);
      if (Date.now() - cached.timestamp < ttl * 1000) {
        return cached.value;
      }
    }

    // L2: Redis cache (fast)
    const redisValue = await this.redisClient.get(key);
    if (!refresh && redisValue) {
      const parsed = JSON.parse(redisValue);
      this.memoryCache.set(key, {
        value: parsed,
        timestamp: Date.now(),
      });
      return parsed;
    }

    // L3: Fetch from source (slow)
    const value = await fetcher();

    // Store in all cache levels
    await this.redisClient.setex(key, ttl, JSON.stringify(value));
    this.memoryCache.set(key, {
      value,
      timestamp: Date.now(),
    });

    return value;
  }

  async invalidate(key) {
    this.memoryCache.delete(key);
    await this.redisClient.del(key);
  }

  async invalidatePattern(pattern) {
    // Clear memory cache
    for (const key of this.memoryCache.keys()) {
      if (key.match(pattern)) {
        this.memoryCache.delete(key);
      }
    }

    // Clear Redis cache
    const keys = await this.redisClient.keys(pattern);
    if (keys.length > 0) {
      await this.redisClient.del(...keys);
    }
  }
}

// Usage in workflow
const cache = new WorkflowCache();

const userData = await cache.get(
  `user:${userId}`,
  () => fetchUserFromDatabase(userId),
  { ttl: 300 } // 5 minutes
);

Monitoring and Observability

Custom Metrics Export

// prometheus-exporter.js
const prometheus = require('prom-client');
const express = require('express');

// Register default metrics
prometheus.collectDefaultMetrics({
  prefix: 'n8n_',
  timeout: 5000,
});

// Custom metrics
const workflowExecutionDuration = new prometheus.Histogram({
  name: 'n8n_workflow_execution_duration_seconds',
  help: 'Workflow execution duration in seconds',
  labelNames: ['workflow_id', 'workflow_name', 'status'],
  buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60, 120, 300],
});

const workflowExecutionTotal = new prometheus.Counter({
  name: 'n8n_workflow_execution_total',
  help: 'Total number of workflow executions',
  labelNames: ['workflow_id', 'workflow_name', 'status'],
});

const queueDepth = new prometheus.Gauge({
  name: 'n8n_queue_depth',
  help: 'Number of jobs in queue',
  labelNames: ['queue_name', 'status'],
});

const nodeExecutionDuration = new prometheus.Histogram({
  name: 'n8n_node_execution_duration_seconds',
  help: 'Node execution duration in seconds',
  labelNames: ['node_type', 'workflow_id'],
  buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10],
});

const databaseQueryDuration = new prometheus.Histogram({
  name: 'n8n_database_query_duration_seconds',
  help: 'Database query duration in seconds',
  labelNames: ['query_type'],
  buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1],
});

const activeWorkers = new prometheus.Gauge({
  name: 'n8n_active_workers',
  help: 'Number of active workers',
});

// Metrics server
const app = express();

app.get('/metrics', async (req, res) => {
  res.set('Content-Type', prometheus.register.contentType);
  res.end(await prometheus.register.metrics());
});

app.listen(9090, () => {
  console.log('Metrics server listening on port 9090');
});

// Export metrics functions
module.exports = {
  recordWorkflowExecution: (workflow, duration, status) => {
    workflowExecutionDuration.labels(workflow.id, workflow.name, status).observe(duration);

    workflowExecutionTotal.labels(workflow.id, workflow.name, status).inc();
  },

  updateQueueDepth: async queues => {
    for (const [name, queue] of Object.entries(queues)) {
      const waiting = await queue.getWaitingCount();
      const active = await queue.getActiveCount();
      const delayed = await queue.getDelayedCount();

      queueDepth.labels(name, 'waiting').set(waiting);
      queueDepth.labels(name, 'active').set(active);
      queueDepth.labels(name, 'delayed').set(delayed);
    }
  },

  recordNodeExecution: (nodeType, workflowId, duration) => {
    nodeExecutionDuration.labels(nodeType, workflowId).observe(duration);
  },

  recordDatabaseQuery: (queryType, duration) => {
    databaseQueryDuration.labels(queryType).observe(duration);
  },

  setActiveWorkers: count => {
    activeWorkers.set(count);
  },
};

Performance Dashboard

{
  "dashboard": {
    "title": "n8n Performance Dashboard",
    "panels": [
      {
        "title": "Workflow Execution Rate",
        "targets": [
          {
            "expr": "rate(n8n_workflow_execution_total[5m])"
          }
        ]
      },
      {
        "title": "P95 Execution Duration",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(n8n_workflow_execution_duration_seconds_bucket[5m]))"
          }
        ]
      },
      {
        "title": "Queue Depth",
        "targets": [
          {
            "expr": "n8n_queue_depth"
          }
        ]
      },
      {
        "title": "Success Rate",
        "targets": [
          {
            "expr": "sum(rate(n8n_workflow_execution_total{status='success'}[5m])) / sum(rate(n8n_workflow_execution_total[5m]))"
          }
        ]
      },
      {
        "title": "Database Query Performance",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(n8n_database_query_duration_seconds_bucket[5m]))"
          }
        ]
      },
      {
        "title": "Worker Utilization",
        "targets": [
          {
            "expr": "n8n_active_workers"
          }
        ]
      }
    ]
  }
}

Auto-Scaling Configuration

Kubernetes HPA (Horizontal Pod Autoscaler)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: n8n-worker-hpa
  namespace: n8n
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: n8n-worker
  minReplicas: 5
  maxReplicas: 50
  metrics:
    # CPU utilization
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

    # Memory utilization
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

    # Custom metric: Queue depth
    - type: Pods
      pods:
        metric:
          name: n8n_queue_depth
        target:
          type: AverageValue
          averageValue: '100'

  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Percent
          value: 50
          periodSeconds: 60
        - type: Pods
          value: 5
          periodSeconds: 60
      selectPolicy: Max

    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
        - type: Pods
          value: 2
          periodSeconds: 60
      selectPolicy: Min

Performance Testing

Load Testing Script

// loadtest.js
const autocannon = require('autocannon');

const runLoadTest = () => {
  const instance = autocannon({
    url: 'https://automation.company.com/webhook/test',
    connections: 100,
    duration: 60,
    pipelining: 10,
    method: 'POST',
    headers: {
      'content-type': 'application/json',
    },
    body: JSON.stringify({
      test: 'data',
      timestamp: Date.now(),
    }),
  });

  autocannon.track(instance, { renderProgressBar: true });

  instance.on('done', results => {
    console.log('Load test results:');
    console.log('Requests/sec:', results.requests.average);
    console.log('Latency p50:', results.latency.p50);
    console.log('Latency p95:', results.latency.p95);
    console.log('Latency p99:', results.latency.p99);
    console.log('Throughput:', results.throughput.average);
    console.log('Errors:', results.errors);
  });
};

runLoadTest();

Performance Checklist

  • Database indexes optimized for common queries
  • Table partitioning implemented for large tables
  • Connection pooling configured (PgBouncer)
  • Redis persistence and memory limits set
  • Worker processes tuned for available resources
  • Horizontal pod autoscaling configured
  • Monitoring and alerting in place
  • Load testing completed
  • Caching strategy implemented
  • Workflow execution data retention policy set
  • Rate limiting configured
  • Queue priorities configured
  • Metrics exported to monitoring system

Join the Community

Scaling automation infrastructure is an ongoing process. The House of Loops community includes platform engineers and DevOps professionals who share their production configurations, optimization techniques, and scaling strategies.

Join us to:

  • Access production-ready configuration templates
  • Get performance optimization advice
  • Share your scaling experiences
  • Participate in performance tuning workshops
  • Connect with other engineers running high-scale n8n deployments

Join House of Loops Today and build automation systems that scale with your business.


Need help optimizing your n8n deployment? Our community has experts who've scaled to millions of executions!

H

House of Loops Team

House of Loops is a technology-focused community for learning and implementing advanced automation workflows using n8n, Strapi, AI/LLM, and DevSecOps tools.

Join Our Community

Join 1,000+ automation enthusiasts