Skip to main content

Scaling

This guide covers horizontal and vertical scaling strategies for Chive.

Scaling principles

Stateless services

API, web, and worker services are stateless and can scale horizontally:

ServiceScalableNotes
APIYesLoad-balanced, any replica handles any request
WebYesStatic assets via CDN, SSR scales horizontally
WorkerYesJobs distributed via BullMQ
IndexerLimitedSingle consumer for event ordering (see below)

Stateful services

Databases require different scaling strategies:

DatabaseScaling approach
PostgreSQLRead replicas, connection pooling
ElasticsearchAdd nodes, increase shards
Neo4jSingle instance or causal cluster
RedisSentinel for HA, Cluster for scale

Horizontal scaling

API service

# Kubernetes HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: chive-api
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: chive-api
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80

Worker service

Scale workers based on queue depth:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: chive-worker
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: chive-worker
minReplicas: 2
maxReplicas: 10
metrics:
- type: External
external:
metric:
name: bullmq_queue_depth
selector:
matchLabels:
queue: indexing
target:
type: AverageValue
averageValue: 100

Indexer scaling

The indexer must remain a single instance to preserve event ordering. Scale by:

  1. Faster processing: Optimize event handlers
  2. Parallel non-ordered work: Offload to workers via queue
  3. Multiple firehose subscriptions: Partition by collection (advanced)

Vertical scaling

Resource recommendations

ServiceCPUMemoryNotes
API1-2 cores1-2 GBScale horizontally first
Indexer0.5-1 core512 MBMemory for event buffering
Worker0.5-1 core512 MBCPU-bound for enrichment
Web0.5-1 core512 MBMemory for SSR cache

Database sizing

PostgreSQL

MetricSmallMediumLarge
Preprints100K1M10M
Storage50 GB200 GB1 TB
RAM4 GB16 GB64 GB
CPUs2832

Elasticsearch

MetricSmallMediumLarge
Documents100K1M10M
Storage20 GB100 GB500 GB
Nodes135+
RAM per node4 GB16 GB32 GB

Neo4j

MetricSmallMediumLarge
Nodes100K1M10M
Storage10 GB50 GB200 GB
RAM4 GB16 GB64 GB

Connection pooling

PgBouncer

Use PgBouncer for PostgreSQL connection pooling:

# pgbouncer.ini
[databases]
chive = host=postgres port=5432 dbname=chive

[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
min_pool_size = 10

Application-level pooling

// src/storage/postgresql/connection.ts
const pool = new Pool({
host: process.env.POSTGRES_HOST,
port: parseInt(process.env.POSTGRES_PORT || '5432'),
database: process.env.POSTGRES_DB,
user: process.env.POSTGRES_USER,
password: process.env.POSTGRES_PASSWORD,
min: 5,
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 10000,
});

Caching strategies

Multi-tier caching

Request → L1 (Memory) → L2 (Redis) → L3 (Database)
↓ ↓
~1ms ~5ms ~50ms

Cache warming

Pre-populate caches on startup:

async function warmCache(): Promise<void> {
// Warm trending preprints
const trending = await db.getTrendingPreprints(100);
await Promise.all(
trending.map((p) => cache.set(`preprint:${p.uri}`, p, 600))
);

// Warm field taxonomy
const fields = await neo4j.getAllFields();
await cache.set('fields:all', fields, 3600);
}

Cache invalidation

Event-driven invalidation via Redis Pub/Sub:

// On preprint update
await redis.publish('cache:invalidate', JSON.stringify({
type: 'preprint',
uri: preprintUri,
}));

// Subscriber
subscriber.on('message', (channel, message) => {
const { type, uri } = JSON.parse(message);
cache.del(`${type}:${uri}`);
});

Load balancing

Kubernetes Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: chive
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
spec:
ingressClassName: nginx
rules:
- host: api.chive.pub
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: chive-api
port:
number: 3000

Session affinity

For WebSocket connections (notifications):

annotations:
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/session-cookie-name: "chive-affinity"
nginx.ingress.kubernetes.io/session-cookie-max-age: "3600"

CDN configuration

Static assets

Serve frontend assets via CDN:

# Cloudflare page rules
- url: "chive.pub/_next/static/*"
cache_level: "Cache Everything"
edge_cache_ttl: 2592000 # 30 days

- url: "chive.pub/api/*"
cache_level: "Bypass"

Blob caching

PDF and image blobs:

- url: "blobs.chive.pub/*"
cache_level: "Cache Everything"
edge_cache_ttl: 86400 # 24 hours

Performance tuning

Node.js options

# Increase memory for large datasets
NODE_OPTIONS="--max-old-space-size=4096"

# Enable clustering (handled by K8s, not needed)
# PM2_INSTANCES=4

Database tuning

PostgreSQL

-- Increase shared buffers
ALTER SYSTEM SET shared_buffers = '4GB';

-- Increase work memory for complex queries
ALTER SYSTEM SET work_mem = '256MB';

-- Increase max connections
ALTER SYSTEM SET max_connections = 200;

Elasticsearch

# elasticsearch.yml
indices.memory.index_buffer_size: 30%
thread_pool.write.queue_size: 1000

Capacity planning

Request rate estimates

ScenarioRPSAPI replicas
Low1002
Medium5005
High200015
Peak500030

Storage growth

MetricGrowth rate
Preprints~1000/day
Reviews~500/day
Blob cache~10 GB/day
Logs~5 GB/day