Deployment Guide

Guide for deploying HybridInference in production.

Quick Start (Docker)

# 1. Clone and configure
git clone https://github.com/HarvardMadSys/hybridInference.git
cd hybridInference
cp .env.example .env
# Edit .env — fill in DB_PASSWORD, JWT_SECRET_KEY, API_KEY_SECRET, and provider API keys

# 2. Start all services
make up

# 3. Verify
make ps
curl http://localhost:8080/health

This starts 5 containers: backend (FastAPI), frontend (Next.js), PostgreSQL, Alertmanager, and alert-logger. Backend, PostgreSQL, and Alertmanager bind to 127.0.0.1 by default; the frontend binds to 0.0.0.0 (override with the FRONTEND_HOST env var) so it can be reached by Nginx on the host. pgAdmin is available but requires the admin profile (see below).

Prerequisites

  • Docker Engine 24+ and Docker Compose v2+

  • User in the docker group (sudo usermod -aG docker $USER)

  • Nginx on the host for SSL termination (not containerized)

Service Architecture

Client ──▶ Cloudflare (CDN + DDoS) ──▶ Nginx (:443) ──┬──▶ backend  (:8080)
                                                        └──▶ frontend (:3001)

Docker internal network:
  backend ──▶ postgres (:5432)
  alertmanager (:9093) ──▶ alert-logger (:5001)
  backend ──▶ host.docker.internal (GPU SSH tunnels on host)

Common Operations

All commands run from the project root via make:

make up                  # Start all services
make down                # Stop all services
make restart             # Restart all services
make restart s=backend   # Restart a single service
make ps                  # Show running services and health status
make logs                # Tail logs (all services)
make logs s=backend      # Tail logs for one service
make build               # Rebuild images and restart
make build s=frontend    # Rebuild one service

Configuration

Environment Variables

All secrets and configuration live in .env at the project root. See .env.example for the full list with comments. Key variables:

Variable

Required

Description

DB_NAME, DB_USER, DB_PASSWORD

Yes

PostgreSQL credentials

JWT_SECRET_KEY

Yes

JWT signing key (generate with python -c "import secrets; print(secrets.token_urlsafe(32))")

API_KEY_SECRET

Yes

HMAC key for API key hashing

Local GPU Endpoints

If you run local inference servers (sglang, vLLM) on the host or via SSH tunnels, config/models.yaml references them as host.docker.internal:<port>. This DNS name resolves to the host machine from inside Docker containers.

For bare-metal development without Docker, replace host.docker.internal with localhost.

Nginx and HTTPS

Nginx runs on the host (not in Docker) to terminate TLS. An example configuration is at deploy/nginx/freeinference.conf.

sudo cp deploy/nginx/freeinference.conf /etc/nginx/sites-available/
sudo ln -s /etc/nginx/sites-available/freeinference.conf /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

This assumes:

  • Backend: 127.0.0.1:8080, Frontend: 127.0.0.1:3001

  • HTTPS certificates from Let’s Encrypt

Cloudflare

FreeInference runs behind Cloudflare. Key settings:

  • SSL/TLS mode: Full (strict)

  • Caching: Disabled for API paths (/v1/*)

Monitoring

Health Checks

curl http://localhost:8080/health
# {"status":"healthy","routes_configured":17,"database_connected":true}

Alerting

Alertmanager and the local alert-logger run as part of the stack. They are preserved for future use, but with Prometheus removed they currently receive no alerts. Re-enable a metrics pipeline (or wire a different alert source) to start populating them again.

Database

PostgreSQL runs in Docker with data persisted to a named volume (hybridinference_postgres_data).

To access the database directly:

docker exec -it hybridinference-postgres psql -U $DB_USER -d $DB_NAME

For pgAdmin (optional):

# Start with admin profile
docker compose -f deploy/docker/docker-compose.yml --env-file .env --profile admin up -d
# Access at http://localhost:5050

See Database for schema details.

Troubleshooting

Service won’t start

make logs s=backend      # Check service-specific logs
make ps                  # Check health status

Common issues:

  • Missing required env vars in .env → compose will error with variable X is missing a value

  • Port already in use → check ss -tlnp | grep <port>

  • Database connection failed → ensure postgres is healthy: make ps

Rebuild after code changes

make build               # Rebuild all images
make build s=backend     # Rebuild just backend

Full reset (preserves data)

make down && make up

Full reset (destroy data)

docker compose -f deploy/docker/docker-compose.yml --env-file .env down -v
make up

Warning: -v deletes all named volumes including the database.