Stats

Actions

Available In

Tags

MultiLLM Gateway

Open-source multi-tenant LLM gateway. Route one API to 16+ backends, ship docker compose up, own your data.

Why MultiLLM

Self-hostable in one command. docker compose up brings up the whole gateway. No vendor account, no per-seat pricing, no telemetry that leaves your network.

Multi-tenant from day one. API key issuance, per-tenant budgets, and quota tracking are built in (Phase 2b lands the full auth surface; the wizard provisions the first admin today).

Built for multi-LLM workflows. Cross-LLM shared memory (FTS5), council mode for parallel queries, side-by-side compare, LLM-as-judge for ranking answers — first-class capabilities, not bolt-ons.

Quickstart (5 minutes)

The shortest path from git clone to a working /v1/messages request, using a local Ollama backend.

Prerequisites: Docker (with docker compose) and Ollama already running locally (ollama serve and ollama pull llama3.2).

Clone

git clone https://github.com/adibirzu/multillm.git
cd multillm

Configure

cp .env.example .env

Start the gateway

docker compose up -d

Open the setup wizard

open http://localhost:8080/setup

Walk through the wizard. Create the admin account. On the backends pane, paste http://host.docker.internal:11434 as OLLAMA_URL and skip the other backends. Finish.

Send your first request

curl -X POST http://localhost:8080/v1/messages \
  -H 'Content-Type: application/json' \
  -d '{"model":"ollama/llama3.2","messages":[{"role":"user","content":"Say hi"}]}'

You should get back an Anthropic-format response containing the model's reply.

If you don't have Ollama installed, follow the same flow with any cloud backend by pasting its API key in the /setup wizard's backends pane.

Architecture

Claude Code / OpenAI SDK / curl │ ▼ ┌────────────────────┐ │ MultiLLM :8080 │ FastAPI + httpx (HTTP/2 pooling) │ ─ routing │ │ ─ streaming (SSE) │ │ ─ tracking │ │ ─ resilience │ │ ─ shared memory │ └────────┬───────────┘ │ ┌────────┴───────────┐ │ 16 backends │ │ Ollama / LM Studio│ │ OpenAI / Anthropic│ │ Gemini / Groq … │ └────────────────────┘

Data lives in MULTILLM_HOME (defaults to ~/.multillm/ or the compose-mounted ./.multillm/): SQLite tracking, FTS5 shared memory, automatic pre-migration backups. For production deployment recipes (Docker Compose, systemd, Kubernetes) see docs/operations/deployment.md.

Backends

Backend

Type

Auth mode

Streaming

Ollama

Local

—

✓ (SSE)

LM Studio

Local

—

✓ (SSE)

Codex CLI

Local

Local CLI

✓

Gemini CLI

Local

Local CLI

✓

OpenAI

Cloud

API key

✓ (SSE)

Anthropic

Cloud

API key

✓ (SSE)

Gemini

Cloud

API key

✓ (SSE)

OpenRouter

Cloud

API key

✓ (SSE)

Groq

Cloud

API key

✓ (SSE)

DeepSeek

Cloud

API key

✓ (SSE)

Mistral

Cloud

API key

✓ (SSE)

Together

Cloud

API key

✓ (SSE)

xAI (Grok)

Cloud

API key

✓ (SSE)

Fireworks

Cloud

API key

✓ (SSE)

Azure OpenAI

Cloud

API key

✓ (SSE)

AWS Bedrock

Cloud

Cloud IAM

✓ (SSE)

OCA

Enterprise

OAuth (PKCE)

✓ (SSE)

Plugin / Slash Commands

MultiLLM Gateway

Open-source multi-tenant LLM gateway. Route one API to 16+ backends, ship docker compose up, own your data.

Why MultiLLM

Self-hostable in one command. docker compose up brings up the whole gateway. No vendor account, no per-seat pricing, no telemetry that leaves your network.
Multi-tenant from day one. API key issuance, per-tenant budgets, and quota tracking are built in (Phase 2b lands the full auth surface; the wizard provisions the first admin today).
Built for multi-LLM workflows. Cross-LLM shared memory (FTS5), council mode for parallel queries, side-by-side compare, LLM-as-judge for ranking answers — first-class capabilities, not bolt-ons.

Quickstart (5 minutes)

The shortest path from git clone to a working /v1/messages request, using a local Ollama backend.

Prerequisites: Docker (with docker compose) and Ollama already running locally (ollama serve and ollama pull llama3.2).

Clone

git clone https://github.com/adibirzu/multillm.git
cd multillm

Configure
```
cp .env.example .env
```
Start the gateway
```
docker compose up -d
```
Open the setup wizard
```
open http://localhost:8080/setup
```
Walk through the wizard. Create the admin account. On the backends pane, paste http://host.docker.internal:11434 as OLLAMA_URL and skip the other backends. Finish.

Send your first request

curl -X POST http://localhost:8080/v1/messages \
  -H 'Content-Type: application/json' \
  -d '{"model":"ollama/llama3.2","messages":[{"role":"user","content":"Say hi"}]}'

You should get back an Anthropic-format response containing the model's reply.

If you don't have Ollama installed, follow the same flow with any cloud backend by pasting its API key in the /setup wizard's backends pane.

Architecture

Claude Code / OpenAI SDK / curl
            │
            ▼
   ┌────────────────────┐
   │  MultiLLM :8080    │  FastAPI + httpx (HTTP/2 pooling)
   │  ─ routing         │
   │  ─ streaming (SSE) │
   │  ─ tracking        │
   │  ─ resilience      │
   │  ─ shared memory   │
   └────────┬───────────┘
            │
   ┌────────┴───────────┐
   │  16 backends       │
   │  Ollama / LM Studio│
   │  OpenAI / Anthropic│
   │  Gemini / Groq …   │
   └────────────────────┘

Backends

Backend	Type	Auth mode	Streaming
Ollama	Local	—	✓ (SSE)
LM Studio	Local	—	✓ (SSE)
Codex CLI	Local	Local CLI	✓
Gemini CLI	Local	Local CLI	✓
OpenAI	Cloud	API key	✓ (SSE)
Anthropic	Cloud	API key	✓ (SSE)
Gemini	Cloud	API key	✓ (SSE)
OpenRouter	Cloud	API key	✓ (SSE)
Groq	Cloud	API key	✓ (SSE)
DeepSeek	Cloud	API key	✓ (SSE)
Mistral	Cloud	API key	✓ (SSE)
Together	Cloud	API key	✓ (SSE)
xAI (Grok)	Cloud	API key	✓ (SSE)
Fireworks	Cloud	API key	✓ (SSE)
Azure OpenAI	Cloud	API key	✓ (SSE)
AWS Bedrock	Cloud	Cloud IAM	✓ (SSE)
OCA	Enterprise	OAuth (PKCE)	✓ (SSE)

multillm

Popularity

What's Inside

Confidence

README

MultiLLM Gateway

Why MultiLLM

Quickstart (5 minutes)

Architecture

Backends

Plugin / Slash Commands

Similar Plugins

litellm

cc-fleet

openrouter-pack

openrouter

llm-router

gavel

More by adibirzu

oci-administrator

rlm

multillm

prod-ready

MultiLLM Gateway

Why MultiLLM

Quickstart (5 minutes)

Architecture

Backends

Plugin / Slash Commands

Popularity

Health & Quality

More by adibirzu

oci-administrator

rlm

multillm

prod-ready

Similar Plugins

litellm

cc-fleet

openrouter-pack

openrouter

llm-router

gavel