AsyncAPI Knowledge MCP
AI-powered documentation search for AsyncAPI, built with OpenCrane and hosted on Vercel.
Run it locally (full stack)
One command starts the whole chat — UI and agent function — and opens it in
your browser:
./scripts/run-local.sh
Default (offline) — no API keys, no Vercel account.
Prerequisites: uv (for uvx), Ollama,
and Node 20+. Everything is served by a small local dev server (scripts/dev-server.mjs) —
the vercel CLI is only needed for deployment work, never for
local development.
Before the first run, pull the default model:
ollama pull mistral-small3.2
The first offline run generates local embeddings from .opencrane/chunks.json
using OpenCrane (uvx opencrane embed) — this takes a few minutes and downloads the
embedding model (~500 MB) once. After that it is instant.
The script picks ports automatically (chat defaults to 7180, local MCP to 7181; override with
CHAT_PORT=... / MCP_PORT=...). To use a different Ollama model:
OLLAMA_MODEL=llama3.2 ./scripts/run-local.sh
Hosted mode — production parity: the Vercel AI Gateway for inference and the public
Hugging Face Space MCP for search (override via SEARCH_MCP_URL). Requires
AI_GATEWAY_API_KEY in your environment or .env.local (copy .env.example as a
starting point):
./scripts/run-local.sh --hosted
Stop a detached stack with:
./scripts/stop-local.sh
Architecture
The core product is the documentation MCP server (the RAG); the public chat website is an
optional layer on top. See docs/architecture.md for the two flows
as Mermaid diagrams — the public chat (we host gpt-5-nano + the agent loop) and your own
agent (any MCP client / the plugin / uvx asyncapi-knowledge-mcp, where you bring the model).
Knowledge pipeline (OpenCrane)
Sources:
- asyncapi-website — https://www.asyncapi.com/docs (concepts, tutorials, reference, guides, community docs — fetched from the
asyncapi/website GitHub repo, markdown/docs/ path)
- asyncapi-json-schema — AsyncAPI 3.1.0 JSON Schema, downloaded weekly from
asyncapi/website/config/3.1.0.json into .opencrane/sources/asyncapi-json-schema/3.1.0.json via scripts/download-schema.mjs. The .opencrane/sources/asyncapi-json-schema/asyncapi-schema-index.md file references it with a ```json-schema ``` fence so OpenCrane includes the schema content in the chunks. It is a manual (non-fetched) source, so it lives under .opencrane/sources/ alongside the fetched docs rather than being a separate local tree at the repo root.
Rebuild the index locally (note: fetch needs a non-empty GITHUB_TOKEN even for GitHub sources):
GITHUB_TOKEN=x uvx opencrane fetch
uvx opencrane llms
uvx opencrane chunk
uvx opencrane embed # local embedding model — no API keys
uvx opencrane index
Artifact strategy: .opencrane/sources/ + .opencrane/chunks.json + .opencrane/llmstxt/
are committed build inputs (small, human-readable). .opencrane/embeddings.json and
.opencrane/milvus.db are committed via Git LFS — regenerated by the weekly refresh ONLY when
chunks.json actually changed, and consumed by the PyPI package and the Hugging Face Space.
Nothing on Vercel needs vector artifacts: the chat searches through the Space's MCP endpoint.
Hosted MCP server
The public MCP server runs as a Hugging Face Docker Space
(OpenCrane + Milvus Lite + a local embedding model — no API keys anywhere):
MCP endpoint: https://derberg-asyncapi-knowledge-mcp.hf.space/http
Health check: https://derberg-asyncapi-knowledge-mcp.hf.space/health
It exposes search_docs, get_yaml_definition, get_list_members, and get_metadata_schema
over MCP Streamable HTTP. The free hardware sleeps when idle — the first request after a
quiet spell cold-starts in ~10–30 s.
Prefer running it locally? The knowledge base is also published to PyPI as a
standalone MCP package (built with opencrane pack, refreshed weekly):
claude mcp add asyncapi-knowledge -- uvx asyncapi-knowledge-mcp==0.0.4
or in any MCP client's config:
{
"mcpServers": {
"asyncapi-knowledge": { "type": "stdio", "command": "uvx", "args": ["asyncapi-knowledge-mcp==0.0.4"] }
}
}
Always pin a specific version (check PyPI
for the current release) — never rely on an implicit "latest"; unpinned installs are a
supply-chain risk.
Claude plugin (asyncapi-knowledge)
Use the AsyncAPI knowledge base from Claude Code via the asyncapi-knowledge plugin (an
asyncapi-researcher agent + the hosted MCP server):