AsyncAPI Knowledge MCP

AI-powered documentation search for AsyncAPI, built with OpenCrane and hosted on Vercel.

Run it locally (full stack)

One command starts the whole chat — UI and agent function — and opens it in your browser:

./scripts/run-local.sh

Default (offline) — no API keys, no Vercel account. Prerequisites: uv (for uvx), Ollama, and Node 20+. Everything is served by a small local dev server (scripts/dev-server.mjs) — the vercel CLI is only needed for deployment work, never for local development.

Before the first run, pull the default model:

ollama pull mistral-small3.2

The first offline run generates local embeddings from .opencrane/chunks.json using OpenCrane (uvx opencrane embed) — this takes a few minutes and downloads the embedding model (~500 MB) once. After that it is instant.

The script picks ports automatically (chat defaults to 7180, local MCP to 7181; override with CHAT_PORT=... / MCP_PORT=...). To use a different Ollama model:

OLLAMA_MODEL=llama3.2 ./scripts/run-local.sh

Hosted mode — production parity: the Vercel AI Gateway for inference and the public Hugging Face Space MCP for search (override via SEARCH_MCP_URL). Requires AI_GATEWAY_API_KEY in your environment or .env.local (copy .env.example as a starting point):

./scripts/run-local.sh --hosted

Stop a detached stack with:

./scripts/stop-local.sh

Architecture

The core product is the documentation MCP server (the RAG); the public chat website is an optional layer on top. See docs/architecture.md for the two flows as Mermaid diagrams — the public chat (we host gpt-5-nano + the agent loop) and your own agent (any MCP client / the plugin / uvx asyncapi-knowledge-mcp, where you bring the model).

Knowledge pipeline (OpenCrane)

Sources:

asyncapi-website — https://www.asyncapi.com/docs (concepts, tutorials, reference, guides, community docs — fetched from the asyncapi/website GitHub repo, markdown/docs/ path)
asyncapi-json-schema — AsyncAPI 3.1.0 JSON Schema, downloaded weekly from asyncapi/website/config/3.1.0.json into .opencrane/sources/asyncapi-json-schema/3.1.0.json via scripts/download-schema.mjs. The .opencrane/sources/asyncapi-json-schema/asyncapi-schema-index.md file references it with a ```json-schema ``` fence so OpenCrane includes the schema content in the chunks. It is a manual (non-fetched) source, so it lives under .opencrane/sources/ alongside the fetched docs rather than being a separate local tree at the repo root.

Rebuild the index locally (note: fetch needs a non-empty GITHUB_TOKEN even for GitHub sources):

GITHUB_TOKEN=x uvx opencrane fetch
uvx opencrane llms
uvx opencrane chunk
uvx opencrane embed   # local embedding model — no API keys
uvx opencrane index

Artifact strategy: .opencrane/sources/ + .opencrane/chunks.json + .opencrane/llmstxt/ are committed build inputs (small, human-readable). .opencrane/embeddings.json and .opencrane/milvus.db are committed via Git LFS — regenerated by the weekly refresh ONLY when chunks.json actually changed, and consumed by the PyPI package and the Hugging Face Space. Nothing on Vercel needs vector artifacts: the chat searches through the Space's MCP endpoint.

Hosted MCP server

The public MCP server runs as a Hugging Face Docker Space (OpenCrane + Milvus Lite + a local embedding model — no API keys anywhere):

MCP endpoint:  https://derberg-asyncapi-knowledge-mcp.hf.space/http
Health check:  https://derberg-asyncapi-knowledge-mcp.hf.space/health

It exposes search_docs, get_yaml_definition, get_list_members, and get_metadata_schema over MCP Streamable HTTP. The free hardware sleeps when idle — the first request after a quiet spell cold-starts in ~10–30 s.

Prefer running it locally? The knowledge base is also published to PyPI as a standalone MCP package (built with opencrane pack, refreshed weekly):

claude mcp add asyncapi-knowledge -- uvx asyncapi-knowledge-mcp==0.0.4

or in any MCP client's config:

{
  "mcpServers": {
    "asyncapi-knowledge": { "type": "stdio", "command": "uvx", "args": ["asyncapi-knowledge-mcp==0.0.4"] }
  }
}

Always pin a specific version (check PyPI for the current release) — never rely on an implicit "latest"; unpinned installs are a supply-chain risk.

Claude plugin (asyncapi-knowledge)

Use the AsyncAPI knowledge base from Claude Code via the asyncapi-knowledge plugin (an asyncapi-researcher agent + the hosted MCP server):

asyncapi-knowledge

Popularity

What's Inside

README