Dolphin semantic code search plugin for Claude Code
npx claudepluginhub plasticbeachllc/dolphinModern search across indexed repositories using Dolphin. Provides hybrid vector + keyword retrieval for large codebases via CLI.
Hybrid search across all your repositories.
Dolphin indexes your repositories and lets you perform hybrid (semantic + keyword) search across them.
# Install
uv pip install pb-dolphin
# Set your OpenAI key (used for embeddings)
export OPENAI_API_KEY="sk-..."
# Initialize, add a repo, and search
dolphin init
dolphin add-repo my-project /path/to/project
dolphin index my-project
dolphin search "database connection pooling"
Dolphin indexes your code with language-aware chunking, embeds it, and returns ranked results.
Want live re-indexing as you edit files? Start the server:
dolphin serve
A small companion MCP server is available at bunx dolphin-mcp. Add this to your AI app's MCP config:
{
"mcpServers": {
"dolphin": {
"command": "bunx",
"args": ["dolphin-mcp"]
}
}
}
Make sure dolphin serve is running, and your agent can now search, retrieve chunks, and read files from your indexed repos.
Additionally, a Claude skill is available in this repo's marketplace as a personal Plugin.
You / Agent
|
v
┌───────────────────────────────────────┐
│ Dolphin │
│ │
│ CLI ─── REST API ─── MCP Bridge │
│ | │
│ ┌──────┴──────┐ │
│ v v │
│ LanceDB SQLite │
│ (vectors) (metadata + BM25) │
└───────────────────────────────────────┘
Indexing: Your code is scanned, split into semantic chunks using language-aware AST parsers, embedded via OpenAI, and stored in LanceDB (vectors) and SQLite (metadata + full-text).
Searching: Your query is embedded and matched against both vector similarity and BM25 keyword relevance. Results are fused with Reciprocal Rank Fusion, optionally reranked with a cross-encoder, and returned as structured snippets with file paths, line numbers, and scores.
Intelligent hybrid search
Language-aware indexing
.gitignore and an optional repo-specific Dolphin config (dolphin init --repo)Live sync
dolphin serve so edits are re-indexed automaticallyMultiple interfaces
dolphin CLI with compact, verbose, and JSON output modesbunx dolphin-mcpDolphin uses language-aware AST chunkers for the best possible search quality. Files in other recognized languages fall back to token-window chunking, and completely unknown extensions use a generic text chunker.
| Language | Extensions | Chunker |
|---|---|---|
| Python | .py, .pyw, .pyi | AST |
| TypeScript | .ts, .tsx | AST |
| JavaScript | .js, .jsx, .mjs, .cjs | AST |
| Markdown | .md, .markdown | AST |
| SQL | .sql | AST |
| Svelte | .svelte | AST |
| Go, Rust, Java, C/C++ | .go, .rs, .java, .c, .cpp | Token-window |
| Ruby, PHP, C#, Swift, Kotlin | .rb, .php, .cs, .swift, .kt | Token-window |
| Shell | .sh, .bash, .zsh | Token-window |
| Config (JSON, YAML, TOML, XML) | .json, .yaml, .yml, .toml, .xml | Token-window |
You can customize extension mappings in ~/.dolphin/config.toml under the [languages] section.