Semantic Search

Semantic search over markdown files. Find related notes by meaning, not just keywords. Detect duplicates before creating new notes.
Supports two server transports:
- stdio MCP — For Claude Code integration (one process per session)
- HTTP — Combined MCP-over-HTTP + REST on one port; one warm process shared by all clients
Features
- Semantic search using sentence-transformers
- Duplicate/similar note detection
- Auto-updating index with file watcher
- Multi-directory support
- Inline tag extraction (
#tag-name)
Install
CPU-only install — recommended for macOS (any Mac, Apple Silicon or Intel) and Linux/Windows without an NVIDIA GPU. Saves ~5GB of CUDA binaries. On macOS, Apple GPU (MPS) is still auto-detected and used via PyTorch's built-in MPS backend — the "CPU" label refers only to the absence of CUDA, not to the compute device at runtime.
uv tool install --index https://download.pytorch.org/whl/cpu \
git+https://github.com/bborbe/semantic-search
CUDA install — only for Linux/Windows with a dedicated NVIDIA GPU. Not applicable to macOS (NVIDIA CUDA is not supported on Mac).
uv tool install git+https://github.com/bborbe/semantic-search
Upgrade
uv tool upgrade semantic-search
Server Modes
stdio MCP (per-session Claude Code)
Spawns one process per Claude Code session. Simple, but each session loads its own ~400 MB–1 GB model copy.
claude mcp add -s project semantic-search \
--env CONTENT_PATH=/path/to/vault \
-- \
uvx --from git+https://github.com/bborbe/semantic-search semantic-search-mcp serve
Tools available:
search_related(query, top_k=5) — Find semantically related notes
get_content(path, snippet, query, context_lines) — Retrieve file content from indexed vaults
check_duplicates(file_path) — Detect duplicate/similar notes
HTTP (shared across all clients)
Single long-running process serves MCP-over-HTTP at /mcp plus REST at /search, /duplicates, /health, /reindex. All Claude Code sessions and REST clients share one warm indexer.
CONTENT_PATH=/path/to/vault semantic-search-http --host 127.0.0.1 --port 8321
Point Claude Code at it via MCP config:
{
"mcpServers": {
"semantic-search": {
"type": "http",
"url": "http://127.0.0.1:8321/mcp"
}
}
}
REST endpoints:
| Endpoint | Method | Description |
|---|
/mcp | POST | MCP-over-HTTP (Claude Code) |
/search?q=...&top_k=5 | GET | Semantic search |
/duplicates?file=...&threshold=0.85 | GET | Find duplicate notes |
/content?path=...&snippet=...&query=...&context_lines=... | GET | Retrieve file content |
/health | GET | Health check with index stats |
/reindex | GET/POST | Force index rebuild |
Example queries:
# Search
curl 'http://127.0.0.1:8321/search?q=kubernetes+deployment'
# Find duplicates
curl 'http://127.0.0.1:8321/duplicates?file=notes/my-note.md'
# Health check
curl 'http://127.0.0.1:8321/health'
Two-Step Flow
Search for related notes, then fetch the full content of any result:
# Step 1: Search for related notes
curl 'http://127.0.0.1:8321/search?q=kubernetes+deployment'
# Returns: [{"path": "notes/k8s-guide.md", "score": 0.92}, ...]
# Step 2: Fetch the content of a result
curl 'http://127.0.0.1:8321/content?path=notes/k8s-guide.md'
# Returns: {"path": "/full/resolved/path.md", "content": "# Kubernetes Guide\n...", "mode": "full"}
Snippet Mode
Retrieve a focused snippet around a specific query term within a file:
# Get a focused snippet around "service mesh" in the file
curl 'http://127.0.0.1:8321/content?path=notes/k8s-guide.md&snippet=true&query=service+mesh&context_lines=10'
# Returns: {"path": "...", "content": "...\n## Service Mesh\n...", "mode": "snippet"}
Remote Deployment
get_content and GET /content enable remote deployment of semantic-search clients. Callers no longer need filesystem access to the vault directory — all content retrieval happens over HTTP/MCP from any network location. The server enforces path validation: files outside indexed roots are never served.
Claude Code Plugin
This repo also ships as a Claude Code marketplace plugin with commands for setup, search, and research.
Install
claude plugin marketplace add bborbe/semantic-search
claude plugin install semantic-search
Update
claude plugin marketplace update semantic-search
claude plugin update semantic-search@semantic-search
Quick Start