mcp-recall

Your context window is finite. MCP tool outputs aren't. mcp-recall bridges the gap.
MCP tool outputs — Playwright snapshots, GitHub issues, file reads — can consume tens of kilobytes of context per call. A 200K token context window fills up in ~30 minutes of active MCP use. mcp-recall intercepts those outputs, stores them in full locally, and delivers compressed summaries to Claude instead. When Claude needs more detail, it retrieves exactly what it needs via FTS search — without re-running the tool.
Sessions that used to hit context limits in 30 minutes routinely run for 3+ hours.

The full context stack
Context pressure builds at four distinct layers. mcp-recall targets the two that nothing else handles.
flowchart TD
A(["Claude session begins"]) --> B
B["① Tool definitions loaded into context\n~500 tokens × every connected tool"]
B -->|"Claude Code Tool Search · Switchboard\ndefer unused schemas"| C
C["② Claude calls tools in sequence"]
C -->|"Code Mode · FastMCP 3.1\nrun script in sandbox, no intermediate results"| D
D["③ Tool returns large output\n50–85 KB per call"]
D -->|"mcp-recall\ncompresses to ~300 B, stores in SQLite"| E
E["④ Session ends"]
E -->|"mcp-recall\npersists across sessions via FTS index"| F(["Next session: clean context"])
| Layer | Problem | Solution |
|---|
| ① Tool definitions | Every connected MCP loads its full schema upfront (~500 tokens/tool) | Claude Code Tool Search (built-in) · Switchboard |
| ② Intermediate results | Multi-step workflows pass each result back through context | Code Mode · FastMCP 3.1 |
| ③ Single-tool outputs | One snapshot or API response dumps 50–85 KB | mcp-recall |
| ④ Cross-session memory | Useful context disappears when the session ends | mcp-recall |
Layers ① and ② have solid first-party and community solutions. mcp-recall focuses on ③ and ④ — the outputs that do land in context, and the knowledge that shouldn't vanish when the session ends. All four layers stack: run them together for maximum efficiency.
How it works
flowchart LR
A["MCP tool output\n(e.g. 56 KB snapshot)"] -->|"PostToolUse hook"| B(["mcp-recall"])
B -->|"~300 B summary"| C["Claude's context"]
B -->|"full content + FTS index"| D[("SQLite")]
D <-->|"recall__retrieve · recall__search"| C
Detailed pipeline
flowchart TD
A["MCP tool response<br/>(e.g. 56 KB snapshot)"] --> B[PostToolUse hook]
subgraph SEC["Security checks"]
DENY[denylist match?]
SCRT[secret detected?]
end
B --> DENY
DENY -- yes --> P1([skip: passes through unchanged])
DENY -- no --> SCRT
SCRT -- yes --> P2([skip + warn: passes through unchanged])
SCRT -- no --> DEDUP_N
subgraph DEDUP["Dedup check"]
DEDUP_N["sha256(name+input)"]
end
DEDUP_N -- "cache hit" --> CACHED(["[cached] header"])
DEDUP_N -- miss --> HAND_N
subgraph HANDLER["Compression handler (TOML profile first)"]
HAND_N["Playwright · GitHub · GitLab · Shell<br/>Linear · Slack · Tavily · Database<br/>Sentry · Filesystem · CSV · JSON · Text"]
end
HAND_N --> CTX["Context<br/>299 B summary + recall header"]
HAND_N --> DB_N
subgraph DB["SQLite store"]
DB_N["full content (56 KB) · summary (299 B)<br/>FTS index · access tracking · session days"]
end
DB_N --> TOOLS["recall__* tools<br/>retrieve · search · pin · note<br/>stats · session_summary · list · forget · export · context"]
Two hooks, one MCP server.
SessionStart hook — records each active day, prunes expired items, and injects a compact context snapshot before the first message
PostToolUse hook — intercepts MCP tool outputs and native Bash commands; deduplicates identical calls; compresses, stores, and returns summary
recall MCP server — exposes ten tools for retrieval, search, memory, and management