By Madhan230205
Optimizes Claude Code interactions by reducing token usage and costs through hybrid retrieval (BM25 + vector search) and summarization, compressing large codebases into compact context packets while preserving key information for reasoning.
Compresses top retrieval chunks into citation-rich summary packets that preserve intent while cutting token usage.
Runs hybrid retrieval with strict FTS-first policy, BM25 lexical ranking, vector merge, and top 3-5 reranking for token-efficient context selection.
Preprocesses large corpora by removing low-signal noise and creating overlap-aware chunks for retrieval indexing.
Delegates software engineering operations to focused subagents using compressed context packets to keep the main conversation lean.
Requires secrets
Needs API keys or credentials to function
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
The open-source alternative to expensive context management tools.
Every time you use Claude with a large codebase, you're paying for thousands of tokens that aren't relevant to your query. Most context management tools either:
Token Reducer is a local-first, intelligent context compression pipeline that:
┌─────────────────┐ ┌───────────────┐ ┌──────────────────┐
│ Your Codebase │────▶│ Token Reducer │────▶│ Compressed │
│ (50,000 tokens)│ │ Pipeline │ │ Context (500t) │
└─────────────────┘ └───────────────┘ └──────────────────┘
│
┌─────────┴─────────┐
│ - AST Chunking │
│ - BM25 + Vector │
│ - TextRank │
│ - Import Graph │
│ - 2-Hop Symbols │
└───────────────────┘
/plugin Command (Recommended)Step 1: Register the marketplace (one-time setup):
/plugin marketplace add Madhan230205/token-reducer
This registers the marketplace as Madhan230205-token-reducer.
Step 2: Install:
/plugin install token-reducer@Madhan230205-token-reducer
For project-scoped install:
/plugin install token-reducer@Madhan230205-token-reducer --scope project
Already ran Step 1 before? Just run
/plugin install token-reducer@Madhan230205-token-reducer— no need to add the marketplace again.
# 1. Clone into your Claude plugins folder
git clone https://github.com/Madhan230205/token-reducer.git ~/.claude/plugins/token-reducer
# 2. Install dependencies (optional but recommended for best results)
pip install -r ~/.claude/plugins/token-reducer/requirements-optional.txt
Windows users: Replace
~/.claude/plugins/with%USERPROFILE%\.claude\plugins\
Then open ~/.claude/settings.json and add:
{
"plugins": ["~/.claude/plugins/token-reducer"]
}
Restart Claude Code. Done.
What requirements-optional.txt installs:
| Package | Purpose |
|---|---|
sentence-transformers | Neural embeddings for smarter retrieval |
hnswlib / faiss-cpu | Fast approximate nearest-neighbor search |
tree-sitter + language grammars | AST-based code chunking (Python, JS, TS, Go, Rust, Java, C/C++, Ruby) |
If you skip this step, Token Reducer still works using hash embeddings and regex chunking — no ML libraries required.
No pip, no ML libs — runs immediately after cloning:
git clone https://github.com/Madhan230205/token-reducer.git
cd token-reducer
python scripts/context_pipeline.py run \
--inputs ./src \
--query "Find auth logic" \
--embedding-backend hash \
--db .cache/index.db
npx claudepluginhub madhan230205/token-reducer --plugin claude-token-reducer45% cost reduction measured. Cache expiry prevention, SubTask auto-delegation, zero-cost context restoration, real-time cost dashboard. The only Claude Code plugin built from CC source analysis.
Token optimization for Claude Code. Automatic tool output compression (40-60% reduction), token meter in statusline, auto-compact at 70% context, structured output formats, self-building project wiki, URL ingestion, Karpathy compile pass, and semantic lint.
A powerful code indexing tool with multi-platform support
Claude Code context optimizer. Builds a local static repo graph, injects ranked file:line candidates through a UserPromptSubmit hook, and adds a SessionStart prewarm brief. No embeddings, no server, no telemetry. Measured live A/B: 40.9% aggregate token reduction across 36 Claude Code calls. v2.10 structural slicing: a PreToolUse hook intercepts whole-file reads and hands Claude the file's outline (symbols + exact line ranges from the graph) so it reads the 40 lines it needs, not 800 that get re-sent every turn. Receipts (/savings) measure both sources — searches avoided and big reads sliced — causally from your transcript.
Structural codebase indexing for efficient navigation. Reduces token consumption by 60-80% through targeted line-range reads instead of full file scans.
AST-powered code compression for LLMs. Send 89% fewer tokens, get the same understanding. Parses your code with tree-sitter, keeps the signal, drops the noise.