token-reduce

token-reduce is an orchestration layer for agent context spending.
It provides built-in routing/enforcement, installs and wires downstream tools, and tells Claude/Codex when to use each tool path.
Agents: start with llms.txt for compact install/usage instructions.
What It Is
token-reduce is not just a prompt style or a single helper script.
It is a high-level token orchestration kit that:
- enforces low-cost discovery first
- auto-routes between path/snippet/structural tiers
- wires tool hooks so wasteful calls are blocked before execution
- integrates a dependency suite and operational benchmarking/review gates
Dependency Suite (Feature)
token-reduce treats dependency integration as a first-class feature.
Instead of manually managing tool-by-tool setup, it installs/wires defaults and exposes conditional tiers explicitly.
Core orchestrated stack
- QMD: BM25 path/snippet retrieval
- RTK: output compression for executed commands
- token-reduce helpers/hooks: path/snippet/adaptive routing + enforcement
Conditional companions
- AXI (
gh-axi, chrome-devtools-axi) for GitHub/browser-heavy execution
delegate-skill router for planner-first delegation — routes to devin (browser/sandbox), kimi (cheap research/review), grok (large codebase), or spark (local Codex write-mode) (standalone repo integration)
token-savior for exact symbol / impact acceleration
context-mode for output-heavy payload sessions
headroom as an optional pilot proxy/MCP layer for large tool-result and long-session context pressure; token-reduce remains the master router
code-review-graph for large-repo structural review tasks
caveman for optional terse output and memory-file compression
Not in default routing
token-optimizer-mcp, token-optimizer, and infra-coupled claude-context are intentionally excluded from default routing.
- Legacy graphify-first orchestration is not part of active routing.
Install And Activation
Quick setup (core stack + hooks + wrappers):
git clone https://github.com/chimera-defi/token-reduce-skill tools/token-reduce-skill
./tools/token-reduce-skill/scripts/setup.sh
What setup.sh does automatically:
- installs/configures core tools (
qmd, rtk) when possible
- runs initial QMD indexing for docs+code (
**/*.{md,txt,rst,py,sh,...}), which can take longer on first run
- wires Claude hooks for prompt steering + pre-tool enforcement
- links global wrappers (
token-reduce-adaptive, token-reduce-paths, token-reduce-snippet, token-reduce-manage)
- links the Codex skill and companion skills when present
Optional QMD scope overrides:
TOKEN_REDUCE_QMD_MASK: explicit glob mask passed to qmd collection add
TOKEN_REDUCE_QMD_EXTENSIONS: comma-separated extension list used to build the default mask
TOKEN_REDUCE_QMD_REFRESH_TTL_SECONDS: controls how often collection fingerprints are recomputed before refresh checks (default runtime: 900)
TOKEN_REDUCE_QMD_SEARCH_TIMEOUT_SECONDS: cap runtime qmd search latency before falling back to scoped rg (default: 8 in runtime, 0 in benchmark/test contexts)
Standalone companion integration (delegate-skill router — devin / kimi / grok / spark):
git clone https://github.com/chimera-defi/delegate-skill "$HOME/.claude/skills/delegate-skill"
./tools/token-reduce-skill/scripts/integrate-delegate-skill.sh
One-command measured activation (core-only default + validate):
git clone https://github.com/chimera-defi/token-reduce-skill tools/token-reduce-skill
./tools/token-reduce-skill/scripts/token-reduce-manage.sh activate-stack
Enable conditional companion install during activation:
TOKEN_REDUCE_ACTIVATE_EXTENDED_STACK=1 ./tools/token-reduce-skill/scripts/token-reduce-manage.sh activate-stack
Direct optional companion install surface:
TOKEN_REDUCE_INSTALL_EXTENDED_STACK=1 ./tools/token-reduce-skill/scripts/setup.sh
Host Support
Claude Code
git clone https://github.com/chimera-defi/token-reduce-skill tools/token-reduce-skill
./tools/token-reduce-skill/scripts/setup.sh
The skill is then available as /token-reduce in any Claude Code session rooted in the repo.
Codex