🗜 claude-context-compressor

Reduce Claude Code input token usage by 10-25% offline (zero cost) or 60-75% with API (optional). Offline by default — no API key needed. Works on Windows, Mac, Linux.

Companion to caveman (output compression). This compresses input: session memory, CLAUDE.md, auto-memory, tool outputs.

Install

claude plugin install github:rico03/claude-context-compressor

That's it. No configuration needed. Hooks activate automatically on next session.

Requirements: Python 3.9+ (any version, auto-detected on all platforms)

What it does

Every Claude Code session re-injects your full CLAUDE.md and auto-memory. After 10+ sessions that's thousands of redundant tokens on every message. This plugin intercepts and compresses them.

Hook	Event	What it compresses
`SessionStart`	Session open	CLAUDE.md + auto-memory → compressed
`UserPromptSubmit`	Each message	facts store → only relevant facts injected
`PostToolUse`	After Bash/Read/Grep	large outputs → truncated
`Stop`	Session end	extracts key facts → merges into facts store
`PreCompact`	Before /compact	backs up critical facts

Facts store merge logic:

current_state, preferences, bugs → latest value overrides old
decisions, architecture → accumulate forever (never lost)

Configuration

Optional. Create .claude/compress.config.json in your project to override defaults:

{
  "level": "input-only"
}

Level	Input	Output	Notes
`off`	❌	❌	Disable everything
`input-only`	✅ aggressive	❌	Default
`balanced`	✅ light	caveman lite	Pair with caveman
`max`	✅ aggressive	caveman ultra	Maximum savings

Enable API mode (optional)

By default the plugin runs fully offline using heuristic compression (no API key required, zero cost). To unlock higher compression via Claude API:

{
  "use_api": true,
  "model": "claude-haiku-4-5-20251001"
}

API mode adds: semantic memory compression, smart fact selection per prompt, intelligent tool output summarization, and session fact extraction.

Manual override per project

{
  "input": {
    "enabled": true,
    "memory_compression": "aggressive",
    "tool_output_limit": 200,
    "min_tokens_to_compress": 100
  }
}

Commands

After install, use /compress in Claude Code:

/compress status — current config + tokens saved this session
/compress level max — switch preset
/compress log — last 10 compression events
/compress facts — view current facts store

View your savings

python -c "
import json
from pathlib import Path

log = Path('.claude/token_log.jsonl')
lines = [json.loads(l) for l in log.read_text().splitlines() if l]
total = sum(l['saved'] for l in lines)
for l in lines[-5:]:
    print(f'  [{l[\"hook\"]:25}] {l[\"before\"]:>5}→{l[\"after\"]:>5}  ({l[\"reduction_pct\"]:>5.1f}%)')
print(f'\n  Total saved: {total:,} tokens')
"

Combine with caveman

claude plugin install github:JuliusBrussee/caveman
claude plugin install github:rico03/claude-context-compressor

Set "level": "max" in your config. Together: ~70% total token reduction.

License

MIT

context-compressor

Popularity

What's Inside

README

🗜 claude-context-compressor

Install

What it does

Configuration

Enable API mode (optional)

Manual override per project

Commands

View your savings

Combine with caveman

License

Confidence

Similar Plugins

pith

governor

claude-context-optimizer

claude-token-reducer

token-optimizer

claude-slim