By arturoyo
Cut Claude Code costs 20-55%. Smart model routing (Sonnet default, Opus only when needed) + terse output compression. Open source.
Force all requests to use Claude Haiku (cheap tier)
Force all requests to use Claude Opus (premium tier)
Force all requests to use Claude Sonnet (mid tier)
Clear tier override and return to auto-routing
Show optym-lite proxy status (running/stopped, mode, port)
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Stop burning expensive AI quota on cheap tasks. Smart model routing for Claude Code and Codex CLI.
Claude Code and Codex charge per request — or burn through quota fast. Most tasks don't need your most expensive model.
Without optym-code:
"hello" → Opus / o3 (expensive, wasted)
"read this file" → Opus / o3 (expensive, wasted)
"add try-catch" → Opus / o3 (expensive, wasted)
"design architecture" → Opus / o3 (actually needed)
Result: quota gone in 2 hours
With optym-code:
"hello" → Haiku / gpt-4.1-mini (20x cheaper)
"read this file" → Haiku / gpt-4.1-mini (20x cheaper)
"add try-catch" → Sonnet / gpt-4.1 (5x cheaper)
"design architecture" → Opus / o3 (escalated)
Result: same work, 30-55% less cost
Works as a local proxy — invisible to your CLI. No config beyond install, no data leaves your machine.
curl -s https://raw.githubusercontent.com/arturoyo/optym-code/master/install.sh | bash
Restart Claude Code. Done.
optym-code setup codex
optym-code start
Writes ~/.codex/config.toml pointing Codex at the proxy. Requires OPENAI_API_KEY set in your environment.
A local proxy runs on localhost:8088. Your CLI sends requests through it.
Pro mode upgrades classification to ML (92% accuracy, 45-55% savings) via api.optym.pro. Your LLM traffic never goes through optym servers — only the classification request does.
| CLI | Supported | Setup |
|---|---|---|
| Claude Code (subscription) | Yes | install.sh |
| Claude Code (API key) | Yes | install.sh + set ANTHROPIC_BASE_URL |
| Codex CLI | Yes | optym-code setup codex |
| Aider, Cursor, any Anthropic client | Yes | Set ANTHROPIC_BASE_URL=http://localhost:8088 |
Real numbers from production usage:
| Mode | Routing accuracy | Typical savings |
|---|---|---|
| Free (local rules) | ~70% | 20-35% |
| Pro (ML cloud) | 92% | 45-55% |
| Command | What |
|---|---|
/optym | Activate terse mode (cuts output tokens ~65%) |
/savings | Show savings dashboard |
/force-opus | Force Opus for all requests |
/force-haiku | Force Haiku for all requests |
/force-sonnet | Force Sonnet for all requests |
/optym-reset | Return to auto-routing |
stop optym | Deactivate terse mode |
optym-code start # start proxy (auto-started on Claude Code session)
optym-code stop # stop proxy
optym-code status # show proxy status and savings
optym-code setup codex # configure Codex CLI
Free tier: static regex rules, 100% local, 20-35% savings.
Optym Pro — ML classification, 92% accuracy, 45-55% savings. $9/month.
export OPTYM_PRO_KEY=optym_your_key
Your LLM traffic stays direct to Anthropic/OpenAI. Only prompt text is sent to classify.
api.optym.pro/v1/classify for ML routing. Not stored.{"telemetry": false} in ~/.optym-lite/config.jsonMIT
npx claudepluginhub arturoyo/optym-code --plugin optym-codeUltra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Memory compression system for Claude Code - persist context across sessions
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.