Name: Token-Efficiency Coach
Author: jimmynycu

Token-Efficiency Coach — find wasted tokens, keep the good stuff

A Claude Code plugin that shows exactly where your AI coding session wasted tokens — and how to fix it.

It flags waste, never quality. It runs on your machine. It speaks tokens, not dollars.

−45% tokens · −44% turns · 4/6 → 6/6 correct
_{measured in a live A/B — 6 sessions per arm, 12 total · see the method ↓}
_{Runs as an external local engine — zero added context overhead in your own session.}

Install · What it catches · Numbers · The story · How it works · FAQ

Install

About a minute. Two commands. The first adds the jimmy-tools marketplace from this repo (Jimmynycu/token-efficiency); the second installs the plugin from it. The @jimmy-tools handle is the marketplace name declared in .claude-plugin/marketplace.json — copy both lines exactly.

# 1) add the marketplace (registers as: jimmy-tools)
claude plugin marketplace add Jimmynycu/token-efficiency

# 2) install the plugin from that marketplace
claude plugin install token-efficiency-coach@jimmy-tools

Prefer the interactive UI? Run /plugin marketplace add Jimmynycu/token-efficiency, then /plugin install token-efficiency-coach@jimmy-tools — or just /plugin and pick it from Discover.

[!NOTE] What "runs locally" means. The coach does all its analysis on your machine — no session content ever leaves. The install step above is the one exception: marketplace add fetches this repo over the network and may ask you to confirm trusting the marketplace before it registers. Once installed, nothing else phones home.

[!TIP] What you get, immediately: a live token-first statusline (ctx tokens · cache% · ⚠), an automatic non-blocking coach that summarizes waste when a session ends, and an on-demand /token-efficiency-coach:coach you can run any time.

That's it. No API keys, no account. The coach reads your local session and talks back.

What it catches

The coach scans a session for waste — the tokens you paid for and didn't need. It does not grade whether your code or answers were "good." Eight things it flags:

Context bloat — the window quietly ballooning turn over turn
Uncached context — paying full freight for context that could have been cached
Oversized tool output — a single tool dumping a wall of tokens into the window
Output-heavy turns — generation spend that ran hotter than the task needed
Failed tool calls — errors you paid to send and paid to read
Wrong-tier routing — a heavyweight model doing featherweight work
Redundant reads — re-reading the same file the model already has
Very long single thread — one sprawling session that never compacted or split

[!NOTE] Every flag is a token flag. The coach never says "your code is bad." It says "this part cost more than it had to, here's the cheaper path."

Numbers

We ran a live A/B: 6 sessions per arm, 12 real headless claude sessions total, one fixed task, the actual work held constant. Coached vs. baseline:

Metric	Baseline	Coached	Change
Mean tokens / session	133,146	72,834	−45%
Mean turns / session	4.5	2.5	−44%
Task correctness	4 / 6	6 / 6	+2 solved (4/6 → 6/6)

Token-Efficiency Coach

Popularity

What's Inside

README

Install

What it catches

Numbers

Confidence

Similar Plugins

claude-buddy

prompt-improver

episodic-memory