Skill

efficient

Tiered-model workflow mode for Claude Code. Routes work to Haiku / Sonnet / Opus based on task type, enforces cache and session hygiene, and prevents 90%+ Opus defaults that burn Max/Pro plan limits. Use when user invokes /efficient, starts a new task in a session, asks "how should I approach X", says "save tokens", "be efficient", "optimize my usage", or is on a Max/Pro plan and the task could be tiered. Also auto-trigger when a long session shows signs of cache decay.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/cc-efficient:efficient

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are operating in **efficient mode**. Your job is to deliver the same (or better)

SKILL.md

193 lines · ~2.6k tokens

Stats

LanguageJavaScript

Stars0

MaintenanceExcellent

Last CommitMay 25, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Persistence

ACTIVE FOR THE REST OF THE SESSION once invoked. Do not silently revert. If the user says "stop efficient" / "normal mode" / "ignore efficient", deactivate.

The 3-tier model — apply on every task

Before responding, classify the task:

Tier	Model	Task is…	Examples
Recall	`haiku`	Retrieving — answer is in the files	grep, glob, "where is X defined", listing endpoints, summarising a known file, doc lookups, finding callers of a function
Transform	`sonnet`	Applying known patterns to known code	Routine refactors, writing tests, single-file bug fixes, adding hooks/handlers to a known pattern, security/perf review against a checklist, format conversions, generating boilerplate
Reason	`opus`	Multi-file synthesis, architectural calls, ambiguous requirements	Designing a system, coordinating reviews, root-causing cross-file bugs, picking between trade-offs, anything where "it depends" is honest

Heuristic: retrieved → Haiku. Applied → Sonnet. Decided → Opus.

Routing rules — delegate to tier-pinned agents, do not nag the user about `/model`

The skill ships four tier-pinned agents in agents/. The main thread stays on Sonnet permanently; Opus and Haiku are tools the main thread calls, not modes the user switches into.

Agent	Model	Tools	Returns
`recall-haiku`	haiku	Read, Grep, Glob	`path:line` list + ≤50w summary. No file contents.
`transform-sonnet`	sonnet	Read, Edit, Write, Grep, Glob	Unified diff or `EDITED <path>` list. No narration.
`reason-opus`	opus	Read, Grep, Glob	Numbered plan + ≤200w rationale. No Edit — plans only, never ships code.
`triage-haiku`	haiku	Read, Grep, Glob	Top-20 error lines from a paste + 1-line root-cause guess.

Tiered autonomy — when to ask vs. when to act

Tier	Behaviour	Why
Recall (`recall-haiku`, `triage-haiku`)	Auto-delegate, silent. Emit a single line: `⟶ recall-haiku` or `⟶ triage-haiku`.	Read-only. Cheap to re-spawn if wrong. Asking permission is a friction tax.
Transform (`transform-sonnet`)	Announce, proceed unless objected. Print `⟶ transform-sonnet (N files). proceed?` then carry on if the user does not stop you.	The user reviews the diff before apply (global rule). Default-yes is safe.
Reason (`reason-opus`)	Ask, wait for explicit yes. Print `⟶ reason-opus (architectural). proceed?` and stop.	Opus is expensive. Never spawn without confirmation.

Delegation threshold — when NOT to spawn

Inline the work on the main thread (do not delegate) if any of these hold:

The search is one symbol in one known file → just Grep inline.
The edit is one file, one rule, ≤20 lines → just Edit inline.
The brief cannot be made self-contained without leaking the entire chat history into the agent prompt.

Delegate when:

Search spans 3+ files or uses 2+ glob patterns or hits unknown symbols.
Edit touches 3+ files or rewrites a recurring pattern across the repo.
A user pastes a blob ≥2KB that looks like logs / stack trace / JSON dump → route to triage-haiku.
The user asks an architecture/trade-off question → suggest reason-opus.

Self-contained brief rule

Agents have no memory of the main thread. When you delegate, the prompt to the agent must include:

The target (symbol, pattern, file paths to look at).
The scope (which dirs / globs to search, which files to edit).
The constraints (preserve API X, keep type Y, do not touch Z).
The return spec (already encoded in each agent's system prompt — do not redefine it).

If you cannot write a self-contained brief, do not delegate — inline the work.

Honour explicit user overrides

If the user types --escalate (or asks for Opus explicitly), spawn reason-opus and stop suggesting downgrades for the rest of the session.
If the user types --inline, do not delegate this turn — handle on the main thread.
If the user types --quiet, suppress all delegation suggestions for the session (still auto-delegate Recall silently).

Custom agents inherit their pinned tier

When invoking an existing agent (e.g. wp-code-review, review-security), trust the agent's model: frontmatter. Do not override unless the user asks.

Effort levels

For routine fixes/refactors, suggest /effort low once. For hard cross-file debugging or architecture, /effort high is fine.

Cache and session hygiene

When you observe these patterns, flag them once (not repeatedly):

Long session, slow responses → suggest /compact
User switches to a different file/feature → suggest /clear or a fresh session
User pastes a large log/diff into chat → auto-route to triage-haiku (returns top 20 error lines + 1-line root-cause). Also suggest saving to a file for future reference — chat content doesn't cache; file reads do.
Long tool output (>50 lines of build/test/lint) → do not absorb the raw blob into Opus context. Either pipe through triage-haiku or apply triage logic inline: keep top ERROR/FATAL/warning lines + 1-line summary. Reason on the compressed summary, not the full output.
Session resume / "catch me up" turn → if the obvious next action is to read 2+ docs (BUILD_LOG, DESIGN.md, journals, changelogs), brief recall-haiku instead: target = the docs, scope = "summarize last session + open TODOs + current state", return spec is the agent's default. Don't pour multi-hundred-line docs directly into the main thread.
User edits CLAUDE.md mid-session → note that it invalidates the prefix cache and suggest deferring the edit to between sessions
No .claudeignore in the project → suggest adding one (template at the bottom of this skill)
CLAUDE.md over ~200 lines → suggest extracting domain instructions to a skill (loads on demand) instead of CLAUDE.md (loads always)

Prompt-tier signals you should recognise

If the user's prompt contains any of these phrases, treat it as the matching tier and route to the matching agent (subject to the delegation threshold above):

"quick lookup", "just find", "list", "where is", "how many", "find every caller", "grep for", "catch me up", "recap", "what changed", "where did we leave off", "status of the session/branch" → Recall → recall-haiku
"fix", "add", "update", "refactor", "rename", "write tests", "convert", "migrate", "verify", "smoke test", "check that X works", "draft a BUILD_LOG entry", "update the changelog", "fill the template" → Transform → transform-sonnet
"design", "trade-offs", "approach", "should we", "walk me through", "why", "root cause" → Reason → reason-opus
Large paste with ERROR/WARN/Traceback markers, or JSON >2KB → Triage → triage-haiku
Long tool/build/test output captured into the main thread (>50 lines) → Triage → triage-haiku or inline triage

How to delegate without being annoying

Recall is silent. Spawn recall-haiku and emit one line: ⟶ recall-haiku. Do not explain why.
Transform announces and proceeds. One line: ⟶ transform-sonnet (<N> files). proceed?. Then carry on if the user does not stop you in the same turn.
Reason asks once and waits. One line: ⟶ reason-opus (architectural). proceed?. Stop until the user says yes.
Each suggestion appears at most once per session per topic. If the user rejects a suggestion for a pattern, do not re-suggest the same agent for similar prompts that session.
Honour explicit tier picks. If the user said "use Opus" this session, stop suggesting downgrades.
Don't mention the skill. Just behave the way the skill instructs. The user knows it's on.

Custom-agent design (when the user asks)

If the user wants to write a new agent, enforce these rules:

Single responsibility — one checklist or one task type per agent
Restrict tools to what the agent needs (Read/Grep/Glob for reviewers; add Edit only for builders)
Pin the model in frontmatter (model: sonnet or model: haiku) — never leave model unset for custom agents
Reference checklist files (~/.claude/reviews/*.md) instead of embedding the checklist — easier to maintain
Default to Sonnet. Only use Opus if the agent does multi-file synthesis or coordination

.claudeignore template

Suggest adding this to any project root that doesn't have one. Single highest-ROI change in this skill — affects every context load forever. The baseline below covers most stacks; tell the user to add framework-specific patterns for their project (e.g. Python __pycache__/, Rails tmp/, Laravel bootstrap/cache/, WordPress wp-content/uploads/, iOS Pods/):

# Dependencies
node_modules/
vendor/
.venv/
venv/
__pycache__/
.bundle/

# Build output
dist/
build/
out/
target/
.next/
.nuxt/
.svelte-kit/

# Caches
.cache/
.turbo/
.parcel-cache/
coverage/

# Lockfiles & logs
*.lock
*.log

# Minified assets
*.min.js
*.min.css

# OS / VCS
.DS_Store
.git/

Anti-patterns to call out

If the user is doing any of these, mention it once:

Running everything on Opus when the task fits Recall or Transform
One marathon session for the whole day instead of one session per task
Pasting logs/diffs into chat instead of saving to a file
Editing CLAUDE.md mid-session
No .claudeignore in the project
Custom agents without model: pinned
Using --resume on Claude Code older than v2.1.90 (full cache miss every resume)

Deactivation

If the user says "stop efficient", "normal mode", "ignore efficient", or "off":

"Efficient mode off. Reverting to default routing."

Then behave normally for the rest of the session.

efficient

Invocation

Context Preview

SKILL.md

efficient

Invocation

Context Preview

SKILL.md

Persistence

The 3-tier model — apply on every task

Routing rules — delegate to tier-pinned agents, do not nag the user about /model

Tiered autonomy — when to ask vs. when to act

Delegation threshold — when NOT to spawn

Self-contained brief rule

Honour explicit user overrides

Custom agents inherit their pinned tier

Effort levels

Cache and session hygiene

Prompt-tier signals you should recognise

How to delegate without being annoying

Custom-agent design (when the user asks)

.claudeignore template

Anti-patterns to call out

Deactivation

Similar Skills

Persistence

The 3-tier model — apply on every task

Routing rules — delegate to tier-pinned agents, do not nag the user about /model

Tiered autonomy — when to ask vs. when to act

Delegation threshold — when NOT to spawn

Self-contained brief rule

Honour explicit user overrides

Custom agents inherit their pinned tier

Effort levels

Cache and session hygiene

Prompt-tier signals you should recognise

How to delegate without being annoying

Custom-agent design (when the user asks)

.claudeignore template

Anti-patterns to call out

Deactivation

Similar Skills

Routing rules — delegate to tier-pinned agents, do not nag the user about `/model`

Routing rules — delegate to tier-pinned agents, do not nag the user about `/model`