Skill

create-lean-tool

Author tools for the personal lean-tools CLI toolkit (stored in ~/.claude/tools/ globally or in a project's .claude/tools/ for repo-specific use). Use when the user asks to create a tool or script, when you notice yourself running the same multi-step command pattern 3+ times within or across sessions, when optimizing recurring bash invocations, when the user mentions lean-tools, the CLI toolkit, the manifest, or wants to reduce Claude Code token consumption from repeated commands, or when deciding whether to add a flag to an existing tool versus creating a new one. Covers the --help contract, parameter inference, structured exit codes, output thresholding, and the consolidation rule that governs when to create a new tool versus extending an existing one.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/lean-tools:create-lean-tool

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

This skill tells you how to build a tool for the user's personal CLI toolkit. The toolkit replaces recurring token-wasteful bash patterns (raw `gh pr view`, unbounded `git diff`, full `rushx build:typecheck` output) with single-purpose scripts that return optimized output.

SKILL.md

324 lines · ~4.6k tokens

Stats

LanguageShell

Parent stars0

MaintenanceExcellent

Last CommitMay 8, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

create-lean-tool

This skill tells you how to build a tool for the user's personal CLI toolkit. The toolkit replaces recurring token-wasteful bash patterns (raw gh pr view, unbounded git diff, full rushx build:typecheck output) with single-purpose scripts that return optimized output.

Trigger conditions

Use this skill when:

Explicit request — the user says "create a tool for X", "let's make a script", "add this to the toolkit".
Proactive pattern detection — you've run the same multi-step bash pattern three or more times in the current session, OR you recall running it in prior sessions (check memory/past chats). Suggest creating a tool before the user has to ask.
Recurring output waste — a command keeps dumping large output that Claude has to filter with head/tail/grep/jq. The filtering pattern is a tool in disguise.

When proactively suggesting a tool, do not build it unprompted. Surface the observation: "I've run gh pr checks <pr> with --json state --jq four times this session. This looks like a candidate for a github-pr-checks tool. Want me to create one?" Wait for confirmation before writing.

Principles

Five ideas underlie every script in the toolkit. Internalise these before writing — they decide most micro-questions on their own.

Single responsibility per script. One question, one tool. If a new tool would answer the same question from a slightly different angle, it's a flag on an existing tool, not a new tool. Different output shapes split into different tools; different modes of the same shape are flags.
Name expresses intention, not implementation. github-pr-status tells the agent when to use it; gh-pr-view-wrapper doesn't. Name what the tool answers, not how it computes the answer.
Smart defaults, explicit args when needed. No-arg invocation should resolve from context (current branch, cwd). Arguments override the default; they don't replace the existence of one. A tool that requires args for the most common case is too sharp.
Fail loudly, never guess silently. When inferring a parameter (PR number from branch, toolchain from cwd, run-id from latest), log the inference to stderr (# inferred: …). Silent resolution is a bug — the agent can't tell what happened when the resolution is wrong.
Exit codes are structured. 0 success, 1 not-found (the question has a definite "no" answer), 2 bad-input (the user / agent passed something invalid), 3 env-issue (missing binary, not authenticated, wrong cwd). Don't collapse failure modes into exit 1 — Claude Code's Bash tool surfaces the differentiation and the agent reacts to it.

The patterns that follow (consolidation rule, output thresholding, parameter inference) are direct expressions of these principles.

Decision flow

Work through these in order before writing any code:

1. Does an existing tool already answer this question?

Run lean-tools (or list ~/.claude/tools/ and the project's .claude/tools/ if present) to check. If an existing tool covers the need, use it. If it almost covers the need, go to step 2.

2. Would a flag on an existing tool work?

Prefer a flag to a new tool. A new tool is justified only when the output shape for this use case differs meaningfully from the existing tool's shape. Example: github-pr-status (overview) and github-pr-checks (polling-optimized) have different shapes, so they're separate tools. github-pr-review --with-comments would not justify splitting into github-pr-comments.

3. Which scope? (plugin / user / project)

Default to user (~/.claude/tools/) for anything personal or experimental.

Choose plugin (a PR to the toolkit repo's bin/) only when the tool is generic enough that every teammate benefits from it by default.

Choose project (<project>/.claude/tools/) only when:

The tool needs project-specific knowledge (monorepo layout, non-standard task runners)
The tool's behavior would be wrong or confusing in other projects
An existing plugin or user tool needs to be shadowed with a specialised version inside this repo

4. What is the minimum output needed?

Draft the output shape before writing the script. The shape drives the implementation, not vice-versa. Aim for the smallest output that answers the question. If the output is sometimes small and sometimes huge, apply the thresholding pattern (see below).

5. What parameter inference is appropriate?

Resolve from the most common invocation. A github-pr-* tool should default to the current branch's PR. A toolchain-aware tool should detect the toolchain from cwd. If inference happens, log it to stderr.

Interactive checkpoints

When you're going to write a tool, pause for explicit user confirmation at three points. Use AskUserQuestion rather than deciding silently — the cost of asking is small; the cost of getting these wrong creates real friction (a wrong name doesn't trigger; a wrong scope has to be re-created or migrated; a single tool that should have been two creates flag-bifurcated output).

Complexity — one tool or split?

If the request involves more than one distinct question (different output shapes, different inputs, different invocation contexts), the consolidation rule probably says to split. Surface the choice via AskUserQuestion:

"This involves and , which return different output shapes. Two tools fit the toolkit's name-based trigger matching better than one tool with a --mode flag. Which do you want?"

Two tools (recommended for distinct shapes)

One tool with a flag

Apply the consolidation rule (later in this doc) to inform your recommendation, then defer to the user. If a single tool genuinely covers everything (e.g., adding a flag to widen scope of an existing tool), there's no question to ask — just go.

Naming — self-explanatory?

Because the SessionStart manifest shows only name <args> (descriptions stripped), the name is the agent's only signal at trigger time. Names must answer "should I use this tool here?" without --help.

Workflow:

Draft 2–3 candidate names matching existing patterns (github-* for GitHub API, git-* for git operations, <domain>-<noun> like json-shape, <domain>-<verb> like port-busy).
For each, ask: does the name alone tell the agent when to reach for it?
If one is clearly best, use it. If candidates are close — or none feel obviously self-explanatory — confirm via AskUserQuestion:

"Pick a name for the new tool. The chosen name must tell the agent when to use it without reading --help."

:

:

:

Reference patterns from the existing toolkit:

json-shape (returns the shape) beats json-summary (vague).
git-blame-line (one line of blame) beats git-blame-quick (vague — quick what?).
github-pr-status (status, distinct from "checks") beats github-pr-info (everything-and-anything).
port-busy (the question being answered: is the port busy?) beats port-check (vague).

The pattern: name the output, the question answered, or the artifact returned — never the implementation (don't name a tool gh-pr-view-wrapper).

Scope — project or global?

At creation time, ask explicitly:

"Where should this tool live?"

~/.claude/tools/ — global (available in every repo)

/.claude/tools/ — project-only (available only inside this repo)

Recommend based on what the tool depends on:

Global when the tool encodes generic patterns: PR triage, JSON inspection, port-listener queries, unfamiliar-codebase orientation, anything that would be equally useful in any repo.
Project when the tool depends on this-repo-specific knowledge: monorepo layout, custom task runners (rushx, nx, turbo with custom config), internal naming conventions, company-specific workflow steps.

Don't skip this checkpoint even when the answer feels obvious. Project tools that should be global have to be re-created in every repo; global tools that should be project-scoped pollute the user's personal collection and mislead the agent in unrelated repos. Make the choice deliberate.

The contract every tool must satisfy

These are non-negotiable. The lean-tools --verify command checks them.

Shebang and strict mode. Bash unless you have a specific reason otherwise.

#!/usr/bin/env bash
set -euo pipefail

First --help line is the manifest entry. Format: name <args> — description. The em-dash (—, U+2014) separator is required by lean-tools --verify.

# github-pr-status [pr|url|branch] — PR overview: title, state, review, mergeable, checks

What appears where:

SessionStart manifest — lean-tools strips the description for token efficiency. The agent sees only name <args> at trigger time. The description is invisible until the agent commits to running the tool with <tool> --help.
<tool> --help output — the full first line, including description. This is where the description lands for the agent.
Terminal users — see the full first line whenever they run <tool> --help.

The split has a consequence: names must be self-explanatory. The agent decides whether to use a tool based on name <args> alone. The description is a tiebreaker, not a primary signal. See "Naming review" below.

The comment on line 2 of the script should match the first line of --help output.

--help short-circuits early. Before any work:

[[ "${1:-}" == "--help" ]] && { usage; exit 0; }

The usage function must print the manifest line as its first output line, followed by full documentation.

Structured exit codes.

0 — success
1 — expected not-found condition (PR doesn't exist, no matching branch)
2 — ambiguous or invalid input from the user
3 — environment issue (missing binary, not authenticated, wrong cwd)

Never use exit 1 for every error. Claude Code surfaces exit codes and can react to them; the differentiation matters.

Environment checks before work. If the tool depends on gh, jq, docker, etc., verify presence and auth before doing anything:

command -v gh >/dev/null || die "gh not found in PATH" 3
gh auth status >/dev/null 2>&1 || die "gh not authenticated" 3

Parameter inference pattern

When a tool accepts multiple input forms (bare number, URL, branch name, or current-branch default), resolve in this order and log the resolution to stderr:

arg="${1:-}"

if [[ -z "$arg" ]]; then
  # Default from context (current branch, cwd, etc.)
  target=$(derive_from_context) || die "no default target available" 1
elif [[ "$arg" =~ ^[0-9]+$ ]]; then
  # Bare number
  target="$arg"
elif [[ "$arg" =~ ^https?:// ]]; then
  # URL — parse with regex, not cut/awk
  [[ "$arg" =~ ^https?://github\.com/([^/]+)/([^/]+)/pull/([0-9]+) ]] \
    || die "URL format not recognized: $arg" 2
  target="${BASH_REMATCH[3]}"
  echo "# inferred: target=${target} from URL" >&2
else
  # Last resort: treat as a name to look up
  target=$(lookup_by_name "$arg") || die "no target found for: $arg" 1
  echo "# inferred: target=${target} from name ${arg}" >&2
fi

The inference log is not optional. When Claude sees the stderr line, it learns the tool's behavior. When a terminal user runs it, they verify the tool didn't misresolve. Silent resolution is a bug.

Output thresholding pattern

For tools whose output can be small or large (diffs, logs, file lists), apply automatic thresholding. Small → show inline. Large → show summary + drill-down command.

THRESHOLD="${CC_DIFF_THRESHOLD:-200}"

output=$(generate_full_output)
line_count=$(echo "$output" | wc -l)

if [[ "$line_count" -le "$THRESHOLD" ]]; then
  echo "$output"
else
  # Summary: stat + file list + drill-down hint
  generate_summary
  echo
  echo "→ scope with: $(basename "$0") ${target} <path>"
fi

The → scope with: line is the critical element. It tells Claude exactly how to narrow down without guessing flag syntax. Format the drill-down invocation to be copy-pasteable as-is.

Starter template

Save to ~/.claude/tools/<name> and chmod +x. Replace the <tool-name> placeholders.

#!/usr/bin/env bash
# <tool-name> <args> — <one-line description>
#
# Exit codes: 0 success  1 not-found  2 bad-input  3 env-issue

set -euo pipefail

usage() {
  cat <<'EOF'
<tool-name> <args> — <one-line description>

<Longer description, usage patterns, what it returns.>

Accepts:
  <arg-form-1>   <what it means>
  (nothing)      <default behavior>

Exit codes:
  0 success  1 not-found  2 bad-input  3 env-issue
EOF
}

die() { echo "$(basename "$0"): $*" >&2; exit "${2:-1}"; }

[[ "${1:-}" == "--help" ]] && { usage; exit 0; }

# Environment checks
command -v <required-binary> >/dev/null || die "<binary> not found in PATH" 3

# Argument parsing / inference
arg="${1:-}"
# ...inference logic...

# Main work — prefer a single structured call over a pipeline
<do the thing>

When to use thresholding vs bounded output

Threshold when the output varies wildly: diffs, logs, search results.
Always bound when the output is consistently truncatable: tsc errors (first 20 is fine), test failures (first failure + count), log lines.
Don't bound when the output is consistently small: PR status (6 lines), check counts (1-5 lines).

If you're unsure, threshold. It degrades gracefully in both directions.

Consolidation examples to reference

Merge, don't split: github-pr-comments should not exist because github-pr-review already surfaces unresolved threads as part of its output. Adding a dedicated comments tool would duplicate responsibility.

Split, don't merge: github-pr-status and github-pr-checks are separate because their output shapes differ. Status is a 6-line overview; checks is a single line designed for polling. Forcing one tool to do both would create a --mode=status|checks flag that bifurcates the output logic — a smell.

Testing the tool

After writing:

chmod +x ~/.claude/tools/<name>
lean-tools --verify        # checks the --help contract
<name> --help                # inspect the manifest line
<name>                       # test the no-arg default path
<name> <invalid-input>       # verify exit code 2

Then invoke the tool the way Claude would. If Claude still reaches for the raw command after the tool exists, the manifest description is not triggering — rewrite the first --help line with sharper intent keywords.

Common mistakes

Using two hyphens (--) instead of an em-dash (—) in the manifest line. lean-tools --verify warns on this.
Writing the manifest description in implementation terms. "Wraps gh pr view with jq projection" is about how the tool works. "PR overview: title, state, review, mergeable, checks" is about what Claude should use it for. Only the second triggers correctly.
Collapsing multiple concerns into flags. github-pr --status, github-pr --checks, github-pr --diff is worse than three separate tools. Claude's trigger matching works better on distinct tool names than on flag combinations.
Silent parameter inference. If the tool resolves "a branch name" to "PR #18162", log it to stderr. Otherwise Claude (and you) can't tell what happened when the resolution is wrong.
No environment check. A github-* tool that assumes gh is authenticated will fail with a confusing error. Check and exit 3 with a clear message.
Forgetting exit code differentiation. exit 1 for everything collapses the failure modes Claude could otherwise react to.
Building speculatively. The consolidation rule exists because it's tempting to pre-build github-pr-merge, github-pr-close, github-pr-comment. Don't. Build only after the histogram shows actual recurring use.

Where to place the new tool

Three scopes exist, with shadow precedence project > user > plugin (nearest wins on basename collision):

Plugin bin/ — the shipped defaults for the whole team. Only touch this via a PR to the toolkit repo. Don't drop personal scripts here on your own machine; they'll be overwritten on plugin updates.
~/.claude/tools/ — your personal tools and local patches. Not on PATH; invoked by path. Start here for anything you're iterating on.
<project>/.claude/tools/ — project-specific overrides. Only when the logic would be wrong or confusing outside this repo (e.g., a monorepo that routes through a non-standard task runner).

Default path: start in ~/.claude/tools/. Promote to the plugin repo via PR once it's proven across projects. Use a project shadow only when the tool genuinely can't generalise.

After placing the file and chmod +x, the tool is immediately available — lean-tools re-scans when mtime exceeds the cache, so no manual regen is needed. Claude will see it on the next session start (or immediately after /compact).

Final check

Before finishing, confirm:

Manifest line on line 2 of the script matches the first line of --help
lean-tools --verify passes
Exit codes differentiate success / not-found / bad-input / env-issue
Parameter inference (if any) logs to stderr
Environment prerequisites are checked before work
Output is either small-by-design, bounded, or thresholded
No duplication of an existing tool's responsibility

create-lean-tool

Invocation

Context Preview

SKILL.md

create-lean-tool

Invocation

Context Preview

SKILL.md

create-lean-tool

Trigger conditions

Principles

Decision flow

Interactive checkpoints

Complexity — one tool or split?

Naming — self-explanatory?

Scope — project or global?

The contract every tool must satisfy

Parameter inference pattern

Output thresholding pattern

Starter template

When to use thresholding vs bounded output

Consolidation examples to reference

Testing the tool

Common mistakes

Where to place the new tool

Final check

Similar Skills

create-lean-tool

Trigger conditions

Principles

Decision flow

Interactive checkpoints

Complexity — one tool or split?

Naming — self-explanatory?

Scope — project or global?

The contract every tool must satisfy

Parameter inference pattern

Output thresholding pattern

Starter template

When to use thresholding vs bounded output

Consolidation examples to reference

Testing the tool

Common mistakes

Where to place the new tool

Final check

Similar Skills