Skill

code-discipline

Apply LLM coding discipline to every edit - minimum code, surgical changes, surface assumptions, goal-driven execution. Use this skill ALWAYS when writing or editing code, especially when modifying existing files. Triggers on any code-writing task, refactor, bug fix, or feature implementation. Inspired by Karpathy's observations on common LLM coding pitfalls. Prevents overcomplication, scope creep, silent assumptions, and weak success criteria.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/solo-sdlc:code-discipline

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Karpathy's observation: "Models make wrong assumptions on your behalf and just run along with them without checking. They don't manage their confusion, don't seek clarifications, don't surface inconsistencies, don't present tradeoffs, don't push back when they should. They really like to overcomplicate code and APIs, bloat abstractions, implement a bloated construction over 1000 lines when 100 ...

SKILL.md

204 lines · ~2k tokens

Stats

LanguageShell

Parent stars0

MaintenanceGood

Last CommitMay 5, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Code Discipline Skill

Why This Exists

This skill makes those failures impossible.

Four Rules (apply to every edit)

Rule 1: Minimum code

The goal is the smallest correct change.

No features that weren't asked for
No abstractions for code used in one place
No "flexibility" or "configurability" not requested
No defensive programming for cases that can't happen
No try/except wrapping that just rethrows

Tests:

Could this be half as long? If yes, halve it.
Could a senior engineer call this overcomplicated? If yes, simplify.
Am I adding something "just in case"? Remove it.

Examples:

❌ Bloated:

class UserService:
    def __init__(self, repo: UserRepository, cache: Optional[CacheBackend] = None,
                 metrics: Optional[MetricsBackend] = None,
                 feature_flags: Optional[FeatureFlagBackend] = None):
        self._repo = repo
        self._cache = cache or NullCache()
        self._metrics = metrics or NullMetrics()
        self._flags = feature_flags or NullFlags()

    async def get_user(self, user_id: str) -> User | None:
        # ... 30 lines of cache check, metrics, flags ...

✅ Minimum:

async def get_user(repo: UserRepo, user_id: str) -> User | None:
    return await repo.find_by_id(user_id)

If caching is needed later, add it then. Don't pre-build for a hypothetical future.

Rule 2: Surgical changes

Touch only what you must.

Don't "improve" adjacent code
Don't reformat code your edit didn't change
Don't rename things that work
Don't update style of existing code to match your preferences
Match the file's existing style even if you'd do it differently
If you see something unrelated that's broken, MENTION it - don't fix it

Dead code rule:

Remove imports/variables/functions YOUR changes made unused
Do NOT remove pre-existing dead code unless explicitly asked

The trace test: every changed line should trace directly to what was requested. If a line changed and you can't say WHY (mapping to the user's request), revert it.

Examples:

❌ Scope creep:

User: "Add VIP discount to CalculateShipping"
You: changes CalculateShipping AND renames customerType to customerCategory AND adds doc comments AND reformats with new linter rules

✅ Surgical:

You: change ONLY CalculateShipping. Even if customerType is a bad name, leave it. Even if there are no doc comments, don't add them. Mention "noticed customerType could be clearer; not changing per scope" in your output.

Rule 3: Surface assumptions

When unsure - say so.

The fastest way to waste hours: assume something silently and code on top of it.

When this rule applies:

Two valid interpretations of the request - state both, ask which
The request contradicts existing code - point out the contradiction
A specific edge case is unclear - ask
A tradeoff exists in implementation - present both options
The "obvious" approach has a non-obvious issue - flag it
A library/API change might affect things you can't see - ask before assuming

Surface assumption template:

"Before I implement: [request] - I'm reading this as X. Could also mean Y. Which?"

OR

"I'll need to [decision]. Two options:

Option A: [pros/cons]

Option B: [pros/cons] Going with A unless you say otherwise."

Anti-pattern (the failure mode):

User: "Add user search" You: silently decide to do prefix search, in-memory, case-sensitive, with pagination of 20, no permissions filter 30 minutes later: user has wrong implementation, has to ask for changes

Correct:

User: "Add user search" You: "Quick check before implementing:

Search by what fields? (email + name? email only?)

Substring or prefix?

Case sensitive?

Apply visibility/permission filter?

Pagination size?

Defaults if you say 'just go': email+name, substring, case-insensitive, visibility-filtered, 20 per page."

Rule 4: Goal-driven execution

Define success BEFORE starting. Loop until verified.

Karpathy: "LLMs are exceptionally good at looping until they meet specific goals. Don't tell it what to do, give it success criteria and watch it go."

The transformation:

❌ Imperative: "Implement caching with Redis"
✅ Declarative: "Make getUserProfile p95 < 50ms when called repeatedly with same ID. Verify by running bench-getUserProfile.sh."

Before starting any implementation:

State the success criteria. Concrete, observable, testable.
State the verification command/test.
After implementation: run the verification.
If verification fails: loop. Don't declare done.

If the user gave a weak goal: Push back: "Before I start - how will we know this is done? What's the test?"

Examples of weak vs strong:

❌ "Make it faster"
✅ "Cut p95 latency on /api/users from 800ms to <200ms, verified by running our load test for 5 minutes"
❌ "Make the UI nicer"
✅ "Match the mockup at docs/design/dashboard-v2.png, verify by side-by-side screenshot"
❌ "Refactor this"
✅ "Reduce CalculateShipping from 200 lines to <80 while keeping all current tests green; specifically split into 3 functions: validate, calculate, format"

Workflow

For every code-writing task:

Read the request carefully. Look for ambiguities, contradictions, edge cases.
Surface assumptions - state your interpretation, ask if uncertain.
Define success criteria - concrete, testable.
Plan minimum change - what's the smallest diff that achieves the goal?
Make surgical changes - only what's needed, in the existing style.
Verify against criteria - run the test, check the observable.
Loop if failed. Don't declare done.
Report what changed - tell user what you touched, what you noticed but left alone.

Output template (per task)

## Understanding
- Request: <one sentence>
- Interpretation: <yours>
- Assumptions: <list, especially uncertain ones>

## Success criteria
- <concrete, testable>

## Plan (minimum diff)
- File: <path> - <what changes>
- File: <path> - <what changes>

## Out of scope (noticed but not touching)
- <unrelated issue you saw>

## Verification
- Command: <how to verify>
- Result: <after running>

Anti-patterns to flag in your own output

Before submitting any change, ask yourself:

Did I add anything that wasn't asked for?
Did I touch any line that doesn't trace to the request?
Did I make any silent decisions that could have been clarifying questions?
Could the result be verified objectively? Did I verify it?
Could this be 50% shorter and still correct?

Any "yes" to the first three or "no" to the last two = revise before output.

Reference reading

Andrej Karpathy's original Twitter observations on LLM coding pitfalls (Jan 2026)
forrestchang/andrej-karpathy-skills (GitHub) - source of these rules
Boris Cherny's workflow notes - howborisusesclaudecode.ru

code-discipline

Invocation

Context Preview

SKILL.md

code-discipline

Invocation

Context Preview

SKILL.md

Code Discipline Skill

Why This Exists

Four Rules (apply to every edit)

Rule 1: Minimum code

Rule 2: Surgical changes

Rule 3: Surface assumptions

Rule 4: Goal-driven execution

Workflow

Output template (per task)

Anti-patterns to flag in your own output

Reference reading

Similar Skills

Code Discipline Skill

Why This Exists

Four Rules (apply to every edit)

Rule 1: Minimum code

Rule 2: Surgical changes

Rule 3: Surface assumptions

Rule 4: Goal-driven execution

Workflow

Output template (per task)

Anti-patterns to flag in your own output

Reference reading

Similar Skills