From contree
Set expected behaviour by writing or modifying test trees — the contract that says what the system should do before any code exists. TRIGGER when: the user describes a feature, capability, or behaviour change they want — even loosely (e.g. 'I want X', 'let's add Y', 'can we make it do Z', 'change how X works', 'I need to modify', 'remove this behaviour'). Trigger before any code is discussed or written.
How this skill is triggered — by the user, by Claude, or both
Slash command
/contree:changeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Sets expected behaviour before code exists. Talks through a behaviour change with the user and writes it into `TEST_TREES.md` at the project root — making intent explicit and agreed before implementation begins.
Sets expected behaviour before code exists. Talks through a behaviour change with the user and writes it into TEST_TREES.md at the project root — making intent explicit and agreed before implementation begins.
What behaviour is being added, modified, or removed? Talk it through with the user. Clarify scope and boundaries before touching any trees.
Read the actual tests and source of the area being changed before drafting the tree edit. The existing tree is a claim about what the code does; the code is what it actually does. If they disagree, a tree edit grounded only in the tree will compound the drift. Reconcile pre-existing tree-code drift in the working area as part of the change — the new tree must be coherent with post-change reality, not with a stale snapshot. If reconciliation is non-trivial, surface it to the user and decide together whether to fold it in or split it out before proceeding.
Who or what consumes this behaviour? A user? An API client? Another module? The consumer's perspective is where you start — not the internals.
Use the consumer's vocabulary. If the consumer says "register", the tree says "registers" — not "calls POST /api/users".
Adding a new capability:
Start with the System tree — and only the System tree. The System tree names the user-visible behaviour and is the contract a real, max-realism functional test will hold the implementation to. Real driven adapters, real infrastructure, real boundaries. If breadth at that realism is unaffordable, the System tree shrinks to a single expansive journey at max realism — it never becomes a broad in-memory-wired suite.
Inner trees (Use-case, Domain, Port, Adapter) get added as tdd discovers them by pulling the failing functional test through the layers — they are derived from what the System test demands, not designed ahead of time. The exception is the pure-library case below (no slice), where Domain/Port trees may stand alone. Naming an inner tree before its need has surfaced is YAGNI failure: it commits to a decomposition the implementation hasn't asked for.
Name the tree — always <Layer>: <Subject> where <Layer> is one of Domain, Use-case, Port, Adapter, System, and <Subject> is a short noun phrase naming the unit under test. Without the layer prefix, readers and sync cannot tell what level a tree exercises and cannot detect duplication across layers. Without a clear subject, the tree's tests are unreadable.
One tree reifies exactly one test file. If a capability exposes multiple behavioural units (e.g. a module with generate AND isValid, each testable independently), write one tree per unit, not one tree grouping them under a shared header. Grouping destroys the one-tree-one-file invariant and forces the TDD skill to fabricate ambiguous test structure.
Write paths using EARS patterns (see EARS Patterns below) to describe the capability's operating principles:
<Layer>: <Subject>
then <ubiquitous outcome>
while <precondition>
then <outcome>
when <trigger>
then <outcome>
and <outcome>
if <error condition>
then <recovery outcome>
Add as a new ### subsection in TEST_TREES.md.
Modifying existing behaviour:
Find the existing tree. Add, change, or remove when/then paths to reflect the new behaviour. Don't rewrite paths that aren't changing.
Removing a capability:
Remove the tree from TEST_TREES.md. Confirm with the user first.
Name the tree's coverage per category.
Every tree must name its coverage in parentheses at the end of the tree-name line as semicolon-separated labelled pairs. The four categories are:
src — implementation file (the source the tree's behaviour lives in)unit — fast, isolated test (Domain / Use-case / in-memory Adapter in hex terms; or structural checks where the tree's "implementation" is prose)integration — test that exercises a real seam (driven Adapter against real infra, port contract against a real implementation)functional — whole-system end-to-end test (System layer, or an in-docker full-app test)Domain: Money (src: src/features/money/domain/money.ts; unit: src/features/money/domain/money.domain.test.ts; integration: none; functional: none)
add
when called with another Money of the same currency
then the sum's amount is the sum of the two amounts
A System tree with no 1:1 source file omits src:
System: save-score (unit: src/features/score/application/save-score.use-case.test.ts; functional: test/system/save-score.system.test.ts)
when a user submits a valid score
then the score is persisted
Gaps are declared explicitly. A category that is expected but not yet covered uses the value none so the gap is visible to readers and to sync. A category that is genuinely not applicable to the tree is omitted entirely — no placeholder needed.
Paths may also appear on a subtree or then line when the behaviour at that node is implemented by a distinct file — rare in practice, but the rule applies wherever a 1:1 file mapping holds.
If naming a (sub)tree's paths reveals an awkward shape — the tree doesn't sit naturally with where the code lives, or one tree spans multiple source files — treat that as design feedback: reshape the tree or the implementation until the paths fit. Do not strip the paths to hide the mismatch.
A System tree describes a slice's consumer-visible behaviour. Below it sits a set of smaller trees — one per behavioural unit the slice produces. Each smaller tree reifies one test file at one test layer.
The four test layers map to the hex seams:
[ Driving Adapter ] ←→ [ Port ] ←→ [ Use-case ] ←→ [ Port ] ←→ [ Driven Adapter ]
(HTTP / CLI) (iface) (orchestration) (iface) (Postgres / Stripe / FS)
↕
[ Domain ]
(entities, value objects,
domain services, rules)
| Layer | Seam under test | Collaborators | Speed | File convention |
|---|---|---|---|---|
| Domain | the pure core | none — no fakes, no mocks, no async | instant | *.domain.test.* |
| Use-case | orchestration + port boundaries | in-memory adapters (real implementations of the ports) | fast | *.use-case.test.* |
| Adapter | one adapter against its contract | driving: mocked use-case. driven: real infrastructure. | mixed | *.adapter.test.* |
| System | the whole wired app | real driven adapters at the highest tolerable realism — max-validity functional testing. If breadth at that realism is unaffordable, fall back to a single expansive journey at max realism, never to broad in-memory wiring. | slow | *.system.test.* in test/system/ |
Layers are named for the hex seam under test, not for infrastructure presence. Classic "unit/integration/functional" conflates pure Domain with mocked Use-case and overloads "integration" — seams give sharper targets.
Hex positions — the locations in the codebase where code sits:
application/ports/.Decomposition rules:
save-score, cancel-order). It drives the System test.ShortCode — when generate() produces a code, then isValid() accepts it). If no cross-function invariant exists, omit System altogether and document the omission — but never leave a System test file without a corresponding tree.Money, SessionToken). Trivial value objects don't earn a tree — they're implicit in the use-case.save-score-use-case). A use-case that just delegates to a single port doesn't earn a tree.score-http-handler). Thin adapters don't earn a tree.ScoreRepository, AuditLog). The tree is reified by a shared contract suite (see skills/tdd/SKILL.md) that both the in-memory and real adapters must pass. Name ports for capability, not technology (OrderRepository, not PostgresClient).ScoreRepository-Postgres for timeout/retry/schema; the port contract covers the rest).describe/it hierarchy mirrors the tree verbatim.Tree naming heuristic — every tree name is <Layer>: <Subject>. Pick the subject with observable behaviour at that layer:
| Layer | Subject | Example |
|---|---|---|
| Domain | domain object or service | Domain: Money, Domain: SessionToken |
| Use-case | the use-case | Use-case: save-score-use-case |
| Port | port interface (shared contract) | Port: ScoreRepository, Port: AuditLog |
| Adapter | adapter being exercised | Adapter: score-http-handler, Adapter: ScoreRepository-Postgres |
| System | capability/slice or cross-cutting policy | System: save-score, System: auth-enforcement |
Tree shape per layer — the naming heuristic names the tree; the shape rule organises its paths.
| Layer | Shape | Top-level nodes |
|---|---|---|
| Domain | Code-shaped | The unit's exported functions/methods. Paths = observable branches. |
| Use-case | Code-shaped | The use-case's entry point (usually execute or the function name). Paths = observable branches. |
| Port contract | Method-shaped | The port's methods. Paths = behaviours the contract requires of every implementation. |
| Adapter | Protocol-shaped | The adapter's exposed operations (HTTP routes, CLI commands, queue topics, real-infra concerns). |
| System | Consumer-shaped | Consumer-visible events on the slice. Consumer vocabulary; principles, not cases. |
At Domain, Use-case, and Port-contract, TDD + YAGNI means no branch without a path and no path without a branch — the tree maps onto the code's methods and branches. At Adapter and System, the tree describes observable behaviour at the seam, not the internal branches that produce it.
A Domain tree is shaped like its class/module:
Domain: Money (src: src/features/money/domain/money.ts; unit: src/features/money/domain/money.domain.test.ts; integration: none; functional: none)
add
when called with another Money of the same currency
then the sum's amount is the sum of the two amounts
and the currency is preserved
if called with another Money of a different currency
then CurrencyMismatch is thrown
multiply
when called with a positive factor
then the amount is multiplied by the factor
if called with a negative factor
then NegativeMultiplier is thrown
The test file's describe/it hierarchy mirrors this verbatim.
The in-memory adapter pattern
For each outbound port, ship two adapters that both satisfy the port contract:
OrderRepository (port interface)
├── PostgresOrderRepository ← real, used in production and in System and driven-adapter tests
└── InMemoryOrderRepository ← real, used in Use-case tests
The in-memory adapter is not a mock. It's a real implementation: stores data in a map, enforces the same invariants (unique IDs, referential rules, ordering guarantees) that the real adapter does. The composition root swaps it in at test time by pointing at a different wiring.
What this buys:
System tests do NOT lean on the in-memory adapter. They wire real driven adapters and exercise real infrastructure — that's the point of the System layer. The in-memory twin exists to serve Use-case tests, not to dilute System tests into slow Use-case tests with extra ceremony.
The shared port contract suite
In-memory substitution is only sound if both adapters really do satisfy the same contract. Enforce this with a shared contract suite: one test suite, imported by both adapter test files.
// src/features/score/application/ports/score-repository.contract.ts
export function scoreRepositoryContract(makeRepo: () => ScoreRepository) {
describe('ScoreRepository', () => {
describe('save', () => {
describe('when called with a score', () => {
it('makes the score retrievable by its id', async () => {
const repo = makeRepo()
await repo.save(someScore)
expect(await repo.findById(someScore.id)).toEqual(someScore)
})
})
describe('if called twice with the same score id', () => {
it('rejects the second call without side effects', async () => { /* ... */ })
})
})
})
}
// src/features/score/adapters/outbound/in-memory/in-memory-score-repository.adapter.test.ts
import { scoreRepositoryContract } from '../../../application/ports/score-repository.contract'
scoreRepositoryContract(() => new InMemoryScoreRepository())
// src/features/score/adapters/outbound/postgres/postgres-score-repository.adapter.test.ts
import { scoreRepositoryContract } from '../../../application/ports/score-repository.contract'
scoreRepositoryContract(() => new PostgresScoreRepository(testDb))
// Plus Postgres-specific tests: timeouts, schema, constraint violations.
The contract suite IS the port-contract tree (ScoreRepository). Both adapter tests run the shared suite. The real adapter's test file also has tests for behaviour that only exists at its seam (timeouts, retries, constraint violations) — those live in the same *.adapter.test.* file but outside the shared suite.
The tree in TEST_TREES.md:
Port: ScoreRepository (src: src/features/score/application/ports/score-repository.ts; unit: src/features/score/application/ports/score-repository.contract.ts; integration: src/features/score/adapters/outbound/postgres/postgres-score-repository.adapter.test.ts; functional: none)
save
when called with a score
then the score is retrievable by its id
if called twice with the same score id
then the second call is rejected without side effects
Feature-first module layout — use this directory shape when adding or touching a capability:
src/features/<name>/
domain/ # entities, value objects, domain services (+ .domain.test.*)
application/
ports/ # outbound port interfaces (+ .contract.ts shared suites)
use-cases/ # orchestration (+ .use-case.test.*)
adapters/
inbound/ # http, cli, queue, cron (+ .adapter.test.*)
outbound/
in-memory/ # in-memory implementations of outbound ports
<real>/ # real implementations (postgres, stripe, s3) (+ .adapter.test.*)
composition/ # explicit wiring; points at real or in-memory adapters per layer
The composition root is the only place that imports concrete adapters and wires them into use-cases. Nothing else should. Use-case and System tests wire in-memory adapters through it; production wires real adapters.
Share the trees with the user before moving to implementation. The trees are the contract — get agreement before building.
Once aligned, suggest the user runs sync to audit completeness and implement gaps.
when/then.then describes outcomes, including side effects — what the consumer observes, what changes, what's produced, what's prevented, what's written externally (files, network, logs, state), what's cleaned up. A side effect that another invocation, hook, process, or operator could detect is behaviour and belongs in the tree.then must assert something the when clause does not already imply — if a then merely restates the condition, it's a tautology and adds no value. "when created / then it is created" tests nothing. "when created with default / then value is zero" asserts a concrete outcome.if/then for error cases and unwanted behaviour. Absence of behaviour is part of the specification.sync compares these two directly.<Layer>: <Subject> (e.g. Domain: Money, System: save-score). The layer disambiguates trees that share a subject across layers; the subject is what is being tested. Without both, duplication can't be spotted and the test's purpose isn't legible.then clause states its assertion inline. No cross-leaf references: phrases like "see above", "as before", or "the existing X branch holds" are forbidden. If two leaves would say the same thing, write both — duplication is the diagnostic signal that surfaces shared concepts; suppressing it hides design problems.Good — uses EARS patterns to match each requirement's nature:
System: media-player
then supports mp3 and wav formats
while playing
when pause is pressed
then playback pauses at current position
when track ends
then the next track starts automatically
when a track is loaded
then playback begins from the start
if the file is corrupt
then playback is rejected with an error message
where bluetooth is available
then audio can be routed to a bluetooth device
Bad — missing layer and subject (just a bare capability slug):
media-player
when a track is loaded
then playback begins from the start
Good — names layer and subject so duplication is detectable and purpose is legible:
System: media-player
when a track is loaded
then playback begins from the start
Bad — enumerates cases:
System: media-player
when file is "song.mp3"
then song.mp3 plays
when file is "track.wav"
then track.wav plays
Good — states the principle once:
System: media-player
then supports mp3 and wav formats
when a track is loaded
then playback begins from the start
Bad — tautological (then restates the when, so the path asserts nothing):
System: media-player
when a track is loaded
then a track is loaded
when playback is paused
then it pauses
Good — then asserts an outcome the when does not imply:
System: media-player
when a track is loaded
then playback begins from the start
when playback is paused
then the current position is retained for resume
Bad — flat siblings for causally dependent behaviour:
Use-case: auth
when token is invalid
then refresh is attempted
when refresh fails
then user is logged out
Good — causal nesting (refresh failure depends on refresh being attempted):
Use-case: auth
when token is invalid
then refresh is attempted
when refresh fails
then user is logged out
Bad — uses implementation language:
System: media-player
when AudioContext.decodeAudioData resolves
then the Float32Array buffer is assigned to the source node
Good — uses consumer vocabulary:
System: media-player
when a track is loaded
then playback begins from the start
Test trees use EARS (Easy Approach to Requirements Syntax) to choose the right keyword for each requirement. Match the pattern to the requirement's nature — don't force everything into when/then.
Ubiquitous — always true, no condition:
then <outcome>
State-driven — active while a condition holds:
while <precondition>
then <outcome>
Event-driven — response to a trigger:
when <trigger>
then <outcome>
Optional feature — applies only when a feature is present:
where <feature>
then <outcome>
Unwanted behaviour — response to error or undesired situation:
if <condition>
then <outcome>
Complex — state + event combined:
while <precondition>
when <trigger>
then <outcome>
Causal nesting — when a trigger can only occur as a consequence of a prior outcome, nest it under that outcome:
when <trigger>
then <outcome>
when <consequence of outcome>
then <next outcome>
A when that depends on a preceding then is not a sibling — it is a child. If "refresh fails" can only happen because "refresh was attempted", nest it under the then that attempts the refresh.
Choose the pattern that fits: a system constraint is ubiquitous; a precondition that must hold is state-driven; a discrete trigger is event-driven; an error case is unwanted behaviour; a feature flag is optional. Combine when needed. Nest when one behaviour depends on another's outcome.
npx claudepluginhub elimydlarz/claude-code-plugins --plugin contreeProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.