From contree
Prepare the project for ongoing test-tree-driven development — configures test framework with tree reporters and generates initial test trees in TEST_TREES.md. TRIGGER when: project has no test framework configured, no TEST_TREES.md at project root, or user is starting a new project. Run once per project.
How this skill is triggered — by the user, by Claude, or both
Slash command
/contree:setupThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Prepares the project for ongoing test-tree-driven development. Configures the test framework, generates initial test trees in CLAUDE.md, and establishes the contract between intent and implementation that the other skills maintain.
Prepares the project for ongoing test-tree-driven development. Configures the test framework, generates initial test trees in CLAUDE.md, and establishes the contract between intent and implementation that the other skills maintain.
Read before write. Always read existing config files before modifying them. Never overwrite — merge surgically.
Tree output is non-negotiable. If a framework can produce nested output, configure it. If it can only produce flat output, use it and be honest.
Four test layers, always. Layers are named for the hex seam under test, not for infrastructure presence:
*.domain.test.*.*.use-case.test.*.*.adapter.test.*.test/system/. *.system.test.*.See skills/tdd/SKILL.md for the full mapping, the in-memory adapter pattern, and the shared port contract suite.
CI dual reporters. Configure tree output for local dev AND structured output (JUnit XML) for CI. Both, not either/or.
Verify after configuring. Run the tests and confirm tree-shaped output before moving on.
No test files. Setup configures the framework and generates requirement trees. Do NOT create any test files (*.test.*, *.spec.*). The tdd skill handles test implementation later.
Read project files — source code, existing tests, configs, CLAUDE.md. Understand:
TEST_TREES.md already exists at the project rootDetect existing test config. Check for these files before creating or modifying anything:
| Ecosystem | Config files to check |
|---|---|
| Vitest | vitest.config.*, test key in vite.config.* |
| Jest | jest.config.*, jest key in package.json |
| Mocha | .mocharc.*, mocha key in package.json |
| pytest | conftest.py, pytest.ini, [tool.pytest.ini_options] in pyproject.toml, [tool:pytest] in setup.cfg |
| RSpec | .rspec, spec/spec_helper.rb |
| Minitest | test/test_helper.rb |
| PHPUnit | phpunit.xml, phpunit.xml.dist |
| Pest | pest in composer.json |
| JUnit/Gradle | build.gradle(.kts) for testLogging or test-logger plugin |
| JUnit/Maven | pom.xml for surefire/failsafe config |
| Go | Makefile/scripts for gotestsum |
| Rust | .config/nextest.toml |
| Elixir | test/test_helper.exs |
| .NET | .csproj for test SDK references |
| Bats | test/*.bats, bats in package.json |
If config exists, merge into it — add the reporter setting alongside existing keys. Never replace the file.
Detect existing test framework from project manifests. If none exists, identify the most suitable for the project's language.
When multiple frameworks are detected (e.g., both Jest and Vitest deps exist during a migration), present both and ask the user which to use.
Present identified frameworks with trade-offs and recommendation. Include tree output quality in the comparison. Let the user choose before proceeding.
Confirm how conventions apply to this project:
*.domain.test.**.use-case.test.**.adapter.test.*test/system/ at project root, *.system.test.**.contract.ts (not a test file — a suite imported by both the in-memory and real adapter tests)Language-specific conventions that override defaults:
#[cfg(test)] mod tests); Adapter (driven) and System tests live in tests/ at crate root — this is the language conventionfoo_test.go next to foo.go); Adapter (driven) tests use *_integration_test.go with //go:build integration tags; System tests live in test/system/ or tests/system/ per conventionspec/ directory is the overwhelming convention — follow it, subdivide by layer (spec/domain/, spec/use_case/, spec/adapter/, spec/system/)Monorepo strategy:
test/system/ if they exercise cross-package behaviour, or per-package if they test a single packageConfigure the Domain, Use-case, and Adapter layers as separate projects/configurations in the test runner. See the Framework Reference below — the Vitest projects example shows one config with four projects.
Do NOT skip. Do NOT rely on defaults. Do NOT overwrite existing config — merge into it.
If the config already has a reporters or verbose key, check whether changing it would break CI (e.g., removing a JUnit XML reporter). Present the conflict to the user rather than silently overwriting.
Separate command/config for the System layer:
test/system/ at project root*.system.test.* namingDetermine whether a Docker harness is needed for the real-infra modes. See the Docker Harness Reference below. Key question: do Adapter (driven) or System (real-infra mode) tests need external processes — databases, queues, HTTP servers? If yes, set up a Docker Compose harness. If the software is pure in-process, Docker is unnecessary.
When configuring Docker:
docker-compose.yml lives at project root (or test/functional/docker-compose.yml if the project root is already crowded)test:system:real script (or test:adapter:real) that orchestrates the full lifecycleInstall appropriate mutation testing tool (see Mutation Testing Reference below). Configure with:
!src/**/*.domain.test.*, !src/**/*.use-case.test.*, !src/**/*.adapter.test.*, !src/**/*.contract.ts)high: 80, low: 60, break: 50npm run test:mutate)Contree prescribes hexagonal architecture: domain is pure, I/O lives in adapters, dependencies point inward. Install a linter that enforces this so boundary violations break the build rather than the review.
For JS/TS projects — install dependency-cruiser:
pnpm add -D dependency-cruiser
Write .dependency-cruiser.cjs at the project root:
module.exports = {
forbidden: [
{
name: 'domain-pure',
comment: 'Domain has no I/O. It must not reach adapters or application code.',
severity: 'error',
from: { path: 'src/.+/domain/' },
to: { path: 'src/.+/(adapters|application)/' },
},
{
name: 'use-case-no-adapter',
comment: 'Use-cases depend on ports (interfaces), not concrete adapters.',
severity: 'error',
from: { path: 'src/.+/application/' },
to: { path: 'src/.+/adapters/' },
},
{
name: 'no-circular',
severity: 'error',
from: {},
to: { circular: true },
},
],
options: {
tsConfig: { fileName: 'tsconfig.json' },
},
};
Add a script and wire it into the project's lint command:
{
"scripts": {
"lint:arch": "depcruise src --config .dependency-cruiser.cjs",
"lint": "... && pnpm lint:arch"
}
}
Ensure CI runs pnpm lint (or pnpm lint:arch directly) so architectural violations fail builds.
For non-JS/TS projects — recommend the language-native equivalent. Don't attempt to install without a template; tell the user the rules they need to enforce (no imports from domain into adapters; no imports from application into adapters) and name the tool:
| Language | Tool |
|---|---|
| Java / Kotlin | ArchUnit |
| Go | go list + depguard |
| Python | import-linter |
| Rust | cargo-modules with CI assertions |
State the limitation honestly: without contree-provided config, the user wires the rules themselves.
Configure commands to run only tests affected by recent changes. Be aware of the gotchas — several "changed" flags silently run zero tests in common situations.
Framework-native support:
| Framework | Command | Gotcha |
|---|---|---|
| Vitest | --changed | Only tracks changed source files, NOT changed test files. If you edit a test without changing source, zero tests run. Use --watch for local TDD instead. |
| Jest | --onlyChanged / -o | Uses git status — after committing, nothing is "changed" and zero tests run. Useless in CI. |
| Jest | --changedSince=main | CI-appropriate. Requires git fetch origin main first (shallow clones break it). Use origin/main not main. |
| pytest | pytest-testmon | Tracks dependencies via coverage.py. First run builds the map (slower). .testmondata goes in .gitignore. |
| pytest | --last-failed | Built-in. Re-runs failures from previous run. Good complement to testmon. |
| RSpec | --only-failures | Requires example_status_persistence_file_path in spec_helper. |
| Go | gotestsum --watch | File watcher, re-runs on save. No git-aware mode. |
| Rust | cargo nextest run + watchexec | No built-in changed mode. Use watchexec -e rs -- cargo nextest run. |
For local TDD: prefer file watchers (vitest --watch, gotestsum --watch, guard-rspec, watchexec) over git-based --changed flags. Watchers are more reliable during rapid red-green cycles.
For CI: use branch-comparison flags (--changedSince=origin/main, nx affected:test, turbo run test --filter=...[origin/main]). Ensure adequate git fetch depth.
Commands should be simple to invoke — package.json scripts, Makefile targets, or mix aliases.
Create TEST_TREES.md at the project root if it does not already exist, containing a short header noting that the file holds the project's test trees and that new trees should be added as ### subsections using EARS patterns.
Do not compose the trees yourself in this step. Tree decomposition is the change skill's expertise — it enforces one-tree-one-file, the system/inner-layer naming heuristic, the EARS-by-requirement-nature rule, and causal nesting. Doing it inline here tends to drop those rules and produce grouped or mis-layered trees.
Once the framework is configured and TEST_TREES.md exists, invoke /contree:change and let that skill drive the tree composition. Do not attempt it yourself, even if the project looks simple.
Do not create any *.test.* or *.spec.* files in this step, not even with .todo/.skip stubs. Tests are the tdd skill's output.
Add or update the following sections:
TEST_TREES.md as the definition of the project's test trees. If CLAUDE.md already references TEST_TREES.md, do not duplicate the pointer.This step is mandatory. Do not skip it. If MENTAL_MODEL.md does not exist at the project root, create it now — before VERIFY — with exactly seven H2 sections.
If MENTAL_MODEL.md already exists, leave it alone — its content is authoritative and must not be modified.
The seven H2 sections, in order, each followed by a one-line placeholder describing what belongs there:
## Core Domain Identity## World-to-Code Mapping## Ubiquitous Language## Bounded Contexts## Invariants## Decision Rationale## Temporal ViewThe placeholders are replaced as the project accrues real content; their purpose is to make the expected contents of each section legible without content yet.
Then add a pointer line to CLAUDE.md identifying MENTAL_MODEL.md as the definition of the project's mental model. If CLAUDE.md already references MENTAL_MODEL.md, do not duplicate the pointer.
Run each layer's test suite and confirm tree-shaped output at each layer:
*.domain.test.*) — tree-shaped (or best available for the language)*.use-case.test.*) — tree-shaped*.adapter.test.*) — tree-shaped*.system.test.*) — tree-shapedDo NOT create test files to verify the reporter. If no tests exist yet, the empty suite's output (no tests found, reporter-formatted) is sufficient evidence that the reporter is wired correctly. Writing smoke tests or stubs violates rule #6 and rule #3 above (No fake code). The tdd skill writes tests later, from the trees.
Test trees use EARS (Easy Approach to Requirements Syntax) to choose the right keyword for each requirement. Match the pattern to the requirement's nature — don't force everything into when/then.
Ubiquitous — always true, no condition:
then <outcome>
State-driven — active while a condition holds:
while <precondition>
then <outcome>
Event-driven — response to a trigger:
when <trigger>
then <outcome>
Optional feature — applies only when a feature is present:
where <feature>
then <outcome>
Unwanted behaviour — response to error or undesired situation:
if <condition>
then <outcome>
Complex — state + event combined:
while <precondition>
when <trigger>
then <outcome>
Causal nesting — when a trigger can only occur as a consequence of a prior outcome, nest it under that outcome:
when <trigger>
then <outcome>
when <consequence of outcome>
then <next outcome>
A when that depends on a preceding then is not a sibling — it is a child. If "refresh fails" can only happen because "refresh was attempted", nest it under the then that attempts the refresh.
Choose the pattern that fits: a system constraint is ubiquitous; a precondition that must hold is state-driven; a discrete trigger is event-driven; an error case is unwanted behaviour; a feature flag is optional. Combine when needed. Nest when one behaviour depends on another's outcome.
True tree output (nested indentation): Vitest, Jest, Mocha, RSpec, Gradle test-logger-plugin (mocha theme), Maven tree-reporter Partial tree (one level grouping): pytest-spec, PHPUnit testdox, Pest testdox, Minitest SpecReporter Flat only (no nesting model): Go, Rust, Elixir (ExUnit), Bats, Swift, .NET CLI
Tree reporter:
// vitest.config.ts
import { defineConfig } from 'vitest/config'
export default defineConfig({
test: {
// 'tree' gives nested describe/it output.
// CRITICAL: Do NOT use 'verbose' — in Vitest v3+ it switched to flat output.
// 'tree' is the correct reporter for nested indentation.
reporters: [
'tree',
// Add JUnit XML for CI alongside tree for local dev:
...(process.env.CI ? ['junit'] : []),
],
outputFile: process.env.CI ? { junit: './reports/junit.xml' } : undefined,
},
})
Separating the four test layers — use Vitest projects (replaces deprecated vitest.workspace.ts in v3.2+). One project per layer: domain, use-case, adapter, system. Same pattern shown below — adjust include globs and timeouts per layer. Adapter (driven) and System (with real infra) may need much higher timeouts than Domain/Use-case:
// vitest.config.ts
export default defineConfig({
test: {
reporters: ['tree'], // reporters are root-level only — silently ignored inside projects
projects: [
{
extends: true, // inherit root config (plugins, resolve.alias, etc.)
test: {
name: 'domain',
include: ['src/**/*.domain.test.{ts,js}'],
},
},
{
extends: true,
test: {
name: 'use-case',
include: ['src/**/*.use-case.test.{ts,js}'],
},
},
{
extends: true,
test: {
name: 'adapter',
include: ['src/**/*.adapter.test.{ts,js}'],
testTimeout: 30_000, // driven adapters hit real infra
hookTimeout: 30_000,
},
},
{
extends: true,
test: {
name: 'system',
include: ['test/system/**/*.system.test.{ts,js}'],
testTimeout: 30_000,
hookTimeout: 30_000,
},
},
],
},
})
Scripts:
{
"test": "vitest run",
"test:domain": "vitest run --project domain",
"test:use-case": "vitest run --project use-case",
"test:adapter": "vitest run --project adapter",
"test:system": "vitest run --project system",
"test:changed": "vitest run --changed",
"test:watch": "vitest",
"test:mutate": "stryker run"
}
Gotchas:
reporters is root-level only — setting it inside a projects[*].test block is silently ignoredextends: true in a project block is required to inherit root-level config — without it you get a bare Vite config--changed uses the import graph but only tracks changed source files, not changed test files — use --watch for local TDDvitest.workspace.ts is deprecated since v3.2 — use the projects array inside vitest.config.tsTree reporter:
// jest.config.ts
import type { Config } from 'jest'
const config: Config = {
// verbose: true IS Jest's tree output — it nests test names under describe blocks.
// There is no separate 'tree' reporter in Jest.
verbose: true,
// Add JUnit XML for CI:
reporters: [
'default',
...(process.env.CI ? [['jest-junit', {
outputDirectory: 'reports',
outputName: 'junit.xml',
}]] : []),
],
// Separate unit and functional via projects:
projects: [
{
displayName: 'unit', // required for --selectProjects to work
testMatch: ['<rootDir>/src/**/*.unit.test.{ts,js}'],
transform: { '^.+\\.tsx?$': 'ts-jest' },
testEnvironment: 'node',
},
{
displayName: 'functional',
testMatch: ['<rootDir>/test/functional/**/*.functional.test.{ts,js}'],
transform: { '^.+\\.tsx?$': 'ts-jest' },
testEnvironment: 'node',
testTimeout: 30_000,
},
],
}
export default config
Scripts:
{
"test": "jest",
"test:unit": "jest --selectProjects unit",
"test:functional": "jest --selectProjects functional",
"test:changed": "jest --changedSince=origin/main",
"test:mutate": "stryker run"
}
Gotchas:
verbose and reporters are shared across all projects — you cannot set them per-projectdisplayName is required for --selectProjects and --ignoreProjects to work--onlyChanged uses git status — after committing, zero tests run; use --changedSince=origin/main for CI--changedSince requires the base branch to be fetchable — in CI run git fetch --no-tags --depth=1 origin main first, then use origin/main (not main)projects is configured — if using Stryker with Jest projects, you may need a separate jest.config for Stryker that targets Domain + Use-case tests only without the projects arrayts-jest if the project uses Vitest (which handles TypeScript natively)Tree reporter:
# .mocharc.yml
# 'spec' is the tree-style reporter (nested describe/it). It is also the default.
reporter: spec
require:
- tsx # TypeScript support
recursive: true
timeout: 5000
extension:
- ts
- js
Separating test suites — use separate config files:
.mocharc.unit.yml:
require: [tsx]
spec: 'src/**/*.unit.test.{ts,js}'
reporter: spec
parallel: true
jobs: 4
timeout: 5000
.mocharc.functional.yml:
require: [tsx]
spec: 'test/functional/**/*.functional.test.{ts,js}'
reporter: spec
parallel: false # functional tests often need serial execution
timeout: 30000
Scripts:
{
"test:unit": "mocha --config .mocharc.unit.yml",
"test:functional": "mocha --config .mocharc.functional.yml",
"test:mutate": "stryker run"
}
Gotchas:
--changed flag — use file watcher or script: mocha $(git diff --name-only -- '*.test.ts')--require with a root hook plugin filespec reporter works correctly in parallel modeInstall (pick the runner matching your test framework):
pnpm add -D @stryker-mutator/core @stryker-mutator/vitest-runner
# OR: @stryker-mutator/jest-runner
# OR: @stryker-mutator/mocha-runner
# Optional: @stryker-mutator/typescript-checker
Vitest runner config:
// stryker.config.mjs
/** @type {import('@stryker-mutator/api/core').PartialStrykerOptions} */
export default {
testRunner: 'vitest',
vitest: {
configFile: 'vitest.config.ts',
dir: '.',
// Only run tests related to mutated files — MUCH faster:
related: true,
},
mutate: [
'src/**/*.ts',
// Exclude test files — critical when tests are colocated with source:
'!src/**/*.test.ts',
'!src/**/*.spec.ts',
'!src/**/*.unit.test.ts',
'!src/**/*.functional.test.ts',
'!src/**/*.d.ts',
],
coverageAnalysis: 'perTest', // most efficient — always use this
reporters: ['clear-text', 'progress', 'html'],
htmlReporter: { fileName: 'reports/mutation/index.html' },
thresholds: { high: 80, low: 60, break: 50 },
// Incremental mode — stores state between runs for speed.
// Commit the file or store as CI artifact for cross-run benefits.
incremental: true,
incrementalFile: 'reports/stryker-incremental.json',
// TypeScript checker prunes non-compiling mutants early (faster):
checkers: ['typescript'],
tsconfigFile: 'tsconfig.json',
concurrency: 4,
timeoutMS: 10_000,
timeoutFactor: 1.5,
ignoreStatic: true, // skip mutants in static initializers (low value, slow)
}
Jest runner config — same structure but:
testRunner: 'jest',
jest: {
projectType: 'custom',
configFile: 'jest.config.ts',
enableFindRelatedTests: true, // equivalent of vitest.related
},
Mocha runner config — same structure but:
testRunner: 'mocha',
mochaOptions: {
spec: ['src/**/*.unit.test.ts'],
config: '.mocharc.unit.yml',
require: ['tsx'],
timeout: 10_000,
ui: 'bdd',
},
Note: Mocha runner does NOT reliably support coverageAnalysis: 'perTest' — fall back to 'all' if you see errors.
Gotchas:
@stryker-mutator/vitest-runner for Vitest, jest-runner for Jest, etc. Mismatching silently fails or crashes.vitest.related: true and jest.enableFindRelatedTests: true are critical for performance — without them Stryker runs ALL tests for every mutantcoverageAnalysis: 'perTest' is the most efficient option — 'all' re-runs the full suite per mutantignoreStatic: true skips mutants in const x = 'hello' at module scope — these are killed by every importing test, slow and low valuethresholds.break is null by default (no CI failure) — set it to enforce the gate@stryker-mutator/typescript-checkerTree reporter — pytest-spec + pytest-describe:
pip install pytest-spec pytest-describe
# or: uv add --dev pytest-spec pytest-describe
# pyproject.toml
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "--spec --strict-markers"
# pytest-describe: enable when/context prefixes for nested blocks
describe_prefixes = ["describe_", "context_", "when_"]
# pytest-spec: configure output format
spec_header_format = "{module_path}:"
spec_test_format = "{result} {name}"
spec_success_indicator = "+"
spec_failure_indicator = "-"
spec_skipped_indicator = "?"
# Markers for test categorisation
markers = [
"unit: Fast isolated unit tests",
"functional: End-to-end functional tests",
"slow: Tests taking >5s",
]
strict_markers = true
pytest-describe enables nested describe/context blocks:
def describe_wallet():
def describe_after_deposit():
def it_has_the_deposited_amount(wallet):
assert wallet.balance == 100
pytest-spec formats the output as indented tree. They compose — use both together for best results.
Separating unit and functional tests:
tests/
unit/
conftest.py # auto-marks all tests as @pytest.mark.unit
test_models.py
functional/
conftest.py # auto-marks all tests as @pytest.mark.functional
test_api.py
conftest.py # shared fixtures
Auto-mark by directory in tests/unit/conftest.py:
import pytest
def pytest_collection_modifyitems(items):
for item in items:
item.add_marker(pytest.mark.unit)
Run independently:
pytest tests/unit/ # or: pytest -m unit
pytest tests/functional/ # or: pytest -m functional
Changed-test runner — pytest-testmon:
pip install pytest-testmon
pytest --testmon # first run builds dependency map; subsequent runs only affected tests
pytest --last-failed # built-in: re-run failures from previous run
.testmondata goes in .gitignore — it is machine-specific.
Mutation testing — mutmut:
pip install mutmut
# pyproject.toml
[tool.mutmut]
paths_to_mutate = ["src/"]
tests_dir = ["tests/"]
runner = "python -m pytest -x --tb=short -q"
do_not_mutate = [
"src/*/migrations/*",
"src/*/config.py",
]
mutate_only_covered_lines = true # huge speed improvement — always enable
mutmut run # run all mutations
mutmut run "src/myapp/models*" # target specific modules
mutmut browse # TUI to inspect results (replaces mutmut html in v3)
Gotchas:
-v/--verbose — use --spec instead, not bothmutmut html is gone, use mutmut browse (TUI)mutate_only_covered_lines = true is critical for speed on large codebasesunittest.TestCase natively — get tree output by running unittest tests through pytest with pytest-specTree reporter — RSpec:
# .rspec
--format documentation
--color
--order random
--require spec_helper
The documentation formatter prints nested describe/context/it blocks as indented text.
Separating spec directories:
# spec/spec_helper.rb
RSpec.configure do |config|
config.define_derived_metadata(file_path: %r{/spec/functional/}) do |metadata|
metadata[:functional] = true
end
config.define_derived_metadata(file_path: %r{/spec/unit/}) do |metadata|
metadata[:unit] = true
end
# Persistence file for --only-failures:
config.example_status_persistence_file_path = "spec/examples.txt"
end
Run by tag:
rspec --tag unit
rspec --tag functional
rspec --only-failures # re-run previous failures
rspec --next-failure # stop at first failure
File watching — guard-rspec:
# Gemfile
gem 'guard-rspec', require: false
Mutation testing — mutant:
# Gemfile
group :development, :test do
gem 'mutant'
gem 'mutant-rspec'
end
bundle exec mutant run --include lib --require my_project --integration rspec -- 'MyApp::User'
bundle exec mutant run --include lib --require my_project --integration rspec -- 'MyApp::User#valid?'
Mutant is the gold standard for Ruby mutation testing — mature, actively maintained (v0.14+). Works best on focused classes/modules rather than entire codebases at once. Test selection uses longest RSpec example group description prefix match.
Minitest note: If the project uses Minitest, minitest-reporters with SpecReporter gives one level of grouping (class > test) but not true nesting. If tree output matters, recommend RSpec.
Best available output — gotestsum:
go install gotest.tools/gotestsum@latest
gotestsum --format testdox ./... # BDD-style sentences, grouped by package
gotestsum --format testname ./... # one line per test with package prefix
gotestsum --format testdox --watch ./... # file watcher for TDD
gotestsum --junitfile results.xml ./... # JUnit XML for CI
testdox output groups by package, then lists tests as sentences — one level deep. Go's test model has no describe/context nesting, so no tool can produce a deep tree. Be honest about this.
Separating unit and integration tests — build tags:
// integration_test.go
//go:build integration
package myapp
// ...
go test ./... # unit tests only (no tag)
go test -tags=integration ./... # both unit AND integration
Critical: -tags=integration runs tagged AND untagged files. To run ONLY integration tests, also tag unit tests with //go:build !integration, or use the -short convention:
func TestSlowIntegration(t *testing.T) {
if testing.Short() {
t.Skip("skipping integration test in short mode")
}
}
go test -short ./... # skip integration
go test ./... # run everything
Mutation testing — gremlins:
go install github.com/go-gremlins/gremlins/cmd/gremlins@latest
gremlins unleash # run from module root
gremlins unleash --tags=unit # with build tags
Gremlins (v0.6+, actively maintained) is the best Go mutation tool available. Supports arithmetic, conditionals, increment/decrement mutations. Limitation: runs full test suite per mutation, so impractical for large monolithic modules. Works well for microservice-sized modules (which is most Go code).
Alternatives: go-mutesting (original abandoned; Avito fork has sporadic maintenance) — prefer gremlins.
Best available output — cargo nextest:
cargo install cargo-nextest --locked
# or: cargo binstall cargo-nextest
# .config/nextest.toml
[profile.default]
test-threads = "num-cpus"
fail-fast = true
slow-timeout = { period = "60s", terminate-after = 2 }
status-level = "pass"
failure-output = "immediate"
success-output = "never"
[profile.ci]
retries = 3
fail-fast = false
failure-output = "immediate-final"
[profile.ci.junit]
path = "target/nextest/ci/junit.xml"
cargo nextest run # all tests
cargo nextest run --lib # unit tests only (inline #[cfg(test)])
cargo nextest run -E 'kind(test)' # integration tests only (tests/ dir)
cargo nextest run --profile ci # CI profile with retries + JUnit
cargo nextest is a strict upgrade over cargo test — each test runs in its own process (better isolation), parallel by default, better failure output. Only limitation: cannot run doctests (use cargo test --doc separately).
Output is flat — module paths, not nested indentation. Rust's #[test] model has no describe/context hierarchy. Be honest about this.
Test separation follows Rust conventions:
#[cfg(test)] mod tests inside source files — access private itemstests/ directory at crate root — separate crates, public API onlycargo nextest run --lib # unit only
cargo nextest run -E 'kind(test)' # integration only
Mutation testing — cargo-mutants:
cargo install --locked cargo-mutants
# or: cargo binstall cargo-mutants
# .cargo/mutants.toml
test_tool = "nextest" # use nextest instead of cargo test
Add a speed-optimised profile in Cargo.toml:
[profile.mutants]
inherits = "test"
debug = false # skip debug symbols — faster builds
cargo mutants # all mutations
cargo mutants -f "src/user.rs" # specific file
cargo mutants -F "validate" # specific function regex
cargo mutants --shard 1/4 # CI sharding (parallel)
cargo mutants --profile=mutants # use speed-optimised profile
cargo-mutants (v1.1+, actively maintained) is the most mature Rust mutation tool. Replaces function bodies with default return values, deletes match arms, replaces operators. Works on any stable compiler (no nightly required).
Best available output — ExUnit trace mode:
# test/test_helper.exs
ExUnit.start(trace: true)
Or: mix test --trace
Trace mode sets max_cases: 1 (serial), prints each module and test name. Output is flat — describe block names are prepended to test names as string prefixes, no visual indentation.
Describe blocks are limited to ONE level of nesting — ExUnit forbids nested describe by design. Composition happens through named setup functions:
describe "when empty" do
setup [:create_empty_order]
test "is not ready", %{order: order} do
refute Order.ready?(order)
end
end
Test separation — tags:
# test/test_helper.exs
ExUnit.start(trace: true, exclude: [:integration])
# In integration test files:
@moduletag :integration
mix test # unit only (integration excluded)
mix test --include integration # everything
mix test --only integration # integration only
Mix aliases for convenience in mix.exs:
defp aliases do
[
"test.unit": ["test --exclude integration"],
"test.integration": ["test --only integration"],
]
end
Mutation testing: No mature tool exists. Muzak and Exavier are both unmaintained. For similar confidence, use property-based tests with StreamData instead. Be honest about this limitation.
PHPUnit — testdox config:
<!-- phpunit.xml -->
<phpunit testdox="true" colors="true">
<testsuites>
<testsuite name="Unit">
<directory>tests/Unit</directory>
</testsuite>
<testsuite name="Functional">
<directory>tests/Functional</directory>
</testsuite>
</testsuites>
<source>
<include>
<directory>src</directory>
</include>
</source>
</phpunit>
Testdox groups by class and converts camelCase to sentences — one level deep (class > test). No nested describe in PHPUnit.
vendor/bin/phpunit --testsuite=Unit
vendor/bin/phpunit --testsuite=Functional
Pest PHP alternative: If the project uses Pest (v3+), it supports describe/it blocks and has built-in mutation testing:
./vendor/bin/pest # run tests
./vendor/bin/pest --mutate # mutation testing
./vendor/bin/pest --mutate --min=80 # fail if MSI below 80%
Pest v3's built-in mutation testing is a significant advantage over managing Infection separately.
Infection (if not using Pest):
// infection.json5
{
"source": {
"directories": ["src"],
"excludes": ["Config", "Migrations"]
},
"timeout": 10,
"threads": "max",
"logs": {
"text": "infection.log",
"html": "infection.html",
"summary": "summary.log"
},
"minMsi": 50,
"minCoveredMsi": 80,
"testFramework": "phpunit",
"testFrameworkOptions": "--testsuite=Unit"
}
vendor/bin/infection --threads=max --show-mutations
vendor/bin/infection --git-diff-lines # only changed lines — great for CI
Tree reporter — gradle-test-logger-plugin:
// build.gradle.kts
plugins {
id("com.adarshr.test-logger") version "4.0.0"
}
testlogger {
theme = com.adarshr.gradle.testlogger.theme.ThemeType.MOCHA
showExceptions = true
showStackTraces = true
showPassed = true
showSkipped = true
showFailed = true
slowThreshold = 2000
}
The MOCHA theme produces nested tree output from @Nested JUnit 5 test classes. Use MOCHA_PARALLEL when maxParallelForks > 1.
Separating test source sets — JVM Test Suite Plugin (built-in since Gradle 7.3):
testing {
suites {
val test by getting(JvmTestSuite::class) {
useJUnitJupiter()
// src/test/java — unit tests
}
val functionalTest by registering(JvmTestSuite::class) {
useJUnitJupiter()
// src/functionalTest/java — functional tests
dependencies {
implementation(project())
}
targets {
all {
testTask.configure { shouldRunAfter(test) }
}
}
}
}
}
tasks.named("check") {
dependsOn(testing.suites.named("functionalTest"))
}
Run: ./gradlew test (unit) vs ./gradlew functionalTest.
JUnit 5 @Nested for tree structure:
class OrderTest {
@Nested class WhenEmpty {
@Test void isNotReady() { /* ... */ }
@Nested class AfterAddingItem {
@Test void isReady() { /* ... */ }
}
}
}
Mutation testing — PIT (pitest):
plugins {
id("info.solidsoft.pitest") version "1.19.0-rc.3"
}
pitest {
pitestVersion.set("1.19.1")
junit5PluginVersion.set("1.2.3")
targetClasses.set(setOf("com.example.*"))
targetTests.set(setOf("com.example.*Test"))
threads.set(4)
outputFormats.set(setOf("HTML", "XML"))
timestampedReports.set(false)
mutationThreshold.set(50)
}
Run: ./gradlew pitest. Incremental: ./gradlew pitest caches results between runs.
Kotest note (Kotlin): Kotest has expressive spec DSLs (DescribeSpec, BehaviorSpec) but when run via Gradle's JUnit Platform runner, output is flat paths, not indented tree. JUnit 5 @Nested + gradle-test-logger-plugin gives better CLI tree output.
Tree reporter — maven-surefire-junit5-tree-reporter:
<!-- pom.xml -->
<plugin>
<artifactId>maven-surefire-plugin</artifactId>
<version>3.5.3</version> <!-- MUST be <= 3.5.3; breaks on 3.5.4+ -->
<dependencies>
<dependency>
<groupId>me.fabriciorby</groupId>
<artifactId>maven-surefire-junit5-tree-reporter</artifactId>
<version>1.5.1</version>
</dependency>
</dependencies>
<configuration>
<reportFormat>plain</reportFormat>
<consoleOutputReporter>
<disable>true</disable>
</consoleOutputReporter>
<statelessTestsetInfoReporter
implementation="org.apache.maven.plugin.surefire.extensions.junit5.JUnit5StatelessTestsetInfoTreeReporter">
<theme>UNICODE</theme>
</statelessTestsetInfoReporter>
</configuration>
</plugin>
Critical: Pin surefire to 3.5.3 — the tree reporter v1.5.1 is incompatible with surefire 3.5.4+.
Use maven-failsafe-plugin (same config pattern) for functional/integration tests (*IT.java).
PIT for Maven:
<plugin>
<groupId>org.pitest</groupId>
<artifactId>pitest-maven</artifactId>
<version>1.19.1</version>
<dependencies>
<dependency>
<groupId>org.pitest</groupId>
<artifactId>pitest-junit5-plugin</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
<configuration>
<targetClasses><param>com.example.*</param></targetClasses>
<targetTests><param>com.example.*Test</param></targetTests>
<threads>4</threads>
<mutationThreshold>50</mutationThreshold>
<timestampedReports>false</timestampedReports>
</configuration>
</plugin>
Run: mvn org.pitest:pitest-maven:mutationCoverage
Incremental (only changed code): mvn org.pitest:pitest-maven:scmMutationCoverage
Output: dotnet test output is flat in ALL verbosity modes — it lists Namespace.Class.Method PASSED one per line. There is no nested indentation in the CLI. True tree output only exists in Visual Studio/Rider GUIs.
dotnet test --logger "console;verbosity=detailed" # most verbose, still flat
dotnet test --logger "trx" # structured output for CI
Test separation — separate .csproj projects:
tests/
MyApp.UnitTests/MyApp.UnitTests.csproj
MyApp.FunctionalTests/MyApp.FunctionalTests.csproj
dotnet test tests/MyApp.UnitTests/
dotnet test tests/MyApp.FunctionalTests/
Mutation testing — Stryker.NET:
dotnet tool install -g dotnet-stryker
// stryker-config.json
{
"stryker-config": {
"solution": "MyApp.sln",
"test-projects": ["tests/MyApp.UnitTests/MyApp.UnitTests.csproj"],
"mutate": ["**/*.cs", "!**/obj/**", "!**/bin/**", "!**/Migrations/**"],
"reporters": ["html", "progress", "cleartext"],
"thresholds": { "high": 80, "low": 60, "break": 0 },
"concurrency": 4,
"coverage-analysis": "perTest",
"since": {
"enabled": true,
"target": "main"
}
}
}
dotnet stryker # full run
dotnet stryker --since:main # only mutate changes since main branch
The since feature is very useful for CI — only mutates code changed since the target branch. The cleartext-tree reporter shows mutations grouped by file in a tree structure in the console.
Be honest: .NET CLI test output is flat. The value here is in the test structure (separate projects, clear naming) and mutation testing, not in tree-shaped terminal output.
Output: Flat only. No describe/context blocks.
bats --pretty test/ # coloured pass/fail
bats --formatter tap test/ # TAP format for CI
Simulate tree structure through naming conventions:
@test "UserRegistration: when valid details: creates account" { ... }
@test "UserRegistration: when duplicate email: rejects" { ... }
No mutation testing tool for Bash. Be honest about this.
Output: swift test --verbose is flat.
No mature mutation testing tool. Be honest about this.
Docker IS needed when Adapter (driven) or System tests must exercise the software against real external processes:
ffmpeg, imagemagick, wkhtmltopdf)Docker is NOT needed when:
Rule of thumb: if you need to docker run or brew install something before tests can pass, that dependency belongs in a Docker harness so the test suite is self-contained.
Every Docker harness follows the same lifecycle:
start dependencies → wait for readiness → run tests → tear down
The harness lives alongside the real-infra test layers:
test/system/
docker-compose.yml # service definitions (shared with driven-adapter tests if needed)
wait-for-ready.sh # readiness checks (or use healthchecks in compose)
*.system.test.* # System test files
Or, for projects where docker-compose.yml belongs at root (e.g., the project already has one for dev):
docker-compose.test.yml # test-specific overrides
test/functional/
*.functional.test.*
services:
db:
image: postgres:17-alpine
environment:
POSTGRES_DB: test
POSTGRES_USER: test
POSTGRES_PASSWORD: test
ports:
- "5433:5432" # non-default port to avoid clashing with local Postgres
healthcheck:
test: ["CMD-SHELL", "pg_isready -U test"]
interval: 2s
timeout: 5s
retries: 10
tmpfs:
- /var/lib/postgresql/data # RAM-backed storage — fast, disposable
Key decisions:
tmpfs — data lives in RAM, tests are faster, nothing persists between runshealthcheck — compose knows when the service is actually ready, not just startedWhen the software itself IS the server being tested:
services:
db:
image: postgres:17-alpine
# ... same as above ...
app:
build:
context: ../.. # project root
dockerfile: Dockerfile # or Dockerfile.test if different from prod
environment:
DATABASE_URL: postgres://test:test@db:5432/test
PORT: "3000"
ports:
- "3001:3000" # non-default host port
depends_on:
db:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 2s
timeout: 5s
retries: 15
Tests run on the host and hit http://localhost:3001. The app container connects to db via the compose network (db:5432).
When to build the app in Docker vs run on host:
services:
rabbitmq:
image: rabbitmq:4-management-alpine
ports:
- "5673:5672" # AMQP
- "15673:15672" # management UI (useful for debugging)
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "check_port_connectivity"]
interval: 5s
timeout: 10s
retries: 10
services:
redis:
image: redis:7-alpine
ports:
- "6380:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 2s
timeout: 5s
retries: 10
services:
db:
image: postgres:17-alpine
# ...
redis:
image: redis:7-alpine
# ...
app:
build: ../..
depends_on:
db: { condition: service_healthy }
redis: { condition: service_healthy }
# ...
worker:
build: ../..
command: ["node", "worker.js"]
depends_on:
db: { condition: service_healthy }
redis: { condition: service_healthy }
# ...
#!/usr/bin/env bash
# test/functional/run-docker.sh
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
COMPOSE_FILE="$SCRIPT_DIR/docker-compose.yml"
cleanup() {
docker compose -f "$COMPOSE_FILE" down --volumes --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
# Start services and wait for health
docker compose -f "$COMPOSE_FILE" up -d --wait
# Export connection details for tests
export DATABASE_URL="postgres://test:test@localhost:5433/test"
export REDIS_URL="redis://localhost:6380"
# Run migrations if needed
# npm run db:migrate (or equivalent)
# Run functional tests
npm run test:functional
{
"test:functional": "vitest run --project functional",
"test:functional:docker": "bash test/functional/run-docker.sh",
"test:functional:ci": "bash test/functional/run-docker.sh"
}
.PHONY: test-functional
test-functional:
docker compose -f test/functional/docker-compose.yml up -d --wait
DATABASE_URL=postgres://test:test@localhost:5433/test \
pytest tests/functional/ || (docker compose -f test/functional/docker-compose.yml down -v; exit 1)
docker compose -f test/functional/docker-compose.yml down -v
The tests themselves should not know about Docker — they connect to services via environment variables or config, same as they would in production.
Node.js/TypeScript example:
// test/functional/user-registration.functional.test.ts
import { describe, it, expect, beforeAll, afterAll } from 'vitest'
import { createApp } from '../../src/app'
describe('UserRegistration', () => {
let app: ReturnType<typeof createApp>
beforeAll(async () => {
// Uses DATABASE_URL from environment (set by run-docker.sh)
app = createApp()
await app.db.migrate()
})
afterAll(async () => {
await app.close()
})
describe('when a new user registers with valid details', () => {
it('creates the user account', async () => {
const res = await app.inject({
method: 'POST',
url: '/users',
payload: { email: '[email protected]', password: 'secret123' },
})
expect(res.statusCode).toBe(201)
})
})
})
Python example:
# tests/functional/test_user_registration.py
import os
import pytest
import httpx
BASE_URL = os.environ.get("APP_URL", "http://localhost:3001")
def describe_user_registration():
def describe_when_valid_details():
def it_creates_account():
resp = httpx.post(f"{BASE_URL}/users", json={
"email": "[email protected]",
"password": "secret123",
})
assert resp.status_code == 201
Go example:
// test/functional/user_test.go
//go:build integration
package functional
import (
"net/http"
"os"
"testing"
)
func TestUserRegistration_ValidDetails_CreatesAccount(t *testing.T) {
baseURL := os.Getenv("APP_URL")
if baseURL == "" {
baseURL = "http://localhost:3001"
}
// ...
}
localhost:<port>beforeAll/setup# Bats example for a CLI that talks to a database
@test "import command loads CSV into database" {
run ./mycli import --file fixtures/data.csv --db "$DATABASE_URL"
[ "$status" -eq 0 ]
# Verify data landed in the database
count=$(psql "$DATABASE_URL" -t -c "SELECT count(*) FROM imports")
[ "$(echo "$count" | tr -d ' ')" -eq 42 ]
}
Usually no Docker needed. Functional tests:
Same as Web API, but also consider:
tmpfs for databases in test compose files. Without it, data survives docker compose down if volumes aren't explicitly removed, causing flaky tests.healthcheck + depends_on: condition: service_healthy. Never use sleep to wait for services — it's fragile and slow.services: or a DinD sidecar.trap cleanup EXIT in shell wrappers so services are torn down even when tests fail. Without this, orphaned containers accumulate.postgres:17-alpine, not postgres:latest) to avoid surprise breakage when upstream releases a new major version.context: ../..) must reach the project root. Use .dockerignore to keep the context small.platform: linux/amd64 to the service if you hit exec format error. This is slower (Rosetta emulation) but works.beforeAll, use transactions that rollback, or recreate the database. Never depend on state from a previous test run.Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub elimydlarz/claude-code-plugins --plugin contree