From warden
Use when writing a test that verifies a mock was called instead of the effect the call produced; reaching for a spy/stub/inspector to peek at internal state from a test; reviewing a test that breaks every time the production code is refactored; deciding what to assert about a function whose contract is "given X, the world looks like Y".
How this skill is triggered — by the user, by Claude, or both
Slash command
/warden:et-0004-asserting-observable-behaviorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
<!-- generated from tenets/ET-0004-asserting-observable-behavior.md by `uv run poe build` — do not edit by hand. -->
Type: best-practice · Tier: 1
A test asserts the observable behavior of the system under test — the return value, the persisted state, the message emitted, the HTTP/database/file effect produced, or the exception raised. It does not assert how that behavior was produced: which private method was called, which collaborator was invoked, in what order, with what arguments — unless the call itself is the behavior (e.g. the unit exists to send that exact message). When the only way to express the test is to reach for a mock and verify a method-call, the unit under test has no observable contract — fix the design, not the test.
Implementation-coupled tests fail every time someone refactors the production code, even when the behavior is unchanged. The team learns that a green test suite means "nothing was refactored", not "nothing broke" — so refactoring stops, the code rots, and the suite becomes a maintenance tax rather than a safety net. Worse, mock-verification tests routinely pass while the system is broken: the mock returns whatever the test set up, regardless of whether the real collaborator would. The test asserts that two pieces of code agree on a fiction. Behavior tests survive refactors because the contract — not the mechanism — is what matters to the caller.
// BAD: verifies that `_repo.Save` was called, not that the order was actually saved.
[Fact]
public void Submit_PersistsOrder()
{
var repoMock = new Mock<IOrderRepo>();
var sut = new OrderService(repoMock.Object);
sut.Submit(new Order { Id = 42 });
repoMock.Verify(r => r.Save(It.Is<Order>(o => o.Id == 42)), Times.Once);
// refactor: introduce a UnitOfWork that batches saves → test breaks even though the order is still saved.
}
# BAD: peeks at private state, asserts the spy, ignores the actual effect.
def test_enqueue_increments_pending_count(monkeypatch):
queue = JobQueue()
spy = MagicMock(wraps=queue._pending) # reaching into _pending
monkeypatch.setattr(queue, "_pending", spy)
queue.enqueue(Job("a"))
spy.append.assert_called_once() # asserts append was called, not that the job is enqueued
// BAD: every test that uses `sendEmail` is coupled to the mailer's internal API.
test("welcome email is sent", () => {
const mailer = { send: vi.fn() };
const svc = new SignupService(mailer);
svc.signup({ email: "[email protected]" });
expect(mailer.send).toHaveBeenCalledWith(
expect.objectContaining({ to: "[email protected]", template: "welcome-v3" }),
);
// refactor: switch from `template` to `subject`+`body` → test breaks; the user still got the email.
});
// GOOD: assert the persisted result through the observable surface (the repo's read side).
[Fact]
public void Submit_PersistsOrder()
{
var repo = new InMemoryOrderRepo(); // real fake, observable
var sut = new OrderService(repo);
sut.Submit(new Order { Id = 42 });
Assert.Equal(42, repo.GetById(42).Id); // the order is actually retrievable
}
# GOOD: assert the public observable — what a caller can see.
def test_enqueue_makes_job_visible():
queue = JobQueue()
queue.enqueue(Job("a"))
assert queue.peek().name == "a" # observable through the public API
assert queue.size() == 1
// GOOD: capture what the world saw — a real fake records sent messages, the test asserts on them.
test("welcome email is sent", () => {
const mailer = new FakeMailer(); // records sends, exposes a `sent` accessor
const svc = new SignupService(mailer);
svc.signup({ email: "[email protected]" });
expect(mailer.sent).toHaveLength(1);
expect(mailer.sent[0].to).toBe("[email protected]");
// refactor: swap template engine, change call shape → test still passes if the email goes out.
});
save(x), getById(x.id) returns x").
That contract is what the test should assert against.Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub c-hoeller/agent-plugins --plugin warden