From ironclaude
Detect and eliminate testing theatre - tests that can't prevent regressions
How this skill is triggered — by the user, by Claude, or both
Slash command
/ironclaude:testing-theatre-detectionThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Enforce zero tolerance for testing theatre across all test types.
Enforce zero tolerance for testing theatre across all test types.
Whenever soliciting user input — choices, confirmations, or selections — ALWAYS use the AskUserQuestion tool. NEVER ask via prose. Follow the format in .claude/rules/ask-user-question-format.md: Re-ground context, Predict, Options.
Announce professional mode status:
Using testing-theatre-detection skill. Professional mode is ACTIVE - architect mode enforced (no code changes).
If invoked with exact file path:
Skip scope determination. Analyze the provided file directly.
Example: /testing-theatre-detection src/auth/login.test.js
If invoked from plan execution (subagent with task context): Identify tests related to the specific task/feature:
If manual invocation (no file path provided): Use AskUserQuestion tool:
If auto-invoked (from code-review): Automatically use current changes (git diff --name-only)
Run appropriate command based on scope:
import.*{ComponentName} or import.*'path/to/file'git diff --name-only | grep -E '\.(test|spec)\.(js|ts|tsx|java|py)$'**/*.test.js, **/*Test.java, **/test_*.pyFor each test file, determine framework:
Jest Detection:
.test.js, .spec.js, .test.ts, .spec.ts, .test.tsx, .spec.tsximport { test, expect } from, import { describe, it } fromjestJUnit Detection:
*Test.java, *Tests.javaimport org.junit, @Test, @Disabledjunitpytest Detection:
test_*.py, *_test.pyimport pytest, @pytest.markpytestReact Testing Library Detection:
.test.tsx, .spec.tsximport { render } from '@testing-library/react'jest + react-testing-libraryIf framework cannot be determined:
Jest patterns to detect:
it.skip(, test.skip(, xit(, xtest(, describe.skip(\.(skip|xit|xtest)\(JUnit patterns to detect:
@Disabled, @Ignore annotations@(Disabled|Ignore)pytest patterns to detect:
@pytest.mark.skip, @pytest.mark.xfail@pytest\.mark\.(skip|xfail)For each match found:
Issue: Skipped Test (Critical)
Line {number}: {test name}
Problem: Test is disabled - known broken behavior being ignored
Risk: Production bug hiding behind disabled test
Fix: Remove skip/disable annotation and fix the failing test
Detection approach: For each test function, count assertion statements.
Jest assertions to count:
expect( statementsassert( statementsexpect\(|assert\(JUnit assertions to count:
assert, assertEquals, assertTrue, assertFalse, etc.assert[A-Z]\w+\(pytest assertions to count:
assert statements^\s+assert\sTautological assertion patterns:
expect(x).toBe(x), expect(true).toBe(true)assertTrue(true), assertEquals(x, x)assert True, assert x == xFor each test with zero assertions:
Issue: No Assertions (Critical)
Line {number}: test "{name}"
Problem: Test has no expect() calls - always passes regardless of implementation
Risk: Cannot detect regressions in {component} behavior
Fix:
test('{name}', () => {
// Arrange
const result = functionUnderTest(input);
// Assert
expect(result).toBe(expectedValue);
expect(result.property).toEqual(expectedProperty);
});
Detection approach: Count mock statements vs real code invocations in each test.
Mock patterns to count:
Jest:
jest.mock(, jest.spyOn(, mockImplementation, mockReturnValuejest\.(mock|spyOn)|mock(Implementation|ReturnValue)JUnit:
@Mock, Mockito.mock(, when(, verify(@Mock|Mockito\.(mock|when|verify)pytest:
@patch, Mock(), MagicMock()@patch|Mock\(\)|MagicMock\(\)Calculate ratio:
For each over-mocked test:
Issue: Over-Mocking (Critical)
Line {number}: test "{name}"
Problem: {percentage}% of test is mocking - not testing real behavior
Risk: Tests pass but production code may be broken
Fix: Reduce mocking. Test real integrations when possible:
- Mock external dependencies (APIs, databases) at boundaries
- Use real implementations for internal logic
- Integration tests should test actual integration
Detection approach: Find tests that ONLY use snapshot assertions without behavioral assertions.
Snapshot patterns:
toMatchSnapshot(), toMatchInlineSnapshot()toMatchSnapshot|toMatchInlineSnapshotCheck logic:
For each snapshot-only test:
Issue: Snapshot Only (Critical)
Line {number}: test "{name}"
Problem: Only uses toMatchSnapshot() with no behavior validation
Risk: Doesn't verify {component} actually works
Fix: Add assertions for interactive behavior:
- Test click handlers are called with correct arguments
- Test state changes correctly
- Test props are applied correctly
- Test accessibility attributes
- Then use snapshots as supplementary check
Detection approach: Find try/catch blocks or conditional logic that can prevent test failures.
Patterns to detect:
Try/catch with no rethrow:
try { ... } catch (e) { } or catch (e) { console.log }catch (Exception e) { } with no throw/failexcept: pass or except Exception:Conditional assertions:
if (condition) { expect(...) } - assertion might not runSearch patterns:
catch\s*\([^)]+\)\s*\{\s*\}For each error swallowing pattern:
Issue: Error Swallowing (Critical)
Line {number}: test "{name}"
Problem: Try/catch or conditional logic can prevent test failure
Risk: Test passes even when code throws errors
Fix: Either:
- Remove try/catch and let test fail on error
- If testing error handling, assert the error: expect(() => fn()).toThrow()
- Remove conditional logic around assertions
Find test command:
Check package.json (for JS/TS projects):
scripts.test or scripts.test-cinpm test or yarn testCheck Makefile (for any project):
^test:|^test-.*:make testCheck build.gradle (for Java projects):
task test./gradlew testDirect framework invocation (fallback):
npx jest --coverage./gradlew test or mvn testpytest --covIf ambiguous, use AskUserQuestion tool:
Execute test command: Run with Bash tool:
Example:
npm test 2>&1
Check 1: Exit Code
❌ Test Failures (Critical)
Problem: {count} tests failed
Risk: Cannot assess test quality when tests don't pass
Fix: Address test failures first:
{list of failed tests from output}
Check 2: Warning Messages
Scan output for warning patterns:
WARN:, WARNING:, Warning:deprecated, deprecation(node:) with warningFor each warning:
⚠️ Test Warning (Critical)
Problem: Test output contains warning
Warning: {warning text}
Risk: Warnings indicate unreliable test behavior
Fix: Address the warning - update deprecated APIs, fix configuration
Check 3: Error Messages (even if tests pass)
Scan output for error patterns:
Error:, ERROR:Exception:failed toFor each error in passing tests:
❌ Hidden Error (Critical)
Problem: Test output contains errors but tests still pass
Error: {error text}
Risk: Test is swallowing errors
Fix: Update test to fail on errors or fix the underlying issue
Check 4: Flaky Indicators
Scan for flaky patterns:
timeout, timed outETIMEDOUT, ECONNREFUSEDUnhandledPromiseRejectionrace conditionFor each flaky indicator:
🔀 Flaky Test Indicator (Critical)
Problem: Test output suggests flaky/unreliable behavior
Indicator: {text}
Risk: Test may pass/fail randomly
Fix:
- Add proper async/await handling
- Increase timeouts if necessary
- Fix race conditions
- Mock unstable dependencies
Find coverage reports:
Jest (JSON format):
coverage/coverage-final.jsonJUnit (JaCoCo XML format):
build/reports/jacoco/test/jacocoTestReport.xmlpytest (JSON format via pytest-cov):
.coverage or coverage.jsonCoverage metrics to extract:
Correlation with static analysis:
⚠️ High Coverage, Low Assertions (Critical)
Problem: {coverage}% code coverage but only {count} assertions
Risk: Executing code without verifying behavior - false confidence
Fix: Add meaningful assertions that verify:
- Return values
- State changes
- Side effects
- Error conditions
Report structure:
Testing Theatre Audit Report
=============================
Scope: {scope description} ({count} test files analyzed)
Status: {✅ CLEAN or ❌ {count} issues found (MUST FIX)}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📁 {file_path} ({issue_count} issues)
{for each issue in file:}
❌ {Issue Type} (Critical)
Line {number}: {test name}
Problem: {description}
Risk: {impact}
Fix: {guidance with code example if applicable}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Summary: {total_issues} critical issues across {file_count} files
All issues must be fixed before production readiness.
Grouping logic:
Issue formatting:
npx claudepluginhub robertphyatt/ironclaude --plugin ironclaudeReviews test suites for assertion quality to detect coverage theater where high line/branch coverage masks weak tests that wouldn't catch regressions.
Reviews test suites for coverage completeness, quality, and best practices. Checks happy/sad paths, edge cases, assertions, isolation, AAA patterns, and compliance with RSpec, Minitest, Jest, Playwright.
Audit test suite health — find flaky tests, slow tests, coverage gaps, and testing anti-patterns. Use when asked to "audit tests", "fix flaky tests", "why are tests slow", "test health", or "improve test suite".