From claude-commands
Provides centralized testing utilities, debug protocols, and CI/local parity guidelines. Includes shared lib modules for evidence capture, MCP client, campaign utilities, and browser testing guidance for Playwright vs chrome-superpower.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-commands:testing-infrastructureThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Purpose**: Centralized testing utilities, debug protocols, and CI/local parity guidelines.
Purpose: Centralized testing utilities, debug protocols, and CI/local parity guidelines.
Always use testing_mcp/lib/ utilities - NEVER reimplement test infrastructure.
| Module | Functions |
|---|---|
lib/evidence_utils.py | get_evidence_dir(), capture_provenance(), save_evidence(), write_with_checksum(), create_evidence_bundle(), save_request_responses() |
lib/mcp_client.py | MCPClient(base_url, timeout), client.tools_call(tool_name, args) |
lib/campaign_utils.py | create_campaign(), process_action(), get_campaign_state(), ensure_game_state_seed() |
lib/server_utils.py | start_local_mcp_server(), pick_free_port(), DEFAULT_EVIDENCE_ENV |
lib/model_utils.py | settings_for_model(), update_user_settings() |
lib/narrative_validation.py | validate_narrative_quality(), extract_dice_notation() |
# Import from lib modules
from testing_mcp.lib.evidence_utils import get_evidence_dir, capture_provenance
from testing_mcp.lib.mcp_client import MCPClient
from testing_mcp.lib.campaign_utils import create_campaign, process_action
# NEVER reimplement these functions
Writing custom capture_provenance(), get_evidence_dir(), save_evidence(), or any function that duplicates testing_mcp/lib/ functionality.
MANDATORY GUIDANCE: Choose the right tool for the task
Use for: Exploratory manual browsing and interactive testing
DON'T use for: Deterministic automated tests (async operations don't complete reliably)
Use for: Deterministic browser tests with validation
Pattern:
from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
browser = playwright.chromium.launch(headless=True)
context = browser.new_context()
page = context.new_page()
# Navigate and interact
page.goto(url)
page.fill("#input", "text")
page.click("button[type='submit']")
# Validate results
elements = page.query_selector_all(".story-entry")
assert len(elements) > 0, "Expected story content"
Why this matters: chrome-superpower returns immediately from async JavaScript operations (shows [object Promise]), making it unsuitable for tests that need to validate streaming responses. Playwright properly waits for events and async operations.
Mock external dependencies to ensure tests pass in both CI and local environments:
with patch('shutil.which', return_value='/usr/bin/command'):
with patch('subprocess.run') as mock_run:
mock_run.return_value = MagicMock(returncode=0)
# test code here
Rules:
shutil.which(), subprocess.run(), file ops$PROJECT_ROOT/tests/*) may use direct logging# CORRECT - Debug info in assertion
debug_info = f"function_result={result}, context={context}"
self.assertTrue(result, f"FAIL DEBUG: {debug_info}")
# WRONG - Print statements (lost in CI)
print(f"Debug: {result}")
ZERO TOLERANCE: Fix ALL test failures in CI
LOCAL TESTING: Don't run full test suite locally - rely on GitHub CI
TESTING=true python $PROJECT_ROOT/tests/test_<specific>.pytesting_mcp suitestesting_mcp/*.py files as script entrypoints, not pytest-collected test modules.cd testing_mcp && ../vpython test_<name>.py --server http://127.0.0.1:8001cd testing_mcp && ../vpython test_<name>.py --start-local./vpython testing_mcp/schema/test_schema_<name>.pypytest testing_mcp/... for script-style files that parse CLI args or expect script runtime setup.testing_ui browser auth bypass$PROJECT_ROOT/testing_ui/README_TEST_MODE.md exactly:
TESTING_AUTH_BYPASS=true.?test_mode=true&test_user_id=<id>.window.testAuthBypass.enabledX-Test-Bypass-Auth: trueX-Test-User-ID: <id>MCP_SERVER_URL="https://..." MCP_TEST_MODE=real node scripts/mcp-smoke-tests.mjs
/tmp/repo/branch/smoke_tests/evidence-standards.md - Evidence capture standardsend2end-testing.md - E2E test patternsnpx claudepluginhub jleechanorg/claude-commands --plugin claude-commandsAutomates headless E2E tests with Playwright MCP: navigation, element interaction, form handling, and cross-browser testing. Use when running CI/CD test automation or Playwright-based browser tests.
Defines test directory structure, decision principles for choosing testing layers (unit, E2E, MCP API, HTTP API, browser), and evidence implications. Use when creating tests or reviewing coverage.
Tests the specific feature or fix from the current session using browser tools (playwright-cli, Puppeteer) or API calls. Focused on verifying recent changes rather than full E2E.