From numeric-mcp-toolkit
Builds a new, production-ready Cowork skill for an accountant who wants to automate a workflow — one that survives a real close, not a demo. Four phases (Ideate, Plan, Build, Review), each producing a document that becomes required input to the next. Plan fans out specialists (data, accounting controls, Numeric integration, architecture, pressure-test). Build assembles the new skill with performance discipline baked in. Review fans out a five-reviewer panel (controller, accounting process expert, auditor, adversarial, on-call) plus a synthesizer that renders one verdict before the skill ships. Trigger when the user says "build me a skill", "build an accounting skill", "automate this workflow", "automate my close process", "build me an automation", "compound automation", "ideate plan build review", "I want to automate [X]", "design an automation", "help me automate", or describes wanting to automate a repeatable accounting workflow. Use this skill even when the user does not name the framework explicitly.
How this skill is triggered — by the user, by Claude, or both
Slash command
/numeric-mcp-toolkit:accounting-skill-builderThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Walks the user through automating an accounting workflow using compound engineering — each phase produces a doc that becomes the required input to the next, so context compounds rather than resets. Four phases — Ideate, Plan, Build, Review. Stop after Review.
agents/accounting-process-expert.mdagents/accounting.mdagents/adversarial-reviewer.mdagents/architecture.mdagents/auditor.mdagents/controller-skeptic.mdagents/data-sources.mdagents/interviewer.mdagents/numeric-integration.mdagents/pressure-test.mdagents/production-on-call.mdagents/review-synthesizer.mdreferences/accounting-controls.mdreferences/build-patterns.mdreferences/numeric-mcp-principles.mdreferences/performance-patterns.mdreferences/skill-catalog.mdtemplates/brief.mdtemplates/plan.mdtemplates/review.mdWalks the user through automating an accounting workflow using compound engineering — each phase produces a doc that becomes the required input to the next, so context compounds rather than resets. Four phases — Ideate, Plan, Build, Review. Stop after Review.
The user is an accountant working in their own Cowork. They have access to the Numeric MCP Toolkit (existing skills covering close pulse, retro, flux, accruals, JE generation, rec workbooks, anomaly scans, audit evidence, AR/AP aging, etc.) plus their own connectors. The output of this skill is a production-ready automation tailored to their workflow — most often a refinement of an existing toolkit skill, sometimes a new script or scheduled task, sometimes a Cowork artifact.
Production-ready is the bar, not a working demo. The skill must hold up against a real close: correct accounting treatment, the right controls and approval gates, deterministic outputs, graceful failure, and performance that scales to real data volumes. The Phase 4 review panel exists to enforce that bar — nothing is declared done until a controller, process expert, auditor, adversarial reviewer, and on-call engineer have each signed off.
The four phases pin scope progressively: brief pins what to solve, plan pins how, build pins what's actually shipped, review pins what works in production.
Load agents/interviewer.md and run the interview in the main thread. The interviewer covers the question flow with a thinking-partner stance — pushes back where framing is narrow, surfaces what the user didn't say. Stops as soon as the brief template can be filled and the user agrees the framing is right.
Pushback at this stage is conversational, in-the-moment. The deeper structural pushback (cross-cutting framing critique, adjacent-workflow surfacing, cadence and output-shape inversion) happens in Phase 2 via the pressure-test specialist.
Output: {run_id}/brief.md using templates/brief.md.
Gate to Phase 2: surface the brief, ask via AskUserQuestion whether to proceed, edit, or stop.
Plan happens in two passes.
First pass — rough plan. Synthesize the brief into a rough plan in the main thread. No subagents yet. Produce a high-level outline of the workflow steps the automation will run:
1. [step]: [what the automation does]
2. [step]: [what the automation does]
...
Include a guess at delivery shape (refinement of existing skill, new skill, scheduled task, artifact) and which existing toolkit skill (if any) is the closest fit. This pass is fast — maybe a paragraph plus the outline.
Surface the rough plan to the user. Ask via AskUserQuestion whether the workflow outline matches what they meant, what to change, or to add anything missing. Capture revisions inline.
Second pass — specialists fill gaps. Once the user signs off on the workflow outline, fan out the five specialist subagents in parallel via the Task tool, each reading the brief and the rough plan:
agents/data-sources.md — maps required data and access paths; runs connectivity probesagents/accounting.md — applies the control taxonomy in references/accounting-controls.mdagents/numeric-integration.md — confirms the existing-skill recommendation; codifies Numeric MCP interaction principlesagents/architecture.md — finalizes delivery shape, file layout, and runs the scope-confirm gateagents/pressure-test.md — pushes back on the brief itself. Four lenses (pain/friction, inversion/removal, assumption-breaking, leverage/second-order). Surfaces gaps and questions the user should answer before Build.Each writes to {run_id}/plan/{agent}.md. Parent merges into {run_id}/plan.md using templates/plan.md. Pressure-test's pushbacks surface to the user as part of the merged plan — user can revise the brief in response, dismiss specific pushbacks, or proceed.
Gate to Phase 3: surface the merged plan. Architecture's confirm-scope question runs here. User confirms or narrows.
Build runs in the main thread. Read brief.md and plan.md, then construct the deliverable per the plan. Load references/build-patterns.md for the methodology — output types, sub-roles, design checklist, predictable output paths, deliverable file tree.
The deliverable is always a Cowork skill. When constructing the SKILL.md and its bundled files, invoke the skill-creator skill for skill-specific best practices (frontmatter, progressive disclosure, output formats, examples). Don't duplicate skill-creator's content.
What the skill produces when run varies — report / external mutation / working file / Cowork artifact / composite. The output type drives the skill's control discipline. See references/build-patterns.md for the discipline per type.
Performance discipline. Apply the patterns in references/performance-patterns.md while writing — caching, fan-out, materiality gates, scope-confirm gates, short-circuits, subagent contracts, gated submission. The new skill ships with its own scoped performance.md.
Output: {run_id}/build/ containing the deliverable file tree, plus _build_summary.md (one-pager listing every file + why) and _panel_inputs.md (paths the Phase 4 panel will read).
Bridge to Phase 4: the user copies {run_id}/build/{deliverable}/ to their workspace folder (e.g., ~/Documents/Claude/{deliverable}/). Cowork registers the new skill. The user can now invoke it.
Two steps with a synthesizer at the end. First, the user runs the new skill on test data themselves. Then a five-reviewer panel fans out in parallel against the captured outputs, and a synthesizer renders one verdict.
Step 1 — User runs the new skill. The user invokes the new skill on a recent closed period by default (or sample data, or synthetic input — they choose). Build's design defaults to dry-run mode; the user doesn't need a separate runner agent. Outputs land in the user's workspace folder under the run's chosen output path.
Before invoking the panel, confirm with the user: which run should the panel review? (Most recent run, named period, or specific output folder.) Capture the path — every panel reviewer needs it.
Step 2 — Five-reviewer panel (parallel via Task tool). Each reviewer receives three input paths:
{run_id}/build/{deliverable}/ (the skill source — SKILL.md, references/, scripts/, performance.md){run_id}/brief.md and {run_id}/plan.md (for context — controls list, expected scope, materiality)Each reviewer applies its own lens and returns structured JSON.
agents/controller-skeptic.md — stakeholder lens. "Would I sign this with my name on the certification?"agents/accounting-process-expert.md — workflow-shape lens. Macro fit (S2P / O2C / R2R / H2R) and micro design (waste, variability, defects).agents/auditor.md — evidence lens. "Could a Big 4 senior rely on this control? Walk me through the population, sample, re-performance test."agents/adversarial-reviewer.md — failure-construction lens. Builds specific scenarios that break the deliverable (assumption violations, composition failures, cascades, abuse cases).agents/production-on-call.md — operational lens. "It's 3am Tuesday and this just failed. How does the user know? What's the runbook?"Each writes to {run_id}/review/{reviewer}.json.
Step 3 — Synthesize. agents/review-synthesizer.md reads the user's test-run output, all five panel JSONs, and the brief (for the human-time baseline). Dedupes overlapping findings, triages everything P1 (must fix) / P2 (should fix) / P3 (nice to fix), and renders one verdict — Cleared / Cleared with conditions / Blocked.
The synthesizer records the iteration backlog in the run's review doc ({run_id}/review.md) — internal to the build, and the right place for it. It is never written into the delivered skill's SKILL.md, which ships clean (often to a public library).
Output: {run_id}/review.md using templates/review.md. End.
Things that commonly go sideways. The orchestrator should watch for these.
AskUserQuestion rather than inferring. Wrong tag → wrong control pack in Phase 4.## Iteration backlog in {run_id}/review.md, items the user hasn't crossed off must carry forward. Older items don't disappear; only completed ones do. (The backlog lives in the review doc, never in the shipped SKILL.md.)list_financial_accounts and get_workspace_context for 24h. If the user changed their chart of accounts or added entities, the cache will lie. Build should expose a --refresh flag that clears the cache.Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub numeric-io/mcp-community --plugin numeric-mcp-toolkit