By flexion
Agent Artifex — designing and testing MCP servers, agents, chatbots, and tool-calling systems using evidence-based patterns from Anthropic, OpenAI, RAGAS, and academic research
Use when the user asks "what testing do we need?", "what are our testing gaps?", "we have some tests but are they enough?", "is our MCP server well-tested?", "what should we test next?", "audit our test coverage for AI", "we keep getting bad responses and don't know why", "our agent picks the wrong tool sometimes", "how is my tool design?", "are my descriptions good enough?", "review my error messages", "is my system prompt well-designed?", "audit my MCP server design", "what design gaps do I have?", or needs to diagnose AI design or testing gaps in an existing project. Also use when someone says "assess my testing", "review our test strategy for AI", or "assess my design".
Use when the user wants to design an MCP server, agent, chatbot, or tool-calling system for quality. This includes: designing tool descriptions, structuring parameters and schemas, writing error messages for LLM consumers, designing system prompts, planning multi-turn conversations, architecting tool sets, or designing response formats. Also use when someone says "how should I design", "what makes a good tool description", "how should I structure my errors", "design my MCP server", "how do I organize my tools", or any task where they want to follow evidence-based design principles before or while building.
Use when the user asks "what is the AI testing framework?", "what are the design principles?", "explain the 7 design areas", "explain MCP testing", "what are the testing areas?", "what's the testing pyramid for AI?", "how do the testing layers relate?", "what should I test in my MCP server?", "overview of AI agent design and testing", or needs a comprehensive reference overview of the AI design and testing guidelines. Also use when someone is new to designing or testing AI systems and wants the big picture before diving into implementation.
Use when the user is unsure which AI services skill to invoke, asks for "AI services guidance" generally, says "help me design my MCP server", "how should I structure my tools?", "what makes a good tool description?", "how do I design my agent?", "help me test my MCP server", "how should I test my agent?", "what testing do I need?", "my tests are flaky", "the agent picks the wrong tool", "how do I set up CI for AI tests?", or wants a structured guided experience that routes across multiple AI services skills. Also use when the user mentions designing or testing chatbots, tool descriptions, evals, or agent behavior without specifying a particular skill.
Use when the user wants to improve an existing MCP server, agent, chatbot, or tool-calling system. This includes: improving tool descriptions, fixing error messages, adding output schemas, writing tests, implementing quality checks, adding evals, setting up test harnesses, or any task where they say "help me improve", "fix my descriptions", "add tests", "write evals", "implement quality checks", "make my server better", "apply the design principles", or are ready to make code changes to improve quality. This skill covers both design application (making the code better) and test implementation (verifying the code is good). For scaffolding new projects, use claude-api:mcp-builder. For design principles without code changes, use agent-artifex:design.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Your strategic coding partner.
Like a cycling domestique, it carries the water, stays focused on your goals, and handles the unglamorous work you don't want to do.
"Remember Sammy Jankis."
Like Leonard in Memento, Claude can't form long-term memories. Context window fills up, conversation resets, everything vanishes. memento gives Claude its tattoos—session files that persist decisions, progress, and context across resets.
Helps developers embody Flexion fundamentals across conversation boundaries:
"I told you. You agreed. You forgot. Repeat."
You've written the perfect CLAUDE.md. Claude reads it. Claude agrees. By turn 47, Claude ignores half of it. mantra provides curated behavioral rules that are automatically loaded via Claude Code's native .claude/rules/ mechanism—ensuring consistent behavior from turn 1 to turn 100.
Helps developers embody Flexion fundamentals throughout long sessions:
"The burden is mine now."
JIRA tickets, Azure DevOps work items, commit messages, PR descriptions. The awful-but-important work that kills your flow. onus handles the project management bureaucracy so you can code.
Supports GitHub Issues (default, zero-config), JIRA, and Azure DevOps work items.
Helps developers embody Flexion fundamentals while staying in flow:
External (GitHub/JIRA/Azure DevOps)
│
▼ fetch issue details
[onus]
│
▼ populate session file
[memento] ←── "What's next?" lookup
│
▼ read session context
[mantra] ──► native rules auto-loaded
Each plugin works standalone but gains enhanced behavior when used together.

Session resumption showing mantra (context refresh counter), onus (issue tracking), and memento (session file) working together. Claude reads the session file and picks up exactly where the previous conversation left off.
| Tool | Version | Used By | Purpose |
|---|---|---|---|
| Claude Code | 2.0.12+ | All | Plugin host (plugin system introduced in v2.0.12) |
| Node.js | 18+ | All | Runtime for hooks and scripts |
| git | 2.x | All | Branch detection, commits, session tracking |
When using /onus:fetch, Claude will use these tools to retrieve work items:
| Tool | Platform | Purpose |
|---|---|---|
| GitHub CLI (gh) | GitHub | Fetch issues, create PRs (recommended) |
| Claude's WebFetch | JIRA, Azure DevOps | Built-in HTTP tool for API calls |
Note: For JIRA/Azure DevOps, Claude uses its built-in WebFetch capability. No additional tools required.
| Variable | Platform | How to Get |
|---|---|---|
GITHUB_TOKEN | GitHub | Create PAT with repo scope |
JIRA_TOKEN | JIRA | echo -n "email:api_token" | base64 (Get API token) |
AZURE_DEVOPS_TOKEN | Azure DevOps | echo -n ":pat" | base64 (Create PAT with Work Items Read) |
# Check required tools
git --version # git version 2.x
node --version # v18.x or higher
claude --version # Claude Code 2.x
# Check GitHub CLI (optional, for onus with GitHub)
gh --version # gh version 2.x
gh auth status # Verify authentication
/plugin marketplace add flexion/claude-domestique
npx claudepluginhub flexion/claude-domestique --plugin agent-artifexSession management for Claude Code - auto-injected via hooks, zero config
Work item automation for Claude Code - auto-injected via hooks, zero config
Behavioral skills for Claude Code - critical assessment, evidence-based debugging, and rule authoring
No description provided.
Curated Claude Code skills and commands for prompt engineering, MCP servers, subagents, hooks, and productivity workflows
Enterprise AI agent orchestration plugin with 150+ commands, 74+ specialized agents, SPARC methodology, swarm coordination, GitHub integration, and neural training capabilities
This skill should be used when the model's ROLE_TYPE is orchestrator and needs to delegate tasks to specialist sub-agents. Provides scientific delegation framework ensuring world-building context (WHERE, WHAT, WHY) while preserving agent autonomy in implementation decisions (HOW). Use when planning task delegation, structuring sub-agent prompts, or coordinating multi-agent workflows.
High-intelligence Claude Code copilot with deep code reasoning, evidence-driven planning, orchestration-first execution, model routing, context budgeting, CI/CD integration, enterprise security, plugin development, prompt engineering, performance profiling, agent teams, channels (event-driven autonomy with CI webhook, mobile approval relay, Discord bridge, and fakechat dev profile), interactive tutorials, LSP integration, security-hardened hook script library, MCP Prompts coverage, common workflow packs, runtime selection guide, computer-use patterns, checkpointing, scheduled-task blueprints, repo bootstrap scanner, hook policy engine (8 installable packs), layered memory deployment, role-based subagent packs (implementer, debugger, migration-lead, dependency-auditor, release-coordinator), 5 agent-team topology kits, autonomy operating mode (4 profiles + 3 gates), and a queryable 15-tool MCP documentation server with autonomy advisor.
Transform Claude Code into a structured development platform with 29 /sc: commands, 23 specialized agents, 7 behavioral modes, and MCP server integration