By rhesis-ai
Design, run, and analyze AI test suites on the Rhesis platform via MCP connection, enabling the full explore-plan-create-execute-analyze workflow for automated testing cycles.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub rhesis-ai/rhesis --plugin rhesisOpen-source testing and regression detection framework for AI agents. Golden baseline diffing, CI/CD integration, works with LangGraph, CrewAI, OpenAI, Anthropic Claude, HuggingFace, Ollama, and MCP.
SDK Usability Benchmark — generate, execute, judge, and analyze AI agent benchmark suites
AI test generation with Ralph-loop quality gate: coder → reviewer → iterate
Chaos-test LLM agents for resilience. Injects latency, errors, timeouts, and malformed responses between your agent and its API to find failure modes before production does.
Set up evaluation of AI agents with tool call validation, correctness checks, task completion, and tool reliability using Dokimos. Framework-agnostic — works with any agent framework.
Benchmark, evaluate, and optimize skills to ensure reliable performance across all LLMs