By orq-ai
Agent skills for building, deploying, evaluating, and monitoring LLM pipelines on the orq.ai platform.
Show workspace analytics — requests, cost, tokens, errors, top models, and drill-down trends
List available AI models and their capabilities
Quick entry point into the `orq-manage-skills` skill. Routes to the right phase based on the first argument, or asks if no argument is given.
Interactive onboarding guide — set up credentials, connect to orq.ai, and learn every command and skill
Query and summarize traces with filters — debugging entry point before orq-analyze-trace-failures
Write and run evaluatorq evaluation scripts (Python or TypeScript) for a single agent or deployment — custom scorers, built-in evaluators, and dataset-driven evaluation. For CLI workflows, use the companion skills: `orq-red-team` for `eq redteam` adversarial testing and `orq-simulate-agent` for `eq sim` multi-turn user simulation. Do NOT use when comparing multiple agents head-to-head (use orq-compare-agents) or when running orq.ai-native experiments only (use orq-run-experiment).
Read production traces, identify what's failing, and build failure taxonomies using open coding and axial coding methodology. Use when debugging agent or pipeline quality, investigating "why are my outputs bad?", or before building any evaluator — error analysis must come first. Do NOT use when you already have identified failure modes and need evaluators (use orq-build-evaluator) or datasets (use orq-generate-synthetic-dataset).
Design, create, and configure orq.ai Agents with tools, instructions, knowledge bases, and memory stores. Use when building new agents, attaching KBs or memory, writing system instructions, selecting models, or setting up RAG pipelines. Do NOT use for debugging existing agents (use orq-analyze-trace-failures) or comparing agents across frameworks (use orq-compare-agents).
Create validated LLM-as-a-Judge evaluators following best practices — binary Pass/Fail judges with TPR/TNR validation for measuring specific failure modes. Use when you need to automate quality checks, build guardrails, or measure a specific failure mode identified during trace analysis. Do NOT use when failures are fixable with prompt changes (use orq-optimize-prompt) or when failure modes are unknown (use orq-analyze-trace-failures first).
Run cross-framework agent comparisons using evaluatorq from orqkit — compares any combination of agents (orq.ai, LangGraph, CrewAI, OpenAI Agents SDK, Vercel AI SDK) head-to-head on the same dataset with LLM-as-a-judge scoring. Use when comparing agents, benchmarking, or wanting side-by-side evaluation. Do NOT use when comparing only orq.ai configurations with no external agents (use orq-run-experiment instead).
External network access
Connects to servers outside your machine
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Agent Skills for the full Build → Evaluate → Optimize lifecycle of LLM pipelines on orq.ai.
Skills are multi-step workflows that require reasoning (e.g. build an agent, run an experiment);
Commands are quick actions for immediate results (list traces, show analytics).
Each skill encodes best practices from prompt engineering, agent design, evaluation methodology, and experimentation into repeatable workflows. From creating agents and writing prompts, through trace analysis and dataset generation, to running validated experiments and iterating on results.
Built on the Agent Skills standard format, so it works with any compatible agent (Claude Code, Cursor, Gemini CLI, and others).
An orq.ai account
An API key from Settings → API Keys
export ORQ_API_KEY=your-key-here
| Tool | Recommended install |
|---|---|
| Claude Code (CLI) | Claude Code plugin |
| Claude Cowork (Desktop) | Claude Cowork install guide |
| Cursor | Cursor install guide |
| Codex | Codex install |
| Gemini CLI, Cline, Copilot, Windsurf | Skills-only install (npx) |
| Any MCP-capable client | MCP-only install |
Use this if you want easy access to all components — skills, MCP tools, and trace hooks — in one install. Installed via the orq-ai/assistant-plugins marketplace.
# In Claude Code:
/plugin marketplace add orq-ai/assistant-plugins
# Install all 3 plugins
/plugin install orq-skills@assistant-plugins
/plugin install orq-mcp@assistant-plugins
/plugin install orq-trace@assistant-plugins
| Plugin | What it gives you |
|---|---|
orq-skills | Skills, commands, and agents for the Build → Evaluate → Optimize lifecycle |
orq-mcp | MCP server registration — Claude can call orq.ai APIs directly |
orq-trace | OTLP tracing hooks that capture Claude Code sessions into orq.ai |
Verify with the interactive onboarding — checks ORQ_API_KEY, MCP reachability, and credentials:
/orq:quickstart
# Skills (writes to ~/.agents/skills/, which Codex scans by default)
npx skills add orq-ai/assistant-plugins --agent codex -g -y
# orq.ai MCP server (writes [mcp_servers.orq-workspace] to ~/.codex/config.toml)
codex mcp add orq-workspace \
--url https://my.orq.ai/v2/mcp \
--bearer-token-env-var ORQ_API_KEY
Restart Codex. Verify with /mcp (lists orq-workspace) or the prompt "List my orq.ai agents".
Codex plugins don't support slash commands, so /orq:* shortcuts are Claude Code-only — in Codex, describe tasks in natural language to trigger skills.
Use this when you're on a non-Claude agent (Cursor, Gemini CLI, Cline, Copilot CLI, Codex, Windsurf, and many others), or when you only want the skills without MCP/trace hooks.
npx skills add orq-ai/assistant-plugins
Auto-detects your agent and writes skills to the correct location (e.g. .claude/skills/, .cursor/rules/). Run inside your project directory.
Agent-specific install guides:
Use this when you want orq.ai MCP tools in a tool that isn't the Claude Code plugin (Claude Desktop, other MCP-capable clients, or manual Claude Code setup).
# Manual registration in Claude Code
claude mcp add --transport http orq-workspace https://my.orq.ai/v2/mcp \
--header "Authorization: Bearer ${ORQ_API_KEY}"
For other clients, most accept a JSON block with url + headers:
{
"mcpServers": {
"orq-workspace": {
"type": "http",
"url": "https://my.orq.ai/v2/mcp",
"headers": { "Authorization": "Bearer ${ORQ_API_KEY}" }
}
}
}
tests/scripts/validate-plugin-manifests.sh
Quick-action slash commands. Use /orq:<command> in Claude Code.
npx claudepluginhub orq-ai/assistant-pluginsAutomatically trace Claude Code sessions to Orq. Captures sessions, turns, tool calls, and LLM responses as hierarchical OTLP spans.
Production ADK orchestrator for A2A protocol and multi-agent coordination on Vertex AI
Context management and multi-agent orchestration with performance optimization tools
LLM observability tooling for agent development and Claude Code
Benchmark, evaluate, and optimize skills to ensure reliable performance across all LLMs
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
Comprehensive startup business analysis with market sizing (TAM/SAM/SOM), financial modeling, team planning, and strategic research