By Azure
Copilot agent skills for running standardized evaluation workflows with AgentOps Toolkit and Microsoft Foundry agents.
AgentOps Doctor - surface regressions, latency spikes, error rates, and safety hits across AgentOps eval history, Azure Monitor traces, and Foundry control plane.
Generate or update agentops.yaml (flat 1.0 schema) by inspecting the workspace. Trigger on "configure agentops", "agentops.yaml", "set up evaluation", "what should I evaluate". Infer the agent target and dataset from the codebase; ask only when nothing can be found.
Create or extend a JSONL evaluation dataset for AgentOps. Trigger on "create dataset", "generate test data", "JSONL", "more eval rows". Infer the agent's domain from the codebase and produce realistic rows; never fabricate data when the domain is unclear.
Run AgentOps evaluations end-to-end against any agent (Foundry hosted/prompt agent, HTTP/JSON endpoint, or raw model deployment). Trigger on phrases like "run eval", "evaluate my agent", "benchmark", "agentops eval", "compare runs". Uses the flat agentops.yaml schema.
Read, regenerate, and explain AgentOps evaluation reports. Trigger on "show report", "explain scores", "regenerate report", "what do these metrics mean". Operates on results.json and report.md produced by `agentops eval run`.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A CLI, local Cockpit, and agent skills that help teams operationalize AI agents on Microsoft Foundry with standardized evaluation, observability, tracing, and operational practices.
AgentOps Toolkit is a CLI, local Cockpit, and agent skills that help teams move Microsoft Foundry agents from demo/POC to production with standardized evaluation, CI/CD gates, readiness diagnostics, release evidence, and trace-driven regression loops.
The project enables:
evidence.json + evidence.md) that summarize whether a candidate is ready, warning-only, or blockedAgentOps is not a replacement for Microsoft Foundry. Foundry remains the system of record for hosted agents, cloud evaluations, traces, runtime monitoring, red teaming, datasets, operations, and Azure resource posture. AgentOps adds the repo-side developer workflow around those capabilities: configuration, repeatable eval gates, CI/CD wiring, local artifacts, Doctor diagnostics, and a Cockpit that points back to the right Foundry or Azure Monitor surface.
| Surface | Microsoft Foundry provides | AgentOps Toolkit provides |
|---|---|---|
| Agent runtime | Hosted agents, model deployments, traces, monitor views | Target resolution, local/CI eval invocation, normalized run artifacts |
| Evaluations | Cloud eval execution, Foundry evaluation reports, dataset assets | Source-controlled eval config, threshold gates, PR reports, baseline comparison |
| Observability | Foundry Monitor, App Insights, traces, operations views | Telemetry wiring, CI eval spans, Doctor finding spans, Cockpit deep links |
| Operations | Active alerts, red teaming, runtime health, Azure resources | Readiness checklist, workflow generation, repo/CI hygiene checks, release evidence |
| Continuous improvement | Traces, datasets, online evaluation signals | Reviewed trace-to-dataset candidates and regression gates |
| Developer workflow | Portal experience and Azure platform services | CLI-first automation that teams can run locally and in CI |
The design goal is simple: AgentOps accelerates adoption of Foundry by making the developer workflow repeatable, observable, and CI-friendly; it does not duplicate Foundry's portal experience.
Core outputs:
results.json (machine-readable)report.md (human-readable)evidence.json / evidence.md (release-readiness evidence, when generated by agentops doctor --evidence-pack)Exit code contract:
0 execution succeeded and all thresholds passed2 execution succeeded but one or more thresholds failed1 runtime or configuration errorAzure Connector Gateway — manage gateways, connections, and triggers. Connects external services (Office 365, Teams, Microsoft Forms, SharePoint, OneDrive, GitHub, Azure Blob) to sandbox apps via event-driven triggers or direct API calls using connection runtime URLs.
Official Claude plugin for Azure DocumentDB (MongoDB-compatible). Bundles the DocumentDB MCP server with skills for data modeling, indexing, query optimization, vector search, full-text search, deployment, security, and more.
Azure Functions skills for setup, create, and deploy workflows
Azure Managed Grafana skills — health checks, cost analysis, and diagnostics via AMG-MCP
npx claudepluginhub azure/agentops --plugin agentops-acceleratorPersistent file-based planning for AI coding agents. Crash-proof markdown plans (task_plan.md, findings.md, progress.md) that survive context loss and /clear, with an opt-in completion gate and multi-agent shared state. Manus-style. Works with Claude Code, Codex CLI, Cursor, Kiro, OpenCode and 60+ agents via the SKILL.md standard. Includes Arabic, German, Spanish, and Chinese (Simplified and Traditional).
v9.44.1 — Patch release for Gemini environment/version detection and qwen auth gating. Run /octo:setup.
Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques
Harness-native ECC operator layer - 67 agents, 271 skills, 92 legacy command shims, reusable hooks, rules, selective install profiles, and production-ready workflows for Claude Code, Codex, OpenCode, Cursor, and related agent harnesses
Tools to maintain and improve CLAUDE.md files - audit quality, capture session learnings, and keep project memory current.
Plugin-safe Claude Code distribution of Antigravity Awesome Skills with 1,561 supported skills.