Plugins listed here are tagged for this technology stack and auto-indexed from public GitHub repositories.
Plugins listed here are tagged for this technology stack and auto-indexed from public GitHub repositories.
Claude Code plugins tagged for Prometheus development. Browse commands, agents, skills, and more.
Diagnose performance bottlenecks, implement distributed tracing, and manage incident response with Prometheus, Grafana, OpenTelemetry, and Datadog. Define SLIs/SLOs, run blameless postmortems, and build production-ready observability pipelines for microservices and infrastructure.
Orchestrate multi-agent teams for complex AI-driven projects: decompose tasks, match capabilities, coordinate workflows, manage shared context and errors, distribute workloads, monitor performance with Prometheus and OpenTelemetry, and synthesize insights from interactions. Integrates PowerShell, .NET, Azure ops via specialist subagents.
Generate alerting rules for Prometheus, Grafana, PagerDuty, and Datadog to monitor performance metrics like latency, errors, throughput, resources, availability, and SLO violations. Produces configs with thresholds and rationale, routing, escalation policies, runbooks, and testing steps.
Centralize performance metrics from apps, systems, databases, caches, and services into Prometheus, StatsD, or CloudWatch using unified naming. Generate instrumentation code, Prometheus configs, Grafana dashboards, retention policies, and alerts for comprehensive monitoring workflows.
Deploy full monitoring stacks like Prometheus, Grafana, or Datadog to Kubernetes or Docker environments, configuring exporters, scrape targets, alerting rules, and Grafana dashboards. Generate production-ready DevOps setup code and configurations tailored to your infrastructure requirements.
Apply 28 prioritized best-practice rules for ClickHouse schema design, query optimization, and data ingestion, with companion skills for running ClickHouse SQL in Python, reviewing schemas and queries, writing Node.js client code, troubleshooting performance issues, and setting up local or cloud ClickHouse environments.
Use slash commands to set up performance monitoring with New Relic or Datadog APM in Node.js apps, including instrumentation and custom metrics, and deploy full observability stacks with Prometheus metrics, Jaeger or Zipkin tracing, ELK or Fluentd logging, alerting, and Grafana or Kibana dashboards.
Use gcx CLI to debug Grafana observability stacks: investigate alerts, SLO breaches, synthetic check failures via Prometheus metrics and Loki logs; manage dashboards, SLOs, resources with GitOps; scaffold Go projects; automate setups and code generation for resources-as-code.
Organize knowledge spatially with memory palace techniques—build, navigate, and maintain virtual structures for enhanced recall. Capture PR review insights and manage digital garden health to preserve context and decisions across projects.
Helps Qdrant users optimize and manage vector search deployments through tuning performance, diagnosing search quality, scaling clusters, deploying infrastructure, upgrading versions, and using client SDKs.
Conduct specialized code reviews on Go projects, auditing web server architecture, middleware, concurrency patterns, data persistence with PostgreSQL, BubbleTea TUIs, Wish SSH servers, Prometheus instrumentation, and testing practices to ensure idiomatic, secure, high-performance code.
Adopt OpenTelemetry observability across your stack: configure and deploy the Collector, instrument applications in multiple languages, write and debug OTTL transformations, and validate attribute conventions.
Delegate observability implementation to expert agents that handle OpenTelemetry instrumentation for distributed tracing, structured logging pipelines with tools like Vector and Loki, Prometheus metrics and alerting, Grafana dashboards, SLO definitions, and incident response workflows for optimized system debugging.
Delegate SRE expertise to an agent for production incident response with triage, roles, and templates; generate Prometheus queries for golden signals, SLIs, alerting rules, and dashboards; define SLOs, error budgets, and capacity plans; implement JavaScript patterns like circuit breakers and retries for reliable distributed systems.
Manage Cloud SQL for PostgreSQL on GCP: provision instances, explore databases, audit health, monitor performance via PromQL, manage replication, optimize vector search, and tune configurations.
Run syncable CLI skills to analyze project tech stacks and monorepos, audit dependencies for CVEs/licenses/copyleft, scan code for secrets/vulnerabilities/insecure patterns, validate IaC (Dockerfiles/Compose/Terraform/K8s manifests), optimize K8s clusters for cost/resources, and execute secure deployments to GCP/Azure with audits.
Delegate SDLC workflows to specialist AI agents that architect cloud-native systems, design databases, conduct deep web research, optimize performance and observability, distill repo knowledge, and build production agents via orchestrated pipelines.
Helps instrument and configure OpenTelemetry telemetry pipelines across languages and the Collector: SDK setup, YAML config, OTTL, version compatibility, semantic conventions, migration patterns, and synthetic data generation.
Manage the full lifecycle of AlloyDB for PostgreSQL databases on GCP: provision clusters and instances, create and manage IAM or built-in users with role grants, explore schemas and run SQL queries, monitor health and replication, and troubleshoot performance using Cloud Monitoring metrics.
Automate Rootly incident management in Claude: create/triage/resolve incidents, manage alerts/workflows/services/on-call schedules, generate blameless postmortems with AI analysis, track action items, and check service health/status.
Process and transform data using jq, SQL, or pandas; design ETL/ELT pipelines for batch or streaming; perform time series forecasting, anomaly detection, and analytics; architect streaming systems with Kafka; generate insights and visualizations via natural language commands and specialist agents.
Implement production observability and SRE reliability: configure dashboards, metrics, alerts, SLOs, tracing in Datadog, CloudWatch, Prometheus, Grafana; orchestrate incident response from triage to postmortems; audit logs for SOC2, GDPR compliance; leverage specialist agents for log analysis, performance optimization, and cost-effective monitoring.
Investigate observability stacks by querying traces, logs, and metrics in OpenSearch with PPL and Prometheus with PromQL, correlating via OTel conventions from metric spikes to error logs, checking component health, and defining SLOs/SLIs.
Diagnose VictoriaMetrics performance issues by analyzing query execution traces for bottlenecks, cardinality bloat, unused metrics, and orchestrating investigations across metrics, logs, traces, and alerts in Kubernetes environments.
Automatically discover Grafana Cloud stacks via gcx, configure analysis environment with Python venv, analyze Prometheus metric DPM rates, and identify top cost drivers through per-series breakdowns in sorted tables.
Manage D&D 5e campaigns as a Dungeon Master by creating modules, NPCs, characters, and encounters; auditing plot continuity, encounter balance, and loot distribution; generating procedural Dungeondraft battle maps; mapping NPC networks; pressure-testing for exploits; searching monster and spell catalogs; running session prep checklists; and querying the local Mimir database.
Query the full VictoriaMetrics observability stack directly from your editor: run PromQL/MetricsQL metric queries on VictoriaMetrics, search and analyze logs with LogsQL in VictoriaLogs, discover and retrieve distributed traces via Jaeger API in VictoriaTraces, and manage AlertManager alerts and silences using curl-based bash skills.
Diagnose Kubernetes cluster health comprehensively with dynamic API discovery, run kubectl operations for debugging pods/services/deployments, and monitor operator-specific status for ArgoCD, Prometheus, Crossplane, and Cert-Manager using specialized agents.
Automate AIDLC operations on AWS: execute self-improving loops via continuous trace evaluation and PR proposals, run 4-stage canary deployments on Kubernetes with SLO gates and human approvals, handle incident response from alarms, enforce cost budgets with model recommendations, and log audits for compliance.
Scaffold Grafana v12.x plugin projects (panels, data sources, apps, backends) with @grafana/create-plugin and Docker hot-reload dev environments. Develop full lifecycle using React/Go/TypeScript SDKs: build, test, sign, publish. Query Prometheus/Loki billing metrics (active series, ingestion, storage, cardinality, costs) via Grafana API.
Accelerate enterprise development with automated CI/CD pipelines, multi-cloud deployments, code quality enforcement, and comprehensive documentation generation. Integrates with GitHub Actions, Kubernetes, Terraform, and monitoring tools to streamline team workflows.
Manage Cloud SQL for SQL Server instances on GCP: provision instances, create databases and users, clone environments, take backups, and monitor performance with PromQL queries for slow queries, CPU, and memory.
Architect end-to-end IoT systems from embedded firmware development and protocol selection (MQTT, CoAP) to edge computing on Docker/Kubernetes, device security with TLS/secure boot, time-series data pipelines using ClickHouse/Prometheus/Grafana, cloud integrations with AWS IoT Core/Azure IoT Hub/GCP, and digital twin modeling.
Assume the Senior DevOps Engineer role to architect production infrastructure on AWS, GCP, and Azure using Docker, Kubernetes, and Terraform; design GitOps CI/CD pipelines with ArgoCD and GitHub Actions; configure Prometheus/Grafana monitoring stacks; implement Vault secrets management; conduct incident response with runbooks and postmortems; and optimize cloud costs through FinOps practices.
Analyze Kubernetes cluster resource efficiency across nodes, workloads, Karpenter provisioning, OOM events, and costs. Generate reports with utilization stats, detected issues, actionable recommendations, and historical comparisons using Prometheus metrics.
Deploy and manage OpenTelemetry Collector pipelines shipping to Coralogix, instrument applications with OTel SDKs, write and debug OTTL transformations, and resolve telemetry semantic issues across Kubernetes and cloud environments.
Act as expert vmkteam Go developer handling full SDLC for API services: scaffold projects with PostgreSQL repos and zenrpc, decompose and resolve YouTrack tasks end-to-end, perform multi-persona GitLab MR code reviews, automate CI/CD deploys to Nomad, monitor Prometheus/Sentry/Grafana/Loki metrics/logs/errors, investigate production incidents, generate RPC clients, and run Playwright browser automation.
Run production incident investigations and SLO monitoring in Honeycomb directly from Claude Code — query traces and metrics, analyze root causes with BubbleUp, interpret SLO burn rates, and instrument applications with OpenTelemetry.
Automate DevOps workflows by generating GitHub Actions CI/CD pipelines, Dockerfiles with multi-stage builds and security scans, docker-compose setups, and Kubernetes YAMLs for zero-downtime deployments using rolling, blue-green, or canary strategies with rollbacks and monitoring.