From grafana-app-sdk
Identifies Prometheus metrics driving high data points per minute (DPM) with per-label breakdown. Helps optimize Grafana Cloud costs by finding noisy, high-cardinality metrics.
How this skill is triggered — by the user, by Claude, or both
Slash command
/grafana-app-sdk:dpm-finderThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A Grafana Professional Services tool for identifying which Prometheus metrics
A Grafana Professional Services tool for identifying which Prometheus metrics drive high Data Points per Minute (DPM). Analyzes metric-level DPM with per-label breakdown to help optimize Grafana Cloud costs.
Source: https://github.com/grafana-ps/dpm-finder
git clone https://github.com/grafana-ps/dpm-finder.git
cd dpm-finder
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
.env_example to .env and filling in values:
PROMETHEUS_ENDPOINT -- The Prometheus endpoint URL (must end in .net, nothing after)PROMETHEUS_USERNAME -- Tenant ID / stack ID (numeric)PROMETHEUS_API_KEY -- Grafana Cloud API key (glc_... format)If gcx is available, use it to find stack details:
gcx config check # Show active stack context
gcx config list-contexts # List all configured stacks
gcx config view # Full config with endpoints
The Prometheus endpoint follows the pattern:
https://prometheus-{cluster_slug}.grafana.net
The username is the numeric stack ID. gcx auto-discovers service URLs from the stack slug via GCOM.
Look up the stack in the Grafana Cloud portal, or query the usage datasource:
grafanacloud_instance_info{name=~"STACK_NAME.*"}
Extract cluster_slug for the endpoint URL and id for the username.
./dpm-finder.py -f json -m 2.0 -t 8 --timeout 120 -l 10
| Flag | Default | Description |
|---|---|---|
-f, --format | csv | Output format: csv, text, txt, json, prom |
-m, --min-dpm | 1.0 | Minimum DPM threshold to include a metric |
-t, --threads | 10 | Concurrent processing threads |
-l, --lookback | 10 | Lookback window in minutes for DPM calculation |
--timeout | 60 | API request timeout in seconds |
--cost-per-1000-series | (none) | Dollar cost per 1000 series; adds estimated_cost column |
-q, --quiet | false | Suppress progress output |
-v, --verbose | false | Enable debug logging |
-e, --exporter | false | Run as Prometheus exporter instead of one-shot |
-p, --port | 9966 | Exporter server port |
-u, --update-interval | 86400 | Exporter metric refresh interval in seconds |
Output files are written to the current working directory.
-f json) -> metric_rates.jsonBest for programmatic analysis. Includes per-series DPM breakdown:
metrics[].metric_name -- the metric namemetrics[].dpm -- data points per minute (maximum across this metric's individual series)metrics[].series_count -- number of active time seriesmetrics[].series_detail[] -- per-label-set DPM breakdown (sorted by DPM descending)total_metrics_above_threshold -- count of metrics above thresholdperformance_metrics.total_runtime_seconds -- total processing timeperformance_metrics.average_metric_processing_seconds -- avg time per metricperformance_metrics.total_metrics_processed -- total metrics analyzedperformance_metrics.metrics_per_second -- processing throughput-f csv) -> metric_rates.csvColumns: metric_name, dpm, series_count (plus estimated_cost if --cost-per-1000-series is set).
-f text) -> metric_rates.txtHuman-readable format with per-series breakdown and performance statistics.
-f prom) -> metric_rates.promPrometheus exposition format suitable for Alloy's prometheus.exporter.unix textfile collector.
series_detail to identify which label combinations drive the highest DPM--cost-per-1000-series is set, use estimated_cost to prioritize by spendWhen running dpm-finder against multiple stacks, limit to max 3 concurrent runs. Batch the stacks and wait for each batch to complete before starting the next.
The tool automatically excludes:
*_count, *_bucket, *_sum suffixesgrafana_* prefix/aggregations/rules)Run as a long-lived Prometheus exporter instead of one-shot analysis:
./dpm-finder.py -e -p 9966 -u 86400
Serves metrics at http://localhost:PORT/metrics. Recalculates at the configured interval (default: daily). See README.md for full exporter and Docker documentation.
Alternative to local Python setup:
docker build -t dpm-finder:latest .
docker run --rm --env-file .env -v $(pwd)/output:/app/output \
dpm-finder:latest --format json --min-dpm 2.0
See README.md for full Docker Compose, production deployment, and monitoring integration docs.
metrics:read scope. Confirm PROMETHEUS_USERNAME matches the numeric stack ID.--timeout for large metric sets. The default is 60s; use 120s or higher for stacks with thousands of metrics.--min-dpm threshold. Check that PROMETHEUS_ENDPOINT does not have a trailing path after .net.The tool retries failed API requests with exponential backoff (up to 10 retries). Rate-limited responses (HTTP 429) are backed off automatically. HTTP 4xx errors other than 429 are not retried.
dpm-finder.py # Main CLI tool (one-shot + exporter modes)
requirements.txt # Python dependencies
.env_example # Template for credential configuration
Dockerfile # Multi-stage Docker build
docker-compose.yml # Docker Compose orchestration
README.md # Full project documentation
npx claudepluginhub grafana/skills --plugin grafana-app-sdkAnalyzes Prometheus metric DPM rates for Grafana Cloud stacks to identify high data points per minute drivers with per-label breakdowns. Uses gcx for stack discovery and presents sorted tables.
Provides PromQL query patterns, alerting rules, and Grafana Cloud Metrics integration for monitoring and observability workflows.
Prometheus instrumentation discipline: right metric type, right name, right labels. Invoke whenever task involves any interaction with Prometheus metrics — instrumenting application code, writing PromQL queries, defining alerting or recording rules, choosing metric types, managing label cardinality, building exporters, or reviewing monitoring configuration.