Skill

cost-management

Monitors Grafana Cloud costs, sets usage alerts, attributes spending by label, and reduces cardinality with Adaptive Metrics/Logs. Use for observability budget analysis and optimization.

monitoring

Popularity

Stars

158

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/grafana-app-sdk:cost-management

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

> **Docs**: https://grafana.com/docs/grafana-cloud/cost-management-and-billing/

SKILL.md

207 lines · ~1.4k tokens

Stats

LanguageJavaScript

Stars158

Forks13

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

Grafana Cloud Cost Management

Docs: https://grafana.com/docs/grafana-cloud/cost-management-and-billing/

Cost Management & Billing Application

Access: My Account → Cost Management (or within your Grafana Cloud stack)

FOCUS-compliant (FinOps Open Cost and Usage Specification) billing dashboards showing:

Spending by signal type (metrics, logs, traces, profiles)
Month-over-month trends
Usage vs. quota tracking
Invoice download

Cost Attribution by Label

Tag your telemetry at ingestion to enable per-team cost reporting:

// Add cost attribution labels in Alloy
prometheus.remote_write "cloud" {
  endpoint {
    url = sys.env("PROMETHEUS_URL")
    basic_auth {
      username = sys.env("PROM_USER")
      password = sys.env("GRAFANA_CLOUD_API_KEY")
    }
  }
  external_labels = {
    team    = "platform",
    project = "checkout-service",
    env     = "production",
  }
}

loki.write "cloud" {
  endpoint {
    url = sys.env("LOKI_URL")
    basic_auth {
      username = sys.env("LOKI_USER")
      password = sys.env("GRAFANA_CLOUD_API_KEY")
    }
  }
  external_labels = {
    team    = "platform",
    project = "checkout-service",
  }
}

Usage Alerts

Set alerts before you hit quota or budget thresholds:

# Alert when approaching metrics quota
groups:
  - name: grafana-cloud-usage
    rules:
      - alert: MetricsUsageHigh
        expr: grafana_cloud_metrics_active_series / grafana_cloud_metrics_limit > 0.8
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "Grafana Cloud metrics usage >80% of quota"

      - alert: LogsIngestionHigh
        expr: increase(grafana_cloud_logs_bytes_ingested_total[24h]) > 50e9  # 50GB/day
        labels:
          severity: warning
        annotations:
          summary: "Grafana Cloud log ingestion >50GB today"

Adaptive Metrics (Reduce Cardinality)

Automatically identifies unused or high-cardinality metrics and generates aggregation rules.

# View recommendations
curl https://yourstack.grafana.net/api/plugins/grafana-adaptive-metrics-app/resources/v1/recommendations \
  -H "Authorization: Bearer <token>"

# Apply aggregation rule — drops high-cardinality labels from a metric
- match: "^http_request_duration_seconds.*"
  action: keep
  match_labels:
    - method
    - status_code
    - service
  # Drops: pod, container, instance, node — reduces series from 10k → 50

Workflow:

Go to Grafana Cloud → Adaptive Metrics
Review recommended aggregation rules (sorted by series reduction impact)
Test rules in "Preview" mode before applying
Apply rules — takes effect within 5 minutes

Adaptive Logs (Reduce Log Volume)

Drop or sample log lines before ingestion using Loki's pipeline stages in Alloy:

loki.process "filter_logs" {
  forward_to = [loki.write.cloud.receiver]

  // Drop health check logs (high volume, low value)
  stage.drop {
    expression = ".*GET /health.*"
  }

  // Drop debug logs in production
  stage.drop {
    source     = "level"
    expression = "debug"
  }

  // Sample verbose info logs (keep 10%)
  stage.sampling {
    rate = 0.1
    source = "level"
    value  = "info"
  }
}

Adaptive Traces (Reduce Trace Volume)

Use Alloy tail-based sampling to keep only important traces:

otelcol.processor.tail_sampling "cost_control" {
  decision_wait = "10s"
  policy {
    name = "keep-errors"
    type = "status_code"
    status_code { status_codes = ["ERROR"] }
  }
  policy {
    name = "keep-slow"
    type = "latency"
    latency { threshold_ms = 1000 }
  }
  policy {
    name = "sample-rest"
    type = "probabilistic"
    probabilistic { sampling_percentage = 5 }
  }
  output {
    traces = [otelcol.exporter.otlp.cloud.input]
  }
}

Key Metrics for Cost Monitoring

# Active metric series (billed unit for metrics)
grafana_cloud_metrics_active_series

# Series by label (find high-cardinality sources)
topk(20, count by (__name__) ({__name__=~".+"}))

# Log bytes ingested per stream
sum(increase(loki_ingester_chunk_size_bytes_sum[24h])) by (namespace, app)

# Trace spans ingested
rate(tempo_distributor_spans_received_total[5m])

Optimization Checklist

Run Adaptive Metrics recommendations — typically reduces series 40-60%
Drop health/readiness probe logs in Alloy pipeline
Set sampling rate for traces (5-10% is typical for most workloads)
Review top-N high-cardinality metrics: topk(20, count by (__name__))
Add cost attribution labels (team, project) to all Alloy configs
Set usage alerts at 80% of quota
Review and clean up unused dashboards and data sources (they don't reduce cost but indicate stale collection)
Use recording rules to pre-aggregate expensive PromQL queries

Understanding Grafana Cloud Pricing

Signal	Billing Unit
Metrics	Active series (unique label combinations)
Logs	Bytes ingested
Traces	Spans ingested
Profiles	Bytes ingested
Synthetic Monitoring	Check executions
k6	VUh (Virtual User hours)

cost-management

Popularity

Invocation

Context Preview

SKILL.md

cost-management

Popularity

Invocation

Context Preview

SKILL.md

Grafana Cloud Cost Management

Cost Management & Billing Application

Cost Attribution by Label

Usage Alerts

Adaptive Metrics (Reduce Cardinality)

Adaptive Logs (Reduce Log Volume)

Adaptive Traces (Reduce Trace Volume)

Key Metrics for Cost Monitoring

Optimization Checklist

Understanding Grafana Cloud Pricing

Similar Skills

Grafana Cloud Cost Management

Cost Management & Billing Application

Cost Attribution by Label

Usage Alerts

Adaptive Metrics (Reduce Cardinality)

Adaptive Logs (Reduce Log Volume)

Adaptive Traces (Reduce Trace Volume)

Key Metrics for Cost Monitoring

Optimization Checklist

Understanding Grafana Cloud Pricing

Similar Skills