Monitors and analyzes LLM application data already in Langfuse — dashboards, metrics, and alerting for cost, latency, quality, and volume. Use whenever the user wants to observe or report on production Langfuse data: "monitor my LLM app", "build a Langfuse dashboard", "track cost / latency / quality over time", "Langfuse metrics API", "score analytics", "set up a spend alert", "alert me when costs spike", "dashboard for production monitoring", or interpreting usage/cost/quality trends. Owns operating-the-data (dashboards/metrics/alerting); defers instrumentation to the vendored `langfuse` skill and score/evaluator design to the `langfuse-evaluation` skill.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-langfuse-plugin:langfuse-monitoringThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill covers *operating the data* once it's in Langfuse: visualizing it (dashboards),
This skill covers operating the data once it's in Langfuse: visualizing it (dashboards),
extracting it (metrics API), analyzing eval scores (score analytics), and alerting. It does not
cover emitting the data (that's instrumentation — vendored langfuse skill) or designing the scores
(that's the langfuse-evaluation skill).
.md to the page URL (e.g.
https://langfuse.com/docs/metrics/features/metrics-api.md).langfuse skill), not here.Identify the metric family — cost, latency, quality, or volume — and the dimensions
to slice by (trace name/feature, user, model, tags, release). See references/dashboards.md for the
metric/dimension model.
references/dashboards.md.references/metrics-api.md (use v2).references/score-analytics.md.For a new production app, stand up the three durable dashboards (production health, cost
optimization, quality/UX) from references/dashboards.md.
Disambiguate first: Spend Alerts = your Langfuse Cloud bill, not app cost. For app-level
cost/latency/quality alerts, build a Metrics-API-driven check. See references/alerting.md.
references/dashboards.md — metrics & dimensions, curated vs custom dashboards, the widget
model, and the three standard dashboards to build (health / cost / quality).references/metrics-api.md — programmatic metrics: use v2, the query model
(view/metrics/dimensions/filters/timeDimension), and v1→v2 migration gotchas.references/score-analytics.md — zero-config eval-score analysis: distributions, trends,
and judge-vs-human agreement metrics (MAE/RMSE, Cohen's Kappa/F1).references/alerting.md — Spend Alerts (Cloud billing) vs application-level alerting (Metrics
API + your own check); how to alert on cost/latency/quality.| Need | Where |
|---|---|
| Dashboards, metrics extraction, score analytics, alerting | this skill |
| Emitting cost/latency/userId/tags on traces (instrumentation) | vendored langfuse skill |
| Designing the scores/evaluators being monitored | langfuse-evaluation skill |
| Onboarding-time spend-alert setup / production-readiness checklist | langfuse-setup skill |
| Formal judge calibration (vs lightweight score analytics) | vendored langfuse skill judge-calibration.md |
| Exact dashboard UI / metrics API schema | live docs (.md-append) |
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub jbaham2/claude-langfuse-plugin --plugin claude-langfuse-plugin