Search everything...

Stats

Actions

Available In

claude-langfuse-plugin

Name: claude-langfuse-plugin
Author: jbaham2

By jbaham2

Expert system for Langfuse setup, observability, prompt management, evaluation, and monitoring. Bundles distilled knowledge, skills, agents, commands, and hooks.

npx claudepluginhub jbaham2/claude-langfuse-plugin --plugin claude-langfuse-plugin

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Slash Commands6

Lf Deploy

/lf-deploy

Plan or operate a self-hosted Langfuse deployment — sizing, scaling, backups, upgrades, security.

Lf Eval

/lf-eval

Design and set up a Langfuse evaluation — method choice, scores, datasets/experiments, online/offline.

Lf Health Check

/lf-health-check

Check a Langfuse connection/project is healthy — auth, ingestion, and recent activity.

Lf Monitor

/lf-monitor

Build or query Langfuse monitoring — dashboards, metrics (cost/latency/quality/volume), alerting.

Lf Setup

/lf-setup

Guide Langfuse onboarding and setup — deployment choice, keys, first trace, prod-readiness.

Agents3

eval-designer

/eval-designer

Designs a complete Langfuse evaluation strategy for an LLM application — choosing methods, defining scores, shaping datasets/experiments, and planning online + offline evaluation. Use when the user asks to "design an eval strategy", "figure out how to evaluate my agent/RAG/chatbot", "what should I measure and how", or wants a structured evaluation plan rather than ad-hoc scoring. Explores the codebase when available to ground the plan in the actual application.

setup-doctor

/setup-doctor

Diagnoses why a Langfuse setup isn't working — traces not appearing, auth/connection errors, region/key mismatches, SDK init problems, flush-on-exit issues. Use when the user says "my traces aren't showing up", "Langfuse auth is failing", "can't connect to Langfuse", "is my Langfuse setup working", or a first-trace check fails. Read-only: it diagnoses and recommends, never mutates data.

trace-reviewer

/trace-reviewer

Reviews Langfuse traces/observations at scale to find, classify, and quantify failure modes, then recommends fixes and regression test cases. Use when the user says "review my traces", "what's going wrong in production", "analyze failures in my Langfuse traces", "find error patterns", or wants systematic error analysis rather than reading traces one by one. Read-only: it analyzes and recommends, never mutates data.

Skills5

langfuse-deployment

/langfuse-deployment

Operating a self-hosted Langfuse deployment — architecture, sizing, scaling, backups, upgrades, and security. Use whenever the user is running or planning to run Langfuse on their own infrastructure: "operate / run self-hosted Langfuse", "deploy Langfuse on Kubernetes / Docker / AWS / GCP / Azure", "Langfuse sizing / resource requirements", "scale Langfuse / ingestion throughput", "back up Langfuse", "upgrade Langfuse / background migrations", "Langfuse SSO / encryption / VPC / air-gapped", or "Langfuse production deployment". Owns HOW to run self-hosted Langfuse well; the self-host-vs-Cloud and tier decision lives in the `langfuse-setup` skill, and exact configs in live docs.

langfuse-evaluation

/langfuse-evaluation

Designs and runs LLM evaluation with Langfuse — the strategy and workflow layer for scoring quality, building datasets, and running experiments. Use whenever the user is evaluating LLM output quality with Langfuse: "evaluate my LLM app", "which eval method should I use", "set up LLM-as-a-judge", "create a dataset / run an experiment", "score my traces", "offline vs online evaluation", "test prompt changes before deploying", "build a regression test set", or interpreting experiment results. Owns eval STRATEGY and the datasets/experiments/scores workflow; defers judge calibration and CI/CD experiment code to the vendored `langfuse` skill, and exact SDK code to live docs.

langfuse-monitoring

/langfuse-monitoring

Monitors and analyzes LLM application data already in Langfuse — dashboards, metrics, and alerting for cost, latency, quality, and volume. Use whenever the user wants to observe or report on production Langfuse data: "monitor my LLM app", "build a Langfuse dashboard", "track cost / latency / quality over time", "Langfuse metrics API", "score analytics", "set up a spend alert", "alert me when costs spike", "dashboard for production monitoring", or interpreting usage/cost/quality trends. Owns operating-the-data (dashboards/metrics/alerting); defers instrumentation to the vendored `langfuse` skill and score/evaluator design to the `langfuse-evaluation` skill.

langfuse-setup

/langfuse-setup

Orchestrates Langfuse adoption decisions and production-readiness — the planning the official `langfuse` skill doesn't cover. Use whenever the user is deciding HOW to adopt Langfuse or whether their setup is ready: "set up Langfuse", "Langfuse Cloud or self-host", "which Langfuse region", "configure Langfuse keys/env", "is my Langfuse setup production ready", "Langfuse prod checklist", "my traces aren't showing up", or planning a Langfuse rollout. Defers instrumentation CODE to the vendored `langfuse` skill — this skill owns the decisions, order, and verification around it.

langfuse

/langfuse

Interact with Langfuse and access its documentation. Use when needing to (1) query or modify Langfuse data programmatically via the CLI — traces, prompts, datasets, scores, sessions, and any other API resource, (2) look up Langfuse documentation, concepts, integration guides, or SDK usage, or (3) understand how any Langfuse feature works. This skill covers CLI-based API access (via npx) and multiple documentation retrieval methods.

Hooks1

Event Hooks

File writes

1 hook across 1 event

MCP Servers2

langfuse-docs

External

langfuse-data-platform

External

Stats

Version0.2.0

LanguagePython

Stars0

MaintenanceGood

LicenseMIT

Last CommitJun 16, 2026

AddedJun 17, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

claude-langfuse

Safety Signals

Caution

Modifies files

Hook triggers on file write and edit operations

External network access

Connects to servers outside your machine

README

Claude Langfuse Plugin

A Claude Code plugin that turns Claude into an expert at Langfuse — instrumentation, setup, evaluation, monitoring, and self-hosting — for Langfuse Cloud and self-hosted deployments, across the Python and JS/TS SDKs and 50+ framework integrations.

It bundles 5 skills, 6 commands, 3 agents, 1 safety hook, and both Langfuse MCP servers, designed on one principle: distill durable judgment, fetch the facts live. The plugin carries the decisions and workflows (which eval method when, how to size a deployment, how to read an experiment); it fetches exact, version-sensitive code from the live Langfuse docs at runtime so it never goes stale.

Features

Skills (auto-activate when relevant)

Skill	What it owns
`langfuse` (vendored, official)	Instrumentation code, the `langfuse-cli`, live-docs access, prompt migration, judge calibration, CI/CD experiment gates, error analysis, SDK upgrades
`langfuse-setup`	Adoption decisions (Cloud vs self-host, region), onboarding sequence, first-trace verification, production-readiness checklist
`langfuse-evaluation`	Eval strategy — methods, scores model, datasets/experiments, LLM-as-a-judge, code evaluators, human annotation, RAG/agent/multi-turn/external-pipeline evals, experiment interpretation
`langfuse-monitoring`	Dashboards, the Metrics API, score analytics, alerting (Spend-Alerts vs app cost)
`langfuse-deployment`	Self-hosting: architecture & sizing, scaling, backups/upgrades/migrations, security & SSO

Commands (you invoke explicitly)

Command	Does
`/lf-setup`	Guides onboarding — deployment choice, keys, first trace, prod-readiness
`/lf-eval`	Designs/sets up an evaluation — method, scores, datasets/experiments, online/offline
`/lf-monitor`	Builds or queries monitoring — dashboards, metrics, alerting
`/lf-deploy`	Plans/operates a self-hosted deployment — sizing, scaling, backups, security
`/lf-trace-review`	Reviews traces to find, classify, and quantify failure modes
`/lf-health-check`	Checks a connection/project: auth, ingestion, recent activity

Agents (autonomous, delegated tasks)

Agent	Does	Access
`eval-designer`	Explores your codebase and produces a concrete, app-specific evaluation plan	read + reasoning
`setup-doctor`	Diagnoses why a setup isn't working (traces missing, auth, region mismatch, flush)	read-only
`trace-reviewer`	Builds a quantified failure taxonomy from your traces and proposes fixes + regression cases	read-only

Hook

A non-blocking PostToolUse check that warns if you accidentally hardcode a Langfuse secret key (sk-lf-…) in an edited file. High-signal, never blocks your edit.

MCP servers (`.mcp.json`)

langfuse-docs — public docs search/retrieval (no auth). 3 tools.
langfuse-data-platform — authenticated access to your project's prompts, traces, observations, scores, datasets, evaluators, metrics, and getHealth. 61 tools.

Installation

1. Prerequisites

Claude Code installed.
A Langfuse project (Cloud sign-up at cloud.langfuse.com, or self-hosted) and a project API key pair (pk-lf-… / sk-lf-…) from Project Settings → API Keys.

2. Install the plugin

Local (development / trying it out):

git clone https://github.com/jbaham2/claude-langfuse-plugin
claude --plugin-dir /path/to/claude-langfuse-plugin

Via a marketplace (shareable install):

/plugin marketplace add jbaham2/claude-langfuse-plugin
/plugin install claude-langfuse-plugin

(Exact marketplace commands can vary by Claude Code version — see the Claude Code plugin docs.)

3. Set credentials (never paste keys into chat)

Copy the template and fill it in:

cp skills/langfuse-setup/assets/.env.example .env

Set in your shell or .env:

export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_BASE_URL="https://cloud.langfuse.com"   # your region or self-host URL
export LANGFUSE_HOST="$LANGFUSE_BASE_URL"               # some tools read HOST instead

4. Enable the data-platform MCP

The authenticated MCP uses Basic Auth. Generate the token and export it before launching Claude Code (the value is interpolated into .mcp.json at load):

export LANGFUSE_MCP_AUTH="$(printf '%s:%s' "$LANGFUSE_PUBLIC_KEY" "$LANGFUSE_SECRET_KEY" | base64)"

Set the right region endpoint in .mcp.json (the url of langfuse-data-platform):

View full README on GitHub

claude-langfuse-plugin

Popularity

What's Inside

Confidence

README

Claude Langfuse Plugin

Features

Skills (auto-activate when relevant)

Commands (you invoke explicitly)

Agents (autonomous, delegated tasks)

Hook

MCP servers (.mcp.json)

Installation

1. Prerequisites

2. Install the plugin

3. Set credentials (never paste keys into chat)

4. Enable the data-platform MCP

Similar Plugins

context7-plugin

startup-business-analyst

octo

claude-buddy

creative-writing

dotnet-skills

More by jbaham2

plugin-forge

llamacloud

herdr

Claude Langfuse Plugin

Features

Skills (auto-activate when relevant)

Commands (you invoke explicitly)

Agents (autonomous, delegated tasks)

Hook

MCP servers (.mcp.json)

Installation

1. Prerequisites

2. Install the plugin

3. Set credentials (never paste keys into chat)

4. Enable the data-platform MCP

Popularity

Health & Quality

More by jbaham2

plugin-forge

llamacloud

herdr

Similar Plugins

context7-plugin

startup-business-analyst

octo

claude-buddy

creative-writing

dotnet-skills

MCP servers (`.mcp.json`)

MCP servers (`.mcp.json`)