Skill

run-lightdash-evals

From lightdash-agentops

Orchestrate evaluation runs and test case management for Lightdash agents.

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/lightdash-agentops:run-lightdash-evals

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Skill for managing and executing evaluations for Lightdash AI agents.

SKILL.md

51 lines · ~504 tokens

Stats

LanguageShell

Parent stars3

MaintenanceGood

Last CommitMar 28, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Run Lightdash Evaluations

Skill for managing and executing evaluations for Lightdash AI agents.

Purpose

Enables the "Eval-Driven Development" workflow by providing tools to create evaluation suites, append test cases (prompts), execute evaluation runs, and analyze the results.

Tools

Wraps the following MCP tools from the lightdash-tools server:

ldt__list_agent_evaluations
ldt__get_agent_evaluation
ldt__create_agent_evaluation
ldt__update_agent_evaluation
ldt__append_agent_evaluation_prompts
ldt__run_agent_evaluation
ldt__list_agent_evaluation_runs
ldt__get_agent_evaluation_run_results
ldt__delete_agent_evaluation

Safety Mode Compliance

Read Tools: list_agent_evaluations, get_agent_evaluation, list_agent_evaluation_runs, get_agent_evaluation_run_results.
Write-Safe Tools: create_agent_evaluation, update_agent_evaluation, append_agent_evaluation_prompts, run_agent_evaluation.
Write-Destructive Tools: delete_agent_evaluation.

Behavior

Test Case Management:
- Use ldt__append_agent_evaluation_prompts to add 20-50 diverse test cases representing real-world user queries.
- Organize evaluations by agent or project to maintain clarity.
Execution:
- Trigger a run using ldt__run_agent_evaluation.
- Monitor the progress using ldt__list_agent_evaluation_runs.
Analysis:
- Once a run is complete, fetch the detailed results via ldt__get_agent_evaluation_run_results.
- Identify patterns in failures (e.g., specific dimensions or metrics that the agent struggles with).

Rules

ALWAYS create or update an evaluation suite before deploying major changes to an agent's prompt.
NEVER delete an evaluation suite without explicit confirmation.
Use the agent-tuner sub-agent to automatically process evaluation results for improvement.

run-lightdash-evals

Popularity

Invocation

Context Preview

SKILL.md

run-lightdash-evals

Popularity

Invocation

Context Preview

SKILL.md

Run Lightdash Evaluations

Purpose

Tools

Safety Mode Compliance

Behavior

Rules

Similar Skills

Run Lightdash Evaluations

Purpose

Tools

Safety Mode Compliance

Behavior

Rules

Similar Skills