From skills
Run and manage evaluations in Adaline to test prompt quality at scale. Use when creating evaluation runs, polling status, analyzing results, or cancelling runs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/skills:adaline-evaluationsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Evaluations run a prompt against a dataset and score each row with one evaluator. They are asynchronous: create a run, poll its status, then read paginated results.
Evaluations run a prompt against a dataset and score each row with one evaluator. They are asynchronous: create a run, poll its status, then read paginated results.
Key terms:
runIdevaluatorIdpass, fail, or unknownqueued -> running -> completed
-> failed
-> cancelling -> cancelled
Set these environment variables when credentials are available:
ADALINE_API_KEY — workspace API key from Admin > API KeysADALINE_PROMPT_ID — prompt to evaluateADALINE_EVALUATOR_ID — evaluator to runADALINE_DATASET_ID — optional dataset overrideBase URL: https://api.adaline.ai/v2
| Symptom | First Fix |
|---|---|
| Create body rejected | Use singular evaluatorId, not the old plural evaluator field |
| Follow-up GET returns 404 | Use response runId as the {evaluationId} path parameter |
| Results missing row data | Add expand=row on the results endpoint |
| Pagination skips results | Use pagination.nextCursor, not page numbers |
| Python example returns coroutine | Await SDK methods inside an asyncio event loop |
curl -X POST "https://api.adaline.ai/v2/prompts/$ADALINE_PROMPT_ID/evaluations" \
-H "Authorization: Bearer $ADALINE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"evaluatorId": "evaluator_abc123",
"datasetId": "dataset_abc123"
}'
The response returns runId. Use that value as evaluationId in status/results/cancel calls.
curl "https://api.adaline.ai/v2/prompts/$ADALINE_PROMPT_ID/evaluations/$RUN_ID" \
-H "Authorization: Bearer $ADALINE_API_KEY"
curl "https://api.adaline.ai/v2/prompts/$ADALINE_PROMPT_ID/evaluations/$RUN_ID/results?grade=fail&expand=row&limit=50" \
-H "Authorization: Bearer $ADALINE_API_KEY"
curl -X POST "https://api.adaline.ai/v2/prompts/$ADALINE_PROMPT_ID/evaluations/$RUN_ID/cancel" \
-H "Authorization: Bearer $ADALINE_API_KEY"
const run = await adaline.prompts.evaluations.create({
promptId,
evaluation: { evaluatorId, datasetId },
});
const status = await adaline.prompts.evaluations.get({
promptId,
evaluationId: run.runId,
});
const results = await adaline.prompts.evaluations.results.list({
promptId,
evaluationId: run.runId,
grade: 'fail',
expand: 'row',
});
run = await adaline.prompts.evaluations.create(
prompt_id=prompt_id,
evaluation=CreateEvaluationRequest(evaluator_id=evaluator_id, dataset_id=dataset_id),
)
status = await adaline.prompts.evaluations.get(
prompt_id=prompt_id,
evaluation_id=run.run_id,
)
results = await adaline.prompts.evaluations.results.list(
prompt_id=prompt_id,
evaluation_id=run.run_id,
grade="fail",
expand="row",
)
runId in CI or job metadata so later steps can poll and fetch results.grade=fail&expand=row.See references/api.md for request/response schemas and curl examples.
npx claudepluginhub adaline/skills --plugin skillsProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.