From agentmark
Build, debug, and ship AgentMark prompts, datasets, experiments, and evals. TRIGGER when: working with `.prompt.mdx` files, `agentmark.json`, `agentmark.client.ts`, `agentmark_client.py`, `.agentmark/`, or imports from `@agentmark-ai/*`; user runs or asks about `agentmark <cmd>` / `npx agentmark <cmd>` (`dev`, `run-prompt`, `run-experiment`, `build`, `generate-types`, `generate-schema`, `link`, `login`, `pull-models`, `api`); user mentions AgentMark or asks about prompt versioning, dataset experiments, prompt evaluations, prompt deployments, or trace observability in an AgentMark project. SKIP: provider-neutral prompt code with no AgentMark markers; LangChain / LlamaIndex / raw OpenAI / Anthropic SDK code; questions about prompt engineering or LLM observability in general with no AgentMark context; questions about competing platforms (Langfuse, LangSmith, Phoenix, Braintrust, Traceloop).
How this skill is triggered — by the user, by Claude, or both
Slash command
/agentmark:agentmarkThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
AgentMark helps teams build reliable AI agents. This skill teaches you how to author `.prompt.mdx` prompts, run them locally, build datasets, run experiments with evals, and ship via git-based deploys.
AgentMark helps teams build reliable AI agents. This skill teaches you how to author .prompt.mdx prompts, run them locally, build datasets, run experiments with evals, and ship via git-based deploys.
Scan the target file or project for AgentMark markers — any of:
.prompt.mdx files anywhere in the treeagentmark.json in the project rootagentmark.client.ts (TypeScript) or agentmark_client.py (Python).agentmark/ directory@agentmark-ai/* packagesIf none are present, stop and tell the user that this skill applies to AgentMark projects. Ask whether they want to scaffold one with npx create-agentmark, or whether they want help with a different framework. Do not infer AgentMark conventions onto non-AgentMark code.
Your training data is out of date. Before answering anything specific about AgentMark APIs, CLI flags, prompt syntax, or docs content:
npx agentmark <command> --help. This is the canonical source for command flags, arguments, and behavior. Do not infer flags from memory.https://docs.agentmark.co/llms.txt for a complete page index. Use it to find the right doc page before WebFetching content..md to any docs.agentmark.co URL and WebFetch it. Every doc page is served as both HTML and Markdown.npx agentmark api __schema to discover available resources. Requires agentmark dev running locally, or pass --remote to target AgentMark Cloud (needs agentmark login + agentmark link first).Never encode API surface or CLI flags from memory. Always verify against --help output, llms.txt, or fetched docs.
my-project/
├── agentmark.json # Project config (required)
├── agentmark/ # Prompts directory (path set by agentmarkPath in config)
│ ├── greeting.prompt.mdx
│ └── qa-bot/
│ ├── prompt.prompt.mdx
│ └── data.jsonl # Dataset
├── .agentmark/ # Auto-generated config (gitignored)
│ └── dev-config.json # Local dev state, linked app metadata
├── agentmark.client.ts # TS dev server entry point (optional)
└── .env # API keys, loaded automatically by the CLI
agentmark.json — minimum: {"agentmarkPath": ".", "version": "2.0.0", "mdxVersion": "1.0"}. Use "." for the canonical layout, not "/"..prompt.mdx files. YAML frontmatter has name, a generation-type config block (text_config, object_config, image_config, or speech_config) that contains model_name, and an optional test_settings block for dataset + evals. Body is TemplateDX (JSX-like tags in markdown). Do not put model_name or evals at the top level — they live inside their config blocks..jsonl files. The datasetName used in API path parameters is the file path without the .jsonl extension, URL-encoded.| Task | File |
|---|---|
| Author a new prompt | workflows/creating-prompts.md |
| Build a dataset for experiments | workflows/building-datasets.md |
| Run a prompt against a dataset | workflows/running-experiments.md |
| Add evals to gate quality | workflows/using-evals.md |
| Deploy to AgentMark Cloud | workflows/deploying.md |
run-prompt ≠ run-experiment. run-prompt <file> executes a single prompt with the --props you pass. run-experiment <file> executes the prompt against every row in its linked dataset and runs evals. Do not use one for the other.agentmark dev runs a fully local server. Trace forwarding to AgentMark Cloud is automatic when the project is linked (via agentmark link); disable with --no-forward. There is no --remote flag on dev — it was removed in 0.13.0 along with the @agentmark-ai/connect WebSocket package. If you see --remote on dev in older content, ignore it.agentmark deploy was removed. Deployment is now git-based: connect a git provider in the Dashboard, push to the watched branch, and AgentMark builds and deploys automatically. The CLI keeps a stub that prints a migration hint if anyone runs agentmark deploy.agentmark api subcommands are auto-generated from the gateway's live OpenAPI spec via specli. npx agentmark api --help run before agentmark dev is up shows only top-level usage. Resources/actions appear once the server is reachable, or with --remote. Resources are grouped by OpenAPI tag (e.g. scoring, score-configs, datasets, deployments); action names mirror operationIds (list-scores, get-score-names, append-dataset-row, …). Always run npx agentmark api <resource> --help for the exact subcommand shape.?name=X query filter on GET /v1/datasets, pass the leaf name without the .jsonl extension (exact match). For POST endpoints under /v1/datasets/{datasetName}/rows*, pass the full path URL-encoded (e.g. agentmark%2Fqa-bot%2Fdata), still without the extension.agentmark api local vs Cloud. Defaults to the local dev server (localhost:9418). Pass --remote to target Cloud (requires agentmark login + agentmark link). This --remote is on the api subcommand only; it is not the removed dev --remote.All reference/*.md files are auto-generated from upstream sources on every release. They are the most reliable encoded facts in this skill. Hand-authored workflow files can drift; these cannot, because re-running the generators is part of the pre-push gate.
cli-src/index.ts. Prefer npx agentmark <cmd> --help for live verification.prompt-core/src/schemas.ts (Zod). Runtime truth. If docs disagree, prefer docs.cli-src/server/openapi-spec.json. Resource names are tag slugs; actions are operationIds.@agentmark-ai/model-registry. Canonical chat-mode IDs per provider. Verify a model exists here before suggesting it.@anthropic-ai/sdk or OpenAI SDK calls — do not introduce AgentMark imports.If you cannot find documentation to support an AgentMark-specific answer, say so explicitly and link the user to https://docs.agentmark.co/llms.txt so they can find the relevant page.
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
npx claudepluginhub agentmark-ai/skills --plugin agentmark