From llamastash
Control local llama.cpp-backed models through the LlamaStash CLI. Use when an agent needs to install or initialize LlamaStash, list models, start or stop a model, inspect daemon or proxy health, pull GGUFs from HuggingFace, or obtain the local OpenAI-compatible proxy address for harnesses like Claude Code, OpenClaw, OpenCode, Pi, and other shell-capable AgentSkills clients.
How this skill is triggered — by the user, by Claude, or both
Slash command
/llamastash:llamastashThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use the `llamastash` CLI to install, inspect, and control local models managed
Use the llamastash CLI to install, inspect, and control local models managed
by LlamaStash.
llamastash status --json 2>/dev/null || echo '{"error":{"code":"not_ready","message":"llamastash is not installed, the daemon is not running, or the CLI is not configured yet"}}'If llamastash is missing or not configured yet, bring it up in this order:
Install the binary using the machine's native path:
# macOS
brew install llamastash/llamastash/llamastash
# Linux
curl -fsSL https://llamastash.cli.rs/install.sh | sh
# Portable fallback
cargo install llamastash --locked
Verify the binary:
llamastash --version
Run the non-interactive first-run flow:
llamastash init --recommended --json
Verify the install and branch on findings, not on doctor's exit code:
llamastash doctor --json
llamastashstatus --jsonllamastash pullproxy.listenFor agent use, prefer:
llamastash init --recommended --json
llamastash doctor --json
llamastash list --json
llamastash status --json
llamastash recommend --json
llamastash pull <owner/repo[:filename.gguf]> --json
Do not parse the human-readable table or colored output.
Before start or stop, read the catalog first and reuse the exact discovered
name from list --json.
llamastash list --json
llamastash start <exact-model-name>
llamastash status --json
start and stop are not the primary machine contract surfaces. Confirm the
result with status --json.
doctor findings, not its exit codellamastash doctor --json always exits 0. Escalate when findings is not
empty.
When wiring another harness to the built-in OpenAI-compatible proxy, read the
bound address from status --json and build the base URL as
http://<proxy.listen>/v1.
Default mode usually lands on 127.0.0.1:11435. Ollama-compat mode usually
lands on 127.0.0.1:11434. If the base port is busy, LlamaStash may bind the
next free port in the documented scan window, so check the live value first.
Important codes:
| Code | Meaning |
|---|---|
64 | bad CLI usage |
65 | daemon unreachable |
66 | model reference matched zero or multiple models |
67 | launch failed |
68 | stop failed |
69 | pull failed |
70 | llama-server binary not found |
71 | unexpected error |
72 | init aborted before substantive work |
73 | init download failed |
74 | init smoke failed |
See references/commands.md for command patterns and examples.
See INSTALL.md#for-ai-agents for agent installation patterns and examples.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub llamastash/llamastash --plugin llamastash