Run GGUF models locally with Mozilla Llamafile, launching OpenAI-compatible API servers configurable for GPU/CPU inference, SDK integrations, installations, startups, and connection troubleshooting in offline setups.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub jamie-bitflight/claude_skills --plugin llamafileRead The Fucking Prompt — finds the strongest user reaction to an AI instruction-following failure in a chosen session, reconstructs the triggering assistant output, and renders a shareable terminal-style PNG.
This skill should be used when the model needs to ensure code quality through comprehensive linting and formatting. It provides automatic linting workflows for orchestrators (format → lint → resolve via concurrent agents) and sub-agents (lint touched files before task completion). Prevents claiming "production ready" code without verification. Includes linting rules knowledge base for ruff, mypy, and bandit, plus the linting-root-cause-resolver agent for systematic issue resolution.
When setting up commit message validation for a project. When project has commitlint.config.js or .commitlintrc files. When configuring CI/CD to enforce commit format. When extracting commit rules for LLM prompt generation. When debugging commit message rejection errors.
Comprehensive Perl 5.30+ development plugin with modular skills for scripting, CPAN ecosystem, environment setup, testing, linting, and validation. Includes specialized agents for script development, code auditing, and CLI architecture.
Build FastMCP 3.x Python MCP servers — covers provider/transform architecture (including CodeMode, Tool Search, and server-level transforms), component versioning, session state, authorization (MultiAuth, PropelAuth, connection-pooled token verifiers), evaluation creation, Pydantic validation, async patterns, STDIO and HTTP transports, nginx reverse proxy deployment, background tasks, Prefab Apps UI, security patterns, client SDK usage, testing, deployment, and migration from FastMCP v2. TypeScript is a legacy reference only and is not updated for v3.
Run AI models locally with Ollama - free alternative to OpenAI, Anthropic, and other paid LLM APIs. Zero-cost, privacy-first AI infrastructure.
When calling LLM APIs from Python code. When connecting to llamafile or local LLM servers. When switching between OpenAI/Anthropic/local providers. When implementing retry/fallback logic for LLM calls. When code imports litellm or uses completion() patterns.
Local-first resolver for Hugging Face models (GGUF, MLX, safetensors). The agent checks your own storage and any mounted drives before downloading anything.
Delegate heavy code generation to a local LLM (Ollama / LM Studio). Save tokens, keep oversight.
Spawn any third-party LLM provider with an Anthropic-compatible API (e.g. DeepSeek, GLM, Kimi, Qwen, MiniMax) as real Claude Code agent-team teammates or one-shot subagents — driven exactly like native teammates. Your main session's own auth is untouched (OAuth subscription or API key, either works); provider workers bill the provider API key via apiKeyHelper (the key never enters env/argv/history). Requires the `cc-fleet` binary on PATH, installed separately.
Editorial "LLM Application Developer" bundle for Claude Code from Antigravity Awesome Skills.