By Hellblazer
Locally-hosted Qwen 3.6 as a Claude Code coprocessor — delegate bulk or cheap work to long-lived supervised inference sessions over MCP. Backends include llama.cpp Metal on Apple Silicon and llama.cpp Vulkan on AMD Strix Halo.
Manage the qwen-stack supervisor's backend list — list configured backends with live health, add a new backend, remove one, or test connectivity. Operates on `~/.qwen-coprocessor-stack/config.json` and hot-reloads in the running supervisor without restart. Use when the user types `/qwen-stack:backends ...`.
Manage the qwen-stack supervisor's session-budget caps (`max_context_tokens` and `max_tool_calls`) — show current resolved values with source priority, set one or both fields in the config file, or clear them back to env / hardcoded defaults. Operates on the `session_budget` field in `~/.qwen-coprocessor-stack/config.json` and hot-reloads in the running supervisor without restart. Use when the user types `/qwen-stack:budget ...`.
Manage the qwen-stack supervisor's session-default extension list — show current value with source priority, set a new comma-separated list, set explicit-empty (suppresses CLI defaults), or clear (CLI defaults apply). Operates on the `default_extensions` field in `~/.qwen-coprocessor-stack/config.json` and hot-reloads in the running supervisor without restart. Use when the user types `/qwen-stack:defaults ...`.
List installed Qwen Code extensions on the supervisor host with version, source, enabled state per scope, and declared commands/skills/agents/MCP servers. Read-only listing for v0.3 — install / remove / enable / disable are deferred to v0.4. Use when the user types `/qwen-stack:extensions` or asks "what qwen extensions are installed" / "what does extension X provide".
One-glance overview of qwen-stack — plugin version, supervisor process state, dist build freshness, configured backends with live health, config-file path, and any obvious red flags (stale binary, env override masking config, dead default backend). Use when the user types `/qwen-stack:status` or asks "is the qwen stack healthy" / "what's running" / "is everything wired up".
Admin access level
Server config contains admin-level keywords
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Locally-hosted Qwen 3.6 wired into Claude Code as an MCP coprocessor. Claude Code runs unmodified with normal subscription auth; Qwen is exposed as a small set of MCP tools that Claude can call to delegate cheap or bulk work to long-lived, supervised inference sessions.
The supervisor is a TypeScript MCP server
(mcp-bridges/qwen-agent-server) that
manages session lifecycle, backend routing, KV-cache affinity, and
permission gating on top of @qwen-code/sdk.
Any OpenAI-compatible endpoint serving a Qwen 3.6 GGUF works as a
backend; the standard deployments are llama.cpp Metal on Apple Silicon
(Qwen 3.6 27B) and llama.cpp Vulkan on AMD Strix Halo (Qwen 3.6 35B-A3B).
Full design rationale: docs/rdr/RDR-001.
npm, and Claude Code
installed and signed in. Portable; not Apple-specific.scripts/setup-mac-host.sh,
scripts/start-stack.sh) builds llama.cpp with Metal and runs
Qwen 3.6 27B at localhost:8080. Apple Silicon, ~25 GB free disk.host:port/v1,
reached over Tailscale or any other network you trust.# 1. Build llama.cpp with Metal support and download Qwen 3.6 27B (~25 GB).
./scripts/setup-mac-host.sh
# 2. Start llama-server (cold start: ~5 min off external SSD, ~5 s off NVMe).
./scripts/start-stack.sh
# 3. Build the supervisor (compiles dist/server.js — postinstall runs tsc).
( cd mcp-bridges/qwen-agent-server && npm install )
# 4. Register the supervisor with Claude Code. Either:
# a) install as a plugin (recommended — see "Install as a plugin" below), or
# b) run ./scripts/setup-qwen-agent-server.sh (legacy `claude mcp add` path).
# 5. Run Claude Code anywhere — the qwen_* tools are now available.
claude
To shut down the local llama-server: ./scripts/stop-stack.sh.
This repo doubles as a Claude Code plugin (qwen-stack). After npm install
in step 3:
# From any shell with the claude CLI on PATH:
claude plugin marketplace add /path/to/this/repo
claude plugin install qwen-stack@qwen-stack
# Then reload from any CC session: /reload-plugins
The plugin manifest at .claude-plugin/plugin.json registers the supervisor's
MCP server with ${CLAUDE_PLUGIN_ROOT} resolved to the plugin install
location, so paths stay portable.
Migrating from the old
qwen-coprocessor-stackplugin name (pre-0.3.0):claude plugin uninstall qwen-coprocessor-stack claude plugin marketplace remove qwen-coprocessor-stack claude plugin marketplace add /path/to/this/repo claude plugin install qwen-stack@qwen-stack
State lives at ~/.qwen-coprocessor-stack/config.json (object form,
forward-extensible — backends, default_extensions today).
| Command | Purpose |
|---|---|
/qwen-stack:status | One-glance overview — plugin version, supervisor process, build freshness, backends + health, env overrides, red flags |
/qwen-stack:backends list | add | remove | test | Backend lifecycle — edits config file in place; supervisor hot-applies on next spawn |
/qwen-stack:extensions list | info <name> | Read-only listing of installed Qwen Code extensions with version, source, enabled state, declared commands/skills/agents/MCP servers |
/qwen-stack:defaults list | set <a,b,c> | set --none | clear | Manage the session-default extension list applied when a spawn doesn't specify opts.extensions.only |
/qwen-stack:budget list | set [--max-context-tokens N] [--max-tool-calls M] | clear [field] | Manage the session_budget caps that abort a runaway session before the HTTP layer panics |
Resolution priorities (env > file > default):
QWEN_BACKENDS env → config.backends → built-in single-local default.QWEN_DEFAULT_EXTENSIONS env → config.default_extensions → CLI defaults from extension-enablement.json.Existing in-flight sessions stay pinned to their backend and resolved extension set (RDR-001 §Q3, RDR-002 §drain semantics) — config edits affect new spawns only.
The inner Qwen has no automatic mid-flight compaction; an open-ended task
that reads dozens of files can accumulate tool_result payload past the
backend's context window and crash the HTTP layer with ECONNRESET. v0.4
adds a guardrail that aborts the session cleanly before that happens.
Two caps, both per session:
npx claudepluginhub hellblazer/qwen-coprocessor-stack --plugin qwen-stackhal-9000 - Containerized Claude with MCP servers pre-installed
Injects Serena and Context7 MCP tool usage guidance into subagents via SubagentStart hook.
Render a2ui v0.9 surfaces inline in Claude Code chat. Wraps payloads via palinex.wrap_as_mcp_ui_resource and returns them as MCP UI resources Claude Code renders as sandboxed iframes. Optional nexus integration substitutes chash references with chunk text at wrap time.
Hook-driven framework for recording deterministic Claude Code sessions (tutorials, demos, screencasts, fixtures). Drives an isolated child claude under tmux + asciinema + agg, coordinates via lifecycle-hook sentinels (no TUI scraping), validates the cast, renders a GIF.
Memory compression system for Claude Code - persist context across sessions
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.
Streamline people operations — recruiting, onboarding, performance reviews, compensation analysis, and policy guidance. Maintain compliance and keep your team running smoothly.
Write feature specs, plan roadmaps, and synthesize user research faster. Keep stakeholders updated and stay ahead of the competitive landscape.
Create content, plan campaigns, and analyze performance across marketing channels. Maintain brand voice consistency, track competitors, and report on what's working.
Prospect, craft outreach, and build deal strategy faster. Prep for calls, manage your pipeline, and write personalized messaging that moves deals forward.