From togetherai-skills
Deploys custom Dockerized inference workers on Together AI's managed GPU infrastructure using Jig CLI, Sprocket SDK, and async queue jobs for container-level control.
How this skill is triggered — by the user, by Claude, or both
Slash command
/togetherai-skills:together-dedicated-containersThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use Dedicated Container Inference when the user needs a custom runtime, not just managed model
Use Dedicated Container Inference when the user needs a custom runtime, not just managed model hosting.
Core building blocks:
together-dedicated-endpoints for standard model hosting without custom containerstogether-gpu-clusters for full cluster ownership and orchestration controltogether-chat-completions, together-images, or together-video when a serverless product already covers the taskpyproject.toml for image, runtime, autoscaling, and mounts.together>=2.0.0). If the user is on an older version, they must upgrade first: uv pip install --upgrade "together>=2.0.0".pyproject.toml as the source of truth for deployment behavior.npx claudepluginhub togethercomputer/skills --plugin togetherai-skillsManages single-tenant GPU endpoints on Together AI with autoscaling and no rate limits. Deploys fine-tuned or uploaded models, sizes hardware, and handles endpoint lifecycle.
Runs Python code on Modal's serverless platform with GPUs, autoscaling, and containerized dependencies. Use for ML model deployment, batch processing, scheduled jobs, and GPU-accelerated APIs.
Deploys ML training jobs and inference services to Vast.ai GPU cloud using optimized Docker images, CLI scripting, and automation for GPU instance provisioning.