Develop AI applications on the Together AI platform: run inference, fine-tune models, generate embeddings, produce images and video, perform speech-to-text and text-to-speech, manage GPU infrastructure, and execute remote Python sandboxes.
Text-to-speech and speech-to-text via Together AI, including REST, streaming, and realtime WebSocket TTS, plus transcription, translation, diarization, timestamps, and live STT. Reach for it whenever the user needs audio in or audio out on Together AI rather than chat generation, image or video creation, or model training.
High-volume, asynchronous offline inference at up to 50% lower cost via Together AI's Batch API. Prepare JSONL inputs, upload files, create jobs, poll status, and download outputs. Reach for it whenever the user needs non-interactive bulk inference rather than real-time chat or evaluation jobs.
Real-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-turn conversations, tool and function calling, structured JSON outputs, and reasoning models. Reach for it whenever the user wants to build or debug text generation on Together AI, unless they specifically need batch jobs, embeddings, fine-tuning, dedicated endpoints, dedicated containers, or GPU clusters.
Custom Dockerized inference workers on Together AI's managed GPU infrastructure. Build with Sprocket SDK, configure with Jig CLI, submit async queue jobs, and poll results. Reach for it whenever the user needs container-level control rather than a standard model endpoint or raw cluster.
Single-tenant GPU endpoints on Together AI with autoscaling and no rate limits. Deploy fine-tuned or uploaded models, size hardware, and manage endpoint lifecycle. Reach for it whenever the user needs predictable always-on hosting rather than serverless inference, custom containers, or raw clusters.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A collection of 12 agent skills that provide comprehensive knowledge of the Together AI platform — inference, training, embeddings, audio, video, images, function calling, and infrastructure.
Each skill teaches AI coding agents how to use a specific Together AI product, including API patterns, SDK usage (Python and TypeScript), CLI commands, direct API usage, model selection, and best practices. Skills include runnable Python scripts (using the Together Python v2 SDK), TypeScript examples, and CLI/API workflow guidance.
Compatible with Claude Code, Cursor, Codex, and Gemini CLI.
Skills are markdown instruction files that give AI coding agents domain-specific knowledge. When an agent detects that a skill is relevant to your task, it loads the skill's instructions and uses them to write better code.
Each skill contains:
SKILL.md — Lean routing guidance for the agent: when to use the skill, when to hand off, and where to look nextreferences/ — Detailed reference docs (model lists, API parameters, CLI commands)scripts/ — Runnable Python scripts demonstrating complete workflowsagents/openai.yaml — Optional UI metadata for OpenAI/Codex surfaces| Skill | Description | Scripts |
|---|---|---|
| together-chat-completions | Real-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-tur... | async_parallel.py, chat_basic.py, debug_headers.py, reasoning_models.py, structured_outputs.py, tool_call_loop.py |
| together-images | Text-to-image generation and image editing via Together AI, including FLUX and Kontext models, LoRA-based styling, re... | generate_image.py, kontext_editing.py, lora_generation.py |
| together-video | Text-to-video and image-to-video generation via Together AI, including keyframe control, model and dimension selectio... | generate_video.py, image_to_video.py |
| together-audio | Text-to-speech and speech-to-text via Together AI, including REST, streaming, and realtime WebSocket TTS, plus transc... | stt_realtime.py, stt_transcribe.py, tts_generate.py, tts_websocket.py |
| together-embeddings | Dense vector embeddings, semantic search, RAG pipelines, and reranking via Together AI. | embed_and_rerank.py, rag_pipeline.py, semantic_search.py |
| together-fine-tuning | LoRA, full fine-tuning, DPO preference tuning, VLM training, function-calling tuning, reasoning tuning, and BYOM uplo... | dpo_workflow.py, finetune_workflow.py, function_calling_finetune.py, reasoning_finetune.py, vlm_finetune.py |
| together-batch-inference | High-volume, asynchronous offline inference at up to 50% lower cost via Together AI's Batch API. | batch_workflow.py |
| together-evaluations | LLM-as-a-judge evaluation framework on Together AI. | run_evaluation.py |
| together-sandboxes | Remote Python execution in managed sandboxes on Together AI with stateful sessions, file uploads, data analysis, char... | execute_with_session.py |
| together-dedicated-endpoints | Single-tenant GPU endpoints on Together AI with autoscaling and no rate limits. | deploy_finetuned.py, manage_endpoint.py, upload_custom_model.py |
| together-dedicated-containers | Custom Dockerized inference workers on Together AI's managed GPU infrastructure. | queue_client.py, sprocket_hello_world.py |
| together-gpu-clusters | On-demand and reserved GPU clusters (H100, H200, B200) on Together AI with Kubernetes or Slurm orchestration, shared ... | manage_cluster.py, manage_storage.py |
Install all skills at once using skills.sh:
npx skills add togethercomputer/skills
This works with Claude Code, Cursor, Codex, and other agents that support the Agent Skills specification.
cp -r skills/together-* your-project/.claude/skills/
# Global availability
cp -r skills/together-* ~/.claude/skills/
Marketplace plugin coming soon.
cp -r skills/together-* your-project/.cursor/skills/
Cursor plugin marketplace listing coming soon.
cp -r skills/together-* your-project/.agents/skills/
gemini extensions install https://github.com/togethercomputer/skills.git --consent
# Claude Code
ls your-project/.claude/skills/together-*/SKILL.md
# Codex
ls your-project/.agents/skills/together-*/SKILL.md
You should see one SKILL.md per installed skill.
Once installed, skills activate automatically when the agent detects a relevant task. No explicit invocation is needed.
npx claudepluginhub togethercomputer/skills --plugin togetherai-skillsClaude Code skill pack for Together AI (18 skills)
Skills for finding, comparing, running, and prompting AI models on Replicate
Machine learning training and inference pipeline using cloud GPUs (Modal, Lambda Labs, RunPod) with HuggingFace ecosystem - no local GPU required
Skills for the OpenRouter platform: TypeScript SDK, model discovery, pricing, image generation, and provider performance
LLM post-training — unified interface for SFT, OSFT, LoRA fine-tuning, and GRPO reinforcement learning
Skills for NVIDIAs ecosystem spans GPU acceleration, CUDA, AI agents, inference, robotics, Physical AI, Omniverse, and simulation. This plugin helps you understand the pieces, choose a path, validate your setup, and build practical NVIDIA-powered workflows.