From togetherai-skills
Generates real-time streaming text via Together AI's OpenAI-compatible chat/completions API. Supports multi-turn chats, tool/function calling, structured JSON outputs, and reasoning models for building/debugging interactive inference.
How this skill is triggered — by the user, by Claude, or both
Slash command
/togetherai-skills:together-chat-completionsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use Together AI's serverless chat/completions API for interactive inference workloads:
agents/openai.yamlreferences/api-parameters.mdreferences/function-calling-patterns.mdreferences/models.mdreferences/reasoning-models.mdreferences/structured-outputs.mdscripts/async_parallel.pyscripts/chat_basic.pyscripts/chat_basic.tsscripts/debug_headers.pyscripts/debug_headers.tsscripts/reasoning_models.pyscripts/reasoning_models.tsscripts/structured_outputs.pyscripts/structured_outputs.tsscripts/tool_call_loop.pyscripts/tool_call_loop.tsUse Together AI's serverless chat/completions API for interactive inference workloads:
Treat this skill as the default entry point for Together AI text generation unless the task is clearly offline batch processing, vector retrieval, model training, or infrastructure management.
together-batch-inference for large offline runs, backfills, or lower-cost asynchronous jobstogether-embeddings for vector search, semantic retrieval, or rerankingtogether-fine-tuning when the user wants to train or adapt a modeltogether-dedicated-endpoints when the user needs always-on single-tenant hostingtogether-dedicated-containers or together-gpu-clusters for custom infrastructuretogether>=2.0.0). If the user is on an older version, they must upgrade first: uv pip install --upgrade "together>=2.0.0".client.chat.completions.create() for Python and client.chat.completions.create() for TypeScript.messages history for multi-turn conversations; do not rebuild context from final text only.json_schema over looser JSON modes when the user needs stable machine-readable output.tools (no response_format), Phase 2 sends response_format (no tools) after tool results are appended.response_format; accumulate chunks and parse the final concatenated string as JSON.async_parallel.py or hand off to batch inference.npx claudepluginhub zainhas/skills --plugin togetherai-skillsReal-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-turn conversations, tool calling, structured JSON outputs, and reasoning models.
Runs Together AI inference for chat completions, streaming, images, and embeddings using Python or Node.js OpenAI-compatible clients. For testing open-source LLMs like Llama.
Provides TypeScript code for Groq chat completions, tool calling, JSON mode, and structured outputs. For building real-time AI chat interfaces and function calling with fast inference.