From togetherai-skills
High-volume, asynchronous offline inference via Together AI's Batch API. Prepares JSONL inputs, uploads files, creates jobs, polls status, and downloads outputs for bulk classification, synthetic data generation, and dataset transformations.
How this skill is triggered — by the user, by Claude, or both
Slash command
/togetherai-skills:together-batch-inferenceThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use Together AI's Batch API for large offline workloads where latency is not the primary concern.
Use Together AI's Batch API for large offline workloads where latency is not the primary concern.
Typical fits:
together-chat-completions for real-time requests or tool-calling appstogether-evaluations for managed LLM-as-a-judge workflowstogether-embeddings for retrieval-specific vector generationcustom_id and body.purpose="batch-api".input_file_id=... and the target endpoint.custom_id.together>=2.0.0). If the user is on an older version, they must upgrade first: uv pip install --upgrade "together>=2.0.0".input_file_id, not legacy file parameters.custom_id stable and meaningful so result reconciliation is easy.client.batches.create() returns a wrapper; access the batch object via response.job (e.g., response.job.id). client.batches.retrieve() returns the batch object directly.max_tokens low (e.g., 4), use temperature: 0, and constrain the system prompt to return only the label. This minimizes output tokens and cost.npx claudepluginhub togethercomputer/skills --plugin togetherai-skillsProcesses thousands of documents asynchronously using Google's Gemini Batch API at 50% lower cost. Enforces correct API patterns to avoid silent failures.
Guides Together AI workflows for inference, fine-tuning, and model deployment using OpenAI-compatible API and Python SDK. Covers errors, models, and batch inference.
Runs inference-time scaling on multiple prompts from JSONL, CSV, or TXT files. Useful for batch processing, evaluation runs, or dataset-level scaling.