Skip to main content

/

/

Stats

Actions

Tags

Stats

Actions

Tags

ClaudePluginHub

Community directory for discovering and installing Claude Code plugins.

Find plugins for your project

AI-powered recommendations based on your stack.

Product

Browse Plugins
Marketplaces
Pricing
About
Contact

Resources

Learning Center
Blog
Weekly Digest
Claude Code Docs
Plugin Guide
Plugin Reference
Plugin Marketplaces

Community

Browse on GitHub
Get Support

Legal

Terms of Service
Privacy Policy

Browse · Plugins · Top Plugins · Marketplaces · Components · Technologies · Skills · Agents · Commands · Hooks · MCP Servers · LSP Servers · Output Styles · Themes · Monitors

Categories · Productivity · Development · Testing · Deployment · Security · Documentation · Data · Utilities

© 2025 ClaudePluginHub

Community Maintained · Not affiliated with Anthropic

ClaudePluginHub

ClaudePluginHub

Tools Learn Pricing

Search everything...

together-embeddings | togetherai-skills

Home
Skills
togetherai-skills
together-embeddings

Skill

together-embeddings

From togetherai-skills

Generates dense vector embeddings, performs semantic search, builds RAG pipelines, and reranks results via Together AI. Use for retrieval plumbing before the generation step.

Popularity

Stars

32

Forks

4

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/togetherai-skills:together-embeddings

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use this skill for semantic retrieval components:

Supporting Files

agents/openai.yamlreferences/api-reference.mdreferences/models.mdscripts/embed_and_rerank.pyscripts/embed_and_rerank.tsscripts/rag_pipeline.pyscripts/semantic_search.py

SKILL.md

76 lines · ~977 tokens

Stats

LanguagePython

Stars32

Forks4

MaintenanceExcellent

Last CommitJun 10, 2026

Actions

View Source View Plugin View on GitHub View README

Tags

semantic-search

Stats

LanguagePython

Stars32

Forks4

MaintenanceExcellent

Last CommitJun 10, 2026

Actions

View Source View Plugin View on GitHub View README

Tags

semantic-search

Together Embeddings & Reranking

Overview

Use this skill for semantic retrieval components:

create embeddings
batch embeddings
build retrieval or RAG pipelines
rerank retrieved candidates

This skill is for retrieval plumbing, not for the final language-model response itself.

When This Skill Wins

Build vector search or semantic similarity features
Add embedding generation to a data pipeline
Improve retrieval quality with reranking
Assemble a retrieval stage before calling a chat model

Hand Off To Another Skill

Use together-chat-completions for the final answer-generation step
Use together-batch-inference for very large offline embedding backfills
Use together-dedicated-endpoints when reranking requires a dedicated deployment

Quick Routing

Embeddings API usage
- Read references/api-reference.md
- Start with scripts/embed_and_rerank.py or scripts/embed_and_rerank.ts
Semantic search (embed, store, query)
- Start with scripts/semantic_search.py -- includes an in-memory vector store, cosine-similarity retrieval, and optional rerank
RAG pipeline composition
- Start with scripts/rag_pipeline.py
Model selection and rerank constraints
- Read references/models.md

Workflow

Confirm that the user needs vectors or retrieval, not direct generation.
Choose the embedding model and batch shape.
Generate embeddings for corpus and query paths consistently.
Retrieve candidates. An in-memory cosine-similarity store works for prototyping and small corpora (see semantic_search.py). Use a dedicated vector database for production scale.
Rerank only when the extra latency and endpoint requirement are justified. When no dedicated rerank endpoint is available, cosine-similarity ranking is a reasonable fallback.

High-Signal Rules

Python scripts require the Together v2 SDK (together>=2.0.0). If the user is on an older version, they must upgrade first: uv pip install --upgrade "together>=2.0.0".
Keep embeddings and reranking conceptually separate; rerank is a second-stage precision step.
Reranking in this repo assumes a dedicated endpoint. Do not promise serverless rerank unless the product changes. When no endpoint is available, fall back to cosine-similarity ranking.
The embedding model has a 514-token context limit. Chunk longer documents before embedding.
The rag_pipeline.py example demonstrates retrieval plus generation; treat generation as a hand-off to chat completions.
Preserve model consistency across indexing and querying.

Resource Map

API details: references/api-reference.md
Model guide: references/models.md
Python embeddings example: scripts/embed_and_rerank.py
TypeScript embeddings example: scripts/embed_and_rerank.ts
Python semantic search: scripts/semantic_search.py
Python RAG pipeline: scripts/rag_pipeline.py

Official Docs

Embeddings Overview
Rerank Overview
Embeddings API
Rerank API

$

npx claudepluginhub togethercomputer/skills --plugin togetherai-skills

Similar Skills

embedding-strategies

37.9k

Guides selection and optimization of embedding models for vector search and RAG, including model comparisons, chunking strategies, dimension reduction, and Python templates for OpenAI and local models.

antigravity-bundle-data-engineering

View embedding-strategies

embedding-strategies

41.0k

Guides selection and optimization of embedding models for vector search, including chunking, dimension reduction, and multilingual support.

antigravity-awesome-skills

View embedding-strategies

rag-implementation

15

Build RAG systems for LLM apps using vector databases, embeddings, and retrieval strategies. Use for document Q&A, grounded chatbots, and semantic search.

llm-application-dev

View rag-implementation