From perplexity-pack
Choose and implement Perplexity validated architecture blueprints for different scales. Use when designing new Perplexity integrations, choosing between monolith/service/microservice architectures, or planning migration paths for Perplexity applications. Trigger with phrases like "perplexity architecture", "perplexity blueprint", "how to structure perplexity", "perplexity project layout", "perplexity microservice".
How this skill is triggered — by the user, by Claude, or both
Slash command
/perplexity-pack:perplexity-architecture-variantsThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Deployment architectures for Perplexity Sonar search API at different scales. Perplexity's search-augmented generation model fits different patterns from simple search widgets to full research automation pipelines.
Deployment architectures for Perplexity Sonar search API at different scales. Perplexity's search-augmented generation model fits different patterns from simple search widgets to full research automation pipelines.
Best for: Adding AI search to an app, < 500 queries/day.
@app.route('/ask')
def ask():
response = pplx_client.chat.completions.create(
model="sonar", messages=[{"role": "user", "content": request.args["q"]}]
)
return jsonify({
"answer": response.choices[0].message.content,
"citations": response.citations
})
Best for: Repeated queries, 500-5K queries/day, research tools.
class CachedResearch:
def __init__(self, client, cache, ttl=1800): # 1800: timeout: 30 minutes
self.client = client
self.cache = cache
self.ttl = ttl
def search(self, query: str, model: str = "sonar"):
key = f"pplx:{hashlib.sha256(query.encode()).hexdigest()}"
cached = self.cache.get(key)
if cached:
return json.loads(cached)
result = self.client.chat.completions.create(
model=model, messages=[{"role": "user", "content": query}]
)
data = {"answer": result.choices[0].message.content, "citations": result.citations}
self.cache.setex(key, self.ttl, json.dumps(data))
return data
Best for: Automated research, 5K+ queries/day, report generation.
class ResearchPipeline:
async def research_topic(self, topic: str) -> dict:
# Decompose into sub-questions
sub_questions = await self.decompose(topic)
# Run parallel searches
results = await asyncio.gather(*[
self.search_with_cache(q) for q in sub_questions
])
# Synthesize into report
report = await self.synthesize(topic, results)
return {"topic": topic, "sections": results, "synthesis": report}
async def decompose(self, topic: str) -> list[str]:
r = self.client.chat.completions.create(
model="sonar", messages=[
{"role": "system", "content": "Break this topic into 3-5 specific research questions."},
{"role": "user", "content": topic}
])
return r.choices[0].message.content.strip().split("\n")
| Factor | Direct Widget | Cached Layer | Research Pipeline |
|---|---|---|---|
| Volume | < 500/day | 500-5K/day | 5K+/day |
| Use Case | Quick answers | Repeated queries | Deep research |
| Latency | 2-5s | 50ms (cached) | 10-30s |
| Model | sonar | sonar | sonar-pro |
| Issue | Cause | Solution |
|---|---|---|
| Slow in UI | No caching | Cache repeated queries |
| High cost | sonar-pro everywhere | Route by complexity |
| Stale answers | Long cache TTL | Reduce TTL for current events |
Basic usage: Apply perplexity architecture variants to a standard project setup with default configuration options.
Advanced scenario: Customize perplexity architecture variants for production environments with multiple constraints and team-specific requirements.
npx claudepluginhub nickloveinvesting/nick-love-plugins --plugin perplexity-packImplements Perplexity Sonar API architectures for varying scales: direct widget, cached layer with LRU, multi-query pipeline. Includes TypeScript examples for Express/Next.js.
Implements Exa search architectures: direct, cached (Redis/LRU), and RAG pipelines with decision matrix and TypeScript/Express examples for varying traffic scales.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.