From perplexity-pack
Identifies Perplexity Sonar API pitfalls like generic chatbot misuse, ignoring citations, wrong SDK imports, and unset max_tokens during code reviews, onboarding, and audits.
How this skill is triggered — by the user, by Claude, or both
Slash command
/perplexity-pack:perplexity-known-pitfallsThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.
Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.
Perplexity searches the web per request. Using it for tasks that don't need web search wastes money.
# BAD: general chatbot (wastes a search query)
response = call_perplexity("Write me a haiku about cats")
# Costs $0.005+ for something any LLM can do offline
# GOOD: leverage web search capability
response = call_perplexity(
"What are the latest Next.js 15 features released this month?",
search_recency_filter="month"
)
Perplexity returns [1], [2] markers in text with a separate citations array. Ignoring them loses the key value prop.
data = response.model_dump() # or response.json() for raw HTTP
answer = data["choices"][0]["message"]["content"]
citations = data.get("citations", []) # NOT in choices — top-level field
# BAD: displaying raw markers
print(answer) # "According to [1], Node.js 22 adds..."
# GOOD: replace markers with links
import re
for i, url in enumerate(citations, 1):
answer = answer.replace(f"[{i}]", f"[{i}]({url})")
There is no @perplexity/sdk or perplexity Python package. Use the standard OpenAI client.
// BAD — this package doesn't exist
import { PerplexityClient } from "@perplexity/sdk";
// GOOD — use OpenAI client with Perplexity base URL
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: "https://api.perplexity.ai",
});
Without max_tokens, responses can be arbitrarily long, increasing costs unpredictably.
// BAD: no token limit — output cost can spike
await client.chat.completions.create({
model: "sonar-pro", // $15/M output tokens!
messages: [{ role: "user", content: "Tell me about AI" }],
});
// GOOD: always set max_tokens
await client.chat.completions.create({
model: "sonar-pro",
messages: [{ role: "user", content: "Tell me about AI" }],
max_tokens: 1024,
});
Without search_recency_filter, Perplexity may cite outdated articles.
# BAD: may return articles from any time period
response = call_perplexity("current Bitcoin price")
# GOOD: constrain to recent results
response = call_perplexity(
"current Bitcoin price",
search_recency_filter="day" # hour | day | week | month
)
Each message in the conversation may trigger new search queries. Sending 20 turns of history is expensive and slow.
# BAD: 20 turns of history = many search queries
messages = long_history + [{"role": "user", "content": "summarize"}]
# GOOD: summarize context, send focused query
messages = [
{"role": "system", "content": "Answer based on web search."},
{"role": "user", "content": f"Context: {summary}\nQuestion: {question}"}
]
sonar-pro costs 3-15x more than sonar. Using it for simple factual lookups wastes budget.
// BAD: sonar-pro for a trivial question
await client.chat.completions.create({
model: "sonar-pro", // $3 input + $15 output per M tokens
messages: [{ role: "user", content: "What is the capital of France?" }],
});
// GOOD: match model to complexity
const model = isComplexQuery(query) ? "sonar-pro" : "sonar";
search_domain_filter supports either allowlist (include) or denylist (exclude with - prefix), but not both in the same request.
// BAD: mixing modes
search_domain_filter: ["python.org", "-reddit.com"] // ERROR
// GOOD: pick one mode
search_domain_filter: ["python.org", "docs.python.org"] // Allowlist
// OR
search_domain_filter: ["-reddit.com", "-quora.com"] // Denylist
Every uncached call performs a web search. At scale, duplicate queries burn budget.
// BAD: same query hits API every time
app.get("/search", (req, res) => {
const result = await client.chat.completions.create({ ... });
res.json(result);
});
// GOOD: cache by query hash
const cache = new LRUCache({ max: 1000, ttl: 3600_000 });
app.get("/search", (req, res) => {
const key = hash(req.query.q);
if (cache.has(key)) return res.json(cache.get(key));
const result = await client.chat.completions.create({ ... });
cache.set(key, result);
res.json(result);
});
The API is at api.perplexity.ai, not api.perplexity.com.
// BAD
baseURL: "https://api.perplexity.com" // Wrong domain
// GOOD
baseURL: "https://api.perplexity.ai" // Correct
openai package, not fake @perplexity/sdkhttps://api.perplexity.aimax_tokens set on every requestresponse.citations arraysearch_recency_filter used for time-sensitive queries| Pitfall | Impact | Detection |
|---|---|---|
| No caching | 3-5x cost overrun | Check cache hit rate metric |
| Wrong model | Budget waste | Grep for sonar-pro in simple query paths |
| No max_tokens | Unpredictable costs | Grep for create() calls without max_tokens |
| PII in queries | Privacy violation | Run sanitization check in CI |
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin perplexity-packImplements Perplexity Sonar API patterns in TypeScript and Python using OpenAI client wrappers for typed singletons, search with citations, and response parsing.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.