From anthropic-pack
Provides Claude API reference architectures: sync FastAPI gateway, async Redis queues, multi-model routing. Use when designing scalable Anthropic integrations.
How this skill is triggered — by the user, by Claude, or both
Slash command
/anthropic-pack:anth-reference-architectureThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Three validated architecture patterns for Claude API integrations: synchronous API gateway, async queue-based processing, and multi-model routing.
Three validated architecture patterns for Claude API integrations: synchronous API gateway, async queue-based processing, and multi-model routing.
User → API Gateway → Claude Service → Messages API
↓
Response → User
# Best for: chatbots, interactive tools, low-volume (<100 RPM)
from fastapi import FastAPI
import anthropic
app = FastAPI()
client = anthropic.Anthropic(max_retries=3, timeout=60.0)
@app.post("/chat")
async def chat(prompt: str):
msg = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
return {"text": msg.content[0].text, "tokens": msg.usage.output_tokens}
User → API → Queue (Redis/SQS) → Worker Pool → Messages API
↑ ↓
└──────────── Status/Result ←── Result Store ←───┘
# Best for: batch processing, high-volume, background tasks
from redis import Redis
from rq import Queue
import anthropic
redis = Redis()
task_queue = Queue("claude-tasks", connection=redis)
result_store = Redis(db=1)
def process_task(task_id: str, prompt: str, model: str):
client = anthropic.Anthropic()
msg = client.messages.create(
model=model,
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
result_store.setex(f"result:{task_id}", 3600, msg.content[0].text)
# Enqueue
import uuid
task_id = str(uuid.uuid4())
task_queue.enqueue(process_task, task_id, prompt, "claude-sonnet-4-20250514")
User → Router → Haiku (classify/extract)
→ Sonnet (general/code)
→ Opus (research/complex)
→ Batches (bulk/offline)
class ModelRouter:
def __init__(self):
self.client = anthropic.Anthropic()
self.classifier = anthropic.Anthropic() # Can be same client
def route_and_execute(self, prompt: str, context: dict) -> str:
# Step 1: Classify with Haiku (cheap, fast)
classification = self.classifier.messages.create(
model="claude-haiku-4-20250514",
max_tokens=32,
messages=[{
"role": "user",
"content": f"Classify this request as: simple|moderate|complex|bulk\n\n{prompt[:200]}"
}]
)
complexity = classification.content[0].text.strip().lower()
# Step 2: Route to appropriate model
model_map = {
"simple": "claude-haiku-4-20250514",
"moderate": "claude-sonnet-4-20250514",
"complex": "claude-opus-4-20250514",
}
model = model_map.get(complexity, "claude-sonnet-4-20250514")
# Step 3: Execute with selected model
msg = self.client.messages.create(
model=model,
max_tokens=4096,
messages=[{"role": "user", "content": prompt}]
)
return msg.content[0].text
my-claude-app/
├── src/
│ ├── main.py # FastAPI app
│ ├── claude/
│ │ ├── client.py # Singleton + config
│ │ ├── router.py # Model routing logic
│ │ ├── tools.py # Tool definitions
│ │ └── prompts/ # System prompts as files
│ ├── workers/
│ │ └── claude_worker.py # Queue consumer
│ └── middleware/
│ ├── rate_limiter.py # App-level rate limiting
│ └── cost_tracker.py # Spend monitoring
├── tests/
│ ├── unit/ # Mocked tests
│ └── integration/ # Live API tests
└── config/
├── .env.development
├── .env.staging
└── .env.production
| Architecture | Failure Mode | Mitigation |
|---|---|---|
| Sync Gateway | 429/5xx blocks user | Circuit breaker + fallback response |
| Queue-Based | Worker crashes | Dead-letter queue + retry policy |
| Multi-Model | Router misclassifies | Default to Sonnet (safest middle) |
For multi-environment setup, see anth-multi-env-setup.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin anthropic-packImplements Claude API patterns for serverless (Lambda), microservices (FastAPI WS), queues (Celery), and edge deployments using Python.
Analyzes project requirements and recommends optimal Anthropic architectures using Skills, Agents, Prompts, and SDK primitives for scalable AI systems.
Provides instructions for building LLM-powered apps with the Claude API or Anthropic SDK, including language detection and code examples for Python, TypeScript, Java, Go, Ruby, and more.