From neurometric
Neurometric CLI - check status, replay captures, and optimize AI costs
How this skill is triggered — by the user, by Claude, or both
Slash command
/neurometric:neurometricThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Unified Neurometric CLI for monitoring, replaying, and optimizing AI API usage.
Unified Neurometric CLI for monitoring, replaying, and optimizing AI API usage.
/neurometric <command> [options]
Commands:
status - Check gateway connection statusreplay [count] - View recent API capturesoptimize [options] - Get cost optimization recommendationsCheck the current Neurometric gateway connection status.
/neurometric status
Shows the current status of the Neurometric gateway connection, including:
When the user invokes /neurometric status:
# Check if NEUROMETRIC_API_KEY is set
if [ -z "$NEUROMETRIC_API_KEY" ]; then
echo "NEUROMETRIC_API_KEY is not set"
exit 1
fi
# Check gateway connectivity
curl -s -H "Authorization: Bearer $NEUROMETRIC_API_KEY" \
"https://api.neurometric.ai/v1/status"
If connected, show:
Neurometric Status: Connected
API Key: ...${NEUROMETRIC_API_KEY: -8} (last 8 chars)
Gateway: https://api.neurometric.ai
Captures: <count> API calls (last 24h)
Environment variables configured for: OpenAI, Anthropic, Cohere, Mistral, Groq, Together
If not configured, show:
Neurometric Status: Not Configured
NEUROMETRIC_API_KEY is not set. To enable capture:
export NEUROMETRIC_API_KEY="sk_live_..."
Then restart your Claude Code session.
If connection fails, show:
Neurometric Status: Connection Failed
Could not reach https://api.neurometric.ai
Check your network connection and API key.
neurometric.config.jsonFetch and display recent AI API calls captured by Neurometric.
/neurometric replay [count]
Arguments:
count (optional): Number of recent prompts to show. Default: 5, Max: 20Fetches and displays recent API calls captured during this session or from Neurometric's history. Useful for:
When the user invokes /neurometric replay:
const count = Math.min(parseInt(args) || 5, 20);
curl -s -H "Authorization: Bearer $NEUROMETRIC_API_KEY" \
"https://api.neurometric.ai/v1/captures?session_id=$SESSION_ID&limit=$COUNT"
For each capture, show:
─────────────────────────────────────────
Capture #<index> | <timestamp> | <provider>/<model>
─────────────────────────────────────────
PROMPT:
<system prompt if present>
<user message>
RESPONSE:
<assistant response>
Tokens: <prompt_tokens> in / <completion_tokens> out
Latency: <latency_ms>ms
─────────────────────────────────────────
No captures found for this session.
Make sure:
- NEUROMETRIC_API_KEY is set (/neurometric status)
- You have made AI API calls in this session
- The Neurometric gateway is reachable
After showing the captures, ask:
Would you like to replay any of these prompts? Specify the capture number to re-run it.
If the user selects a capture to replay:
NEUROMETRIC_API_KEY to be set/v1/captures endpointAnalyze AI usage and get recommendations for cost optimization, model routing, and latency improvements.
/neurometric optimize [mode] [options]
Modes:
--captures (default): Fetch recent API calls from Neurometric and analyze actual usage--scan [path]: Scan codebase for AI SDK patterns (default path: ./)--describe "...": Analyze a natural language workflow descriptionOptions:
--days N: Analysis period for captures mode (default: 7)--output json: Output as JSON instead of formatted text/neurometric optimize
/neurometric optimize --captures --days 30
/neurometric optimize --scan ./src
/neurometric optimize --describe "RAG pipeline that extracts JSON from documents and summarizes them"
When the user invokes /neurometric optimize:
Determine the mode and options from the arguments:
// Default values
let mode = 'captures';
let scanPath = './';
let description = '';
let days = 7;
let outputJson = false;
// Parse arguments
if (args.includes('--scan')) {
mode = 'scan';
const scanMatch = args.match(/--scan\s+([^\s-]+)/);
if (scanMatch) scanPath = scanMatch[1];
}
if (args.includes('--describe')) {
mode = 'describe';
const descMatch = args.match(/--describe\s+"([^"]+)"/);
if (descMatch) description = descMatch[1];
}
const daysMatch = args.match(/--days\s+(\d+)/);
if (daysMatch) days = parseInt(daysMatch[1]);
if (args.includes('--output json')) outputJson = true;
Mode: captures (default)
Fetch recent API calls from Neurometric and analyze:
curl -s -H "Authorization: Bearer $NEUROMETRIC_API_KEY" \
"https://api.neurometric.ai/v1/captures?days=$DAYS&limit=1000"
Parse the response and extract:
Mode: scan
Scan the codebase for AI SDK usage patterns:
# Find OpenAI SDK usage
grep -r "openai\." --include="*.py" --include="*.ts" --include="*.js" $SCAN_PATH
# Find Anthropic SDK usage
grep -r "anthropic\." --include="*.py" --include="*.ts" --include="*.js" $SCAN_PATH
# Find model specifications
grep -rE "(gpt-4|gpt-3|claude|gemini)" --include="*.py" --include="*.ts" --include="*.js" $SCAN_PATH
Analyze the patterns to identify:
Mode: describe
Parse the workflow description and classify tasks based on keywords:
Use this pricing table (March 2026):
| Model | Input ($/1M tokens) | Output ($/1M tokens) | P50 Latency | Best For |
|---|---|---|---|---|
| gpt-4o | 5.00 | 15.00 | 800ms | Complex reasoning, code generation |
| gpt-4o-mini | 0.15 | 0.60 | 400ms | Simple tasks, extraction, classification |
| gpt-4-turbo | 10.00 | 30.00 | 1000ms | Legacy, complex tasks |
| gpt-3.5-turbo | 0.50 | 1.50 | 300ms | Simple chat, legacy |
| claude-sonnet-4 | 3.00 | 15.00 | 700ms | Balanced performance, coding |
| claude-opus-4 | 15.00 | 75.00 | 1200ms | Most complex reasoning |
| claude-haiku-3.5 | 0.80 | 4.00 | 300ms | Fast, simple tasks |
| gemini-2.0-flash | 0.075 | 0.30 | 200ms | Speed-critical, high volume |
| gemini-2.0-pro | 1.25 | 5.00 | 500ms | Balanced Gemini option |
Task Classification Heuristics:
| Signal | Task Type | Recommended Tier |
|---|---|---|
response_format: json or JSON schema | Extraction | Mini/Flash |
tools: or functions: present | Function calling | Mini/Medium |
| Short input, short output | Simple Q&A | Mini/Flash |
| Long output (>500 tokens) | Generation | Medium |
| Multi-turn or chain-of-thought | Reasoning | Large |
| Code in prompt or response | Coding | Medium/Large |
Model Recommendations by Task:
| Task Type | Budget | Balanced | Premium |
|---|---|---|---|
| Extraction | gemini-2.0-flash | gpt-4o-mini | gpt-4o |
| Classification | gemini-2.0-flash | gpt-4o-mini | claude-sonnet-4 |
| Simple chat | gpt-4o-mini | claude-haiku-3.5 | claude-sonnet-4 |
| Generation | gpt-4o-mini | claude-sonnet-4 | gpt-4o |
| Coding | gpt-4o-mini | claude-sonnet-4 | claude-opus-4 |
| Complex reasoning | claude-sonnet-4 | gpt-4o | claude-opus-4 |
For each task category:
Standard Output:
+-------------------------------------------------------------+
| Neurometric Optimization Report |
+-------------------------------------------------------------+
| API Calls Analyzed: {total_calls} |
| Current Cost: ${current_cost}/month (projected) |
| Optimized Cost: ${optimized_cost}/month (projected) |
| Potential Savings: ${savings}/month ({percent}%) |
+-------------------------------------------------------------+
## Recommendations
### High Impact: {task_description} -> {recommended_model}
- Current: {current_model} (${current_monthly})
- Recommended: {recommended_model} (${recommended_monthly})
- Savings: ${savings}/mo ({percent}%)
- Latency: {latency_change}
- Confidence: {High|Medium|Low}
- Rationale: {why_this_recommendation}
### Medium Impact: ...
## Summary by Task Type
| Task Type | Calls | Current Model | Recommended | Monthly Savings |
|-----------|-------|---------------|-------------|-----------------|
| Extraction | 450 | gpt-4o | gpt-4o-mini | $66.45 |
| Generation | 200 | gpt-4o | claude-sonnet-4 | $12.00 |
| ...
## Next Steps
1. Update model selection for extraction tasks (highest impact)
2. Consider implementing model routing based on task complexity
3. Review the {N} calls using premium models for simple tasks
---
View full analysis at: https://studio.neurometric.ai/optimize
JSON Output (when --output json):
{
"summary": {
"calls_analyzed": 1247,
"current_cost_monthly": 127.45,
"optimized_cost_monthly": 48.20,
"savings_monthly": 79.25,
"savings_percent": 62
},
"recommendations": [
{
"impact": "high",
"task_type": "extraction",
"current_model": "gpt-4o",
"recommended_model": "gpt-4o-mini",
"current_cost": 68.50,
"optimized_cost": 2.05,
"savings": 66.45,
"savings_percent": 97,
"latency_change_ms": -200,
"confidence": "high",
"rationale": "JSON extraction tasks don't require advanced reasoning"
}
],
"by_task_type": {
"extraction": { "calls": 450, "current": "gpt-4o", "recommended": "gpt-4o-mini" },
"generation": { "calls": 200, "current": "gpt-4o", "recommended": "claude-sonnet-4" }
}
}
No NEUROMETRIC_API_KEY (captures mode):
Error: NEUROMETRIC_API_KEY not set
To use captures mode, set your API key:
export NEUROMETRIC_API_KEY="sk_live_..."
Alternatively, try:
/neurometric optimize --scan ./
/neurometric optimize --describe "your workflow"
No captures found:
No API calls found in the last {days} days.
Make sure:
- NEUROMETRIC_API_KEY is set (/neurometric status)
- You have made AI API calls through the gateway
- Try increasing the time range: --days 30
No AI SDK patterns found (scan mode):
No AI SDK usage patterns found in {path}
Searched for:
- OpenAI SDK (openai.*)
- Anthropic SDK (anthropic.*)
- Model references (gpt-*, claude-*, gemini-*)
Try:
- Specifying a different path: --scan ./src
- Using describe mode: --describe "your workflow"
| Command | Description |
|---|---|
/neurometric status | Check gateway connection |
/neurometric replay | Show last 5 captures |
/neurometric replay 10 | Show last 10 captures |
/neurometric optimize | Analyze captures for savings |
/neurometric optimize --scan ./ | Scan codebase for AI patterns |
/neurometric optimize --describe "..." | Analyze workflow description |
npx claudepluginhub neurometricai/neurometric-pluginAnswers natural-language questions about OpenRouter usage data: spend, request volume, model breakdown, latency, token usage, and cost optimization.
Monitors PostHog AI observability data for cost, latency, errors, volume, eval performance, clusters, and tool usage trends. Emits findings only when confidence is high; otherwise writes durable memory.