Sets up production monitoring for SaaS apps: uptime checks with UptimeRobot, error tracking with Sentry, performance metrics via Vercel/Netlify. Handles incidents and alerts.
How this skill is triggered — by the user, by Claude, or both
Slash command
/solo-founder-superpowers:monitorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**This skill is for production monitoring and incident response.** For debugging specific bugs, use **debug**. For pre-launch readiness checks, use **go-live**. For security-specific monitoring (auth events, API abuse), use **secure**. For analytics and user behavior tracking, use **analytics**.
This skill is for production monitoring and incident response. For debugging specific bugs, use debug. For pre-launch readiness checks, use go-live. For security-specific monitoring (auth events, API abuse), use secure. For analytics and user behavior tracking, use analytics.
Basic Monitoring:
- [ ] Uptime monitoring (is site up?)
- [ ] Error tracking (are errors happening?)
- [ ] Performance monitoring (is it slow?)
- [ ] User activity (are people using it?)
- [ ] Critical alerts configured
- [ ] Check dashboard daily
See MONITORING-SETUP.md for implementation.
Without monitoring:
With monitoring:
Goal: Know about problems before users tell you.
Uptime monitoring - Pings your app every minute
Free tools:
Setup:
1. Sign up for UptimeRobot
2. Add monitor for https://yourapp.com
3. Add your email for alerts
4. Get texted if site is down
Error tracking - Captures JavaScript errors and API failures
Free tools:
Claude Code:
Add Sentry error tracking to my app:
- Install @sentry/nextjs (or appropriate package)
- Capture all frontend errors and API errors
- Include user context (email, ID)
- Configure source maps for readable stack traces
- Set up Sentry.init in both client and server entry points
Lovable / Replit (paste into chat):
Add error tracking to my app. I want to be notified when errors happen.
Use Sentry (free tier). Show me how to:
1. Create a Sentry account and project
2. Add the tracking code to my app
3. Test that errors are being captured
Performance monitoring - Tracks page load times
Free tools:
Setup:
Must monitor:
Nice to have:
For MVP: Focus on the "must monitor" only.
Configure alerts for:
Critical (text me immediately):
Important (email within hour):
Informational (daily digest):
Tell AI:
Configure monitoring alerts:
- Critical: Text to [phone]
- Important: Email to [email]
- Send summary: Daily at 9am
5-minute morning check:
Daily Check:
1. Open monitoring dashboard
2. Check uptime (should be 100% yesterday)
3. Check error count (any spikes?)
4. Check performance (slower than usual?)
5. Review any alerts from overnight
If all green: You're done, 5 minutes.
If red: Investigate using debug skill.
Green: Site responding
Red: Site down or slow to respond
What to check:
Look for:
Priority:
Look for:
When errors spike:
1. Open error tracking dashboard (Sentry)
2. Find the most frequent error
3. Read error message and stack trace
4. Note: How many users affected?
5. Note: Started when?
6. Check: Did we deploy recently?
Give to AI:
Error in production:
[Paste error message and stack trace]
Affected: [X] users in last [Y] hours
Started: [timestamp]
Recent deploys: [any?]
Please:
1. Explain what's wrong
2. Propose hotfix
3. How to test before deploying
When user reports problem:
User Report Investigation:
1. Can you reproduce it?
2. Check monitoring for errors at that time
3. Check logs for that user
4. Check if others affected
5. Determine severity
Then use debug skill to fix.
Tell AI:
User reported: [issue description]
User: [email or ID]
Timestamp: [when it happened]
Check monitoring and logs for this user at this time.
What errors or issues do you see?
Catch issues before users:
Weekly checks:
Weekly Review:
- [ ] Error trends (going up or down?)
- [ ] Performance trends (slower?)
- [ ] New error types introduced
- [ ] Uptime issues resolved
- [ ] Alert noise (too many false alerts?)
Monthly checks:
Monthly Health:
- [ ] Compare to last month
- [ ] Any degradation?
- [ ] Any improvements?
- [ ] Monitoring gaps (what's not tracked?)
Recommended for MVP:
Uptime:
Errors:
Performance:
Logs:
Cost: $0/month until you need more.
Upgrade when:
Paid tiers (typically $20-50/mo):
For < 1000 users: Free tiers sufficient.
| Mistake | Fix |
|---|---|
| No monitoring set up | Set up before launch |
| Alert fatigue (too many alerts) | Only alert on critical issues |
| Checking once a month | Check daily (5 minutes) |
| Ignoring trends | Watch for degradation over time |
| No alerts configured | Set up text alerts for downtime |
| Monitoring but not acting | Use monitoring to find and fix issues |
Good trends:
Warning trends:
Critical trends:
Action: Address warning trends before they become critical.
Logging:
Monitoring:
Both needed: Monitoring alerts you, logs help debug.
Tell AI:
Add application logging:
- Log all errors with context
- Log API requests/responses
- Log slow operations (>1s)
- Log authentication events
- Don't log sensitive data
Format: JSON with timestamp, level, message, context
Send to: [Platform logs or external service]
Log levels:
Production: Log ERROR and WARN only.
Third-party services:
Payments (Stripe):
Email (SendGrid):
Database:
Tell AI:
Add monitoring for [service]:
- Alert on failures
- Track success rate
- Log errors with context
When alerts fire:
Incident Response:
1. Acknowledge alert (mark as seen)
2. Assess severity:
- Critical: Site down, payments failing
- High: Errors affecting many users
- Medium: Isolated issues
3. Immediate action:
- Critical: Hotfix or rollback
- High: Fix within hours
- Medium: Fix in next deploy
4. Update users if needed
5. Post-mortem after resolved
Critical incidents:
1. Assess impact (how many affected?)
2. Quick fix or rollback
3. Deploy hotfix
4. Verify fixed
5. Monitor closely for hour
6. Update status page if you have one
✅ Know about issues before users report them ✅ Uptime >99.9% ✅ Errors caught and fixed quickly ✅ Performance trends stable or improving ✅ Daily monitoring routine (5 minutes) ✅ Alerts configured and actionable ✅ Issues resolved proactively
npx claudepluginhub whawkinsiv/solo-founder-superpowers --plugin solo-founder-superpowersDesigns monitoring systems: SLOs, uptime checks, error tracking, alert routing, on-call rotations. Use when setting up or fixing monitoring, alert fatigue, or incident gaps.
Monitors application error rates across HTTP endpoints, databases, APIs, jobs, and exceptions; sets up alerting, thresholds, and error budgets for reliability.
Design monitoring and alerting that catches production issues fast without creating alert fatigue. Use when establishing observability or improving incident response.