This document covers the comprehensive alert configuration for Sentry error tracking and monitoring of JudgeFinder.io.
DSN (Data Source Name):
https://[key]@sentry.io/[project-id]
Environment Variables:
SENTRY_DSN=<your-dsn>
NEXT_PUBLIC_SENTRY_DSN=<your-dsn> # For client-side errors
SENTRY_TRACES_SAMPLE_RATE=0.1 # 10% of transactions
SENTRY_REPLAYS_SESSION_SAMPLE_RATE=0 # Disabled by default
SENTRY_REPLAYS_ON_ERROR_SAMPLE_RATE=1 # 100% on errors- Email: notifications@judgefinder.io
- Slack: #monitoring channel
- PagerDuty: Critical incidents (optional)
Trigger Conditions:
- When
event.type == "error" - Frequency: > 1% of all events in last 5 minutes
- Affected environments: production, staging
Notification Actions:
- Send to: Slack #monitoring + Email
- Mention: @channel in Slack
- Create incident: Yes
- Page on-call: Yes
Configuration:
name: Error Rate > 1%
filter:
environment: [production, staging]
event.type: error
percentage: 1.0
timeframe: 5m
actions:
- slack:
channel: '#monitoring'
mention: '@channel'
- email:
recipient: 'notifications@judgefinder.io'
- create_incident: true
- pagerduty:
severity: criticalResponse SLA: 15 minutes
Trigger Conditions:
- When exception is caught
- Handled: false
- First occurrence: Yes
- Environment: production
Notification Actions:
- Send to: Email + Slack (immediate)
- Alert frequency: Once per exception type per hour
- Create Sentry issue: Yes
Configuration:
name: Unhandled Exceptions
filter:
environment: production
exception.handled: false
first_occurrence: true
actions:
- slack:
channel: '#alerts'
mention: '@developers'
- email:
recipient: ['admin@judgefinder.io', 'ops@judgefinder.io']
- create_issue: true
- escalate_after: 10m
to: pagerdutyResponse SLA: 5 minutes
Trigger Conditions:
- When
measurement.duration > 2000(milliseconds) - Endpoint: /api/*
- Frequency: > 10 occurrences in 10 minutes
- Environment: production
Notification Actions:
- Send to: Slack #performance
- Trigger performance review
- Do not alert for first occurrence
Configuration:
name: API Response Time > 2s
filter:
environment: production
transaction: /api/*
measurement.duration: '>2000'
frequency: '>10:10m'
actions:
- slack:
channel: '#performance'
- metric_alert: response_time_warning
- log_to_cloudwatch: trueResponse SLA: 30 minutes
Trigger Conditions:
- When database operation duration > 5000ms
- Database: Supabase/PostgreSQL
- Frequency: > 5 occurrences in 15 minutes
Notification Actions:
- Log to monitoring dashboard
- Alert engineering team
- No paging (not critical)
Configuration:
name: Slow Database Query
filter:
span.op: db.query
measurement.duration: '>5000'
frequency: '>5:15m'
actions:
- slack:
channel: '#performance'
message: 'Slow database query detected'
- log_to_cloudwatch: true
- create_monitoring_ticket: trueResponse SLA: 1 hour
Trigger Conditions:
- When heap memory usage > 80%
- Frequency: Sustained for 5 minutes
- Environment: production
Notification Actions:
- Alert ops team
- Log memory profile
- Trigger performance review
Configuration:
name: Memory Usage High
filter:
environment: production
memory.heap_usage_percentage: '>80'
duration: 5m
actions:
- slack:
channel: '#ops'
message: 'High memory usage on production'
- email:
recipient: 'ops@judgefinder.io'
- capture_heap_dump: trueResponse SLA: 30 minutes
Trigger Conditions:
- When auth-related exceptions occur
- Exception contains: "auth", "token", "permission"
- Frequency: > 20 occurrences in 10 minutes
Notification Actions:
- Alert security team
- Log all details
- Review for potential breach
Configuration:
name: Authentication Failures
filter:
event.type: error
tags.error_type: [auth, authentication, token]
frequency: '>20:10m'
actions:
- slack:
channel: '#security'
mention: '@security'
- email:
recipient: 'security@judgefinder.io'
- log_to_cloudwatch:
log_group: security_events
- escalate_to: incident_commanderResponse SLA: 10 minutes
Trigger Conditions:
- CourtListener API failures
- Stripe API failures
- External service timeouts
- Frequency: > 5 in 5 minutes
Notification Actions:
- Alert integration team
- Log API response
- Monitor for degradation
Configuration:
name: External API Failures
filter:
tags.service: [courtlistener, stripe, external]
event.type: error
frequency: '>5:5m'
actions:
- slack:
channel: '#integrations'
- email:
recipient: ['integrations@judgefinder.io']
- create_monitoring_alert: true
- auto_disable_feature: falseResponse SLA: 30 minutes
Trigger Conditions:
- Stripe payment failures
- Transaction declined
- Webhook processing fails
- Frequency: > 1 in 5 minutes
Notification Actions:
- Immediate paging
- Alert payment ops
- Create incident
- Capture transaction details
Configuration:
name: Payment Processing Errors
filter:
tags.feature: payments
tags.service: stripe
event.type: error
frequency: '>1:5m'
actions:
- pagerduty:
severity: critical
title: 'Payment processing failure'
- slack:
channel: '#payments'
mention: '@payments-team'
- email:
recipient: ['payments@judgefinder.io', 'finance@judgefinder.io']
- create_incident: true
- capture_transaction_details: trueResponse SLA: 5 minutes
Trigger Conditions:
- New release deployment
- Error rate spike > 5x baseline
- First hour after deployment
- Environment: production
Notification Actions:
- Alert deployment team
- Suggest rollback
- Create deployment incident
Configuration:
name: Post-Deployment Error Spike
filter:
environment: production
release: release.*
time_since_release: '0-60m'
error_rate_increase: '>5x'
actions:
- slack:
channel: '#deployments'
message: 'Error spike detected post-deployment'
suggest_rollback: true
- email:
recipient: 'devops@judgefinder.io'
- create_deployment_incident: trueResponse SLA: 15 minutes
Trigger Conditions:
- Client-side JavaScript errors
- Frequency: > 100 in 1 hour
- Environment: production
- Not ignored errors
Notification Actions:
- Log to performance dashboard
- Weekly summary email
- No real-time alert
Configuration:
name: JavaScript Console Errors
filter:
environment: production
platform: javascript
event.type: error
frequency: '>100:1h'
ignore_tags: [expected_error]
actions:
- slack:
channel: '#frontend'
frequency: weekly
- email:
recipient: 'frontend-team@judgefinder.io'
frequency: daily_digest
- log_to_cloudwatch: trueResponse SLA: No SLA (informational)
Via Web Console:
- Go to: Sentry Project > Alerts
- Toggle rules on/off
- Click Save
Via API:
# Disable rule
curl -X PUT https://sentry.io/api/0/projects/{org}/{project}/rules/{rule_id}/ \
-H "Authorization: Bearer {auth_token}" \
-d '{"enabled": false}'
# Enable rule
curl -X PUT https://sentry.io/api/0/projects/{org}/{project}/rules/{rule_id}/ \
-H "Authorization: Bearer {auth_token}" \
-d '{"enabled": true}'Ignore specific exception:
filter:
error.message: "Network error"
action: ignore
Ignore error by pattern:
filter:
error.message: "*network*"
environment: staging
action: ignore
Resurrecting ignored errors:
- Go to: Sentry Project > Settings > Ignore
- Click "Restore" for the error
- Confirm
Smart grouping enabled:
- Groups similar errors by:
- Exception type
- Stack trace
- Error message fingerprint
Custom grouping:
group_by:
- exception.type
- tags.service
- tags.endpointSetup:
- Go to: Sentry Project > Integrations > Slack
- Authorize Sentry app to Slack workspace
- Select channels for notifications
- Configure notification frequency
Slack Commands:
@sentry-app ignore
@sentry-app resolve
@sentry-app assign @username
@sentry-app status
Configure:
- Go to: Settings > Notifications > Project Alerts
- Add recipients:
Digest Email:
- Frequency: Daily
- Time: 9:00 AM UTC
- Include: All alerts from past 24h
Auto-create issues:
- Go to: Integrations > GitHub
- Authorize repository
- Configure:
- Create issue on: Error > threshold
- Issue template: Use standard template
- Auto-assign: @dev-team
Issue Template:
## Error: {error.title}
**Severity:** {error.level}
**First seen:** {error.first_seen}
**Last seen:** {error.last_seen}
**Occurrences:** {error.count}
### Stack Trace
\`\`\`
{error.stack_trace}
\`\`\`
### Context
- User: {user.username}
- URL: {request.url}
- Environment: {environment}
### Action Items
- [ ] Investigate root cause
- [ ] Create fix
- [ ] Test fix
- [ ] Deploy to productionCustom dashboard:
// Metrics to display
{
"widgets": [
{
"title": "Error Rate (Last 24h)",
"type": "stat",
"query": "event.type:error"
},
{
"title": "Top 10 Errors",
"type": "table",
"query": "event.type:error",
"sort": "-frequency"
},
{
"title": "Error Trend",
"type": "line_chart",
"query": "event.type:error",
"period": "1h"
},
{
"title": "Slowest Transactions",
"type": "table",
"query": "measurement.duration:>1000"
},
{
"title": "User Impact",
"type": "stat",
"query": "affected_users"
}
]
}-
Error Volume
- Total errors per day
- Error growth rate
- Critical vs. warning errors
-
Response Time
- p50, p95, p99 latencies
- Slow transaction endpoints
- Database query performance
-
User Impact
- Unique users affected
- Error blast radius
- Session impact
-
Error Sources
- Frontend vs. backend
- Top error types
- New errors introduced
# Create a test event
curl -X POST https://[key]@sentry.io/api/[project_id]/store/ \
-H "Content-Type: application/json" \
-d '{
"message": "Test alert",
"level": "error",
"exception": {
"values": [
{
"type": "TestException",
"value": "This is a test error"
}
]
}
}'- Create test error in staging
- Verify Slack message sent
- Verify email received
- Confirm PagerDuty incident (if configured)
- Review in Sentry dashboard
-
Set appropriate thresholds:
- Avoid alerting on every error
- Use frequency-based rules
- Set time windows for context
-
Use alert grouping:
- Group similar errors
- Reduce noise
- Focus on unique issues
-
Regular rule reviews:
- Monthly: Check rule effectiveness
- Disable unused rules
- Adjust thresholds based on data
Upon alert:
- Acknowledge alert in Slack
- Create incident ticket
- Assign to on-call engineer
- Begin investigation
- Implement fix
- Verify resolution
- Document post-mortem
Track these metrics:
- Frontend performance (LCP, FID, CLS)
- API endpoint latency
- Database query duration
- Third-party API response times
- Memory and CPU usage
When UptimeRobot detects downtime:
- Automatically create Sentry alert
- Tag with service and endpoint
- Correlate with error spike
- Provide context for response
Stream Sentry errors to CloudWatch:
Sentry -> Webhook -> Lambda -> CloudWatchConfiguration:
webhook:
url: https://lambda.amazonaws.com/webhook
events: [error, transaction]
include_details: trueSentry Plan: Business tier - $9/month per project
Quota allocation:
- Errors: 10M events/month
- Transactions: 50M events/month
- Replays: 1000 sessions/month
Cost optimization:
- Use
beforeSendto filter events - Reduce sample rates in non-critical envs
- Ignore known harmless errors
- Archive old issues
// sentry.client.config.ts
Sentry.init({
beforeSend(event, hint) {
// Ignore known errors
if (event.message?.includes('Network error')) {
return null
}
// Ignore in non-production
if (process.env.NODE_ENV !== 'production') {
return null
}
return event
},
tracesSampleRate: process.env.NODE_ENV === 'production' ? 0.1 : 1.0,
})- Verify DSN is correct
- Check browser console for errors
- Verify
beforeSendisn't filtering - Check network requests (DevTools)
- Review Sentry project settings
- Check rule conditions
- Verify grouping is working
- Adjust frequency thresholds
- Review filter conditions
- Enable breadcrumbs
- Attach user context
- Add custom tags
- Include HTTP request details
- Set up Slack integration
- Configure critical rules (Rules 1, 2, 8)
- Enable warning rules (Rules 3-7)
- Create dashboard
- Configure team notifications
- Document runbook for each rule
- Schedule monthly review
- Sentry Docs: https://docs.sentry.io/
- Alert Rules: https://docs.sentry.io/product/alerts/
- Integrations: https://docs.sentry.io/product/integrations/
- Best Practices: https://docs.sentry.io/platforms/javascript/best-practices/