Aller au contenu principal

Application Alerting

Configure alerts for critical application events.

Alert Channels

ChannelConfiguration
EmailSMTP configuration
SlackWebhook URL
PagerDutyIntegration key
DiscordWebhook URL

Critical

AlertConditionAction
API DownHealth check fails 3minPage on-call
Database DownNo DB connectionPage on-call
High Error Rate5xx > 5% for 5minPage on-call
Disk FullDisk > 90%Alert ops team

Warning

AlertConditionAction
High LatencyP99 > 2s for 10minNotify team
Memory Usage High> 80% for 15minInvestigate
Queue Backlog> 1000 pending jobsScale workers
SSL ExpiringCertificate < 14 daysRenew cert

Info

AlertConditionAction
Deployment CompleteNew version deployedVerify
Backup CompleteDaily backup finishedLog

Grafana Alerting

groups:
- name: gauzy-alerts
rules:
- alert: APIHighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "API error rate > 5%"