Escalation Contract
Status: active.
Severity Definitions
Sev1- Full service outage, auth unavailable, or critical data integrity risk.Sev2- Major feature degraded for many users, no safe workaround.Sev3- Limited degradation, workaround available.Sev4- Minor issue, no material business impact.
Response Targets
Sev1: acknowledge <= 15 minutes, mitigation start <= 30 minutes.Sev2: acknowledge <= 30 minutes, mitigation start <= 60 minutes.Sev3: acknowledge <= 4 business hours, fix plan <= 1 business day.Sev4: acknowledge <= 1 business day, scheduled backlog handling.
Escalation Chain
- On-call engineer (primary owner)
- Platform/backend lead
- Product owner / stakeholder representative
- Executive escalation (institutional sponsor) for Sev1/extended Sev2
Communication Contract
- Open incident channel immediately for Sev1/Sev2.
- Post updates every 30 minutes for Sev1, every 60 minutes for Sev2.
- Include: impact, affected services, mitigation status, next update time.
- Publish postmortem for Sev1/Sev2 within 3 business days.
Mandatory Operational Artifacts
- Monitoring dashboard links for API health, auth, cache, error rates.
- Runbook links for session/auth failure, cache failure, DB degradation.
- Ownership matrix in
docs/ownership/raci.md.
Pilot Acceptance Requirement
No pilot is considered operationally ready unless this contract and the RACI matrix are approved by maintainers and institutional stakeholders.