Skip to content

Escalation Contract

Status: active.

Severity Definitions

  • Sev1 - Full service outage, auth unavailable, or critical data integrity risk.
  • Sev2 - Major feature degraded for many users, no safe workaround.
  • Sev3 - Limited degradation, workaround available.
  • Sev4 - Minor issue, no material business impact.

Response Targets

  • Sev1: acknowledge <= 15 minutes, mitigation start <= 30 minutes.
  • Sev2: acknowledge <= 30 minutes, mitigation start <= 60 minutes.
  • Sev3: acknowledge <= 4 business hours, fix plan <= 1 business day.
  • Sev4: acknowledge <= 1 business day, scheduled backlog handling.

Escalation Chain

  1. On-call engineer (primary owner)
  2. Platform/backend lead
  3. Product owner / stakeholder representative
  4. Executive escalation (institutional sponsor) for Sev1/extended Sev2

Communication Contract

  • Open incident channel immediately for Sev1/Sev2.
  • Post updates every 30 minutes for Sev1, every 60 minutes for Sev2.
  • Include: impact, affected services, mitigation status, next update time.
  • Publish postmortem for Sev1/Sev2 within 3 business days.

Mandatory Operational Artifacts

  • Monitoring dashboard links for API health, auth, cache, error rates.
  • Runbook links for session/auth failure, cache failure, DB degradation.
  • Ownership matrix in docs/ownership/raci.md.

Pilot Acceptance Requirement

No pilot is considered operationally ready unless this contract and the RACI matrix are approved by maintainers and institutional stakeholders.