Runbook: <Datadog alert name or symptom>

Datadog Resources

  • Monitor: TODO

  • Dashboard: TODO

  • Logs: TODO

  • APM: TODO

Symptoms

How this manifests — what the alert says, what dashboards show.

Likely Causes

  1. Most common

  2. Next most common

Diagnosis

  1. Step-by-step checks with exact queries / commands

Mitigation / Resolution

  1. Steps to stop the bleeding and resolve. Prefer safe, reversible actions first (feature flags, restarts) before code changes.

Escalation

When to page secondary / notify #incidents / tag a team lead.