Worth Reading: Cloudflare Control Plane Outage
Cloudflare experienced a significant outage in early November 2023 and published a detailed post-mortem report. You should read the whole report; here are my CliffsNotes:
- Regardless of how much redundancy you have, sometimes all systems will fail at once. Having redundant systems decreases the probability of total failure but does not reduce it to zero.
- As your systems grow, they gather hidden- and circular dependencies.
- You won’t uncover those dependencies unless you run a full-blown disaster recovery test (not a fake one)
- If you don’t test your disaster recovery plan, it probably won’t work when needed.
Also (unrelated to Cloudflare outage):


