Jamie Herre

Author Archives: Jamie Herre

Cloudflare incident on November 14, 2024, resulting in lost logs

On November 14, 2024, Cloudflare experienced an incident which impacted the majority of customers using Cloudflare Logs. During the roughly 3.5 hours that these services were impacted, about 55% of the logs we normally send to customers were not sent and were lost. We’re very sorry this happened, and we are working to ensure that a similar issue doesn't happen again.

This blog post explains what happened and what we’re doing to prevent recurrences. Also, the systems involved and the particular class of failure we experienced will hopefully be of interest to engineering teams beyond those specifically using these products.

Failures within systems at scale are inevitable, and it’s essential that subsystems protect themselves from failures in other parts of the larger system to prevent cascades. In this case, a misconfiguration in one part of the system caused a cascading overload in another part of the system, which was itself misconfigured. Had it been properly configured, it could have prevented the loss of logs.

Background

Cloudflare’s network is a globally distributed system enabling and supporting a wide variety of services. Every part of this system generates event logs which contain detailed metadata about what’s happening with our systems around Continue reading

Explaining Cloudflare’s ABR Analytics

Explaining Cloudflare's ABR Analytics

Cloudflare’s analytics products help customers answer questions about their traffic by analyzing the mind-boggling, ever-increasing number of events (HTTP requests, Workers requests, Spectrum events) logged by Cloudflare products every day.  The answers to these questions depend on the point of view of the question being asked, and we’ve come up with a way to exploit this fact to improve the quality and responsiveness of our analytics.

Useful Accuracy

Consider the following questions and answers:

What is the length of the coastline of Great Britain? 12.4K km
What is the total world population? 7.8B
How many stars are in the Milky Way? 250B
What is the total volume of the Antarctic ice shelf? 25.4M km3
What is the worldwide production of lentils? 6.3M tonnes
How many HTTP requests hit my site in the last week? 22.6M

Useful answers do not benefit from being overly exact.  For large quantities, knowing the correct order of magnitude and a few significant digits gives the most useful answer.  At Cloudflare, the difference in traffic between different sites or when a single site is under attack can cross nine orders of magnitude and, in general, all our traffic follows a Continue reading