On November 18, 2025, Cloudflare’s network experienced significant failures to deliver network traffic for approximately two hours and ten minutes. Nearly three weeks later, on December 5, 2025, our network again failed to serve traffic for 28% of applications behind our network for about 25 minutes.
We published detailed post-mortem blog posts following both incidents, but we know that we have more to do to earn back your trust. Today we are sharing details about the work underway at Cloudflare to prevent outages like these from happening again.
We are calling the plan “Code Orange: Fail Small”, which reflects our goal of making our network more resilient to errors or mistakes that could lead to a major outage. A “Code Orange” means the work on this project is prioritized above all else. For context, we declared a “Code Orange” at Cloudflare once before, following another major incident that required top priority from everyone across the company. We feel the recent events require the same focus. Code Orange is our way to enable that to happen, allowing teams to work cross-functionally as necessary to get the job done while pausing any other work.
The Code Continue reading
Cloudflare's latest transparency report — covering the first half of 2025 — is now live. As part of our commitment to transparency, Cloudflare publishes such reports twice a year, describing how we handle legal requests for customer information and reports of abuse of our services. Although we’ve been publishing these reports for over 10 years, we’ve continued to adapt our transparency reporting and our commitments to reflect Cloudflare’s growth and changes as a company. Most recently, we made changes to the format of our reports to make them even more comprehensive and understandable.
In general, we try to provide updates on our approach or the requests that we receive in the transparency report itself. To that end, we have some notable updates for the first half of 2025. But our transparency report can only go so far in explaining the numbers.
In this blog post, we’ll do a deeper dive on one topic: Cloudflare’s approach to streaming and claims of copyright violations. Given increased access to AI tools and other systems for abuse, bad actors have become increasingly sophisticated in the way they attempt to abuse systems to stream copyrighted content, often incorporating steps to hide their behavior. We’ve Continue reading
They say time goes faster as you get older, and it seems to be true. Another year has (almost) gone by.
Try to disconnect from the crazy pace of the networking world, forget the “vibe coding with AI will make engineers obsolete” stupidities (hint: Fifth Generation Languages and Natural Language Programming were all the rage in the 1980s and 1990s), and focus on your loved ones. I would also like to wish you all the best in 2026!
In the meantime, I’m working on weaning netlab off of a particular automation tool (you can always track the progress on GitHub). Expect the first results in the January netlab release.
For years, platform teams have known what a service mesh can provide: strong workload identity, authorization, mutual TLS authentication and encryption, fine-grained traffic control, and deep observability across distributed systems. In theory, Istio checked all the boxes. In practice though, many teams hit a wall.
Across industries like financial services, media, retail, and SaaS, organizations told a similar story. They wanted mTLS between services to meet regulatory or security requirements. They needed safer deployment capabilities like canary rollouts and traffic splitting. They wanted visibility that went beyond IP addresses.
However, traditional sidecar based meshes came with real costs:
In several cases, teams started down the Istio service mesh path, only to pause or roll back entirely because the ongoing operational complexity was too high. The value of a service mesh was clear, but the service mesh architecture based on sidecars was not sustainable for many production environments.
In many cases, organizations evaluated service meshes with clear goals in mind. They wanted mTLS between services, better control over traffic during deployments, and observability that could keep up. Continue reading
As the Internet centralizes and gets “big,” standards are often being sidelined or consumed. What are the possible results of abandoning standards? Is there anything “normal network engineers” can do about it?
download
When you’re dealing with large amounts of data, it’s helpful to get a quick overview — which is exactly what aggregations provide in SQL. Aggregations, known as “GROUP BY queries”, provide a bird’s eye view, so you can quickly gain insights from vast volumes of data.
That’s why we are excited to announce support for aggregations in R2 SQL, Cloudflare's serverless, distributed, analytics query engine, which is capable of running SQL queries over data stored in R2 Data Catalog. Aggregations will allow users of R2 SQL to spot important trends and changes in the data, generate reports and find anomalies in logs.
This release builds on the already supported filter queries, which are foundational for analytical workloads, and allow users to find needles in haystacks of Apache Parquet files.
In this post, we’ll unpack the utility and quirks of aggregations, and then dive into how we extended R2 SQL to support running such queries over vast amounts of data stored in R2 Data Catalog.
Aggregations, or “GROUP BY queries”, generate a short summary of the underlying data.
A common use case for aggregations is generating reports. Consider a table called “sales”, which contains Continue reading
Want to look up various HTTP status/error codes when troubleshooting a DNS BGP network server problem? Start at http.pizza for badly-needed stress relief (HT: Networking Notes), then start a chat session with your new AI friend exploring more focused resources like the Wikipedia list of HTTP status codes.
Daftar Pustaka
Siapa yang menjadi raja di langit? Bicara soal transportasi udara, satu nama langsung muncul. Ya, Amerika Serikat adalah negara dengan airport terbanyak di dunia. Jumlahnya sangat fantastis dan jauh meninggalkan negara lain. Fenomena ini bukan sekadar angka. Ia mencerminkan geografi, ekonomi, dan budaya yang unik. Mari kita bedah lebih dalam.
Amerika Serikat memimpin daftar global dengan jumlah total bandara yang mencengangkan. Menurut data dari FAA atau Federal Aviation Administration, ada lebih dari 19.000 bandara. Angka ini termasuk berbagai jenis fasilitas. Tentu saja, tidak semua bandara sebesar JFK atau LAX. Sebagian besar adalah fasilitas kecil. Namun, semuanya berkontribusi pada infrastruktur penerbangan yang masif.
FAA membagi bandara menjadi dua kategori utama. Pertama adalah bandara umum. Kedua adalah bandara swasta. Bandara umum tersedia untuk penggunaan publik. Sementara itu, bandara swasta hanya untuk pemiliknya. Kombinasi kedualah yang menciptakan angka yang sangat besar. Selain itu, budaya penerbangan umum di AS sangat kuat. Banyak individu dan perusahaan memiliki pesawat pribadi. Akibatnya, kebutuhan akan landai pacu pribadi pun melonjak.

Ingress NGINX Controller, the trusty staple of countless platform engineering toolkits, is about to be put out to pasture. This news was announced by the Kubernetes community recently, and very quickly circulated throughout the cloud-native space. It’s big news for any platform team that currently uses the NGINX Controller because, as of March 26, 2026, there will be no more bug fixes, no more critical vulnerability patches and no more enhancements when Kubernetes continues to release new versions.
If you’re feeling ambushed, you’re not alone. For many teams, this isn’t just an inconvenient roadmap update, its unexpected news that now puts long-term traffic management decisions front and center. You know you need to migrate yesterday but the best path forward can be a confusing labyrinth of platforms and unfamiliar tools. Questions you might ask yourself:
Do you find a quick drop-in Ingress replacement?
Does moving to Gateway API make sense and can you commit enough resources to do a full migration?
If you decide on Gateway API then what is the best option for a smooth transition?
With Ingress NGINX on the way out, platform teams are standing at a Continue reading
A month ago, I described how Ansible release 12 broke the network device configuration modules, the little engines (that could) that brought us from the dark days of copy-and-paste into the more-survivable land of configuration templates.
Three releases later (they just released 13.1), the same bug is still there (at least it was on a fresh Python virtual environment install I made on a Ubuntu 24.04 server on December 13th, 2025), making all device_config modules unusable (without changing your Ansible playbooks) for configuration templating. Even worse: