Kubernetes has transformed how we deploy and manage applications. It gives us the ability to spin up a virtual data center in minutes, scaling infrastructure with ease. But with great power comes great complexities, and in the case of Kubernetes, that complexity is security.
By default, Kubernetes permits all traffic between workloads in a cluster. This “allow by default” stance is convenient during development, and testing but it’s dangerous in production. It’s up to DevOps, DevSecOps, and cloud platform teams to lock things down.
To improve the security posture of a Kubernetes cluster, we can use microsegmentation, a practice that limits each workload’s network reach so it can only talk to the specific resources it needs. This is an essential security method in today’s cloud-native environments.
We all understand that network policies can achieve microsegmentation; or in other words, it can divide our Kubernetes network model into isolated pieces. This is important since Kubernetes is usually used to provide multiple teams with their infrastructural needs or host multiple workloads for different tenants. With that, you would think network policies are first citizens of clusters. However, when we dig into implementing them, three operational challenges Continue reading
What is Jevon’s Paradox? Tom, Eyvonne, and Russ discuss how this famous paradox impact network engineering.
download
Social media users are tired of losing their identity and data every time a platform shuts down or pivots. In the ATProto ecosystem — short for Authenticated Transfer Protocol — users own their data and identities. Everything they publish becomes part of a global, cryptographically signed shared social web. Bluesky is the first big example, but a new wave of decentralized social networks is just beginning. In this post I’ll show you how to get started, by building and deploying a fully serverless ATProto application on Cloudflare’s Developer Platform.
Why serverless? The overhead of managing VMs, scaling databases, maintaining CI pipelines, distributing data across availability zones, and securing APIs against DDoS attacks pulls focus away from actually building.
That’s where Cloudflare comes in. You can take advantage of our Developer Platform to build applications that run on our global network: Workers deploy code globally in milliseconds, KV provides fast, globally distributed caching, D1 offers a distributed relational database, and Durable Objects manage WebSockets and handle real-time coordination. Best of all, everything you need to build your serverless ATProto application is available on our free tier, so you can get started without spending a cent. You can find the code in Continue reading
The Cloudflare Business Intelligence team manages a petabyte-scale data lake and ingests thousands of tables every day from many different sources. These include internal databases such as Postgres and ClickHouse, as well as external SaaS applications such as Salesforce. These tasks are often complex and tables may have hundreds of millions or billions of rows of new data each day. They are also business-critical for product decisions, growth plannings, and internal monitoring. In total, about 141 billion rows are ingested every day.
As Cloudflare has grown, the data has become ever larger and more complex. Our existing Extract Load Transform (ELT) solution could no longer meet our technical and business requirements. After evaluating other common ELT solutions, we concluded that their performance generally did not surpass our current system, either.
It became clear that we needed to build our own framework to cope with our unique requirements — and so Jetflow was born.
Over 100x efficiency improvement in GB-s:
Our longest running job with 19 billion rows was taking 48 hours using 300 GB of memory, and now completes in 5.5 hours using 4 GB of memory
We estimate that ingestion of Continue reading
One should never trust the technical details published by the industry press, but assuming the Tomahawk Ultra puff piece isn’t too far off the mark, the new Broadcom ASIC (supposedly loosely based on emerging Ultra Ethernet specs):
If you’re ancient enough, you might recognize #3 as part of Fibre Channel, #2 and #3 as part of IEEE 802.1 LLC2 (used by IBM to implement SNA over Token Ring and Ethernet), and all three as the fundamental ideas of X.25 that Broadcom obviously reinvented at 800 Gbps speeds, proving (yet again) RFC 1925 Rule 11.
I have a little confession. Sometimes I like to go into Best Buy and just listen. I pretend to be shopping or modem bearings or a left handed torque wrench. What I’m really doing is hearing how people sell computers. I remember when 8x CD burners were all the rage. I recall picking one particular machine because it had an integrated Sound Blaster card. Today, I just marvel at how the associates rattle off a long string of impressive sounding nonsense that consumers will either buy hook, line, and sinker or refute based on some Youtube reviewer recommendation. Every once in a while, though, I hear someone that actually does understand the lingo and it is wonderful. They listen and understand the challenges and don’t sell a $3,000 gaming computer to a grandmother just to play Candy Crush and look up grandkid photos on Facebook.
What does that story have to do with the title of this post? Well, dear young readers, you may not remember the time when Best Buy Blue was locked in mortal competition with Circuit City Red. In a time before Amazon was ascendant you had to pick between the two giants of Continue reading
On July 19, 2025, Microsoft disclosed CVE-2025-53770, a critical zero-day Remote Code Execution (RCE) vulnerability. Assigned a CVSS 3.1 base score of 9.8 (Critical), the vulnerability affects SharePoint Server 2016, 2019, and the Subscription Edition, along with unsupported 2010 and 2013 versions. Cloudflare’s WAF Managed Rules now includes 2 emergency releases that mitigate these vulnerabilities for WAF customers.
The vulnerability's root cause is improper deserialization of untrusted data, which allows a remote, unauthenticated attacker to execute arbitrary code over the network without any user interaction. Moreover, what makes CVE-2025-53770 uniquely threatening is its methodology – the exploit chain, labeled "ToolShell." ToolShell is engineered to play the long-game: attackers are not only gaining temporary access, but also taking the server's cryptographic machine keys, specifically the ValidationKey
and DecryptionKey
. Possessing these keys allows threat actors to independently forge authentication tokens and __VIEWSTATE
payloads, granting them persistent access that can survive standard mitigation strategies such as a server reboot or removing web shells.
In response to the active nature of these attacks, the U.S. Cybersecurity and Infrastructure Security Agency (CISA) added CVE-2025-53770 to its Known Exploited Vulnerabilities (KEV) catalog with an emergency remediation deadline. Continue reading
Cloudflare’s network currently spans more than 330 cities in over 125 countries, and we interconnect with over 13,000 network providers in order to provide a broad range of services to millions of customers. The breadth of both our network and our customer base provides us with a unique perspective on Internet resilience, enabling us to observe the impact of Internet disruptions at both a local and national level, as well as at a network level.
As we have noted in the past, this post is intended as a summary overview of observed and confirmed disruptions, and is not an exhaustive or complete list of issues that have occurred during the quarter. A larger list of detected traffic anomalies is available in the Cloudflare Radar Outage Center. Note that both bytes-based and request-based traffic graphs are used within the post to illustrate the impact of the observed disruptions — the choice of metric was generally made based on which better illustrated the impact of the disruption.
In our Q1 2025 summary post, we noted that we had not observed any government-directed Internet shutdowns during the quarter. Unfortunately, that forward progress was short-lived — in the second quarter of 2025, we Continue reading
A while ago, I published a blog post proudly describing the netlab integration test that should check for incorrect OSPF network types in netlab-generated device configurations. Almost immediately, Erik Auerswald pointed out that my test wouldn’t detect that error (it might detect other errors, though) as the OSPF network adjacency is always established even when the adjacent routers have mismatching OSPF network types.
I made one of the oldest testing mistakes: I checked whether my test would work under the correct conditions but not whether it would detect an incorrect condition.
Recently, I started self-hosting most of the apps I use, like Memos for note-taking and Paperless-NGX for document management. The next one on the list was Immich. Immich is a self-hosted photo and video backup solution that supports features like facial recognition and automatic uploads.
In this post, we’ll look at how to set up Immich as a Docker container and also how to add an NFS share as an external library.
I have a lot of pictures on my NAS that I’ve collected over the years. This includes photos of friends, family, and ones from my older phones. I wanted a way to manage and organise them from one place. I also didn’t want to upload all of them to Google or Apple, which would cost quite a bit. Continue reading