Archive

Category Archives for "CloudFlare"

Cloudflare outage on November 18, 2025

On 18 November 2025 at 11:20 UTC (all times in this blog are UTC), Cloudflare's network began experiencing significant failures to deliver core network traffic. This showed up to Internet users trying to access our customers' sites as an error page indicating a failure within Cloudflare's network.

The issue was not caused, directly or indirectly, by a cyber attack or malicious activity of any kind. Instead, it was triggered by a change to one of our database systems' permissions which caused the database to output multiple entries into a “feature file” used by our Bot Management system. That feature file, in turn, doubled in size. The larger-than-expected feature file was then propagated to all the machines that make up our network.

The software running on these machines to route traffic across our network reads this feature file to keep our Bot Management system up to date with ever changing threats. The software had a limit on the size of the feature file that was below its doubled size. That caused the software to fail.

After we initially wrongly suspected the symptoms we were seeing were caused by a hyper-scale DDoS attack, we correctly identified the core issue and were able Continue reading

Replicate is joining Cloudflare

We have some big news to share today: Replicate, the leading platform for running AI models, is joining Cloudflare.

We first started talking to Replicate because we shared a lot in common beyond just a passion for bright color palettes. Our mission for Cloudflare’s Workers developer platform has been to make building and deploying full-stack applications as easy as possible. Meanwhile, Replicate has been on a similar mission to make deploying AI models as easy as writing a single line of code. And we realized we could build something even better together by integrating the Replicate platform into Cloudflare directly.

We are excited to share this news and even more excited for what it will mean for customers. Bringing Replicate’s tools into Cloudflare will continue to make our Developer Platform the best place on the Internet to build and deploy any AI or agentic workflow.

What does this mean for you? 

Before we spend more time talking about the future of AI, we want to answer the questions that are top of mind for Replicate and Cloudflare users. In short: 

For existing Replicate users: Your APIs and workflows will continue to work without interruption. You will soon benefit from the Continue reading

Finding the grain of sand in a heap of Salt

How do you find the root cause of a configuration management failure when you have a peak of hundreds of changes in 15 minutes on thousands of servers?

That was the challenge we faced as we built the infrastructure to reduce release delays due to failures of Salt, a configuration management tool. (We eventually reduced such failures on the edge by over 5%, as we’ll explain below.) We’ll explore the fundamentals of Salt, and how it is used at Cloudflare. We then describe the common failure modes and how they delay our ability to release valuable changes to serve our customers.

By first solving an architectural problem, we provided the foundation for self-service mechanisms to find the root cause of Salt failures on servers, datacenters and groups of datacenters. This system is able to correlate failures with git commits, external service failures and ad hoc releases. The result of this has been a reduction in the duration of software release delays, and an overall reduction in toilsome, repetitive triage for SRE.

To start, we will go into the basics of the Cloudflare network and how Salt operates within it. And then we’ll get to how we solved the challenge Continue reading

Connecting to production: the architecture of remote bindings

Remote bindings are bindings that connect to a deployed resource on your Cloudflare account instead of a locally simulated resource – and recently, we announced that remote bindings are now generally available

With this launch, you can now connect to deployed resources like R2 buckets and D1 databases while running Worker code on your local machine. This means you can test your local code changes against real data and services, without the overhead of deploying for each iteration. 

In this blog post, we’ll dig into the technical details of how we built it, creating a seamless local development experience.

Developing on the Workers platform

A key part of the Cloudflare Workers platform has been the ability to develop your code locally without having to deploy it every time you wanted to test something – though the way we’ve supported this has changed greatly over the years. 

We started with wrangler dev running in remote mode. This works by deploying and connecting to a preview version of your Worker that runs on Cloudflare’s network every time you make a change to your code, allowing you to test things out as you develop. However, remote mode isn’t perfect — Continue reading

A closer look at Python Workflows, now in beta

Developers can already use Cloudflare Workflows to build long-running, multi-step applications on Workers. Now, Python Workflows are here, meaning you can use your language of choice to orchestrate multi-step applications.

With Workflows, you can automate a sequence of idempotent steps in your application with built-in error handling and retry behavior. But Workflows were originally supported only in TypeScript. Since Python is the de facto language of choice for data pipelines, artificial intelligence/machine learning, and task automation – all of which heavily rely on orchestration – this created friction for many developers.

Over the years, we’ve been giving developers the tools to build these applications in Python, on Cloudflare. In 2020, we brought Python to Workers via Transcrypt before directly integrating Python into workerd in 2024. Earlier this year, we built support for CPython along with any packages built in Pyodide, like matplotlib and pandas, in Workers. Now, Python Workflows are supported as well, so developers can create robust applications using the language they know best.

Why Python for Workflows?

Imagine you’re training an LLM. You need to label the dataset, feed data, wait for the model to run, evaluate the loss, adjust the model, and repeat. Without automation, Continue reading

DIY BYOIP: a new way to Bring Your Own IP prefixes to Cloudflare

When a customer wants to bring IP address space to Cloudflare, they’ve always had to reach out to their account team to put in a request. This request would then be sent to various Cloudflare engineering teams such as addressing and network engineering — and then the team responsible for the particular service they wanted to use the prefix with (e.g., CDN, Magic Transit, Spectrum, Egress). In addition, they had to work with their own legal teams and potentially another organization if they did not have primary ownership of an IP prefix in order to get a Letter of Agency (LOA) issued through hoops of approvals. This process is complex, manual, and  time-consuming for all parties involved — sometimes taking up to 4–6 weeks depending on various approvals. 

Well, no longer! Today, we are pleased to announce the launch of our self-serve BYOIP API, which enables our customers to onboard and set up their BYOIP prefixes themselves.

With self-serve, we handle the bureaucracy for you. We have automated this process using the gold standard for routing security — the Resource Public Key Infrastructure, RPKI. All the while, we continue to ensure the best quality of service by Continue reading

Extract audio from your videos with Cloudflare Stream

Cloudflare Stream loves video. But we know not every workflow needs the full picture, and the popularity of podcasts highlights how compelling stand-alone audio can be. For developers, processing a video just to access audio is slow, costly, and complex. 

What makes video so expensive? A video file is a dense stack of high-resolution images, stitched together over time. As such, it is not just “one file” —  it’s a container of high-dimensional data such as frames per second, resolution, codecs. Analyzing video means traversing time resolution frame rate.

Why audio extraction

By comparison, an audio file is far simpler. If an audio file consists of only one channel, it is defined as a single waveform. The technical characteristics of this waveform are defined by the sample rate (the number of audio samples taken per second), and the bit depth (the precision of each sample).

With the rise of computationally intensive AI inference pipelines, many of our customers want to perform downstream workflows that require only analyzing the audio. For example:

  • Power AI and Machine Learning: In addition to translation and transcription, you can feed the audio into Voice-to-Text models for speech recognition or analysis, or AI-powered summaries.

  • Improve Continue reading

Async QUIC and HTTP/3 made easy: tokio-quiche is now open-source

A little over 6 years ago, we presented quiche, our open source QUIC implementation written in Rust. Today we’re announcing the open sourcing of tokio-quiche, our battle-tested, asynchronous QUIC library combining both quiche and the Rust Tokio async runtime. Powering Cloudflare’s Proxy B in Apple iCloud Private Relay and our next-generation Oxy-based proxies, tokio-quiche handles millions of HTTP/3 requests per second with low latency and high throughput. tokio-quiche also powers Cloudflare Warp’s MASQUE client, replacing our WireGuard tunnels with QUIC-based tunnels, and the async version of h3i.

quiche was developed as a sans-io library, meaning that it implements the state machine required to handle the QUIC transport protocol while not making any assumptions about how its user intends to perform IO. This means that, with enough elbow grease, anyone can write an IO integration with quiche! This entails connecting or listening on a UDP socket, managing sending and receiving UDP datagrams on that socket while feeding all network information to quiche. Given we need this integration to be async, we’d have to do all this while integrating with an async Rust runtime. tokio-quiche does all of that for you, no grease required.

Lowering the barrier to Continue reading

How Workers VPC Services connects to your regional private networks from anywhere in the world

In April, we shared our vision for a global virtual private cloud on Cloudflare, a way to unlock your applications from regionally constrained clouds and on-premise networks, enabling you to build truly cross-cloud applications.

Today, we’re announcing the first milestone of our Workers VPC initiative: VPC Services. VPC Services allow you to connect to your APIs, containers, virtual machines, serverless functions, databases and other services in regional private networks via Cloudflare Tunnels from your Workers running anywhere in the world. 

Once you set up a Tunnel in your desired network, you can register each service that you want to expose to Workers by configuring its host or IP address. Then, you can access the VPC Service as you would any other Workers service binding — Cloudflare’s network will automatically route to the VPC Service over Cloudflare’s network, regardless of where your Worker is executing:

export default {
  async fetch(request, env, ctx) {
    // Perform application logic in Workers here	

    // Call an external API running in a ECS in AWS when needed using the binding
    const response = await env.AWS_VPC_ECS_API.fetch("http://internal-host.com");

    // Additional application logic in Workers
    return new Response();
  },
};

Workers VPC is now Continue reading

Building a better testing experience for Workflows, our durable execution engine for multi-step applications

Cloudflare Workflows is our take on "Durable Execution." They provide a serverless engine, powered by the Cloudflare Developer Platform, for building long-running, multi-step applications that persist through failures. When Workflows became generally available earlier this year, they allowed developers to orchestrate complex processes that would be difficult or impossible to manage with traditional stateless functions. Workflows handle state, retries, and long waits, allowing you to focus on your business logic.

However, complex orchestrations require robust testing to be reliable. To date, testing Workflows was a black-box process. Although you could test if a Workflow instance reached completion through an await to its status, there was no visibility into the intermediate steps. This made debugging really difficult. Did the payment processing step succeed? Did the confirmation email step receive the correct data? You couldn't be sure without inspecting external systems or logs. 

Why was this necessary?

As developers ourselves, we understand the need to ensure reliable code, and we heard your feedback loud and clear: the developer experience for testing Workflows needed to be better.

The black box nature of testing was one part of the problem. Beyond that, though, the limited testing offered came at a high Continue reading

Fresh insights from old data: corroborating reports of Turkmenistan IP unblocking and firewall testing

Here at Cloudflare, we frequently use and write about data in the present. But sometimes understanding the present begins with digging into the past.  

We recently learned of a 2024 turkmen.news article (available in Russian) that reports Turkmenistan experienced “an unprecedented easing in blocking,” causing over 3 billion previously-blocked IP addresses to become reachable. The same article reports that one of the reasons for unblocking IP addresses was that Turkmenistan may have been testing a new firewall. (The Turkmen government’s tight control over the country’s Internet access is well-documented.) 

Indeed, Cloudflare Radar shows a surge of requests coming from Turkmenistan around the same time, as we’ll show below. But we had an additional question: Does the firewall activity show up on Radar, as well? Two years ago, we launched the dashboard on Radar to give a window into the TCP connections to Cloudflare that close due to resets and timeouts. These stand out because they are considered ungraceful mechanisms to close TCP connections, according to the TCP specification. 

In this blog post, we go back in time to share what Cloudflare saw in connection resets and timeouts. We must remind our readers that, as passive observers, Continue reading

Load Balancing Monitor Groups: Multi-Service Health Checks for Resilient Applications

Modern applications are not monoliths. They are complex, distributed systems where availability depends on multiple independent components working in harmony. A web server might be running, but if its connection to the database is down or the authentication service is unresponsive, the application as a whole is unhealthy. Relying on a single health check is like knowing the “check engine” light is not on, but not knowing that one of your tires has a puncture. It’s great your engine is going, but you’re probably not driving far.

As applications grow in complexity, so does the definition of "healthy." We've heard from customers, big and small, that they need to validate multiple services to consider an endpoint ready to receive traffic. For example, they may need to confirm that an underlying API gateway is healthy and that a specific ‘/login’ service is responsive before routing users there. Until now, this required building custom, synthetic services to aggregate these checks, adding operational overhead and another potential point of failure.

Today, we are introducing Monitor Groups for Cloudflare Load Balancing. This feature provides a new way to create sophisticated, multi-service health assessments directly on our platform. With Monitor Groups, you can bundle Continue reading

Improving the trustworthiness of Javascript on the Web

The web is the most powerful application platform in existence. As long as you have the right API, you can safely run anything you want in a browser.

Well… anything but cryptography.

It is as true today as it was in 2011 that Javascript cryptography is Considered Harmful. The main problem is code distribution. Consider an end-to-end-encrypted messaging web application. The application generates cryptographic keys in the client’s browser that lets users view and send end-to-end encrypted messages to each other. If the application is compromised, what would stop the malicious actor from simply modifying their Javascript to exfiltrate messages?

It is interesting to note that smartphone apps don’t have this issue. This is because app stores do a lot of heavy lifting to provide security for the app ecosystem. Specifically, they provide integrity, ensuring that apps being delivered are not tampered with, consistency, ensuring all users get the same app, and transparency, ensuring that the record of versions of an app is truthful and publicly visible.

It would be nice if we could get these properties for our end-to-end encrypted web application, and the web as a whole, without requiring a single central authority like Continue reading

Unpacking Cloudflare Workers CPU Performance Benchmarks

On October 4, independent developer Theo Browne published a series of benchmarks designed to compare server-side JavaScript execution speed between Cloudflare Workers and Vercel, a competing compute platform built on AWS Lambda. The initial results showed Cloudflare Workers performing worse than Node.js on Vercel at a variety of CPU-intensive tasks, by a factor of as much as 3.5x.

We were surprised by the results. The benchmarks were designed to compare JavaScript execution speed in a CPU-intensive workload that never waits on external services. But, Cloudflare Workers and Node.js both use the same underlying JavaScript engine: V8, the open source engine from Google Chrome. Hence, one would expect the benchmarks to be executing essentially identical code in each environment. Physical CPUs can vary in performance, but modern server CPUs do not vary by anywhere near 3.5x.

On investigation, we discovered a wide range of small problems that contributed to the disparity, ranging from some bad tuning in our infrastructure, to differences between the JavaScript libraries used on each platform, to some issues with the test itself. We spent the week working on many of these problems, which means over the past week Workers got better and faster Continue reading

Introducing REACT: Why We Built an Elite Incident Response Team

Cloudforce One’s mission is to help defend the Internet. In Q2’25 alone, Cloudflare stopped an average of 190 billion cyber threats every single day. But real-world customer experiences showed us that stopping attacks at the edge isn’t always enough. We saw ransomware disrupt financial operations, data breaches cripple real estate firms, and misconfigurations cause major data losses.

In each case, the real damage occurred inside networks.

These internal breaches uncovered another problem: customers had to hand off incidents to separate internal teams for investigation and remediation. Those handoffs created delays and fractured the response. The result was a gap that attackers could exploit. Critical context collected at the edge didn’t reach the teams managing cleanup, and valuable time was lost. Closing this gap has become essential, and we recognized the need to take responsibility for providing customers with a more unified defense.

Today, Cloudforce One is launching a new suite of incident response and security services to help organizations prepare for and respond to breaches.

These services are delivered by Cloudforce One REACT (Respond, Evaluate, Assess, Consult Team), a group of seasoned responders and security veterans who investigate threats, hunt adversaries, and work closely with executive leadership to guide Continue reading

How we found a bug in Go’s arm64 compiler

Every second, 84 million HTTP requests are hitting Cloudflare across our fleet of data centers in 330 cities. It means that even the rarest of bugs can show up frequently. In fact, it was our scale that recently led us to discover a bug in Go's arm64 compiler which causes a race condition in the generated code.

This post breaks down how we first encountered the bug, investigated it, and ultimately drove to the root cause.

Investigating a strange panic

We run a service in our network which configures the kernel to handle traffic for some products like Magic Transit and Magic WAN. Our monitoring watches this closely, and it started to observe very sporadic panics on arm64 machines.

We first saw one with a fatal error stating that traceback did not unwind completely. That error suggests that invariants were violated when traversing the stack, likely because of stack corruption. After a brief investigation we decided that it was probably rare stack memory corruption. This was a largely idle control plane service where unplanned restarts have negligible impact, and so we felt that following up was not a priority unless it kept happening.

And then it kept happening. 

Coredumps Continue reading

Payload on Workers: a full-fledged CMS, running entirely on Cloudflare’s stack

Tucked behind the administrator login screen of countless websites is one of the Internet’s unsung heroes: the Content Management System (CMS). This seemingly basic piece of software is used to draft and publish blog posts, organize media assets, manage user profiles, and perform countless other tasks across a dizzying array of use cases. One standout in this category is a vibrant open-source project called Payload, which has over 35,000 stars on GitHub and has generated so much community excitement that it was recently acquired by Figma.

Today we’re excited to showcase a new template from the Payload team, which makes it possible to deploy a full-fledged CMS to Cloudflare’s platform in a single click: just click the Deploy to Cloudflare button to generate a fully-configured Payload instance, complete with bindings to Cloudflare D1 and R2. Below we’ll dig into the technical work that enables this, some of the opportunities it unlocks, and how we’re using Payload to help power Cloudflare TV. But first, a look at why hosting a CMS on Workers is such a game changer.

Behind the scenes: Cloudflare TV’s Payload instance

Serverless by design

Most CMSs are designed to be hosted on a conventional server that runs Continue reading

Nationwide Internet shutdown in Afghanistan extends localized disruptions

Just after 11:30 UTC (16:00 local time) on Monday, September 29, 2025, subscribers of wired Internet providers in Afghanistan experienced a brief service interruption, lasting until just before 12:00 UTC (16:30 local time). Cloudflare traffic data for AS38472 (Afghan Wireless) and AS131284 (Etisalat) shows that traffic from these mobile providers remained available during that period.

However, just after 12:30 UTC (17:00 local time), the Internet was completely shut down, with Afghani news outlet TOLOnews initially reporting in a post on X that “Sources have confirmed to TOLOnews that today (Monday), afternoon, fiber-optic Internet will be shut down across the country.” This shutdown is likely an extension of the regional shutdowns of fiber optic connections that took place earlier in September, and it will reportedly remain in force “until further notice”. (The earlier regional shutdowns are discussed in more detail below.)

While Monday’s first shutdown was only partial, with mobile connectivity apparently remaining available, the graphs below show that the second event took the country completely offline, with web and DNS traffic dropping to zero at a national level, as seen in the graphs below.

While the shutdown will impact subscribers to fixed and mobile Internet Continue reading

15 years of helping build a better Internet: a look back at Birthday Week 2025

Cloudflare launched fifteen years ago with a mission to help build a better Internet. Over that time the Internet has changed and so has what it needs from teams like ours.  In this year’s Founder’s Letter, Matthew and Michelle discussed the role we have played in the evolution of the Internet, from helping encryption grow from 10% to 95% of Internet traffic to more recent challenges like how people consume content. 

We spend Birthday Week every year releasing the products and capabilities we believe the Internet needs at this moment and around the corner. Previous Birthday Weeks saw the launch of IPv6 gateway in 2011,  Universal SSL in 2014, Cloudflare Workers and unmetered DDoS protection in 2017, Cloudflare Radar in 2020, R2 Object Storage with zero egress fees in 2021,  post-quantum upgrades for Cloudflare Tunnel in 2022, Workers AI and Encrypted Client Hello in 2023. And those are just a sample of the launches.

This year’s themes focused on helping prepare the Internet for a new model of monetization that encourages great content to be published, fostering more opportunities to build community both inside and outside of Cloudflare, and evergreen missions like making more features available to Continue reading

An AI Index for all our customers

Today, we’re announcing the private beta of AI Index for domains on Cloudflare, a new type of web index that gives content creators the tools to make their data discoverable by AI, and gives AI builders access to better data for fair compensation.

With AI Index enabled on your domain, we will automatically create an AI-optimized search index for your website, and expose a set of ready-to-use standard APIs and tools including an MCP server, LLMs.txt, and a search API. Our customers will own and control that index and how it’s used, and you will have the ability to monetize access through Pay per crawl and the new x402 integrations. You will be able to use it to build modern search experiences on your own site, and more importantly, interact with external AI and Agentic providers to make your content more discoverable while being fairly compensated.

For AI builders—whether developers creating agentic applications, or AI platform companies providing foundational LLM models—Cloudflare will offer a new way to discover and retrieve web content: direct pub/sub connections to individual websites with AI Index. Instead of indiscriminate crawling, builders will be able to subscribe to specific sites that have opted in for Continue reading