Search platforms historically crawled web sites with the implicit promise that, as the sites showed up in the results for relevant searches, they would send traffic on to those sites — in turn leading to ad revenue for the publisher. This model worked fairly well for several decades, with a whole industry emerging around optimizing content for optimal placement in search results. It led to higher click-through rates, more eyeballs for publishers, and, ideally, more ad revenue. However, the emergence of AI platforms over the last several years, and the incorporation of AI "overviews" into classic search platforms, has turned the model on its head. When users turn to these AI platforms with queries that used to go to search engines, they often won't click through to the original source site once an answer is provided — and that assumes that a link to the source is provided at all! No clickthrough, no eyeballs, and no ad revenue.
To provide a perspective on the scope of this problem, Radar launched crawl/refer ratios on July 1, based on traffic seen across our whole customer base. These ratios effectively compare the number of crawling requests for HTML pages from the crawler Continue reading
Last week, we wrote about face cropping for Images, which runs an open-source face detection model in Workers AI to automatically crop images of people at scale.
It wasn’t too long ago when deploying AI workloads was prohibitively complex. Real-time inference previously required specialized (and costly) hardware, and we didn’t always have standard abstractions for deployment. We also didn’t always have Workers AI to enable developers — including ourselves — to ship AI features without this additional overhead.
And whether you’re skeptical or celebratory of AI, you’ve likely seen its explosive progression. New benchmark-breaking computational models are released every week. We now expect a fairly high degree of accuracy — the more important differentiators are how well a model fits within a product’s infrastructure and what developers do with its predictions.
This week, we’re introducing background removal for Images. This feature runs a dichotomous image segmentation model on Workers AI to isolate subjects in an image from their backgrounds. We took a controlled, deliberate approach to testing models for efficiency and accuracy.
Here’s how we evaluated various image segmentation models to develop background removal.
In computer vision, image segmentation is the process of splitting Continue reading
Empowering content creators in the age of AI with smarter crawling controls and direct communication channels
Imagine you run a regional news site. Last month an AI bot scraped 3 years of archives in minutes — with no payment and little to no referral traffic. As a small company, you may struggle to get the AI company's attention for a licensing deal. Do you block all crawler traffic, or do you let them in and settle for the few referrals they send?
It’s picking between two bad options.
Cloudflare wants to help break that stalemate. On July 1st of this year, we declared Content Independence Day based on a simple premise: creators deserve control of how their content is accessed and used. Today, we're taking the next step in that journey by releasing AI Crawl Control to general availability — giving content creators and AI crawlers an important new way to communicate.
Today, we're rebranding our AI Audit tool as AI Crawl Control and moving it from beta to general availability. This reflects the tool's evolution from simple monitoring to detailed insights and control over how AI systems can access your content.
The Continue reading
Publishers and content creators have historically relied on traditional keyword-based search to help users navigate their website’s content. However, traditional search is built on outdated assumptions: users type in keywords to indicate intent, and the site returns a list of links for the most relevant results. It’s up to the visitor to click around, skim pages, and piece together the answer they’re looking for.
AI has reset expectations and that paradigm is breaking: how we search for information has fundamentally changed.
Users no longer want to search websites the old way. They’re used to interacting with AI systems like Copilot, Claude, and ChatGPT, where they can simply ask a question and get an answer. We’ve moved from search engines to answer engines.
At the same time, websites now have a new class of visitors, AI agents. Agents face the same pain with keyword search: they have to issue keyword queries, click through links, and scrape pages to piece together answers. But they also need more: a structured way to ask questions and get reliable answers across websites. This means that websites need a way to give the agents they trust controlled access, so Continue reading
On the surface, the goal of handling bot traffic is clear: keep malicious bots away, while letting through the helpful ones. Some bots are evidently malicious — such as mass price scrapers or those testing stolen credit cards. Others are helpful, like the bots that index your website. Cloudflare has segmented this second category of helpful bot traffic through our verified bots program, vetting and validating bots that are transparent about who they are and what they do.
Today, the rise of agents has transformed how we interact with the Internet, often blurring the distinctions between benign and malicious bot actors. Bots are no longer directed only by the bot owners, but also by individual end users to act on their behalf. These bots directed by end users are often working in ways that website owners want to allow, such as planning a trip, ordering food, or making a purchase.
Our customers have asked us for easier, more granular ways to ensure specific bots, crawlers, and agents can reach their websites, while continuing to block bad actors. That’s why we’re excited to introduce signed agents, an extension of our verified bots program that gives a new bot Continue reading
Getting the observability you need is challenging enough when the code is deterministic, but AI presents a new challenge — a core part of your user’s experience now relies on a non-deterministic engine that provides unpredictable outputs. On top of that, there are many factors that can influence the results: the model, the system prompt. And on top of that, you still have to worry about performance, reliability, and costs.
Solving performance, reliability and observability challenges is exactly what Cloudflare was built for, and two years ago, with the introduction of AI Gateway, we wanted to extend to our users the same levels of control in the age of AI.
Today, we’re excited to announce several features to make building AI applications easier and more manageable: unified billing, secure key storage, dynamic routing, security controls with Data Loss Prevention (DLP). This means that AI Gateway becomes your go-to place to control costs and API keys, route between different models and providers, and manage your AI traffic. Check out our new AI Gateway landing page for more information at a glance.
When using an AI provider, you typically have to sign up for Continue reading
Inference powers some of today’s most powerful AI products: chat bot replies, AI agents, autonomous vehicle decisions, and fraud detection. The problem is, if you’re building one of these products on top of a hyperscaler, you’ll likely need to rent expensive GPUs from large centralized data centers to run your inference tasks. That model doesn’t work for Cloudflare — there’s a mismatch between Cloudflare’s globally-distributed network and a typical centralized AI deployment using large multi-GPU nodes. As a company that operates our own compute on a lean, fast, and widely distributed network within 50ms of 95% of the world’s Internet-connected population, we need to be running inference tasks more efficiently than anywhere else.
This is further compounded by the fact that AI models are getting larger and more complex. As we started to support these models, like the Llama 4 herd and gpt-oss, we realized that we couldn’t just throw money at the scaling problems by buying more GPUs. We needed to utilize every bit of idle capacity and be agile with where each model is deployed.
After running most of our models on the widely used open source inference and serving engine vLLM, we figured out it Continue reading
As the demand for AI products grows, developers are creating and tuning a wider variety of models. While adding new models to our growing catalog on Workers AI, we noticed that not all of them are used equally – leaving infrequently used models occupying valuable GPU space. Efficiency is a core value at Cloudflare, and with GPUs being the scarce commodity they are, we realized that we needed to build something to fully maximize our GPU usage.
Omni is an internal platform we’ve built for running and managing AI models on Cloudflare’s edge nodes. It does so by spawning and managing multiple models on a single machine and GPU using lightweight isolation. Omni makes it easy and efficient to run many small and/or low-volume models, combining multiple capabilities by:
Spawning multiple models from a single control plane,
Implementing lightweight process isolation, allowing models to spin up and down quickly,
Isolating the file system between models to easily manage per-model dependencies, and
Over-committing GPU memory to run more models on a single GPU.
Cloudflare aims to place GPUs as close as we possibly can to people and applications that are using them. With Omni in place, we’re now able to run Continue reading
When we first launched Workers AI, we made a bet that AI models would get faster and smaller. We built our infrastructure around this hypothesis, adding specialized GPUs to our datacenters around the world that can serve inference to users as fast as possible. We created our platform to be as general as possible, but we also identified niche use cases that fit our infrastructure well, such as low-latency image generation or real-time audio voice agents. To lean in on those use cases, we’re bringing on some new models that will help make it easier to develop for these applications.
Today, we’re excited to announce that we are expanding our model catalog to include closed-source partner models that fit this use case. We’ve partnered with Leonardo.Ai and Deepgram to bring their latest and greatest models to Workers AI, hosted on Cloudflare’s infrastructure. Leonardo and Deepgram both have models with a great speed-to-performance ratio that suit the infrastructure of Workers AI. We’re starting off with these great partners — but expect to expand our catalog to other partner models as well.
The benefits of using these models on Workers AI is that we don’t only have a standalone inference Continue reading
Large Language Models (LLMs) are rapidly evolving from impressive information retrieval tools into active, intelligent agents. The key to unlocking this transformation is the Model Context Protocol (MCP), an open-source standard that allows LLMs to securely connect to and interact with any application — from Slack to Canva, to your own internal databases.
This is a massive leap forward. With MCP, an LLM client like Gemini, Claude, or ChatGPT can answer more than just "tell me about Slack." You can ask it: "What were the most critical engineering P0s in Jira from last week, and what is the current sentiment in the #engineering-support Slack channel regarding them? Then propose updates and bug fixes to merge."
This is the power of MCP: turning models into teammates.
But this great power comes with proportional risk. Connecting LLMs to your most critical applications creates a new, complex, and largely unprotected attack surface. Today, we change that. We’re excited to announce Cloudflare MCP Server Portals are now available in Open Beta. MCP Server Portals are a new capability that enable you to centralize, secure, and observe every MCP connection in your organization. Continue reading
As Generative AI revolutionizes businesses everywhere, security and IT leaders find themselves in a tough spot. Executives are mandating speedy adoption of Generative AI tools to drive efficiency and stay abreast of competitors. Meanwhile, IT and Security teams must rapidly develop an AI Security Strategy, even before the organization really understands exactly how it plans to adopt and deploy Generative AI.
IT and Security teams are no strangers to “building the airplane while it is in flight”. But this moment comes with new and complex security challenges. There is an explosion in new AI capabilities adopted by employees across all business functions — both sanctioned and unsanctioned. AI Agents are ingesting authentication credentials and autonomously interacting with sensitive corporate resources. Sensitive data is being shared with AI tools, even as security and compliance frameworks struggle to keep up.
While it demands strategic thinking from Security and IT leaders, the problem of governing the use of AI internally is far from insurmountable. SASE (Secure Access Service Edge) is a popular cloud-based network architecture that combines networking and security functions into a single, integrated service that provides employees with secure and efficient access to the Internet and to corporate resources, regardless Continue reading
Security teams are racing to secure a new attack surface: AI-powered applications. From chatbots to search assistants, LLMs are already shaping customer experience, but they also open the door to new risks. A single malicious prompt can exfiltrate sensitive data, poison a model, or inject toxic content into customer-facing interactions, undermining user trust. Without guardrails, even the best-trained model can be turned against the business.
Today, as part of AI Week, we’re expanding our AI security offerings by introducing unsafe content moderation, now integrated directly into Cloudflare Firewall for AI. Built with Llama, this new feature allows customers to leverage their existing Firewall for AI engine for unified detection, analytics, and topic enforcement, providing real-time protection for Large Language Models (LLMs) at the network level. Now with just a few clicks, security and application teams can detect and block harmful prompts or topics at the edge — eliminating the need to modify application code or infrastructure. This feature is immediately available to current Firewall for AI users. Those not yet onboarded can contact their account team to participate in the beta program.
Cloudflare's Firewall for AI protects user-facing LLM applications from abuse and Continue reading
Starting today, all users of Cloudflare One, our secure access service edge (SASE) platform, can use our API-based Cloud Access Security Broker (CASB) to assess the security posture of their generative AI (GenAI) tools: specifically, OpenAI’s ChatGPT, Claude by Anthropic, and Google’s Gemini. Organizations can connect their GenAI accounts and within minutes, start detecting misconfigurations, Data Loss Prevention (DLP) matches, data exposure and sharing, compliance risks, and more — all without having to install cumbersome software onto user devices.
As Generative AI adoption has exploded in the enterprise, IT and Security teams need to hustle to keep themselves abreast of newly emerging security and compliance challenges that come alongside these powerful tools. In this rapidly changing landscape, IT and Security teams need tools that help enable AI adoption while still protecting the security and privacy of their enterprise networks and data.
Cloudflare’s API CASB and inline CASB work together to help organizations safely adopt AI tools. The API CASB integrations provide out-of-band visibility into data at rest and security posture inside popular AI tools like ChatGPT, Claude, and Gemini. At the same time, Cloudflare Gateway provides in-line prompt controls and Shadow AI identification. It applies policies and Continue reading
The availability of SaaS and Gen AI applications is transforming how businesses operate, boosting collaboration and productivity across teams. However, with increased productivity comes increased risk, as employees turn to unapproved SaaS and Gen AI applications, often dumping sensitive data into them for quick productivity wins.
The prevalence of “Shadow IT” and “Shadow AI” creates multiple problems for security, IT, GRC and legal teams. For example:
Gen AI applications may train their models on user inputs, which could expose proprietary corporate information to third parties, competitors, or even through clever attacks like prompt injection.
Applications may retain user data for long periods, share data with third parties, have lax security practices, suffer a data breach, or even go bankrupt, leaving sensitive data exposed to the highest bidder.
Gen AI applications may produce outputs that are biased, unsafe or incorrect, leading to compliance violations or bad business decisions.
In spite of these problems, blanket bans of Gen AI don't work. They stifle innovation and push employee usage underground. Instead, organizations need smarter controls.
Security, IT, legal and GRC teams therefore face a difficult challenge: how can you appropriately assess each third-party application, without Continue reading
The digital landscape of corporate environments has always been a battleground between efficiency and security. For years, this played out in the form of "Shadow IT" — employees using unsanctioned laptops or cloud services to get their jobs done faster. Security teams became masters at hunting these rogue systems, setting up firewalls and policies to bring order to the chaos.
But the new frontier is different, and arguably far more subtle and dangerous.
Imagine a team of engineers, deep into the development of a groundbreaking new product. They're on a tight deadline, and a junior engineer, trying to optimize his workflow, pastes a snippet of a proprietary algorithm into a popular public AI chatbot, asking it to refactor the code for better performance. The tool quickly returns the revised code, and the engineer, pleased with the result, checks it in. What they don't realize is that their query, and the snippet of code, is now part of the AI service’s training data, or perhaps logged and stored by the provider. Without anyone noticing, a critical piece of the company's intellectual property has just been sent outside the organization's control, a silent and unmonitored data leak.
This isn't a Continue reading
The revolution is already inside your organization, and it's happening at the speed of a keystroke. Every day, employees turn to generative artificial intelligence (GenAI) for help with everything from drafting emails to debugging code. And while using GenAI boosts productivity—a win for the organization—this also creates a significant data security risk: employees may potentially share sensitive information with a third party.
Regardless of this risk, the data is clear: employees already treat these AI tools like a trusted colleague. In fact, one study found that nearly half of all employees surveyed admitted to entering confidential company information into publicly available GenAI tools. Unfortunately, the risk for human error doesn’t stop there. Earlier this year, a new feature in a leading LLM meant to make conversations shareable had a serious unintended consequence: it led to thousands of private chats — including work-related ones — being indexed by Google and other search engines. In both cases, neither example was done with malice. Instead, they were miscalculations on how these tools would be used, and it certainly did not help that organizations did not have the right tools to protect their data.
While the instinct for many may be to deploy Continue reading
If you’re here on the Cloudflare blog, chances are you already understand AI pretty well. But step outside our circle, and you’ll find a surprising number of people who still don’t know what it really is — or why it matters.
We wanted to come up with a way to make AI intuitive, something you can actually see and touch to get what’s going on. Hands on, not just hand-wavy.
The idea we landed on is simple: nothing comes into the world fully formed. Like us, and like the Internet, AI didn’t show up fully formed. So we asked ourselves: what if we told the story of AI as it learns and grows?
Episode by episode, we’d give it new capabilities, explain how those capabilities work, and explore how they change the way AI interacts with the world. Giving it a voice. Letting it see. Helping it learn. And maybe even letting it imagine the future.
So we made AI Avenue, a show where I (Craig) explore the fun, human, and sometimes surprising sides of AI… with a little help from my co-host Yorick, a robot hand with a knack for comic timing and the occasional eye-roll. Together, we travel, Continue reading
We are witnessing in real time as AI fundamentally changes how people work across every industry. Customer support agents can respond to ten times the tickets. Software engineers are reviewers of AI generated code instead of spending hours pounding out boiler plate code. Salespeople can get back to focusing on building relationships instead of tedious follow up and administration.
This technology feels magical, and Cloudflare is committed to helping companies build world class AI-driven experiences for their employees and customers.
There is a but, however. Any time a brand new technology with such widespread appeal emerges, the technology often outpaces the tools in place to govern, secure and control the technology. We're already starting to see stories of vibe coded apps leaking all their users' details. LLM chats that were intended to only be shared between colleagues, are actually out on the web, being indexed by search engines for all the world to see. AI Agents are being given the keys to the application kingdom, enabling them to work autonomously across an organization — but without proper tracking and control. And then there’s the risk of a well-meaning employee uploading confidential company or customer data into an LLM, which Continue reading
For over two decades, we've built real-time communication on the Internet using a patchwork of specialized tools. RTMP gave us ingest. HLS and DASH gave us scale. WebRTC gave us interactivity. Each solved a specific problem for its time, and together they power the global streaming ecosystem we rely on today.
But using them together in 2025 feels like building a modern application with tools from different eras. The seams are starting to show—in complexity, in latency, and in the flexibility needed for the next generation of applications, from sub-second live auctions to massive interactive events. We're often forced to make painful trade-offs between latency, scale, and operational complexity.
Today Cloudflare is launching the first Media over QUIC (MoQ) relay network, running on every Cloudflare server in datacenters in 330+ cities. MoQ is an open protocol being developed at the IETF by engineers from across the industry—not a proprietary Cloudflare technology. MoQ combines the low-latency interactivity of WebRTC, the scalability of HLS/DASH, and the simplicity of a single architecture, all built on a modern transport layer. We're joining Meta, Google, Cisco, and others in building implementations that work seamlessly together, creating a shared foundation for the next generation of real-time Continue reading
On August 21, 2025, an influx of traffic directed toward clients hosted in the Amazon Web Services (AWS) us-east-1 facility caused severe congestion on links between Cloudflare and AWS us-east-1. This impacted many users who were connecting to or receiving connections from Cloudflare via servers in AWS us-east-1 in the form of high latency, packet loss, and failures to origins.
Customers with origins in AWS us-east-1 began experiencing impact at 16:27 UTC. The impact was substantially reduced by 19:38 UTC, with intermittent latency increases continuing until 20:18 UTC.
This was a regional problem between Cloudflare and AWS us-east-1, and global Cloudflare services were not affected. The degradation in performance was limited to traffic between Cloudflare and AWS us-east-1. The incident was a result of a surge of traffic from a single customer that overloaded Cloudflare's links with AWS us-east-1. It was a network congestion event, not an attack or a BGP hijack.
We’re very sorry for this incident. In this post, we explain what the failure was, why it occurred, and what we’re doing to make sure this doesn’t happen again.
Cloudflare helps anyone to build, connect, protect, and accelerate their websites on the Internet. Most customers host their Continue reading