Archive

Category Archives for "Networking"

Piecing together the Agent puzzle: MCP, authentication & authorization, and Durable Objects free tier

It’s not a secret that at Cloudflare we are bullish on the future of agents. We’re excited about a future where AI can not only co-pilot alongside us, but where we can actually start to delegate entire tasks to AI. 

While it hasn’t been too long since we first announced our Agents SDK to make it easier for developers to build agents, building towards an agentic future requires continuous delivery towards this goal. Today, we’re making several announcements to help accelerate agentic development, including:

  • New Agents SDK capabilities: Build remote MCP clients, with transport and authentication built-in, to allow AI agents to connect to external services. 

  • BYO Auth provider for MCP: Integrations with Stytch, Auth0, and WorkOS to add authentication and authorization to your remote MCP server. 

  • Hibernation for McpAgent: Automatically sleep stateful, remote MCP servers when inactive and wake them when needed. This allows you to maintain connections for long-running sessions while ensuring you’re not paying for idle time. 

  • Durable Objects free tier: We view Durable Objects as a key component for building agents, and if you’re using our Agents SDK, you need access to it. Until today, Durable Objects Continue reading

Welcome to Developer Week 2025

We’re kicking off Cloudflare’s 2025 Developer Week — our innovation week dedicated to announcements for developers.

It’s an exciting time to be a developer. In fact, as a developer, the past two years might have felt a bit like every week is Developer Week. Starting with the release of ChatGPT, it has felt like each day has brought a new, disruptive announcement, whether it’s new models, hardware, agents, or other tools. From late 2024 and in just the first few months of 2025, we’ve seen the DeepSeek model challenge assumptions about what it takes to train a new state-of-the-art model, MCP introduce a new standard for how LLMs interface with the world, and OpenAI’s o4 model Ghiblify the world.

And while it’s exciting to witness a technological revolution unfold in front of your eyes, it’s even more exciting to partake in it. 

A new era of innovation

One of the marvels of the recent AI revolution is the extent to which the cost of experimentation has gone down. Ideas that would have taken whole weekends, weeks, or months to build can now be turned into working code in a day. You can vibe-code your way through things you might Continue reading

AI for Network Engneers: Challenges in AI Fabric Design

 Introduction


Figure 10-1 illustrates a simple distributed GPU cluster consisting of three GPU hosts. Each host has two GPUs and a Network Interface Card (NIC) with two interfaces. Intra-host GPU communication uses high-speed NVLink interfaces, while inter-host communication takes place via NICs over slower PCIe buses.

GPU-0 on each host is connected to Rail Switch A through interface E1. GPU-1 uses interface E2 and connects to Rail Switch B. In this setup, inter-host communication between GPUs connected to the same rail passes through a single switch. However, communication between GPUs on different rails goes over three hops  Rail–Spine–Rail switches.

In Figure 10-1, we use a data parallelization strategy where a training dataset is split into six micro-batches, which are distributed across the GPUs. All GPUs use the shared feedforward neural network model and compute local model outputs. Next, each GPU calculates the model error and begins the backward pass to compute neuron-based gradients. These gradients indicate how much, and in which direction, the weight parameters should be adjusted to improve the training result (see Chapter 2 for details).

Figure 10-1: Rail-Optimized Topology.


Egress Interface Congestions


After computing all gradients, each GPU stores the results in a local memory buffer and Continue reading

Meta’s Llama 4 is now available on Workers AI

As one of Meta’s launch partners, we are excited to make Meta’s latest and most powerful model, Llama 4, available on the Cloudflare Workers AI platform starting today. Check out the Workers AI Developer Docs to begin using Llama 4 now.

What’s new in Llama 4?

Llama 4 is an industry-leading release that pushes forward the frontiers of open-source generative Artificial Intelligence (AI) models. Llama 4 relies on a novel design that combines a Mixture of Experts architecture with an early-fusion backbone that allows it to be natively multimodal.

The Llama 4 “herd” is made up of two models: Llama 4 Scout (109B total parameters, 17B active parameters) with 16 experts, and Llama 4 Maverick (400B total parameters, 17B active parameters) with 128 experts. The Llama Scout model is available on Workers AI today.

Llama 4 Scout has a context window of up to 10 million (10,000,000) tokens, which makes it one of the first open-source models to support a window of that size. A larger context window makes it possible to hold longer conversations, deliver more personalized responses, and support better Retrieval Augmented Generation (RAG). For example, users can take advantage of that increase to summarize multiple documents or Continue reading

io_uring, kTLS and Rust for zero syscall HTTPS server

Around the turn of the century we started to get a bigger need for high capacity web servers. For example there was the C10k problem paper.

At the time, the kinds of things done to reduce work done per request was pre-forking the web server. This means a request could be handled without an expensive process creation.

Because yes, creating a new process for every request was something perfectly normal.

Things did get better. People learned how to create threads, making things more light weight. Then they switched to using poll()/select(), in order to not just spare the process/thread creation, but the whole context switch.

I remember a comment on Kuro5hin from anakata, the creator of both The Pirate Bay and the webserver that powered it, along the lines of “I am select() of borg, resistance is futile”, mocking someone for not understanding how to write a scalable webserver.

But select()/poll() also doesn’t scale. If you have ten thousand connections, that’s an array of ten thousand integers that need to be sent to the kernel for every single iteration of your request handling loop.

Enter epoll (kqueue on other operating systems, but I’m focusing Continue reading

From Python to Go 018. Interaction With Network Devices Using GNMI.

Hello my friend,

Within past blog posts we covered how to interact with network devices (in fact, many servers support that as well) using SSH/CLI and NETCONF/YANG. Those two protocols allow you to confidently cover almost all cases for managing devices, whether you prefer more human-like approach, that is templating CLI commands and pushing them via SSH or using structured XML data and send it via NETCONF. However, there are more protocols and today we are going to talk about the most modern to the date, which is called GNMI (generalized network management interface).

How Does Automation Help Real Business?

Talking to students at our trainings and with customers and peers at various events, there is often a concern arise that small and medium businesses don’t have time and/or need to invest in automation. There is no time as engineers are busy solving “real” problems, like outages, customer experience degradation, etc. There is no need because why to bother, we have engineers to tackle issue. From my experience automation allows to save so much time on doing operational tasks and to free it up for improving user experience and developing new problems, that is difficult to overestimate. It really is Continue reading

Querying Data in Infrahub via the Python SDK

Querying Data in Infrahub via the Python SDK

Infrahub provides multiple ways to interact with your infrastructure data, including the Web GUI, GraphQL queries, and the Python SDK. These can be used to query, modify, create, or delete data in Infrahub. In this post, we’ll focus on using the Python SDK to query data from Infrahub.

This post assumes you are familiar with basic Python and Infrahub. If you’re new to these topics, don’t worry, you can still follow along.

Throughout this post, we’ll be using the Always-On Infrahub demo instance, which is available for anyone to access via this link. The demo instance already has some data in it, so if you’d like to follow along or try this yourself, you can use it without needing to set up anything.

Originally published under - https://www.opsmill.com/querying-data-in-infrahub-via-the-python-sdk/

Introduction

The Python SDK supports both synchronous and asynchronous Python. However, in this post, we’ll focus on using synchronous Python, which I hope most of us are comfortable with. We’ll cover async in a future blog post.

Interacting with Infrahub through the Python SDK is done using a client object, which defines the Infrahub instance you’ll be working with. This client acts as the connection point, allowing you to Continue reading

Dave Taht, Who Sped Up Networks More Than You’ll Ever Know, Has Died

Dave Taht, LinkedIn profile photo. I don’t recall when I first met InterOp conference in the 80s when it was the best-ever networking conference or at a science-fiction convention — Balticon? — around the same time. Whether it was when he was talking about TCP/IP networking or playing Bufferbloat Project with Jim Gettys, focusing on reducing Internet latency and improving network performance. His work on advanced queuing algorithms like Common Applications Kept Enhanced (CAKE) significantly enhanced network efficiency, making these technologies part of the default networking stack in many Linux distributions, the popular open source embedded router operating system,

Hedge 265: Out of Band Networks

Out of band management networks were once more common than they are today. Should we go back to building out of band management networks? Should out of band management networks be virtual or physical? How can we sell out of band management networks to the folks paying the bills? Daryll Swer joins Tom Ammon and Russ White to discuss the importance of OOB management.

download

TNO023: Networking’s Third Phase – The Network Operator Experience

Guest Chris Grundemann believes that NetOps is in the third phase of networking–improving the network operator experience. Not just making the network functional or improving end user experience. In this episode, Chris tells his origin story at a wireless service provider and growth into a founder of multiple companies. He also shares his community-focused work... Read more »

Cloudflare’s commitment to CISA Secure-By-Design pledge: delivering new kernels, faster

As cyber threats continue to exploit systemic vulnerabilities in widely used technologies, the United States Cybersecurity and Infrastructure Agency (CISA) produced best practices for the technology industry with their Secure-by-Design pledge. Cloudflare proudly signed this pledge on May 8, 2024, reinforcing our commitment to creating resilient systems where security is not just a feature, but a foundational principle.

We’re excited to share and provide transparency into how our security patching process meets one of CISA’s goals in the pledge: Demonstrating actions taken to increase installation of security patches for our customers.

Balancing security patching and customer experience 

Managing and deploying Linux kernel updates is one of Cloudflare’s most challenging security processes. In 2024, over 1000 CVEs were logged against the Linux kernel and patched. To keep our systems secure, it is vital to perform critical patch deployment across systems while maintaining the user experience. 

A common technical support phrase is “Have you tried turning it off and then on again?”.  One may be  surprised how often this tactic is used — it is also an essential part of how Cloudflare operates at scale when it comes to applying our most critical patches. Frequently restarting systems exercises the Continue reading

Transitioning into Networking, 2025 Edition

Elmer sent me the following question:

I’ve been working in systems engineering (Linux, virtualization, infrastructure ops) and am considering shifting toward network engineering or architecture. I got my CCNA years ago and started CCNP but didn’t continue.

I’d really appreciate any thoughts you might have on how someone with my background could best make that transition today, especially with how things are evolving around automation and the cloud.

I keep answering a variant of this question every other year or so (2019, 2021, 2023, 2024). I guess it’s time for another answer, so here we go.

KubeCon Europe: Kgateway Aims To Be the Kubernetes Onramp

Kubernetes network administrators at KubeCon + CloudNativeCon EU this week in London should drop by the ease the management of moving traffic to and from clusters. Built on top of Kubernetes Gateway API, the open source Solo.io, and went under the name Gloo Gateway. At last year’s KubeCon +_ CloudNativeCon North America 2024, the company announced that it would be donating the software to the Cloud Native Computing Foundation (CNCF), changing the software’s name to kgateway in the process. In March, CNCFGloo open source repository will be deprecated over time. The Importance of the Kubernetes Gateway API In 2023, the

N4N020: To Cert Or Not To Cert?

To cert or not to cert? That is the question Holly & Ethan discuss on today’s episode. Will a certification really land you a networking job? Are certs the guaranteed path to tech career success? We consider this, talking through the benefits, challenges and even risks of networking industry certification. And there’s some bonus material,... Read more »

Improve your media pipelines with the Images binding for Cloudflare Workers

When building a full-stack application, many developers spend a surprising amount of time trying to make sure that the various services they use can communicate and interact with each other. Media-rich applications require image and video pipelines that can integrate seamlessly with the rest of your technology stack.

With this in mind, we’re excited to introduce the Images binding, a way to connect the Images API directly to your Worker and enable new, programmatic workflows. The binding removes unnecessary friction from application development by allowing you to transform, overlay, and encode images within the Cloudflare Developer Platform ecosystem.

In this post, we’ll explain how the Images binding works, as well as the decisions behind local development support. We’ll also walk through an example app that watermarks and encodes a user-uploaded image, then uploads the output directly to an R2 bucket.

The challenges of fetch()

Cloudflare Images was designed to help developers build scalable, cost-effective, and reliable image pipelines. You can deliver multiple copies of an image — each resized, manipulated, and encoded based on your needs. Only the original image needs to be stored; different versions are generated dynamically, or as requested by a user’s browser, then subsequently served Continue reading

ARP Challenges in EVPN/VXLAN Symmetric IRB

Whenever I claimed that EVPN is The SIP of Networking, vendor engineers quickly told me that “EVPN interoperability is a solved problem” and that they run regular multi-vendor interoperability labs to iron out the quirks. As it turns out, things aren’t as rosy in real life; it’s still helpful to have an EVPN equivalent of the DTMF tone generators handy.

I encountered a particularly nasty quirk when running the netlab EVPN integration test using symmetric IRB with an anycast gateway between Nokia SR Linux (or Juniper vSwitch) and FRR container.

Lab topology

Lab topology

Calico Whisker, Your New Ally in Network Observability

With the upcoming release of Calico v3.30 on the horizon, we are excited to introduce Calico Whisker, a simple yet powerful User Interface (UI) designed to enhance network observability and policy debugging. If you’ve ever struggled to make sense of network flow logs or troubleshoot policies in a complex Kubernetes cluster, Whisker is your friend!

Whisker is a three part deployment that holds a UI, backend and a gRPC channel to communicate with the Felix brain of Calico to gather live flow information and present it in a human readable, easy to understand way. But before we get started let’s dive into why Whisker is a must-have for your Kubernetes environment, what problems it solves, and how it can streamline your policy management.

Navigating Network Flows is Difficult

In Kubernetes environments, network flows are the backbone of communication between workloads. As clusters scale, so does the complexity of managing these flows and their security. Without clear visibility and effective observability tools, teams often struggle with:

  • Diagnosing unexplained workload behavior and determining why certain applications aren’t working as expected.
  • Identifying the real reason why certain workload communications are permitted or denied, which stems from understanding which policies are affecting specific Continue reading