N4N042: Meet MACsec

MACsec is a protocol for encrypting Ethernet frames on a local (though not always local) network. Ethan Banks and Holly Metlitzky have an ELI5 (explain like I’m 5) discussion as to what exactly is MACsec and how it differs from IPsec. They talk about when and whether you need to implement MACsec with all the... Read more »

Async QUIC and HTTP/3 made easy: tokio-quiche is now open-source

A little over 6 years ago, we presented quiche, our open source QUIC implementation written in Rust. Today we’re announcing the open sourcing of tokio-quiche, our battle-tested, asynchronous QUIC library combining both quiche and the Rust Tokio async runtime. Powering Cloudflare’s Proxy B in Apple iCloud Private Relay and our next-generation Oxy-based proxies, tokio-quiche handles millions of HTTP/3 requests per second with low latency and high throughput. tokio-quiche also powers Cloudflare Warp’s MASQUE client, replacing our WireGuard tunnels with QUIC-based tunnels, and the async version of h3i.

quiche was developed as a sans-io library, meaning that it implements the state machine required to handle the QUIC transport protocol while not making any assumptions about how its user intends to perform IO. This means that, with enough elbow grease, anyone can write an IO integration with quiche! This entails connecting or listening on a UDP socket, managing sending and receiving UDP datagrams on that socket while feeding all network information to quiche. Given we need this integration to be async, we’d have to do all this while integrating with an async Rust runtime. tokio-quiche does all of that for you, no grease required.

Lowering the barrier to Continue reading

Extract audio from your videos with Cloudflare Stream

Cloudflare Stream loves video. But we know not every workflow needs the full picture, and the popularity of podcasts highlights how compelling stand-alone audio can be. For developers, processing a video just to access audio is slow, costly, and complex. 

What makes video so expensive? A video file is a dense stack of high-resolution images, stitched together over time. As such, it is not just “one file” —  it’s a container of high-dimensional data such as frames per second, resolution, codecs. Analyzing video means traversing time resolution frame rate.

Why audio extraction

By comparison, an audio file is far simpler. If an audio file consists of only one channel, it is defined as a single waveform. The technical characteristics of this waveform are defined by the sample rate (the number of audio samples taken per second), and the bit depth (the precision of each sample).

With the rise of computationally intensive AI inference pipelines, many of our customers want to perform downstream workflows that require only analyzing the audio. For example:

  • Power AI and Machine Learning: In addition to translation and transcription, you can feed the audio into Voice-to-Text models for speech recognition or analysis, or AI-powered summaries.

  • Improve Continue reading

D2DO286: Scaling Kubernetes Across Clouds – Identity, DNS, and Security

If you think managing Kubernetes clusters is hard, what about managing Kubernetes clusters across three different public clouds? We dive into the challenges that arises from running multi-cloud Kubernetes workloads. These challenges include workload identity, DNS query resolutions, and security. Here to help us navigate this complexity and offer possible solutions is Goutam Tadi, Staff... Read more »

PP085: News Roundup – Naked Satellite Signals, Account Recovery Buddies, Busting Ghost Networks

Did you know college students are snooping on satellite transmissions? On today’s news roundup we discuss new research in which university investigators use off-the-shelf equipment to intercept traffic from geostationary satellites and discover that a lot of it is unencrypted. We also dig into the credential hygiene lessons we can learn from a corpus of... Read more »

NAN105: Campus Network Automation, Powered by Cisco Agentic Workflows (Sponsored)

Cisco Workflows is a new platform that makes network automation easier, smarter, and safer. On today’s episode, sponsored by Cisco, we get introduced to Cisco Workflows by Stephen Orr, Distinguished Solutions Engineer; and Reid Butler, Director of Product Management. They break down how Workflows helps you ditch repetitive tasks, roll out changes faster, and plug... Read more »

How Workers VPC Services connects to your regional private networks from anywhere in the world

In April, we shared our vision for a global virtual private cloud on Cloudflare, a way to unlock your applications from regionally constrained clouds and on-premise networks, enabling you to build truly cross-cloud applications.

Today, we’re announcing the first milestone of our Workers VPC initiative: VPC Services. VPC Services allow you to connect to your APIs, containers, virtual machines, serverless functions, databases and other services in regional private networks via Cloudflare Tunnels from your Workers running anywhere in the world. 

Once you set up a Tunnel in your desired network, you can register each service that you want to expose to Workers by configuring its host or IP address. Then, you can access the VPC Service as you would any other Workers service binding — Cloudflare’s network will automatically route to the VPC Service over Cloudflare’s network, regardless of where your Worker is executing:

export default {
  async fetch(request, env, ctx) {
    // Perform application logic in Workers here	

    // Call an external API running in a ECS in AWS when needed using the binding
    const response = await env.AWS_VPC_ECS_API.fetch("http://internal-host.com");

    // Additional application logic in Workers
    return new Response();
  },
};

Workers VPC is now Continue reading

Worth Reading: AI Won’t Replace Network Engineers

Jason Gintert published an excellent explanation why AI won’t replace (all) network engineers, and reading it, I felt like reading one of my “automation won’t replace network engineers” blog posts.

Here’s a quote to get you in the mood:

AI will make good engineers better and will expose mediocre ones. If your value proposition is memorizing CLI commands or being a human grep for log files, then yes, you might need to be worried.

Build Your First HTTP Server in Python

We see HTTP everywhere on the web. It’s considered one of its backbones. Think of it as the “language” that allows browsers, servers and websites to talk to one another. HTTP is a protocol that defines a structured way to request and exchange information. With an HTTP server, you can provide access to data, tools and services, allowing a client to request information or trigger actions. Think of HTTP like ordering at a restaurant. You don’t walk into the kitchen and ask the chef for your meal yourself. You give your meal order to the server. The server passes along your order to the right people and then, a short time later, you have a finished meal. If you need something else, like salt, you again speak to the server rather than finding salt yourself. HTTP works similarly. Your browser sends a request to the web server, and the web server figures out where the right information is and delivers it back to your browser. In this analogy, the human server represents the HTTP server. It takes in the browser’s request for information, identifies where the information is and returns it to the browser. An HTTP server is a service Continue reading

Zero-Trust with Zero-Friction eBPF in Calico v3.31

Calico eBPF by default*

*some conditions may apply

Calico eBPF by defaultCalico has used eBPF as one of its dataplanes since version 3.13, released more than five years ago. At the time, this was an exciting step forward, introducing a new, innovative data plane that quickly gained traction within the Calico community. Since then, there have been many changes and continued evolution, all thanks to the many adopters of the then-new data plane.

However, there has been one persistent challenge in the installation process since day one: bootstrapping the eBPF data plane required a manual setup step. This extra friction point often frustrated operators and slowed adoption.

With the launch of Calico v3.31, that hurdle to using the eBPF data plane has finally been removed. For many environments (see Limitations section below), you can now install Calico with eBPF enabled right out of the box with no manual setup required.

Simply use the provided installation manifest (custom-resources-bpf.yaml), which comes preconfigured with the data plane option set to eBPF.

To get started, follow the instructions in Install Calico networking and network policy for on-premises deployments to enjoy a much smoother installation experience.

See Calico eBPF in action with Continue reading

Building a better testing experience for Workflows, our durable execution engine for multi-step applications

Cloudflare Workflows is our take on "Durable Execution." They provide a serverless engine, powered by the Cloudflare Developer Platform, for building long-running, multi-step applications that persist through failures. When Workflows became generally available earlier this year, they allowed developers to orchestrate complex processes that would be difficult or impossible to manage with traditional stateless functions. Workflows handle state, retries, and long waits, allowing you to focus on your business logic.

However, complex orchestrations require robust testing to be reliable. To date, testing Workflows was a black-box process. Although you could test if a Workflow instance reached completion through an await to its status, there was no visibility into the intermediate steps. This made debugging really difficult. Did the payment processing step succeed? Did the confirmation email step receive the correct data? You couldn't be sure without inspecting external systems or logs. 

Why was this necessary?

As developers ourselves, we understand the need to ensure reliable code, and we heard your feedback loud and clear: the developer experience for testing Workflows needed to be better.

The black box nature of testing was one part of the problem. Beyond that, though, the limited testing offered came at a high Continue reading

The Curious Case of Default OSPF Interface Timers

We run two types of integration tests before shipping a netlab release: device integration tests that check whether we correctly implemented netlab features on all supported devices, and platform integration tests that check whether rarely-used core functionality works as expected.

I want to have some validation included in the platform integration tests to ensure the lab devices are started, and that the links and the management network work as expected. The simplest way to get that done is to start OSPF with short hello intervals (to get adjacency up in no time), for example:

Ultra Ethernet Transport

The Ultra Ethernet Consortium has a mission to Deliver an Ethernet based open, interoperable, high performance, full-communications stack architecture to meet the growing network demands of AI & HPC at scale. The recently released UE-Specification-1.0.1 includes an Ultra Ethernet Transport (UET) protocol with similar functionality to RDMA over Converged Ethernet (RoCEv2).

The sFlow instrumentation embedded as a standard feature of data center switch hardware from all leading vendors (Arista, Cisco, Dell, Juniper, NVIDIA, etc.) provides a cost effective solution for gaining visibility into UET traffic in large production AI / ML fabrics. 

docker run -p 8008:8008 -p 6343:6343/udp sflow/prometheus
The easiest way to get started is to use the pre-built sflow/prometheus Docker image to analyze the sFlow telemetry. The chart at the top of this page shows an up to the second view of UET operations using the included Flow Browser application, see Defining Flows for a list of available UET attributes. Getting Started describes how to set up the sFlow monitoring system.

Flow metrics with Prometheus and Grafana describes how collect custom network traffic flow metrics using the Prometheus time series database and include the metrics in Grafana dashboards. Use the Flow Browser to explore Continue reading

Fresh insights from old data: corroborating reports of Turkmenistan IP unblocking and firewall testing

Here at Cloudflare, we frequently use and write about data in the present. But sometimes understanding the present begins with digging into the past.  

We recently learned of a 2024 turkmen.news article (available in Russian) that reports Turkmenistan experienced “an unprecedented easing in blocking,” causing over 3 billion previously-blocked IP addresses to become reachable. The same article reports that one of the reasons for unblocking IP addresses was that Turkmenistan may have been testing a new firewall. (The Turkmen government’s tight control over the country’s Internet access is well-documented.) 

Indeed, Cloudflare Radar shows a surge of requests coming from Turkmenistan around the same time, as we’ll show below. But we had an additional question: Does the firewall activity show up on Radar, as well? Two years ago, we launched the dashboard on Radar to give a window into the TCP connections to Cloudflare that close due to resets and timeouts. These stand out because they are considered ungraceful mechanisms to close TCP connections, according to the TCP specification. 

In this blog post, we go back in time to share what Cloudflare saw in connection resets and timeouts. We must remind our readers that, as passive observers, Continue reading

Ansible Release 12: the Windows Vista Moment

My first encounter with Ansible release 12 wasn’t exactly encouraging. We were using a few Ansible Jinja2 filters (ipaddr and hwaddr) in internal netlab templates, and all of a sudden those templates started crashing due to some weird behavior of attributes starting with underscore.

We implemented don’t use Ansible release 12 as a quick workaround, but postponing painful things is never a good solution(see also: visiting a dentist), so I decided to try to make netlab work with Ansible release 12. What a mistake to make.

Build BGP Labs in Minutes, Not Hours with Netlab

Build BGP Labs in Minutes, Not Hours with Netlab

What if I told you that all it takes to build a simple BGP lab with two eBGP peers (or even a hundred, for that matter) is a single YAML file? No need to add nodes on a GUI, connect links, or configure interface IPs manually. You just define the lab in a YAML file as shown below, and in about two minutes, you’ll have two routers of your choice fully configured with BGP and an established eBGP session.

provider: clab
defaults.device: eos
defaults.devices.eos.clab.image: ceos:4.34.2

addressing:
  mgmt:
    ipv4: 192.168.200.0/24

nodes:
  - name: r1
    module: [ bgp ]
  - name: r2
    module: [ bgp ]

bgp:
  as_list:
    100:
      members: [ r1]
    200:
      members: [ r2 ]

links:
  - r1-r2
r1#show ip bgp summary 
BGP summary information for VRF default
Router identifier 10.0.0.1, local AS number 100
Neighbor Status Codes: m - Under maintenance
  Description              Neighbor V AS           MsgRcvd   MsgSent  InQ OutQ  Up/Down State   PfxRcd PfxAcc PfxAdv
  r2                       10.1.0.2 4 200                5         5    0    0 00:00:15 Estab   1      1      1
r2#show ip bgp summary 
BGP summary information for VRF default
Router identifier 10.0.0.2,  Continue reading