Jason Gintert published an excellent explanation why AI won’t replace (all) network engineers, and reading it, I felt like reading one of my “automation won’t replace network engineers” blog posts.
Here’s a quote to get you in the mood:
AI will make good engineers better and will expose mediocre ones. If your value proposition is memorizing CLI commands or being a human grep for log files, then yes, you might need to be worried.
Calico has used eBPF as one of its dataplanes since version 3.13, released more than five years ago. At the time, this was an exciting step forward, introducing a new, innovative data plane that quickly gained traction within the Calico community. Since then, there have been many changes and continued evolution, all thanks to the many adopters of the then-new data plane.
However, there has been one persistent challenge in the installation process since day one: bootstrapping the eBPF data plane required a manual setup step. This extra friction point often frustrated operators and slowed adoption.
With the launch of Calico v3.31, that hurdle to using the eBPF data plane has finally been removed. For many environments (see Limitations section below), you can now install Calico with eBPF enabled right out of the box with no manual setup required.
Simply use the provided installation manifest (custom-resources-bpf.yaml), which comes preconfigured with the data plane option set to eBPF.
To get started, follow the instructions in Install Calico networking and network policy for on-premises deployments to enjoy a much smoother installation experience.
See Calico eBPF in action with Continue reading
Cloudflare Workflows is our take on "Durable Execution." They provide a serverless engine, powered by the Cloudflare Developer Platform, for building long-running, multi-step applications that persist through failures. When Workflows became generally available earlier this year, they allowed developers to orchestrate complex processes that would be difficult or impossible to manage with traditional stateless functions. Workflows handle state, retries, and long waits, allowing you to focus on your business logic.
However, complex orchestrations require robust testing to be reliable. To date, testing Workflows was a black-box process. Although you could test if a Workflow instance reached completion through an await to its status, there was no visibility into the intermediate steps. This made debugging really difficult. Did the payment processing step succeed? Did the confirmation email step receive the correct data? You couldn't be sure without inspecting external systems or logs.
As developers ourselves, we understand the need to ensure reliable code, and we heard your feedback loud and clear: the developer experience for testing Workflows needed to be better.
The black box nature of testing was one part of the problem. Beyond that, though, the limited testing offered came at a high Continue reading
We run two types of integration tests before shipping a netlab release: device integration tests that check whether we correctly implemented netlab features on all supported devices, and platform integration tests that check whether rarely-used core functionality works as expected.
I want to have some validation included in the platform integration tests to ensure the lab devices are started, and that the links and the management network work as expected. The simplest way to get that done is to start OSPF with short hello intervals (to get adjacency up in no time), for example:
The sFlow instrumentation embedded as a standard feature of data center switch hardware from all leading vendors (Arista, Cisco, Dell, Juniper, NVIDIA, etc.) provides a cost effective solution for gaining visibility into UET traffic in large production AI / ML fabrics.
docker run -p 8008:8008 -p 6343:6343/udp sflow/prometheusThe easiest way to get started is to use the pre-built sflow/prometheus Docker image to analyze the sFlow telemetry. The chart at the top of this page shows an up to the second view of UET operations using the included Flow Browser application, see Defining Flows for a list of available UET attributes. Getting Started describes how to set up the sFlow monitoring system.
Flow metrics with Prometheus and Grafana describes how collect custom network traffic flow metrics using the Prometheus time series database and include the metrics in Grafana dashboards. Use the Flow Browser to explore Continue reading
Here at Cloudflare, we frequently use and write about data in the present. But sometimes understanding the present begins with digging into the past.
We recently learned of a 2024 turkmen.news article (available in Russian) that reports Turkmenistan experienced “an unprecedented easing in blocking,” causing over 3 billion previously-blocked IP addresses to become reachable. The same article reports that one of the reasons for unblocking IP addresses was that Turkmenistan may have been testing a new firewall. (The Turkmen government’s tight control over the country’s Internet access is well-documented.)
Indeed, Cloudflare Radar shows a surge of requests coming from Turkmenistan around the same time, as we’ll show below. But we had an additional question: Does the firewall activity show up on Radar, as well? Two years ago, we launched the dashboard on Radar to give a window into the TCP connections to Cloudflare that close due to resets and timeouts. These stand out because they are considered ungraceful mechanisms to close TCP connections, according to the TCP specification.
In this blog post, we go back in time to share what Cloudflare saw in connection resets and timeouts. We must remind our readers that, as passive observers, Continue reading
My first encounter with Ansible release 12 wasn’t exactly encouraging. We were using a few Ansible Jinja2 filters (ipaddr and hwaddr) in internal netlab templates, and all of a sudden those templates started crashing due to some weird behavior of attributes starting with underscore.
We implemented don’t use Ansible release 12 as a quick workaround, but postponing painful things is never a good solution(see also: visiting a dentist), so I decided to try to make netlab work with Ansible release 12. What a mistake to make.

What if I told you that all it takes to build a simple BGP lab with two eBGP peers (or even a hundred, for that matter) is a single YAML file? No need to add nodes on a GUI, connect links, or configure interface IPs manually. You just define the lab in a YAML file as shown below, and in about two minutes, you’ll have two routers of your choice fully configured with BGP and an established eBGP session.
provider: clab
defaults.device: eos
defaults.devices.eos.clab.image: ceos:4.34.2
addressing:
mgmt:
ipv4: 192.168.200.0/24
nodes:
- name: r1
module: [ bgp ]
- name: r2
module: [ bgp ]
bgp:
as_list:
100:
members: [ r1]
200:
members: [ r2 ]
links:
- r1-r2r1#show ip bgp summary
BGP summary information for VRF default
Router identifier 10.0.0.1, local AS number 100
Neighbor Status Codes: m - Under maintenance
Description Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc PfxAdv
r2 10.1.0.2 4 200 5 5 0 0 00:00:15 Estab 1 1 1r2#show ip bgp summary
BGP summary information for VRF default
Router identifier 10.0.0.2, Continue reading
The Raspberry Pi 3 Model B may be an older board, but it remains a […]
The post Debian 13 Trixie on Raspberry Pi 3B first appeared on Brezular's Blog.
https://codingpackets.com/blog/yt-dlp-quick-reference
https://codingpackets.com/blog/libvirt-tips-and-tricks
It’s time again for Tom, Eyvonne, and Russ to talk about current articles they’ve run across in their day-to-day reading. This time we talk about WiFi in the home, how often users think a global problem is really local, and why providers have a hard time supporting individual homes and businesses. The second topic is one no one really cares about … apathy. What causes apathy? How can we combat it? Join us for this episode of the Hedge … if you can bring yourself to care!
download
Kubernetes adoption is growing, and managing secure and efficient network communication is becoming increasingly complex. With this growth, organizations need to enforce network policies with greater precision and care. However, implementing these policies without disrupting operations can be challenging.
That’s where Calico Whisker comes in. It helps teams implement network policies that follow the principle of least privilege, ensuring workloads communicate only as intended. Since most organizations introduce network policies after applications are already running, safe and incremental rollout is essential.
To support this, Calico Whisker offers staged network policies, which allow teams to preview a policy’s effect in a live environment before enforcing it. Alongside this, policy traces in Calico Whisker provide deep visibility into how both enforced and pending policies impact traffic. This makes it easier to understand policy behaviour, validate intent, and troubleshoot issues in real time. In this post, we’ll walk through real-world policy trace outputs and show how they help teams confidently deploy and refine network policies in production Kubernetes clusters.
It’s important to reiterate the network policy behaviour in Kubernetes, as understanding this foundation is key to effectively interpreting policy traces and ensuring the right traffic flow decisions are Continue reading