You Can’t Patch People

One of the things I’ve noticed when it comes to IT is how quickly we’re willing to use software to solve people problems. Over my career I’ve seen all manner of crazy solutions to get around people being lazy or uneducated. Remember vMotion? Or OTV for stretched layer 2? Why do you think those solutions came about? I posit that it’s because it’s faster to write software than to patch people.

Hacking Humans

I see this most often in cybersecurity. Developers love to create software solutions that prevent things from happening. Phishing and all its various forms are some of the top priorities for solutions that prevent leaking of information. While we have invested a lot in phishing tests and education it’s also very likely that there are controls in place that prevent users from accidentally giving out information to threat actors.

Why are we so willing to write software to fix problems instead of teaching people to avoid those issues? I think in part it’s because software is predictable. If I create an app or write some controls into a platform it’s going to behave the same way every time. That’s the definition of deterministic. Every time the software Continue reading

HN824: That’s Not a Job for an LLM: The Right Way to Apply AI to Network Operations (Sponsored)

On today’s sponsored Heavy Networking, we get off the AI hype train to talk about how different artificial intelligence techniques usefully impact network operations—and where they aren’t a fit. The various forms of AI represent a set of tools that, like any tool, have use cases, capabilities, and limitations. Our guest is Avi Freedman, CEO... Read more »

State of Network Automation with Urs Baumann

I stopped tracking the (lack of) progress in network automation years ago, when I realized I had nothing new to say. As an eternal optimist, I hoped I was just missing something, but Urs Baumann (the guest of Software Gone Wild Episode 206) destroyed my hopes when he said, “I can still use the same slides I created 10 years ago”. On a more positive note, he recently completed his Master’s thesis on AI in network engineering, so we ended with a nice chat on its potential impact.

Worth Reading: AI and Knowledge Stagnation

Another week, another interesting AI article (is anyone writing about anything else these days?), this time from Noah Smith (another author worth following). I found this gem hidden in his weekly roundup:

Instead of trying to write a piece of code from scratch, or prove a math theorem from scratch, or figure out some piece of knowledge for yourself, you just ask AI to do it all for you. So everyone ends up getting the right answers to questions whose answers are already known, so they don’t end up adding anything new.

CSS & vertical rhythm for text, images, and tables

Vertical rhythm aligns lines to a consistent spacing cadence down the page. It creates a predictable flow for the eye to follow. Thanks to the rlh CSS unit, vertical rhythm is now easier to implement for text.1 But illustrations and tables can disrupt the layout. The amateur typographer in me wants to follow Bringhurst’s wisdom:

Headings, subheads, block quotations, footnotes, illustrations, captions and other intrusions into the text create syncopations and variations against the base rhythm of regularly leaded lines. These variations can and should add life to the page, but the main text should also return after each variation precisely on beat and in phase.

Robert Bringhurst, The Elements of Typographic Style

Text

Three factors govern vertical rhythm: font size, line height and margin or padding. Let’s set our baseline with an 18-pixel font and a 1.5 line height:

html {
  font-size: 112.5%;
  line-height: 1.5;
}
h1, h2, h3, h4 {
  font-size: 100%;
}
html, body,
h1, h2, h3, h4,
p, blockquote,
dl, dt, dd, ol Continue reading

TCG074: From SOAR to Agents: Why Practical Automation Has to Survive Contact with Real Infrastructure

Eyvonne Sharp and William Collins speak with Sif Baksh, Principal Solutions Architect at Tines, to discuss the power of automation. Sif shares some personal stories of how he has been able to use automation to innovate and modernize networking operations. They also discuss the importance of learning AI and using it as a tool, how... Read more »

Making Rust Workers reliable: panic and abort recovery in wasm‑bindgen

Rust Workers run on the Cloudflare Workers platform by compiling Rust to WebAssembly, but as we’ve found, WebAssembly has some sharp edges. When things go wrong with a panic or an unexpected abort, the runtime can be left in an undefined state. For users of Rust Workers, panics were historically fatal, poisoning the instance and possibly even bricking the Worker for a period of time.

While we were able to detect and mitigate these issues, there remained a small chance that a Rust Worker would unexpectedly fail and cause other requests to fail along with it. An unhandled Rust abort in a Worker affecting one request might escalate into a broader failure affecting sibling requests or even continue to affect new incoming requests. The root cause of this was in wasm-bindgen, the core project that generates the Rust-to-JavaScript bindings Rust Workers depend on, and its lack of built-in recovery semantics.

In this post, we’ll share how the latest version of Rust Workers handles comprehensive Wasm error recovery that solves this abort-induced sandbox poisoning. This work has been contributed back into wasm-bindgen as part of our collaboration within the wasm-bindgen organization formed last year. First with panic=unwind support, which ensures that Continue reading

Ten Years of ITNOG

I spent the last two days in Bologna at ITNOG 10 in the excellent company of Italian networking engineers (many of them personal friends) and a few guests from around the world. As always, the organizers and the program committee didn’t disappoint – it was a smoothly organized, lovely event full of interesting presentations. Thanks a million to everyone involved; I’ll definitely be back!

Now for the highlights, starting with the ultimate catnip for the differently attentive: running two presentations in parallel on the same screen with the soundtrack distributed via headphones. I’ve never seen anything like that, and while it looked weird (I have no idea how the presenters took it), it turned out to be very useful, as you could easily tune out AI-washing presentations and switch to something more interesting. On the other hand, you could be faced with a hard choice of having to select one of two excellent presentations:

KubeVirt Networking: How to Preserve VM IP Addresses During Migration

Organisations are re-evaluating their VM infrastructure. The economics have shifted, the tooling has matured, and the case for running two separate platforms, one for containers, one for VMs, is getting harder to justify. Platform teams that spent years managing hypervisor infrastructure are being asked to consolidate, and most are landing on the same answer: Kubernetes.

KubeVirt makes running VMs on Kubernetes possible. But KubeVirt networking – what happens to a VM’s IP address, VLAN, and security posture when it lands in a cluster – is where most migration plans hit a wall. The reasons go beyond cost:

  • Most enterprises already run Kubernetes. Containers are already there. Adding VMs to the same platform consolidates tooling, lifecycle management, networking models, and security policy into a single operational model.
  • Two platforms means double the overhead. Separate infrastructure means separate upgrade cycles, separate monitoring, separate network configuration, and separate on-call runbooks. Platform consolidation has direct operational value.
  • Kubernetes is mature enough. KubeVirt has reached the point where it’s a viable production choice for enterprise VM workloads.

The decision to migrate is being made. The question is how to do it without causing chaos.

Introducing KubeVirt

KubeVirt extends the Kubernetes API with new custom resource Continue reading