Chesterton’s Fence

Chesterton's Fence

Imagine yourself walking down a country lane, lush green grass around you, no farm animals anywhere, when suddenly you see a fence right in the middle of the path. You think, now, that’s a bit silly, that fence is blocking the path, somebody should have this fence removed. And by thinking that you’d fall right into the predicament known as Chesterton’s Fence. That is, you see something that you instinctively feel does not belong and you want to remove it. And perhaps that is exactly what needs to be done, but not before you ask a very important question, “why”? Why is the fence here? What function does it serve? Who put it there? What were they trying to achieve?

Chesterton's Fence

In any complex system, and most of the systems we work with these days are complex, problems often arise as a result of relationships and interactions between components. Our systems contain many components, some with special optimizations, some acting as local stabilizers, that might appear inefficient and unintuitive. Other components, or parts of the system seem to serve no apparent purpose at all.

Any given component is usually self-contained and can be understood, reasoned about, modified and improved by one Continue reading

DNS Centrality

As a collection of inter-twined markets, aspects of the Internet have been prone to excessive market distortions where one, or a small clique, of providers in market sector become completely dominant to the extent that there is no effective competition and no possibility of admitting additional market entrants. This form of market dominance is often termed "centrality". How centralised is the DNS?

The Case for VM and Container Consolidation in 2026

Two platforms, two teams, two procurement relationships, all doing one job. There’s a reason it ended up this way. There isn’t a reason it has to stay this way.

Ask anyone at a typical enterprise why the VM platform and the container platform are separate, and they’ll give you a sensible answer. The VM estate has been there for fifteen years. It runs the workloads the business depends on. Kubernetes got stood up later, when application teams started building microservices, and giving them their own environment made more sense than retrofitting one onto VMware. Two platforms, two teams, two roadmaps.

That’s how most enterprises got here.

The reasoning was sound at the time. The question is whether it still is.

This is the consolidation question most enterprises haven’t actually revisited, and it’s the one quietly absorbing more of your budget each year.

Figure 1. The current state most enterprises operate today.
Figure 1. The current state most enterprises operate today.

Why VM and container platforms ended up separate

If you operate both platforms, you know the shape of this already. There’s a VMware team: vSphere admins, network engineers who know NSX, storage specialists, plus a separate procurement relationship for the underlying virtualisation stack. Then there’s a Kubernetes team: platform Continue reading

PP111: New HPE Mist Features Validate NAC Changes, Enable Inline Microsegmentation (Sponsored)

HPE has announced new features in its Juniper Mist portfolio. On today’s sponsored Packet Protector, we dig into those features, including a dry run option that lets organizations test and refine Network Access Control (NAC) policies before pushing them out, a policy validation feature that can identify shadow NAC rules, and a microsegmentation capability aimed... Read more »

NB576: IBM Gets Big Bucks to Build Quantum Chip Fab; AT&T Sues to Hang Up on Copper Phone Lines

Take a Network Break! We sound the alarm about a critical vulnerability in an on-prem Azure stack. On the news side, AI NetOps startup Selector adds public cloud observability to its portfolio, Versa Networks adds zero trust capabilities to its AI assistant, and IBM gets a billion-dollar investment to build a foundry to fabricate quantum... Read more »

Worth Reading: Your Code Is Worthless

Did you manage not to stumble on a dramatic post explaining how someone generated 10,000 lines of code with AI while wasting time on your LinkedIn feed? Congratulations, you’re lucky.

However, as Nathaniel Fishel explained in his Your Code Is Worthless article, the “lines of code” is a useless vanity metric that sounds great in a LinkedIn self-promotion, but doesn’t matter when one has to maintain the product one has shipped to the customers. Add the natural laziness, and you have a perfect storm. As he wrote:

Kubernetes Operational Maturity: Secure and Resilient Cluster Federation with Cluster Mesh

Practically no one runs a single Kubernetes cluster in production these days. Maybe that’s how it started but data sovereignty requirements, acquisitions, AI initiatives and the need for edge servers, among other considerations, have pulled most enterprises into multi-cluster territory whether they planned for it or not. Reaching Kubernetes operational maturity—the point at which a fleet of clusters operates as one secure, observable, policy-consistent system—depends entirely on how those clusters are connected. Operating in a multi-cluster environment has evolved into the unspoken standard, one requiring a careful re-evaluation of the network architectures used to link clusters together.

That re-evaluation rarely happens. Most enterprises connect their clusters with the same networking patterns they were using before Kubernetes existed: load balancers fronting internal services, DNS records published to external zones, and IP-based firewall rules. Those patterns were built for north-south traffic moving in and out of a traditional data center perimeter, not for east-west traffic moving between internal workloads.

Running east-west traffic on north-south plumbing

The conventional way to make services in one cluster reachable from another is to expose them externally with a load balancer in front, a DNS name registered in a public zone, a firewall rule allowing traffic in. Continue reading

SONIC Part III: SONiC Introduction

SONiC is a vendor-neutral, Linux-based network operating system (NOS) that uses a database-driven architecture. Its software components run in multiple containers and exchange information through Redis. In SONiC, several named databases are defined for different functions, and these databases are mapped to Redis logical database IDs. Through this design, configuration data, application state, operational state, and ASIC-related state move between software layers by means of specialized processes.

Different hardware vendors may add their own platform integrations, transceiver support, monitoring utilities, or management workflows. However, the core SONiC architecture remains the same. This is one of the main reasons why SONiC knowledge, troubleshooting methods, and automation practices are transferable across different hardware platforms.

Vendor neutrality does not mean that every SONiC-based implementation behaves exactly the same in every operational detail. It means that different implementations follow the same architectural model. To organize information clearly, SONiC defines several named databases, each of which is mapped to a Redis logical database ID:

·       CONFIG_DB (Redis DB 4): Stores the user’s intended configuration.

·       APPL_DB (Redis DB 0): Stores application-level objects that are ready for processing by lower software layers.

·       STATE_DB (Redis DB 6): Stores operational state information about system Continue reading

Scaling Akvorado BMP RIB with sharding

To associate routing information—like AS paths or BGP communities—to flows, Akvorado can import routes through the BGP Monitoring Protocol (BMP). As the Internet routing table contains more than 1 million routes, Akvorado needs to scale to tens of millions of routes.1 This has been a long-standing challenge,2 but I expect this issue is now fixed by using RIB sharding, a method that splits the routing database into several parts to enable concurrent updates.

Previous implementation

Akvorado connects 2 elements to build its RIB:

  1. a prefix tree, and
  2. a list of routes attached to each prefix.
Akvorado BMP RIB implementation before sharding with the memory layout of each
structure and a single lock.
Akvorado BMP RIB implementation without sharding. One single read/write lock.

In the diagram above, the RIB stores five IPv4 prefixes and two IPv6 prefixes. One of them, 2001:db8:1::/48, contains three routes:

  • from peer 3, next hop 2001:db8::3:1, AS 65402, AS path 65402, community 65402:31,
  • from peer 4, next hop 2001:db8::4:1, same ASN, AS path, and community,
  • from peer 5, next hop 2001:db8::5:1, AS 65402, AS path 65401 65402 Continue reading

The Five Pillars of AI Agent Accountability: A Diagnostic Framework for Engineering Leaders

You’re in a board meeting. The CISO is presenting on AI risk. The CFO asks a simple question:

“When that finance agent we deployed last quarter accessed a customer payment record, can we tell who authorized it, what policy permitted it, and produce the full audit trail?”

The CISO looks at the head of the platform. The head of the platform looks at security. Nobody answers.

If you can picture that meeting happening at your company, you’re not alone. McKinsey found that only one-third of organizations have AI agent governance maturity at level 3 or higher. The other two-thirds are exactly the silence in that boardroom.

This post is the diagnostic framework that closes that gap. It’s part 2 of a five-part series on AI agent accountability, and if you only have time to read one post in the series, read this one. By the end you’ll have a five-question assessment to run with your team this week, and a maturity model to score where you stand today.

Not all governance equals AI agent accountability. Many enterprises believe they’re covered because they have network policies or an API gateway, but governance without accountability is a security theater: it Continue reading

HN828: How Selector Unifies Cloud and On-Prem Network Observability (Sponsored)

Selector is extending its AI-driven network observability capabilities into public clouds. On today’s sponsored episode, we dig into how Selector gathers and analyzes public cloud network telemetry, how it integrates cloud and on-prem network data to provide end-to-end visibility, how it integrates with third-party Application Performance Monitoring (APM) systems to correlate network and application performance,... Read more »

Hedge 306: RPKI Transport

Synchronizing information across the Internet, at an initial glance, looks like a fairly simple problem to solve. Just copy a file to a host and create a magic protocol, right? Not really. Each kind of data has a fairly unique set of requirements–and RPKI data, used to provide security information for BGP, is no different. Job Snijders joins Tom and Russ to talk about ERIK, a protocol developed to synchronize RPKI records.
 
For more information, check out Job’s web site and the IETF draft.
 

 
download

Technology Short Take 196

Welcome to Technology Short Take 196! Just in time for the US Memorial Day holiday, I am back with another list of articles related to various data center technologies like networking, security, operating systems, and applications. You will find articles on VPNs, Linux local privilege escalation (LPE) vulnerabilities, browser quirks and workarounds, the death of Terraform (again), and so much more. Enjoy your weekend reading!

Networking

Servers/Hardware

Security

Continue reading

Public Videos: OpenFlow Deep Dive

Remember OpenFlow, the One Protocol to Bind Them All1? I haven’t heard anyone even mention it in ages, and I never bothered to ask whether anyone is still using it after the dismal results of the 2022 poll.

Anyway, if you still have to deal with that ancient blunder, six hours of deep dive videos I recorded a decade ago might still be useful. You can watch them without an ipSpace.net account.

Looking for more binge-watching materials? You’ll find them here.

Announcing Claude Compliance API support with Cloudflare CASB

Today, we are extending Cloudflare’s cloud access security broker (CASB) to support the Claude Compliance API. Security and compliance teams can now monitor Claude usage directly in the Cloudflare dashboard. No endpoint agents required.

Enterprise security teams have long struggled to see how users interact with sanctioned and unsanctioned applications. The rapid adoption of AI applications has made this harder. Employees spend significant time in these new surface areas, and their interactions differ from traditional SaaS: users upload files, share freeform prompts, and providers generate content that may contain sensitive data.

Cloudflare CASB helps solve this problem. One API integration gives you out-of-band visibility and control over the applications your organization uses. This integration builds on our existing support for AI governance, extending coverage over the most common tools security teams now manage. 

The fast path to safe AI adoption

AI adoption has outpaced security governance. While IT and security teams raced to enable AI tools for productivity, the controls lagged behind. Most organizations today operate with partial visibility: they may block unauthorized AI tools at the network layer, but they cannot see what happens inside sanctioned ones.

This matters because AI tools are not like traditional SaaS Continue reading

1 2 3 3,873