It always takes longer to find a problem than it should. Moving through the troubleshooting process often feels like swimming in molasses—it’s never fast enough or far enough to get the application back up and running before that crucial deadline. The “swimming in molasses effect” doesn’t end when the problem is found out, either—repairing the problem requires juggling a thousand variables, most of which are unknown, combined with the wit and sagacity of a soothsayer to work with vendors, code releases, and unintended consequences.
It’s enough to make a network engineer want to find a mountain top and assume an all-knowing pose—even if they don’t know anything at all.
The problem of taking longer, though, applies in every area of computer networking. It takes too long for the packet to get there, it takes to long for the routing protocol to converge, it takes too long to support a new application or server. It takes so long to create and validate a network design change that the hardware, software and processes created are obsolete before they are used.
Why does it always take too long? A short story often told to me by my Grandfather—a farmer—might help.
One morning a Continue reading
We talk a lot of about telemetry in the networking world, but generally as a set of disconnected things we measure, rather than as an entire system. We also tend to think about what we can measure, rather than what is useful to measure. Dinesh Dutt argues we should be thinking about observability, and how to see the network as a system. Listen in as Dinesh, Tom, And Russ talk about observability, telemetry, and Dinesh’s open source network observability project.
The Hedge is over 90 episodes now … I’m a little biased, but I believe we’re building the best content in network engineering—a good blend of soft skills, Internet policy, research, open source projects, and relevant technical content. You can always follow the Hedge here on Rule 11, of course, but it’s also available on a number of services, including—
I think it’s also available on Amazon Music, but I don’t subscribe to that service so I can’t see it. You can check the Podcast Directory for other services, as well. If you enjoy the Hedge, please post a positive rating so others can find it more easily.
I’ll be teaching a three-hour live webinar on data center fabrics on the 20th of August—
Data centers are the foundation of the cloud, whether private, public, on the edge, or in the center of the network. This training will focus on topologies and control planes, including scale, performance, and centralization. This training is important for network designers and operators who want to understand the elements of data center design that apply across all hardware and software types.
We tend to think every technology and every product is roughly unique—so we tend to stay up late at night looking at packet captures and learning how to configure each product individually, and chasing new ones as if they are the brightest new idea (or, in marketing terms, the best thing since sliced bread). Reality check: they aren’t. This applies across life, of course, but especially to technology. From a recent article—
RFC1925 rule 11 states—
Rule 11 isn’t just a funny saying—rule 11 is your friend. If want to learn new things quickly, learn rule 11 first. A basic understanding of the theory of networking will carry across all products, all Continue reading
On the 28th—in two days—I’m doing a master class over at Juniper on DC fabric disaggregation. I’ll spend some time defining the concept (there are two different ideas we use the word disaggregation to describe), and then consider some of the positive and negative aspects of disaggregation. This is a one hour session, and it’s free. Register here.
In most areas of life, where the are standards, there is some kind of enforcing agency. For instance, there are water standards, and there is a water department that enforces these standards. There are electrical standards, and there is an entire infrastructure of organizations that make certain the fewest number of people are electrocuted as possible each year. What about Internet standards? Most people are surprised when they realize there is no such thing as a “standards police” in the Internet.
Listen in as George Michaelson, Evyonne Sharp, Tom Ammon, and Russ White discuss the reality of standards enforcement in the Internet ecosystem.
There is never enough. Whatever you name in the world of networking, there is simply not enough. There are not enough ports. There is not enough speed. There is not enough bandwidth. Many times, the problem of “not enough” manifests itself as “too much”—there is too much buffering and there are too many packets being dropped. Not so long ago, the Internet community decided there were not enough IP addresses and decided to expand the address space from 32 bits in IPv4 to 128 bits in IPv6. The IPv6 address space is almost unimaginably huge—2 to the 128th power is about 340 trillion, trillion, trillion addresses. That is enough to provide addresses to stacks of 10 billion computers blanketing the entire Earth. Even a single subnet of this space is enough to provide addresses for a full data center where hundreds of virtual machines are being created every minute; each /64 (the default allocation size for an IPv6 address) contains 4 billion IPv4 address spaces.
But… what if the current IPv6 address space simply is not enough? Engineers working in the IETF have created two different solutions over the years for just this eventuality. In 1994 RFC1606 provided a Continue reading
What if you could connect a lot of devices to the Internet—without any kind of firewall or other protection—and observe attackers trying to find their way “in?” What might you learn from such an exercise? One thing you might learn is a lot of attacks seem to originate from within a relatively small group of IP addresses—IP addresses acing badly. Listen in as Leslie Daigle of Thinking Cat and the Techsequences podcast, Tom Ammon, and Russ White discuss just such an experiment and its results.
A couple of weeks ago, I joined Leslie Daigle and Alexa Reid on Techsequences to talk about free speech and the physical platform—does the right to free speech include the right to build and operate physical facilities like printing presses and web hosting? I argue it does. Listen in if you want to hear my argument, and how this relates to situations such as the “takedown” of Parler.
While reading a research paper on address spoofing from 2019, I ran into this on NAT (really PAT) failures—
The authors state 49% of the NATs they discovered in their investigation of spoofed addresses fail in one of these two ways. From what I remember way back when the first NAT/PAT device (the PIX) was deployed in the real world (I worked in TAC at the time), there was a lot of discussion about what a firewall should do with packets sourced from addresses not indicated anywhere.
If I have an access list including 192.168.1.0/24, and I get a packet sourced from 192.168.2.24, Continue reading
Automation is surely one of the best things to come to the networking world—the ability to consistently apply a set of changes across a wide array of network devices has speed at which network engineers can respond to customer requests, increased the security of the network, and reduced the number of hours required to build and maintain large-scale systems. There are downsides to automation, as well—particularly when operators begin to rely on automation to solve problems that really should be solved someplace else.
In this episode of the Hedge, Andrew Wertkin from Bluecat Networks joins Tom Ammon and Russ White to discuss the naïve reliance on automation.
I’m teaching a webinar on router internals through Pearson (Safari Books Online) on the 23rd of July. From the abstract—
A network device—such as a router, switch, or firewall—is often seen as a single “thing,” an abstract appliance that is purchased, deployed, managed, and removed from service as a single unit. While network devices do connect to other devices, receiving and forwarding packets and participating in a unified control plane, they are not seen as a “system” in themselves.
What is the first thing almost every training course in routing protocols begin with? Building adjacencies. What is considered the “deep stuff” in routing protocols? Knowing packet formats and processes down to the bit level. What is considered the place where the rubber meets the road? How to configure the protocol.
I’m not trying to cast aspersions at widely available training, but I sense we have this all wrong—and this is a sense I’ve had ever since my first book was released in 1999. It’s always hard for me to put my finger on why I consider this way of thinking about network engineering less-than-optimal, or why we approach training this way.
This, however, is one thing I think is going on here—
We believe that by knowing ever-deeper reaches of detail about a protocol, we are not only more educated engineers, but we will be able to make better decisions in the design and troubleshooting spaces.
To some degree, we think we are managing the Continue reading