Russ

Author Archives: Russ

Master Class: Security in the Design of DC Fabrics

I’m teaching another master class over at Juniper on February the 10th at 12 noon PT (3PM ET):

It’s typical to think about scale, speed, oversubscription, and costs when designing a data center fabric. But what about security in a world increasingly focused on privacy, data protection, and preventing downtime caused by cyber breaches? This session will consider how data center fabric software and control plane components can impact security, including the ability to effectively manage segmentation policy, controlling failure domains, and the impact host-based routing has on fabric security.

You can register here.

It is Easier to Move a Problem than Solve it (RFC1925, Rule 6)

Early on in my career as a network engineer, I learned the value of sharing. When I could not figure out why a particular application was not working correctly, it was always useful to blame the application. Conversely, the application owner was often quite willing to share their problems with me, as well, by blaming the network.

A more cynical way of putting this kind of sharing is the way RFC 1925, rule 6 puts is: “It is easier to move a problem around than it is to solve it.”

Of course, the general principle applies far beyond sharing problems with your co-workers. There are many applications in network and protocol design, as well. Perhaps the most widespread case deployed in networks today is the movement to “let the controller solve the problem.” Distributed routing protocols are hard? That’s okay, just implement routing entirely on a controller. Understanding how to deploy individual technologies to solve real-world problems is hard? Simple—move the problem to the controller. All that’s needed is to tell the controller what we intend to do, and the controller can figure the rest out. If you have problems solving any problem, just call it Software Defined Continue reading

Rethinking BGP on the DC Fabric (part 2)

In my last post on this topic, I laid out the purpose of this series—to start a discussion about whether BGP is the ideal underlay control plane for a DC fabric—and gave some definitions. Here, I’d like to dive into the reasons to not use BGP as a DC fabric underlay control plane—and the first of these reasons is BGP converges very slowly and requires a lot of help to converge at all.

Examples abound. I’ve seen the results of two testbeds in the last several years where a DC fabric was configured with each router (switch, if you prefer) in a separate AS, and some number of routes pushed into the network. In both cases—one large-scale, the other a more moderately scaled network on physical hardware—BGP simply failed to converge. Why? A quick look at how BGP converges might help explain these results.

Assume we are watching the 110::/64 route (attached to A, on the left side of the diagram), at P. What happens when A loses it’s connection to 110::/64? Assuming every router in this diagram is in a different AS, and the AS path length is the only factor determining the best path at every router.

Watching Continue reading

The Hedge Podcast #69: Container Networking Done Right

Everyone who’s heard me talk about container networking knows I think it’s a bit of a disaster. This is what you get, though, when someone says “that’s really complex, I can discard the years of experience others have in designing this sort of thing and build something a lot simpler…” The result is usually something that’s more complex. Alex Pollitt joins Tom Ammon and I to discuss container networking, and new options that do container networking right.

download

Rethinking BGP on the DC Fabric

Everyone uses BGP for DC underlays now because … well, just because everyone does. After all, there’s an RFC explaining the idea, every tool in the world supports BGP for the underlay, and every vendor out there recommends some form of BGP in their design documents.

I’m going to swim against the current for the moment and spend a couple of weeks here discussing the case against BGP as a DC underlay protocol. I’m not the only one swimming against this particular current, of course—there are at least three proposals in the IETF (more, if you count things that will probably never be deployed) proposing link-state alternatives to BGP. If BGP is so ideal for DC fabric underlays, then why are so many smart people (at least they seem to be smart) working on finding another solution?

But before I get into my reasoning, it’s probably best to define a few things.

In a properly design data center, there are at least three control planes. The first of these I’ll call the application overlay. This control plane generally runs host-to-host, providing routing between applications, containers, or virtual machines. Kubernetes networking would be an example of an application overlay control plane.

Continue reading

Controversial Reading 013021: Freedom of Speech

In the past, I have blended links of a more controversial nature about culture, technology, and governance into my weekend reads posts. There has been so much, however, on the situation with social media platforms blocking prominent people, and the Parler takedown, that it seemed worth setting aside an entire post containing some of the interesting things I’ve run across on these topics. I may, from time to time, gather up more controversial sets of reading into separate posts in the future, so people can skip (or read) them if they want to.


But then I think of this comment from a recent essay by Cory Doctorow: “The one entity Facebook will never, ever protect you from is Facebook.” We need to face quite clearly the fact that these recent events serve to consolidate the power of the tech giants—tech giants who quite literally have no principles to guide them other than self-interest, though they might occasionally discover reasons to act on our behalf.


Infrastructure companies much closer to the bottom of the technical “stack”— including Amazon Web Services (AWS), and Google’s Android and Apple’s iOS app stores—decided to cut off service not just to an individual but to
Continue reading

The Hedge Podcast #66: Daniel Migault and the ADD Working Group

The modern DNS landscape is becoming complex even for the end user. With the advent of so many public resolvers, DNS over TLS (DoT) and DNS over HTTPS (DoH), choosing a DNS resolver has become an important task. The ADD working group will, according to their page—

…focus on discovery and selection of DNS resolvers by DNS clients in a variety of networking environments, including publicnetworks, private networks, and VPNs, supporting both encrypted and unencrypted resolvers.

In this episode of the Hedge, Daniel Migault joins Alvaro Retana and Russ White to discuss Requirements for Discovering Designated Resolvers, draft-box-add-requirements-02.

download

Agglutinating Problems Considered Harmful (RFC2915, Rule 5)

In the networking world, many equate simplicity with the fewest number of moving parts. According to this line of thinking, if there are 100 routers, 10 firewalls, 3 control planes, and 4 management systems in a network, then reducing the number of routers to 95, the number of firewalls to 8, the number of control planes to 1, and the number of management systems to 3 would make the system “much simpler.” Disregarding the reduction in the number of management systems, scientifically proven to always increase in number, it does seem that reducing the number of physical devices, protocols in use, etc., would tend to decrease the complexity of the network.

The wise engineers of the IETF, however, has a word of warning in this area that all network engineers should heed. According to RFC1925, rule 5: “It is always possible to agglutinate multiple separate problems into a single complex interdependent solution. In most cases this is a bad idea.” When “conventional wisdom” and the wisdom of engineers with the kind of experience and background as those who write IETF documents contradict one another, it is worth taking a deeper look.

A good place to begin is Continue reading

Focus is a Virtue

The modern world craves our attention—but only in short bursts. To give your attention to any one thing for too long is failing, it seems, because you might miss out on something else of interest. We have entered the long tail of the attention economy, grounded in finding every smaller slices of time in which the user’s attention can be captured and used.

The damage of the attention economy is wide-ranging, including the politicization of everything, and the replacing ideas in politics with hate and fear. But for the network engineering world, the problem is exactly as Ethan describes— Technology mastery will be increasingly in the hands of the very few as a dwindling number of folks are willing, or perhaps even able, to create a mental state of focused learning. The application delivery stacks are enormously more complex than they were 25 years ago. Learning them requires a huge amount of focus over long periods of time.

The problem is obvious for anyone with eyes to see. What is the solution? The good news is there are solutions. The bad news is these solutions are swimming upstream against the major commercial interests of our day, so it’s going to Continue reading

The Hedge Podcast 67: Daniel Beveridge and the Structure of Innovation

Innovation and disruption are part the air we breath in the information technology world. But what is innovation, and how do we become innovators? When you see someone who has invented a lot of things, either shown in patents or standards or software, you might wonder how you can become an innovator, too. In this episode of the Hedge, Tom Ammon, Eyvonne Sharp, and Russ White talk to Daniel Beveridge about the structure of innovation—how to position yourself in a place where you can innovate, and how to launch innovation.

download

The History of the Cisco TAC

The Cisco Technical Assistance Center, or TAC, was as responsible for the growth of computer networking as any technology or other organization. TAC trained the first generation of network engineers, both inside Cisco and out, creating a critical mass of talent that spread out into the networking world, created a new concept of certifications, and set a standard that every other technical support organization has sought to live up to since. Join Joe Pinto, Phil Remaker, Alistair Woodman, Donald Sharp, and Russ White as we dive into the origins of TAC.

download

God Objects Considered Harmful

Every software developer has run into “god objects”—some data structure or database that every process must access no matter what it is doing. Creating god objects in software is considered an anti-pattern—something you should not do. Perhaps the most apt description of the god object I’ve seen recently is you ask for a banana, and you get the gorilla as well.

We seem to have a deep desire to solve all the complexity of modern networks through god objects. There was ATM, which was going to solve all our networking problems by allowing the edge device (or a centralized controller) to control the path its traffic takes through the network. There is LISP, which is going to solve every mapping and tunneling/transport problem in the entire networking world (including mobility and security). There is SDN, which is going to solve everything by pushing it all into a controller.

And now there is BGP, which can be a link state protocol (LSVR), the ideal DC fabric control plane, the ideal interdomain protocol, the ideal IGP … a sort-of distributed god object that solves everything, everywhere, all the time (life in the fast lane…).

The problem is, a bunch of people Continue reading

Webinar: How the Internet Really Works Part 1

On the 22nd, I’m giving a three hour course called How the Internet Really Works. I tried making this into a four hour course, but found I still have too much material, so I’ve split the webinar into two parts; the second part will be given in February. This part is about how systems work, who pays for what, and other higher level stuff. The second part will be all about navigating the DFZ. From the Safari Books site:

This training is designed for beginning engineers who do not understand the operation of the Internet, experienced engineers who want to “fill in the gaps,” project managers, coders, and anyone else who interacts with the Internet and wants to better understand the various parts of this complex, global ecosystem.

You can register here.

Technologies that Didn’t: ARCnet

In the late 1980’s, I worked at a small value added reseller (VAR) around New York City. While we deployed a lot of thinnet (RG58 coax based Ethernet for those who don’t know what thinnet is), we also had multiple customers who used ARCnet.

Back in the early days of personal computers like the Amiga 500, the 8086 based XT (running at 4.77MHz), and the 8088 based AT, all networks were effectively wide area, used to connect PDP-11’s and similar gear between college campuses and research institutions. ARCnet was developed in 1976, and became popular in the early 1980’s, because it was, at that point, the only available local area networking solution for personal computers.

ARCnet was not an accidental choice in the networks I supported at the time. While thinnet was widely available, it required running coax cable. The only twisted pair Ethernet standard available at the time required new cables to be run through buildings, which could often be an expensive proposition. For instance, one of the places that relied heavily on ARCnet was a legal office in a small town in north-central New Jersey. This law office had started out in an older home over a Continue reading

1 25 26 27 28 29 162