Russ

Author Archives: Russ

On the ‘net: The Art of Conviction

I was recently a guest on The Art of Conviction podcast, where we covered a bit of my background, some of the challenges I’ve faced in getting where I am, and then we moved into a discussion around my recently finished dissertation. I’m working to find places to publish more in the area of worldview and culture; I’ll point to those here as I can find a “home” for that side of my life.

You can find the recording here.

Beyond my episode, The Art of Conviction is a fascinating podcast; you should really subscribe and listen in.

Rethinking BGP on the DC Fabric (part 3)

The fist post on this topic considered some basic definitions and the reasons why I am writing this series of posts. The second considered the convergence speed of BGP on a dense topology such as a DC fabric, and what mechanisms we normally use to improve BGP’s convergence speed. This post considers some of the objections to slow convergence speed—convergence speed is not important, and ECMP with high fanouts will take care of any convergence speed issues. The network below will be used for this discussion.

Two servers are connected to this five-stage butterfly: S1 and S2 Assume, for a moment, that some service is running on both S1 and S2. This service is configured in active-active mode, with all data synchronized between the servers. If some fabric device, such as C7, fails, traffic destined to either S1 or S2 across that device will be very quickly (within tens of milliseconds) rerouted through some other device, probably C6, to reach the same destination. This will happen no matter what routing protocol is being used in the underlay control plane—so why does BGP’s convergence speed matter? Further, if these services are running in the overlay, or they are designed to discover Continue reading

The Hedge #70: FR Routing Update

FR Routing is a widely used and supported open source routing stack. In this episode of the Hedge, Alistair Woodman, Quentin Young, Donald Sharp, Tom Ammon, and Russ White discuss recent updates, additions to the CI/CD system, the release process, and operating system support. If you’re looking for a good open source, containerized routing stack for everything from route servers to DC fabrics and labbing to production, you should check out FR Routing.

download

Master Class: Security in the Design of DC Fabrics

I’m teaching another master class over at Juniper on February the 10th at 12 noon PT (3PM ET):

It’s typical to think about scale, speed, oversubscription, and costs when designing a data center fabric. But what about security in a world increasingly focused on privacy, data protection, and preventing downtime caused by cyber breaches? This session will consider how data center fabric software and control plane components can impact security, including the ability to effectively manage segmentation policy, controlling failure domains, and the impact host-based routing has on fabric security.

You can register here.

It is Easier to Move a Problem than Solve it (RFC1925, Rule 6)

Early on in my career as a network engineer, I learned the value of sharing. When I could not figure out why a particular application was not working correctly, it was always useful to blame the application. Conversely, the application owner was often quite willing to share their problems with me, as well, by blaming the network.

A more cynical way of putting this kind of sharing is the way RFC 1925, rule 6 puts is: “It is easier to move a problem around than it is to solve it.”

Of course, the general principle applies far beyond sharing problems with your co-workers. There are many applications in network and protocol design, as well. Perhaps the most widespread case deployed in networks today is the movement to “let the controller solve the problem.” Distributed routing protocols are hard? That’s okay, just implement routing entirely on a controller. Understanding how to deploy individual technologies to solve real-world problems is hard? Simple—move the problem to the controller. All that’s needed is to tell the controller what we intend to do, and the controller can figure the rest out. If you have problems solving any problem, just call it Software Defined Continue reading

Rethinking BGP on the DC Fabric (part 2)

In my last post on this topic, I laid out the purpose of this series—to start a discussion about whether BGP is the ideal underlay control plane for a DC fabric—and gave some definitions. Here, I’d like to dive into the reasons to not use BGP as a DC fabric underlay control plane—and the first of these reasons is BGP converges very slowly and requires a lot of help to converge at all.

Examples abound. I’ve seen the results of two testbeds in the last several years where a DC fabric was configured with each router (switch, if you prefer) in a separate AS, and some number of routes pushed into the network. In both cases—one large-scale, the other a more moderately scaled network on physical hardware—BGP simply failed to converge. Why? A quick look at how BGP converges might help explain these results.

Assume we are watching the 110::/64 route (attached to A, on the left side of the diagram), at P. What happens when A loses it’s connection to 110::/64? Assuming every router in this diagram is in a different AS, and the AS path length is the only factor determining the best path at every router.

Watching Continue reading

The Hedge Podcast #69: Container Networking Done Right

Everyone who’s heard me talk about container networking knows I think it’s a bit of a disaster. This is what you get, though, when someone says “that’s really complex, I can discard the years of experience others have in designing this sort of thing and build something a lot simpler…” The result is usually something that’s more complex. Alex Pollitt joins Tom Ammon and I to discuss container networking, and new options that do container networking right.

download

Rethinking BGP on the DC Fabric

Everyone uses BGP for DC underlays now because … well, just because everyone does. After all, there’s an RFC explaining the idea, every tool in the world supports BGP for the underlay, and every vendor out there recommends some form of BGP in their design documents.

I’m going to swim against the current for the moment and spend a couple of weeks here discussing the case against BGP as a DC underlay protocol. I’m not the only one swimming against this particular current, of course—there are at least three proposals in the IETF (more, if you count things that will probably never be deployed) proposing link-state alternatives to BGP. If BGP is so ideal for DC fabric underlays, then why are so many smart people (at least they seem to be smart) working on finding another solution?

But before I get into my reasoning, it’s probably best to define a few things.

In a properly design data center, there are at least three control planes. The first of these I’ll call the application overlay. This control plane generally runs host-to-host, providing routing between applications, containers, or virtual machines. Kubernetes networking would be an example of an application overlay control plane.

Continue reading

Controversial Reading 013021: Freedom of Speech

In the past, I have blended links of a more controversial nature about culture, technology, and governance into my weekend reads posts. There has been so much, however, on the situation with social media platforms blocking prominent people, and the Parler takedown, that it seemed worth setting aside an entire post containing some of the interesting things I’ve run across on these topics. I may, from time to time, gather up more controversial sets of reading into separate posts in the future, so people can skip (or read) them if they want to.


But then I think of this comment from a recent essay by Cory Doctorow: “The one entity Facebook will never, ever protect you from is Facebook.” We need to face quite clearly the fact that these recent events serve to consolidate the power of the tech giants—tech giants who quite literally have no principles to guide them other than self-interest, though they might occasionally discover reasons to act on our behalf.


Infrastructure companies much closer to the bottom of the technical “stack”— including Amazon Web Services (AWS), and Google’s Android and Apple’s iOS app stores—decided to cut off service not just to an individual but to
Continue reading

The Hedge Podcast #66: Daniel Migault and the ADD Working Group

The modern DNS landscape is becoming complex even for the end user. With the advent of so many public resolvers, DNS over TLS (DoT) and DNS over HTTPS (DoH), choosing a DNS resolver has become an important task. The ADD working group will, according to their page—

…focus on discovery and selection of DNS resolvers by DNS clients in a variety of networking environments, including publicnetworks, private networks, and VPNs, supporting both encrypted and unencrypted resolvers.

In this episode of the Hedge, Daniel Migault joins Alvaro Retana and Russ White to discuss Requirements for Discovering Designated Resolvers, draft-box-add-requirements-02.

download

Agglutinating Problems Considered Harmful (RFC2915, Rule 5)

In the networking world, many equate simplicity with the fewest number of moving parts. According to this line of thinking, if there are 100 routers, 10 firewalls, 3 control planes, and 4 management systems in a network, then reducing the number of routers to 95, the number of firewalls to 8, the number of control planes to 1, and the number of management systems to 3 would make the system “much simpler.” Disregarding the reduction in the number of management systems, scientifically proven to always increase in number, it does seem that reducing the number of physical devices, protocols in use, etc., would tend to decrease the complexity of the network.

The wise engineers of the IETF, however, has a word of warning in this area that all network engineers should heed. According to RFC1925, rule 5: “It is always possible to agglutinate multiple separate problems into a single complex interdependent solution. In most cases this is a bad idea.” When “conventional wisdom” and the wisdom of engineers with the kind of experience and background as those who write IETF documents contradict one another, it is worth taking a deeper look.

A good place to begin is Continue reading

Focus is a Virtue

The modern world craves our attention—but only in short bursts. To give your attention to any one thing for too long is failing, it seems, because you might miss out on something else of interest. We have entered the long tail of the attention economy, grounded in finding every smaller slices of time in which the user’s attention can be captured and used.

The damage of the attention economy is wide-ranging, including the politicization of everything, and the replacing ideas in politics with hate and fear. But for the network engineering world, the problem is exactly as Ethan describes— Technology mastery will be increasingly in the hands of the very few as a dwindling number of folks are willing, or perhaps even able, to create a mental state of focused learning. The application delivery stacks are enormously more complex than they were 25 years ago. Learning them requires a huge amount of focus over long periods of time.

The problem is obvious for anyone with eyes to see. What is the solution? The good news is there are solutions. The bad news is these solutions are swimming upstream against the major commercial interests of our day, so it’s going to Continue reading

The Hedge Podcast 67: Daniel Beveridge and the Structure of Innovation

Innovation and disruption are part the air we breath in the information technology world. But what is innovation, and how do we become innovators? When you see someone who has invented a lot of things, either shown in patents or standards or software, you might wonder how you can become an innovator, too. In this episode of the Hedge, Tom Ammon, Eyvonne Sharp, and Russ White talk to Daniel Beveridge about the structure of innovation—how to position yourself in a place where you can innovate, and how to launch innovation.

download

The History of the Cisco TAC

The Cisco Technical Assistance Center, or TAC, was as responsible for the growth of computer networking as any technology or other organization. TAC trained the first generation of network engineers, both inside Cisco and out, creating a critical mass of talent that spread out into the networking world, created a new concept of certifications, and set a standard that every other technical support organization has sought to live up to since. Join Joe Pinto, Phil Remaker, Alistair Woodman, Donald Sharp, and Russ White as we dive into the origins of TAC.

download

1 25 26 27 28 29 162