Ivan Pepelnjak

Author Archives: Ivan Pepelnjak

BGP Labs: Protect EBGP Sessions

I published another BGP labs exercise a few days ago. You can use it to practice EBGP session protection, including Generalized TTL Security Mechanism (GTSM) and TCP MD5 checksums1.

I would strongly recommend to run BGP labs with netlab, but if you like extra work, feel free to use any system you like including physical hardware.


  1. I would love to add TCP-AO to the mix, but it’s not yet supported by the Linux kernel, and so cannot be used in Cumulus Linux or FRR containers. ↩︎

Addresses in a Networking Stack

After discussing names, addresses and routes, it’s time for the next question: what kinds of addresses do we need to make things work?

End-users (clients) are usually interested in a single thing: they want to reach the service they want to use. They don’t care about nodes, links, or anything else.

End-users might want to use friendly service names, but we already know we need addresses to make things work. We need application level service identifiers – something that identifies the services that the clients want to reach.

Names, Addresses and Routes

It always helps to figure out the challenges of a problem you’re planning to solve, and to have a well-defined terminology. This blog post will mention a few challenges we might encounter while addressing various layers of the networking stack, from data-link layer and all the way up to the application layer, and introduce the concepts of names, addresses and routes.

According to Martin Fowler, one of the best quotes I found on the topic originally came from Phil Karlton:

Dataplane MAC Learning with EVPN

Johannes Resch submitted the following comment to the Is Dynamic MAC Learning Better Than EVPN? blog post:

I’ve also recently noticed some vendors claiming that dataplane MAC learning is so much better because it reduces the number of BGP updates in large scale SP EVPN deployments. Apparently, some of them are working on IETF drafts to bring dataplane MAC learning “back” to EVPN. Not sure if this is really a relevant point - we know that BGP scales nicely, and its relatively easy to deploy virtualized RR with sufficient VPU resources.

While he’s absolutely correct that BGP scales nicely, the questions to ask is “what is the optimal way to deliver a Carrier Ethernet service?

Worth Reading: Where Are the Self-Driving Cars?

Gary Marcus wrote an interesting essay describing the failure of self-driving cars to face the unknown unknowns. The following gem from his conclusions applies to AI in general:

In a different world, less driven by money, and more by a desire to build AI that we could trust, we might pause and ask a very specific question: have we discovered the right technology to address edge cases that pervade our messy really world? And if we haven’t, shouldn’t we stop hammering a square peg into a round hole, and shift our focus towards developing new methodologies for coping with the endless array of edge cases?

Obviously that’s not going to happen, we’ll keep throwing more GPU power at the problem trying to solve it by brute force.

Reliable ECMP with Static Routing

One of my readers wanted to use EIBGP (hint: wrong tool for this particular job1) to load balance outgoing traffic from a pair of WAN edge routers. He’s using a design very similar to this one with VRRP running between WAN edge routers, and the adjacent firewall cluster using a default route to the VRRP IP address.

The problem: all output traffic goes to the VRRP IP address which is active on one of the switches, and only a single uplink is used for the outgoing traffic.

Case Study: BGP Routing Policy

Talking about BGP routing policy mechanisms is nice, but it’s even better to see how real Internet Service Providers use those tools to implement real-life BGP routing policy.

Getting that information is incredibly hard as everyone considers their setup a secret sauce. Fortunately, there are a few exceptions; Pim van Pelt described the BGP Routing Policy of IPng Networks in great details. The article is even more interesting as he’s using Bird2 configuration language that looks almost like a programming language (as compared to the ancient route-maps used by vendors focused on “industry-standard” CLI).

Have fun!

Layer-3 WAN Handoff (L3Out) in VXLAN/EVPN Fabrics

I got a question from a few of my students regarding the best way to implement end-to-end EVPN across multiple locations. Obviously there’s the multi-pod and multi-site architecture for people believing in the magic powers of stretching VLANs across the globe, but I was looking for something that I could recommend to people who understand that you have to have a L3 boundary if you want to have multiple independent failure domains (or availability zones).

Random Thoughts on Zero-Trust Architecture

When preparing the materials for the Design Clinic section describing Zero-Trust Network Architecture, I wondered whether I was missing something crucial. After all, I couldn’t find anything new when reading the NIST documents – we’ve seen all they’re describing 30 years ago (remember Kerberos?).

In late August I dropped by the fantastic Roundtable and Barbecue event organized by Gabi Gerber (running Security Interest Group Switzerland) and used the opportunity to join the Zero Trust Architecture roundtable. Most other participants were seasoned IT security professionals with a level of skepticism approaching mine. When I mentioned I failed to see anything new in the now-overhyped topic, they quickly expressed similar doubts.

BGP Labs: Simple Routing Policy Tools

The first set of BGP labs covered the basics, the next four will help you master simple routing policy tools (BGP weights, AS-path filters, prefix filters) using real-life examples:

The labs are best used with netlab (it supports BGP on almost 20 different devices), but you could use any system you like (including GNS3 and CML/VIRL). If you’re stubborn enough it’s possible to make them work with the physical gear, but don’t ask me for help. For more details, read the Installation and Setup documentation.

Lifetime ipSpace.net Subscription

More than thirteen years after I started creating vendor-neutral webinars, it’s time for another change1: the ipSpace.net subscriptions became perpetual. If you have an active ipSpace.net subscription, it will stay valid indefinitely2 (and I’ll stop nagging you with renewal notices).

Wow, Free Lunch?

Sadly, that’s not the case.

ARP and Static Routes

A few days ago, I described how ARP behaves when the source- and destination IP addresses are not on the same subnet (TL&DR: it doesn’t care). Now, let’s see how routers use ARP to get the destination MAC address for various entries in the IP routing table. To keep things simple, we’ll use static routes to insert entries in the IP routing table.

We’ll run our tests in a small virtual lab with two Linux hosts and an Arista vEOS switch. The link between H1 and RTR is a regular subnet. H2 has an IP address on the Ethernet interface, but RTR uses an unnumbered interface.

Worth Reading: Looking Inside Large Language Models

Bruce Davie published an interesting overview article about Large Language Models. It would be worth reading just for the copious links to in-depth article; I particularly like his conclusions:

We mistake performance (producing realistic text) for competence (understanding the world).

Having a model for language is different from having a model of the world.

And that’s a perfect explanation why it makes no sense to expect ChatGPT and friends to produce picture-perfect device configurations or always-working code.

ARP Details Behind the Scenes

When figuring out how unnumbered IPv4 interfaces work, Daniel Dib asked an interesting question: How does ARP work when the source and destination IPv4 address are not in the same segment (as is usually the case when using unnumbered interfaces)?

TL&DR: ARP doesn’t care about subnets. If the TCP/IP stack needs to find a MAC address of a node it thinks is adjacent, ARP does its best, no matter what.

BGP Labs: The Basics

The first BGP labs are online. They cover the basic stuff (one has to start with the basics, right?):

The labs are supposed to be run on virtual devices, but if you’re stubborn enough it’s possible to make them work with the physical gear. In theory, you could use any system you like to set up the virtual lab (including GNS3 and CML/VIRL), but your life will be way easier if you use netlab – it supports BGP on almost 20 different devices. For more details, read the Installation and Setup documentation.

How GitHub Learned How Hard Distributed Systems Are

Anne Baretta found a great video describing the October 2018 GitHub failure. Here’s the TL&DW:

  • The failure was caused by a short (~ 1 minute) disconnect of the primary data center
  • The database replicas failed over to the secondary data center, but that failover was never tested and of course some stuff didn’t work.
  • In the meantime, batch jobs modified data in the primary data center, making the two replicas out-of-sync.
  • It took them over 24 hours to clean up the mess.
1 12 13 14 15 16 128