ddib, Author at NetworkingNexus.net

ddib

Author Archives: ddib

PMTUD in MPLS-enabled Networks

In the previous post on MSS, MSS Clamping, PMTUD, and MTU, we learned how PMTUD is performed by setting the Don’t fragment flag in the IP header which leads to the device that needs to perform fragmentation dropping the packet and sending ICMP Fragmentation needed packet towards the source. In MPLS-enabled networks, it’s not always possible to send the ICMP packet straight towards the source as the P routers have no knowledge of the customer specific networks. In RFC 3032 – MPLS Label Stack Encoding, such a scenario is described:

Suppose one is using MPLS to "tunnel" through a transit routing
domain, where the external routes are not leaked into the domain's
interior routers.  For example, the interior routers may be running
OSPF, and may only know how to reach destinations within that OSPF
domain.  The domain might contain several Autonomous System Border
Routers (ASBRs), which talk BGP to each other.  However, in this
example the routes from BGP are not distributed into OSPF, and the
LSRs which are not ASBRs do not run BGP.

In this example, only an ASBR will know how to route to the source of
some arbitrary packet.  If an interior router needs  Continue reading

MSS, MSS Clamping, PMTUD, and MTU

Maximum Segment Size (MSS) and MSS clamping are concepts that can be confusing. How do they relate to the MTU (Maximum Transmission Unit)? Before we setup a lab to demonstrate these concepts, let’s give some background. Note that this entire post assumes a maximum frame size of 1518 bytes, the original Ethernet definition, and does not cover jumbo frames.

Ethernet frame

Almost all interfaces today are Ethernet. The original 802.3 standard from 1985 defined the minimum size- and maximum size frame as the following:

minFrameSize – 64 octets.
maxFrameSize – 1518 octets.

With a maximum frame size of 1518 octets (bytes), that leaves 1500 bytes of payload as the Ethernet frame adds 18 bytes, 14 bytes of header and 4 bytes of trailer. The Ethernet frame is shown below:

IP header

An IPv4 IP header adds at least 20 bytes to the frame. The IPv4 header is shown below:

Note that more than 20 bytes can be used if the header has IP options. With no options in the IP header, there’s 1480 bytes remaining for the L4 protocol such as UDP or TCP.

TCP header

TCP also adds a minimum of 20 bytes, meaning that the maximum payload Continue reading

Not All OSPF Inter-area Traffic Traverses Interfaces In Area 0

Everyone knows that OSPF is a link state protocol. Those that study also discover that OSPF behaves like distance vector between areas as Type-1- and Type-2 LSAs are not flooded between areas, but rather summarized in Type-3 LSAs. This means that OSPF is a logical star, or hub with spokes, where Area 0 is the backbone and all other areas must connect to Area 0. This is shown below:

With this topology, since all the areas only connect to the backbone area, traffic between areas must traverse the backbone:

We learn about this behavior in literature where there is a very straight forward topology where each ABR is only attached to one area beyond the backbone. Such a topology is shown below:

In such a topology, traffic between RT04 and RT05 has to traverse the backbone. This is shown below:

However, what if you have a topology which is not as clear cut? Where an ABR attaches to multiple areas? This is what we will explore in this post. We’ll be using the topology below:

In this topology, RT02 and RT03 are ABRs. RT02 is attached to both Area 1 and Area 2 in addition to the backbone, while RT03 Continue reading

Ethernet History Deepdive – Why Do We Have Different Frame Types?

In my previous post Encapsulation of PDUs On Trunk Ports, I showed what happens to PDUs when you change the configuration of a trunk. You may have noticed that there are typically three different types of Ethernet encapsulations that we see:

Ethernet II.
802.2 LLC.
802 SNAP.

Historically, there were even more than three, but we’re ignoring that for now. Why do we have three? To understand this, we need to go back in history.

The Origin of Ethernet

In the early 70’s, Robert Metcalfe, inspired by ARPANET and ALOHAnet had been working on developing what we today know as Ethernet. He published a paper in 1976, together with David Boggs, named Ethernet: Distributed Packet Switching for Local Computer Networks:

This image has an empty alt attribute; its file name is Ethernet_paper_1975.png

In the paper, they describe the addressing used in Ethernet:

3.3 Addressing
Each packet has a source and destination, both of which are identified in the packet’s header.
A packet placed on the Ether eventually propagates to all stations. Any station can copy a packet
from the Ether into its local memory, but normally only an active destination station matching ‘its
address in the packet’s header will do so as the packet passes. By convention, a Continue reading

Why Are OSPF Type 5 LSAs Flooded?

I recently saw a great question on Reddit, on why Type-5 (AS-external) LSAs are flooded, in comparison to Type-3 (Summary) that are regenerated at the ABR. To investigate this, we’ll use the following simple topology where R2 and R3 are ABRs:

OSPF Behavior Type-3 LSAs

Let’s see how OSPF handles Summary LSAs. Let’s first look at Area 1, where R4 is advertising 169.254.0.0/24 into it. This can be seen in the LSDB of R2:

R2#show ip ospf data router 203.0.113.4

            OSPF Router with ID (203.0.113.2) (Process ID 1)

                Router Link States (Area 1)

  LS age: 74
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 203.0.113.4
  Advertising Router: 203.0.113.4
  LS Seq Number: 80000009
  Checksum: 0x1DF0
  Length: 84
  Number of Links: 5

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 169.254.0.0
     (Link Data) Network Mask: 255.255.255.0
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 203.0.113.3
     (Link Data) Router Interface address: 192.0.2.14
      Number of MTID metrics: 0
       TOS 0  Continue reading

Some History on VLAN 1 in Cisco Switches

Over the years, there has been a lot of discussion on if VLAN 1 in Cisco switches is special or not. Does it have any characteristics that other VLANs don’t? I covered some of this in the Is VLAN 1 Special in Cisco Networks. This time I thought it would be interesting to give some historical perspective on VLAN 1 and describe some of the implementation details that I learned from Francois Tallet. Francois was heavily involved in L2 and STP when at Cisco.

The 802.1Q standard was released at the end of 1998. Several years before that, Cisco had introduced Inter-Switch Links (ISL) and Dynamic ISL (DISL) to support VLANs. The main difference between ISL and 802.1Q is that ISL encapsulates the entire frame as opposed to 802.1Q that adds a field to the existing frame. DISL was a method of forming trunks dynamically, a predecessor to Dynamic Trunking Protocol (DTP) if you will.

Before VLANs and before ISL, it was simple to send control plane protocol frames such as CDP, PAgP, STP, etc. There was no concept of VLANs so there was no relation to VLANs or encapsulating/tagging the frames. When VLANs were introduced, now Continue reading

Encapsulation of PDUs On Trunk Ports

When I studied for my CCIE almost 15 years ago, I recall that I was fascinated by how different PDUs such as CDP, DTP, STP would have different encapsulations on a trunk depending on the configuration of it. What happens when you change the native VLAN? What happens if the native VLAN is not allowed on the trunk? What happens if you tag the native VLAN? There aren’t many resources describing this as most people don’t care for this level of detail, but there are situations where this is important. The goal of this post is to configure different protocols and see how they are encapsulated using different trunk configurations. You don’t need to consume this entire post, rather use it as a reference for different scenarios. Just be aware that some of this may be platform/OS specific.

The protocols we’ll cover for this post are:

CDP.
LLDP.
DTP.
PAgP.
LACP.
PVST+.
RPVST+.
MST.

The topology is going to be very simple, two switches connected by a single link:

These are IOSv-L2 devices:

SW1#show version
Cisco IOS Software, vios_l2 Software (vios_l2-ADVENTERPRISEK9-M), Experimental Version 15.2(20200924:215240) [sweickge-sep24-2020-l2iol-release 135]
Copyright (c) 1986-2020 by Cisco Systems, Inc.
Compiled Tue 29-Sep-20 11:53 by sweickge


 Continue reading

Adding Arista Switch to CML

I wanted to add Arista switches to CML to do some STP interopability testing. However, the process of adding them is not well described. I had to refer to some Youtube videos to understand what to do. This is what you’ll need for CML 2.7:

Download images from Arista software downloads.
Upload images to CML.
Create node- and image definition.

The first thing you need to do is to download images. Thankfully, Arista provides images for anyone that’s registered, whether you are an existing partner/customer, or not. Go to Arista’s login page and create an account if you don’t already have one. When logged in, go to Support -> Software Download:

When on the downloads page, scroll down until you see cEOS-lab and vEOS-lab. Expand the vEOS lab section:

You will need to download two images:

Aboot – Boot loader.
vEOS – The actual NOS.

Grab one of the Aboot images such as Aboot-veos-serial-8.0.2.ios:

The Aboot serial image outputs to serial while the other image outputs to VGA. I didn’t have any issues using the serial one in CML.

You’ll then need the actual vEOS file. Previously, there was a process needed to convert Continue reading

Detecting Mismatched Native VLANs

Many people have seen the message logged to their switch about a mismatched native VLAN on a trunk, but how is it detected? There are two methods of detecting mismatched native VLAN on a trunk link:

CDP.
STP when using a Per-VLAN flavor such as PVST+ or RPVST+.

To demonstrate how this happens, I will setup a very simple topology in CML with two switches connected by a trunk link as seen below:

At this point only the following has been configured on the trunk link:

interface GigabitEthernet0/0
 switchport trunk encapsulation dot1q
 switchport mode trunk
 negotiation auto

Now, let’s take a look at the PDUs being generated, CDP and STP. For CDP we can see the following in Wireshark:

Frame 31: 354 bytes on wire (2832 bits), 354 bytes captured (2832 bits)
IEEE 802.3 Ethernet 
Logical-Link Control
Cisco Discovery Protocol
    Version: 2
    TTL: 180 seconds
    Checksum: 0x474d [correct]
    [Checksum Status: Good]
    Device ID: SW2
    Software Version
    Platform: Cisco 
    Addresses
    Port ID: GigabitEthernet0/0
    Capabilities
    VTP Management Domain: 
    Native VLAN: 1
        Type: Native VLAN (0x000a)
        Length: 6
        Native VLAN: 1
    Duplex: Full
    Trust Bitmap: 0x00
    Untrusted port CoS: 0x00
    Management Addresses

Notice that the native VLAN is signaled and that it Continue reading

802.1Q-Tagged Frames Through Unmanaged Switch – Forwarded or Dropped?

As a follow-up to the post yesterday on native VLANs, there was a question on what would happen to 802.1Q-tagged frames traversing an unmanaged switch. Unmanaged in this case being a switch that does not support VLANs. While this might be more of a theoretical question today, it’s still interesting to dive into it to better understand how a 802.1Q-tagged frame is different from an untagged frame.

Before we can answer the question on what a VLAN-unaware switch should do, let’s refresh our memory on the Ethernet header. The Ethernet frame consists of Destination MAC, Source MAC, Ethertype, and FCS. 802.1Q adds an additional four bytes consisting of Tag Protocol Identifier (TPID) and Tag Control Information (TCI). This is shown below:

Note how the TPID in the tagged frame is in the place of EtherType for untagged frames. It’s also a 2-byte field and the TPID is set to 0x8100 for tagged frames. The EtherType field is still there and would be for example 0x0800 for IPv4 payload.

To demonstrate what this looks like on the wire, I’ve setup two routers with the following configuration:

hostname R1
!
vrf definition ETHERNET
 !
 address-family ipv4
 exit-address-family
!
interface GigabitEthernet1.100
 encapsulation  Continue reading

Why Do We Have Native VLANs?

Recently, my friend Andy Lapteff asked an excellent question. Why do we have native VLANs? As in, why allow untagged frames on a trunk link?

There was a time where we didn’t have VLANs. At first there was hubs, then bridges, multi-port bridges, and finally switches. Cisco was one of the first vendors to introduce VLANs, even before it became a standard, through the use of Inter Switch Links (ISL). ISL is long gone and encapsulated the entire Ethernet frame so native VLANs were not relevant there. In 1998, the 802.1Q standard was released.

In 802.1Q, 1.2 VLAN aims and benefits, the following is described:

a) VLANs are supported over all IEEE 802 LAN MAC protocols, and over shared media LANs as well as point-to-point LANs.
b) VLANs facilitate easy administration of logical groups of stations that can communicate as if they were on the same LAN. They also facilitate easier administration of moves, adds, and changes in members of these groups.
c) Traffic between VLANs is restricted. Bridges forward unicast, multicast, and broadcast traffic only
on LAN segments that serve the VLAN to which the traffic belongs.
d) As far as possible, VLANs maintain compatibility Continue reading

Why Didn’t We Have Anycast Gateways Before VXLAN?

A while back I started thinking about why it took so long before we started using anycast gateways. I started thinking about what would be the reason(s) for not doing it earlier. I came up with some good reasons and it started making sense to me. I then asked you all what your thoughts were and received a ton of great responses. Here are a few that were mentioned:

It was a natural evolution.
More powerful devices.
We didn’t have overlays.
There were no protocols to map what device a MAC sits behind.
Reusing the same IP would cause IP conflicts.

These are all certainly true to some degree. I would argue though that the main reason why we didn’t have it earlier is because of the topology and protocols we used in traditional LANs. The typical design was to have three layers, access, distribution, and core. The links in access to distribution layer were L2 only and the distribution layer had all the L3 configuration. A typical topology looked like this:

In a topology like this, there are only two devices that host the L3 configuration needed for hosts. When you have two of something, it’s natural to think Continue reading

Cisco vPC in VXLAN/EVPN Network – Part 6 – vPC Enhancements

There are a lot of options when it comes to vPC. What enhancements should you consider? I’ll go through some of the options worth considering.

Peer Switch – The Peer Switch feature changes how vPC behaves in regards to STP. Without this enabled, you would configure different STP priorities on the primary and secondary switch. The secondary switch forwards BPDUs coming from vPC-connected switches towards the primary switch. The secondary switch doesn’t process these received BPDUs. Only the primary switch sends BPDUs to the vPC-connected switches. Note that the secondary switch can process and send BPDUs to switches that are only connected to the secondary switch. Without Peer Switch it looks like this:

The BPDU sent by SW04 is not processed by SW02. It is forwarded towards SW01.
- SW04 BPDU is only sent initially. Port will become Root port and stop sending BPDUs.
SW02 sends BPDU towards SW05 as it is not connected with vPC. The BPDU has information about cost to Root (SW01).
SW02 doesn’t send BPDU towards SW03 as it is connected with vPC.
SW01 and SW02 have different STP priorities and send distinct BPDUs. They are not one switch from STP perspective.
- If SW01 goes down, STP Continue reading

Cisco vPC in VXLAN/EVPN Network – Part 5 – Potential Pitfalls

Like I hinted at in an earlier post, there are a some failure scenarios you need to consider for vPC. The first scenario we can’t really do much with, but I’ll describe it anyway. The topology is the one below:

Server4 needs to send a packet to Server1. Leaf4 has the following routes for 198.51.100.11:

Leaf4# show bgp l2vpn evpn 198.51.100.11
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.0.2.3:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.8506]:[32]:[198.51.100.11]/272, version 13677
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10001
      Extcommunity: RT:65000:10000 RT:65000:10001 SOO:203.0.113.12:0 ENCAP:8
          Router MAC:00ad.e688.1b08
      Originator: 192.0.2.3 Cluster list: 192.0.2.2 

  Advertised path-id 1
  Path type: internal, path is valid, is best path,  Continue reading

Cisco vPC in VXLAN/EVPN Network – Part 4 – Fabric Peering

Like I mentioned in a previous post, normally leafs don’t connect to leafs, but for vPC this is required. What if we don’t want to use physical interfaces for this interconnection? This is where fabric peering comes into play. Now, unfortunately my lab, which is virtual, does not support fabric peering so I will just introduce you to the concept. Let’s compare the traditional vPC to fabric peering, starting with traditional vPC:

The traditional vPC has the following pros and cons:

Pros:
- No dependency on other devices for peer link and peer keepalive link.
- No contention for bandwidth on interfaces as they are dedicated.
- This also means no QoS configuration is required.
- Intent of configuration is clear with dedicated interfaces.
Cons:
- Requires dedicated interfaces that could be used for something else.
- Interfaces have a cost, both from perspective of buying the switch, but also SFPs.

Now let’s compare that to fabric peering:

Fabric peering has the following pros and cons:

Pros:
- No dedicated interfaces required.
- Thus reducing cost.
- Resiliency as there are multiple paths between the two switches.
Cons:
- Dependency to other devices.
- Dependency to underlay.
- Contention for bandwidth with other traffic.
- May require QoS.
- May be more difficult to Continue reading

Cisco vPC in VXLAN/EVPN Network – Part 3 – Verifying Connectivity

The following topology is used:

We want to verify connectivity and traffic flow towards:

Gateway of Server3.
Server1.
Server2.
Server4.

Let’s start with the gateway. The gateway is at 10.0.0.1 and has a MAC address of 0001.0001.0001:

server3:~$ ip neighbor | grep 10.0.0.1
10.0.0.1 dev bond0 lladdr 00:01:00:01:00:01 STALE

This is an anycast gateway MAC. When initiating a ping towards 10.0.0.1, it can go to either Leaf1 or Leaf2. I will run Ethanalyzer on the switches to confirm which one is receiving the ICMP Echo Request:

Leaf1# ethanalyzer local interface inband display-filter "icmp" limit-captured-frames 0
Capturing on 'ps-inb'

Leaf2# ethanalyzer local interface inband display-filter "icmp" limit-captured-frames 0
Capturing on 'ps-inb'

Then initiate ping from Server3:

server3:~$ ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=255 time=1.19 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=255 time=1.29 ms
^C
--- 10.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 1.192/1.242/1.292/0.050 ms

Cisco vPC in VXLAN/EVPN Network – Part 2 – Configuring vPC

When building leaf and spine networks, leafs connect to spines, but leafs don’t connect to leafs, and spines don’t connect to spines. There are exceptions to this and vPC is one of those exceptions. The leafs that are going to be part of the same vPC need to connect to each other. There are two ways of achieving this:

Physical interfaces.
Fabric peering.

We will first use physical interfaces and then later remove that and use fabric peering. Now, my lab is virtual so take physical with a grain of salt, but they will be dedicated interfaces. The following will be required to configure vPC:

Enable vPC.
Enable LACP.
Create vPC domain.
Create a VRF for the vPC peer keepalive.
Configure the interface for vPC peer keepalive.
Configure the vPC peer keepalive.
Configure the vPC peer link interfaces.
Configure the vPC peer link.

This is shown below:

The vPC is now up:

Leaf1# show vpc
Legend:
                (*) - local vPC is down, forwarding via vPC peer-link

vPC domain id                     : 1   
Peer status                       : peer adjacency formed ok      
vPC keep-alive status             : peer is alive                 
Configuration consistency status  : success 
Per-vlan consistency status       : success                       
Type-2 consistency status         : success 
 Continue reading

Cisco vPC in VXLAN/EVPN Network – Part 1 – Anycast VTEP

Many vendors offer MLAG features, that is, the ability to form a PortChannel (some vendors call it trunk or bond) towards two separate devices. In this post, I will cover the following:

Briefly describe vPC in a traditional network.
Describe vPC in a VXLAN/EVPN network.
Configure leaf switches to support vPC.
Setup of Ubuntu Linux host to bond two interfaces and use LACP.
Verification of the setup.

Traditional vPC

On Cisco Nexus switches, virtual Port Channel (vPC) has been a highly used feature for many years. It has been used towards other network devices such as firewalls, routers, and switches, but also towards hosts running hypervisors such as ESX.

As opposed to other technologies such as Virtual Switching System (VSS) or StackWise Virtual, it does not require the two switches to become one to provide the ability to do MLAG. Instead, the two devices appear as one in PDUs such as LACP, STP, and IGMP, by using a vPC system MAC address as the source MAC. With MLAG features, the two switches need to verify the other is alive and also synchronize state and perform consistency checking. This is done by connecting them with a vPC peer keepalive link, and Continue reading

CCNA 200-301 Updated To Version 1.1

Cisco is updating the Cisco Certified Network Associate (CCNA) exam to version 1.1. In the past, Cisco only did major updates to their exams. Since then, they have moved to doing more frequent and minor updates, in a more agile fashion. Before going in to the changes, let’s answer some common questions that are covered in Cisco’s FAQ:

Why is the CCNA being updated?
Cisco regularly performs reviews of their exams. Exams get updated to clarify exam topics, introduce new ones, and phase out outdated products and solutions.

What is being added?
New topics include generative AI, cloud network management, and machine learning.

When can candidates register for CCNA v1.1?
Registration begins on August 20, 2024.

What if I’m already studing for CCNA v1.0?
Complete your study and take the CCNA v1.0 exam.

What percentage of the exam is being updated?
Approximately 10% of the exam is updated.

When is the last day to test for CCNA v1.0?
The last day of testing for CCNA v1.0 is August 19, 2024.

So what is being changed? The different domains and their percentages is not changing. The domains and their percentage remain as:

1.0 Network Continue reading

1000BASE-T Part 4 – Link Down Detection

In the previous three parts, we learned about all the interesting things that go on in the PHY with scrambling, descrambling, synchronization, auto negotiation, FEC encoding, and so on. This is all essential knowledge that we need to have to understand how the PHY can detect that a link has gone down, or is performing so badly that it doesn’t make sense to keep the link up.

What Does IEEE 802.3 1000BASE-T Say?

The function in 1000BASE-T that is responsible for monitoring the status of the link is called link monitor and is defined in 40.4.2.5. The standard does not define much on what goes on in link monitor, though. Below is an excerpt from the standard:

Link Monitor determines the status of the underlying receive channel and communicates it via the variable
link_status. Failure of the underlying receive channel typically causes the PMA’s clients to suspend normal
operation.
The Link Monitor function shall comply with the state diagram of Figure 40–17.

The state diagram (redrawn by me) is shown below:

While 1000BASE-T leaves what the PHY monitors in link monitor to the implementer, there are still some interesting variables and timers that you should be Continue reading

« Previous 1 2 3 4 … 12 Next »