ddib

Author Archives: ddib

CCNA 200-301 Updated To Version 1.1

Cisco is updating the Cisco Certified Network Associate (CCNA) exam to version 1.1. In the past, Cisco only did major updates to their exams. Since then, they have moved to doing more frequent and minor updates, in a more agile fashion. Before going in to the changes, let’s answer some common questions that are covered in Cisco’s FAQ:

    Why is the CCNA being updated?
    Cisco regularly performs reviews of their exams. Exams get updated to clarify exam topics, introduce new ones, and phase out outdated products and solutions.

    What is being added?
    New topics include generative AI, cloud network management, and machine learning.

    When can candidates register for CCNA v1.1?
    Registration begins on August 20, 2024.

    What if I’m already studing for CCNA v1.0?
    Complete your study and take the CCNA v1.0 exam.

    What percentage of the exam is being updated?
    Approximately 10% of the exam is updated.

    When is the last day to test for CCNA v1.0?
    The last day of testing for CCNA v1.0 is August 19, 2024.

    So what is being changed? The different domains and their percentages is not changing. The domains and their percentage remain as:

    1000BASE-T Part 4 – Link Down Detection

    In the previous three parts, we learned about all the interesting things that go on in the PHY with scrambling, descrambling, synchronization, auto negotiation, FEC encoding, and so on. This is all essential knowledge that we need to have to understand how the PHY can detect that a link has gone down, or is performing so badly that it doesn’t make sense to keep the link up.

    What Does IEEE 802.3 1000BASE-T Say?

    The function in 1000BASE-T that is responsible for monitoring the status of the link is called link monitor and is defined in 40.4.2.5. The standard does not define much on what goes on in link monitor, though. Below is an excerpt from the standard:

    Link Monitor determines the status of the underlying receive channel and communicates it via the variable
    link_status. Failure of the underlying receive channel typically causes the PMA’s clients to suspend normal
    operation.
    The Link Monitor function shall comply with the state diagram of Figure 40–17.

    The state diagram (redrawn by me) is shown below:

    While 1000BASE-T leaves what the PHY monitors in link monitor to the implementer, there are still some interesting variables and timers that you should be Continue reading

    Troubleshooting vPC in My Virtual Lab

    I’m preparing a blog post on setting up vPC in a VXLAN/EVPN environment. While doing so, I ran into some issues. Rather than simply fixing them, I wanted to share the troubleshooting experience as it can be useful to see all the things I did to troubleshoot, including commands, packet captures, etc., and learn a little about virtual networking. As always, thanks to Peter Palúch for providing assistance with the process.

    Topology

    The following topology implemented in ESX is used:

    Background

    I had just configured the vPC peer link and vPC peer link keepalive. I verified that the vPC was functional with the following command:

    Leaf1# show vpc
    Legend:
                    (*) - local vPC is down, forwarding via vPC peer-link
    
    vPC domain id                     : 1   
    Peer status                       : peer adjacency formed ok      
    vPC keep-alive status             : peer is alive                 
    Configuration consistency status  : success 
    Per-vlan consistency status       : success                       
    Type-2 consistency status         : success 
    vPC role                          : primary                       
    Number of vPCs configured         : 1   
    Peer Gateway                      : Disabled
    Dual-active excluded VLANs        : -
    Graceful Consistency Check        : Enabled
    Auto-recovery status              : Disabled
    Delay-restore status              : Timer is off.(timeout = 30s)
    Delay-restore SVI status          : Timer is off.(timeout =  Continue reading

    1000BASE-T Part 3 – Autonegotiation

    In this post, we’ll take a closer look at auto negotiation. Auto negotiation has the following characteristics:

    • It is required to be supported.
    • Transmits capabilities for speed/duplex.
    • Negotiates Energy Efficient Ethernet (EEE) capabilities.
    • Determines the leader/follower relationship on the link.
    • Needed for PHY Control, a PMA subfunction.
    • Performed when initializing the link.
    • Auto-MDIX.

    The Auto Negotiation transmitter and receiver is actually a separate system in its own right.  In multi-speed PHY devices, auto negotiation is used to select the highest speed that both sides of the link are capable of, before the link is trained.  However, it is important to understand that auto negotiation is not optional to be supported, but the standard does not require it to be implemented (thanks to Eric Peterson for clarifying this). A leader and follower must be decided so that clock synchronization can take place. Without auto negotiation, this would have to be manually configured. On some devices it is possible to configure speed on 1000BASE-T interface. However, this does normally not disable auto negotation, but rather limit what capabilities get advertised.

    Auto negotiation is performed using Fast Link Pulses (FLP). Historically, 10BASE-T used Link Test Pulse (LTP) to verify the integrity Continue reading

    1000BASE-T Part 2 – Deepdive

    In 1000BASE-T Part 1, we reviewed the layers and what their purpose is. Now we’re going to go much deeper into the layers that relate to the PHY, which is PCS, PMA, and Autonegotiation. First though, let’s review the objectives of 1000BASE-T:

    • Support the CSMA/CD MAC.
    • Comply with specifications for GMII (Clause 35).
    • Support 1000 Mbit/s repeater (Clause 41).
    • Provide line transmission support full and half duplex operation.
    • Meet or exceed FCC Class A/CISPR or better operation.
    • Support operation over 100 meters of copper balanced cabling (defined in 40.7).
    • Bit Error Ratio less than or equal to 10^-10.
    • Support Auto negotiation (Clause 28).

    How does 1000BASE-T achieve a bandwidth of 1000 Mbit/s? As you probably know, the twisted pair cable consists of four pairs, eight wires in total, where transmit and receive are separated to achieve full duplex operation:

    The meaning of hybrid in this context is that transmit and receive is performed on the same pair. Every pair is capable of 250 Mbit/s data rate, for a total of 1000 Mbit/s. As PAM-5 encoding is used (more on this later), the baud rate is 125 MHz. This means that the PHY receives 8-bit words to send every Continue reading

    1000BASE-T Part 1 – Introduction

    How does Ethernet detect that a link goes down? This, what I thought was a simple question, I asked myself a couple of weeks ago. I realized I didn’t have a very good answer. I realized I had more to learn about Ethernet and the physical layer and so does pretty much the entire networking industry. Through the graceful help of Peter Jones at Cisco, I got in touch with George Zimmerman, an independent professional with a PhD in electrical engineering, a history of teaching at Caltech, and that works within the IEEE on different standards. To answer my initial question, we first need to understand more about Ethernet, and especially the physical layer. As every version of Ethernet has slightly different PHY, I will be covering 1000BASE-T. This will be covered in a series of posts, this being the first.

    Going back to the OSI model, most roles in networking puts the focus on layers two to four:

    This is natural as most of our work relates to these layers.

    When we think of two hosts communicating, we imagine that the transceivers connect to each other and that there are ones and zeroes traveling across the cable:

    Continue reading

    How Anycast VTEP Broke My Lab And What I Learned

    I’m preparing a massive blog post on vPC in the context of VXLAN/EVPN and while doing so I accidentally broke my lab. What a great learning experience! I thought I would share it with you and how to perform troubleshooting of this scenario.

    My topology looks like this:

    Before I made any changes, there was full connectivity between these hosts, meaning that both bridging and routing was working. I then changed the loopback1 (NVE source interface) configuration of Leaf-1 and Leaf-2 to add a secondary IP. This was the initial configuration:

    ! Leaf-1
    interface loopback1
      description VTEP
      ip address 203.0.113.1/32
      ip router ospf UNDERLAY area 0.0.0.0
      ip pim sparse-mode
    ! Leaf-2
    interface loopback1
      description VTEP
      ip address 203.0.113.2/32
      ip router ospf UNDERLAY area 0.0.0.0
      ip pim sparse-mode

    This then changed to:

    ! Leaf-1
    interface loopback1
      description VTEP
      ip address 203.0.113.1/32
      ip address 203.0.113.12/32 secondary
      ip router ospf UNDERLAY area 0.0.0.0
      ip pim sparse-mode
    ! Leaf-2
    interface loopback1
      description VTEP
      ip address 203.0.113.2/32
      ip address 203.0.113.12/32 secondary
      ip router ospf UNDERLAY area 0.0.0.0
      ip pim  Continue reading

    Routed Packet Walk in VXLAN/EVPN Network

    In a previous post, I walked through how a packet gets bridged in a VXLAN/EVPN network. In this post, I’ll go through how a packet gets routed, that is, packet from one VNI to another VNI. The following topology will be used:

    The lab has the following characteristics:

    • OSPF in the underlay.
    • Ingress replication for BUM traffic through the use of EVPN.
    • ARP suppression is enabled.

    Server-2 initiates a ping towards Server-4:

    Frame 562: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface ens257, id 4
    Ethernet II, Src: 00:50:56:ad:f4:8d, Dst: 00:01:00:01:00:01
    Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.44
    Internet Control Message Protocol
        Type: 8 (Echo (ping) request)
        Code: 0
        Checksum: 0xd745 [correct]
        [Checksum Status: Good]
        Identifier (BE): 17 (0x0011)
        Identifier (LE): 4352 (0x1100)
        Sequence Number (BE): 1 (0x0001)
        Sequence Number (LE): 256 (0x0100)
        [Response frame: 563]
        Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
        [Timestamp from icmp data (relative): 0.000701509 seconds]
        Data (40 bytes)

    The destination MAC is 0001.0001.0001 which is the Anycast GW MAC configured on Leaf-2. As this MAC is used on SVI for VLAN 20 of Leaf-2, the Continue reading

    EVPN – Asymmetric vs Symmetric IRB

    It is well known that VXLAN supports bridging frames, that is, forwarding frames that belong to the same L2 segment. In the beginning, this is all that was supported. There was no VXLAN routing. In essence, the HW didn’t support taking a VXLAN encapsulated packet, decapsulating it, and then performing a L3 lookup. This meant that another device was needed to do the L3 lookup. Think of it as router on a stick where the VTEP would decapsulate the packet and forward it (based on L2 lookup) to a gateway. This gateway needed to have L3 interfaces for all the L2 VNIs that needed routing. Now, this is still applicable in a design where a FW should inspect traffic between all VNIs, but HW has supported for a long time to do VXLAN routing, that is, taking packet from one VNI and routing it to another VNI. This is referred to as Integrated Routing and Bridging (IRB), as the device is capable of both bridging and routing packets. IRB is described in RFC 9135.

    There are two types of IRB, asymmetric and symmetric. Asymmetric vs symmetric refers to how the lookup is performed to do routing. Let’s first take a Continue reading

    EVPN Terminology

    Reading RFCs is a great source of information for understanding all the details of a protocol. Often they do require the reader to be quite technical and the terminology can be confusing if you aren’t used to the type of language and writing style used in RFCs. In this post, I go through some of the most important terminology in EVPN and VXLAN to help you build your understanding of the different forwarding constructs and how they interact.

    The picture below shows some of the most important terminology in EVPN:

    Let’s go through the terms used in the diagram and some additional ones:

    • Attachment circuit – An interface that is associated with a bridge table. The AC that the packet arrived on is determined by examining the port, and optionally VLAN tag.
    • Broadcast Domain – The Broadcast domain consists of all devices and hosts that would receive a broadcast frame when sent in that domain (assuming no ARP optimization features used). This is normally a VLAN, and it normally maps to one subnet. From a VXLAN perspective, it would be a L2 VNI. An EVI may contain one or more BDs depending on service model.
    • Bridge Table – Bridge Table Continue reading

    Bridging Packet Walk In VXLAN/EVPN Network

    In this post I walk you through all the steps and packets involved in two hosts communicating over a L2 VNI in a VXLAN/EVPN network. The topology below is the one we will be using:

    The lab has the following characteristics:

    • OSPF in the underlay.
    • Ingress replication for BUM traffic through the use of EVPN.
    • ARP suppression is enabled.
    • ARP cache is cleared on Server-1 and Server-4 before initating the packet capture.
    • Server-1 is the host sourcing traffic by pinging Server-4.

    Server-1 clears the ARP entry for Server-4 and initiates the ping:

    sudo ip neighbor del 198.51.100.44 dev ens160
    ping 198.51.100.44
    PING 198.51.100.44 (198.51.100.44) 56(84) bytes of data.
    64 bytes from 198.51.100.44: icmp_seq=1 ttl=64 time=6.38 ms
    64 bytes from 198.51.100.44: icmp_seq=2 ttl=64 time=4.56 ms
    64 bytes from 198.51.100.44: icmp_seq=3 ttl=64 time=4.60 ms

    Below is packet capture showing the ARP request from Server-1:

    Frame 7854: 60 bytes on wire (480 bits), 60 bytes captured (480 bits) on interface ens257, id 4
    Ethernet II, Src: 00:50:56:ad:85:06, Dst: ff:ff:ff:ff:ff:ff
    Address Resolution Protocol (request)
        Hardware type: Ethernet (1)
        Protocol type:  Continue reading

    Why Is BFD More Light Weight Than Routing Hellos?

    There are many articles on BFD. It is well known that BFD has the following advantages over routing protocol hellos/keepalives:

    • BFD is more light weight than hellos/keepalives.
    • Multiple clients can register to BFD instead of configuring each protocol with aggressive timers.
    • On some platforms, BFD can be offloaded to the hardware instead of the CPU.
    • BFD provides faster timers than routing protocols.
    • BFD is less CPU intensive.

    What does light weight mean, though? Does it mean that the packets are smaller? Let’s compare a BFD packet to an OSPF Hello. Starting with the OSPF Hello:

    Frame 269: 114 bytes on wire (912 bits), 114 bytes captured (912 bits) on interface ens192, id 1
    Ethernet II, Src: 00:50:56:ad:8d:3c, Dst: 01:00:5e:00:00:05
    Internet Protocol Version 4, Src: 203.0.113.0, Dst: 224.0.0.5
    Open Shortest Path First
        OSPF Header
            Version: 2
            Message Type: Hello Packet (1)
            Packet Length: 48
            Source OSPF Router: 192.168.128.223
            Area ID: 0.0.0.0 (Backbone)
            Checksum: 0x7193 [correct]
            Auth Type: Null (0)
            Auth Data (none): 0000000000000000
        OSPF Hello Packet
        OSPF LLS Data Block
    

    There’s 114 bytes on the wire consisting of:

    Catalyst SD-WAN Enhanced Application Aware Routing

    Traditionally, Cisco has leveraged BFD to monitor tunnels and their performance and Application Aware Routing (AAR) to reroute traffic. BFD has been used to measure:

    • Latency.
    • Loss.
    • Jitter.

    Additionally, BFD is also used to verify liveliness of the tunnels. This works well, but there are some drawbacks to using a separate protocol for measuring performance:

    • You are adding control plane packets competing for bandwidth with packets in data plane.
    • Sending control plane packets frequently may overload the control plane.
      • This may lead to false positives.
    • It’s not guaranteed that control plane packets and data plane packets are treated equally.
    • AAR did take some time to react to poor transports as it had to collect enough measurements before reacting.
    • AAR didn’t have a built-in dampening mechanism.

    With the default BFD settings, BFD packets are sent every second. The default AAR configuration consists of six buckets that hold 10 minutes of data each. This means that with the default settings, AAR will react in 10-60 minutes depending on how poorly the transport is performing. The most aggressive AAR configuration recommended by Cisco was to have 5 buckets holding 2 minutes of data each. AAR would then react in 2-10 minutes which I Continue reading

    Catalyst SD-WAN 20.13 – RBAC

    Catalyst SD-WAN has supported Role Based Access Control (RBAC) for a long time. It has been possible to use predefined roles or create custom roles and defining what areas the user should have access to. However, before 20.13 it was not possible to define a scope. In large companies it’s quite common that one group manages one set of devices, for example all the sites in EU, all the sites in the US, etc. There may also be multiple business units within the company which may share some infrastructure but operate autonomously from each other where a BU should only have access to its own set of devices. As of 20.13, it is not possible to define scope when using RBAC in Catalyst SD-WAN.

    There is another feature, called Network Hierarchy that is somewhat related to RBAC. When onboarding devices, you assign a Site ID to the device. The site is then assigned a name in the format of SITE_SiteID, for example SITE_10 when using a Site ID of 10. By default all sites belong to the global node as can be seen below:

    Note that it says Auto-Generated site. It is possible to edit the site Continue reading

    NX-OS Forwarding Constructs For VXLAN/EVPN

    In this post we will look at the forwarding constructs in NX-OS in the context of VXLAN and EVPN. Having knowledge of the forwarding constructs helps both with understanding of the protocols, but also to assist in troubleshooting. BRKDCN-3040 from Cisco Live has a nice overview of the components involved:

    There are components that are platform independent (PI) and platform dependent (PD). Below I’ll explain what each component does:

    • ARP – Information from ARP requests/responses is needed to build adjacencies. The information learned from ARP is used to populate IP address field in RT2 and hence also to populate the ARP suppression cache.
    • IPv6 ND – ND fills the role of ARP, but for IPv6.
    • Adjacency Manager – Resolves directly attached hosts MAC addresses.
    • Host Mobility Manager – Tracks the endpoints and their movements.
    • L2FM – The Layer2 Forwarding Manager. A platform dependent component that programs ASICs for L2 forwarding. Keeps track of MAC addresses, their placement and moves, and synchronizes this information across ASICS, line cards, and vPC peers when vPC is in use.
    • MFDM – Multicast Forwarding Database Manager. A platform dependent component that programs ASICs with information to perform multicast forwarding.
    • L2RIB – The component that handles Continue reading

    EVPN Route Type 5

    In a previous post, EVPN Deepdive Route Types 2 and 3, I covered route types 2 and 3. In this post I’ll cover route type 5 which is used for advertising IP prefixes. This route type is covered in RFC 9136.

    There are two main use cases for advertising IP prefixes in EVPN route type 5:

    • Advertising external prefixes into the VXLAN network.
    • Advertising prefixes for connectivity towards silent hosts.

    The first scenario is pretty obvious. There are other places in the network, such as remote offices via a WAN, partners and external parties, as well as the internet. To route towards these destinations, a route type is needed and this is route type 5. Remember, route type 2 only provides host routing which poses the following problems for external connectivity:

    • Advertising everything as /32 and /128 would be highly inefficient.
    • It requires an EVPN speaker to generate the RT2 and the external prefixes are originated from non-EVPN speakers.
    • It would not be possible to advertise a default route.
    • Without RT5, external connectivity would have to be advertised from another protocol than EVPN.

    The last bullet may be worth expanding a bit on. If the external prefixes aren’t advertised Continue reading

    Simulate a Silent Host in a VXLAN Network

    I’m working on a blog post explaining route type 5 in EVPN. To demonstrate a scenario with a silent host, I want to simulate this behavior. Normally, hosts can be quite chatty and ARP for their GW, for example. In this post I will show how arptables on Linux can be used to simulate a silent host.

    Currently the leaf switch has an ARP entry for the host:

    Leaf4# show ip arp vrf Tenant1
    
    Flags: * - Adjacencies learnt on non-active FHRP router
           + - Adjacencies synced via CFSoE
           # - Adjacencies Throttled for Glean
           CP - Added via L2RIB, Control plane Adjacencies
           PS - Added via L2RIB, Peer Sync
           RO - Re-Originated Peer Sync Entry
           D - Static Adjacencies attached to down interface
    
    IP ARP Table for context Tenant1
    Total number of entries: 1
    Address         Age       MAC Address     Interface       Flags
    198.51.100.44   00:15:20  0050.56ad.7d68  Vlan10           

    It is possible to ping the host from the leaf switch:

    Leaf4# ping 198.51.100.44 vrf Tenant1
    PING 198.51.100.44 (198.51.100.44): 56 data bytes
    64 bytes from 198.51.100.44: icmp_seq=0 ttl=63 time=1.355 ms
    64 bytes from 198.51.100.44:  Continue reading

    VXLAN/EVPN – Host routing

    In an previous post Advertising IPs In EVPN Route Type 2, I described use cases for advertising IP addresses in EVPN route type 2. Host ARP and host mobility I already covered so today we will focus on host routing.

    To be able to show this scenario, I have added another server (SERVER-2) and will be using the topology below:

    There is already existing configuration for VLAN 10 (L2 VNI) and for VLAN 100 (L3 VNI) which is shown below:

    vrf context Tenant1
      vni 10001
      rd auto
      address-family ipv4 unicast
        route-target both auto
        route-target both auto evpn
    !
    interface Vlan10
      no shutdown
      vrf member Tenant1
      ip address 198.51.100.1/24
      fabric forwarding mode anycast-gateway
    !
    interface Vlan100
      no shutdown
      mtu 9216
      vrf member Tenant1
      ip forward

    To get SERVER-2 connected the following is needed:

    • Configure VLAN 20 and map it to L2 VNI (VNI 10002).
    • Make the L2 VNI a member of the NVE.
    • SVI for VLAN 20.
    • Configure port towards SERVER-2 in VLAN 20.

    This is shown below:

    vlan 20
      vn-segment 10002
    !
    interface nve1
      member vni 10002
        ingress-replication protocol bgp
    !
    interface Vlan20
      no shutdown
      vrf member Tenant1
      ip address 10.0.0.1/24
      fabric forwarding mode anycast-gateway
    !
    interface Ethernet1/3
       Continue reading

    VXLAN/EVPN – Host mobility

    In the previous post VXLAN/EVPN – Host ARP, I talked about how knowing the MAC/IP of endpoints allows for ARP suppression. In this post we’ll take a look at host mobility. The topology used is the same as in the previous post:

    Currently SERVER-1 is connected to LEAF-1. What happens if SERVER-1 moves to LEAF-2? This would be a common scenario for a virtual infrastructure. First let’s take a look at LEAF-4 on what routes we have for SERVER-1:

    Leaf4# show bgp l2vpn evpn 0050.56ad.8506
    BGP routing table information for VRF default, address family L2VPN EVPN
    Route Distinguisher: 192.0.2.3:32777
    BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.8506]:[0]:[0.0.0.0]/216, version 662
    Paths: (2 available, best #2)
    Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW
    
      Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
      AS-Path: NONE, path sourced internal to AS
        203.0.113.1 (metric 81) from 192.0.2.12 (192.0.2.2)
          Origin IGP, MED not set, localpref 100, weight 0
          Received label 10000
          Extcommunity: RT:65000:10000 ENCAP:8
          Originator: 192.0.2.3 Cluster list: 192.0.2.2 
    
      Advertised  Continue reading

    VXLAN/EVPN – Host ARP

    In the last post Advertising IPs In EVPN Route Type 2, I described how to get IPs advertised in EVPN route type 2, but why do we need it? There are three main scenarios where having the MAC/IP mapping is useful:

    • Host ARP.
    • Host mobility.
    • Host routing.

    In this post I will cover the first use case and the topology below will be used:

    Host ARP

    When two hosts in the same subnet want to send Ethernet frames to each other, they will ARP to discover the MAC address of the other host. This is no different in a VXLAN/EVPN network. The ARP frame, which is broadcast, will have to be flooded to other VTEPs either using multicast in the underlay or by ingress replication. Because the frame is broadcast, it will have to go to all the VTEPs that have that VNI. The scenario with ingress replication is shown below:

    In this scenario, SERVER-1 is sending an ARP request to get the MAC address of SERVER-4. As all leafs are participating in the L2 VNI, LEAF-1 will perform ingress replication and send it to all leafs. However, sending the ARP request to LEAF-2 and LEAF-3 is not needed Continue reading

    1 2 3 10