There are many articles on BFD. It is well known that BFD has the following advantages over routing protocol hellos/keepalives:
What does light weight mean, though? Does it mean that the packets are smaller? Let’s compare a BFD packet to an OSPF Hello. Starting with the OSPF Hello:
Frame 269: 114 bytes on wire (912 bits), 114 bytes captured (912 bits) on interface ens192, id 1 Ethernet II, Src: 00:50:56:ad:8d:3c, Dst: 01:00:5e:00:00:05 Internet Protocol Version 4, Src: 203.0.113.0, Dst: 220.127.116.11 Open Shortest Path First OSPF Header Version: 2 Message Type: Hello Packet (1) Packet Length: 48 Source OSPF Router: 192.168.128.223 Area ID: 0.0.0.0 (Backbone) Checksum: 0x7193 [correct] Auth Type: Null (0) Auth Data (none): 0000000000000000 OSPF Hello Packet OSPF LLS Data Block
There’s 114 bytes on the wire consisting of:
Traditionally, Cisco has leveraged BFD to monitor tunnels and their performance and Application Aware Routing (AAR) to reroute traffic. BFD has been used to measure:
Additionally, BFD is also used to verify liveliness of the tunnels. This works well, but there are some drawbacks to using a separate protocol for measuring performance:
With the default BFD settings, BFD packets are sent every second. The default AAR configuration consists of six buckets that hold 10 minutes of data each. This means that with the default settings, AAR will react in 10-60 minutes depending on how poorly the transport is performing. The most aggressive AAR configuration recommended by Cisco was to have 5 buckets holding 2 minutes of data each. AAR would then react in 2-10 minutes which I Continue reading
Catalyst SD-WAN has supported Role Based Access Control (RBAC) for a long time. It has been possible to use predefined roles or create custom roles and defining what areas the user should have access to. However, before 20.13 it was not possible to define a scope. In large companies it’s quite common that one group manages one set of devices, for example all the sites in EU, all the sites in the US, etc. There may also be multiple business units within the company which may share some infrastructure but operate autonomously from each other where a BU should only have access to its own set of devices. As of 20.13, it is not possible to define scope when using RBAC in Catalyst SD-WAN.
There is another feature, called Network Hierarchy that is somewhat related to RBAC. When onboarding devices, you assign a Site ID to the device. The site is then assigned a name in the format of SITE_SiteID, for example SITE_10 when using a Site ID of 10. By default all sites belong to the global node as can be seen below:
Note that it says Auto-Generated site. It is possible to edit the site Continue reading
In this post we will look at the forwarding constructs in NX-OS in the context of VXLAN and EVPN. Having knowledge of the forwarding constructs helps both with understanding of the protocols, but also to assist in troubleshooting. BRKDCN-3040 from Cisco Live has a nice overview of the components involved:
There are components that are platform independent (PI) and platform dependent (PD). Below I’ll explain what each component does:
In a previous post, EVPN Deepdive Route Types 2 and 3, I covered route types 2 and 3. In this post I’ll cover route type 5 which is used for advertising IP prefixes. This route type is covered in RFC 9136.
There are two main use cases for advertising IP prefixes in EVPN route type 5:
The first scenario is pretty obvious. There are other places in the network, such as remote offices via a WAN, partners and external parties, as well as the internet. To route towards these destinations, a route type is needed and this is route type 5. Remember, route type 2 only provides host routing which poses the following problems for external connectivity:
The last bullet may be worth expanding a bit on. If the external prefixes aren’t advertised Continue reading
I’m working on a blog post explaining route type 5 in EVPN. To demonstrate a scenario with a silent host, I want to simulate this behavior. Normally, hosts can be quite chatty and ARP for their GW, for example. In this post I will show how arptables on Linux can be used to simulate a silent host.
Currently the leaf switch has an ARP entry for the host:
Leaf4# show ip arp vrf Tenant1 Flags: * - Adjacencies learnt on non-active FHRP router + - Adjacencies synced via CFSoE # - Adjacencies Throttled for Glean CP - Added via L2RIB, Control plane Adjacencies PS - Added via L2RIB, Peer Sync RO - Re-Originated Peer Sync Entry D - Static Adjacencies attached to down interface IP ARP Table for context Tenant1 Total number of entries: 1 Address Age MAC Address Interface Flags 198.51.100.44 00:15:20 0050.56ad.7d68 Vlan10
It is possible to ping the host from the leaf switch:
Leaf4# ping 198.51.100.44 vrf Tenant1 PING 198.51.100.44 (198.51.100.44): 56 data bytes 64 bytes from 198.51.100.44: icmp_seq=0 ttl=63 time=1.355 ms 64 bytes from 198.51.100.44: Continue reading
In an previous post Advertising IPs In EVPN Route Type 2, I described use cases for advertising IP addresses in EVPN route type 2. Host ARP and host mobility I already covered so today we will focus on host routing.
To be able to show this scenario, I have added another server (SERVER-2) and will be using the topology below:
There is already existing configuration for VLAN 10 (L2 VNI) and for VLAN 100 (L3 VNI) which is shown below:
vrf context Tenant1 vni 10001 rd auto address-family ipv4 unicast route-target both auto route-target both auto evpn ! interface Vlan10 no shutdown vrf member Tenant1 ip address 198.51.100.1/24 fabric forwarding mode anycast-gateway ! interface Vlan100 no shutdown mtu 9216 vrf member Tenant1 ip forward
To get SERVER-2 connected the following is needed:
This is shown below:
vlan 20 vn-segment 10002 ! interface nve1 member vni 10002 ingress-replication protocol bgp ! interface Vlan20 no shutdown vrf member Tenant1 ip address 10.0.0.1/24 fabric forwarding mode anycast-gateway ! interface Ethernet1/3 Continue reading
In the previous post VXLAN/EVPN – Host ARP, I talked about how knowing the MAC/IP of endpoints allows for ARP suppression. In this post we’ll take a look at host mobility. The topology used is the same as in the previous post:
Currently SERVER-1 is connected to LEAF-1. What happens if SERVER-1 moves to LEAF-2? This would be a common scenario for a virtual infrastructure. First let’s take a look at LEAF-4 on what routes we have for SERVER-1:
Leaf4# show bgp l2vpn evpn 0050.56ad.8506 BGP routing table information for VRF default, address family L2VPN EVPN Route Distinguisher: 192.0.2.3:32777 BGP routing table entry for ::::[0050.56ad.8506]::[0.0.0.0]/216, version 662 Paths: (2 available, best #2) Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop AS-Path: NONE, path sourced internal to AS 203.0.113.1 (metric 81) from 192.0.2.12 (192.0.2.2) Origin IGP, MED not set, localpref 100, weight 0 Received label 10000 Extcommunity: RT:65000:10000 ENCAP:8 Originator: 192.0.2.3 Cluster list: 192.0.2.2 Advertised Continue reading
In the last post Advertising IPs In EVPN Route Type 2, I described how to get IPs advertised in EVPN route type 2, but why do we need it? There are three main scenarios where having the MAC/IP mapping is useful:
In this post I will cover the first use case and the topology below will be used:
When two hosts in the same subnet want to send Ethernet frames to each other, they will ARP to discover the MAC address of the other host. This is no different in a VXLAN/EVPN network. The ARP frame, which is broadcast, will have to be flooded to other VTEPs either using multicast in the underlay or by ingress replication. Because the frame is broadcast, it will have to go to all the VTEPs that have that VNI. The scenario with ingress replication is shown below:
In this scenario, SERVER-1 is sending an ARP request to get the MAC address of SERVER-4. As all leafs are participating in the L2 VNI, LEAF-1 will perform ingress replication and send it to all leafs. However, sending the ARP request to LEAF-2 and LEAF-3 is not needed Continue reading
In my last post EVPN Deepdive Route Types 2 and 3, we took a deepdive into these two route types. I mentioned that the IP address of a host, a /32 or /128 address, could optionally be advertised. I also mentioned that this is mainly to facilitate features such as ARP suppression where a VTEP will be aware of the MAC/IP mapping and not have to flood BUM frames. However, in my last lab no IP addresses were advertised. Why is that? How do we get them advertised?
Currently, I have only setup a L2 VNI in the lab. This provides connectivity for the VLAN that my hosts are in, but it does not provide any L3 services. There is no SVI configured and there is also no L3 service configured that can route between different VNIs. The “standard” way of setting this up would be to configure anycast gateway on the leafs where every leaf that hosts the VNI has the same IP/MAC, but I consider this to be an optimization that I want to cover in a future posts. I prefer to break things down into their components and focus on the configuration needed for each component Continue reading
In my last post on Configuring EVPN, we setup EVPN but configured no services. In this post we will configure a basic L2 service so we can dive into the different EVPN route types. This post will cover route type 2 and 3 together as you will commonly see these together. This post will cover:
The topology we will use for this post is shown below:
Before diving into configuration, let’s discuss something that is often overlooked, VTEP discovery.
Without EVPN, VXLAN uses flood and learn behavior for discovery of VTEPs. This means that any host sending VXLAN frames would be considered a trusted VTEP in the network. This is obviously not great from a security perspective. When using EVPN, adding VTEPs is based on BGP messages. A VTEP will learn about other VTEPs based on these BGP updates. It’s not a specific route type, but rather any type of EVPN message. This makes it more difficult to add a rogue Continue reading
Yesterday I posted a tricky question to Twitter. If you have a working VPNv4 environment and create a VRF with only a Route Distinguisher (RD) but without Route Targets (RT), will the route be exported? The answer may surprise you! The configuration supplied in the question was similar to the one below:
vrf definition QUIZ rd 198.51.100.1:100 ! address-family ipv4 exit-address-family ! interface GigabitEthernet2 vrf forwarding QUIZ ip address 203.0.113.1 255.255.255.0 ! router bgp 65000 ! address-family ipv4 vrf QUIZ network 203.0.113.0
Notice how this VRF has a RD but no RT. Will this router, PE1, advertise the route into VPNv4? Most would say no, but the answer is yes. Let’s first check that we see the route locally on PE1 in VRF QUIZ:
PE1#show bgp vpnv4 uni vrf QUIZ 203.0.113.0 BGP routing table entry for 198.51.100.1:100:203.0.113.0/24, version 4 Paths: (1 available, best #1, table QUIZ) Advertised to update-groups: 1 Refresh Epoch 1 Local 0.0.0.0 (via vrf QUIZ) from 0.0.0.0 (198.51.100.1) Origin IGP, metric 0, localpref 100, weight 32768, valid, sourced, local, best mpls Continue reading
In this post we will configure EVPN on NX-OS. We will reuse the VXLAN topology from my previous post. The following will describe the setup in this post:
The BGP topology is shown below:
I will cover all the details of configuring EVPN and establishing the BGP sessions. We will then cover the actual exchange of routes in detail in separate posts in the future.
Starting out, the following globals and features need to be configured:
Next, let’s configure BGP on the spines with the following settings:
Then let’s configure BGP on the leafs:
The devices will now advertise that they have AFI L2VPN and SAFI EVPN:
The BGP sessions are now up:
Leaf1# show bgp l2vpn evpn sum BGP summary information for VRF default, address family L2VPN EVPN BGP router identifier 192.0.2.3, local AS number 65000 BGP table version is 4, L2VPN EVPN config peers Continue reading
In previous posts I described VXLAN using flood and learn behavior using multicast or ingress replication. The drawback to flood and learn is that frames need to be flooded/replicated for the VTEPs to learn of each other and for learning what MAC addresses are available through each VTEP. This isn’t very efficient. Isn’t there a better way of learning this information? This is where Ethernet VPN (EVPN) comes into play. What is it? As you know, BGP can carry all sorts of information and EVPN is just BGP with support to carry information about VTEPs, MAC addresses, IP addresses, VRFs, and some other stuff. What does EVPN provide us?
Note that the use of EVPN doesn’t entirely remove the need for flooding using multicast or ingress replication. Hosts still need to use ARP/ND to find the MAC address of each other, although ARP suppression could potentially help with that. There may also be protocols such as DHCP that leverage broadcast for some messages. In addition, there may be silent hosts in the fabric where VTEP is not aware that the host is Continue reading
Recently a posted a question to Twitter about connecting two Cisco Catalyst switches. One switch has already booted and has the following configuration:
interface GigabitEthernet0/0 description SW02 switchport mode trunk switchport trunk allowed vlan 1,10,20,30 switchport nonegotiate
The other switch is connected to Gi1/0/48 and has just been powered on. It has no configuration so it is booting with the default configuration. The intention is to onboard a new switch via Catalyst Center using Plug and Play (PNP).
Based on the responses not many people were able to describe what would happen and why or why not this scenario would work. There are some interesting details here and before running into this scenario myself I thought that it might work. Before we can answer if it will work, let’s list what we know at this point in time about the two switches, SW01, and SW02. For SW01 we know that:
For SW02 we know that:
Most organizations are terribly bad at interviewing people. They overcomplicate things by holding too many interviews (more than 2-3) and often focus their interview on trivia and memorization rather than walking through a scenario. Every interview should have some form of a scenario and a whiteboard if you are hiring a Network Engineer. Rather than overcomplicating things, here’s how you can interview someone using a single scenario that you can expand on and go to different depths at different stages depending on the focus of the role.
You are an employee working in a large campus network. Your computer has just started up and has not previously communicated with anything before you open your browser and type in microsoft.com.
Before any communication can take place, you need an IP address. What IP protocols are there? What are the main differences between the two?
Things to look for: IPv4 vs IPv6. ARP vs ND. DHCP vs RA. Broadcast vs multicast.
What methods are there of configuring an IP address?
Things to look for: Static IP vs DHCP vs RA.
When I need to communicate to something external, traffic goes through a gateway. What type of device would Continue reading
On 31 October 2023, I took and passed the Implementing and Operating Cisco Service Provider Network Core Technologies (SPCOR) exam on my first attempt. Most of you know that I am recognized as an expert on Cisco Service Provider technologies given that I was one of the first to pass the new CCIE SPv4 exam in early 2016. Three months later, I released book of nearly 3,000 pages detailing all the technologies involved in that blueprint, going extremely deep into every topic. I later partnered with two other Service Provider experts to sell an ultimate study bundle that combined my textbook with their lab workbook. I think it’s fair to say that I should have crushed this exam, and although I did pass, it was far more difficult than I anticipated.
A few years ago, I also took the SCOR exam, but didn’t write a blog about it because I felt the exam was unremarkable. My friend Craig Stansbury has an excellent SCOR learning path at Pluralsight that I used to pass the exam on the first try, but on balance, it felt like a regular CCNP exam. SPCOR was a whole new class of difficulty. When I Continue reading
On 17 October 2023, I took and passed the Automating and Programming Cisco Data Center Solutions (DCAUTO) exam on my first attempt. This is the seventh DevNet exam I’ve passed. After the retirement of the Webex and IoT specialty exams, the Collaboration specialty and Expert exams remain the only two I haven’t attempted. Much like my experience with enterprise, service provider, and security automation, I have years of real-life experience automating various data center solutions, primarily by working with Nexus and NDO (formerly MSO). I’ve spoken about the topic on various podcasts and professional training courses many times. Believe it or not, I don’t have as much real-life automation experience with ACI or UCS, which are key data center products for Cisco, so I studied those areas intensely.
It’s worth mentioning that Cisco’s new certification road map introduces small changes at regular intervals to all of their certification exams. This is smart as it leads to less “blueprint shock” every few years, plus gives learners an opportunity to master the newest technologies in an incremental way. Because Cisco updated DCAUTO earlier this year, I took the v1.1 exam. I’m not kidding when I say the exam was Continue reading
In the last post we used multicast to forward BUM frames. In this post, static ingress replication is used which removes the requirement for multicast in the underlay. Before setting this up, let’s do a comparison of multicast vs ingress replication. To make this interesting, let’s use a larger topology consisting of 2 Spines and 32 Leafs:
Assume that all Leafs have the same VNIs. In a scenario with multicast, Leaf-1 sends for example an ARP request encapsulated in multicast towards Spine-1. This is a single frame. Spine-1 then replicates this frame and sends it on all its links towards the Leafs:
This is very efficient as a single frame is sent by Leaf-1 and then 31 copies are sent on the other links to the Leafs. If the ARP request is 110 bytes, then a total of 32 x 110 = 3520 bytes have been consumed to send this ARP request. This is optimal forwarding from a resource consumption perspective and one of the strengths of multicast. Now let’s compare that to ingress replication. With ingress replication, Leaf-1 sends 31x copies of the frame (with different destination IP) towards Spine-1. Spine-1 then forwards those frames on each link towards Continue reading
There are two main methods that can be used to forward broadcast, unknown unicast, and multicast (BUM) frames in a VXLAN network:
In this post, we take a detailed look at how multicast can be used to forward BUM frames by running multicast in the underlay. We are using the topology from my Building a VXLAN Lab Using Nexus9000v post. The Spine switches are configured using the Nexus feature Anycast RP. That is, no MSDP is used between the RPs. To be able to forward broadcast frames such as ARP in our topology, the following is required:
The Leaf switches have the following configuration to enable multicast:
ip pim rp-address 192.0.2.255 group-list 224. Continue reading