Archive

Category Archives for "The Network Times"

Configuration of BGP afi/safi L2VPN EVPN and NVE Tunnel Interface

Overlay Network Routing: MP-BGP L2VPN/EVPN



EVPN Fabric Data Plane – MP-BGP


Instead of being a protocol, EVPN is a solution that utilizes the Multi-Protocol Border Gateway Protocol (MP-BGP) for its control plane in an overlay network. Besides, EVPN employs Virtual eXtensible Local Area Network (VXLAN) encapsulation for the data plane of the overlay network.

Multi-Protocol BGP (MP-BGP) is an extension of BGP-4 that allows BGP speakers to encode Network Layer Reachability Information (NLRI) of various address types, including IPv4/6, VPNv4, and MAC addresses, into BGP Update messages. The MP_REACH_NLRI path attribute (PA) carried within MP-BGP update messages includes Address Family Identifier (AFI) and Subsequent Address Family Identifier (SAFI) attributes. The combination of AFI and SAFI determines the semantics of the carried Network Layer Reachability Information (NLRI). For example, AFI-25 (L2VPN) with SAFI-70 (EVPN) defines an MP-BGP-based L2VPN solution, which extends a broadcast domain in a multipoint manner over a routed IPv4 infrastructure using an Ethernet VPN (EVPN) solution.

BGP EVPN Route Types (BGP RT) carried in BGP update messages describe the advertised EVPN NLRIs (Network Layer Reachability Information) type. Besides publishing IP Prefix information with IP Prefix Route (EVPN RT 5), BGP EVPN uses MAC Advertisement Route (EVPN RT 2) Continue reading

Single-AS EVPN Fabric with OSPF Underlay: Underlay Network Multicast Routing: Any-Source Multicast – ASM

 Underlay Network Multicast Routing: PIM-SM

In a traditional Layer 2 network, switches forward Intra-VLAN data traffic based on the destination MAC address of Ethernet frames. Therefore, hosts within the same VLAN must resolve each other's MAC-IP address bindings using Address Resolution Protocol (ARP). When a host wants to open a new IP connection with a device in the same subnet and the destination MAC address is unknown, the connection initiator generates an ARP Request message. In the message, the sender provides its own MAC-IP binding information and queries the MAC address of the owner of the target IP. The ARP Request messages are Layer 2 Broadcast messages with the destination MAC address FF:FF:FF:FF:FF:FF. 

EVPN Fabric is a routed network and requires a solution for Layer 2 Broadcast messages. We can select either BGP EVPN-based Ingress-Replication (IR) solution or enable Multicast routing in Underlay network. This chapter introduces the latter model. As in previous Unicast Routing section, we follow the Multicast deployment workflow of Nexus Dashboard Fabric Controller (NDFC) graphical user interface. 

Figure 2-4 depicts the components needed to deploy Multicast service in the Underlay network. The default option for selecting “RP mode” is ASM (Any-Source Multicast). ASM is Continue reading

Single-AS EVPN Fabric with OSPF Underlay: Underlay Network Unicast Routing

 Introduction


Image 2-1 illustrates the components essential for designing a Single-AS, Multicast-enabled OSPF Underlay EVPN Fabric. These components need to be established before constructing the EVPN fabric. I've grouped them into five categories based on their function.

  • General: Defines the IP addressing scheme for Spine-Leaf Inter-Switch links, set the BGP AS number and number of BGP Route-Reflectors, and set the MAC address for the Anycast gateway for client-side VLAN routing interfaces.
  • Replication: Specifies the replication mode for Broadcast, Unknown Unicast, and Multicast (BUM) traffic generated by Tenant Systems. The options are Ingress-Replication and Multicast (ASM or BiDir options).
  • vPC: Describes vPC multihoming settings such as vPC Peer Link VLAN ID and Port-Channel ID, vPC Auto-recovery and Delay Restore timers, and define vPC Peer Keepalive interface.
  • Protocol: Defines the numbering schema for Loopback interfaces, set the OSPF Area identifier, and OSPF process name.
  • Resources: Reserves IP address ranges for Loopback interfaces defined in the Protocols category and for the Rendezvous Point specified in the Replication category. Besides, in this section, we reserve Layer 2 and Layer 3 VXLAN and VLAN ranges for overlay network segments.

The model presented in Figure 2-1 outlines the steps for configuring an EVPN fabric using the Continue reading

BGP EVPN with VXLAN: Fabric Overview

 




Figure illustrates the simplified operation model of EVPN Fabric. At the bottom of the figure is four devices, Tenant Systems (TS), connected to the network. When speaking about TS, I am referring to physical or virtual hosts. Besides, The Tenant System can be a forwarding component attached to one or more Tenant-specific Virtual Networks. Examples of TS forwarding components include firewalls, load balancers, switches, and routers.

We have connected TS1 and TS2 to VLAN 10 and TS3-4 to VLAN 20. VLAN 10 is associated with EVPN Instance (EVI) 10010 and VLAN 20 to EVI 10020. Note that VLAN-Id is switch-specific, while EVI is Fabric-wide. Thus, subnet A can have VLAN-Id XX on one Leaf switch and VLAN-Id YY on another. However, we must map both VLAN XX and YY to the same EVPN Instance.

When a TS connected to the Fabric sends the first Ethernet frame, the Leaf switch stores the source MAC address in the MAC address table, where it is copied to the Layer 2 routing table (L2RIB) of the EVPN Instance. Then, the BGP process of the Leaf switch advertises the MAC address with its reachability information to its BGP EVPN peers, essentially the Spine switches. Continue reading

Azure Networking: Cloud Scale Load Balancing

 Introduction


During the load balancer deployment process, we define a virtual IP (a.k.a front-end IP) for our published service. As a next step, we create a backend (BE) pool to which we attach Virtual Machines using either their associated vNIC or Direct IP (DIP). Then, we bind the VIP to BE using an Inbound rule. Besides, in this phase, we create and associate health probes with inbound rules for monitoring VM's service availability. If VMs in the backend pool also initiate outbound connections, we build an outbound policy, which states the source Network Address Translation (SNAT) rule (DIP, src port > VIP, src port).  

This chapter provides an overview of the components of the Azure load balancer service: Centralized SDN Controller, Virtual Load balancer pools, and Host Agents. In this chapter, we discuss control plane and data plane operation.  


Management & Control Plane – External Connections

Figure 20-1 depicts our example diagram. The top-most box, Loadbalancer deployment, shows our LB settings. We intend to forward HTTP traffic from the Internet to VIP 1.2.3.4 to either DIP 10.0.0.4 (vm-beetle) or DIP 10.0.0.5 (vm-bailey). The health probe associated with Continue reading

Adding TS’s IP Address to MAC-VRF (L2RIB) and IP-VRF (L3RIB)

In the previous chapter, we discussed how a VTEP learns the local TS's MAC address and the process through which the MAC address is programmed into BGP tables. An example VTEP device was configured with a Layer 2 VLAN and an EVPN Instance without deploying a VRF Context or VLAN routing interface. This chapter introduces, at a theoretical level, how the VTEP device, besides the TS's MAC address, learns the TS's IP address information after we have configured the VRF Context and routing interface for our example VLAN.


Figure 1-3: MAC-VRF Tenant System’s IP Address Propagation.

I have divided Figure 1-3 into three sections. The section on the top left, Integrated Routing and Bridging - IRB illustrates the components required for intra-tenant routing and their interdependencies. By configuring a Virtual Routing and Forwarding Context (VRF Context), we create a closed routing environment with a per-tenant IP-VRF L3 Routing Information Base (L3RIB). Within the VRF Context, we define the Layer 3 Virtual Network Identifier (L3VNI) along with the Route Distinguisher (RD) and Route Target (RT) values. The RD of the VRF Context enables the use of overlapping IP addresses across different tenants. Based on the RT value of the VRF Context, Continue reading

Another Ethernet VPN (EVPN) Introduction

Ethernet VPN (EVPN) Introduction


Instead of being a protocol, EVPN is a solution that utilizes the Multi-Protocol Border Gateway Protocol (MP-BGP) for its control plane in an overlay network. Besides, EVPN employs Virtual extensible Local Area Network (VXLAN) encapsulation for the data plane of the overlay network.


EVPN Control Plane: MP-BGP AFI: L2VPN, SAFI: EVPN


Multi-Protocol BGP (MP-BGP) is an extension of BGP-4 that allows BGP speakers to encode Network Layer Reachability Information (NLRI) of various address types, including IPv4/6, VPNv4, and MAC addresses, into BGP Update messages. 

The MP_REACH_NLRI path attribute (PA) carried within MP-BGP update messages includes Address Family Identifier (AFI) and Subsequent Address Family Identifier (SAFI) attributes. The combination of AFI and SAFI determines the semantics of the carried Network Layer Reachability Information (NLRI). For example, AFI-25 (L2VPN) with SAFI-70 (EVPN) defines an MP-BGP-based L2VPN solution, which extends a broadcast domain in a multipoint manner over a routed IPv4 infrastructure using an Ethernet VPN (EVPN) solution. 

BGP EVPN Route Types (BGP RT) carried in BGP update messages describe the advertised EVPN NLRIs (Network Layer Reachability Information) type. Besides publishing IP Prefix information with IP Prefix Route (EVPN RT 5), BGP EVPN uses MAC Advertisement Route Continue reading

BGP EVPN Part IV: MAC-VRF L2RIB Update: Local MAC Address

In Figure 1-3 we have VLAN 10 mapped to EVI/MAC-VRF L2VNI10000. TS-A1 (IP: 192.168.11.12/MAC: 1000.0010.beef) is connected to VLAN10 via Attachment Circuit (AC) Ethernet 1/2, (ifindex: 1a000200). 

Figure 1-3: MAC-VRF: L2RIB Local Learning Process.


Example 1-1 shows the VLAN to L2VNI mapping information. 


Leaf-101# show vlan id 10 vn-segment
VLAN Segment-id
---- -----------
10   10000       

Example 1-1: VLAN to EVPN Instance Mapping Information.


Step-1 and 2: MAC Table Update 


During the startup process, TS-A1 sends a Gratuitous ARP (GARP) message to announce its presence on the network and validate the uniqueness of its IP address. It uses its IP address in the Target IP field (Example 1-2). If another host responds to this unsolicited ARP reply, it indicates a potential IP address conflict. 

Ethernet II, Src: 10:00:00:10:be:ef, Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Address Resolution Protocol (reply/gratuitous ARP)
    Hardware type: Ethernet (1)
    Protocol type: IPv4 (0x0800)
    Hardware size: 6
    Protocol size: 4
    Opcode: reply (2)
    [Is gratuitous: True]
    Sender MAC address: 10:00:00:10:be:ef (10:00:00:10:be:ef)
    Sender IP address: 192.168.11.12
    Target MAC address: Broadcast Continue reading

BGP EVPN Part III: BGP EVPN Local Learning Fundamentals

Multi-Protocol BGP (MP-BGP) is a BGP-4 extension that enables BGP speakers to encode Network Layer Reachability Information (NLRI) of various address types, such as IPv4/6, VPNv4, and MAC addresses, into BGP Update messages. MP-BGP features an MP_REACH_NLRI Path-Attribute (PA), which utilizes an Address Family Identifier (AFI) to describe service categories. Subsequent Address Family Identifier (SAFI), in turn, defines the solution used for providing the service. For example, L2VPN (AFI 25) is a primary category for Layer-2 VPN services, and the Ethernet Virtual Private Network (EVPN: SAFI 70) provides the service. Another L2VPN service is Virtual Private LAN Service (VPLS: SAFI 65). The main differences between these two L2VPN services are that only EVPN supports active/active multihoming, has a control-plane-based MAC address learning mechanism, and operates over an IP-routed infrastructure.

EVPN utilizes various Route Types (EVPN RT) to describe the Network Layer Reachability Information (NLRI) associated with Unicast, BUM (Broadcast, Unknown unicast, and Multicast) traffic, as well as ESI Multihoming. The following sections explain how EVPN RT 2 (MAC Advertisement Route) is employed to distribute MAC and IP address information of Tenant Systems enabling the expansion of VLAN over routed infrastructure. 

The Tenant System refers to a host, virtual machine, Continue reading

BGP EVPN Part II: Network Virtualization Overlay with BGP EVPN and VXLAN – Introduction

In Figure 1-1, we have a routed 3-stage Clos Fabric, where all Inter-Switch links are routed point-to-point layer-3 connections. As explained in previous sections, a switched layer-2 network with an STP control plane allows only one active path per VLAN/Instance and VLAN-based traffic load sharing. Due to the Equal Cost Multi-Path (ECMP) supported by routing protocols, a routed Clos Fabric enables flow-based traffic load balancing using all links from the ingress leaf via the spine layer down to the egress leaf. The convergence time for routing protocols is faster and less disruptive than STP topology change. Besides, a routed Clos Fabric architecture allows horizontal bandwidth scaling. We can increase the overall bandwidth capacity between switches, by adding a new spine switch. Dynamic routing protocols allow standalone and virtualized devices lossless In-Service Software Update (ISSU) by advertising infinite metrics or withdrawing all advertised routes.

But how do we stretch layer-2 segments over layer-3 infrastructure in a Multipoint-to-Multipoint manner, allowing tenant isolation and routing between segments? The answer relies on the Network Virtualization Overlay (NVO3) framework. 

BGP EVPN, as an NVO3 control plane protocol, uses EVPN Route Types (RT) in update messages for identifying the type of advertised EVPN NLRIs (Network Continue reading

BGP EVPN Part-I: Challenges in Traditional Switched Datacenter Networks

Inefficient Link Utilization

The default Layer 2 Control Plane protocol in Cisco NX-OS is a Rapid Per-VLAN Spanning Tree Plus (Rapid PVST+), which runs 802.1w standard Rapid Spanning Tree Protocol (RSTP) instance per VLAN. Rapid PVST+ builds a VLAN-specific, loop-free Layer 2 data path from the STP root switch to all non-root switches. Spanning Tree Protocol, no matter which mode we use, allows only one active path at a time and blocks all redundant links. One general solution for activating all Inter-switch links is placing an STP root switch for odd and even VLANs into different switches. However, STP allows only a VLAN-based traffic load balancing.


CPU and Memory Usage

After building a loop-free data path, switches running Rapid PVST+ monitor the state of the network by using Spanning Tree instance-based Bridge Protocol Data Units (BPDU). By default, each switch sends instance-based BPDU messages from their designated port in two-second intervals. If we have 2000 VLANs, all switches must process 2000 BPDUs. To reduce CPU and Memory consumption caused by BPDU processing, we can use Multiple Spanning Tree – MSTP (802.1s), where VLANs are associated with Instances. For example, we can attach VLANs 1-999 to one instance and Continue reading

Cisco Intent-Based Networking: Part II – Cisco ISE and Catalyst Center Migration

Cisco Identity Service Engine (ISE) and Catalyst Center Integration

Before you can add Cisco ISE to Catalyst Center’s global network settings as an Authentication, Authorization, and Accounting server (AAA) for clients and manage the Group-Based access policy implemented in Cisco ISE, you must integrate them. 

This post starts by explaining how to activate the pxGrid service on ISE, which it uses for pushing policy changes to Catalyst Center (steps 1a-f). Next, it illustrates the procedure to enable  External RESTful API (ERS) read/write on Cisco ISE to allow external clients to Create, Read, Update, and Delete (CRUD) processes on ISE. Catalyst Center uses ERS for pushing configuration to ISE. After starting the pxGrid service and enabling ERS, this post discusses how to initiate the connection between ISE and Catalyst Center (steps 2a-h and 3a-b). The last part depicts the Group-Based Access Control migration processes (4a-b).

Step-1: Start pxGrid Service and Enabling ERS on ISE

Open the Administrator tab on the main view of Cisco ISE. Then, under the System tab, select the Deployment option. The Deployment Nodes section displays the Cisco ISE Node along with its personas. In Figure 1-3, a standalone ISE Node is comprised of three personas: Policy Continue reading

Cisco Intent-Based Networking: Part I – Introduction

 Introduction

This chapter introduces Cisco's approach to Intent-based Networking (IBN) through their Centralized SDN Controller, Cisco DNA Center, rebranded as Cisco Catalyst Center (from now on, I am using the abbreviation C3 for Cisco Catalyst Center). We focus on the network green field installation, showing workflows, configuration parameters, and relationships and dependencies between building blocks. The C3 workflow is divided into four main entities: 1) Design, 2) Policy, 3) Provision, and 4) Assurance, each having its own sub-processes. This chapter introduces the Design phase focusing on Network Hierarchy, Network Settings, and Network Profile with Configuration Templates. 

This post deprecates the previous post, "Cisco Intent-Based Networking: Part I, Overview."

Network Hierarchy

Network Hierarchy is a logical structure for organizing network devices. At the root of this hierarchy is the Global Area, where you establish your desired network structure. In our example, the hierarchy consists of four layers: Area (country - Finland), Sub-area (city - Joensuu), Building (JNS01), and Floor (JNS01-FLR01). Areas and Buildings indicate the location, while Floors provide environmental information relevant to wireless networks, such as floor type, measurements, and wall properties.


Network Settings

Network settings define device credentials (CLI, HTTP(S), SNMP, and NETCONF) required for accessing devices Continue reading

Cisco Intent Based Networking: Part I, Overview

This post introduces Cisco's approach to Intent-based Networking (IBN) through their Centralized SDN Controller, DNA Center, rebranded as Catalyst Center. We focus on the network green field installation, showing workflows, configuration parameters, and relationships and dependencies between building blocks.

Figure 1-1 is divided into three main areas: a) Onboard and Provisioning, b) Network Hierarchy and Global Network Settings, c) and Configuration Templates and Site Profiles. 

We start a green field network deployment by creating a Network Design. In this phase, we first build a Network Hierarchy for our sites. For example, a hierarchy can define Continent/Country/City/Building/Floor structure. Then, we configure global Network Settings. This phase includes both Network and Device Credentials configuration. AAA, DHCP, DNS serves, DNS name, and Time Zone, which are automatically inherited throughout the hierarchy, are part of the Network portion. Device Credentials, in turn, define CLI, SNMP read/write, HTTP(S) read/write username/password, and CLI enable password. The credentials are used later in the Discovery phase.

Next, we build a site and device type-specific configuration templates. As a first step, we create a Project, a folder for our templates. In Figure 1-1, we have a Composite template into which we attach two Regular templates. Regular templates include Continue reading

The Network Times 2023-08-27 09:59:00

 

Available at Leanpub and Amazon


About This Book

A modern application typically comprises several modules, each assigned specific roles and responsibilities within the system. Application architecture governs the interactions and communications between these modules and users. One prevalent architecture is the three-tier architecture, encompassing the Presentation, Application, and Data tiers. This book explains how you can build a secure and scalable networking environment for your applications running in Microsoft Azure. Besides a basic introduction to Microsoft Azure, the book explains various solutions for Virtual Machines Internet Access, connectivity, security, and scalability perspectives.


Azure Basics: You will learn the hierarchy of Microsoft Azure datacenters, i.e., how a group of physical datacenters forms an Availability Zone within the Azure Region. Besides, you learn how to create a Virtual Network (VNet), divide it into subnets, and deploy Virtual Machines (VM). You will also learn how the subnet in Azure differs from the subnet in traditional networks.


Internet Access: Depending on the role of the application, VMs have different Internet access requirements. Typically, front-end VMs in the presentation tier/DMZ are visible on the Internet, allowing external hosts to initiate connections. VMs in the Application and Data tiers are rarely accessible from Continue reading

NVA Part V: NVA Redundancy with Azure Internal Load Balancer – On-Prem Connec

 Introduction


In Chapter Five, we deployed an internal load balancer (ILB) in the vnet-hub. It was attached to the subnet 10.0.0.0/24, where it obtained the frontend IP (FIP) 10.0.1.6. Next, we created a backend pool and associated our NVAs with it. Finally, we bound the frontend IP 10.0.1.6 to the backend pool to complete the ILB setup.


Next, in vnet-spoke1, we created a route table called rt-spoke1. This route table contained a user-defined route (UDR) for 10.2.0.0/24 (vnet-spoke2) with the next-hop set as 10.0.1.6. We attached this route table to the subnet 10.1.0.0/24. Similarly, in vnet-spoke2, we implemented a user-defined route for 10.1.0.0/24 (vnet-spoke1). By configuring these UDRs, we ensured that the spoke-to-spoke traffic would pass through the ILB and one of the NVAs on vnet-hub. Note that in this design, the Virtual Network Gateway is not required for spoke-to-spoke traffic.


In this chapter, we will add a Virtual Network Gateway (VGW) into the topology and establish an IPsec VPN connection between the on-premises network edge router and VGW. Additionally, we will deploy a new route table called "rt-gw-snet" where we add routing entries to the spoke VNets with the next-hop IP address 10.0.1.6 (ILB's frontend IP). Besides, we will add a routing entry 10.3.0.0/16 > 10.0.1.6 into the existing route tables on vnet-spoke-1 and vnet-spoke-2 (not shown in figure 6-1). This configuration will ensure that the spoke to spoke and spoke to on-prem flows are directed through one of the Network Virtual Appliances (NVAs) via ILB. The NVAs use the default route table, where the VGW propagates all the routes learned from VPN peers. However, we do not propagate routes from the default route table into the "rt-gw-snet" and "rt-prod-1" route tables. To enable the spoke VNets to use the VGW on the hub VNet, we allow it in VNet peering configurations.


  1. The administrator of the mgmt-pc opens an SSH session to vm-prod-1. The connection initiation begins with the TCP three-way handshake. The TCP SYN message is transmitted over the VPN connection to the Virtual Gateway (VGW) located on the vnet-hub. Upon receiving the message, the VGW first decrypts it and performs a routing lookup. The destination IP address, 10.1.0.4, matches the highlighted routing entry in the route table rt-gw-snet.
  2. The VGW determines the location (the IP address of the hosting server) of 10.1.0.6, encapsulates the message with tunnel headers, and forwards it to an Internal Load Balancer (ILB) using the destination IP address 10.1.0.6 in the tunnel header.
  3. The Internal Load Balancer receives the TCP SYN message. As the destination IP address in the tunnel header matches one of its frontend IPs, the ILB decapsulates the packet. It then checks which backend pool (BEP) is associated with the frontend IP (FIP) 10.0.1.6 to determine to which VMs it can forward the TCP SYN message. Using a hash algorithm (in our example, the 5-tuple), the ILB selects a VM from the backend pool members, in this case, NVA2. The ILB performs a location lookup for the IP address 10.1.0.5, encapsulates the TCP SYN message with tunnel headers, and finally sends it to NVA2.
  4. The message reaches the hosting server of NVA2, which removes the encapsulation since the destination IP in the tunnel header belongs to itself. Based on the Syn flag set in the TCP header, the packet is identified as the first packet of the flow. Since this is the initial packet of the flow, there is no flow entry programmed into the Generic Flow Table (GFT) related to this connection. The parser component generates a metadata file from the L3 and L4 headers of the message, which then is processed by the Virtual Filtering Platform (VFP) layers associated with NVA2. Following the VFP processing, the TCP SYN message is passed to NVA2, and the GFT is updated with flow information and associated actions (Allow and Encapsulation instructions). Besides, the VFP process creates a corresponding entry for the return packets into the GFT (reversed source and destination IPs and ports). Please refer to the first chapter for more detailed information on VFP processes.
  5. We do not have any pre-routing or post-routing policies configured on either NVA. As a result, NVA2 simply routes the traffic out of the eth0 interface based on its routing table. The ingress TCP SYN message has already been processed by the VFP layers, and the GFT has been updated accordingly. Consequently, the egress packet can be forwarded based on the GFT without the need for additional processing by the VFP layers.
  6. Subsequently, the encapsulated TCP SYN message is transmitted over VNet peering to vm-prod-1, located on vnet-spoke-1. Upon reaching the hosting server of vm-prod-1, the packet is processed in a similar manner as we observed with NVA. The encapsulation is removed, and the packet undergoes the same VFP processing steps as before.


Figure 6-1: ILB Example Topology.

Continue reading

NVA Part IV: NVA Redundancy with Azure Internal Load Balancer

Introduction

To achieve active/active redundancy for a Network Virtual Appliance (NVA) in a Hub-and-Spoke VNet design, we can utilize an Internal Load Balancer (ILB) to enable Spoke-to-Spoke traffic.

Figure 5-1 illustrates our example topology, which consists of a vnet-hub and spoke VNets. The ILB is associated with the subnet 10.0.1.0/24, where we allocate a Frontend IP address (FIP) using dynamic or static methods. Unlike a public load balancer's inbound rules, we can choose the High-Availability (HA) ports option to load balance all TCP and UDP flows. The backend pool and health probe configurations remain the same as those used with a Public Load Balancer (PLB).

From the NVA perspective, the configuration is straightforward. We enable IP forwarding in the Linux kernel and virtual NIC but not pre-routing (destination NAT). We can use Post-routing policies (source NAT) if we want to hide real IP addresses or if symmetric traffic paths are required. To route egress traffic from spoke sites to the NVAs via the ILB, we create subnet-specific route tables in the spoke VNets. The reason why the "rt-spoke1" route table has an entry "10.2.0.0/24 > 10.0.1.6 (ILB)" is that vm-prod-1 has a public IP address used for external access. If we were to set the default route, as we have in the subnet 10.2.0.0/24 in "vnet-spoke2", the external connection would fail.

Figure 5-1: ILB Example Topology.

NVA Part III: NVA Redundancy – Connection from the Internet

This chapter is the first part of a series on Azure's highly available Network Virtual Appliance (NVA) solutions. It explains how we can use load balancers to achieve active/active NVA redundancy for connections initiated from the Internet.

In Figure 4-1, Virtual Machine (VM) vm-prod-1 uses the load balancer's Frontend IP address 20.240.9.27 to publish an application (SSH connection) to the Internet. Vm-prod-1 is located behind an active/active NVA FW cluster. Vm-prod-1 and NVAs have vNICs attached to the subnet 10.0.2.0/24.

Both NVAs have identical Pre- and Post-routing policies. If the ingress packet's destination IP address is 20.240.9.27 (load balancer's Frontend IP) and the transport layer protocol is TCP, the policy changes the destination IP address to 10.0.2.6 (vm-prod-1). Additionally, before routing the packet through the Ethernet 1 interface, the Post-routing policy replaces the original source IP with the IP address of the egress interface Eth1.

The second vNICs of the NVAs are connected to the subnet 10.0.1.0/24. We have associated these vNICs with the load balancer's backend pool. The Inbound rule binds the Frontend IP address to the Backend pool and defines the load-sharing policies. In our example, the packets of SSH connections from the remote host to the Frontend IP are distributed between NVA1 and NVA2. Moreover, an Inbound rule determines the Health Probe policy associated with the Inbound rule.

Note! Using a single VNet design eliminates the need to define static routes in the subnet-specific route table and the VM's Linux kernel. This solution is suitable for small-scale implementations. However, the Hub-and-Spoke VNet topology offers simplified network management, enhanced security, scalability, performance, and hybrid connectivity. I will explain how to achieve NVA redundancy in the Hub-and-Spoke VNet topology in upcoming chapters.



Figure 4-1: Example Diagram. 

NVA Part II – Internet Access with a single NVA

Introduction

In the previous chapter, you learned how to route east-west traffic through the Network Virtual Appliance (NVA) using subnet-specific route tables with User Defined Routes (UDR). This chapter introduces how to route north-south traffic between the Internet and your Azure Virtual Network through the NVA.

Figure 3-1 depicts our VNet setup, which includes DMZ and Web Tier zones. The NVA, vm-nva-fw, is connected to subnet snet-north (10.0.2.0/24) in the DMZ via a vNIC with Direct IP (DIP) 10.0.2.4. We've also assigned a public IP address, 51.12.90.63, to this vNIC. The second vNIC is connected to subnet snet-west (10.0.0.0/24) in the Web Tier, with DIP 10.0.0.5. We have enabled IP Forwarding in both vNICs and Linux kernel. We are using Network Security Groups (NSGs) for filtering north-south traffic.

Our web server, vm-west, has a vNIC with DIP 10.0.0.4 that is connected to the subnet snet-west in the Web Tier. We have associated the route table to the subnet with the UDR, which forwards traffic to destination IP 141.192.166.81 (remote host) to NVA. To publish the web server to the internet, we've used the public IP of NVA. 

On the NVA, we have configured a Destination NAT rule which rewrites the destination IP address to 10.0.0.4 to packets with the source IP address 141.192.166.81 and protocol ICMP. To simulate an http connection, we're using ICMP requests from a remote host.


Figure 3-1: Example Diagram and.

Continue reading

Routing in Azure Subnets

Introduction

Subnets, aka Virtual Local Area Networks (VLANs) in traditional networking, are Layer-2 broadcast domains that enable attached workloads to communicate without crossing a Layer-3 boundary, the subnet Gateway. Hosts sharing the same subnet resolve each other’s MAC-IP address binding using Address Resolution Protocol, which relays on Broadcast messages. That is why we often use the Failure domain definition with subnets. We can spread subnets between physical devices over Layer-2 links using VLAN tagging, defined in the IEEE 802.1Q standard. Besides, tunnel encapsulation solutions supporting tenant/context identifier enables us to extend subnets over Layer-3 infrastructure. Virtual eXtensible LAN (VXLAN) using VXLAN Network Identifier (VNI) and Network Virtualization using Generic Route Encapsulation (NVGRE) using Tenant Network ID (TNI) are examples of Network Virtualization Over Layer 3 (NVO) solutions. If you have to spread the subnet over MPLS enabled network, you can choose to implement Virtual Private LAN (VPLS) Service or Virtual Private Wire Service (VPWS), among the other solutions.  

In Azure, the concept of a subnet is different. You can think about it as a logical domain within a Virtual Network (VNet), where attached VMs share the same IP address space and use the same shared routing policies. Broadcast and Multicast traffic is not natively supported in Azure VNet. However, you can use a cloudSwXtch VM image from swXtch.io to build a Multicast-enabled overlay network within VNet. 

Default Routing in Virtual Network

This section demonstrates how the routing between subnets within the same Virtual Network (VNet) works by default. Figure 2-1 illustrates our example Azure VNet setup where we have deployed two subnets. The interface eth0 of vm-west and interface eth1 of vm-nva-fw are attached to subnet snet-west (10.0.0.0/24), while interface eth2 of vm-nva-fw and interface eth0 of vm-west is connected to subnet snet-east (10.0.1.0/24). All three VMs use the VNet default routing policy, which routes Intra-VNet data flows directly between the source and destination endpoint, regardless of which subnets they are connected to. Besides, the Network Security Groups (NSGs) associated with vNICs share the same default security policies, which allow inbound and outbound Intra-VNet data flows, InBound flows from the Load Balancer, and OutBound Internet connections. 

Now let’s look at what happens when vm-west (DIP: 10.0.0.4) pings vm-west (DIP: 10.0.1.4), recapping the operation of VFP. Note that Accelerated Networking (AccelNet) is enabled in neither VMs.

  1. The VM vm-west sends an ICMP Request message to vm-east. The packet arrives at the Virtual Filtering Platform (VFP) for processing. Since this is the first packet of the flow, the Flow Identifier and associated Actions are not in the Unified Flow Table (UFT). The Parser component extracts the 5-tuple header information (source IP, source port, destination IP, destination port, and transport protocol) as metadata from the original packet. The metadata is then processed in each VFP layer to generate a flow-based entry in the UFT.
  2. The destination IP address matches the Network Security Group's (NSG) default outbound rule, which allows Intra-VNet flows. Then the metadata is passed on to the routing process. Since we haven't yet deployed subnet-specific route tables, the result of the next-hop route lookup is 3.3.3.3, the Provider Address (PA) of Host-C.
  3. Intra-VNet connections use private IP addresses (DIP-Direct IP), and the VFP process bypasses the NAT layer. The VNet layer, responsible for encapsulation/decapsulation, constructs tunnel headers (IP/UDP/VXLAN). It creates the outer IP address with the source IP 1.1.1.1 (Host-A) and destination IP 3.3.3.3 (Host-C), resolved by the Routing layer. Besides, it adds Virtual Network Identifier (VNI) into the VXLAN header.
  4. After each layer has processed the metadata, the result is encoded to Unified Flow Table (UFT) with Flow-Id with push action (Encapsulation). 
  5. The Header Transposition engine (HT) modifies the original packet based on the UFT actions. It adds tunnel headers leaving all original header information intact. Finally, the modified packet is transmitted to the upstream switch. The subsequent packets are forwarded based on the UFT.
  6. The Azure switching infra forwards the packet based on the destination IP address on the outer IP header (tunnel header).
  7. The VFP on Host-C processes the ingress ICMP Request message in the same manner as VFP in Host-A but in reversed order starting with decapsulation in the VNet layer.

Figure 2-1: Example Topology Diagram.
Continue reading
1 2 3 4