Amit Aneja

Author Archives: Amit Aneja

Next-Generation Reference Design Guide for NSX-T

The NSX-T 2.5 release marks a cornerstone in NSX-T as announced at VMworld 2019 by SVP Umesh Mahajan. 2019 has been a year of phenomenal growth for VMware’s NSX-T with its wide adoption by enterprises across several verticals.  In 2019, we introduced two ground-breaking releases NSX-T 2.4 and NSX-T 2.5. With these two releases, we are fully embarking on enterprise ready system becoming the de-facto enterprise software-defined networking (SDN) platform of choice.  

To support our customers in their network and security virtualization journey, we introduced the NSX-T design guide on the NSX-T 2.0 release and provided design guidance on how customers should design their data centers with NSX-T. 

Today, we are excited to announce the next version of the NSX-T design guide based on generally available NSX-T release 2.5. It is the foundation overhaul to design guidance and leading best practices. There have been numerous L2-L7 features additions and platform enhancements since NSX-T release 2.0. This design guide covers functional aspects of these enhancements and provides design guidance for them.  

What readers can expect in the new NSX-T Design Guide:

Introducing IPv6 in NSX-T Data Center 2.4

With the latest release for VMware NSX-T Data Center 2.4, we announced the support for IPv6. Since the advent of IPv4 address space exhaustion, IPv6 adoption has continued to increase around the world. A quick look at the Google IPv6 adoption statistics proves the fact that IPv6 adoption is ramping up. With the advances in IoT space and explosion in number of endpoints (mobile devices), this adoption will continue to grow. IPv6 increases the number of network address bits from its predecessor IPv4 from 32 to 128 bits, providing more than enough globally unique IP addresses for global end-to-end reachability. Several government agencies mandate use of IPv6. In addition to that, IPv6 also provides operational simplification.

NSX-T Data Center 2.4 release introduces the dual stack support for the interfaces on a logical router (now referred as Gateway). You can now leverage all the goodness of distributed routing or distributed firewall in a single tier topology or multi-tiered topology. If you are wondering what dual stack is; it is the capability of a device that can simultaneously originate and understand both IPv4 and IPv6 packets. In this blog, I will discuss the IPv6 features that are made generally available Continue reading

Flexible deployment options for NSX-T Data Center Edge VM

Each datacenter is unique and is designed to serve the specific business needs. To serve these business needs, you could have a small or a large ESXi/KVM footprint. NSX-T Data Center can be leveraged to provide networking and security benefits regardless of the size of your datacenter. This blog focusses on a critical infrastructure component of the NSX-T Data Center i.e. NSX-T Edge node. Refer to my previous blogs, where I have discussed how the centralized components of a logical router are hosted on Edge nodes and also, provide centralized services like N-S routing, NAT, DHCP, Load balancing, VPN etc. To consume these services, traffic from compute nodes must go to the Edge node.  

These NSX-T Edge nodes could be hosted in a dedicated Edge cluster or a collapsed Management and Edge cluster as discussed in the NSX-T Reference design guide. NSX-T Edge nodes could also be hosted in Compute Cluster in small Datacenter topologies, making it a Collapsed Compute and Edge Cluster design. Please refer to NSX-T Reference design guide to understand the pros/cons of using a dedicated cluster vs a shared cluster. 

In this blog, I will cover various deployment options of NSX-T VM form factor Continue reading

NSX-T: Multi-Tiered Routing Architecture

Multi-tenancy exists in some shape or form in almost every network. For an Enterprise network, it can be the separation of tenants based on different business units, departments, different security/network policies or compliance requirements. For a service provider, multi-tenancy can simply be separation of different customers (tenants).

Multi-tenancy doesn’t just allow separation of tenants, but also provides control boundaries as to who controls what. For instance, tenant administrators can control/configure the network and security policies for their specific tenants and a service provider administrator can either provide a shared service or provide inter-tenant or WAN connectivity.

In the logical routing world of NSX-T, this provider function can provide connectivity between the tenant logical networks and  physical infrastructure. It can also provide inter-tenant communication or some shared services (like NAT, Load Balancer etc.) to the tenants.

In my previous post, NSX-T: Routing where you need it (Part 1), I discussed how NSX-T provides optimized E-W distributed routing and N-S centralized routing. In addition to that, NSX-T supports a multi-tiered routing model with logical separation between provider router functions and tenant routing functions. The concept of multi-tenancy is built into the routing model. The top-tier logical router is referred to Continue reading

NSX-T: Routing where you need it (Part 2, North-South Routing)

In the first part of this blog series, NSX-T: Routing where you need it (Part 1), I discussed how East-West (E-W) routing is completely distributed on NSX-T and how routing is done by the Distributed Router (DR) running as a kernel module in each hypervisor. 

In this post, I will explain how North-South (N-S) routing is done in NSX-T and we will also look at the ECMP topologies. This N-S routing is provided by the centralized component of logical router, also known as Service Router. Before we get into the N-S routing or packet walk, let’s define Service Router.

Service Router (SR)

Whenever a service which cannot be distributed is enabled on a Logical Router, a Service Router (SR) is instantiated. There are some services today on NSX-T which are not distributed such as:

1) Connectivity to physical infrastructure
2) NAT
3) DHCP server
4) MetaData Proxy
5) Edge Firewall
6) Load Balancer

Let’s take a look at one of these services (connectivity to physical devices) and see why a centralized routing component makes sense for running this service. Connectivity to physical topology is intended to exchange routing information from NSX domain to external networks (DC, Campus or Continue reading

NSX-T: Routing where you need it (Part 1)


Network virtualization has come a long way. NSX has played a key role in redefining and modernizing networking in a Datacenter. Providing an optimal routing path for the traffic has been one of the topmost priorities of Network Architects. Thanks to NSX distributed routing, the routing between different subnets on a ESXi hypervisor can be done in kernel and traffic never has to leave the hypervisor. With NSX-T, we take a step further and extend this network functionality to a multi-hypervisor and multi-cloud environment. NSX-T is a platform that provides Network and Security virtualization for a plethora of compute nodes such as ESXi, KVM, Bare Metal, Public Clouds and Containers.


This blog series will introduce NSX-T Routing & focus primarily on Distributed Routing. I will explain Distributed Routing in detail with a packet walk between the VMs sitting in same/different hypervisors, connectivity to physical infrastructure and multi-tenant routing. Let’s start with a quick reference to NSX-T architecture.


NSX-T Architecture

NSX-T has a built-in separation for Management plane (NSX-T Manager), Control Plane (Controllers) and Data Plane (Hypervisors, Containers etc.). I highly recommend going through NSX-T Whitepaper for detailed information on architecture to understand the components and  functionality of each of the planes.


Couple of interesting points that I want to highlight about the architecture:

  • NSX-T Manager is decoupled from vCenter and is designed to run across all these heterogeneous platforms.
  • NSX-T Controllers serve as central control point for all the logical switches within a network and maintains information about hosts, logical switches/routers. 
  • NSX-T Manager and NSX-T Controllers can be deployed in a VM form factor on either ESXi or KVM. 
  • In order to provide networking to different type of compute nodes, NSX-T relies on a virtual switch called “hostswitch”. The NSX management plane fully manages the lifecycle of this “hostswitch”. This hostswitch is a variant of the VMware virtual switch on ESXi-based endpoints and as Open Virtual Switch (OVS) on KVM-based endpoints.
  • Data Plane stretches across a variety of compute nodes: ESXi, KVM, Containers, and NSX-T edge nodes (on/off ramp to physical infrastructure).
  • Each of the compute nodes is a transport node & will be a TEP (Tunnel End Point) for the host. Depending upon the teaming policy, this host could have one or more TEPs.
  • NSX-T uses GENEVE as underlying overlay protocol for these TEPs to carry Layer 2 information across Layer 3. GENEVE provides us the complete flexibility of inserting Metadata as TLV (Type, Length, Value) fields which could be used for new features. One of the examples of this Metadata is VNI (Virtual Network Identifier). We recommend a MTU of 1600 to account for encapsulation header. More details on GENEVE can be found on the following IETF Draft.


Before we dive deep into routing, let me define a few key terms.

Logical Switch is a broadcast domain which can span across multiple compute hypervisors. VMs in the same subnet would connect to the same logical switch. 


Logical Router provides North-South, East-West routing between different subnets & has two components: Distributed component that runs as a kernel module in hypervisor and Centralized component to take care of centralized functions like NAT, DHCP, LB and provide connectivity to physical infrastructure.

Types of interfaces on a Logical Router

  • Downlink- Interface connecting to a Logical switch.
  • Uplink– Interface connecting to the physical infrastructure/physical router.
  • RouterLink– Interface connecting two Logical routers.


Edge nodes are appliances with a pool of capacity to run the centralized services and would be an on/off ramp to the physical infrastructure. You can think of Edge node as an empty container which would host one or multiple Logical routers to provide centralized services and connectivity to physical routers. Edge node will be a transport node just like compute node and will also have a TEP IP to terminate overlay tunnels.

They are available in two form factor: Bare Metal or VM(leveraging Intel’s DPDK Technology).


Moving on, let’s also get familiarized with the topology that I will use throughout this blog series.

I have two hypervisors in above topology, ESXi and KVM. Both of these hypervisors have been prepared for NSX & have been assigned a TEP (Tunnel End Point) IP, ESXi Host:, KVM host: These hosts have L3 connectivity between them via transport network. I have created 3 Logical switches via NSX Manager & have connected a VM to each one of the switches. I have also created a Logical Router named Tenant 1 Router, which is connected to all the logical switches and is acting as a Gateway for each subnet.

Before we look at the routing table, packet walk etc., let’s look at how configuration looks like in NSX Manager. Here is switching configuration, showing 3 Logical switches.

Following is the routing configuration showing the Tenant 1 Logical Router.


Once configured via NSX Manager, the logical switches and routers are pushed to both the hosts, ESXi and KVM. Let’s validate that on both hosts. Following is the output from ESXi showing the Logical switches and router.


Following is the output from KVM host showing the Logical switches and router.


NSX Controller MAC learning and advertisement


Before we look at the packet walk, it is important to understand how remote MAC addresses are learnt by the compute hosts. This is done via NSX Controllers.  As soon as a VM comes up and connects to Logical switch, local TEP registers its MAC with the NSX Controller. Following output from NSX Controller shows that the MAC addresses of VMs on Web VM1, App VM1 and DB VM1 have been reported by their respective TEPs. NSX Controller publishes this MAC/TEP association to the compute hosts depending upon type of host.

Now, we will look at the communication between VMs on the same hypervisor.


Distributed Routing for VMs hosted on the same Hypervisor


We have WEB VM1 and App VM1 hosted on the same ESXi hypervisor. Since we are discussing the communication between VMs on same host, I am just showing the relevant topology below.


 Following is how traffic would go from Web VM1 to App VM1.

  1. Web VM1 ( sends traffic to the gateway, as the destination ( is in different subnet. This traffic traverses Web-LS and goes to Downlink interface of Local distributed router running as a kernel module on ESXi Host.
  2. Routing lookup happens on the ESXi distributed router and subnet is a Connected route. Packet gets routed and is put on the App-LS.
  3. Destination MAC lookup for MAC address of App VM1 is needed to forward the frame. In this case, App VM1 is also hosted on the same ESXi, we do a MAC address lookup and find a local MAC entry as highlighted in diagram above.
  4. L3 rewrite is done and packet is sent to App VM1.

Please note that the packet didn’t have to leave the hypervisor to get routed. This routing happened in kernel. Now that we understand the communication between two VMs (in different subnet) on same hypervisor, let’s take a look at the packet walk from Web VM1 ( on ESXi to DB-VM1 ( hosted on KVM.


Distributed Routing for VMs hosted on the different Hypervisors (ESXi & KVM)


  1. Web VM1 ( sends traffic to the gateway, as the destination ( is in different subnet. This traffic traverses Web-LS and goes to Downlink interface of Local distributed router on ESXi Host.
  2. Routing lookup happens on the ESXi distributed router. Packet gets routed and is put on the DB-LS. Following output show the distributed router on ESXi host and it’s routing table.

  1. Destination MAC lookup for MAC address of DB VM1 is needed to forward the frame. MAC lookup is done and MAC address of DB VM1 is learnt via remote TEP Again, this MAC/TEP association table was published by NSX Controller to the hosts.

  1. ESXi TEP encapsulates the packet and sends it to the remote TEP with a Src IP=, Dst IP=
  1. Packet is received at remote KVM TEP, where VNI (21386) is matched. MAC lookup is done and packet is delivered to DB VM1 after removing the encapsulation header.

A quick traceflow validates the above packet walk.

This concludes the routing components part of this blog. In the next blog of this series, I will discuss multi-tenant routing and connectivity to physical infrastructure.

Continue reading

NSX-T: Routing where you need it (Part 1)

Network virtualization has come a long way. NSX has played a key role in redefining and modernizing networking in a Datacenter. Providing an optimal routing path for the traffic has been one of the topmost priorities of Network Architects. Thanks to NSX distributed routing, that the routing between different subnets on a ESXi hypervisor can... Read more →