packetexpert

Author Archives: packetexpert

Juniper IP-CLOS (EVPN-VxLAN) Data Center – Design Options and Config Guide

1        Overview

IP-CLOS provides scalable option for large scale Data Center for hosting providers or Infrastructure as a Service (Iaas) model.  IP-CLOS model consists of spine and leaf layer switches, where leaf layer switches provides direct connectivity to Bare Metal Servers (BMS), hypervisor based servers or other network devices (e.g Firewall, Load balancer) for services layer. Each leaf device is connected to all spine devices through high speed link, connectivity between spine and leaf is IP based thus offering ECMP (equal cost multipath) load balancing over IP links.

The question arises why need IP-CLOS based Date Center, the main and primary reason is to remove the upper limit on maximum number of VLANs. In switching based Data Center (traditional 3-Tier i.e Core, Distribution & Access) or modern Data Center (Spine and Leaf based switching fabric or flat switching fabric e.g Juniper Virtual Chassis Fabric and Juniper QFabric) we still have an upper limit on available VLANs inside single Data Center i.e 4096. In IP-CLOS based Data Center VLAN values are not significant and once traffic received on leaf layer from sever/ external network devices it will be converted into VxLAN packets and will be identified by Continue reading

Multistage MC-LAG in Data Center

1       Executive Summary

Compute virtualization and converged infrastructure has introduced tremendous changes in Data Center networks.  Traditional network design (Core, Aggregation and Access layers) coupled with Spanning tree protocol for management of layer 2 loops could not simply afford requirements of virtual machine mobility and elephant flows required for modern applications. All major network vendors have collaborated and brought new technologies to solve modern day Data Center challenges. 3 tier traditional networks are being replaced with flat switching fabric or scalable IP-Fabric.

2       Multi-Chassis LAG, A Solution

Multi-Chassis Link Aggregation Group is   another solution besides “Switching Fabric and IP Fabric” where access devices or servers can have active-active connectivity and traffic load sharing on links connected with 2 different network devices.  The basic idea is to prune effects of spanning tree protocol and offer active-active topology and redundancy for link and device safe fail-over.

In this solution paper; we will discuss how to design a Data Center network for small to medium organization with collapsed core architecture (Core and aggregation layers combined in single layer) with active-active multi-homing between server and access layer switches and active-active multi-homing between access and core layer network devices. Thus completely removing spanning Continue reading

Starting SDN Learning Journey- Through Open vSwitch

openvswitch.png

Introduction

Software Defined networking (SDN) is no more a new topic but still many Network/ System engineers feel it painful how to start learning SDN. Many SDN solution exists in market and each has its pros and cons. Objective of this blog is to give an idea about SDN basics to the engineers who want to start their SDN learning curve.

Reference topology

  • 2 x Ubuntu host (14.04 LTS) each with multiple NICs
  • Open vSwitch installed in each host and 1 instance created.
  • Virtual Box installed in each host, vBox will be used to host guest virtual machines (VM-A & VM-B)

Topology Description

Open vSwitch (e.g br0) in each host will have following interfaces:-

  • A tap interface which will be used to bind guest VM to Open vSwitch
  • Eth1 of each host will be added to Open vSwitch
  • IP address / sub netmask for Eth1 of each host will be configured on Open vSwitch itself (br0)
  • Guest VM eth1 will be configured with IP/sub net mask different that host IP/ sub net mask
  • VXLAN / GRE will be configured on each host (by using host IP addresses)

Step by Step setting up Lab

It is assumed Ubuntu 14. Continue reading

Contrail Integration with Bare Metal Devices via EVPN-VxLAN

In this blog how we will discuss how to integrate Bare metal devices with Juniper Contrail (SDN Controller) by using EVPN-VXLAN.

My earlier blogs on Contrail can be viewed on links  Blog-1Blog-2 ,

Reference Topology  

evpn-vxlan

Problem statement “Gust VM spawned inside SDN environment needs to communicate with Bare Metal Device (same sub net or different sub net here we will discuss former use case only).

Solution “EVPN based control plane will be established between MX Router and Contrail Controller to exchange ARP entries between them,  VxLAN based forwarding plane will be configured for communication between Guest VMs and Bare Metal Devices”

Solution components:-

  1. Contrail GUI
    • RED network 2.2.2.0/ is configured and VMs are spawned using open stack “Horizon” Web GUI (not covered in this article)
    • Configure VxLAN as 1st encapsulation method under “Encapsulation Priority Order” go to Configure then  Infrastructure  then Global Config and click edit button.
    • Select VxLAN Identifier Mode as “user configured”
    • Configure VxLAN ID & Route target community  for the desired network

Deep Dive- Contrail Data Center Interconnect

In previous blog we discussed high level for  Juniper Contrail Data Center Interconnect and how to connect physical servers with servers deployed inside SDN environment. In this blog we will have deep dive for both scenarios. We will discuss in detail configuration options ,  control plane and data plane operations involved in both options:-

picture1

Following component are included in reference topology:-

  1. 1 x MX-5 will be configured as Data Center Edge Router
  2. Contrail Control Node
  3. Compute 51 (which has 1 x vRouter)
  4. Compute 52 (Which has 1 x vRouter)
  5. MP-iBGP will be configured by Contrail Control Node between itself and all vRouters.
  6. Contrail node will act as Route Reflector (RR) and all vRouter will act as client to RR.
  7. vRouter will establish GRE tunnel (for data plane forwarding) with all other vRouter .
  8. MX-5 (Data Center Edge Router) will also establish MP-iBGP  peer-ship with Contrail Control node and will establish GRE tunnel with all vRouters.

Now if we recall iBGP forwarding rules and co-relate to our environment:-

  1. All vRouter which are RR  clients will transmit routes only to RR.
  2. RR will receive the routes from any of the client and will transmit received routes to all clients (except the vRouter from where the Continue reading

Data Center Interconnect for Juniper Contrail (SDN Controller)

 

Juniper Contrail is Software Defined Networking (SDN) controller which automate the network provisioning in a Virtual Data Center. In traditionally server hyper-visor environment there is still need to configure and allow VLANs on Data Center switches ports connected with servers, which involves inordinate delays due to lengthy “Change Process” approval and dependency on many teams. But modern centers can not afford such delays for service provisioning as delay in service provisioning means lost of revenue.

The scope of this blog is to discuss:-

  1. How physical servers can talk with servers deployed inside SDN environment. .
  2. Layer 2 & Layer 3 Data Center Interconnect (DCI) solution between two enterprise Data Centers (DCs)

contrail

Above diagram shows architecture of  Contrail , quick overview of Contrail inner working described below, please follow the link for Contrail in depth reading (http://www.opencontrail.org/opencontrail-architecture-documentation/)

  1. Contrail  control node act as central brain.
  2. Contrail installs an instance of  vRouter on each compute node.
  3. Each vRouter on a compute node creates separate VRF (Virtual Routing and Forwarding table)  for each particular subnet for which a Virtual Machines are created.
  4. Full mesh MP-iBGP is configured by Contrail and all vRouters, Overlay tunnels (MPLS over GRE, MPLS over UPD or VXLAN used to Continue reading

Blade Chassis to End of Row Swithces Connectivity & High Availability Options

Spanning Tree Protocol (STP) free network inside Data Centre is main focus for network vendors and many technologies have been introduced in recent past to resolve STP issues in data centre and ensure optimal link utilization. Advent of switching modules inside blade enclosures coupled with the requirements for optimal link utilization starting right from blade server has made today’s Data Centre network more complex.

In this blog , we will discuss how traditional model of network switches placement (End of Row) can be coupled with blade chassis with different options available for end to end connectivity / high availability.

Network Switches are placed in End of Row and in order to remove STP Multi-Chassis Link Aggregation (MC-LAG) is deployed. Please see one of my earlier blog for understanding of MC-LAG.

Option 1: Rack mounted servers for computing machines, servers have installed multiple NICs in Pass-Though module and Virtual Machines hosted inside servers require Active/Active NIC Teaming.

picture5

Option 2 : Blade Chassis has multiple blade servers and each blade servers has more than 1 NIC (which are connected with blade chassis switches through internal fabric link). Virtul Machines hosted inside blade servers require active/active NIC teaming.

picture6

Option 3 : Blade Chassis Continue reading

Packet Walk Through-Part 1

The objective of this blog is to discuss end to end packet (client to server)  traversing through a service provider network with special consideration on performance effecting factors.   

 

screenshot

 

 

We will suppose client needs to access any of the service hosted in server connected with CE-2, all the network links and NICs on end system are Ethernet based. Almost all the vendors compute machines (PC/ servers) are generating IP data gram with 1500 bytes size  (20 bytes header +1480  data bytes) in normal circumstances. 

ip

Fragmentation:- If any of link is unable to handle 1500 size IP data-gram then packet will be fragmented and forwarded to its destination where it will be re-assembled. The fragmentation and re-assembly will introduce overhead and  defiantly over all performance will be degraded.  In IP header following fields are important to detect fragmentation and to re-assemble the packets.

  •  Identification:- Is unique for all segments if packet is fragmented at all 
  •  Flags – 3 bits  . Bit 0 always 0, bit 1 -DF (Fermentation allowed or not  0 and 1 respectively), Bit 2-MF (More fragments expected or Last ,  1 and 0 respectively)
  • Fragments Offset :- Determine where data will start after removal of IP header in 1st and subsequent segments once packet is re-assembled.

With below Continue reading

Junos MTU Handling on Access & Trunk Ports)

MTU is most important aspect for proper functionality of any application. In this blog post I will highlight MTU handling by Junos based devices for (802.3 un-tag and 802.1Q tag packets) .

802-3

Simple 802.3 packet header is shown above total packet size is 1514 bytes (14 bytes header + 1500 bytes max payload). Now we will see how  Junos based devices handle MTU on access ports.

 

  • LAB> show interfaces xe-1/0/32
    Physical interface: xe-1/0/32, Enabled, Physical link is Up

    Link-level type: Ethernet, MTU: 1514, MRU: 0, Link-mode: Auto, Speed: Auto, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled, Flow control: Disabled, Auto-negotiation: Disabled,
    ———-output omitted for brevity——————–
    Protocol eth-switch, MTU: 1514

  • LAB > monitor traffic interface xe-1/0/32 no-resolve layer2-headers print-hex 02:09:00.266841 Out 00:31:46:52:dd:80 > 00:1e:0b:d3:1d:1a, ethertype 802.1Q (0x8100), length 1486: vlan 243, p 0, ethertype IPv4, truncated-ip – 32 bytes missing!
    (tos 0x0, ttl 64, id 49385, offset 0, flags [DF], proto: ICMP (1), length: 1500)
    192.168.243.1 > 192.168.243.52: ICMP echo reply, id 29316, seq 5, length 1480

 

  • As we can see an access interface “xe-1/0/32″ showing MTU 1514 but when we monitor traffic on Continue reading

Juniper QFX 5100 & VMware ESXI Host NIC Teaming -Design Consideration

The objective of this article is to highlight design consideration for NIC Teaming between  Juniper QFX 5100 (Virtual Chassis -VC) and VMWare ESXI host.

Reference topology is as under:-

We have 2 x Juniper QFX 5100 48S switches which are deployed as VC in order to provide connectivity to  compute machines. All compute machines are running VMWare ESXI Hyper-visor. Link Aggregation Group (LAG or Active/ Active NIC Teaming) is  required between compute machines and QFX 5100 VC.

  • Data Traffic from server to switch – xe-0/0/0  interface on both switches connected to NIC 3 & 4 on a single Compute Machine.
  • ESXI Host Management  and V-Motion Traffic from server to Switch-  xe-0/0/45 interface from both switches connected to NIC 1 & 2 ports on compute machine.
  • VLANs-ID
    • Data VLANs – 116, 126
    • V-Motion- 12
    • ESXI Management-11

Hence,the requirement is to configure  LAG (Active/ Active NIC Teaming) between compute machines and network switch for optimal link utilization in addition to fault tolerance if in case one physical link goes down between network switch and compute machine.

In order to achieve the required results one’s needs to understand default load balancing mechanism over LAG member interfaces in Juniper devices and same load balancing mechanism must be  configured on VMware ESXI Continue reading

Integrating SRX in Svc Provider Network (Routing and Multi-tenancy Considerations)

Service Providers networks are always have complex requirements of multi-tenancy, routing & security and pose challenges to network architects.  In this blog I will write about SRX integration in Svc Provider Network while highlighting methodologies how to handle challenges of implementing security features with multi-tenancy and routing consideration.srx-in-sp

                                                                               REFERENCE TOPOLOGY

Devices have been classified into following segments based on their role:-

  •  Remote Customer Network (consist of Customer PCs connected to Provide Edge through Customer Edge).
  • Provider Network (Consist of Provider Edge Routers and Provider Back Bone Rout
  • Data Center Network (Consist of Internet Firewall and Server inside Data Center directly connected with Internet Firewall).
  •   Internet Edge (Consist of Internet Router connected with Internet Firewall hence providing internet access to Customer Networks connected with Data Center through provider network).

Traffic flow and security requirements are as under:-

  • Customer 1 Network (PC-1) requires access to Server-1 installed in Data Center and to Public DNS Server reachable via Internet Edge Router.
  • Continue reading

Multi-Chassis-Link Aggregation (MC-LAG)

In my earlier blog (Junos High Availability Design Guide) it was discussed how to make use of redundant routing engines by configuring features like (GRES, NSR, NSB)  for reduction of downtime to minimum possible level.

The real problem is that one RE is active at one time and all PFEs must be connected with active RE . In case of failure of primary Routing Engine (RE) the backup RE will take over  and all PFEs now, needs to connect to new primary RE. This scenario can cause momentary disruption of services.

MC-LAG (Active-Active) is correct solution to above described problem as it offers 2 active REs in 2 different devices/ chassis. Important concepts for MC-LAG proper configuration / functionality  are as under:-

  • Inter Chassis Control Protocol. The MC-LAG peers use the Inter-Chassis Control Protocol (ICCP) to exchange control information and coordinate with each other to ensure that data traffic is forwarded properly. ICCP replicates control traffic and forwarding states across the MC-LAG peers and communicates the operational state of the MC-LAG members. It uses TCP as a transport protocol and requires Bidirectional Forwarding Detection (BFD) for fast convergence. Because ICCP uses TCP/IP to communicate between the peers, the two peers must be connected to Continue reading

Junos High Availability Design Guide

High availability is one of the important consideration during network design and deployment stage and all most all the network vendors support various high availability features.

The objective of this article is to describe Junos best practices required to achieve minimum downtime in case of fail-over scenarios.

The Routing Engine or Control Plan is the brain in Junos based devices to run and execute all the management functions. Most of the  Junos based devices offers redundant routing engines (either through default configuration or through explicit configuration virtual chassis ). At one time only one Routing engine can be active (exception of Active-Active MC-LAG which is beyond the scope of this blog).  The mere presence of 2nd routing engine in the Junos device will not add any advantage with respect to high availability until certain features are not configured.

  •  Grace-full Routing Engine Switch Over  (GRES). GRES enables synchronization of kernel and chassis demon between mater routing engine and backup routing engines and in case of failure of master routing Packet Forwarding Engine (PFE) will simply join to new master routing engine (which was backup routing before fail-over).

Preparing for a Graceful Routing Engine Switchover

 

Graceful Routing Engine Switchover Process

GRES can be configured by following configuration command:-

set chassis redundancy graceful-switchover

Effects of Continue reading