Diane Patton

Author Archives: Diane Patton

NetQ agent on a host

We all know and love NetQ – it works hand-in-hand with Linux to accelerate data center operations. Customers love how easy it is to install and operate which makes their lives easier. Also, it can prevent and find issues in a data center by viewing the entire data center as a whole and providing three different types of services:

  • Preventative: NetQ allows an engineer to check all data center configurations and state in a few steps from any location in the network. The validation can be done on a virtual network using vagrant with Cumulus VX or if a virtual environment is not available, it can also be used during an change outage window. Since NetQ has built in analyzers of the network as a whole, no scripting is required and the validation is done from one location, rather than hop by hop. It can also shorten outage windows needed for network changes allowing shorter outage windows virtually or during outage windows.
  • Proactive: NetQ supplies notifications if something goes wrong in the network by either logging it to a file or integrating with third party applications like Slack, PagerDuty, or Splunk. It can also be filtered to ensure the right Continue reading

NetQ + Kubernetes: bringing container visibility with the leading container orchestrator

Businesses today have to get applications to market faster than ever, but with the same or less budget. Because of this requirement, modern data centers are evolving to support a change in application delivery. In order to get applications to market faster and increase revenue, applications that were once built as one monolithic entity are becoming segmented and deployed separately, communicating amongst themselves. The pieces of applications, sometimes referred to as microservices, are often deployed as containers. This results in much faster deployment and a quicker update cycle. However, the network teams operating the infrastructure supporting the applications often have no visibility into how their networks are being utilized, and thus are making design, operations and troubleshooting decisions blindly. Now, Cumulus NetQ provides this visibility from container deployments all the way to the spine switches and beyond — accelerating operations and providing the crucial information to efficiently design and operate the networks running containers.

Understanding the challenges of container management

Traditionally, the new application design and deployment method using containers makes operating and managing the infrastructure to support them very challenging. The containers often have to talk with each other within or across data centers or to the outside world. An Continue reading

Why and how to deploy Voyager

In Part I of this blog series, “What is the open packet optical switch, Voyager?”, we discussed the challenges and remedies for providing additional bandwidth for intra and inter data center connections. DWDM is a powerful technology that provides hundreds of gigabits of bandwidth over hundreds or thousands of kilometers using just a fiber pair. We also reviewed some information about DWDM networks and transponder functionality. Voyager provides all the functionality of Cumulus Linux running on a Broadcom Tomahawk based switch and integrates the transponders into the switch itself, all in 1RU. This makes it the first open and fully integrated box operating at DWDM, Layer 2 and Layer 3 in 1RU, making it extremely flexible.

Incorporating routing, switching and DWDM in one node could mean fewer boxes needed for the network, since DWDM functionality could be incorporated directly into border leafs. Because it runs Cumulus Linux (CL), all CL data center functionality, such as VXLAN Routing with EVPN, is also supported. For example, a pair of Voyager nodes can be used as VXLAN routing centralized routers with EVPN, hosting VXLAN VTEPs, running MLAG, and provide the long distance DWDM connectivity all in one box!

Voyager also Continue reading

What is the open packet optical switch, Voyager?

Modern web-scale data centers are thirsty for bandwidth. Popular applications such as video and virtual reality are increasing in demand, causing data centers to require higher and higher bandwidths — both within data centers and between data centers. In this blog post, we will briefly discuss the current challenges in the optics space as well as some of the key technical aspects of the Voyager’s DWDM transponder. In part two of this series, we will cover why Voyager is a unique, powerful and robust solution.

The challenges to accommodate longer distances

Within a data center, organizations are adding higher and higher bandwidth ports and connections to accommodate the need for more bandwidth. However, connections that accommodate longer distances between data centers may be limited and expensive. Therefore, a critical requirement for businesses with this challenge is how to support longer distance spans at higher bandwidths over a small amount of fiber pairs.

The optical industry solves the bandwidth problem using Dense Wave Division Multiplexing (DWDM). DWDM allows many separate connections on one fiber pair by sending them over different wavelengths. Although the wavelengths are sent on the same physical fiber, they act as “ships in the night” and don’t interact Continue reading

CI/CD: What does it mean for the network engineer?

The continuous integration/continuous delivery (CI/CD) process is very popular in the DevOps industry. CI/CD creates a more agile software development environment, which provides benefits including the faster delivery of applications. As a network engineer, are there any aspects of this I can benefit from to improve network operations and achieve the same goal: design and deploy an agile network that provides customers access to those applications as fast as they are deployed? After all, quick, reliable application delivery is only as fast as customers can access it.

This blog post outlines how treating infrastructure as code and implementing a CI/CD workflow can ease the life of a network engineer. It also describes how using Cumulus VX and Cumulus NetQ can simplify this process further.

What does “infrastructure as code” mean?

Generally, it means treating all your network node configurations as code that you manage externally to the nodes. The program identifies each individual node and renders or produces all the configurations for all the nodes in the network in one step. This also means all configuration changes happen in this code, and the code itself accesses the box to deploy the configurations, not the engineer. Configuration deployment can be done Continue reading

VXLAN routing with EVPN: asymmetric vs. symmetric model

We all know and love EVPN as a control plane for VXLAN tunnels over a layer 3 infrastructure (Need a refresher? Check out our blog post on the topic). EVPN gives us the ability to deploy VXLAN tunnels without controllers. Plus, it offers a range of other benefits such as reduction of data center traffic through ARP suppression, quick convergence during mobility, one routing protocol for both underlay and overlay and the inherent ability to support multi-tenancy (just to name a few). So EVPN for VXLAN for all of your layer 2 needs, right? Well it’s a little more complicated than that.

Customers need to also communicate between VXLANs and between a VXLAN tunnel and the outside world, so VXLAN routing must also be enabled in the network — which is what I cover in this post. Previous generation merchant silicon does not internally support VXLAN routing, so customers implement a workaround — adding an external loopback cable, sometimes called hyperloop, to the switch. The newer chips that support VXLAN routing allow us to route directly on the ASIC, eliminating the need for the hyperloop.

VXLAN routing can be performed with one of two architectures – centralized or distributed. Continue reading

5 ways to design your container network

There’s been a lot of talk about container networking in the industry lately (heck, we can’t even stop talking about it). And it’s for a good reason. Containers offer a fantastic way to develop and manage microservices and distributed applications easily and efficiently. In fact, that’s one of the reasons we launched Host Pack — to make container networking even simpler. Between Host Pack and NetQ, you can get fabric-wide connectivity and visibility from server to switch.

There are a variety of ways you can deploy a container network using Host Pack and Cumulus Linux, and we have documented some of them in several Validated Design Guides discussed below. Wondering which deployment method is right for your business? This blog post is for you.

Docker Swarm with Host Pack

Overview: The Docker Swarm with Host pack solution uses the connectivity module within Host Pack, Free Range Routing (FRR) in a container. The FRR container runs on the servers and uses BGP unnumbered for Layer 3 connectivity, enabling the hosts to participate in the routing fabric. We use Docker Swarm as the container orchestration tool for simplicity.

Choose this deployment if:

Troubleshooting with Docker Swarm + NetQ

Say you are a network engineer, and you recently were told your company will be building applications using a distributed/microservices architecture with containers moving forward. You know how important this is for the developers — it gives them tremendous flexibility to develop and deploy money making applications. However, what does this mean for the network? It can be much more technically challenging to plan, operate, and manage a network with containers than a traditional network. The containers may need to talk with each other and to the outside world, and you won’t even know IF they exist, let alone WHERE they exist! Yet, the network engineer is responsible for the containers connectivity and high availability.

Since the containers are deployed inside a host — on a virtual ethernet network — they can be invisible to network engineers. Orchestration tools such as Docker Swarm, Apache Mesos or Kubernetes make it very easy to spin up and take down containers from various hosts on a network – and may even do this without human intervention. Many containers are also ephemeral and the traffic patterns between the servers hosting containers can be very dynamic and constantly shifting throughout the network.

troubleshooting with Docker Swarm

Cumulus Networks understands Continue reading

Converge your network with priority flow control (PFC)

Back in April, we talked about a feature called Explicit Congestion Notification (ECN). We discussed how ECN is an end-to-end method used to converge networks and save money. Priority flow control (PFC) is a different way to accomplish the same goal. Since PFC supports lossless or near lossless Ethernet, you can run applications, like RDMA, over Converged Ethernet (RoCE or RoCEv2) over your current data center infrastructure. Since RoCE runs directly over Ethernet, a different method than ECN must be used to control congestion. In this post, we’ll concentrate on the Layer 2 solution for RoCE — PFC, and how it can help you optimize your network.

What is priority flow control?

Certain data center applications can tolerate only little or no loss. However, traditional Ethernet is connectionless and allows traffic loss; it relies on the upper layer protocols to re-send or provide flow control when necessary. To allow flow control for Ethernet frames, 802.3X was developed to provide flow control on the Ethernet layer. 802.3X defines a standard to send an Ethernet PAUSE frame upstream when congestion is experienced, telling the sender to “stop sending” for a few moments. The PAUSE frame stops traffic BEFORE the buffer Continue reading

Converge your network with explicit congestion notification

Every now and again, we like to highlight a piece of technology or solution featured in Cumulus Linux that we find especially useful. Priority Flow Control (PFC) and Explicit Congestion Notification (ECN) are exactly such things. In short, these technologies allow you to converge networks and save money. By supporting lossless or near lossless Ethernet, you can now run applications such as RDMA over Converged Ethernet (RoCE) or RoCEv2 over your current data center infrastructure. In this post, we’ll concentrate on the end-to-end solution for RoCEv2 – ECN and how it can help you optimize your network. We will cover PFC in a future post.

What is explicit congestion notification?

ECN is a mechanism supported by Cumulus Linux that helps provide end-to-end lossless communication between two endpoints over an IP routed network. Normally, protocols like TCP use dropped packets to indicate congestion, which then tells the sender to “slow down’. Explicit congestion notification uses this same concept, but instead of dropping packets after the queues are completely full, it notifies the receiving host that there was some congestion before the queues are completely full, thereby avoiding dropping traffic. It uses the IP layer (ECN bits in the IP TOS header) Continue reading

Announcing EVPN for scalable virtualized networks

This post was updated on 2/22/17 to reflect the official launch of EVPN. You can now access EVPN in general availability. Read the white paper to learn more about this exciting new feature.

When we set out to build new features for Cumulus Linux, we ask ourselves two questions: 1) How can we make network operators’ jobs easier? And 2) How can we help businesses use web-scale IT principles to build powerful, efficient and highly-scalable data centers? With EVPN, we believe we nailed both.

Why EVPN?

Many data centers today rely on layer 2 connectivity for specific applications and IP address mobility. However, an entire layer 2 data center can bring challenges such as large failure domains, spanning tree complexities, difficulty troubleshooting, and scale challenges as only 4094 VLANS are supported.

Therefore, modern data centers are moving to a layer 3 fabric, which means running a routing protocol, such as BGP or OSPF between the leaf and spine switches. In order to provide layer 2 connectivity, between hosts and VMs on different racks as well as maintain multi-tenant separation, layer 2 overlay solution is deployed such as VXLAN. However, VXLAN does not define a control plane to learn and exchange Continue reading

Is web-scale networking secure? This infographic breaks it down.

At Cumulus Networks, we take a lot of pride in the fact that web-scale networking using Cumulus Linux can have an immense impact on an organization’s ability to scale, automate and even reduce costs. However, we know that efficiency and growth are not the only things our customers care about.

In fact, many of our customers are interested first and foremost in the security of web-scale networking with Cumulus Linux. Many conclude that a web-scale, open environment can be even more secure than a closed proprietary one. Keep reading to learn more or scroll to the bottom to check out our infographic “The network security debate: Web-scale vs. traditional networking”

Here are some of the ways web-scale networking with Cumulus Linux keeps your data center switches secure:

  • Cumulus Linux uses the same standard secure protocols and procedures as a proprietary vendor: For example, Openssh is used by both traditional closed vendors and Cumulus Linux. The standardized MD5 is used for router authentication, and Cumulus supports management VRF.
  • Web-scale networking has more “eyes” on the code with community support: Linux has a large community of developers from different backgrounds and interests supporting the integrity of the code. Since an entire community of Continue reading

Cumulus Linux Network Command Line Utility

Here at Cumulus Networks, we believe that network engineers are the real heart of an organization. They’re the ones managing switches, running the data center, and generally keeping an organization moving efficiently and securely.

We also believe web-scale networking and the associated benefits should be accessible to everyone and the best way to make that happen is to leverage the power of disaggregation and native Linux. Although web-scale networking is very flexible, agile and offers many benefits, there can be a learning curve as Linux uses separate, independently developed applications which each have their own syntax to configure the switch.

So how do we bridge these two beliefs? Allow us to introduce Cumulus Linux Network Command Line Utility (NCLU). Scheduled for our early December 3.2 release, NCLU empowers and quickens the learning curve so all network engineers can benefit from web-scale networking while integrating with and still supporting the traditional Linux methods. In short, NCLU makes Cumulus Linux easily accessible to everyone.

What is Cumulus Linux Network Command Line Utility?

NCLU is a command line utility for Cumulus Linux that rides in the Linux user space as seen below. It provides consistent access to networking commands directly via bash, Continue reading

With the Hype around SDN- Is it Worth it in a Data Center?

With all the hype about SDN, we wonder: Is it worth deploying it in a data center? What goals does SDN meet? Is there a simpler way to achieve those same goals?

We think there is a better way, and it’s called Open Networking. And Open Networking is available today.

Vivek Venkataraman (my colleague and co-author of this blog) and I recently presented “Achieving SDN Goals through Open Networking” at the Open Networking Summit in Santa Clara, CA. The talk was well attended and interactive. We examined the goals of SDN:

  • Controlling costs
  • Achieving agility and flexibility
  • Easily enabling and encouraging innovation
  • Using the network as a platform

You can achieve all of these goals today in a more pragmatic and efficient fashion using Linux and Open Networking.

Controlling Costs

Reducing both CapEx and OpEx is always an important goal for any enterprise. First, Open Networking reduces CapEx by encouraging vendor competition and giving customers choice, all the way from the optics and silicon up to the OS and applications. The modular choice allows data centers to be designed to their exact business requirements, so customers pay for and deploy only the necessary hardware, features and applications that Continue reading