Archive

Category Archives for "Networking"

Removing uncertainty through “what-if” capacity planning

Infrastructure planning for a network serving more than 81 million requests at peak and which is globally distributed across more than 330 cities in 120+ countries is complex. The capacity planning team at Cloudflare ensures there is enough capacity in place all over the world so that our customers have one less thing to worry about - our infrastructure, which should just work. Through our processes, the team puts careful consideration into “what-ifs”. What if something unexpected happens and one of our data centers fails? What if one of our largest customers triples, or quadruples their request count?  Across a gamut of scenarios like these, the team works to understand where traffic will be served from and how the Cloudflare customer experience may change.

This blog post gives a look behind the curtain of how these scenarios are modeled at Cloudflare, and why it's so critical for our customers.

Scenario planning and our customers

Cloudflare customers rely on the data centers that Cloudflare has deployed all over the world, placing us within 50 ms of approximately 95% of the Internet-connected population globally. But round-trip time to our end users means little if those data centers don’t have the capacity Continue reading

Cloudflare incident on September 17, 2024

On September 17, 2024, during routine maintenance, Cloudflare inadvertently stopped announcing fifteen IPv4 prefixes, affecting some Business plan websites for approximately one hour. During this time, IPv4 traffic for these customers would not have reached Cloudflare, and users attempting to connect to websites assigned addresses within those prefixes would have received errors. 

We’re very sorry for this outage. 

This outage was the result of an internal software error and not the result of an attack. In this blog post, we’re going to talk about what the failure was, why it occurred, and what we’re doing to make sure this doesn’t happen again.

Background

Cloudflare assembled a dedicated Addressing team in 2019 to simplify the ways that IP addresses are used across Cloudflare products and services. The team builds and maintains systems that help Cloudflare conserve and manage its own network resources. The Addressing team also manages periodic changes to the assignment of IP addresses across infrastructure and services at Cloudflare. In this case, our goal was to reduce the number of IPv4 addresses used for customer websites, allowing us to free up addresses for other purposes, like deploying infrastructure in new locations. Since IPv4 addresses are a finite Continue reading

How Cloudflare is helping domain owners with the upcoming Entrust CA distrust by Chrome and Mozilla

Chrome and Mozilla announced that they will stop trusting Entrust’s public TLS certificates issued after November 12, 2024 and December 1, 2024, respectively. This decision stems from concerns related to Entrust’s ability to meet the CA/Browser Forum’s requirements for a publicly trusted certificate authority (CA). To prevent Entrust customers from being impacted by this change, Entrust has announced that they are partnering with SSL.com, a publicly trusted CA, and will be issuing certs from SSL.com’s roots to ensure that they can continue to provide their customers with certificates that are trusted by Chrome and Mozilla. 

We’re excited to announce that we’re going to be adding SSL.com as a certificate authority that Cloudflare customers can use. This means that Cloudflare customers that are currently relying on Entrust as a CA and uploading their certificate manually to Cloudflare will now be able to rely on Cloudflare’s certificate management pipeline for automatic issuance and renewal of SSL.com certificates. 

CA distrust: responsibilities, repercussions, and responses

With great power comes great responsibility Every publicly trusted certificate authority (CA) is responsible for maintaining a high standard of security and compliance to ensure that the certificates they issue are trustworthy. Continue reading

EVPN Hub-and-Spoke Layer-3 VPN

Now that we figured out how to implement a hub-and-spoke VPN design on a single PE-router, let’s do the same thing with EVPN. It turns out to be trivial:

  • We’ll split the single PE router into three PE devices (pe_a, pe_b, and pe_h)
  • We’ll add a core router (p) and connect it with all three PE devices.

As we want to use EVPN and have a larger core network, we’ll also have to enable VLANs, VXLAN, BGP, and OSPF on the PE devices.

This is the topology of our expanded lab:

HW036: eduroam Visitor Access (eVA): Simplifying Campus Guest Wi-Fi Access

In today’s episode, guest Cheryl Connell joins host Keith Parsons to talk about the eduroam Visitor Access (eVA) system. Cheryl explains that eVA is a free add-on for institutions with an existing eduroam setup, allowing them to create temporary usernames and passwords for guests without needing a separate guest network. They discuss the challenges of... Read more »

How to Use GitPython to Manage Git Repositories?

How to Use GitPython to Manage Git Repositories?

I know what you're thinking, we usually manage our Python code via Git to track changes, but what do I mean by using GitPython to manage Git repositories? I recently faced a situation where I needed to automate a Git workflow. This includes pulling the latest changes from a Git repository, creating a branch, making some changes, viewing the diff, committing, and then pushing my branch back to the remote repository.

Doing this repeatedly was time-consuming, and I figured there must be a way to automate this. With Python, virtually anything is possible. I found a Python library called 'GitPython' that does exactly this. So, let's get to it.

What Is GitPython?

GitPython is a Python library that lets you work with Git repositories. It allows you to manage Git tasks using Python code, making it easy to automate things like commits, branches, and pushes without using the command line. This is useful for automating repetitive Git tasks directly from Python.

For example, you can use it to pull the latest updates from a repository, create new branches, and commit to your changes. It also provides a way to view diffs, so you can see what has changed Continue reading

Hub-and-Spoke VPN Topology

Hub-and-spoke topology is by far the most complex topology I’ve ever encountered in the MPLS/VPN (and now EVPN) world. It’s used when you want to push all the traffic between sites attached to a VPN (spokes) through a central site (hub), for example, when using a central firewall.

You get the following diagram when you model the traffic flow requirements with VRFs. The forward traffic uses light yellow arrows, and the return traffic uses dark orange ones.

Artificial Intelligence for Network Engineers: Introduction

Several books on artificial intelligence (AI) and deep learning (DL) have been published over the past decade. However, I have yet to find a book that explains deep learning from a networking perspective while providing a solid introduction to DL. My goal is to fill this gap by writing a book titled AI for Network Engineers (note that the title name may change during the writing process). Writing about such a complex subject will take time, but I hope to complete and release it within a year.

Part I: Deep Learning and Deep Neural Networks

The first part of the book covers the theory behind Deep Learning. It begins by explaining the construct of a single artificial neuron and its functionality. Then, it explores various Deep Neural Network models, such as Feedforward Neural Networks (FNN), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). Next, the first part discusses data and model parallelization strategies such as Data, Pipeline, and Tensor Parallelism, explaining how input data and/or model sizes that exceed the memory capacity of GPUs within a single server can be distributed across multiple GPU servers.

Part II: AI Data Center Networking - Lossless Ethernet

After a brief Continue reading

Network CI/CD Pipeline – Speed Up Your CI Jobs with GitLab Cache

Network CI/CD Pipeline - Speed Up Your CI Jobs with GitLab Cache

In the previous pipelines, I’ve been using Python and had to install multiple pip modules. Suppose we have 5 different jobs, we would be installing the pip modules again and again for each job, which takes a while to complete. Remember, each job runs in its own pristine environment, meaning it builds a fresh Docker container and installs all the required modules before running the script we need. This repetition can slow down the pipeline significantly.

In this blog post, let’s look at how you can use GitLab Cache to speed up your jobs and avoid unnecessary reinstallations.​ If you are new to GitLab or CI/CD in general, I highly recommend checking out my previous GitLab introduction post below.

Network CI/CD Pipeline - GitLab Introduction
In this part, we’ll discuss what exactly GitLab is and the role it plays in the whole CI/CD process. We’ll explore how to use GitLab as a Git repository, how to install GitLab runners
Network CI/CD Pipeline - Speed Up Your CI Jobs with GitLab Cache

The Problem With My Previous Approach

This is how my pipeline looked before. Though it worked perfectly fine, it took around 45 seconds to run each job and just over 3 minutes for the entire pipeline to Continue reading

AI for Network Engineers: Chapter 1 – Deep Learning Basics

Content

Introduction 
Artificial Neuron 
  Weighted Sum for Pre-Activation Value 
  ReLU Activation Function for Post-Activation 
  Bias Term
  S-Shaped Functions – TANH and SIGMOID
Network Impact
Summary
References

Introduction


Artificial Intelligence (AI) is a broad term for solutions that aim to mimic the functions of the human brain. Machine Learning (ML), in turn, is a subset of AI, suitable for tasks like simple pattern recognition and prediction. Deep Learning (DL), the focus of this section, is a subset of ML that leverages algorithms to extract meaningful patterns from data. Unlike ML, DL does not necessarily require human intervention, such as providing structured, labeled datasets (e.g., 1,000 bird images labeled as “bird” and 1,000 cat images labeled as “cat”). 


DL utilizes layered, hierarchical Deep Neural Networks (DNNs), where hidden and output layers consist of computational units, artificial neurons, which individually process input data. The nodes in the input layer pass the input data to the first hidden layer without performing any computations, which is why they are not considered neurons or computational units. Each neuron calculates a pre-activation value (z) based on the input received from the previous layer and then applies an activation function to this value, producing a post-activation output (ŷ) value. There are various DNN models, such as Feed-Forward Neural Networks (FNN), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN), each designed for different use cases. For example, FNNs are suitable for simple, structured tasks like handwritten digit recognition using the MNIST dataset [1], CNNs are effective for larger image recognition tasks such as with the CIFAR-10 dataset [2], and RNNs are commonly used for time-series forecasting, like predicting future sales based on historical sales data. 


To provide accurate predictions based on input data, neural networks are trained using labeled datasets. The MNIST (Modified National Institute of Standards and Technology) dataset [1] contains 60,000 training and 10,000 test images of handwritten digits (grayscale, 28x28 pixels). The CIFAR-10 [2] dataset consists of 60,000 color images (32x32 pixels), with 50,000 training images and 10,000 test images, divided into 10 classes. The CIFAR-100 dataset [3], as the name implies, has 100 image classes, with each class containing 600 images (500 training and 100 test images per class). Once the test results reach the desired level, the neural network can be deployed to production.


Figure 1-1: Deep Learning Introduction.

Continue reading

Private VLAN (PVLAN) Configuration Example

Private VLAN (PVLAN) Configuration Example

We all know that by default, all the devices in the same VLAN can talk to each other. For example, if you have a switch with multiple devices connected to it and if they are part of the same VLAN, they can communicate without any restrictions. But there are times when you might want to keep the devices in the same VLAN while preventing them from talking to each other. This is where Private VLANs come into play, offering control over who can talk to each other within the 'same VLAN'. So, let’s get started and we will cover the following topics.

  1. Isolated VLAN, Community VLAN and Promiscuous Port
  2. A very Simple Private VLAN example
  3. Private VLAN with Multiple Switches (Trunk)
  4. Private VLAN to Default Gateway over Trunk

Private VLAN (PVLAN) Introduction

Let's break down how Private VLANs work with a simple scenario. Imagine we have a "users" VLAN where all the laptops connect. Suppose we have a mix of Windows and Linux devices. We want to ensure that Windows devices can't communicate with each other at all. However, it's okay for Linux devices to talk to each other, but they shouldn't communicate with the Windows devices either.

Continue reading