Technology Short Take 180

Welcome to Technology Short Take #180! It’s hard to believe that July is almost over, and that 2024 is flying past us. It’s probably time that you, my readers, took some time to slow down and read more technical blogs. To help with that, I just happen to have a little collection of links to share. Enjoy!

Networking

  • Read this article to better understand why native VLANs exist.

Servers/Hardware

Security

Experience Expansion

Recently at Networking Field Day, one of the presenters for cPacket had a wonderful line that stuck with me:

There’s no compression algorithm for experience.

Like, floored. Because it hits at the heart of a couple of different things that are going on in the IT industry right now that showcase why it feels like everything is on the verge of falling apart and what we can do to help that.

Misteaks Hapin

Let’s just get this out of the way: you are going to screw up. Anyone doing any job ever for any amount of time has made a mistake. I know I’ve made my fair share of them over the years. When I finished chastising myself I looked back at what happened, figured out what went wrong, and made sure that it didn’t happen that exact same way again. That’s experience.

Experience is key to understanding why we do things the way we do them or why we don’t do something a certain way. You know how you get experience? By doing it. It’s rare that someone can read a book or a blog post about some topic and instantly know everything there is to know about Continue reading

Using netlab Reports

Did you know you can use netlab to generate reports describing your lab topology, IP addressing, BGP details, or OSPF areas? The magic command (netlab report) was introduced in August 2023, followed by netlab show reports to display the available reports a few months later.

You can generate the reports in text, Markdown, or HTML format. The desired format is selected with the report name suffix. For example, the bgp-asn.md report will create Markdown text.

Let’s see how that works.

Meta Lets Its Largest Llama AI Model Loose Into The Open Field

A scant three months ago, when Meta Platforms released the Llama 3 AI model in 8B and 70B versions, which correspond to the billions of parameters they can span, we asked the question we ask of every open source tool or platform since the dawn of Linux: Who’s going to profit from it and how are they going to do it?

Meta Lets Its Largest Llama AI Model Loose Into The Open Field was written by Jeffrey Burt at The Next Platform.

I’m a Network Engineer, I Want to Learn Cloud, What Should I Do?

I'm a Network Engineer, I Want to Learn Cloud, What Should I Do?

As a Network Engineer, I often receive messages on LinkedIn and through my blog with people asking, “How do I start learning about Cloud?” After getting so many similar messages, I thought it would be more easier to write a dedicated blog post to address this. If you’re looking for a quick answer, I’ll tell you this, Learning about Cloud is easier than you might think, especially if you’re already familiar with networking concepts like BGP, Subnets and Routing.

💡
Please note that when I mention “Cloud,” I’m specifically talking about the networking aspects of cloud computing. The cloud covers a vast array of technologies, and trying to learn everything is almost impossible. So, my focus here is primarily on understanding how networking functions within the cloud, and perhaps managing some virtual machines (VMs). I’ll be focusing on AWS since that’s the cloud environment I’m most familiar with.

Please note, this blog post isn’t intended to teach you everything about AWS but rather to point you in the right direction on how to begin learning. The best way to learn is by actively doing something in AWS and picking up more knowledge as you go.

If You Continue reading

Dropped packet notifications with Arista Networks

Visibility into dropped packets is essential for Artificial Intelligence/Machine Learning (AI/ML) workloads, where a single dropped packet can stall large scale computational tasks, idling millions of dollars worth of GPU/CPU resources, and delaying the completion of business critical workloads. Enabling real-time sFlow telemetry provides the observability into traffic flows and packet drops needed to effectively manage these networks.

The availability of the Arista EOS 4.31.4M maintenance release brings sFlow dropped packet monitoring (previously demonstrated using the 4.30.1F feature release - see SC23 Dropped packet visibility demonstration) to production networks, see EOS Life Cycle Policy
sflow sampling 50000
sflow polling-interval 20
sflow vrf mgmt destination 203.0.113.100
sflow vrf mgmt source-interface Management0
sflow run
The above Arista EOS commands enable sFlow counter polling and packet sampling on all ports, sending the sFlow telemetry to the sFlow analyzer at 203.0.113.100
flow tracking mirror-on-drop
  sample limit 100 pps
  !
  tracker SFLOW
    exporter SFLOW
      format sflow
      collector sflow
      local interface Management0
  no shutdown
The above commands add sFlow Dropped Packet Notification Structures to the sFlow telemetry feed using Broadcom Mirror on Drop (MoD) instrumentation. Broadcom implements mirror-on-drop in Jericho 2, Trident 3, and Tomahawk 3, Continue reading

TL001: The Line Between Management and IC Leadership

This first episode of Technically Leadership explores distinctions and commonalities between the management track and the staff engineer track with guests Nick Silkey and Martin Smith. Our guests share their stories from both perspectives and offer advice for those considering similar paths in technical leadership. Episode Guests: Nick Silkey and Martin Smith Nick Silkey, Senior... Read more »

Making WAF ML models go brrr: saving decades of processing time

We made our WAF Machine Learning models 5.5x faster, reducing execution time by approximately 82%, from 1519 to 275 microseconds! Read on to find out how we achieved this remarkable improvement.

WAF Attack Score is Cloudflare's machine learning (ML)-powered layer built on top of our Web Application Firewall (WAF). Its goal is to complement the WAF and detect attack bypasses that we haven't encountered before. This has proven invaluable in catching zero-day vulnerabilities, like the one detected in Ivanti Connect Secure, before they are publicly disclosed and enhancing our customers' protection against emerging and unknown threats.

Since its launch in 2022, WAF attack score adoption has grown exponentially, now protecting millions of Internet properties and running real-time inference on tens of millions of requests per second. The feature's popularity has driven us to seek performance improvements, enabling even broader customer use and enhancing Internet security.

In this post, we will discuss the performance optimizations we've implemented for our WAF ML product. We'll guide you through specific code examples and benchmark numbers, demonstrating how these enhancements have significantly improved our system's efficiency. Additionally, we'll share the impressive latency reduction numbers observed after the rollout.

Before diving Continue reading

Making WAF ML models go brrr: saving decades of processing time

We made our WAF Machine Learning models 5.5x faster, reducing execution time by approximately 82%, from 1519 to 275 microseconds! Read on to find out how we achieved this remarkable improvement.

WAF Attack Score is Cloudflare's machine learning (ML)-powered layer built on top of our Web Application Firewall (WAF). Its goal is to complement the WAF and detect attack bypasses that we haven't encountered before. This has proven invaluable in catching zero-day vulnerabilities, like the one detected in Ivanti Connect Secure, before they are publicly disclosed and enhancing our customers' protection against emerging and unknown threats.

Since its launch in 2022, WAF attack score adoption has grown exponentially, now protecting millions of Internet properties and running real-time inference on tens of millions of requests per second. The feature's popularity has driven us to seek performance improvements, enabling even broader customer use and enhancing Internet security.

In this post, we will discuss the performance optimizations we've implemented for our WAF ML product. We'll guide you through specific code examples and benchmark numbers, demonstrating how these enhancements have significantly improved our system's efficiency. Additionally, we'll share the impressive latency reduction numbers observed after the rollout.

Before diving Continue reading

BGP, EVPN, VXLAN, or SRv6?

Daniel Dib asked an interesting question on LinkedIn when considering an RT5-only EVPN design:

I’m curious what EVPN provides if all you need is L3. For example, you could run pure L3 BGP fabric if you don’t need VRFs or a limited amount of them. If many VRFs are needed, there is MPLS/VPN, SR-MPLS, and SRv6.

I received a similar question numerous times in my previous life as a consultant. It’s usually caused by vendor marketing polluting PowerPoint slide decks with acronyms without explaining the fundamentals1. Let’s fix that.

Native Kubernetes cluster mesh with Calico

workloads from remote clusters

As Kubernetes continues to gain traction in the cloud-native ecosystem, the need for robust, scalable, and highly available cluster deployments has become more noticeable.

While a Kubernetes cluster can easily expand via additional nodes, the downside of such an approach is that you might have to spend a lot of time troubleshooting the underlying networking or managing and updating resources between clusters. On top of that, a multi-regional scenario or hyper-cloud environment might be off the limits depending on the limitations that a cloud provider or your Kubernetes distro might impose on your environment.

Calico Enterprise cluster mesh is a suite of features native to Kubernetes with a multi-layer design that connects two or more Kubernetes clusters and seamlessly shares resources between them. This post will explore cluster mesh, its benefits, and how it can enhance your Kubernetes environment.

Projects that provide cluster mesh

Multiple projects offer cluster mesh, and while they are all similar in basic principles, each has a different take on implementing this solution in an environment.

The following table is a brief overview of notable projects that offer cluster mesh:

Calico Open Source Calico Enterprise Cilium Calico Enterprise Submariner
Encapsulation IPIP Direct Continue reading

How to Implement 802.1X from Scratch?

How to Implement 802.1X from Scratch?

If you're a Network Engineer looking to learn what 802.1X is and how you can implement it in your network, you've come to the right place. 802.1X might seem confusing at first glance due to its various components, and the fact that it can be implemented in numerous ways. But don't worry, I'm here to break down each component and simplify the whole process for you. By the end of this post, you'll have a clear understanding of 802.1X and how to set it up, whether for wired or wireless networks.

Here is what we will cover in this blog post.

  1. What is our end goal?
  2. Network Access Control (NAC)
  3. What exactly is 802.1X?
  4. What do I need to start using 802.1X?
  5. Which protocol to use? (EAP-TLS, PEAP, TEAP)
  6. Cisco ISE Introduction
  7. Supplicant (end-device) configuration
  8. MAB

What is Our End Goal?

Let's talk about our end goal - Imagine our current setup where the WiFi network is secured with just a Pre-Shared Key (PSK) and wired networks are open, allowing anyone to plug in a laptop and gain access. This isn't ideal for security.

Our main aim is to shift towards a more secure authentication Continue reading

HS078: Is It Time To Dump Microsoft?

Should enterprises ditch Microsoft because of security concerns? Microsoft’s numerous vulnerabilities and questionable responses make it a significant risk for continued use. At the same time, Microsoft’s strong integration and utility in enterprise environments make it attractive for continued use. Johna Till Johnson and John Burke debate. They also weigh considerations including the challenges of... Read more »

Meta Llama 3.1 now available on Workers AI

At Cloudflare, we’re big supporters of the open-source community – and that extends to our approach for Workers AI models as well. Our strategy for our Cloudflare AI products is to provide a top-notch developer experience and toolkit that can help people build applications with open-source models.

We’re excited to be one of Meta’s launch partners to make their newest Llama 3.1 8B model available to all Workers AI users on Day 1. You can run their latest model by simply swapping out your model ID to @cf/meta/llama-3.1-8b-instruct or test out the model on our Workers AI Playground. Llama 3.1 8B is free to use on Workers AI until the model graduates out of beta.

Meta’s Llama collection of models have consistently shown high-quality performance in areas like general knowledge, steerability, math, tool use, and multilingual translation. Workers AI is excited to continue to distribute and serve the Llama collection of models on our serverless inference platform, powered by our globally distributed GPUs.

The Llama 3.1 model is particularly exciting, as it is released in a higher precision (bfloat16), incorporates function calling, and adds support across 8 languages. Having multilingual support built-in means that you can Continue reading