AMD’s Long And Winding Road To The Hybrid CPU-GPU Instinct MI300A

Back in 2012, when AMD was in the process of backing out of the datacenter CPU business and did not really have its datacenter GPU act together at all, the US Department of Energy exhibited the enlightened self-interest that is a strong foundation of both economics and politics and took a chance and invested in AMD to do research in memory technologies and hybrid CPU-GPU computing at exascale.

AMD’s Long And Winding Road To The Hybrid CPU-GPU Instinct MI300A was written by Timothy Prickett Morgan at The Next Platform.

VMware’s ‘Private Cloud’ Solution Emerges Under Broadcom

VMware Cloud Foundation’s (VCF) new configuration was eagerly awaited. There have been many questions about what VCF would look like exactly and, more importantly, what it would mean for DevOps customers now that VCF is under the Broadcom umbrella. While there has been a lot of discussion about price increases for some customers following licensing changes and other attributes of VMware honing its product portfolio under Broadcom, we have now seen, during the past few days, releases detailing what VCF now means, what it has to offer and what is planned for the future. To that end, a lot of care has been taken to accommodate more emerging needs, especially for private cloud ownership involving large, geographically distributed operations across many different sectors. This often includes IoT and edge applications that private cloud is configured for. There is also a simplification in VCF’s portfolio now under Broadcom, which we will detail below. The company detailed several features, including VCF’s management in line with hyper-convergence and combining storage operations environments under a single umbrella, uniting or “de-siloing” them. This offers many advantages and accounts for much of the hype surrounding VCF. At the same time, the development of VCF’s offering Continue reading

TL000: Announcing Technically Leadership, a New Podcast for the Next Phase of Your Career

Technically Leadership is a brand new podcast on the Packet Pushers network. Host Laura Santamaria explores leadership in the tech industry, with conversations and insights to help you development your management skills. Whether you’re considering your first management role or you’re an experienced manager working your way to the C-suite, this podcast is for you.... Read more »

Worth Reading: AI Is Still a Delusion

Here’s another AI rant to spice your summer: AI Is Still a Delusion, including an excellent example of how the latest LLMs flunk simple logical reasoning. I particularly liked this one-line summary:

The real danger today is not that computers are smarter than us but that we think computers are smarter than us and consequently trust them to make decisions they should not be trusted to make.

It might be worth remembering that quote when an AI-powered management appliance messes up your network because of a false positive ;)

eBPF: Enabling Security and Performance to Co-Exist

Today, most organizations and individuals use Linux and the Linux kernel with a “one-size-fits-all” approach. This differs from how Linux was used in the past–for example, 20 years ago, many users would compile their kernel and modify it to fit their specific needs, architectures and use cases. This is no longer the case, as one-size-fits-all has become good enough. But, like anything in life, “good enough” is not the best you can get.

Enter: Extended Berkeley Packet Filter (eBPF). eBPF allows users to modify one-size-fits-all to fit their specific needs. While this was not impossible before, it was cumbersome and often unsecure.

eBPF is a feature available in Linux kernels that allows users to safely load programs into the kernel, to customize its operation. With eBPF, the kernel and its behavior become highly customizable, instead of being fixed.

Utilizing eBPF, users can load a program into the kernel and instruct the kernel to execute their program if, for example, a certain packet is seen or another event occurs. eBPF lets programs run without needing to add additional modules or modify the kernel source code. Users can think of it as a lightweight, sandboxed virtual machine (VM) within the Linux kernel Continue reading

Eliminating hardware with Load Balancing and Cloudflare One

In 2023, Cloudflare introduced a new load balancing solution supporting Local Traffic Management (LTM). This year, we took it a step further by introducing support for layer 4 load balancing to private networks via Spectrum. Now, organizations can seamlessly balance public HTTP(S), TCP, and UDP traffic to their privately hosted applications. Today, we’re thrilled to unveil our latest enhancement: support for end-to-end private traffic flows as well as WARP authenticated device traffic, eliminating the need for dedicated hardware load balancers! These groundbreaking features are powered by the enhanced integration of Cloudflare load balancing with our Cloudflare One platform, and are available to our enterprise customers. With this upgrade, our customers can now utilize Cloudflare load balancers for both public and private traffic directed at private networks.

Cloudflare Load Balancing today

Before discussing the new features, let's review Cloudflare's existing load balancing support and the challenges customers face.

Cloudflare currently supports four main load balancing traffic flows:

  1. Internet-facing load balancers connecting to publicly accessible endpoints at layer 7, supporting HTTP(S).
  2. Internet-facing load balancers connecting to publicly accessible endpoints at layer 4 (Spectrum), supporting TCP and UDP services
  3. Internet-facing load balancers connecting to private endpoints at layer 7 HTTP(S) via Cloudflare Tunnels.
  4. Continue reading

Q2 2024 Internet disruption summary

Cloudflare’s network spans more than 320 cities in over 120 countries, where we interconnect with over 13,000 network providers in order to provide a broad range of services to millions of customers. The breadth of both our network and our customer base provides us with a unique perspective on Internet resilience, enabling us to observe the impact of Internet disruptions. Thanks to Cloudflare Radar functionality released earlier this year, we can explore the impact from a routing perspective, as well as a traffic perspective, at both a network and location level.

As we have seen in previous years, nationwide exams take place across several MENA countries in the second quarter, and with them come government directed Internet shutdowns. Cable cuts, both terrestrial and submarine, caused Internet outages across a number of countries, with the ACE submarine cable being a particular source of problems. Maintenance, power outages, and technical problems also disrupted Internet connectivity, as did unknown issues. And as we have frequently seen in the two-plus years since the conflict began, Internet connectivity in Ukraine suffers as a result of Russian attacks.

As we have noted in the past, this post is intended as a summary overview Continue reading

AI/ML Networking: Part-II: Introduction of Deep Neural Networks

Machine Learning (ML) is a subset of Artificial Intelligence (AI). ML is based on algorithms that allow learning, predicting, and making decisions based on data rather than pre-programmed tasks. ML leverages Deep Neural Networks (DNNs), which have multiple layers, each consisting of neurons that process information from sub-layers as part of the training process. Large Language Models (LLMs), such as OpenAI’s GPT (Generative Pre-trained Transformers), utilize ML and Deep Neural Networks.

For network engineers, it is crucial to understand the fundamental operations and communication models used in ML training processes. To emphasize the importance of this, I quote the Chinese philosopher and strategist Sun Tzu, who lived around 600 BCE, from his work The Art of War.

If you know the enemy and know yourself, you need not fear the result of a hundred battles.

We don’t have to be data scientists to design a network for AI/ML, but we must understand the operational fundamentals and communication patterns of ML. Additionally, we must have a deep understanding of network solutions and technologies to build a lossless and cost-effective network for enabling efficient training processes.

In the upcoming two posts, I will explain the basics of: 

a) Data Models: Continue reading

Encapsulation of PDUs On Trunk Ports

When I studied for my CCIE almost 15 years ago, I recall that I was fascinated by how different PDUs such as CDP, DTP, STP would have different encapsulations on a trunk depending on the configuration of it. What happens when you change the native VLAN? What happens if the native VLAN is not allowed on the trunk? What happens if you tag the native VLAN? There aren’t many resources describing this as most people don’t care for this level of detail, but there are situations where this is important. The goal of this post is to configure different protocols and see how they are encapsulated using different trunk configurations. You don’t need to consume this entire post, rather use it as a reference for different scenarios. Just be aware that some of this may be platform/OS specific.

The protocols we’ll cover for this post are:

  • CDP.
  • LLDP.
  • DTP.
  • PAgP.
  • LACP.
  • PVST+.
  • RPVST+.
  • MST.

The topology is going to be very simple, two switches connected by a single link:

These are IOSv-L2 devices:

SW1#show version
Cisco IOS Software, vios_l2 Software (vios_l2-ADVENTERPRISEK9-M), Experimental Version 15.2(20200924:215240) [sweickge-sep24-2020-l2iol-release 135]
Copyright (c) 1986-2020 by Cisco Systems, Inc.
Compiled Tue 29-Sep-20 11:53 by sweickge


 Continue reading

Torero – Boots on the Ground Framework for Automation Sharing

Background One of the many useful things I came away with from AutoCon1 in Amsterdam was a "to-do" to investigate torero. Launched by Itential at NAF's AutoCon1 as a community based product and presented to the AutoCon community by Peter Sprygada in Amsterdam, one has to take notice and I did. You can see Mr. READ MORE

The post Torero – Boots on the Ground Framework for Automation Sharing appeared first on The Gratuitous Arp.

Ongoing Saga: How Much Money Will Be Spent On AI Chips?

Everybody knows that companies, particularly hyperscalers and cloud builders but now increasingly enterprises hoping to leverage generative AI, are spending giant round bales of money on AI accelerators and related chips to create AI training and inference clusters.

Ongoing Saga: How Much Money Will Be Spent On AI Chips? was written by Timothy Prickett Morgan at The Next Platform.

Using a Git Commit Template

Although I’m sure that I’d seen or read about Git commit templates previously, the idea of using a Git commit template was brought to the forefront recently with the release of version 11 of Tower for Mac. When I’m using macOS, Tower is my graphical Git client of choice. However, most of my Git operations are via the terminal, and I’m not always using macOS (I also spend a fair amount of time using Linux). For that reason, I wanted to implement a more “platform-neutral” solution to Git commit templates. In this post, I’ll share with you what I learned about using a Git commit template.

Let me start with explaining why I wanted to start using a Git commit template. Some time ago, I was exposed to the Conventional Commits specification, and decided I wanted to adopt this for my personal projects. As with starting any new habit, it can be challenging to remember, and I found myself committing changes to personal repositories without following the specification. So when I was reminded of Git commit template functionality via its inclusion in Tower, I immediately realized this would be a great way to help build new habits (and Continue reading

Adding Arista Switch to CML

I wanted to add Arista switches to CML to do some STP interopability testing. However, the process of adding them is not well described. I had to refer to some Youtube videos to understand what to do. This is what you’ll need for CML 2.7:

  • Download images from Arista software downloads.
  • Upload images to CML.
  • Create node- and image definition.

The first thing you need to do is to download images. Thankfully, Arista provides images for anyone that’s registered, whether you are an existing partner/customer, or not. Go to Arista’s login page and create an account if you don’t already have one. When logged in, go to Support -> Software Download:

When on the downloads page, scroll down until you see cEOS-lab and vEOS-lab. Expand the vEOS lab section:

You will need to download two images:

  • Aboot – Boot loader.
  • vEOS – The actual NOS.

Grab one of the Aboot images such as Aboot-veos-serial-8.0.2.ios:

The Aboot serial image outputs to serial while the other image outputs to VGA. I didn’t have any issues using the serial one in CML.

You’ll then need the actual vEOS file. Previously, there was a process needed to convert Continue reading

Worth Reading: GitHub Copilot Workspace Review

In Matt Duggan’s blog post, you’ll find a scathing review of another attempt to throw AI spaghetti at the wall to see if they stick: the GitHub Copilot Workspace.

He also succinctly summarized everything I ever wanted to say about the idea of using AI tools to generate networking configurations:

Having a tool that makes stuff that looks right but ends up broken is worse than not having the tool at all.