Figure 8-1 depicts some of the model parameters that need to be stored in GPU memory: a) Weight matrices associated with connections to the preceding layer, b) Weighted sum (z), c) Activation values (y), d) Errors (E), e) Local gradients (local ∇), f) Gradients received from peer GPUs (remote ∇), g) Learning rates (LR), and h) Weight adjustment values (Δw).
In addition, the training and test datasets, along with the model code, must also be stored in GPU memory. However, a single GPU may not have enough memory to accommodate all these elements. To address this limitation, an appropriate parallelization strategy must be chosen to efficiently distribute computations across multiple GPUs.
This chapter introduces the most common strategies include data parallelism, model parallelism, pipeline parallelism, and tensor parallelism.
Figure 8-1: Overview of Neural Networks Parameters.
In data parallelization, each GPU has an identical copy of the complete model but processes different mini-batches of data. Gradients from all GPUs are averaged and synchronized before updating the model. This approach is effective when the model fits within a single GPU’s memory.
In Figure 8-2, the batch of training data is split into eight micro-batches. The first four micro-batches are Continue reading
AI (Artificial Intelligence) is a broad concept encompassing machines that simulate or duplicate human cognitive tasks, with Machine Learning (ML) serving as its data-driven engine. Both have existed for decades but gained fresh momentum when Generative AI, AI models that can create text, images, audio, code, and video, surged in popularity following the release of OpenAI’s ChatGPT in late 2022. In this blog post, we examine the most popular Generative AI services and how they evolved throughout 2024 and early 2025. We also try to answer questions like how much traffic growth these Generative AI websites have experienced from Cloudflare’s perspective, how much of that traffic was malicious, and other insights.
To accomplish this, we use aggregated data from our 1.1.1.1 DNS resolver to measure the popularity of specific Generative AI services. We typically do this for our Year in Review and now also on the DNS domain rankings page of Cloudflare Radar, where we aggregate related domains for each service and identify sites that provide services to users. For overall traffic growth and attack trends, we rely on aggregated data from the cohort of Generative AI customers that use Cloudflare for performance (including Continue reading
You can use SR-MPLS, MPLS-TE, or an SDN controller to build virtual circuits (label-switched paths) across the network core. The controller can push the LSPs into network devices with PCEP, BGP-LU, or some sort of NETCONF/RESTCONF trickery.
Unfortunately, you’re only half done once you have installed the LSPs. You still have to persuade the network devices to use them. Welcome to the confusing world of traffic steering explored in the Loopback as a Service blog post by Dmytro Shypovalov.
Before we get into what Tailscale is or how it compares to a traditional remote access VPN, let’s take a quick look at Tailscale in action. The main problem Tailscale solves is remote access to your internal workloads.
In my homelab, I have a server running Linux. When I’m on my home network, I can access it directly without any issues. But if I step outside and want to access the same server over the Internet, Tailscale makes that much easier and you can have it up and running in about 10 minutes for free.
Typically, you would set up some kind of a VPN, either running on a server or a dedicated firewall. Then, you’d install a VPN client on the devices and point them to the public IP of your VPN server or firewall. That’s exactly what I have at the moment.
Tailscale takes a completely different approach, and you don’t need any of that. I’m not saying one is better than the other, I’m just pointing this out for comparison. I’ve shared my thoughts on the pros and cons of each solution at the end of this post.
Head over to the Tailscale Continue reading
Hello my friend,
This blog continues discussion of how to manage devices (network switches and routers, servers, virtual machines, etc) using SSH, which we started in previous blog. In this discussion we’ll cover advanced interaction with devices, which include multiple commands, change of contexts and validations. Let’s dive in.
Each programming language has its strong and weak capabilities. Golang, by virtue being a low level (or at least much lower level compared to Python) is very good, when performance and efficiency are paramount. However, you don’t need it for all applications. Python give quicker time to market, possibility to iteratively develop your code with Jupyter and vast ecosystem of existing libraries. Both programming languages are important and both of them play crucial role in IT and network infrastructure management. So if you are good with Python, learn Go (Golang) using our blog series.
And if you are not, or you want to have good intro to IT and network automation holistically, enroll to our training programms:
We offer the following training programs in network automation for you:
To avoid needless typing, the fish shell features command abbreviations to expand some words after pressing space. We can emulate such a feature with Zsh:
# Definition of abbrev-alias for auto-expanding aliases typeset -ga _vbe_abbrevations abbrev-alias() { alias $1 _vbe_abbrevations+=(${1%%\=*}) } _vbe_zle-autoexpand() { local -a words; words=(${(z)LBUFFER}) if (( ${#_vbe_abbrevations[(r)${words[-1]}]} )); then zle _expand_alias fi zle magic-space } zle -N _vbe_zle-autoexpand bindkey -M emacs " " _vbe_zle-autoexpand bindkey -M isearch " " magic-space # Correct common typos (( $+commands[git] )) && abbrev-alias gti=git (( $+commands[grep] )) && abbrev-alias grpe=grep (( $+commands[sudo] )) && abbrev-alias suod=sudo (( $+commands[ssh] )) && abbrev-alias shs=ssh # Save a few keystrokes (( $+commands[git] )) && abbrev-alias gls="git ls-files" (( $+commands[ip] )) && { abbrev-alias ip6='ip -6' abbrev-alias ipb='ip -brief' } # Hard to remember options (( $+commands[mtr] )) && abbrev-alias mtrr='mtr -wzbe'
Here is a demo where gls
is expanded to git ls-files
after pressing space:
gls
to git ls-files
I don’t auto-expand all aliases. I keep using regular aliases when slightly modifying the behavior of a command or for well-known abbreviations:
alias df='df -h' alias du='du -h' alias rm='rm -i' alias mv='mv -i' alias ll='ls -ltrhA'
Nvidia sells the lion’s share of the parallel compute underpinning AI training, and it has a very large – and probably dominant – share of AI inference. …
Broadcom And Marvell Ride The Compute Engine Independence Wave was written by Timothy Prickett Morgan at The Next Platform.
Today, we are thrilled to announce Media Transformations, a new service that brings the magic of Image Transformations to short-form video files wherever they are stored.
Since 2018, Cloudflare Stream has offered a managed video pipeline that empowers customers to serve rich video experiences at global scale easily, in multiple formats and quality levels. Sometimes, the greatest friction to getting started isn't even about video, but rather the thought of migrating all those files. Customers want a simpler solution that retains their current storage strategy to deliver small, optimized MP4 files. Now you can do that with Media Transformations.
For customers with a huge volume of short video, such as generative AI output, e-commerce product videos, social media clips, or short marketing content, uploading those assets to Stream is not always practical. Furthermore, Stream’s key features like adaptive bitrate encoding and HLS packaging offer diminishing returns on short content or small files.
Instead, content like this should be fetched from our customers' existing storage like R2 or S3 directly, optimized by Cloudflare quickly, and delivered efficiently as small MP4 files. Cloudflare Images customers reading this will note that this sounds just like their existing Image Transformation Continue reading
Jeroen van Bemmel and Stefano Sasso contributed tons of new device features for the netlab release 1.9.5:
Cumulus Linux (NVUE):
Dell saw a sequential slump in server sales its most recent quarter as customers were awaiting access to systems using Nvidia’s “Blackwell” GPUs, and rival Hewlett Packard Enterprise had a similar issue when it turns in its first quarter of fiscal 2025, which ended in early February. …
GPU Transitions, Aggressive Server Pricing Squeeze HPE Profits was written by Timothy Prickett Morgan at The Next Platform.
It’s a bit embarrassing as a Network Engineer that we’ve made it this far into the Docker series without looking into Docker Networking and IP Addresses. So, in this part of the series, let’s take a quick look at the basics of Docker Networking. There’s a lot more to Docker networking than what we’ll cover here, but this should be enough to get most people started. We can always explore advanced topics in future posts.
If you haven’t been following the Docker series and just landed on this post, don’t worry; you can still follow along without any issues. If you’re curious about the previous posts, feel free to check them out below.
As always, if you find this post helpful, press the ‘clap’ button. It means a lot to me and helps me know you enjoy this type of content.
Container networking refers to the ability for containers to connect to and communicate with each other or Continue reading
I wrote dozens of posts describing various fundamentals of networking technologies. They were a bit hard to find, so I organized them into subcategories and created a summary page to display them. I hope you like the new format.