The Queuing Theory webinar by Rachel Traylor is now available without a valid ipSpace.net account. Enjoy!
In his letter to Intel employees, new chief executive officer Lip-Bu Tan, who starts his new job next Tuesday, tells them that he is “never deterred by challenges.” …
Lip-Bu Tan: Intel’s New – And Maybe Last – CEO was written by Timothy Prickett Morgan at The Next Platform.
Many providers count on detection in the global routing table to discover and counter BGP route hijacks. What if there were a kind of BGP hijack that cannot be detected using current mechanisms? Henry Birge-Lee joins Tom Ammon and Russ White to discuss a kind of stealthy BGP attack that avoids normal detection, and how we can resolve these attacks.
To find out more, check this RIPE video.
downloa
Like OSPF, IS-IS needs a router to originate the pseudo-node for a LAN segment. IS-IS standards call that router a Designated Intermediate System (DIS), and since it is not responsible for flooding, it does not need a backup.
Want to know more? The Influence the Designated IS Election lab exercise provides the details (and some hands-on work).
Oracle’s cloud may be been in the running to be the host of a massive AI training system for Elon Musk’s xAI startup, with a purported $10 billion in rentals at stake. …
Oracle Has Some Big Advantages To Mainstream AI was written by Timothy Prickett Morgan at The Next Platform.
AI models have rapidly evolved from GPT-2 (1.5B parameters) in 2019 to models like GPT-4 (1+ trillion parameters) and DeepSeek-V3 (671B parameters, using Mixture-of-Experts). More parameters enhance context understanding and text/image generation but increase computational demands. Modern AI is now multimodal, handling text, images, audio, and video (e.g., GPT-4V, Gemini), and task-specific, fine-tuned for applications like drug discovery, financial modeling or coding. As AI models continue to scale and evolve, they require massive parallel computing, specialized hardware (GPUs, TPUs), and crucially, optimized networking to ensure efficient training and inference.
In Model Parallelism, the
neural network is partitioned across multiple GPUs, with each GPU responsible
for specific layers of the model. This strategy is particularly
beneficial for large-scale models that surpass the memory limitations of a
single GPU.
Conversely, Pipeline Parallelism involves dividing the model into consecutive stages, assigning each stage to a different GPU. This setup allows data to be processed in a pipeline fashion, akin to an assembly line, enabling simultaneous processing of multiple training samples. Without pipeline parallelism, each GPU would process its inputs sequentially from the complete dataset, while all other GPUs remain idle.
Our example neural network in Figure 8-3 consists of three hidden layers and an output layer. The first hidden layer is assigned to GPU A1, while the second and third hidden layers are assigned to GPU A2 and GPU B1, respectively. The output layer is placed on GPU B2. The training dataset is divided into four micro-batches and stored on the GPUs. These micro-batches are fed sequentially into the first hidden layer on GPU A1.
Note 8-1. In this example, we use a small training dataset. However, if the dataset is too large to fit on a Continue reading
Wouldn’t it be funny if all of that money that Microsoft spent last year paying neocloud upstart CoreWeave was just to support ever-embiggening AI training workloads at OpenAI as it makes its GPT models smarter? …
What A Tangled OpenAI Web We CoreWeave was written by Timothy Prickett Morgan at The Next Platform.
In the rapidly evolving world of Kubernetes, network security remains one of the most challenging aspects for organizations. The shift to dynamic containerized environments brings challenges like inter-cluster communication, rapid scaling, and multi-cloud deployments. These challenges, compounded by tool sprawl and fragmented visibility, leave teams grappling with operational inefficiencies, misaligned priorities, and increasing vulnerabilities. Without a unified solution, organizations risk security breaches and compliance failures.
Calico reimagines Kubernetes security with a holistic, end-to-end approach that simplifies operations while strengthening defenses. By unifying key capabilities like ingress and egress gateways, microsegmentation, and real-time observability, Calico empowers teams to bridge the gaps between security, compliance, and operational efficiency. The result is a scalable, robust platform that addresses the unique demands of containerized environments without introducing unnecessary complexity. Let’s look at how Calico’s key network security capabilities make this possible.
The Calico Ingress Gateway is a Kubernetes-native solution, built on the Envoy Gateway, that serves as a centralized entry point for managing and securing incoming traffic to your clusters. Implementing the Kubernetes Gateway API specification, it replaces traditional ingress controllers with a more robust, scalable, and flexible architecture that is capable of more Continue reading
SPONSORED FEATURE: As an industry, financial services is accustomed to big numbers. …
Data Deluge Pushes Financial Services Deeper Into AI was written by Timothy Prickett Morgan at The Next Platform.
It all started with an innocuous article describing the MTU basics. As the real purpose of the MTU is to prevent packet drops due to fixed-size receiver buffers, and I waste spend most of my time in virtual labs, I wanted to check how various virtual network devices react to incoming oversized packets.
As the first step, I created a simple netlab topology in which a single link had a slightly larger than usual MTU… and then all hell broke loose.
With RISC-V International, the body controlling the RISC-V instruction set, located in Switzerland for the past five years, RISC-V now has just as much right to call itself indigenous to Europe as does Arm Ltd, the British chip company that finds itself on the other side of the English Channel after the Brexit break up and that is still around 90 percent owned by Japanese conglomerate SoftBank. …
Europe Takes Another Whack At Homegrown Compute Engines was written by Timothy Prickett Morgan at The Next Platform.
Figure 8-1 depicts some of the model parameters that need to be stored in GPU memory: a) Weight matrices associated with connections to the preceding layer, b) Weighted sum (z), c) Activation values (y), d) Errors (E), e) Local gradients (local ∇), f) Gradients received from peer GPUs (remote ∇), g) Learning rates (LR), and h) Weight adjustment values (Δw).
In addition, the training and test datasets, along with the model code, must also be stored in GPU memory. However, a single GPU may not have enough memory to accommodate all these elements. To address this limitation, an appropriate parallelization strategy must be chosen to efficiently distribute computations across multiple GPUs.
This chapter introduces the most common strategies include data parallelism, model parallelism, pipeline parallelism, and tensor parallelism.
Figure 8-1: Overview of Neural Networks Parameters.
In data parallelization, each GPU has an identical copy of the complete model but processes different mini-batches of data. Gradients from all GPUs are averaged and synchronized before updating the model. This approach is effective when the model fits within a single GPU’s memory.
In Figure 8-2, the batch of training data is split into eight micro-batches. The first four micro-batches are Continue reading