Crafting A DGX-Alike AI Server Out Of AMD GPUs And PCI Switches

Not everybody can afford an Nvidia DGX AI server loaded up with the latest “Hopper” H100 GPU accelerators or even one of its many clones available from the OEMs and ODMs of the world.

The post Crafting A DGX-Alike AI Server Out Of AMD GPUs And PCI Switches first appeared on The Next Platform.

Crafting A DGX-Alike AI Server Out Of AMD GPUs And PCI Switches was written by Timothy Prickett Morgan at The Next Platform.

Predicting and Surviving Correlated Failures Redundancy

In this archived panel discussion, Frank Ohlhorst, Henry Sow, and Stephen Lawton connect to deliver an in-depth conversation detailing the need for 'Predicting and Surviving Correlated Failures Redundancy' during our 'Network Resilience Boot Camp' presented by Data Center Knowledge and Network Computing. This excerpt is from our live 'Network Resilience Boot Camp' virtual event moderated by Bonnie D. Graham.

Hybrid workforce demands change from network ops

The pandemic forced businesses to send employees home to work, but even in recovery, the workforce trend is going strong. Some remote work measures were considered a temporary fix, and now the hybrid work reality demands IT organizations reassess how they can deliver consistent support, service, and technology to employees wherever they decide to work.“There have been a lot of conversations about return to work, but it’s not really happening,” said Shamus McGillicuddy, vice president of research at Enterprise Management Associates, during a recent webinar.To read this article in full, please click here

Hiding from history on Linux

Linux shells like bash have a convenient way of remembering commands that you type, making it easy to run them again without having to retype them. Just use the history command (which is a bash built-in) and then use an exclamation point followed by the number shown in front of the command in the history command output that you want to rerun. Alternatively, you can back up to that command by pressing the up arrow key as many times as needed to reach that command and then press return. Don’t forget, though, that you can also set up commands you are likely to use often as aliases by adding a line like this to your ~/.bashrc file so that you don’t need to search for them in your command history. Here’s an example:To read this article in full, please click here

Hiding from history on Linux

Linux shells like bash have a convenient way of remembering commands that you type, making it easy to run them again without having to retype them. Just use the history command (which is a bash built-in) and then use an exclamation point followed by the number shown in front of the command in the history command output that you want to rerun. Alternatively, you can back up to that command by pressing the up arrow key as many times as needed to reach that command and then press return. Don’t forget, though, that you can also set up commands you are likely to use often as aliases by adding a line like this to your ~/.bashrc file so that you don’t need to search for them in your command history. Here’s an example:To read this article in full, please click here

Backblaze sees rise in hard drive failure rates

The latest quarterly report from Backblaze on hard drive reliability reveals a rise in failures among certain drives.Backblaze is a pure storage provider; cloud storage is all they do, and they dig deep into the statistics of hard drive failure and share their data with the industry. The company currently has a massive inventory of 241,297 hard disk drives of varying capacities and from various brands. (In recent quarters, Backblaze has added SSD performance to its measurements, but SSDs are still early in their deployment lifecycle, so patterns over time have yet to fully emerge.)To read this article in full, please click here

ECL set to build modular, hydrogen-powered data centers

ECL has announced what it says will be the world’s first modular, sustainable, off-grid data center that uses hydrogen as its primary power source, promising carbon neutral performance and 99.9999% uptime.Modular data centers are designed to go together like building blocks, allowing companies to start small and grow as their capacity needs increase. The ECL data centers will come in 1 megawatt blocks.ECL's data-center-as-a-service offering is geared primarily to mid-sized data center operators, as well as large companies with a mix of cloud and on-premises IT environments. It claims its data centers will have a total cost of ownership that's two-thirds of what a traditional colocation data center environment would cost when measured over five years.To read this article in full, please click here

ECL set to build modular, hydrogen-powered data centers

ECL has announced what it says will be the world’s first modular, sustainable, off-grid data center that uses hydrogen as its primary power source, promising carbon neutral performance and 99.9999% uptime.Modular data centers are designed to go together like building blocks, allowing companies to start small and grow as their capacity needs increase. The ECL data centers will come in 1 megawatt blocks.ECL's data-center-as-a-service offering is geared primarily to mid-sized data center operators, as well as large companies with a mix of cloud and on-premises IT environments. It claims its data centers will have a total cost of ownership that's two-thirds of what a traditional colocation data center environment would cost when measured over five years.To read this article in full, please click here

GPU Shortages Will Prop Up The Clouds In More Ways Than One

For the last two quarters at least, the generic infrastructure server market – the one running databases, application servers, various web layers, and print and file serving workloads the world over – has been in a recession.

The post GPU Shortages Will Prop Up The Clouds In More Ways Than One first appeared on The Next Platform.

GPU Shortages Will Prop Up The Clouds In More Ways Than One was written by Timothy Prickett Morgan at The Next Platform.

Containerlab dashboard

The GitHub sflow-rt/containerlab project contains example network topologies for the Containerlab network emulation tool that demonstrate real-time streaming telemetry in realistic data center topologies and network configurations. The examples use the same FRRouting (FRR) engine that is part of SONiC, NVIDIA Cumulus Linux, and DENT network operating systems. Containerlab can be used to experiment before deploying solutions into production. Examples include: tracing ECMP flows in leaf and spine topologies, EVPN visibility, and automated DDoS mitigation using BGP Flowspec and RTBH controls.
The screen capture at the top of this article shows a real-time dashboard displaying up to the second traffic analytics gathered from the 5 stage Clos fabric shown above. This article walks through the steps needed to run the example.
git clone https://github.com/sflow-rt/containerlab.git
cd containerlab
./run-clab
Run the above commands to download the project and run Containerlab on a system with Docker installed. Docker Desktop is a conventient way to run the labs on a laptop.
containerlab deploy -t clos5.yml
Start the emulation.
./topo.py clab-clos5
Post topology to sFlow-RT REST API. Connect to http://localhost:8008/app/containerlab-dashboard/html/ to access the Dashboard shown at the top of this article.
docker exec -it clab-clos5-h1 iperf3 -c 172.16. Continue reading

Network Break 442: HashiCorp Swaps Open Source For BSL; Open Enterprise Linux Goes After RHEL

Today on Network Break we discuss big moves in open source, including HashiCorp switching from an open source license to "business source" and Red Hat competitors banding together to offer an alternative to Red Hat Enterprise Linux (RHEL). We also discuss Google's odd attempt to get employees back to the office by charging them to stay at an on-campus hotel.

Network Break 442: HashiCorp Swaps Open Source For BSL; Open Enterprise Linux Goes After RHEL

Today on Network Break we discuss big moves in open source, including HashiCorp switching from an open source license to "business source" and Red Hat competitors banding together to offer an alternative to Red Hat Enterprise Linux (RHEL). We also discuss Google's odd attempt to get employees back to the office by charging them to stay at an on-campus hotel.

The post Network Break 442: HashiCorp Swaps Open Source For BSL; Open Enterprise Linux Goes After RHEL appeared first on Packet Pushers.

Wasm core dumps and debugging Rust in Cloudflare Workers

Wasm core dumps and debugging Rust in Cloudflare Workers
Wasm core dumps and debugging Rust in Cloudflare Workers

A clear sign of maturing for any new programming language or environment is how easy and efficient debugging them is. Programming, like any other complex task, involves various challenges and potential pitfalls. Logic errors, off-by-ones, null pointer dereferences, and memory leaks are some examples of things that can make software developers desperate if they can't pinpoint and fix these issues quickly as part of their workflows and tools.

WebAssembly (Wasm) is a binary instruction format designed to be a portable and efficient target for the compilation of high-level languages like Rust, C, C++, and others. In recent years, it has gained significant traction for building high-performance applications in web and serverless environments.

Cloudflare Workers has had first-party support for Rust and Wasm for quite some time. We've been using this powerful combination to bootstrap and build some of our most recent services, like D1, Constellation, and Signed Exchanges, to name a few.

Using tools like Wrangler, our command-line tool for building with Cloudflare developer products, makes streaming real-time logs from our applications running remotely easy. Still, to be honest, debugging Rust and Wasm with Cloudflare Workers involves a lot of the good old time-consuming and Continue reading

Interfaces Management with Ansible validated content using the network.interfaces collection

Introduction

At AnsibleFest 2022, we announced a new addition to the content ecosystem offered through Red Hat Ansible Automation Platform: Ansible validated content. Ansible validated content is use case-focused and provides an expert-guided path for performing operational tasks.  

While Red Hat Ansible Certified Content Collections focus on how to integrate platforms (typically in the form of modules), Ansible validated content offers expert best practices and guidance for how to perform operations or tasks (typically in the form of roles or playbooks). Some Ansible validated content may depend on certified content (modules) for integration.

Specifically in the network automation area, we have already seen  the release of network.base and network.bgp validated content.

Network engineers commonly ask about automation for network interfaces, which are the fundamental connection point for endpoints as layer 2 access ports, or other networking devices that extend the network to other domains as layer 3 interfaces. However it is extremely challenging to be able to collect data at scale and at the same time standardize settings for interfaces following specific rules through automation. 

For this reason, we want to introduce you to the new network.interfaces collection. In this blog, we will show how Continue reading

Kubernetes 005. Overview of MicroK8s from Canonical (Ubuntu-folks).

Dear friend,

This year I had a pleasure and privilege to attend KubeCon Europe 2023 and this was the first time I’ve heard about MicroK8s. That sounded interesting and I decided I shall experiment with it and write a blogpost; but it didn’t catch my attention to a degree that I put it on top of my list; instead, I’ve put it to back burner. The last week I was talking to a colleagues of mine, who told me that he needs to test something in his production Kubernetes at home. I was quite interested, what does the one mean by “production Kubernetes cluster at home” and it appeared to be MikroK8s. At this stage I though, I don’t have any more excuses, so I just should sit and write it.

Is Kubernetes Used in Network Automation?

It is, indeed. The last week when we posted a blog about starting programming in C we got an interesting discussion in LinkedIn about Go vs Python with one right-honorable gentleman, who rightfully suggested that one of the main weaknesses of Python is that it requires to install dependencies on the host before you can use application. However, to be brutally honest, many Continue reading