Cloudflare R2 and MosaicML enable training LLMs on any compute, anywhere in the world, with zero switching costs

Cloudflare R2 and MosaicML enable training LLMs on any compute, anywhere in the world, with zero switching costs
Cloudflare R2 and MosaicML enable training LLMs on any compute, anywhere in the world, with zero switching costs

Building the large language models (LLMs) and diffusion models that power generative AI requires massive infrastructure. The most obvious component is compute – hundreds to thousands of GPUs – but an equally critical (and often overlooked) component is the data storage infrastructure. Training datasets can be terabytes to petabytes in size, and this data needs to be read in parallel by thousands of processes. In addition, model checkpoints need to be saved frequently throughout a training run, and for LLMs these checkpoints can each be hundreds of gigabytes!

To manage storage costs and scalability, many machine learning teams have been moving to object storage to host their datasets and checkpoints. Unfortunately, most object store providers use egress fees to “lock in” users to their platform. This makes it very difficult to leverage GPU capacity across multiple cloud providers, or take advantage of lower / dynamic pricing elsewhere, since the data and model checkpoints are too expensive to move. At a time when cloud GPUs are scarce, and new hardware options are entering the market, it’s more important than ever to stay flexible.

In addition to high egress fees, there is a technical barrier to object-store-centric machine learning training. Reading and Continue reading

Use Snowflake with R2 to extend your global data lake

Use Snowflake with R2 to extend your global data lake
Use Snowflake with R2 to extend your global data lake

R2 is the ideal object storage platform to build data lakes. It’s infinitely scalable, highly durable (eleven 9's of annual durability), and has no egress fees. Zero egress fees mean zero vendor lock-in. You are free to use the tools you want to get the maximum value from your data.

Today we’re excited to announce our partnership with Snowflake so that you can use Snowflake to query data stored in your R2 data lake and load data from R2 into Snowflake. Organizations use Snowflake's Data Cloud to unite siloed data, discover, and securely share data, and execute diverse analytic workloads across multiple clouds.

One challenge of loading data into Snowflake database tables and querying external data lakes is the cost of data transfer. If your data is coming from a different cloud or even different region within the same cloud, this typically means you are paying an additional tax for each byte going into Snowflake. Pairing R2 and Snowflake lets you focus on getting valuable insights from your data, without having to worry about egress fees piling up.

Getting started

Sign up for R2 and create an API token

If you haven’t already, you’ll need to sign up for R2 Continue reading

Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers

Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers
Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers

Today, we are excited to announce a new API in Cloudflare Workers for creating outbound TCP sockets, making it possible to connect directly to any TCP-based service from Workers.

Standard protocols including SSH, MQTT, SMTP, FTP, and IRC are all built on top of TCP. Most importantly, nearly all applications need to connect to databases, and most databases speak TCP. And while Cloudflare D1 works seamlessly on Workers, and some hosted database providers allow connections over HTTP or WebSockets, the vast majority of databases, both relational (SQL) and document-oriented (NoSQL), require clients to connect by opening a direct TCP “socket”, an ongoing two-way connection that is used to send queries and receive data. Now, Workers provides an API for this, the first of many steps to come in allowing you to use any database or infrastructure you choose when building full-stack applications on Workers.

Database drivers, the client code used to connect to databases and execute queries, are already using this new API. pg, the most widely used JavaScript database driver for PostgreSQL, works on Cloudflare Workers today, with more database drivers to come.

The TCP Socket API is available today to everyone. Get started by reading the TCP Continue reading

IT pros worry about network data being fed to AI tools

As more IT organizations apply artificial intelligence (AI), machine learning (ML), and so-called AIOps technology to network management, network data is critical to success. AI/ML technology requires more and more data to learn individual networks, derive insights, and offer recommendations. Unfortunately, many organizations encounter problems when trying to feed network data to these AI tools.In other words, network teams need to modernize their approach to network data before they embrace AI technology.Enterprise Management Associates recently surveyed 250 IT professionals about their experience with AI/ML-driven network management solutions for a report, “AI-Driven Networks: Leveling up Network Management.” It found that data problems are the number-two technical challenge they encounter when applying AI/ML to network management. Only network complexity is a bigger technical issue.To read this article in full, please click here

Google launches A3 supercomputer VMs

Google Cloud announced a new supercomputer virtual-machine series aimed at rapidly training large AI models.Unveiled at the Google I/O conference, the new A3 supercomputer VMs are purpose-built to handle the considerable resource demands of a large language model (LLM). “A3 GPU VMs were purpose-built to deliver the highest-performance training for today’s ML workloads, complete with modern CPU, improved host memory, next-generation Nvidia GPUs and major network upgrades,” the company said in a statement.The instances are powered by eight Nvidia H100 GPUs, Nvidia’s newest GPU that just begin shipping earlier this month, as well as Intel’s 4th Generation Xeon Scalable processors, 2TB of host memory and 3.6 TBs bisectional bandwidth between the eight GPUs via Nvidia’s NVSwitch and NVLink 4.0 interconnects.To read this article in full, please click here

Google launches A3 supercomputer VMs

Google Cloud announced a new supercomputer virtual-machine series aimed at rapidly training large AI models.Unveiled at the Google I/O conference, the new A3 supercomputer VMs are purpose-built to handle the considerable resource demands of a large language model (LLM). “A3 GPU VMs were purpose-built to deliver the highest-performance training for today’s ML workloads, complete with modern CPU, improved host memory, next-generation Nvidia GPUs and major network upgrades,” the company said in a statement.The instances are powered by eight Nvidia H100 GPUs, Nvidia’s newest GPU that just begin shipping earlier this month, as well as Intel’s 4th Generation Xeon Scalable processors, 2TB of host memory and 3.6 TBs bisectional bandwidth between the eight GPUs via Nvidia’s NVSwitch and NVLink 4.0 interconnects.To read this article in full, please click here

Cisco aims for full-stack observability with AppDynamics/ThousandEyes tie-in

Cisco is more tightly integrating its network- and application-intelligence tools in an effort to help customers quickly diagnose and remediate performance problems.An upgrade to Cisco's Digital Experience Monitoring (DEM) platform melds the vendor’s AppDynamics application observability capabilities and ThousandEyes network intelligence with a bi-directional, OpenTelemetry-based integration package. (Read more about how to shop for network observability tools)The goal with DEM is to get business, infrastructure, networking, security operations, and DevSecOps teams working together more effectively to find the root cause of a problem and quickly address the issue, said Carlos Pereira, Cisco Fellow and chief architect in its Strategy, Incubation & Applications group. To read this article in full, please click here

Cisco aims for full-stack observability with AppDynamics/ThousandEyes tie-in

Cisco is more tightly integrating its network- and application-intelligence tools in an effort to help customers quickly diagnose and remediate performance problems.An upgrade to Cisco's Digital Experience Monitoring (DEM) platform melds the vendor’s AppDynamics application observability capabilities and ThousandEyes network intelligence with a bi-directional, OpenTelemetry-based integration package. (Read more about how to shop for network observability tools)The goal with DEM is to get business, infrastructure, networking, security operations, and DevSecOps teams working together more effectively to find the root cause of a problem and quickly address the issue, said Carlos Pereira, Cisco Fellow and chief architect in its Strategy, Incubation & Applications group. To read this article in full, please click here

Network Break 430: Cisco Viptela Customers Have A Certifiably Bad Day; IT Crimes And Punishments

Take a Network Break! This week we cover some follow-up on Lumen. Then we dive into a massive Cisco blunder that let a digital certificate expire on some models of the Viptela SD-WAN appliance, causing device failures. Extreme Networks release a new Wi-Fi 6e AP and core and aggregation switches, a Ubiquiti employee who stole […]

The post Network Break 430: Cisco Viptela Customers Have A Certifiably Bad Day; IT Crimes And Punishments appeared first on Packet Pushers.

Zero Trust Security for AI

Zero Trust Security for AI

A collection of tools from Cloudflare One to help your teams use AI services safely

Zero Trust Security for AI

Cloudflare One gives teams of any size the ability to safely use the best tools on the Internet without management headaches or performance challenges. We’re excited to announce Cloudflare One for AI, a new collection of features that help your team build with the latest AI services while still maintaining a Zero Trust security posture.

Large Language Models, Larger Security Challenges

A Large Language Model (LLM), like OpenAI’s GPT or Google’s Bard, consists of a neural network trained against a set of data to predict and generate text based on a prompt. Users can ask questions, solicit feedback, and lean on the service to create output from poetry to Cloudflare Workers applications.

The tools also bear an uncanny resemblance to a real human. As in some real-life personal conversations, oversharing can become a serious problem with these AI services. This risk multiplies due to the types of use cases where LLM models thrive. These tools can help developers solve difficult coding challenges or information workers create succinct reports from a mess of notes. While helpful, every input fed into a prompt becomes a piece of Continue reading