Moving Baselime from AWS to Cloudflare: simpler architecture, improved performance, over 80% lower cloud costs

Introduction

When Baselime joined Cloudflare in April 2024, our architecture had evolved to hundreds of AWS Lambda functions, dozens of databases, and just as many queues. We were drowning in complexity and our cloud costs were growing fast. We are now building Baselime and Workers Observability on Cloudflare and will save over 80% on our cloud compute bill. The estimated potential Cloudflare costs are for Baselime, which remains a stand-alone offering, and the estimate is based on the Workers Paid plan. Not only did we achieve huge cost savings, we also simplified our architecture and improved overall latency, scalability, and reliability.

Cost (daily)

Before (AWS)

After (Cloudflare)

Compute

$650 - AWS Lambda

$25 - Cloudflare Workers

CDN

$140 - Cloudfront

$0 - Free

Data Stream + Analytics database

$1,150 - Kinesis Data Stream + EC2

$300 - Workers Analytics Engine

Total (daily)

$1,940

$325

Total (annual)

$708,100

$118,625 (83% cost reduction)

Table 1: AWS vs. Workers Costs Comparison ($USD)

When we joined Cloudflare, we immediately saw a surge in usage, and within the first week following the announcement, we were processing over a billion events daily and our weekly active users tripled.

As the platform grew, so did the challenges Continue reading

Workers Builds: integrated CI/CD built on the Workers platform

During 2024’s Birthday Week, we launched Workers Builds in open beta — an integrated Continuous Integration and Delivery (CI/CD) workflow you can use to build and deploy everything from full-stack applications built with the most popular frameworks to simple static websites onto the Workers platform. With Workers Builds, you can connect a GitHub or GitLab repository to a Worker, and Cloudflare will automatically build and deploy your changes each time you push a commit.

Workers Builds is intended to bridge the gap between the developer experiences for Workers and Pages, the latter of which launched with an integrated CI/CD system in 2020. As we continue to merge the experiences of Pages and Workers, we wanted to bring one of the best features of Pages to Workers: the ability to tie deployments to existing development workflows in GitHub and GitLab with minimal developer overhead. 

In this post, we’re going to share how we built the Workers Builds system on Cloudflare’s Developer Platform, using Workers, Durable Objects, Hyperdrive, Workers Logs, and Smart Placement.

The design problem

The core problem for Workers Builds is how to pick up a commit from GitHub or GitLab and start a Continue reading

Google Covers Its Compute Engine Bases Because It Has To

The minute that search engine giant Google wanted to be a cloud, and the several years later that Google realized that companies were not ready to buy full-on platform services that masked the underlying hardware but wanted lower level infrastructure services that gave them more optionality as well as more responsibility, it was inevitable that Google Cloud would have to buy compute engines from Intel, AMD, and Nvidia for its server fleet.

Google Covers Its Compute Engine Bases Because It Has To was written by Timothy Prickett Morgan at The Next Platform.

Ethernet at NANOG 92

Ethernet has been the mainstay of much of the networking environment for almost 50 years now, but that doesn't mean that it’s remained unchanged over that period. The evolution of this technology has featured continual increases in the scale of Ethernet networks, increasing in capacity, reach and connections. I’d like to report on a couple of Ether-related presentations that took place at the recent NANOG 92 meeting, held in Toronto in October 2024 that described some recent developments in Ethernet.

Installing Certificate on ISE Lab Server

When ISE is installed, all the certificates used for different services such as EAP, Admin portal, etc., are self signed. Below is a short summary of the certificates that ISE uses:

  • Admin – Authentication of the ISE admin portal (GUI).
  • EAP Authentication – EAP protocols that use SSL/TLS tunneling.
  • RADIUS DTLS – RADsec server (encrypted RADIUS).
  • pxGrid – pxGrid controller.
  • SAML – For SAML signing.
  • Portal – For portals.

The certificates can be seen by going to Administration -> System -> Certificates:

A certificate can be viewed by selecting the checkbox and clicking View:

Self-signed certificates aren’t good. Certificates should be signed by a trusted CA. That could be a public root CA, or more commonly, especially for labs, an internal CA. Before such a certificate can be installed, ISE must be configured to trust that CA. This is done by importing the root CA certificate. I’ll download the certificate from the web service on the ADCS server. The web service is reachable on https:://<IP of ADCS server>/certsrv/. Click Download a CA certificate, certificate chain or CRL:

On the next page, change to Base 64 and then click Download CA certificate:

The file is downloaded Continue reading

Cloudflare’s perspective of the October 30 OVHcloud outage

On October 30, 2024, cloud hosting provider OVHcloud (AS16276) suffered a brief but significant outage. According to their incident report, the problem started at 13:23 UTC, and was described simply as “An incident is in progress on our backbone infrastructure.” OVHcloud noted that the incident ended 17 minutes later, at 13:40 UTC. As a major global cloud hosting provider, some customers use OVHcloud as an origin for sites delivered by Cloudflare — if a given content asset is not in our cache for a customer’s site, we retrieve the asset from OVHcloud.

We observed traffic starting to drop at 13:21 UTC, just ahead of the reported start time. By 13:28 UTC, it was approximately 95% lower than pre-incident levels. Recovery appeared to start at 13:31 UTC, and by 13:40 UTC, the reported end time of the incident, it had reached approximately 50% of pre-incident levels.

Traffic from OVHcloud (AS16276) to Cloudflare

Cloudflare generally exchanges most of our traffic with OVHcloud over peering links. However, as shown below, peered traffic volume during the incident fell significantly. It appears that some small amount of traffic briefly began to flow over transit links from Cloudflare to OVHcloud due to sudden Continue reading

HW039: Demystifying Private Mobile Networks

What is a private mobile network and how does it work? Guest Jeremy Rollinson, an expert in private cellular networks, joins host Keith Parsons to clarify misconceptions about private mobile networks, from terminology to spectrum allocations. They explore the differences between public and private networks, the evolution of private mobile networks, the importance of understanding... Read more »

HS087: Alkira’s Multi-Cloud NaaS Bridges Networking and Security (Sponsored)

Startup Alkira has built a Network as a Service (NaaS) offering that extends from on prem to public cloud and multi-cloud. Today’s sponsored episode of Heavy Strategy digs in to Alkira’s capabilities in multi-cloud networking, security, automation, and cost transparency. Guest Manan Shah, SVP of Product at Alkira,  explains how Alkira simplifies network management, enhances... Read more »

HPC Gets A Reconfigurable Dataflow Engine To Take On CPUs And GPUs

No matter how elegant and clever the design is for a compute engine, the difficulty and cost of moving existing – and sometimes very old – code from the device it currently runs on to that new compute engine is a very big barrier to adoption.

HPC Gets A Reconfigurable Dataflow Engine To Take On CPUs And GPUs was written by Timothy Prickett Morgan at The Next Platform.