Harshal Brahmbhatt

Author Archives: Harshal Brahmbhatt

Scaling with safety: Cloudflare’s approach to global service health metrics and software releases

Has your browsing experience ever been disrupted by this error page? Sometimes Cloudflare returns "Error 500" when our servers cannot respond to your web request. This inability to respond could have several potential causes, including problems caused by a bug in one of the services that make up Cloudflare's software stack.

We know that our testing platform will inevitably miss some software bugs, so we built guardrails to gradually and safely release new code before a feature reaches all users. Health Mediated Deployments (HMD) is Cloudflare’s data-driven solution to automating software updates across our global network. HMD works by querying Thanos, a system for storing and scaling Prometheus metrics. Prometheus collects detailed data about the performance of our services, and Thanos makes that data accessible across our distributed network. HMD uses these metrics to determine whether new code should continue to roll out, pause for further evaluation, or be automatically reverted to prevent widespread issues.

Cloudflare engineers configure signals from their service, such as alerting rules or Service Level Objectives (SLOs). For example, the following Service Level Indicator (SLI) checks the rate of HTTP 500 errors over 10 minutes returned from a service in our software stack.

sum(rate(http_request_count{code="500"}[10m]))  Continue reading

Introducing Object Lifecycle Management for Cloudflare R2

Introducing Object Lifecycle Management for Cloudflare R2
Introducing Object Lifecycle Management for Cloudflare R2

Last year, R2 made its debut, providing developers with object storage while eliminating the burden of egress fees. (For many, egress costs account for over half of their object storage bills!) Since R2’s launch, tens of thousands of developers have chosen it to store data for many different types of applications.

But for some applications, data stored in R2 doesn’t need to be retained forever. Over time, as this data grows, it can unnecessarily lead to higher storage costs. Today, we’re excited to announce that Object Lifecycle Management for R2 is generally available, allowing you to effectively manage object expiration, all from the R2 dashboard or via our API.

Object Lifecycle Management

Object lifecycles give you the ability to define rules (up to 1,000) that determine how long objects uploaded to your bucket are kept. For example, by implementing an object lifecycle rule that deletes objects after 30 days, you could automatically delete outdated logs or temporary files. You can also define rules to abort unfinished multipart uploads that are sitting around and contributing to storage costs.

Getting started with object lifecycles in R2

Cloudflare dashboard

Introducing Object Lifecycle Management for Cloudflare R2
  1. From the Cloudflare dashboard, select R2.
  2. Select your R2 bucket.
  3. Navigate to Continue reading