Alex Fattouche

Author Archives: Alex Fattouche

Migrating billions of records: moving our active DNS database while it’s in use

According to a survey done by W3Techs, as of October 2024, Cloudflare is used as an authoritative DNS provider by 14.5% of all websites. As an authoritative DNS provider, we are responsible for managing and serving all the DNS records for our clients’ domains. This means we have an enormous responsibility to provide the best service possible, starting at the data plane. As such, we are constantly investing in our infrastructure to ensure the reliability and performance of our systems.

DNS is often referred to as the phone book of the Internet, and is a key component of the Internet. If you have ever used a phone book, you know that they can become extremely large depending on the size of the physical area it covers. A zone file in DNS is no different from a phone book. It has a list of records that provide details about a domain, usually including critical information like what IP address(es) each hostname is associated with. For example:

example.com      59 IN A 198.51.100.0
blog.example.com 59 IN A 198.51.100.1
ask.example.com  59 IN A 198.51.100.2

It is not unusual Continue reading

Making zone management more efficient with batch DNS record updates

Customers that use Cloudflare to manage their DNS often need to create a whole batch of records, enable proxying on many records, update many records to point to a new target at the same time, or even delete all of their records. Historically, customers had to resort to bespoke scripts to make these changes, which came with their own set of issues. In response to customer demand, we are excited to announce support for batched API calls to the DNS records API starting today. This lets customers make large changes to their zones much more efficiently than before. Whether sending a POST, PUT, PATCH or DELETE, users can now execute these four different HTTP methods, and multiple HTTP requests all at the same time.

Efficient zone management matters

DNS records are an essential part of most web applications and websites, and they serve many different purposes. The most common use case for a DNS record is to have a hostname point to an IPv4 address, this is called an A record:

example.com 59 IN A 198.51.100.0

blog.example.com 59 IN A 198.51.100.1

ask.example.com 59 IN A 198.51. Continue reading

How we improved DNS record build speed by more than 4,000x

How we improved DNS record build speed by more than 4,000x

This post is also available in 简体中文, 日本語, Español.

How we improved DNS record build speed by more than 4,000x

Since my previous blog about Secondary DNS, Cloudflare's DNS traffic has more than doubled from 15.8 trillion DNS queries per month to 38.7 trillion. Our network now spans over 270 cities in over 100 countries, interconnecting with more than 10,000 networks globally. According to w3 stats, “Cloudflare is used as a DNS server provider by 15.3% of all the websites.” This means we have an enormous responsibility to serve DNS in the fastest and most reliable way possible.

Although the response time we have on DNS queries is the most important performance metric, there is another metric that sometimes goes unnoticed. DNS Record Propagation time is how long it takes changes submitted to our API to be reflected in our DNS query responses. Every millisecond counts here as it allows customers to quickly change configuration, making their systems much more agile. Although our DNS propagation pipeline was already known to be very fast, we had identified several improvements that, if implemented, would massively improve performance. In this blog post I’ll explain how we managed to drastically improve our DNS record propagation speed, and the Continue reading

Moving k8s communication to gRPC

Moving k8s communication to gRPC
Moving k8s communication to gRPC

Over the past year and a half, Cloudflare has been hard at work moving our back-end services running in our non-edge locations from bare metal solutions and Mesos Marathon to a more unified approach using Kubernetes(K8s). We chose Kubernetes because it allowed us to split up our monolithic application into many different microservices with granular control of communication.

For example, a ReplicaSet in Kubernetes can provide high availability by ensuring that the correct number of pods are always available. A Pod in Kubernetes is similar to a container in Docker. Both are responsible for running the actual application. These pods can then be exposed through a Kubernetes Service to abstract away the number of replicas by providing a single endpoint that load balances to the pods behind it. The services can then be exposed to the Internet via an Ingress. Lastly, a network policy can protect against unwanted communication by ensuring the correct policies are applied to the application. These policies can include L3 or L4 rules.

The diagram below shows a simple example of this setup.

Moving k8s communication to gRPC

Though Kubernetes does an excellent job at providing the tools for communication and traffic management, it does not help the developer decide the Continue reading

Secondary DNS – Deep Dive

How Does Secondary DNS Work?

Secondary DNS - Deep Dive

If you already understand how Secondary DNS works, please feel free to skip this section. It does not provide any Cloudflare-specific information.

Secondary DNS has many use cases across the Internet; however, traditionally, it was used as a synchronized backup for when the primary DNS server was unable to respond to queries. A more modern approach involves focusing on redundancy across many different nameservers, which in many cases broadcast the same anycasted IP address.

Secondary DNS involves the unidirectional transfer of DNS zones from the primary to the Secondary DNS server(s). One primary can have any number of Secondary DNS servers that it must communicate with in order to keep track of any zone updates. A zone update is considered a change in the contents of a  zone, which ultimately leads to a Start of Authority (SOA) serial number increase. The zone’s SOA serial is one of the key elements of Secondary DNS; it is how primary and secondary servers synchronize zones. Below is an example of what an SOA record might look like during a dig query.

example.com	3600	IN	SOA	ashley.ns.cloudflare.com. dns.cloudflare.com. 
2034097105  // Serial
10000 //  Continue reading

Orange Clouding with Secondary DNS

What is secondary DNS?

Orange Clouding with Secondary DNS

In a traditional sense, secondary DNS servers act as a backup to the primary authoritative DNS server.  When a change is made to the records on the primary server, a zone transfer occurs, synchronizing the secondary DNS servers with the primary server. The secondary servers can then serve the records as if they were the primary server, however changes can only be made by the primary server, not the secondary servers. This creates redundancy across many different servers that can be distributed as necessary.

There are many common ways to take advantage of Secondary DNS, some of which are:

  1. Secondary DNS as passive backup - The secondary DNS server sits idle until the primary server goes down, at which point a failover can occur and the secondary can start serving records.
  2. Secondary DNS as active backup - The secondary DNS server works alongside the primary server to serve records.
  3. Secondary DNS with a hidden primary - The nameserver records at the registrar point towards the secondary servers only, essentially treating them as the primary nameservers.

What is secondary DNS Override?

Secondary DNS Override builds on the Secondary DNS with a hidden primary model by allowing our Continue reading