In 2011 we launched the Cloudflare Apps platform in an article that described Cloudflare as “not ... the sexiest business in the world.” Sexy or not, Cloudflare has since grown from the 3.5 billion pageviews a month we were doing then to over 1.3 trillion per month today. Along the way, we’ve powered more than a million app installations onto our customer’s websites.
For the last 6 years Cloudflare has been focused on building one of the world’s largest networks. The importance of that work has not left as much time as we would have liked to improve our app platform. With just 21 apps, we knew we were not delivering all that our marketplace could offer.
About six months ago, we were introduced to the team at Eager. Eager was building its own app store for installation onto any website. They impressed us with their ability to enable even the most non-technical website owner to install powerful tools to improve their sites through a slick interface. Eager’s platform included the features we wanted in our marketplace, like the ability to preview an app on a user's site before installing it. Even better, Eager had a powerful app Continue reading
Like most of you, I first heard of Cloudflare via this blog. I read about HTTP/2, Railgun, the Hundredth Data Center, and Keyless SSL — but I never thought I would work here. I, along with my co-founder Adam, and our friends and coworkers were hard at work building something very different. We were working on a tool which spent most of its life in the web browser, not on servers all around the world: an app store for your website. Using our tool a website owner could find and install any of over a hundred apps which could help them collect feedback from their visitors, sell products on their site, or even make their site faster.
Our goal was to create a way for every website owner to find and install all of the open-source and SaaS tools technical experts use everyday. As developers ourselves, we wanted to make it possible for a developer in her basement to build the next great tool and get it on a million websites (and make a million dollars) the next day. We didn’t want her to succeed because she had the biggest sales or marketing team, or the most Continue reading
M42 Smart Motorway in the West Midlands, UK; courtesy of Highways England.
The load time of your website not only affects your search engine rankings, but is also correlated to the conversion rate on your site:
Cloudflare is determined to help website administrators boost the performance of their websites. From today, Cloudflare users on our Business plan will gain a previously Enterprise-only Page Rule option, “Bypass Cache on Cookie”. When used in conjunction with a “Cache Everything” Page Rule, this setting allows for websites to cache the HTML of anonymous page visits without affecting dynamic content.
By caching anonymous page views, Cloudflare is able to help ensure that your origin webserver doesn't waste time constantly regenerating pages which change rarely. This ultimately allows us Continue reading
The following blog post describes a debugging adventure on Cloudflare's Mesos-based cluster. This internal cluster is primarily used to process log file information so that Cloudflare customers have analytics, and for our systems that detect and respond to attacks.
The problem encountered didn't have any effect on our customers, but did have engineers scratching their heads...
At some point in one of our cluster we started seeing errors like this (an NXDOMAIN for an existing domain on our internal DNS):
lookup some.existing.internal.host on 10.36.0.9:53: no such host
This seemed very weird, since the domain did indeed exist. It was one of our internal domains! Engineers had mentioned that they'd seen this behaviour, so we decided to investigate deeper. Queries triggering this error were varied and ranged from dynamic SRV records managed by mesos-dns to external domains looked up from inside the cluster.
Our first naive attempt was to run the following in a loop:
while true; do dig some.existing.internal.host > /tmp/dig.txt || break; done
Running this for a while on one server did not reproduce the problem: all the lookups were successful. Then we took our service Continue reading
Recent headline grabbing DDoS attacks provoked heated debates in the DNS community. Everyone has strong opinions on how to harden DNS to avoid downtime in the future. Is it better to use a single DNS provider or multiple? What DNS TTL values are best? Does DNSSEC make you more or less exposed?
CC BY 2.0 image by Leticia Chamorro
These are valid questions worth serious discussion, but tuning your own DNS server settings is not the full story. Together, as a community, we need to harden the DNS protocol itself. We need to prepare it to withstand the toughest DDoS attacks the future will surely bring. In this blog post I'll point out an obscure feature in the core DNS protocol. It is not practical to use this "hidden" feature for DDoS mitigation now, but with a small tweak it could become extremely useful. The feature is currently unused not due to protocol problems - it's unused because of the DNS Top Level Domain (TLD) operators' apathy. If it was working it would reduce DDoS recovery time for the DNS servers under attack.
The feature in question is: DNS TLD glue records. More specifically DNS TLD glue records with Continue reading
On Wednesday afternoon, Cloudflare and other Internet companies noticed that the West African country of The Gambia had dropped off the Internet - the day before the presidential election that was planned to be held there on Thursday, December 1st. This is not unprecedented. The Ugandan government blocked access to Facebook and WhatsApp during its recent election. Internet blocking by governments has also been seen in Gabon. Even Ghana toyed with the idea earlier this year.
Gambia has a population of 1.8 million people, and according to World Internet Stats, Internet penetration is growing fast and is almost 20%. The latest statistics indicate that at least ten percent of Gambians are using Facebook. As shown in the graph below, on Thursday, the Gambian government cut off access to the global Internet and for 39 hours hundreds of thousands of Gambians were unable to use online services on which they rely every day.
All the networks in Gambia disappeared from the global routing tables. This could have been caused by a soft reconfiguration of Internet routers; or by a physical powering down of telecommunications equipment. At this point, we do not know. What we do know is that we Continue reading
Back in March my colleague Marek wrote about a Winter of Whopping Weekend DDoS Attacks where we were seeing 400Gbps attacks occurring mostly at the weekends. We speculated that attackers were busy with something else during the week.
This winter we've seen a new pattern, and attackers aren't taking the week off, but they do seem to be working regular hours.
CC BY 2.0 image by Carol VanHook
On November 23, the day before US Thanksgiving, our systems detected and mitigated an attack that peaked at 172Mpps and 400Gbps. The attack started at 1830 UTC and lasted non-stop for almost exactly 8.5 hours stopping at 0300 UTC. It felt as if an attacker 'worked' a day and then went home.
The very next day the same thing happened again (although the attack started 30 minutes earlier at 1800 UTC).
On the third day the attacker started promptly at 1800 UTC but went home a little early at around 0130 UTC. But they managed to peak the attack over 200Mpps and 480Gbps.
And the attacker just kept this up day after day. Right through Thanksgiving, Black Friday, Cyber Monday and into this week. Night after night attacks were peaking Continue reading
Sometime before midnight Monday (UK local time) a ship dropped its anchor and broke, not one, not two, but three undersea cables serving the island of Jersey in the English Channel. Jersey is part of the Channel Islands along with Guernsey and some smaller islands.
Image courtesy TeleGeography Submarine Cable Map
These things happen and that’s not a good thing. The cut was reported on the venerable BBC news website. For the telecom operators in Jersey (JT Global) this wasn’t good news. However looking at the traffic from Cloudflare’s point of view; we can see that while the cable cut removed the direct path from London to Jersey, it was replaced by the backup path from Paris to Jersey. The move was 100% under the control of the BGP routing protocol. It's a relief that there's a fallback for when these unpredictable events happen.
Here's a look at one network on the island.
The red traffic is being served from our London data center (the normal location for all Jersey traffic) and the blue traffic is coming from our Paris data center. The step could well be caused by either a delayed break in one of the cables or the Continue reading
If you have experienced HTTP/2 for yourself, you are probably aware of the visible performance gains possible with HTTP/2 due to features like stream multiplexing, explicit stream dependencies, and Server Push.
There is however one important feature that is not obvious to the eye. This is the HPACK header compression. Current implementations of Apache and nginx servers, as well edge networks and CDNs using them, do not support the full HPACK implementation. We have, however, implemented the full HPACK in nginx, and upstreamed the part that performs Huffman encoding.
CC BY 2.0 image by Conor Lawless
This blog post gives an overview of the reasons for the development of HPACK, and the hidden bandwidth and latency benefits it brings.
As you probably know, a regular HTTPS connection is in fact an overlay of several connections in the multi-layer model. The most basic connection you usually care about is the TCP connection (the transport layer), on top of that you have the TLS connection (mix of transport/application layers), and finally the HTTP connection (application layer).
In the the days of yore, HTTP compression was performed in the TLS layer, using gzip. Both headers and body were compressed indiscriminately, Continue reading
Documentation for JavaScript projects has traditionally been generated via annotations inserted as code comments. While this gets the job done, it seems far from ideal. In this post, I’ll explore how to use TypeScript to generate documentation from source code alone.
CC BY-SA 2.0 image by David Joyner
TypeScript is JavaScript with optional types. Here’s a simple example:
// Sends some data to some analytics endpoint
function sendAnalyticsJS(data) {
if (typeof data.type !== 'string') {
throw new Error('The `type` property is required')
}
navigator.sendBeacon('/beacon', JSON.stringify(data))
}
// Results in run-time error
// The `type` property is required
sendAnalyticsJS({ foo: 'bar' })
The JavaScript code will result in a run-time error. This is fine if the developer catches it early, but it would be better if the developer were warned as the bug was introduced. Here’s the same code written using TypeScript:
// Describe the shape of the data parameter
interface IAnalyticsData {
// Type is required
type: string
// All other fields are fair game
[propName: string]: string
}
// We don’t particularly need the data.type check here since
// the compiler will stamp out the majority of those cases.
function sendAnalytics(data: IAnalyticsData) {
if Continue reading
It's 2016 and almost every site using Cloudflare (more than 4 million of them) is using IPv6. Because of this, Cloudflare sees significant IPv6 traffic globally where networks have enabled IPv6 to the consumer.
The top IPv6 networks are shown here.
The chart shows the percentage of IPv6 within a specific network vs. the relative bandwidth of that network. We will talk about specific networks below.
IPv6 is faster for two reasons. The first is that many major operating systems and browsers like iOS, MacOS, Chrome and Firefox impose anywhere from a 25ms to 300ms artificial delay on connections made over IPv4. The second is that some mobile networks won’t need to perform extra v4 -> v6 and v6 -> v4 translations to connect visitors to IPv6 enabled sites if the phone is only assigned an IPv6 address. (IPv6-only phones are becoming very common. If you have a phone on T-Mobile, Telstra, SK Telecom, Orange, or EE UK, to name a few, it’s likely you’re v6-only.)
How much faster is IPv6? Our data shows that visitors connecting over IPv6 were able to connect and load pages in 27% less time than visitors connecting Continue reading
Hola Barcelona! The land of Antoni Gaudí, Salvador Dalí, Ferrán Adriá and Lionel Messi is now also home to Cloudflare.
Located alongside the Mediterranean, Barcelona, is the capital of the autonomous community of Catalonia, the second-most populated municipality in Spain and the core of the fifth-most populous urban area in the Europe. Our data center in Barcelona is our 3rd in the Iberian Peninsula following our deployments in Madrid and Lisbon, our 28th in Europe, and 101st globally. This means not only better performance in Catalonia and Spain, but additional redundancy for European data centers. As of this moment, Cloudflare has a point of presence (PoP) in 7 out of Europe's 10 most populous urban areas, and with number 8 coming soon (all roads have been said to lead there…).
CC BY-NC-ND 2.0 image by Luc Mercelis
Cloudflare has connected to the Catalunya Neutral Internet Exchange (“CATNIX”). This raises the number of exchanges that Cloudflare is a participant in to over 150. As we expand our peering, more visitors get served locally. If you wish to connect with us at CATNIX (or any of our other locations) you can find our peering details on our PeeringDB profile.
Only a Continue reading
In a recent post we discussed how we have been adding resilience to our network.
The strength of the Internet is its ability to interconnect all sorts of networks — big data centers, e-commerce websites at small hosting companies, Internet Service Providers (ISP), and Content Delivery Networks (CDN) — just to name a few. These networks are either interconnected with each other directly using a dedicated physical fiber cable, through a common interconnection platform called an Internet Exchange (IXP), or they can even talk to each other by simply being on the Internet connected through intermediaries called transit providers.
The Internet is like the network of roads across a country and navigating roads means answering questions like “How do I get from Atlanta to Boise?” The Internet equivalent of that question is asking how to reach one network from another. For example, as you are reading this on the Cloudflare blog, your web browser is connected to your ISP and packets from your computer found their way across the Internet to Cloudflare’s blog server.
Figuring out the route between networks is accomplished through a protocol designed 25 years ago (on two napkins) called BGP.
BGP allows interconnections between Continue reading
Come join us on Cloudflare HQ in San Francisco on Tuesday, November 22 for another cryptography meetup. We had such a great time at the last one, we decided to host another.
We’ll start the evening at 6:00p.m. with time for networking, followed up with short talks by leading experts starting at 6:30p.m. Pizza and beer are provided! RSVP here.
Here are the confirmed speakers:
Emily Stark is a software engineer on the Google Chrome security team, where she focuses on making TLS more usable and secure. She spends lots of time analyzing field data about the HTTPS ecosystem and improving web platform features like Referrer Policy and Content Security Policy that help developers migrate their sites to HTTPS. She has also worked on the DevTools security panel and the browser plumbing that supports other security UI surfaces like the omnibox. (That green lock icon is more complicated than you'd think!)
Previously, she was a core developer at Meteor Development Group, where she worked on web framework security and internal infrastructure, and a graduate student researching client-side cryptography in web browsers. Emily has a master's degree from MIT and a bachelor's degree from Stanford, Continue reading
Last Friday the popular DNS service Dyn suffered three waves of DDoS attacks that affected users first on the East Coast of the US, and later users worldwide. Popular websites, some of which are also Cloudflare customers, were inaccessible. Although Cloudflare was not attacked, joint Dyn/Cloudflare customers were affected.
Almost as soon as Dyn came under attack we noticed a sudden jump in DNS errors on our edge machines and alerted our SRE and support teams that Dyn was in trouble. Support was ready to help joint customers and we began looking in detail at the effect the Dyn outage was having on our systems.
An immediate concern internally was that since our DNS servers were unable to reach Dyn they would be consuming resources waiting on timeouts and retrying. The first question I asked the DNS team was: “Are we seeing increased DNS response latency?” rapidly followed by “If this gets worse are we likely to?”. Happily, the response to both those questions (after the team analyzed the situation) was no.
CC BY-SA 2.0 image by tracyshaun
However, that didn’t mean we had nothing to do. Operating a large scale system like Cloudflare that Continue reading
The last few weeks have seen several high-profile outages in legacy DNS and DDoS-mitigation services due to large scale attacks. Cloudflare's customers have, understandably, asked how we are positioned to handle similar attacks.
While there are limits to any service, including Cloudflare, we are well architected to withstand these recent attacks and continue to scale to stop the larger attacks that will inevitably come. We are, multiple times per day, mitigating the very botnets that have been in the news. Based on the attack data that has been released publicly, and what has been shared with us privately, we have been successfully mitigating attacks of a similar scale and type without customer outages.
I thought it was a good time to talk about how Cloudflare's architecture is different than most legacy DNS and DDoS-mitigation services and how that's helped us keep our customers online in the face of these extremely high volume attacks.
Before delving into our architecture, it's worth taking a second to think about another analogous technology problem that is better understood: scaling databases. From the mid-1980s, when relational databases started taking off, through the early 2000s the way companies thought of scaling Continue reading
Today there is an ongoing, large scale Denial-of-Service attack directed against Dyn DNS. While Cloudflare services are operating normally, if you are using both Cloudflare and Dyn services, your website may be affected.
Specifically, if you are using CNAME records which point to a zone hosted on Dyn, our DNS queries directed to Dyn might fail making your website unavailable, and presenting a “1001” error message.
Some popular services that might rely on Dyn for part of their operations include GitHub Pages, Heroku, Shopify and AWS.
As a possible workaround, you might be able to update your Cloudflare DNS records from CNAMEs (referring to Dyn hosted records) to A/AAAA records specifying the origin IP of your website. This will allow Cloudflare to reach your origin without the need for an external DNS lookup.
Note that if you use different origin IP addresses, for example based on the geographical location, you may lose some of that functionality by using plain A/AAAA records. We recommend that you provide addresses for many of your different locations, so that load will be shared amongst them.
Customers with a CNAME setup (which means Cloudflare is not configured in your domain NS records) where the main Continue reading
One of the base principles of cryptography is that you can't just encrypt multiple messages with the same key. At the very least, what will happen is that two messages that have identical plaintext will also have identical ciphertext, which is a dangerous leak. (This is similar to why you can't encrypt blocks with ECB.)
If you think about it, a pure encryption function is just like any other pure computer function: deterministic. Given the same set of inputs (key and message) it will always return the same output (the encrypted message). And we don't want an attacker to be able to tell that two encrypted messages came from the same plaintext.
The solution is the use of IVs (Initialization Vectors) or nonces (numbers used once). These are byte strings that are different for each encrypted message. They are the source of non-determinism that is needed to make duplicates indistinguishable. They are usually not secret, and distributed prepended to the ciphertext since they are necessary for decryption.
The distinction between IVs and nonces is controversial and not binary. Different encryption schemes require different properties to be secure: some just need them to never repeat, in which case we commonly Continue reading
Over the last few weeks we've seen DDoS attacks hitting our systems that show that attackers have switched to new, large methods of bringing down web applications. They appear to come from the Mirai botnet (and relations) which were responsible for the large attacks against Brian Krebs.
Our automatic DDoS mitigation systems have been handling these attacks, but we thought it would be interesting to publish some of the details of what we are seeing. In this article we'll share data on two attacks, which are perfect examples of the new trends in DDoS.
In the past we've written extensively about volumetric DDoS attacks and how to mitigate them. The Mirai attacks are distinguished by their heavy use of L7 (i.e. HTTP) attacks as opposed to the more familiar SYN floods, ACK floods, and NTP and DNS reflection attacks.
Many DDoS mitigation systems are tuned to handle volumetric L3/4 attacks; in this instance attackers have switched to L7 attacks in an attempt to knock web applications offline.
Seeing the move towards L7 DDoS attacks we put in place a new system that recognizes and blocks these attacks as they happen. The Continue reading
Over the last six years, we’ve built the tooling, infrastructure and expertise to run a DNS network that handles our scale - we’ve answered a few million DNS queries in the few seconds since you started reading this.
DNS is the backbone of the internet. Every email, website visit, and API call ultimately begins with a DNS lookup. Internet is built on DNS, so every hosting company, registrar, TLD operator, and cloud provider must be able to run reliable DNS.
Last year CloudFlare launched Virtual DNS, providing DDoS mitigation and a strong caching layer of 100 global data centers to those running DNS infrastructure.
Today we’re expanding that offering with two new features for an extra layer of reliability: Serve Stale and DNS Rate Limiting.
Virtual DNS sits in front of your DNS infrastructure. When DNS resolvers lookup answers on your authoritative DNS, the query first goes to CloudFlare Virtual DNS. We either serve the answer from cache if we have the answer in cache, or we reach out to your nameservers to get the answer to respond to the DNS resolver.
Even if your DNS servers are down, Virtual DNS can now answer on your behalf Continue reading