This is part 3 of a six part series based on a talk I gave in Trento, Italy. To start from the beginning go here.
After Cloudbleed, lots of things changed. We started to move away from memory-unsafe languages like C and C++ (there’s a lot more Go and Rust now). And every SIGABRT or crash on any machine results in an email to me and a message to the team responsible. And I don’t let the team leave those problems to fester.
So Cloudbleed was a terrible time. Let’s talk about a great time. The launch of our public DNS resolver 1.1.1.1. That launch is a story of an important Cloudflare quality: audacity. Google had launched 8.8.8.8 years ago and had taken the market for a public DNS resolver by storm. Their address is easy to remember, their service is very fast.
But we thought we could do better. We thought we could be faster, and we thought we could be more memorable. Matthew asked us to get the address 1.1.1.1 and launch a secure, privacy-preserving, public DNS resolver in a couple of months. Continue reading
This is the text I prepared for a talk at Speck&Tech in Trento, Italy. I thought it might make a good blog post. Because it is 6,000 words I've split it into six separate posts.
Here's part 1:
I’ve worked at Cloudflare for more than seven years. Cloudflare itself is more than eight years old. So, I’ve been there since it was a very small company. About twenty people in fact. All of those people (except one, me) worked from an office in San Francisco. I was the lone member of the London office.
Today there are 900 people working at Cloudflare spread across offices in San Francisco, Austin, Champaign IL, New York, London, Munich, Singapore and Beijing. In London, my “one-person office” (which was my spare bedroom) is now almost 200 people and in a month, we’ll move into new space opposite Big Ben.
The numbers tell a story about enormous growth. But it’s growth that’s been very carefully managed. We could have grown much faster (in terms of people); we’ve certainly raised enough money to do so.
I ended up at Cloudflare because I gave a really good talk at a conference. Well, Continue reading
At Cloudflare, we aim to make the Internet faster and safer for everyone. One way we do this is through caching: we keep a copy of our customer content in our 165 data centers around the world. This brings content closer to users and reduces traffic back to origin servers.
Today, we’re excited to announce a huge change in our how cache works. Cloudflare Workers now integrates the Cache API, giving you programmatic control over our caches around the world.
Figuring out what to cache and how can get complicated. Consider an e-commerce site with a shopping cart, a Content Management System (CMS) with many templates and hundreds of articles, or a GraphQL API. Each contains a mix of elements that are dynamic for some users, but might stay unchanged for the vast majority of requests.
Over the last 8 years, we’ve added more features to give our customers flexibility and control over what goes in the cache. However, we’ve learned that we need to offer more than just adding settings in our dashboard. Our customers have told us clearly that they want to be able to express their ideas in code, to build Continue reading
HTTP is the application protocol that powers the Web. It began life as the so-called HTTP/0.9 protocol in 1991, and by 1999 had evolved to HTTP/1.1, which was standardised within the IETF (Internet Engineering Task Force). HTTP/1.1 was good enough for a long time but the ever changing needs of the Web called for a better suited protocol, and HTTP/2 emerged in 2015. More recently it was announced that the IETF is intending to deliver a new version - HTTP/3. To some people this is a surprise and has caused a bit of confusion. If you don't track IETF work closely it might seem that HTTP/3 has come out of the blue. However, we can trace its origins through a lineage of experiments and evolution of Web protocols; specifically the QUIC transport protocol.
If you're not familiar with QUIC, my colleagues have done a great job of tackling different angles. John's blog describes some of the real-world annoyances of today's HTTP, Alessandro's blog tackles the nitty-gritty transport layer details, and Nick's blog covers how to get hands on with some testing. We've collected these and more at https://cloudflare-quic.com. And if that tickles your fancy, be sure Continue reading
This is a guest post by Tejas Dinkar, who is the Head of Engineering at Quintype, a platform for digital publishing. He’s continually looking for ways to make applications run faster and cheaper. You can find him on Github and Twitter.
TL;DR: Check out create-cloudflare-worker.
At Quintype, we are continually looking for new and innovative ways to use our CDN. Quintype moved to Cloudflare last year, partly because of the power of Cloudflare Workers. Workers have been a very important tool in our belt, and in this blog post we will talk a little bit about our worker development lifecycle.
Cloudflare Workers have drastically changed the way we architect and deploy things at Quintype. Quintype is a platform that powers many publishers, including many high volume ones like The Quint, BloombergQuint, Swarajya, and Fortune India. An average month sees hundreds of millions of page views come through our network.
Maintaining a healthy cache hit ratio is the key to scaling a content heavy app. Ensuring requests are served from Cloudflare is faster, and cheaper, as requests do not have to come through to an origin. We actively architect our apps to ensure that we Continue reading
As of December 22, 2018, parts of the US Government have “shut down” because of a lapse in appropriation. The shutdown has caused the furlough of employees across the government and has affected federal contracts. An unexpected side-effect of this shutdown has been the expiration of TLS certificates on some .gov websites. This side-effect has emphasized a common issue on the Internet: the usage of expired certificates and their erosion of trust.
For an entity to provide a secure website, it needs a valid TLS certificate attached to the website server. These TLS certificates have both start dates and expiry dates. Normally certificates are renewed prior to their expiration. However, if there’s no one to execute this process, then websites serve expired certificates--a poor security practice.
This means that people looking for government information or resources may encounter alarming error messages when visiting important .gov websites:
The content of the website hasn’t changed; it’s just the cryptographic exchange that’s invalid (an expired certificate can’t be validated). These expired certificates present a trust problem. Certificate errors often dissuade people from accessing a website, and imply that the site is not to be trusted. Browsers purposefully make it difficult to continue to Continue reading
During last year’s Birthday Week we announced early support for QUIC, the next generation encrypted-by-default network transport protocol designed to secure and accelerate web traffic on the Internet.
We are not quite ready to make this feature available to every Cloudflare customer yet, but while you wait we thought you might enjoy a slice of quiche, our own open-source implementation of the QUIC protocol written in Rust.
Quiche will allow us to keep on top of changes to the QUIC protocol as the standardization process progresses and experiment with new features more easily. Let’s have a quick look at it together.
The main design principle that guided quiche’s initial development was exposing most of the QUIC complexity to applications through a minimal and intuitive API, but without making too many assumptions about the application itself, in order to allow us to reuse the same library in different contexts.
For example, while we think Rust is great, most of the stack that deals with HTTP requests on Cloudflare’s edge network is still written in good ol’ C, which means that our QUIC implementation would need to be integrated into that.
The quiche API can process Continue reading
Cloudflare is proud to partner with Mesosphere on their new Argo Tunnel offering available within their DC/OS (Data Center / Operating System) catalogue! Before diving deeper into the offering itself, we’ll first do a quick overview of the Mesophere platform, DC/OS.
Mesosphere DC/OS provides application developers and operators an easy way to consistently deploy and run applications and data services on cloud providers and on-premise infrastructure. The unified developer and operator experience across clouds makes it easy to realize use cases like global reach, resource expansion, and business continuity.
In this multi cloud world Cloudflare and Mesosphere DC/OS are great complements. Mesosphere DC/OS provides the same common services experience for developers and operators, and Cloudflare provides the same common service access experience across cloud providers. DC/OS helps tremendously for avoiding vendor lock-in to a single provider, while Cloudflare can load balance traffic intelligently (in addition to many other services) at the edge between providers. This new offering will allow you to load balance through the use of Argo Tunnel.
Cloudflare Argo Tunnel is a private connection between your services and Cloudflare. Tunnel makes it such that only traffic that routes through the Continue reading
This is a guest post by Jamie Mason, who is the Head of Test Servers at SamKnows. This post originally appears on the SamKnows Megablog.
We leveraged Cloudflare Workers to expand the SamKnows measurement infrastructure.
At SamKnows, we run lots of tests to measure internet performance. Actually, that’s an understatement. Our software is embedded on tens of millions of devices, and that number grows daily.
We measure performance between the user’s home and the internet, across dozens of metrics. Some of these metrics measure the performance of major video-streaming services, popular games, or large websites. Others focus on the more traditional ‘quality of service’ metrics: speed, latency, and packet loss.
In order to measure speed, latency, and packet loss, SamKnows needs test servers to carry out the measurements against. These servers should be relatively near to the user’s home - this ensures that we’re measuring solely the user’s internet connection (i.e. what their Internet Service Provider sells them) and not some external factor.
As a result, we manage high-capacity test servers all over the world. Some are donated by research groups, some we host ourselves in major data centers, and still others are run inside ISPs’ own networks.
Customers Continue reading
When you launch your domain to the world, you rely on the Domain Name System (DNS) to direct your users to the address for your site. However, DNS cannot guarantee that your visitors reach your content because DNS, in its basic form, lacks authentication. If someone was able to poison the DNS responses for your site, they could hijack your visitors' requests.
The Domain Name System Security Extensions (DNSSEC) can help prevent that type of attack by adding a chain of trust to DNS queries. When you enable DNSSEC for your site, you can ensure that the DNS response your users receive is the authentic address of your site.
We launched support for DNSSEC in 2014. We made it free for all users, but we couldn’t make it easy to set up. Turning on DNSSEC for a domain was still a multistep, manual process. With the launch of Cloudflare Registrar, we can finish the work to make it simple to enable for your domain.
You can now enable DNSSEC with a single click if your domain is registered with Cloudflare Registrar. Visit the DNS tab in the Cloudflare dashboard, click "Enable DNSSEC", and we'll handle the rest. If you are Continue reading
Serverless technology is still in its infancy, and some people are unsure about where it’s headed. Join Zack Bloom, Director of Product for Product Strategy at Cloudflare, on a journey to explore the serverless future where developers “just write code,” pay for exactly what they use, and completely forget about where code runs; then see why current platforms won't be able to get developers all the way there.
The talk below was originally presented and recorded at Serverless Computing London in November 2018. If you’d like to join us in person to talk about serverless, we’ll be announcing 2019 event locations throughout the year on the docs page.
Many of the technical challenges of serverless (cold-start time, memory overhead, and CPU context switching) are solved by a new architecture which translates technology developed for web browsers onto the server. Learn about how serverless platforms built using isolates are helping to expand the kinds of applications built using serverless.
Zack Bloom helps build the future of the Internet as the Director of Product for Product Strategy at Cloudflare. He was a co-founder of Eager, an Continue reading
This is a guest post by Ben Chartrand, who is a Development Manager at Timely. You can check out some of Ben's other Workers projects on his GitHub and his blog.
At Timely we started a project to migrate our web applications from legacy Azure services to a modern PaaS offering. In theory it meant no code changes.
We decided to start with our webhooks. All our endpoints can be grouped into four categories:
Despite their limited number, these are vitally important. We did a lot of testing but it was clear we’d only really know if everything was working once we had production traffic. How could we migrate traffic?
Change the CNAME to point to the new hosting infrastructure. This is high risk. DNS takes time to propagate so, if we needed to roll back, it would take time. We would also be shifting over everything at once.
Use a traffic manager to shift a percentage of traffic using Cloudflare Load Balancing. We could start at, say, 5% traffic to the new infrastructure Continue reading
My curiosity was piqued by an LWN article about IOCB_CMD_POLL - A new kernel polling interface. It discusses an addition of a new polling mechanism to Linux AIO API, which was merged in 4.18 kernel. The whole idea is rather intriguing. The author of the patch is proposing to use the Linux AIO API with things like network sockets.
Hold on. The Linux AIO is designed for, well, Asynchronous disk IO! Disk files are not the same thing as network sockets! Is it even possible to use the Linux AIO API with network sockets in the first place?
The answer turns out to be a strong YES! In this article I'll explain how to use the strengths of Linux AIO API to write better and faster network servers.
But before we start, what is Linux AIO anyway?
Linux AIO exposes asynchronous disk IO to userspace software.
Historically on Linux, all disk operations were blocking. Whether you did open()
, read()
, write()
or fsync()
, you could be sure your thread would stall if the needed data and meta-data was not ready in disk cache. This usually isn't Continue reading
À peine la page du calendrier tournée que nous constatons plus de troubles sur Internet.
Aujourd’hui, Cloudflare peut confirmer, chiffres à l’appui, qu’Internet a été coupé en République Démocratique du Congo, information précédemment révélée par de multiples organes de presse. Cette coupure a eu lieu alors que se déroulait l’élection présidentielle le 30 Décembre dernier, et perdure pendant la publication des résultats.
Tristement, cette situation est loin d’être une nouveauté. Nous avons fait état d'événements similaires par le passé, y compris lors d’une autre coupure d’Internet en RDC il y a moins d’un an. Une courbe malheureusement bien familière est aujourd’hui visible sur notre plateforme de gestion du réseau, montrant que le trafic dans le pays atteint péniblement un quart de son niveau habituel.
Notez que le diagramme est gradué en temps UTC, et que la capitale de la RDC Kinshasa est dans le fuseau horaire GMT+1.
La chute du trafic a démarré en milieu de journée le 31 Décembre 2018 (à environ 10h30 UTC, soit 11h30 heure locale à Kinshasa). Celà est d’autant plus frappant quand sont superposées toutes les courbes quotidiennes:
Ci-dessus, la courbe rouge représente le trafic du 31 Décembre, et les courbes grises celui des 8 Continue reading
The calendar has barely flipped to 2019 and already we’re seeing Internet disruptions.
Today, Cloudflare can quantitatively confirm that Internet access has been shut down in the Democratic Republic of the Congo, information already reported by many press organisations. This shutdown occurred as the presidential election was taking place on December the 30th, and continues as the results are published.
Sadly, this act is far from unprecedented. We have published many posts about events like this in the past, including a different post about roughly three days of Internet disruption in the Democratic Republic of the Congo less than a year ago. A painfully familiar shape can be seen on our network monitoring platform, showing that the traffic in the country is barely reaching a quarter of its typical level:
Note that the graph is based on UTC and Democratic Republic of the Congo’s capital Kinshasa has the timezone of GMT+1.
The drop in bandwidth started just before midday on 31 December 2018 (around 10:30 UTC, 11:30 local time in Kinshasa). This can be clearly seen if we overlay each 24 hour day over each other:
The red line is 31 December, the gray lines the previous eight days. Looking Continue reading
At Cloudflare, we are constantly looking into ways to improve development experience for Workers and make it the most convenient platform for writing serverless code.
As some of you might have already noticed either from our public release notes, on cloudflareworkers.com or in your Cloudflare Workers dashboard, there recently was a small but important change in the look of the inspector.
But before we go into figuring out what it is, let's take a look at our standard example on cloudflareworkers.com:
The example worker code featured here acts as a transparent proxy, while printing requests / responses to the console.
Commonly, when debugging Workers, all you could see from the client-side devtools is the interaction between your browser and the Cloudflare Worker runtime. However, like in most other server-side runtimes, the interaction between your code and the actual origin has been hidden.
This is where console.log
comes in. Although not the most convenient, printing random things out is a fairly popular debugging technique.
Unfortunately, its default output doesn't help much with debugging network interactions. If you try to expand either of request or response objects, all you can see is just a bunch of lazy accessors:
Pink Dot SG is an event which takes place every June in Singapore to celebrate LGBTQIA+ pride! Cloudflare participated this year, on June 21st. We’re a little late, but wanted to share what we got up to. Pink Dot SG started in 2009, as a way for queer people and allies alike to demonstrate their belief that everyone deserves the “freedom to love.”
Proudflare, Cloudflare's LGBTQIA+ employee resource group, finds ways to support and provide resources for the LGBTQIA+ community, both within Cloudflare and in the larger community.
Proudflare started in 2017 in our San Francisco headquarters and in 2018, the Proudflare Singapore chapter was formed. We were excited to participate in our first public-facing event and demonstrate Cloudflare’s commitment to equality and dignity for all people!
We took to the streets this year to celebrate, but more importantly demand equality for our community in Singapore. It was an exciting event, with heaps of buzz, cheer, and joy amongst the crowd! Pink Dot SG included LGBTQIA+-themed events, information tents, a concert, and onstage were 10 Declarations for Equality, a list of changes the LGBTQIA+ community and their allies are ready for and Continue reading
Last year we published some crypto challenges to keep you momentarily occupied from the festivities. This year, we're doing the same. Whether you're bored or just want to learn a bit more about the technologies that encrypt the internet, feel free to give these short cryptography quizzes a go.
We're withholding answers until the start of the new year, to give you a chance to solve them without spoilers. Before we reveal the answers; if you manage to solve them, we'll be giving the first 5 people to get the answers right some Cloudflare swag. Fill out your answers and details using this form so we know where to send it.
Have fun!
NOTE: Hints are below the questions, avoid scrolling too far if you want to avoid any spoilers.
Client says hello, as follows:
00000c07ac01784f437dbfc70800450000f2560140004006db58ac1020c843c
7f80cd1f701bbc8b2af3449b598758018102a72a700000101080a675bce16787
abd8716030100b9010000b503035c1ea569d5f64df3d8630de8bdddd1152e75f
528ae577d2436949ce8deb7108600004400ffc02cc02bc024c023c00ac009c0
08c030c02fc028c027c014c013c012009f009e006b0067003900330016009d
009c003d003c0035002f000a00af00ae008d008c008b010000480000000b
000900000663666c2e7265000a00080006001700180019000b0002010000
0d0012001004010201050106010403020305030603000500050100000000
001200000017000052305655494338795157524c656d6443436c5246574651
675430346754456c4f52564d674e434242546b51674e513d3d
[Raw puzzle without text wrap]
A user has an authenticator device to generate one time passwords for logins to their banking website. The implementation contains a fatal flaw.
At the following times, the following codes are generated (all in GMT/UTC):
The Time to First Byte (TTFB) of a site is the time from when the user starts navigating until the HTML for the page they requested starts to arrive. A slow TTFB has been the bane of my existence for more than the ten years I have been running WebPageTest.
Looking at a recent test data set (~100k pages):
— Patrick Meenan (@patmeenan) August 10, 2016
20% TTFB > 3s
80% start render > 5s (10% > 10s)
500 pages were > 15MB
So much slow to fix
There is a reason why TTFB appears as one of the few “grades” that WebPageTest scores a site on and, specifically, why it is the first grade in the list.
If the first byte is slow, EVERY other metric will also be slow. Improving it is one of the few cases where you can predict what the impact will be on every other measurement. Every millisecond improvement in the TTFB translates directly into a millisecond of savings in every other measurement (i.e. first paint will be 500ms faster if TTFB improves by 500ms). That said, a fast ttfb doesn't guarantee a fast experience but a slow ttfb does guarantee a slow Continue reading
In December 2017 we unveiled the true potential of Cloudflare’s scale: to find the best commercially available mince pie and let the world know about it. In 2018 we’ve all been extremely busy helping Cloudflare & our customers and therefore we left it very late this year. Uncomfortably late.
If you want to know the best mince pie to buy in 2018 right now, skip straight to the bottom of this post where we reveal the winner. If you want to understand more about what makes a mince pie great and how we can learn this at Cloudflare’s scale - read on.
With a very short amount of time to get this research out to a discerning and demanding public, we engaged the entire Cloudflare London team to help. Team members diligently went out and purchased mince pies from all over the South East of England for everyone to taste.
On Monday the team assembled for a “Mince Pie Jam”, where we would taste & consistently review each pie:
As we can see from even the most cursory Internet Continue reading