The North American Network Operators Group (NANOG) is the loci of modern Internet innovation and the day-to-day cumulative network-operational knowledge of thousands and thousands of network engineers. NANOG itself is a non-profit membership organization; but you don’t need to be a member in order to attend the conference or join the mailing list. That said, if you can become a member, then you’re helping a good cause.
The next NANOG conference starts in a few days (February 6-8 2017) in Washington, DC. Nearly 900 network professionals are converging on the city to discuss a variety of network-related issues, both big and small; but all related to running and improving the global Internet. For this upcoming meeting, Cloudflare has three network professionals in attendance. Two from the San Francisco office and one from the London office.
With the conference starting next week, it seemed a great opportunity to introduce readers of the blog as to why a NANOG conference is so worth attending.
While it seems obvious how to do some network tasks (you unpack the spiffy new wireless router from its box, you set up its security and plug it in); alas the global Internet is somewhat more complex. Continue reading
Today a severe vulnerability was announced by the WordPress Security Team that allows unauthenticated users to change content on a site using unpatched (below version 4.7.2) WordPress.
CC BY-SA 2.0 image by Nicola Sap De Mitri
The problem was found by the team at Sucuri and reported to WordPress. The WordPress team worked with WAF vendors, including Cloudflare, to roll out protection before the patch became available.
Earlier this week we rolled out two rules to protect against exploitation of this issue (both types mentioned in the Sucuri blog post). We have been monitoring the situation and have not observed any attempts to exploit this vulnerability before it was announced publicly.
Customers on a paid plan will find two rules in WAF, WP0025A and WP0025B, that protect unpatched WordPress sites from this vulnerability. If the Cloudflare WordPress ruleset is enabled then these rules are automatically turned on and blocking.
As we have in the past with other serious and critical vulnerabilities like Shellshock and previous issues with JetPack, we have enabled these two rules for our free customers as well.
Free customers who want full protection for their WordPress sites can upgrade to a Continue reading
Nick Sullivan and I gave a talk about TLS 1.3 at 33c3, the latest Chaos Communication Congress. The congress, attended by more that 13,000 hackers in Hamburg, has been one of the hallmark events of the security community for more than 30 years.
You can watch the recording below, or download it in multiple formats and languages on the CCC website.
The talk introduces TLS 1.3 and explains how it works in technical detail, why it is faster and more secure, and touches on its history and current status.
The slide deck is also online.
This was an expanded and updated version of the internal talk previously transcribed on this blog.
In related news, TLS 1.3 is reaching a percentage of Chrome and Firefox users this week, so websites with the Cloudflare TLS 1.3 beta enabled will load faster and more securely for all those new users.
You can enable the TLS 1.3 beta from the Crypto section of your control panel.
Cloudflare’s mission is to help build a better Internet. That means a faster, more secure, open Internet world-wide. We have millions of customers using our services like free SSL, an advanced WAF, the latest compression and the most up to date security to ensure that their web sites, mobile apps and APIs are secure and fast.
One vital area of web technology has lagged behind in terms of speed and security: online ads. And consumers have been turning to ad blocking technology to secure and speed up their own web browsing.
Today, Cloudflare is introducing a new product to make web ads secure, fast and safe. That product is Firebolt.
With Firebolt, ad networks can instantly speed up and secure their ads, resulting in happy consumers and better conversion rates.
Firebolt delivers:
Lightning fast ad delivery
Cloudflare's global network of 102 data centers in 50 countries, combined with routing and performance technologies, makes the delivery of online ads to any device up to five times faster.
Free, simple SSL
Adding SSL to ad serving has been challenging for some ad networks. Cloudflare has years of experience providing free, one click SSL for our customers. Firebolt ads are Continue reading
In 2017, we've predicted that more than half of the traffic to Cloudflare's network will come from mobile devices. Even if they are formatted to be displayed on a small screen, the mobile web is built on traditional web protocols and technologies that were designed for desktop CPUs, network connections, and displays. As a result, browsing the mobile web feels sluggish compared with using native mobile apps.
In October 2015, the team at Google announced Accelerated Mobile Pages (AMP), a new, open technology to make the mobile web as fast as native apps. Since then, a large number of publishers have adopted AMP. Today, 600 million pages across 700,000 different domains are available in the AMP format.
The majority of traffic to this AMP content comes from people running searches on Google.com. If a visitor finds content through some source other than a Google search, even if the content can be served from AMP, it typically won't be. As a result, the mobile web continues to be slower than it needs to be.
Cloudflare's Accelerated Mobile Links helps solve this problem, making content, regardless of how it's discovered, app-quick. Once enabled, Accelerated Mobile Continue reading
Cloudflare is publishing today its seventh transparency report, covering the second half of 2016. For the first time, we are able to present information on a previously undisclosed National Security Letter (NSL) Cloudflare received in the 2013 reporting period.
Wikipedia provides the most succinct description of an NSL:
An NSL is an administrative subpoena issued by the United States federal government to gather information for national security purposes. NSLs do not require prior approval from a judge.… NSLs typically contain a nondisclosure requirement, frequently called a "gag order", preventing the recipient of an NSL from disclosing that the FBI had requested the information. https://en.wikipedia.org/wiki/National_security_letter
Shortly before the New Year, the FBI sent us the following letter about that NSL.
The letter withdrew the nondisclosure provisions (the “gag order”) contained in NSL-12-358696, which had constrained Cloudflare since the NSL was served in February 2013. At that time, Cloudflare objected to the NSL. The Electronic Frontier Foundation agreed to take our case, and with their assistance, we brought a lawsuit under seal to protect its customers' rights.
Early in the litigation, the FBI rescinded the NSL in July 2013 and withdrew the request for information. So no customer Continue reading
While working to make the Internet a better place, we also want to make it easier for our customers to have control of their content and APIs, and who has access to them. Using Cloudflare’s Token Authentication features, customers can implement access control via URL tokens or HTTP request headers without having to build complex back-end systems.
Cloudflare will check these tokens at the edge before any request is relayed to an origin or served from cache. If the token is not valid the request is blocked. Since Cloudflare handles all the token validation, the origin server does not need to have complex authentication logic. In addition, a malicious user who attempts to forge tokens will be blocked from ever reaching the origin.
Leveraging our edge network of over 100 data centers, customers can use token authentication to perform access control checks on content and APIs, as well as allowing Cloudflare to cache private content and only serve it to users with a valid token tied specifically to that cached asset.
Performing access control on the edge has many benefits. Brute force attempts and other attacks on private assets don't ever reach Continue reading
We extensively monitor our network and use multiple systems that give us visibility including external monitoring and internal alerts when things go wrong. One of the most useful systems is Grafana that allows us to quickly create arbitrary dashboards. And a heavy user of Grafana we are: at last count we had 645 different Grafana dashboards configured in our system!
grafana=> select count(1) from dashboard;
count
-------
645
(1 row)
This post is not about our Grafana systems though. It's about something we noticed a few days ago, while looking at one of those dashboards. We noticed this:
This chart shows the number of HTTP requests per second handled by our systems globally. You can clearly see multiple spikes, and this chart most definitely should not look like a porcupine! The spikes were large in scale - 500k to 1M HTTP requests per second. Something very strange was going on.
Our intuition indicated an attack - but our attack mitigation systems didn't confirm it. We'd seen no major HTTP attacks at those times.
It would be bad if we were under such heavy HTTP attack and our mitigation systems didn't notice it. Without more ideas, we Continue reading
At midnight UTC on New Year’s Day, deep inside Cloudflare’s custom RRDNS software, a number went negative when it should always have been, at worst, zero. A little later this negative value caused RRDNS to panic. This panic was caught using the recover feature of the Go language. The net effect was that some DNS resolutions to some Cloudflare managed web properties failed.
The problem only affected customers who use CNAME DNS records with Cloudflare, and only affected a small number of machines across Cloudflare's 102 PoPs. At peak approximately 0.2% of DNS queries to Cloudflare were affected and less than 1% of all HTTP requests to Cloudflare encountered an error.
This problem was quickly identified. The most affected machines were patched in 90 minutes and the fix was rolled out worldwide by 0645 UTC. We are sorry that our customers were affected, but we thought it was worth writing up the root cause for others to understand.
Cloudflare customers use our DNS service to serve the authoritative answers for DNS queries for their domains. They need to tell us the IP address of their origin web servers so we can contact the Continue reading
A man, a plan, a canal, a data center. Over 5 million Internet properties are now faster across Panama, as Cloudflare turned up its newest data center in Panama City. This is our 102nd data center globally, and brings us to a special milestone as our network now spans 50 countries. While perhaps not quite as big an announcement as the $5B Panama Canal expansion, the websites of many important newspapers, TV stations, banks and airlines can be accessed directly from Panama.
‘A man a plan a canal’ y un data center! A Partir de hoy más de 5 millones de sitios en internet serán mas rápidos desde Panamá, ya que hemos introducido nuestro más reciente centro de datos en Ciudad de Panamá. Es nuestro centro de datos número 102 a nivel mundial. Es una hito especial ya que la red global de Cloudflare alcanza 50 países a nivel mundial. Tal vez no es un anuncio tan grande como la inauguración de la expansión del canal, pero a partir de hoy muchos de los sitios importantes de diarios, Televisoras, bancos y aerolíneas serán servidos directamente desde Panamá.
Bridge of the Americas
Puente de las Américas
An abbreviated version of this post originally appeared on TechCrunch
Looking back over 2016, we saw the good and bad that comes with widespread use and abuse of the Internet.
In both Gabon and Gambia, Internet connectivity was disrupted during elections. The contested election in Gambia started with an Internet blackout that lasted a short time. In Gabon, the Internet shutdown lasted for days. Even as we write this countries like DR Congo are discussing blocking specific Internet services, clearly forgetting the lessons learned in these other countries.
CC BY 2.0 image by Aniket Thakur
DDoS attacks continued throughout the year, hitting websites big and small. Back in March, we wrote about 400 Gbps attacks that were happening over the weekend, and then in December, it looked like attackers were treating attacks as a job to be performed from 9 to 5.
In addition to real DDoS, there were also empty threats from a group calling itself Armada Collective and demanding Bitcoin for sites and APIs to stay online. Another group popped up to copycat the same modus-operandi.
The Internet of Things became what many had warned it would become: an army of devices used for attacks. A botnet Continue reading
APIs are increasingly becoming the backbone of the modern internet - whether you're ordering food from an app on your phone or browsing a blog using a modern JavaScript framework, chances are those requests are flowing through an API. Given the need for APIs to evolve through refactoring and extension, having great automated tests allows you to develop fast without needing to slow down to run manual tests to work out what’s broken. Additionally, by having tests in place you’re able to firmly identify the requirements that your API should meet, your API tests effectively form a tangible and executable specification. API Testing offers an end-to-end mechanism of testing the behaviour of your API which has advantages in both reliability and also development productivity.
In this post I'll be demonstrating how you can test RESTful APIs in an automated fashion using PHP, by building a testing framework through creative use of two packages - Guzzle and PHPUnit. The resulting tests will be something you can run outside of your API as part of your deployment or CI (Continuous Integration) process.
Guzzle acts as a powerful HTTP client which we can use to simulate HTTP Requests against our API. Though PHPUnit Continue reading
This piece was originally written for the Gopher Academy advent series. We are grateful to them for allowing us to republish it here.
Back when crypto/tls
was slow and net/http
young, the general wisdom was to always put Go servers behind a reverse proxy like NGINX. That's not necessary anymore!
At Cloudflare we recently experimented with exposing pure Go services to the hostile wide area network. With the Go 1.8 release, net/http
and crypto/tls
proved to be stable, performant and flexible.
However, the defaults are tuned for local services. In this articles we'll see how to tune and harden a Go server for Internet exposure.
crypto/tls
You're not running an insecure HTTP server on the Internet in 2016. So you need crypto/tls
. The good news is that it's now really fast (as you've seen in a previous advent article), and its security track record so far is excellent.
The default settings resemble the Intermediate recommended configuration of the Mozilla guidelines. However, you should still set PreferServerCipherSuites
to ensure safer and faster cipher suites are preferred, and CurvePreferences
to avoid unoptimized curves: a client using CurveP384
would cause up to a second of CPU to be consumed on our Continue reading
Cloudflare has an automatic image optimization feature called Polish, available to customers on paid plans. It recompresses images and removes unnecessary data so that they are delivered to browsers more quickly.
Up until now, Polish has not changed image types when optimizing (even if, for example, a PNG might sometimes have been smaller than the equivalent JPEG). But a new feature in Polish allows us to swap out an image for an equivalent image compressed using Google’s WebP format when the browser is capable of handling WebP and delivering that type of image would be quicker.
CC-BY 2.0 image by John Stratford
The main image formats used on the web haven’t changed much since the early days (apart from the SVG vector format, PNG was the last one to establish itself, almost two decades ago).
WebP is a newer image format for the web, proposed by Google. It takes advantage of progress in image compression techniques since formats such as JPEG and PNG were designed. It is often able to compress the images into a significantly smaller amount of data than the older formats.
WebP is versatile and able to replace the three main Continue reading
We use Salt to manage our ever growing global fleet of machines. Salt is great for managing configurations and being the source of truth. We use it for remote command execution and for network automation tasks. It allows us to grow our infrastructure quickly with minimal human intervention.
CC-BY 2.0 image by Kevin Dooley
We got to thinking. Are DNS records not just a piece of the configuration? We concluded that they are and decided to manage our own records from Salt too.
We are strong believers in eating our own dog food, so we make our employees use the next version of our service before rolling it to everyone else. That way if there's a problem visiting one of the 5 million websites that use Cloudflare it'll get spotted quickly internally. This is also why we keep our own DNS records on Cloudflare itself.
Cloudflare has an API that allows you to manage your zones programmatically without ever logging into the dashboard. Until recently, we were using handcrafted scripts to manage our own DNS records via our API. These scripts were in exotic languages like PHP for historical reasons and had interesting behavior that not everybody enjoyed. Continue reading
In 2011 we launched the Cloudflare Apps platform in an article that described Cloudflare as “not ... the sexiest business in the world.” Sexy or not, Cloudflare has since grown from the 3.5 billion pageviews a month we were doing then to over 1.3 trillion per month today. Along the way, we’ve powered more than a million app installations onto our customer’s websites.
For the last 6 years Cloudflare has been focused on building one of the world’s largest networks. The importance of that work has not left as much time as we would have liked to improve our app platform. With just 21 apps, we knew we were not delivering all that our marketplace could offer.
About six months ago, we were introduced to the team at Eager. Eager was building its own app store for installation onto any website. They impressed us with their ability to enable even the most non-technical website owner to install powerful tools to improve their sites through a slick interface. Eager’s platform included the features we wanted in our marketplace, like the ability to preview an app on a user's site before installing it. Even better, Eager had a powerful app Continue reading
Like most of you, I first heard of Cloudflare via this blog. I read about HTTP/2, Railgun, the Hundredth Data Center, and Keyless SSL — but I never thought I would work here. I, along with my co-founder Adam, and our friends and coworkers were hard at work building something very different. We were working on a tool which spent most of its life in the web browser, not on servers all around the world: an app store for your website. Using our tool a website owner could find and install any of over a hundred apps which could help them collect feedback from their visitors, sell products on their site, or even make their site faster.
Our goal was to create a way for every website owner to find and install all of the open-source and SaaS tools technical experts use everyday. As developers ourselves, we wanted to make it possible for a developer in her basement to build the next great tool and get it on a million websites (and make a million dollars) the next day. We didn’t want her to succeed because she had the biggest sales or marketing team, or the most Continue reading
M42 Smart Motorway in the West Midlands, UK; courtesy of Highways England.
The load time of your website not only affects your search engine rankings, but is also correlated to the conversion rate on your site:
Cloudflare is determined to help website administrators boost the performance of their websites. From today, Cloudflare users on our Business plan will gain a previously Enterprise-only Page Rule option, “Bypass Cache on Cookie”. When used in conjunction with a “Cache Everything” Page Rule, this setting allows for websites to cache the HTML of anonymous page visits without affecting dynamic content.
By caching anonymous page views, Cloudflare is able to help ensure that your origin webserver doesn't waste time constantly regenerating pages which change rarely. This ultimately allows us Continue reading
The following blog post describes a debugging adventure on Cloudflare's Mesos-based cluster. This internal cluster is primarily used to process log file information so that Cloudflare customers have analytics, and for our systems that detect and respond to attacks.
The problem encountered didn't have any effect on our customers, but did have engineers scratching their heads...
At some point in one of our cluster we started seeing errors like this (an NXDOMAIN for an existing domain on our internal DNS):
lookup some.existing.internal.host on 10.36.0.9:53: no such host
This seemed very weird, since the domain did indeed exist. It was one of our internal domains! Engineers had mentioned that they'd seen this behaviour, so we decided to investigate deeper. Queries triggering this error were varied and ranged from dynamic SRV records managed by mesos-dns to external domains looked up from inside the cluster.
Our first naive attempt was to run the following in a loop:
while true; do dig some.existing.internal.host > /tmp/dig.txt || break; done
Running this for a while on one server did not reproduce the problem: all the lookups were successful. Then we took our service Continue reading
Recent headline grabbing DDoS attacks provoked heated debates in the DNS community. Everyone has strong opinions on how to harden DNS to avoid downtime in the future. Is it better to use a single DNS provider or multiple? What DNS TTL values are best? Does DNSSEC make you more or less exposed?
CC BY 2.0 image by Leticia Chamorro
These are valid questions worth serious discussion, but tuning your own DNS server settings is not the full story. Together, as a community, we need to harden the DNS protocol itself. We need to prepare it to withstand the toughest DDoS attacks the future will surely bring. In this blog post I'll point out an obscure feature in the core DNS protocol. It is not practical to use this "hidden" feature for DDoS mitigation now, but with a small tweak it could become extremely useful. The feature is currently unused not due to protocol problems - it's unused because of the DNS Top Level Domain (TLD) operators' apathy. If it was working it would reduce DDoS recovery time for the DNS servers under attack.
The feature in question is: DNS TLD glue records. More specifically DNS TLD glue records with Continue reading