Meyer Zinn

Author Archives: Meyer Zinn

Making Magic Transit health checks faster and more responsive

Making Magic Transit health checks faster and more responsive
Making Magic Transit health checks faster and more responsive

Magic Transit advertises our customer’s IP prefixes directly from our edge network, applying DDoS mitigation and firewall policies to all traffic destined for the customer’s network. After the traffic is scrubbed, we deliver clean traffic to the customer over GRE tunnels (over the public Internet or Cloudflare Network Interconnect). But sometimes, we experience inclement weather on the Internet: network paths between Cloudflare and the customer can become unreliable or go down. Customers often configure multiple tunnels through different network paths and rely on Cloudflare to pick the best tunnel to use if, for example, some router on the Internet is having a stormy day and starts dropping traffic.

Making Magic Transit health checks faster and more responsive

Because we use Anycast GRE, every server across Cloudflare’s 200+ locations globally can send GRE traffic to customers. Every server needs to know the status of every tunnel, and every location has completely different network routes to customers. Where to start?

In this post, I’ll break down my work to improve the Magic Transit GRE tunnel health check system, creating a more stable experience for customers and dramatically reducing CPU and memory usage at Cloudflare’s edge.

Everybody has their own weather station

To decide where to send traffic, Cloudflare edge servers Continue reading