At Cloudflare, we take steps to ensure we are resilient against failure at all levels of our infrastructure. This includes Kafka, which we use for critical workflows such as sending time-sensitive emails and alerts.
We learned a lot about keeping our applications that leverage Kafka healthy, so they can always be operational. Application health checks are notoriously hard to implement: What determines an application as healthy? How can we keep services operational at all times?
These can be implemented in many ways. We’ll talk about an approach that allows us to considerably reduce incidents with unhealthy applications while requiring less manual intervention.
Cloudflare is a big adopter of Kafka. We use Kafka as a way to decouple services due to its asynchronous nature and reliability. It allows different teams to work effectively without creating dependencies on one another. You can also read more about how other teams at Cloudflare use Kafka in this post.
Kafka is used to send and receive messages. Messages represent some kind of event like a credit card payment or details of a new user created in your platform. These messages can be represented in multiple ways: JSON, Protobuf, Avro and so on.
It’s easy to get excited about what seems to be a new technology and conclude that it will forever change the way we do things. For example, I’ve seen claims that SmartNICs (also known as Data Processing Units – DPU) will forever change the network.
TL&DR: Of course they won’t.
Before we start discussing the details, it’s worth remembering what a DPU is: it’s another server with its own CPU, memory, and network interface card (NIC) that happens to have PCI hardware that emulates the host interface cards. It might also have dedicated FPGA or ASICs.
It’s easy to get excited about what seems to be a new technology and conclude that it will forever change the way we do things. For example, I’ve seen claims that SmartNICs (also known as Data Processing Units – DPU) will forever change the network.
TL&DR: Of course they won’t.
Before we start discussing the details, it’s worth remembering what a DPU is: it’s another server with its own CPU, memory, and network interface card (NIC) that happens to have PCI hardware that emulates the host interface cards. It might also have dedicated FPGA or ASICs.
On today's Network Break podcast we cover a raft of Juniper vulnerabilities, whether Cisco should patch serious vulnerabilities in end-of-life products, a big T-Mobile breach, Avaya dealing with significant debt, sweeping rounds of layoffs, and more IT news.
The post Network Break 414: 230 Juniper Vulnerabilities, Should Cisco Patch An EOL Router, T-Mobile Takes Weeks To Spot Breach appeared first on Packet Pushers.
The post Optimizing BGP convergence appeared first on Noction.
The post Optimizing BGP convergence appeared first on Noction.