From bytecode to bytes: automated magic packet generation

Linux malware often hides in Berkeley Packet Filter (BPF) socket programs, which are small bits of executable logic that can be embedded in the Linux kernel to customize how it processes network traffic. Some of the most persistent threats on the Internet use these filters to remain dormant until they receive a specific "magic" packet. Because these filters can be hundreds of instructions long and involve complex logical jumps, reverse-engineering them by hand is a slow process that creates a bottleneck for security researchers.

To find a better way, we looked at symbolic execution: a method of treating code as a series of constraints, rather than just instructions. By using the Z3 theorem prover, we can work backward from a malicious filter to automatically generate the packet required to trigger it. In this post, we explain how we built a tool to automate this, turning hours of manual assembly analysis into a task that takes just a few seconds.

The complexity ceiling

Before we look at how to deconstruct malicious filters, we need to understand the engine running them. The Berkeley Packet Filter (BPF) is a highly efficient technology that allows the kernel to pull specific packets from the network Continue reading

Cloudflare targets 2029 for full post-quantum security

Cloudflare is accelerating its post-quantum roadmap. We now target 2029 to be fully post-quantum (PQ) secure including, crucially, post-quantum authentication.

At Cloudflare, we believe in making the Internet private and secure by default. We started by offering free universal SSL certificates in 2014, began preparing our post-quantum migration in 2019, and enabled post-quantum encryption for all websites and APIs in 2022, mitigating harvest-now/decrypt-later attacks. While we’re excited by the fact that over 65% of human traffic to Cloudflare is post-quantum encrypted, our work is not done until authentication is also upgraded. Credible new research and rapid industry developments suggest that the deadline to migrate is much sooner than expected. This is a challenge that any organization must treat with urgency, which is why we’re expediting our own internal Q-Day readiness timeline.

What happened? Last week, Google announced they had drastically improved upon the quantum algorithm to break elliptic curve cryptography, which is widely used to secure the Internet. They did not reveal the algorithm, but instead provided a zero-knowledge proof that they have one.

This is not even the biggest breakthrough. That same day, Oratomic published a resource estimate for breaking RSA-2048 and P-256 on a neutral atom computer. For Continue reading

SONiC developments for visibility into AI/ML networks in 2026

SONiC sFlow High Level Design (HLD) v1.4 was recently published. This is the latest in a series of revisions bringing support for sFlow extensions that enhance network visibility for AI / ML traffic flows.

v1.3 Egress sFlow support

RoCEv2 / Ultra Ethernet host adapters bypass the Linux kernel and transfer data directly to GPU memory, rendering traditional host-based network monitoring tools ineffective (tcpdump, Wireshark, eBPF etc.). Ingress/egress packet sampling on the top of rack switch offloads monitoring from the host to the switch to provide visibility into host traffic.

In addition, some measurements may only be possible for egress sampled packets. For example, the v1.3 HLD describes how SONiC SAI drivers can support the sFlow Delay and Transit Structures extension:

Depending on platform capabilities, SAI driver may report additional attributes defined in https://github.com/torvalds/linux/blob/master/include/uapi/linux/psample.h. For example, PSAMPLE_ATTR_OUT_TC (egress queue), PSAMPLE_ATTR_OUT_TC_OCC (egress queue depth), and PSAMPLE_ATTR_LATENCY (transit delay) populate the sFlow Transit Delay Structures (https://sflow.org/sflow_transit.txt).
Typically this data is only known when packets egress the switch and may only be available for egress sampled packets.

Transit delay and queuing describes the measurements and provides an example. The sFlow transit delay and queue Continue reading

How we built Organizations to help enterprises manage Cloudflare at scale

Cloudflare was designed to be simple to use for even the smallest customers, but it’s also critical that it scales to meet the needs of the largest enterprises. While smaller customers might work solo or in a small team, enterprises often have thousands of users making use of Cloudflare’s developer, security, and networking capabilities. This scale can add complexity, as these users represent multiple teams and job functions. 

Enterprise customers often use multiple Cloudflare Accounts to segment their teams (allowing more autonomy and separation of roles), but this can cause a new set of problems for the administrators by fragmenting their controls.

That’s why today, we’re launching our new Organizations feature in beta — to provide a cohesive place for administrators to manage users, configurations, and view analytics across many Cloudflare Accounts. 

Principle of least privilege

The principle of least privilege is one of the driving factors behind enterprises using multiple accounts. While Cloudflare’s role-based access control (RBAC) system now offers fine-grained permissions for many resources, it can be cumbersome to enumerate all the resources one by one. Instead, we see enterprises use multiple accounts, so each team’s resources are managed by that team alone. This allows organic Continue reading

Technology Short Take 193

Welcome to Technology Short Take #193! I know it has only been a couple weeks since the last Tech Short Take, but I am guessing that readers won’t really mind another one. Here is my latest collection of articles and posts about data center-related technologies. Enjoy!

Networking

Servers/Hardware

  • RIP Mac Pro. I had a “classic Mac Pro” (2012 era) for a long time, and I loved that system. (I even ran Linux on it for a while.) It is a shame to see it go.
  • I mentioned on social media (Mastodon/Bluesky) that I recently purchased all the hardware for a new PC build. It’ll be part PC/part home server, as I look to expand the type and scope of services that I self-host. Don’t be surprised if a few articles emerge out of this.

Security

How to Stub LLMs for AI Agent Security Testing and Governance

Note: The core architecture for this pattern was introduced by Isaac Hawley from Tigera.

If you are building an AI agent that relies on tool calling, complex routing, or the Model Context Protocol (MCP), you’re not just building a chatbot anymore. You are building an autonomous system with access to your internal APIs.

With that power comes a massive security and governance headache, and AI agent security testing is where most teams hit a wall. How do you definitively prove that your agent’s identity and access management (IAM) actually works?

The scale of the problem is hard to overstate. Microsoft’s telemetry shows that 80% of Fortune 500 companies now run active AI agents, yet only 47% have implemented specific AI security controls. Most teams are deploying agents faster than they can test them.

If an agent is hijacked via prompt injection, or simply hallucinates a destructive action, does your governance layer stop it? Testing this usually forces engineers into a frustrating trade-off:

  1. Use the real API (Gemini, OpenAI): Real models are heavily RLHF’d to be safe and polite. It is incredibly difficult (and non-deterministic) to intentionally force a real model to “go rogue” and consistently output malicious tool Continue reading

Why we’re rethinking cache for the AI era

Cloudflare data shows that 32% of traffic across our network originates from automated traffic. This includes search engine crawlers, uptime checkers, ad networks — and more recently, AI assistants looking to the web to add relevant data to their knowledge bases as they generate responses with retrieval-augmented generation (RAG). Unlike typical human behavior, AI agents, crawlers, and scrapers’ automated behavior may appear aggressive to the server responding to the requests. 

For instance, AI bots frequently issue high-volume requests, often in parallel. Rather than focusing on popular pages, they may access rarely visited or loosely related content across a site, often in sequential, complete scans of the websites. For example, an AI assistant generating a response may fetch images, documentation, and knowledge articles across dozens of unrelated sources.

Although Cloudflare already makes it easy to control and limit automated access to your content, many sites may want to serve AI traffic. For instance, an application developer may want to guarantee that their developer documentation is up-to-date in foundational AI models, an e-commerce site may want to ensure that product descriptions are part of LLM search results, or publishers may want to get paid for their content through mechanisms such Continue reading

SR Linux Configuration Conversion Tool

A year ago, I was complaining about SR Linux breaking its configuration data model with a new software release. At that time, I was promised it would only happen once a year, and, like clockwork, that moment arrived with the SR Linux release 26.03.

However, this year Miguel Redondo fixed the netlab SR Linux configuration templates (VRF export policies, LocPref routing policy changes) before I could even start looking at them, and Roman Dodin released a tool that tells you exactly what changed between software releases and how to fix it.

NAN118: The Importance of the Data Behind AI in Networks (Sponsored)

When applying AI to network operations and automation, a strong data foundation is essential. In this sponsored episode, Eric Chou and Scott Robohn are joined by Surya Nimmagadda, Chief Data Scientist; and Joby Rudolph, Senior Distinguished Engineer, both from Selector. They discuss the importance of transparency in their data and how it can instill confidence... Read more »
1 2 3 3,859