We often try to “institutionalize” things that work into repeatable processes—and most of the time, it doesn’t work. The process ends up becoming unwieldy, eventually failing to prevent failures and stifling innovation. How can we get out of this rut? Differentiating between architecture and process. Far too many IT shops try to replace architecture with process. Our second topic for this episode is the destructive lies of the tool trope. Tools are not “neutral,” they impact the way we think and work. A primary example of a tool that can often reshape our thinking and doing in very negative ways is … the process.
Depending on your configuration, the Linux kernel can produce a hung task warning message in its log. Searching the Internet and the kernel documentation, you can find a brief explanation that the kernel process is stuck in the uninterruptable state and hasn’t been scheduled on the CPU for an unexpectedly long period of time. That explains the warning’s meaning, but doesn’t provide the reason it occurred. In this blog post we’re going to explore how the hung task warning works, why it happens, whether it is a bug in the Linux kernel or application itself, and whether it is worth monitoring at all.
The hung task message in the kernel log looks like this:
INFO: task XXX:1495882 blocked for more than YYY seconds.
Tainted: G O 6.6.39-cloudflare-2024.7.3 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:XXX state:D stack:0 pid:1495882 ppid:1 flags:0x00004002
. . .
Processes in Linux can be in different states. Some of them are running or ready to run on the CPU — they are in the TASK_RUNNING state. Others are waiting for some signal or event to happen, e.g. network packets to arrive or terminal input Continue reading
This is my response to Rob Pike’s words On Bloat.
I’m not surprised to see this from Pike. He’s a NIH extremist. And yes, in this aspect he’s my spirit animal when coding for fun. I’ll avoid using a framework or a dependency because it’s not the way that I would have done it, and it doesn’t do it quite right… for me.
And he correctly recognizes the technical debt that an added dependency involves.
But I would say that he has two big blind spots.
He doesn’t recognize that not using the available dependency is also adding huge technical debt. Every line of code you write is code that you have to maintain, forever.
The option for most software isn’t “use the dependency” vs “implement it yourself”. It’s “use the dependency” vs “don’t do it at all”. If the latter means adding 10 human years to the product, then most of the time the trade-off makes it not worth doing at all.
He shows a dependency graph of Kubernetes. Great. So are you going to write your own Kubernetes now?
Pike is a good enough coder that he can write his own editor (wikipedia: “Pike has written many text Continue reading
As we kick off the new year, we’re excited to introduce the latest updates to Calico, designed to create a single, unified platform for all your Kubernetes networking, security, and observability needs. These new features help organizations reduce tool sprawl, streamline operations, and lower costs, making it more convenient and efficient to manage Kubernetes environments.
In this blog, we’ll highlight some of the most exciting additions that include a major new product capability, an ingress gateway.
Managing and securing traffic in Kubernetes environments is one of the most complex and critical challenges organizations face today. With more than 60% of enterprises having adopted Kubernetes, according to an annual CNCF survey, controlling and optimizing how external traffic enters clusters is more important than ever. As applications grow in scale and complexity, legacy ingress solutions often fall short, plagued by operational inefficiencies, reliance on proprietary APIs, limited scalability, and difficulty in customization. These limitations make it difficult for teams to maintain consistent performance and robust security across their environments.
To address these challenges, we’re excited to introduce the Calico Ingress Gateway, an enterprise hardened, 100% upstream distribution of Envoy Gateway that leverages and expands the Continue reading
The quantum computing space is replete with big-name companies like IBM, Google, Microsoft, Amazon, and Intel touting incremental but important steps they’re taking to bring the long-promised technology to the fore. …
German HPC Center Is The First Buyer For New D-Wave Quantum Computer was written by Jeffrey Burt at The Next Platform.
PARTNER CONTENT: As the technology industry continues its shift towards AI dominance, an important schism is opening up that threatens to impact scientific progress, along with important humanitarian endeavors such as disaster response. …
The Hidden Cost Of Compromise: Why HPC Still Demands Precision was written by Timothy Prickett Morgan at The Next Platform.
Audit logs are a critical tool for tracking and recording changes, actions, and resource access patterns within your Cloudflare environment. They provide visibility into who performed an action, what the action was, when it occurred, where it happened, and how it was executed. This enables security teams to identify vulnerabilities, ensure regulatory compliance, and assist in troubleshooting operational issues. Audit logs provide critical transparency and accountability. That's why we're making them "automatic" — eliminating the need for individual Cloudflare product teams to manually send events. Instead, audit logs are generated automatically in a standardized format when an action is performed, providing complete visibility and ensuring comprehensive coverage across all our products.
We're excited to announce the beta release of Automatic Audit Logs — a system that unifies audit logging across Cloudflare products. This new system is designed to give you a complete and consistent view of your environment’s activity. Here’s how we’ve enhanced our audit logging capabilities:
Standardized logging: Previously, audit logs generation was dependent on separate internal teams, which could lead to gaps and inconsistencies. Now, audit logs are automatically produced in a seamless and standardized way, eliminating Continue reading
Vini Motta decided to use AI on ipSpace.net content to find what it would recommend as the projects to work on in order to become employable in 2025. Here are the results he sent me; my comments are inline on a gray background.
While the hyperscalers and big cloud builders all are racing as fast as they can to build the biggest – and presumably the best – models, or collections of models, to win the AI race and become the Microsoft or Red Hat of commercial-grade models, the acquisition of AI hardware and envelope pushing on AI model architecture is not indicative of the adoption of AI by enterprises. …
Cisco Is The Bellwether Of Enterprise AI Adoption was written by Timothy Prickett Morgan at The Next Platform.
Hewlett Packard Enterprise last summer introduced the first of its Gen12 ProLiant systems, packed with Nvidia’s latest GPU accelerators and aimed squarely at the rapidly expanding AI space that in less than two years went from prompt-and-respond chatbots to AI agents that can reason, plan, and collaborate on their own. …
HPE Sets Gen12 ProLiant Servers Loose On AI And The Edge was written by Jeffrey Burt at The Next Platform.
In the previous blog post, I described the usual mechanisms used to connect virtual machines or containers in a virtual lab, and the drawbacks of using Linux bridges to connect virtual network devices.
In this blog post, we’ll see how KVM/QEMU/libvirt/Vagrant use UDP tunnels to connect virtual machines, and how containerlab creates point-to-point vEth links between Linux containers.
Nvidia may be shipping its “Blackwell” B100, B200, and GB200 compute engines, but not in enough volumes for server maker Supermicro to meet its revenue expectations in the quarter ended in December. …
Extended “Blackwell” GPU Ramp Cools Growth At Supermicro was written by Timothy Prickett Morgan at The Next Platform.
There are two ways to make a programmable switch that can run network applications and accelerate certain network functions. …
Cisco Cuts Network Costs By Welding Nexus Switch To AMD DPU was written by Timothy Prickett Morgan at The Next Platform.
Now that 2025 has been here for a few weeks and 2024 has closed with a variety of year-end traditions — from Christmas and Hanukkah celebrations to New Year’s Eve (NYE) countdowns, as well as celebrations of Orthodox Christmas, and Lunar/Chinese New Year — let’s examine how these events have shaped online behavior across continents and cultures. Reflecting on Christmas and NYE 2024 provides insights into how these trends compared with those of the previous year, as detailed in an earlier blog.
One notable finding is the remarkable consistency in human online patterns from one year to the next, a trend that persists despite cultural differences among countries. Data from over 50 countries reveal how people celebrated in 2024–2025, offering a timely reminder of typical holiday trends. While Christmas remains a dominant influence in many regions, other cultural and religious events — such as Hanukkah and local festivities — also shape online habits where Western traditions hold less sway.
In regions where Christmas is deeply rooted, Internet traffic dips significantly during Christmas Eve dinners, midnight masses, morning gift exchanges, and Christmas Day lunches, a pattern evident in both our previous and current analyses.
This analysis focuses exclusively on non-bot Internet Continue reading
Despite many advantages of Segment Routing, some networks still prefer to use RSVP for traffic engineering – and they can have good reasons for this. Is there any value of SDN controller with RSVP-TE, compared to configuring policies on each …