Archive

Category Archives for "Networking – The New Stack"

How To Read a Traceroute for Network Troubleshooting

The traceroute tool is one of the most valuable yet straightforward diagnostic utilities available for network troubleshooting. Built into virtually every operating system, traceroute runs a connection test from one computer to another device, showing each “hop” the data takes between network devices. This comprehensive guide will help you understand how traceroute works, interpret its results and recognize common network problems it can reveal. Traceroute: Understanding What It Does To see traceroute in action, we can begin with a simple example of running a traceroute from your computer to Catchpoint’s servers. The specific results will be different for each person. However, in most cases, the results will show you around four to 20 “hops” that packets take to get from your computer to Catchpoint’s servers and back. The first one would likely be your local router, and from there, the data will take multiple “hops” through your internal network and out through your internet service provider (ISP) and over the internet, before finally reaching Catchpoint’s servers. Figure 1 shows an example of what you might see on the command prompt of a Windows computer. Figure 1: Image of a traceroute command and the results generated. Understanding how to run this Continue reading

Dave Taht, Who Sped Up Networks More Than You’ll Ever Know, Has Died

Dave Taht, LinkedIn profile photo. I don’t recall when I first met InterOp conference in the 80s when it was the best-ever networking conference or at a science-fiction convention — Balticon? — around the same time. Whether it was when he was talking about TCP/IP networking or playing Bufferbloat Project with Jim Gettys, focusing on reducing Internet latency and improving network performance. His work on advanced queuing algorithms like Common Applications Kept Enhanced (CAKE) significantly enhanced network efficiency, making these technologies part of the default networking stack in many Linux distributions, the popular open source embedded router operating system,

KubeCon Europe: Kgateway Aims To Be the Kubernetes Onramp

Kubernetes network administrators at KubeCon + CloudNativeCon EU this week in London should drop by the ease the management of moving traffic to and from clusters. Built on top of Kubernetes Gateway API, the open source Solo.io, and went under the name Gloo Gateway. At last year’s KubeCon +_ CloudNativeCon North America 2024, the company announced that it would be donating the software to the Cloud Native Computing Foundation (CNCF), changing the software’s name to kgateway in the process. In March, CNCFGloo open source repository will be deprecated over time. The Importance of the Kubernetes Gateway API In 2023, the

AI in Network Observability: The Dawn of Network Intelligence

Let’s face it. The modern network is a beast — a sprawling, complex organism of clouds, data centers, SaaS apps, home offices, and, depending on your industry vertical, factories, offices, retail locations, or branches. Mix in the internet as the backbone to connect them all, as well as an ever-increasing volume and velocity of data, and it becomes clear that traditional monitoring tools are now akin to peering through a keyhole to look at a vast landscape. They simply can’t see the bigger picture, and a new approach is needed: Enter Artificial Intelligence (AI), the game-changer ushering in a new era of Network Intelligence. From Reactive to Intelligent: The AI Revolution Remember the days of watching hundreds of dashboards, sifting through endless logs, and deciphering cryptic alerts? Those days are fading fast. Machine Learning and Generative AI are transforming network observability from a reactive chore to a proactive science. ML algorithms, trained on vast datasets of enriched, context-savvy network telemetry, can now detect anomalies in real-time, predict potential outages, foresee cost overruns, and even identify subtle performance degradations that would otherwise go unnoticed. Imagine an AI that can predict a spike in malicious traffic based on historical patterns and automatically Continue reading

Choosing Manual or Auto-Instrumentation for Mobile Observability

As applications run in production, you’ll need to find out what’s happening. You might want to know if you’re overloading the hardware, moving to the wrong feature in an A/B test or, on the mobile side, even such simple contingencies as whether the battery is running out. Developing an app to send information about itself means adding instrumentation. Apps can send such telemetry as OpenTelemetry (OTel) project. An added benefit is, if most applications in a given language need to observe the same types of operations and workflows, developers building on the OTel standard can identify and build

Making the Fediverse More Accessible With Claude 3.7 Sonnet

A few years ago I abandoned Twitter in favor of Mastodon. Recent events validate that choice and underscore the strategic importance of a decentralized fediverse that can’t be owned by a single corporate or state actor. But while Mastodon meets my needs, much of the Twitter diaspora has gone to Bluesky. That’s fine for now but might not always be. In an article titled “Bridgy Fed — a service that enables you to connect together your website, fediverse account and Bluesky account — will help. But Bridgy Fed needs to be easier to Continue reading

WanAware: 21 Packets’ Affordable Observability Play

21Packets’ infrastructure and customers. The company’s founders promise extensive integration capabilities, significantly reducing the costs organizations typically incur when integrating observability or telemetry data feeds with other sources. The platform extends and integrates with not only multiple data sources but at a significantly lower cost. This aspect of the offering is critical for those organizations that have numerous data feeds that might also be spread around the world geographically, WanAware, chief strategy officer/CISO and CEO of 21Packets, who holds other executive positions concurrently, said. “The reason why customers often come to us is they have resources all over the globe and they don’t have the ability to scale their observability platforms to support their environment. They don’t have teams that can operate it and they don’t have subject matter experts,” Collins said. “So, they need the ability to use an IT generalist and have an Continue reading

How To Find and Fix What’s Trashing Your App Performance

Troubleshooting slow app or website performance is one of the more frustrating tasks developers face. Reliability is a key performance indicator and user experience metric, so when something goes wrong, it rises immediately to the top of your priority list. Unless you can find and fix the problem fast, everything else you’d planned to do today is getting pushed into the future. The Scenario Picture this: You’re busy at your keyboard, trying to hit a milestone on an important yet incremental upgrade to your company’s key user-facing app. Suddenly, your ticketing system goes berserk — dozens of users are reporting that they can’t access an essential feature in your app. More patient users report they can access it … eventually. It’s taking minutes to load, not seconds, as they’re used to. You check your app’s performance in your monitoring tool, and everything looks fine — all the indicators are showing green. But your users are saying the app is slow — which, as far as they’re concerned, means it’s down. So, what exactly is going on here? Uncover What’s Degrading Digital Performance The evolution to internet-centric application delivery has made it increasingly challenging for IT orgs to identify the root Continue reading

Choosing the Right Transport Protocol: TCP vs. UDP vs. QUIC

We often think of protocol choice as a purely technical decision, but it’s a critical factor in the user experience and how your application is consumed. This is a high-impact business decision, making it crucial for the technical team first to understand the business situation and priorities. Choosing the right transport protocol — TCP, UDP or QUIC — profoundly impacts scalability, reliability and performance. These protocols function like different postal services, each offering a unique approach to delivering messages across networks. Should your platform prioritize the reliability of a certified letter, the speed of a doorstep drop-off or the innovation of a couriered package with signature confirmation? This decision-making framework breaks down the strengths, weaknesses, and ideal use cases of TCP, UDP and QUIC. It gives platform engineers and architects the insights to choose the proper protocol for their systems. Overview of Protocols Most engineers are familiar with TCP and have heard of UDP. Some may even have hands-on experience with QUIC. However, to make the right choice, it’s helpful to align on how these protocols compare before diving into the decision-making framework. TCP: The Certified Letter

ByteDance to Network a Million Containers with Netkit

Engineers from the Chinese social media conglomerate ByteDance are taking early advantage of a recently released feature of the Linux kernel called netkit that provides a faster way for containers to communicate with each other across a cluster. First released in Linux kernel 6.7 in December 2023, netkit is a now-discontinued Netkit that was used to create virtual networks on a single server) and has been touted as a way to streamline container networking. Like the rest of the cloud native world, ByteDance uses Virtual Ethernet (had a number bottlenecks that slowed communication rates across containers. In fact, veth requires each packet to traverse two network stacks — one of the sender and the other of the recipient — even if the two containers communicating were on the same Continue reading

NVMe-oF Substantially Reduces Data Access Latency

Modeling hyperscaler cloud architecture is gaining significant momentum in enterprise data centers as many IT teams are repatriating their public cloud workloads back on premises, modernizing their data center for cloud native workloads or building their own specialized public cloud services. They want to integrate the best capability and efficiency aspects of the public cloud with on-premises control. Several key benefits of the public cloud are driving data-center requirements, which include efficiency, scalability, flexibility, automation and agility. Technological innovations have emerged as key enablers of best-of-breed cloud architecture to achieve the benefits promised by the public cloud, which are software-defined storage, open source orchestrators such as Kubernetes, and NVMe-oF (Nonvolatile Memory Express Over Fabrics). All are gaining popularity as foundational components of modern cloud architecture. What Is NVMe-oF? The NVMe-oF v1.0 specification was released in June 2016. NVMe-oF is a network protocol that extends the parallel access and low latency features of Nonvolatile Memory Express (NVMe) protocol across networked storage. Originally designed for local storage and common in direct-attached storage (DAS) architectures, NVMe delivers high-speed data access and low latency by directly interfacing with solid-state disks. NVMe-oF allows these same advantages to be achieved in distributed and Continue reading

Cloud Monitoring’s Blind Spot: The User Perspective

The evolution of internet-centric application delivery has worsened IT’s visibility gaps into what impacts an end user’s experience. This problem is exacerbated when these gaps lead to negative business consequences, such as loss of revenue or lower Net Promoter Scores (NPS). The need to address this worsening visibility gap problem is reinforced by Gartner’s recent publication of its first

OpenVPN or WireGuard? A Detailed Performance Breakdown

OpenVPN has been a dominant player in the VPN space since its release in 2001. With a 23-year history, OpenVPN has proven to be a reliable and secure protocol. However, it has some downsides, particularly regarding performance and ease of use. OpenVPN creates a secure tunnel between two endpoints using SSL/TLS for encryption. While robust, the protocol is complex and requires considerable resources to run efficiently. Setting up and managing OpenVPN can be cumbersome, especially for DevOps teams juggling multiple environments and configurations. It wouldn’t be the first time an OpenVPN server stopped working because the TLS certificates expired. WireGuard, on the other hand, is the new kid on the block, having been introduced in recent years. What sets WireGuard apart from OpenVPN is its simplicity and efficiency. While OpenVPN relies on older, more complex cryptographic algorithms, WireGuard uses modern encryption that is both faster and more secure. Unlike OpenVPN, WireGuard is integrated directly into the Linux kernel, meaning it operates at a lower level and with less overhead. This results in faster connection times and lower resource usage. One of the significant benefits of WireGuard is its minimal codebase — about 10% the size of OpenVPN’s — which reduces Continue reading

Deciphering the Open Systems Interconnection Model

Unless you’ve studied for a network cert, the Open Systems Interconnection (OSI) model is probably somewhat of a mystery to you. Maybe you heard of it from a coworker, or maybe you saw it in a marketing campaign for something on AWS. Maybe you thought “Layer 3” was just some new buzzword. Such shorthand references to the OSI model, however, can be useful if you can decode them, as they can help you understand where in your network stack a tool could fit or where to look for a problem during an incident call. Before we get too far, let me address a point of contention. Many people will say the theoretical OSI model is outdated. The model is theoretical, true, and the real world is certainly more complex than it may lead you to believe. Its layers don’t neatly map to specific devices, and other models exist that more accurately reflect the real world, such as the Transmission Control Protocol/Internet Protocol (TCP/IP) model. Image 1 It’s useful to think of the OSI model as an abstraction that allows us to reason about the separation of concerns on a network. We use it to think through troubleshooting steps should Continue reading

P99 Conf: 3 Ways to Squash Application Latency

We’ve all been frustrated by latency, either as users of an application,  or as developers building such apps. At ScyllaDB‘s annual Pekka Enberg, founder and CTO of shared his favorite tips for spotting and removing latency from systems. “Latency lurks everywhere,” said Enberg, who also has authored a once estimated that it loses 1% of sales for every 100ms of latency. Screenshot Enberg has thought plenty about ways of reducing latency and has boiled down his solutions into three different approaches: Reduce data movement Continue reading