An analogous observation that readers may be familiar with is the importance of minimizing costs when investing in order to maximize returns - see Vanguard Principle 3: Minimize costSuppose that a 100 server pool is being monitored and visibility will allow the orchestration system to realize a 10% improvement by better workload scheduling and placement - increasing the pool's capacity by 10% without the need to add an additional 10 servers and saving the associated CAPEX/OPEX costs.
Some time ago we discovered that certain very slow downloads were getting abruptly terminated and began investigating whether that was a client (i.e. web browser) or server (i.e. us) problem.
Some users were unable to download a binary file a few megabytes in length. The story was simple—the download connection was abruptly terminated even though the file was in the process of being downloaded. After a brief investigation we confirmed the problem: somewhere in our stack there was a bug.
Describing the problem was simple, reproducing the problem was easy with a single curl command, but fixing it took surprising amount of effort.
CC BY 2.0 image by jojo nicdao
In this article I'll describe the symptoms we saw, how we reproduced it and how we fixed it. Hopefully, by sharing our experiences we will save others from the tedious debugging we went through.
Two things caught our attention in the bug report. First, only users on mobile phones were experiencing the problem. Second, the asset causing issues—a binary file—was pretty large, at around 30MB.
After a fruitful session with tcpdump one of our engineers was able to prepare a test case that reproduced the Continue reading
Check out these wireless vendors driving innovation in enterprise WiFi.