War Stories: Always Check Your Inputs
The extremely irregular War Stories series returns, with an anecdote from 15 years ago, investigating a problem with a web app that only seemed to crash when one particular person used it. Ultimately a simple problem, but it took me a while to track it down. I blame Perl.
ISPY With my Little Eye
“ispy” was our custom-built system that archived SMS logs from all our SMSCs, aggregating them to one server for analysis. Message contents were kept for a short period, with CDRs stored for longer (i.e. details of sending and receiving numbers, and timestamps, but no content).
The system had a web interface that support staff could use to investigate customer reports of SMS issues. They could enter source and/or destination MSISDNs, and see when messages were sent, and potentially contents. Access to contents was restricted, and was typically only used for things like abuse investigations.
This system worked well, usually.
Except when it didn’t.
Every few weeks, we’d get reports that L2 support couldn’t access the system. We’d login, see that one process was using up 99% CPU, kill it, and it would be OK for a while. Normally the system was I/O bound, so we Continue reading