Searching for the cause of hung tasks in the Linux kernel
Depending on your configuration, the Linux kernel can produce a hung task warning message in its log. Searching the Internet and the kernel documentation, you can find a brief explanation that the kernel process is stuck in the uninterruptable state and hasn’t been scheduled on the CPU for an unexpectedly long period of time. That explains the warning’s meaning, but doesn’t provide the reason it occurred. In this blog post we’re going to explore how the hung task warning works, why it happens, whether it is a bug in the Linux kernel or application itself, and whether it is worth monitoring at all.
The hung task message in the kernel log looks like this:
INFO: task XXX:1495882 blocked for more than YYY seconds.
Tainted: G O 6.6.39-cloudflare-2024.7.3 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:XXX state:D stack:0 pid:1495882 ppid:1 flags:0x00004002
. . .
Processes in Linux can be in different states. Some of them are running or ready to run on the CPU — they are in the TASK_RUNNING
state. Others are waiting for some signal or event to happen, e.g. network packets to arrive or terminal input Continue reading