Introducing runqstat — New Linux Run Queue & Load Average Tool
Getting real CPU Load Average on Linux
Linux Load Average is as old as the hills, and about as confusing, as few engineers can even tell you what it really is or how it’s calculated.
Fortunately, Brendon Gregg has a great summary including its history, at Linux Load Averages: Solving the Mystery, where you can see the core problem:
Linux Load Average includes I/O
It’s as simple as that. Unlike all other UNIX-like systems (Solaris, AIX, etc.), Linux counts BOTH running/runnable processes AND blocked (uninterruptible) processes, which are usually waiting for I/O. This is a bad thing, and dates to an archaic 24 year-old patch that Brendon found.
Why do we care?
We care because on any mixed-use system (using both CPU and I/O), the resulting Load Average is highly variable and often useless. This can be especially bad with highly-threaded systems that can cause I/O, so it’s easy to have dozens or hundreds of them blocked on slow disks.
What most of us really want is the traditional Unix measure of Load Average, which is the average number of runnable processes averaged over time. This tells us how loaded our system is, and we can look at the queue as a measure of saturation, i.e. a continual CPU queue means w are we overloaded.
So, how can we get that?
We really want the CPU Run Queue length, which becomes > 0 when processes have to wait for the CPU. But it’s hard to get accurately, and the Linux kernel is not very friendly in this area.
First, what Linux folks call the run queue includes running tasks, so it’s not really the queue, but the queue plus what’s running. That’s confusing, and thus you need to subtract the number of CPUs to get a sense of the real queue.
Several tools can get the current instantaneous run queue size. This is what the vmstat “r” column and “sar -q” shows; both get this from /proc/stat’s procs_running metric.
The problem is this is an instantaneous count, so it’s very noisy and needs to be averaged to get useful data, but the kernel won’t do this for you. Ideally there’d be a kernel output in /proc for CPU only and I/O only load averages, but no one has created this yet.
So we need a tool that tries to emulate how Load Average works, by rapidly sampling this instant run queue size, and averaging this over time.
runqstat does this — it’s a simple command line tool written in Go that by default samples this queue size every 10ms and averages this over one second. The goal is a stable and accurate measure of how busy the system is.
It includes options to change the sample rate and averaging time, plus to subtract the number of CPUs so you can just see the actual queue size if you want.
It also has an option to calculate the average blocked count, which in theory you can subtract from the Load Average to get a different view of the CPU saturation (and to see how much trouble Load Average is causing you).
With this tool, ops teams finally have a good way to monitor their Linux servers saturation.
Find it on Github — runqstat