Using dstat to analyze important system parameters

Alexandar Sidorov, 123RF

Overview

Dstat can help you figure out why your computer is running slow and much more.

Linux users have a large repertoire of tools at their fingertips for figuring out the current workload on system resources. You may already have used tools like Ifstat, Iostat, Vmstat, and Netstat. These and others like them are intended only for professionals, or for only displaying parts of the desired parameters. Dstat, on the other hand, can be used by both novices and experienced administrators. This is because the program offers a well-structured and colorful information output. With a little practice, less experienced users can detect processes that cause significant loads on the system.

Table 1

Dstat Parameters

Switch

Function

-c

displays CPU metrics

-d

outputs bandwidth for disk use

-g

activates paging statistics

-l

load average according to Linux kernel

-m

displays values for the RAM

-n

outputs bandwidth of the network throughput

-s

activates swapping statistics

-y

output of important system values

--disk-tps

number of disk operations per second

--net-packets

number of packages running through the network interfaces

--thermal

reads temperature sensors

--top-cpu

displays the process causing the largest CPU load

--top-io

displays application with the highest disk throughput

--top-mem

displays application with the largest memory use

--bw

activates a different color profile

--nocolor

deactivates all colors

Dag Wieers was the Primary Developer of Dstat, which is written in Python. The original intention behind the program was to bring together the combined functions of well-known tools such as Ifstat, Iostat, Netstat and Vmstat. This would give users a comprehensive view of network, disk and memory status. The tool also includes numerous extensions which display metrics for many different applications. While the kernel provides standard measurement values in the usual way via a virtual proc file system, the software has its own modules to read values for applications too.

The program is available for installation as a binary package for all current distributions. In Raspbian, you can install it simply bye using:

apt install dstat

Simple, yet powerful

Typically, the Debian (and Raspbian) package installs the program to /usr/bin/. The modules which deliver actual functionality can be found in /usr/share/dstat/. If you would like to write your own extension for the program, this directory contains many helpful examples.

Dstat has numerous options for targeting and reading out relevant information. If you call the tool without any parameters, it will behave as if you had combined the -c, -d, -n, -g and -y options (Figure 1). You can find additional information on various options by calling the manpage via man dstat. In the Dstat Parameters table we have collected the most important ones for you.

Figure 1: If called without any parameters, dstat will deliver a set of values which may make it possible to locate bottlenecks.

The program displays values in a table with columns of fixed widths. The choice of colors is optimized for dark backgrounds. An option exists to switch via --bw to a design for light backgrounds or you can use --nocolor for a monochrome display.

The first line of output shows the default activated options. The following line names the five large areas below which you will find the individual options. Accordingly, in our example from Figure 1, the program shows the metrics for the condition of the CPU, hard drive, network and memory.

During testing, the system was running on idle. The colour red, as seen in the last column in Figure 1, only serves to provide a better overview. It does not indicate any problems. However, dstat uses color changes inside single columns to indicate altered conditions, such as a switch from idle mode into full load. The color green therefore always means a healthy condition.

To end the program, simply press [Ctrl]+[C]. If you know from the beginning how long you want the program to run, then you can specify this when you call it. For example, the program will run for 5 seconds with the call dstat 1 5 and update the output at intervals of 1 second.

Processors at a Glance

When reviewing the system to figure out why it is running more slowly than usual, take a look at the CPU metrics. To do this, call dstat as follows:

dstat -c -y -l --proc-count --top-cpu

Figure 2 once again shows the areas CPU and System with -c and -y respectively. In addition, dstat displays Load, or -l activity, as well as those processes, --top-cpu, which are using the most CPU resources.

Figure 2: In addition to standard CPU metrics, this example indicates which process eats up the largest share of computing time.

For a system with multiple processors and cores, the tool summarizes the workload of all of the CPUs. If you would like to have a detailed report, you can use the -C option together with a list of the cores, separated by commas, which you wish to monitor.

The first column of the output for the CPU describes typical values measured by the tools. These include usr and sys which indicate what percentage of the CPU time consumed was spent in user and kernel space. The idl column contains the percentage of all unused CPU capacity.

This last value is already especially important to the task of determining whether there are potential issues. If it is high, then the system is idle. On the other hand, the values under the wai abbreviation indicate whether programs are waiting to execute. If these values are high, then it is possible that a bottleneck has formed.

The columns labeled hiq and siq show the number of interrupts caused either by hardware or software. A high number of interrupts indicates heavy use of the system but it would not necessarily mean that a problem had occurred.

The system field is divided into the int column for the total number of all interrupts and csw for context switches. The latter deals with all processes that, due to multitasking, have been paused because another process has priority for CPU execution. If the number of paused processes is higher than normal, this can be an indication of the CPU not being able to keep up with executing the tasks at hand. However this holds true only if the idle time value referred to above is near zero.

The third large field load-avg shows the system load for the past 60 seconds, 5 minutes and 15 minutes as reported by the kernel. In a Linux system, the load value serves as the standard indicator of either overload or system idle. The software determines the load values by checking how many processes are waiting for execution on a particular CPU. The next column proc indicates the current number of processes that are running.

The last column, most-expensive, shows which process is currently consuming the largest amount of CPU resources. As long as the load remains smaller than the capacity of available processing cores, the computer will be half asleep. When the load reaches a size that is more than twice that of available capacity, problems will arise. This means that the CPU can no longer keep up with the tasks at hand.

Each value in a row represents a snapshot of what is happening by the second. If the CPU measurements contain some surprises, then it is a good idea to take a closer look at the memory usage.

Buy this article as PDF

Express-Checkout as PDF

Pages: 5

Price $2.95
(incl. VAT)

Buy Raspberry Pi Geek

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content