Linux+System+Monitoring

=**System Resources:**=

Being able to monitor the performance of your system is essential. If system resources become to low it can cause a lot of problems. System resources can be taken up by individual users, or by services your system may host such as email or web pages. The ability to know what is happening can help determine whether system upgrades are needed, or if some services need to be moved to another machine.

=**Traceroute with ping:**=

# **mtr google.com**
=**Finger - user information lookup program**=
 * finger** [-**lmsp** ] [//user ...// ] [//user@host ...// ]

Options are: Login time is displayed as month, day, hours and minutes, unless more than six months ago, in which case the year is displayed rather than the hours and minutes. Unknown devices as well as nonexistent idle and login times are displayed as single asterisks.
 * -s**
 * Finger**displays the user's login name, real name, terminal name and write status (as a ``*'' after the terminal name if write permission is denied), idle time, login time, office location and office phone number.

Produces a multi-line format displaying all of the information described for the -**s**option as well as the user's home directory, home phone number, login shell, mail status, and the contents of the files ``.plan  ``.project  ``.pgpkey  and ``.forward  from the user's home directory. Phone numbers specified as eleven digits are printed as ``+N-NNN-NNN-NNNN''. Numbers specified as ten or seven digits are printed as the appropriate subset of that string. Numbers specified as five digits are printed as ``xN-NNNN''. Numbers specified as four digits are printed as ``xNNNN''. If write permission is denied to the device, the phrase ``(messages off)'' is appended to the line containing the device name. One entry per user is displayed with the -**l** option; if a user is logged on multiple times, terminal information is repeated once per login. Mail status is shown as ``No Mail. if there is no mail at all, ``Mail last read DDD MMM ## HH:MM YYYY (TZ) if the person has looked at their mailbox since new mail arriving, or ``New mail received ..., `` Unread since ... if they have new mail.
 * -l**

Prevents the -**l** option of **finger** from displaying the contents of the ``.plan  ``.project  and ``.pgpkey '' files.**-m** Prevent matching of //user// names. //User// is usually a login name; however, matching will also be done on the users' real names, unless the -**m** option is supplied. All name matching performed by **finger** is case insensitive. If no options are specified, **finger** defaults to the -**l** style output if operands are provided, otherwise to the -**s** style. Note that some fields may be missing, in either format, if information is not available for them. If no arguments are specified, **finger** will print an entry for each user currently logged into the system.
 * -p**
 * Finger** may be used to look up users on a remote machine. The format is to specify a //user// as ``**user@host**  or ``**@host**  where the default output format for the former is the -**l** style, and the default output format for the latter is the -**s** style. The -**l** option is the only option that may be passed to a remote machine.

if standard output is a socket, **finger** will emit a carriage return (^M) before every linefeed (^J). This is for processing remote finger requests when invoked by fingerd(8).

The most common of these commands is top. The top will display a continually updating report of system resource usage. 12:10:49 up 1 day, 3:47, 7 users, load average: 0.23, 0.19, 0.10 125 processes: 105 sleeping, 2 running, 18 zombie, 0 stopped CPU states: 5.1% user 1.1% system 0.0% nice 0.0% iowait 93.6% idle Mem: 512716k av, 506176k used, 6540k free, 0k shrd, 21888k buff Swap: 1044216k av, 161672k used, 882544k free 199388k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 2330 root 15 0 161M 70M 2132 S 4.9 14.0 1000m 0 X 2605 weeksa 15 0 8240 6340 3804 S 0.3 1.2 1:12 0 kdeinit 3413 weeksa 15 0 6668 5324 3216 R 0.3 1.0 0:20 0 kdeinit 18734 root 15 0 1192 1192 868 R 0.3 0.2 0:00 0 top 1619 root 15 0 776 608 504 S 0.1 0.1 0:53 0 dhclient 1 root 15 0 480 448 424 S 0.0 0.0 0:03 0 init 2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd 3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd 4 root 35 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU0 9 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush 5 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kswapd 10 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kupdated 11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd 15 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 kjournald 81 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 khubd 1188 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald 1675 root 15 0 604 572 520 S 0.0 0.1 0:00 0 syslogd 1679 root 15 0 428 376 372 S 0.0 0.0 0:00 0 klogd 1707 rpc 15 0 516 440 436 S 0.0 0.0 0:00 0 portmap 1776 root 25 0 476 428 424 S 0.0 0.0 0:00 0 apmd >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The top portion of the report lists information such as the system time, uptime, CPU usage, physical ans swap memory usage, and number of processes. Below that is a list of the processes sorted by CPU utilization. You can modify the output of top while is is running. If you hit an i, top will no longer display idle processes. Hit i again to see them again. Hitting M will sort by memory usage, S will sort by how long they processes have been running, and P will sort by CPU usage again. In addition to viewing options, you can also modify processes from within the top command. You can use u to view processes owned by a specific user, k to kill processes, and r to renice them. For more in−depth information about processes you can look in the /proc filesystem. In the /proc filesystem you will find a series of sub−directories with numeric names. These directories are associated with the processes ids of currently running processes. In each directory you will find a series of files containing information about the process. =**The iostat command.**=
 * TOP COMMAND:**
 * 1) **top**

The iostat will display the current CPU load average and disk I/O information. This is a great command to monitor your disk I/O usage. Linux 2.4.20−24.9 (myhost) 12/23/2003 avg−cpu: %user %nice %sys %idle 62.09 0.32 2.97 34.62 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev3−0 2.22 15.20 47.16 1546846 4799520 For 2.4 kernels the devices is names using the device's major and minor number. In this case the device listed is /dev/hda. To have iostat print this out for you, use the −x. Linux 2.4.20−24.9 (myhost) 12/23/2003 avg−cpu: %user %nice %sys %idle 62.01 0.32 2.97 34.71 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq−sz avgqu−sz await svctm %util /dev/hdc 0.00 0.00 .00 0.00 0.00 0.00 0.00 0.00 0.00 2.35 0.00 0.00 14.71 /dev/hda 1.13 4.50 .81 1.39 15.18 47.14 7.59 23.57 28.24 1.99 63.76 70.48 15.56 /dev/hda1 1.08 3.98 .73 1.27 14.49 42.05 7.25 21.02 28.22 0.44 21.82 4.97 1.00 /dev/hda2 0.00 0.51 .07 0.12 0.55 5.07 0.27 2.54 30.35 0.97 52.67 61.73 2.99 /dev/hda3 0.05 0.01 .02 0.00 0.14 0.02 0.07 0.01 8.51 0.00 12.55 2.95 0.01 The iostat man page contains a detailed explanation of what each of these columns mean. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 * iostat**
 * 1) iostat −x

=**The ps command**=

The ps will provide you a list of processes currently running. There is a wide variety of options that this command gives you. A common use would be to list all processes currently running. To do this you would use the ps −ef command. (Screen output from this command is too large to include, the following is only a partial output.)
 * ps
 * UID PID PPID C STIME TTY TIME CMD**
 * root 1 0 0 Dec22 ? 00:00:03 init**
 * root 2 1 0 Dec22 ? 00:00:00 [keventd]**
 * root 3 1 0 Dec22 ? 00:00:00 [kapmd]**
 * root 4 1 0 Dec22 ? 00:00:00 [ksoftirqd_CPU0]**
 * root 9 1 0 Dec22 ? 00:00:00 [bdflush]**
 * root 5 1 0 Dec22 ? 00:00:00 [kswapd]**
 * root 6 1 0 Dec22 ? 00:00:00 [kscand/DMA]**
 * root 7 1 0 Dec22 ? 00:01:28 [kscand/Normal]**
 * root 8 1 0 Dec22 ? 00:00:00 [kscand/HighMem]**
 * root 10 1 0 Dec22 ? 00:00:00 [kupdated]**
 * root 11 1 0 Dec22 ? 00:00:00 [mdrecoveryd]**
 * root 15 1 0 Dec22 ? 00:00:01 [kjournald]**
 * root 81 1 0 Dec22 ? 00:00:00 [khubd]**
 * root 1188 1 0 Dec22 ? 00:00:00 [kjournald]**
 * root 1675 1 0 Dec22 ? 00:00:00 syslogd −m 0**
 * root 1679 1 0 Dec22 ? 00:00:00 klogd −x**
 * rpc 1707 1 0 Dec22 ? 00:00:00 portmap**
 * root 1813 1 0 Dec22 ? 00:00:00 /usr/sbin/sshd**
 * ntp 1847 1 0 Dec22 ? 00:00:00 ntpd −U ntp**
 * root 1930 1 0 Dec22 ? 00:00:00 rpc.rquotad**
 * root 1934 1 0 Dec22 ? 00:00:00 [nfsd]**
 * root 1942 1 0 Dec22 ? 00:00:00 [lockd]**
 * root 1943 1 0 Dec22 ? 00:00:00 [rpciod]**
 * root 1949 1 0 Dec22 ? 00:00:00 rpc.mountd**
 * root 1961 1 0 Dec22 ? 00:00:00 /usr/sbin/vsftpd /etc/vsftpd/vsftpd.conf**
 * root 2057 1 0 Dec22 ? 00:00:00 /usr/bin/spamd −d −c −a**
 * root 2066 1 0 Dec22 ? 00:00:00 gpm −t ps/2 −m /dev/psaux**

In the previous section we can see that user aweeks is logged onto both pts/1 and pts/2, but what if w want to see what they are doing? We could to a ps −u aweeks and get the following output hottub:~$ ** ps u ** USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND username 11307 0.0 0.7 1736 980 pts/0 S 10:05 0:00 -bash username 11332 0.0 0.5 2348 716 pts/0 R 10:06 0:00 ps u

get information about processes other than your own: $ **ps aux** (gives me a wider output and it wraps) $ **ps auxw**


 * user@server:~>** ps −u aweeks
 * 20876 pts/1 00:00:00 bash**
 * 20904 pts/2 00:00:00 bash**
 * 20951 pts/2 00:00:00 ssh**
 * 21012 pts/1 00:00:00 ps**
 * From this we can see that the user is doing a ps ssh.**

(-9 forces the kill to happen at then system level, so you will need to login again to star they process)
 * The first column shows who owns the process. The second column is the process ID. The Third column is th**
 * parent process ID. This is the process that generated, or started, the process. The forth column is the CPU**
 * usage (in percent). The fifth column is the start time, of date if the process has been running long enough. Th**
 * sixth column is the tty associated with the process, if applicable. The seventh column is the cumulitive CPU**
 * usage (total amount of CPU time is has used while running). The eighth column is the command itself.**
 * With this information you can see exacly what is running on your system and kill run−away processes, or**
 * those that are causing problems.**
 * Sometimes a user will have a runaway process that needs to be stopped, or you will need to stop a program that's running in the background. In these cases, you'll use the** kill **command. Get the PID (Process Identification) of your** bash **shell by using** ps u**.
 * kill -9 YOUR_PID**
 * kill -9 YOUR_PID**

=**The vmstat command**=

will provide a report showing statistics for system processes, memory, swap, I/O, and the CPU. These statistics are generated using data from the last time the command was run to the present. In the case of the command never being run, the data will be from the last reboot until the present. procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 181604 17000 26296 201120 0 2 8 24 149 9 61 3 36 The following was taken from the vmstat man page. FIELD DESCRIPTIONS Procs r: The number of processes waiting for run time. b: The number of processes in uninterruptable sleep. w: The number of processes swapped out but otherwise runnable. This field is calculated, but Linux never desperation swaps. Memory swpd: the amount of virtual memory used (kB). free: the amount of idle memory (kB). buff: the amount of memory used as buffers (kB). Swap si: Amount of memory swapped in from disk (kB/s). so: Amount of memory swapped to disk (kB/s). IO bi: Blocks sent to a block device (blocks/s). bo: Blocks received from a block device (blocks/s). System in: The number of interrupts per second, including the clock. cs: The number of context switches per second. CPU These are percentages of total CPU time. us: user time sy: system time id: idle time =**The lsof command:**= The **lsof** command will print out a list of every file that is in use. Since Linux considers everythihng a file, this list can be very long. However, this command can be useful in diagnosing problems. An example of this is if you wish to unmount a filesystem, but you are being told that it is in use. You could use this command and
 * vmstat**
 * grep** for the name of the filesystem to see who is using it.

Or suppose you want to see all files in use by a particular process. To do this you would use **lsof −p** −processid−. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.. =**The df command**= The df is the simplest tool available to view disk usage. Simply type in df and you'll be shown disk usage for all your mounted filesystems in 1K blocks user@server:~> **df** Filesystem 1K−blocks Used Available Use% Mounted on /dev/hda3 5242904 759692 4483212 15% tmpfs 127876 8 127868 1% /dev/shm /dev/hda1 127351 33047 87729 28% /boot /dev/hda9 10485816 33508 10452308 1% /home /dev/hda8 5242904 932468 4310436 18% /srv /dev/hda7 3145816 32964 3112852 2% /tmp /dev/hda5 5160416 474336 4423928 10% /usr /dev/hda6 3145816 412132 2733684 14% /var You can also use the **−h** to see the output in "human−readable" format. This will be in K, Megs, or Gigs depending on the size of the filesystem. Alternately, you can also use the **−B** to specify block size. In addition to space usage, you could use the **−i** option to view the number of used and available inodes.

user@server:~> **df −i** Filesystem Inodes IUsed IFree IUse% Mounted on /dev/hda3 0 0 0 − tmpfs 31969 5 31964 1% /dev/shm /dev/hda1 32912 47 32865 1% /boot /dev/hda9 0 0 0 − /home /dev/hda8 0 0 0 − /srv /dev/hda7 0 0 0 − /tmp /dev/hda5 656640 26651 629989 5% /usr /dev/hda6 0 0 0 − /var >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

=**The du command**= Now that you know how much space has been used on a filesystem how can you find out where that data is? To view usage by a directory or file you can use du. Unless you specify a filename du will act recursively.

user@server:~> **du file.txt** 1300 file.txt Or like the df I can use the −h and get the same output in "human−readable" form. user@server:~> **du −h file.txt** 1.3M file.txt Unless you specify a filename du will act recursively. user@server:~> **du −h /usr/local** 4.0K /usr/local/games 16K /usr/local/include/nessus/net 180K /usr/local/include/nessus 208K /usr/local/include 62M /usr/local/lib/nessus/plugins/.desc 97M /usr/local/lib/nessus/plugins 164K /usr/local/lib/nessus/plugins_factory 97M /usr/local/lib/nessus 12K /usr/local/lib/pkgconfig 2.7M /usr/local/lib/ladspa 104M /usr/local/lib 112K /usr/local/man/man1 4.0K /usr/local/man/man2 4.0K /usr/local/man/man3 4.0K /usr/local/man/man4 16K /usr/local/man/man5 4.0K /usr/local/man/man If you just want a summary of that directory you can use the **−s** option.

user@server:~> **du −hs /usr/local** 210M /usr/local >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> =**Monitoring Users**=

Just because you're paranoid doesn't mean they AREN'T out to get you... Source Unknown From time to time there are going to be occasions where you will want to know exactly what people are doing on your system. Maybe you notice that a lot of RAM is being used, or a lot of CPU activity. You are going to want to see who is on the system, what they are running, and what kind of resources they are using.

7.3.1. The who command The easiest way to see who is on the system is to do a who or w. The −−> who is a simple tool that lists out who is logged −−> on the system and what port or terminal they are logged on at. user@server:~> **who** bjones pts/0 May 23 09:33 wally pts/3 May 20 11:35 aweeks pts/1 May 22 11:03 aweeks pts/2 May 23 15:04 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> =**The w command**= Even easier than using the who and **ps −u** commands is to use the **w**. **w** will print out not only who is on the system, but also the commands they are running. user@server:~> **w** aweeks :0 09:32 ?xdm? 30:09 0.02s −:0 aweeks pts/0 09:33 5:49m 0.00s 0.82s kdeinit: kded aweeks pts/2 09:35 8.00s 0.55s 0.36s vi sag−0.9.sgml aweeks pts/1 15:03 59.00s 0.03s 0.03s /bin/bash From this we can see that I have a kde session running, I'm working in this document :−), and have another terminal open sitting idle at a bash prompt.