Processes and Daemons
- Fundamentally, kernels provide a few logical constructs that mediate access to either real or virtual
resources. The two most important in Unix are processes and filesystems.
- You can view the characteristics of processes on a Unix machine with a variety of programs, including
ps, top, lsof, and ls.
What Unix/Linux system administrators see — ps
[user@localhost]$ cat /etc/redhat-release
Fedora release 11 (Leonidas)
[user@localhost]$ ps -elf # Sys V syntax ; Berkeley is more like ps alxwwww
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S root 1 0 0 80 0 - 1020 poll_s Aug25 ? 00:00:00 /sbin/init
1 S root 29 2 0 80 0 - 0 pdflus Aug25 ? 00:00:00 [pdflush]
1 S root 31 2 0 75 -5 - 0 kswapd Aug25 ? 00:00:06 [kswapd0]
0 S root 1260 1 0 80 0 - 2783 wait Aug25 ? 00:00:00 /bin/sh /command/svscanboot
0 S root 1283 1260 0 80 0 - 985 hrtime Aug25 ? 00:00:01 svscan /service
0 S root 1289 1283 0 80 0 - 942 poll_s Aug25 ? 00:00:00 supervise dnscache
0 S root 1290 1283 0 80 0 - 942 poll_s Aug25 ? 00:00:00 supervise log
4 S 501 1291 1289 0 80 0 - 1326 poll_s Aug25 ? 00:00:01 /usr/local/bin/dnscache
4 S Gdnslog 1292 1290 0 80 0 - 978 pipe_w Aug25 ? 00:00:00 multilog t ./main
4 S root 1659 1 0 80 0 - 42145 epoll_ Aug25 ? 00:00:00 cupsd -C /etc/cups/cupsd.conf
5 S ntp 1897 1 0 80 0 - 7985 poll_s Aug25 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
5 S root 1954 1 0 80 0 - 19398 poll_s Aug25 ? 00:00:00 sendmail: accepting connections
1 S smmsp 1962 1 0 80 0 - 15739 pause Aug25 ? 00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
1 S root 1974 1 0 80 0 - 25073 hrtime Aug25 ? 00:00:00 crond
What Unix/Linux system administrators see -- top
[root@localhost root]# top -b -n1 # run in batch mode for one iteration
08:17:41 up 1 day, 18:12, 2 users, load average: 9.69, 9.14, 8.89
115 processes: 114 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.0% 0.0% 0.9% 0.0% 0.9% 0.0% 98.0%
Mem: 510344k av, 392504k used, 117840k free, 0k shrd, 17208k buff
240368k actv, 55488k in_d, 4760k in_c
Swap: 522104k av, 90392k used, 431712k free 72852k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
1090 root 20 0 1088 1088 832 R 0.9 0.2 0:00 0 top
1 root 15 0 492 456 432 S 0.0 0.0 0:08 0 init
3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
What Unix/Linux system administrators see - lsof
[root@localhost root]# lsof # heavily redacted to fit on page
COMMAND PID USER NODE NAME
sendmail 20824 root 159526 /lib/libcrypt-2.3.2.so
sendmail 20824 root 159568 /lib/libcrypto.so.0.9.7a
sendmail 20824 root 319023 /usr/lib/libldap.so.2.0.17
sendmail 20824 root 32286 /usr/lib/sasl/libcrammd5.so.1.0.19
sendmail 20824 root 32104 /usr/kerberos/lib/libk5crypto.so.3.0
sendmail 20824 root 32095 /lib/tls/libdb-4.2.so
sendmail 20824 root 318943 /usr/lib/libz.so.1.1.4
sendmail 20824 root 65611 /dev/null
sendmail 20824 root TCP anothermachine.com:smtp->10.1.1.20:
sendmail 20824 root 65611 /dev/null
sendmail 20824 root 16220 socket
sendmail 20824 root TCP anothermachine.com:smtp->10.1.1.20:
sendmail 20824 root TCP localhost.localdomain:48512->localh
sendmail 20824 root TCP anothermachine.com:smtp->10.1.1.20:
Processes and Daemons : fork and clone
-
One of the most important logical resources in Unix is that of processes.
-
A new process is created by fork; or, alternatively, in Linux with clone since processes
and threads are both just task_struct in Linux.
-
With clone, memory, file descriptors and signal handlers are still shared
between parent and child.
-
With fork, these are copied, not shared.
Starting a Unix/Linux process
- exec instantiates a new executable:
-
Usually, when doing an exec the named file is loaded into the current process's memory
space
-
If the executable is dynamically linked, then the dynamic loader maps in the necessary bits (not done if
the binary is statically linked.)
-
Then code in the initial .text section is then executed. (There are three main types of sections:
-
.text sections for executable code
-
.data sections (including read-only .rodata sections)
-
.bss sections (Blocks Started by Symbol) which contains ``uninitialized'' data)
-
SCRIPTS: However if the first two characters of the file are #! and the following characters name
a valid pathname to an executable file, in which case that executable is instead loaded into memory. This is done for "scripts", such
as bash, Perl, and Python scripts.
Some Typical Assembly Code
.file "syslog.c" ; the source file name for this code
.data ; a data section
.align 4 ; put PC on 4 (or 16) byte alignment
.type LogFile,@object ; create a reference of type object
.size LogFile,4 ; and give it 4 bytes in size
LogFile: ; address for object
.long -1 ; initialize to a value of -1
.align 4 ; align . to 4 (16) byte
.type LogStat,@object ; a new object reference is created
.size LogStat,4 ; give it 4 bytes also
LogStat: ; here's its address in memory
.long 0 ; and initialized it to a value zero
.section .rodata ; here's a ``read-only'' section
.LC0: ; local label for a string
.string "syslog" ; initialized to "syslog"
[ ... ]
.text ; now we have some executable code
.globl syslog ; and it is a global symbol for
.type syslog,@function ; a function syslog()
syslog:
pushl %ebp ; and away we go...
movl %esp, %ebp
subl $8, %esp
Daemon processes
When we refer to a daemon process, we are referring to a process with these
characteristics:
-
Generally persistent (though it may spawn temporary helper processes like xinetd does)
-
No controlling terminal (and the controlling tty process group tpgid) is shown as -1 in ps)
-
Parent process is generally init (process 1)
-
Generally has its own process group id and session id;
Daemon processes
Generally a daemon provides a service. So why not put such services in the kernel?
-
Another level of modularity that is easy to control
-
Let's keep from growing the already largish kernel
-
Ease (and safety) of killing and restarting processes
-
Logically, daemons generally share the characteristics one expects of ordinary user processes (except for the lack
of controlling terminal.)
BSD-ish: Kernel and user daemons: swapper
-
An increasing number of Unix/Linux daemons execute in kernel mode; pagedaemon and swapper are two early examples from the BSD world), but their
numbers have been growing in recent years.
-
BSD swapper (pid 0) daemon : The BSD swapper is a kernel daemon. swapper moves whole processes between main memory and secondary storage (swapping out and swapping in) as part of the
operating system's virtual memory system.
-
In BSD-land, the swapper is the first process to start after the kernel is loaded.
(If the machine crashes immediately after the kernel is loaded then you may not have your swap space configured correctly.)
-
The swapper is described as a separate kernel process in other non-BSD UNIXes.
It appears in the Linux process table as kswapd. For example, it appears in the Solaris/OpenIndiana process table
as sched (the SysV swapper was sometimes called the scheduler because it 'scheduled' the allocation of memory and thus influenced the CPU scheduler).
BSD: Kernel and user daemons: pagedaemon
-
BSD pagedaemon. In days gone by, the third process created by the BSD kernel was always the pagedaemon
and always had pid 2. These days, it's just another in the rapidly proliferating ``kernel processes'' in BSD.
The pagedaemon as a kernel process originated with BSD systems (demand paging was initially a BSD feature) which was adopted by AT&T. The pageout
process (still pid 2) in Solaris/OpenIndiana provides the same function with a different name.
-
This is all automatic — not much for a system administrator to do, except monitor system behavior to make sure the system isn't thrashing. You might expect to see this process taking up a lot of cpu time if there were thrashing.
Kernel and user daemons: init
TRADITIONAL: init (pid 1) daemon: The first ``user'' process started by the kernel; its userid is 0. All other ``normal'' processes are descendants of init. Depending on the boot parameters init, you might see something along these lines:
- Spawn a single-user shell at the console.
- Begin the multi-user start-up scripts (which are, unfortunately, not standardized across UNIXes; see section pp. 32-41 and pp. 886-887 in LAH)
- Perhaps start up daemontools "svscan" (probably indicated something like SV:123456:respawn:/command/svscanboot in /etc/inittab
- Or start systemd, initng, upstart, or !?!?
There is a lot of flux in this area; we saw, for instance, in Fedora 11-13 replacement of the old SysV init with upstart, but now Fedora 14-17 have moved to systemd; hopefully, whatever the engine, we can get better dependency resolution than we have had previously and faster boot times. (Take a look at /etc/event.d on Fedora for instance.)
While systemd can support old AT&T scripts, it is designed to instead to have any startup parameters actually processed by systemd rather than the execution of a standalone script.
Kernel and user daemons: update (aka bdflush/kupdate and fsflush)
- update daemons: An update daemon executes the sync() system call every 30 seconds or so. The sync() system call flushes the system buffer cache; it is desirable because UNIX uses delayed write when buffering file I/O to and from disk. These go under a variety of names, such as variants of update, flush, and sync.
- It's best not to just turn off a UNIX machine without flushing the buffer cache. It is better to halt the system using /etc/shutdown, /etc/halt, or poweroff; these commands attempt to put the system in a quiescent state (including calling sync()).
- I like to do something like sync ; sync ; poweroff or sync ; sync ; reboot just to make sure a few manual synchronizations are made. When I am removing a USB drive, I like to do something like sync ; umount /media/disk ; sync .
- The update daemon goes by many names (also see pdflush, bdflush(2), and bdflush(8) in Linux and fsflush in Solaris).
Comments in the code: what Linux kernel comments say about dirty buffers and pages
/*
* The relationship between dirty buffers and dirty pages:
*
* Whenever a page has any dirty buffers, the page's dirty bit is set, and
* the page is tagged dirty in its radix tree.
*
* At all times, the dirtiness of the buffers represents the dirtiness of
* subsections of the page. If the page has buffers, the page dirty bit is
* merely a hint about the true dirty state.
*
* When a page is set dirty in its entirety, all its buffers are marked dirty
* (if the page has buffers).
*
* When a buffer is marked dirty, its page is dirtied, but the page's other
* buffers are not.
*
* Also. When blockdev buffers are explicitly read with bread(), they
* individually become uptodate. But their backing page remains not
* uptodate - even if all of its buffers are uptodate. A subsequent
* block_read_full_page() against that page will discover all the uptodate
* buffers, will set the page uptodate and will perform no I/O.
*/
(from fs/buffer.c in kernel 3.9.4)
Kernel and user daemons: inetd and xinetd
- Even though well-written daemons consume little CPU time they do take up virtual memory and process table entries.
- Years ago, as people created new services, the idea of a super-daemon inetd was created to manage the class of network daemons.
- Many network servers were mediated by the inetd daemon at connect time, though some, such as sendmail, postfix, qmail, and sshd were not typically under inetd.
- The original inetd listened for requests for connections on behalf of the various network services and then started the appropriate daemon, handing off the network connection pointers to the daemon.
- Some examples are pserver, rlogin, telnet, ftp, talk, and finger.
- The configuration file that told inetd which servers to manage was /etc/inetd.conf.
Amusingly enough, this very same line of reasoning is being revived by systemd; see this blog posting by its author. (note that daemontools also has used a related idea since 2001, but more for monitoring purposes.)
Kernel and user daemons: inetd and xinetd
- The /etc/services file: This file maps TCP and UDP protocol server names to port numbers.
- The /etc/inetd.conf file This file has the following format (see page 887-893 in LAH and ``man inetd.conf''):
- The successor to inetd was xinetd, which combined standard inetd functions with other useful features, such as logging and access control.
Kernel and user daemons: inetd and xinetd
The configuration file structure for xinetd is /etc/xinetd.conf and also /etc/xinetd.d/*. These files are used to modify general behavior of the daemon and the directory /etc/xinetd.d contains separate files per service. Your CentOS machines use xinetd instead of inetd.
Kernel and user daemons: inetd and xinetd
When installing new software packages you may have to modify /etc/inetd.conf, /etc/xinetd.d/ files, and/or /etc/services. A hangup signal (kill -HUP SOMEPID) will get the inetd/xinetd to re-read its config file. Or you might be
able to use a startup script, such as ``/etc/init.d/inetd restart'') or ``service inetd restart''.
Kernel and user daemons: portmap and rpcbind
portmap/rpcbind : portmap (rpcbind on OpenSolaris and BSD) maps Sun Remote Procedure Call (RPC) services to ports (/etc/rpc). Typically,
/etc/rpc looks something like:
[root@vm5 etc]# more /etc/rpc
#ident ``@(#)rpc 1.11 95/07/14 SMI'' /* SVr4.0
#
# rpc
#
portmapper 100000 portmap sunrpc rpcbind
rstatd 100001 rstat rup perfmeter rstat_svc
rusersd 100002 rusers
nfs 100003 nfsprog
ypserv 100004 ypprog
mountd 100005 mount showmount
ypbind 100007
walld 100008 rwall shutdown
yppasswdd 100009 yppasswd
Kernel and user daemons: portmap/rpcbind
- Sun RPC is used by other services, such as NFS and NIS. RPC servers register with this daemon and RPC clients get the port number for a service from the daemon. You can find operational information using rpcinfo. For example, rpcinfo -p will list the RPC services on the local machine.
- Some daemons may fail if portmap isn't running. Most UNIXes these days automatically start up portmap after installation, so it's usually not a problem. Also, there are subtle points that have oddly creeped in from the old tcpwrappers package that can affect the portmapper. See for example /etc/hosts.deny.
Kernel and user daemons: syslogd
syslogd :
syslogd is a daemon whose function is to handle logging requests from
- the kernel
- other user processes, primarily daemon processes
- processes on other machines, since syslogd can listen for logging requests across a network
Note that syslog is generally being replace rsyslog.
Kernel and user daemons: syslogd
A process can make a logging request to the syslogd by using the function syslog(3). syslogd determines what to do with logging requests according to the configuration file /etc/syslog.conf
/etc/syslog.conf generally looks something like:
*.info;mail.none;news.none;authpriv.none;cron.none /var/log/messages
authpriv.* /var/log/secure
mail.* /var/log/maillog
cron.* /var/log/cron
*.emerg *
uucp,news.crit /var/log/spooler
local7.* /var/log/boot.log
Kernel and user daemons: syslogd
- For a single UNIX machine, the default /etc/[r]syslog.conf will suffice. Also, you should note that Linux distributions have been moving to rsyslogd, which provides expanded capabilities (such as logging directly to a database) and still tries to preserve the capabilities of the original syslogd.
- You should read the file and figure out where the most common error messages end up (/var/adm/messages or /var/log/messages are typical default locations).
- If you are going to manage a number of UNIX machines, consider learning how to modify /etc/[r]syslog.conf on the machines so all the syslog messages are routed to a single ``LOGHOST''.
- You can also use the "perf" kernel monitoring tools to look at system activity. For instance, "perf top" or "perf state CMD" can occasionally yield interesting data
for a system administrator.