I want to take some time to talk about the fundamental toolset that most programs that system administrators work with are built over.
The most important of these are the system calls. When we run strace to see exactly what a process is doing, we are watching this fundamental interaction between a program and its requests to the operating system, usually for access to resources controlled by the operating system.
Instructions on how to (heuristically) find your current set of Linux system calls.
Some thoughts on proliferating file descriptor mediation are here.
A Unix system call is a direct request to the kernel regarding a system resource. It might be a request for a file descriptor to manipulate a file, it might be a request to write to a file descriptor, or any of hundreds of possible operations.
These are exactly the tools that every Unix program is built upon.
In some sense, the mainstay operations are those on the file system.
Unlike many other resources which are just artifacts of the operating system and disappear at each reboot, changing a file system generally is an operation that has some permanence. [Of course it is possible and even common to create ``RAM'' disk filesystems since they are quite fast and for items that are meant to be temporary, they are quite acceptable. (For instance, as you might do when setting up MailScanner, for instance, in /var/spool/incoming.)]
In C-land, a file descriptor is an int (in the real world, it is the size of the register used for return values from system calls). It provides stateful access to an i/o resource such as a file on a filesystem, a pseudo-terminal, or a socket to a tcp session.
open() -- create a new file descriptor to access a file close() -- deallocate a file descriptor
dup() -- duplicate a file descriptor dup2() -- improved way to duplicate a file descriptor -- you can -- choose the new file descriptor number recvmsg() -- it is possible for a process to pass to another process sendmsg() -- file descriptors using recvmsg() and sendmsg() over Unix -- sockets; the new file descriptors are treated as if they had -- created by dup() Here's a simple program to play with: test-dup.c
fchmod() -- change the permissions of a file associated with a file -- descriptor fchown() -- change the ownership of a file associated with a file fchdir() -- change the working directory for a process via fd mmap() -- create a memory mapping of a file Here's a simple program to play with: test-fchdir.c Here's another simple program to play with: test-mmap.c
fcntl() -- miscellaneous manipulation of file descriptors: dup(), set -- close on exec(), set to non-blocking, set to asynchronous -- mode, locks, signals ioctl() -- manipulate the underlying ``device'' parameters for a file -- descriptor
flock() -- lock a file associated with a file descriptor
pipe() -- create a one-way association between two file -- descriptors so that output from -- one goes to the input of the other
select() -- multiplex on pending i/o to or from a set of poll() -- file descriptors epoll() --
read() -- read data from a file descriptor readv() -- read data from a file descriptor send() -- send data to a file descriptor sendto() -- send data to a file descriptor sendmsg() -- send data to a file descriptor write() -- send data to a file descriptor writev() -- send data to a file descriptor recv() -- read data from a file descriptor recvfrom() -- read data from a file descriptor recvmsg() -- read data from a file descriptor fsync() -- forces a flush for a file descriptor
readdir() -- raw read of directory entry from a file descriptor (old) -- there's no glibc interface for this function getdents() -- raw read of directoy entries from a file descriptor -- there's no glibc interface for this function
fstat() -- return information about a file associated with a fd: inode, perms, hard links, uid, gid, size, modtimes fstatfs() -- return the mount information for the filesystem that the file -- descriptor is associated with
access() -- returns a value indicating if a file is accessible chmod() -- changes the permissions on a file in a filesystem chown() -- changes the ownership of a file in a filesystem
link() -- create a hard link to a file symlink() -- create a soft link to a file
mkdir() -- create a new directory rmdir() -- remove a directory
stat() -- return information about a file associated with a pathname: inode, perms, hard links, uid, gid, size, modtimes statfs() -- return the mount information for the filesystem that the -- pathname is associated with
alarm -- set an alarm clock for a SIGALRM to be sent to a process -- time measured in seconds getitimer -- set an alarm clock in fractions of a second to deliver either -- SIGALRM, SIGVTALRM, SIGPROF
kill -- send an arbitrary signal to an arbitrary process killpg -- send an arbitrary signal to all processes in a process group
sigaction -- interpose a signal handler (can include special ``default'' or -- ``ignore'' handlers) sigprocmask -- change the list of blocked signals
wait -- check for a signal (can be blocking or non-blocking) or child exiting waitpid -- check for a signal from a child process (can be general or specific)
chdir -- change the working directory for a process to dirname chroot -- change the root filesystem for a process
execve -- execute another binary in this current process fork -- create a new child process running the same binary clone -- allows the child to share execution context (unlike fork(2)) exit -- terminate the current process
getdtablesize -- report how many file descriptors this process can have -- active simultaneously
getgid -- return the group id of this process getuid -- return the user id of this process getpgid -- return process group id of this process getpgrp -- return process group's group of this process
getpid -- return the process id of this process getppid -- return parent process id of this process getrlimit -- gets a resource limit on this process (core size, cpu time, -- data size, stack size, and others) getrusage -- find amount of resource usage by this process
nice() -- change the calling process's priority setpriority() -- arbitrarily change any process's (or group or user) priority getpriority() -- get any process's priorities
socket -- create a file descriptor (can be either network or local) bind -- bind a file descriptor to an address, such a tcp port listen -- specify willingness for some number of connections to be -- blocked waiting on accept() accept -- tell a file descriptor block until there is a new connection connect -- actively connect to listen()ing socket setsockopt -- set options on a given socket associated with fd, such out-of-band -- data, keep-alive information, congestion notification, final timeout, -- and so forth (see man tcp(7)); also allows user credentials to be -- passed with each invocation of recvmsg() on a Unix socket getsockopt -- retrieve information about options enabled for a given connection from fd -- also allows user credentials to be retrieved for a given Unix socket getpeername -- retrieve information about other side of a connection from fd getsockname -- retrieve information this side of a connection from fd
brk -- allocate memory for the data segment for the -- current process gethostname -- gets a ``canonical hostname'' for the machine sethostname -- sets a ``canonical hostname'' for the machine gettimeofday -- gets the time of day for the running kernel settimeofday -- sets the time of day for the running kernel mount -- attaches a filesystem to a directory and makes it available sync -- flushes all filesystem buffers, forcing changed blocks to -- ``drives'' and updates superblocks futex -- raw locking (lets a process block waiting on a change to a specific memory location) sysinfo -- provides direct access from the kernel to: load average total ram for system available ram amount of shared memory existing amount of memory used by buffers total swap space swap space available number of processes currently in proctable
msgctl -- SYS V messaging control (uid, gid, perms, size) msgget -- SYS V message queue creation/access msgrcv -- receive a SYS V message msgsnd -- send a SYS V message shmat -- attach memory location to SYS V shared memory segment shmctl -- SYS V shared memory contrl (uid, gid, perms, size, etc) shmget -- SYS V shared memory creation/access shmdt -- detach from SYS V shared memory segment semctl -- SYS V semaphores control semget -- SYS V semaphores creation/access semop -- SYS V semaphores operations
shm_open -- POSIX shared memory create/access shm_unlink -- POSIX shared memory release sem_open -- POSIX semaphores create/access sem_post -- POSIX semaphores unlock sem_wait -- POSIX semaphores try to get semaphore, block if cannot sem_trywait -- POSIX semaphores try to get semaphore, do not block if cannot sem_timedwait -- POSIX semaphores try to get semaphore, block for a time if cannot sem_unlink -- POSIX semaphores (only needed for named/filesystem-based semaphores) sem_init -- POSIX semaphores (only needed for unnamed/memory-based semaphores) sem_destroy -- POSIX semaphores (only needed for unnamed/memory-based semaphores) sem_close -- POSIX
One of the more interesting developments in the last few years is the idea of improving our ability to detect changes in a filesystem. The most recent incarnation of this idea is the inotify system. It can monitor either files or entire directories for events (but it is not recursive — you would have to monitor each directory separately). Interestingly enough, you can use select to monitor the monitoring... ;-)
The interface consists of three new system calls, and also re-uses our old friends read(2) and close(2):
Here's an example program using inotify: inotify_test.c.
Another interesting development in the last few years is the idea of replacing the old asynchronous signal system with queued, synchronously delivered signals.
This has been implemented with signalfd system. It sets up, like inotify, a queue that can be read with read(2), but instead of file system events, we pick up signal events.
The interface consists of just one new system call, and also re-uses our old friends read(2) and close(2):
Here's an example program using inotify: signalfd-test.c.