COMPUTER AND NETWORK
SYSTEM ADINISTRATION
CIS 5406-01
Summer 1997 - Lesson 7
The Network File System and NT Shares
A. Introduction to NFS
- What was life like before NFS?
- built on top of:
UDP - User Datagram Protocol (unreliable delivery)
XDR - eXternal Data Representation (machine independent data format)
RPC - Remote Procedure Call
1. NFS is both a set of specifications and an implementation
2. The protocol specifications are independent of architecture
and operating system
3. two protocols - mount protocol and NFS protocols
- mount protocol establishes initial link between client and
server machines
- NFS protocols provide a set of RPCs for remote file
operations
> searching a directory
> reading a set of directory entries
> manipulating links and directories
> accessing file attributes
> read and writing files
> notably missing are open() and close()
> there is no equivalent to UNIX file tables on the server
side
> each request must provide full set of arguments including
a unique file identifier and offset
4. problems
- performance (even with UDP)
> modified data may be cached locally on the client
> once the cache flushes to server the data must be written to
disk before results are returned to the client and the cache
is flushed
> the benefits of server caching are lost
- semantics
> UNIX semantics (without NFS) and session semantics (ala Andrew File System)
> NFS claimed to implement UNIX semantics
> UNIX semantics (without NFS)
+ writes to an open file are visible immediately to other
users who have the file open at the same time
+ the file is viewed as a single resource
> Session semantics (ala Andrew file system)
+ writes to an open file are not visible to others having
it open at the same time
+ once a file is closed the changes are visible only in
the sessions opened later
> NFS claimed to implement UNIX semantics
+ there are two client caches: file blocks and file attributes
+ cached attributes are validated with server on an open()
+ the biod process implements read-ahead and delayed-write techniques
+ newly created files may not be visible to other sites for
up to 30 seconds
+ it is indeterminate whether writes to a file will be immediately
seen by other clients who have the file open for reading
> example
- touch file on xi
- ls on some other machine
- rm file on xi
- ls (quickly!) on other machine
- If a single NFS stat() request hangs, it can hang up UNIX commands,
like "df"!
- "magic cookies" (random numbers) used to short-cut future
validations. Given to client from server, client can use it
to re-connect whenever a server comes back up after a crash.
--> Can be spoofed <-- Note that "stale cookies" (yuck) can
make a client hang (solution: remount the filesystem on the
client to make it get a new, fresh cookie).
B. Server
1. mountd - Sun's UNIX implementation of the mount protocol
- SunOS 4.x reads /etc/exports
- uses "exportfs" to have mountd reload table ("exportfs -a")
- example: xi:/etc/exports
/ -ro,access=lpdaemon:lpdaemon2,root=mu
/usr -ro,access=lpdaemon:lpdaemon2,root=mu
/real/cs25 -access=lpdaemon:lpdaemon2:majorslab,root=mu:nu:tau
/real/cs26 -access=lpdaemon:lpdaemon2:majorslab,root=mu:nu
- SunOS 5.x reads /etc/dfs/dfstab
- uses "share" to have mountd reload table (see Table 17.4, p. 371)
- example: export:/etc/dfs/dfstab
share -F nfs -o ro,root=nu:mu /
share -F nfs -o ro,root=nu:mu /usr
share -F nfs -o rw=lpdaemon:lpdaemon2:majorslab,root=nu:mu: /real/cs13
share -F nfs -o rw=lpdaemon:lpdaemon2:dad,root=nu:mu: /real/cs14
share -F nfs -o rw=lpdaemon:lpdaemon2:,root=nu:mu: /real/cs15
share -F nfs -o rw=lpdaemon:lpdaemon2,root=nu:mu:beta:chi\
:epsilon:kill:rho:sigma:socket:exec:sync /real/cs16
- Linux (Slackware and RedHat, at least) uses /etc/exports and "kill -HUP"
to mountd. Linux (apparently) provides "NFS multiplying" --
NFS serving of an NFS mounted file system.
- Table 17.1, 17.2, and 17.3 give further implementation specifics.
2. nfsd
- handles requests for NFS file service
- very small, basically turn around and call kernel
- system tuning - See Table 17.5, page 372
- Nemeth says 10 on a dedicated file server
- Loukides says leave it at 4 (performance tuning book)
- he says the kernel inode table and file table size are
more important (an NFS server has more open files)
C. Client side
1. extended "mount" command, accepts "host:path" syntax for NFS filesystems
- /etc/fstab in SunOS 4.x
- example:
/dev/sd0a / 4.2 rw 1 1
/dev/sd0g /usr 4.2 rw 1 2
-- Where's the remote file systems? Done via automounter (see below)
- /etc/vfstab in SunOS 5.x
- example:
#device device mount FS fsck mount
#to mount to fsck point type pass boot
#-----------------------------------------------------------------------
/proc - /proc proc - no
fd - /dev/fd fd - no
swap - /tmp tmpfs - yes
/dev/dsk/c0t3d0s0 /dev/rdsk/c0t3d0s0 / ufs 1 no
/dev/dsk/c0t3d0s6 /dev/rdsk/c0t3d0s6 /usr ufs 2 no
/dev/dsk/c0t3d0s5 /dev/rdsk/c0t3d0s5 /opt ufs 5 yes
/dev/dsk/c0t3d0s1 - - swap - no
- Type "mount" to see currently mounted file systems
- example:
/dev/sd0a on / type 4.2 (rw)
/dev/sd0g on /usr type 4.2 (rw)
mount:/real/cs4 on /tmp_mnt/home/cs4 type nfs (rw,suid,hard,intr)
mount:/real/cs5 on /tmp_mnt/home/cs5 type nfs (rw,nosuid,hard,intr)
access:/real/cs23 on /tmp_mnt/home/cs23 type nfs (rw,nosuid,hard,intr)
2. NFS service is provided in kernel
- transparent to user
3. biod
- provides read-ahead and write-behind caching
- another tuning issue
D. Administering NFS
1. user must have account on file server or access rights can't be checked
(can default to user "nobody").
2. In CompSci, the artificial shells setup keeps them from logging into
the file servers and running up the load
3. must keep UIDs and GIDs consistent across machines
4. don't mount outside of local net (can block NFS at router)
E. auto-mounting
1. Sun's "automount" daemon (used on CompSci network)
- nice to keep one NIS automount map instead of ~50 /etc/fstab
maps
Operation (using CompSci mappings):
- the autmounter appears to the kernel to be an NFS server
- automount uses its maps to locate a real NFS file server
- it then mounts the file system in a temporary location
and creates a symbolic link to the temporary location
- If the file system is not accessed within an appropriate
interval (five minutes by default), the daemon unmounts the
file system and removes the symbolic link
- if the indicated directory has not already been created, the
daemon creates it, and then removes it upon exiting.
- this is different from a regular mount for which the mount point
must already exist
- example (somewhat convoluted) configuration maps:
- auto.master (available via a NIS file; "ypcat -k auto.master")
/home auto.home # an indirect map all rooted at "/home"
/- auto.direct # "/-" means a direct map
/net -hosts -rw,nosuid,hard,intr
# "-host" means use
# NIS "host.byname" to look
# up the hostname; will
# mount any permissible
# NFS server on "/net/..."
- auto.direct ("ypcat -k auto.direct")
Path mount() options actual location
---- --------------- ---------------
/nu0 -rw,nosuid,hard,intr sync:/real/nu0
/nu1 -rw,suid,hard,intr sync:/real/nu1
/nu2 -rw,suid,hard,intr sync:/real/nu2
/var/spool/mail -rw,nosuid,hard,intr nu:/usr/spool/realmail
- auto.home ("ypcat -k auto.home")
Path mount() options actual location
---- --------------- ---------------
s5 -rw,nosuid,hard,intr psi:/s5
s6 -rw,nosuid,hard,intr psi:/s6
cs4 -rw,suid,hard,intr mount:/real/cs4
cs5 -rw,nosuid,hard,intr mount:/real/cs5
cs6 -rw,nosuid,hard,intr mount:/real/cs6
cs7 -rw,nosuid,hard,intr mount:/real/cs7
cs8 -rw,nosuid,hard,intr mount:/real/cs8
cs9 -rw,suid,hard,intr mount:/real/cs9
cs10 -rw,nosuid,hard,intr mount:/real/cs10
cs11 -rw,suid,hard,intr mount:/real/cs11
.
.
.
cs38 -rw,nosuid,hard,intr pi:/real/cs38
2. "amd" - Public domain automounter from Jan-Simon Pendry's doctoral
thesis (used at SCRI)
- new features; more flexible
- irritating features of the Sun implementation were improved
> amd does not hang if a remote file system goes down
> amd attempts to mount a replacement file system if and
when they become available
- amd automatically unmounts (via "keep-alive")
- Interesting list of mount types (Table 17.7, page 380)
- non-blocking operation
- amd maps can be just as convoluted!
F. Security
Don't export to hosts for which non-trusted users have root access.
If you don't control root on the machine then don't export the file system.
Block NFS UDP traffic at your router, if possible.
G. tuning NFS
nfsstat -c to see client side
---------------------------------------------------------------------
Client rpc:
calls badcalls retrans badxid timeout wait newcred timers
3175986 1991 0 1232 1991 0 0 5330
Client nfs:
calls badcalls nclget nclsleep
3173970 0 3173970 0
getattr setattr root lookup readlink read
192650 6% 49 0% 0 0% 831671 26% 2059211 64% 78054 2%
write create remove rename link symlink
140 0% 124 0% 50 0% 3 0% 7 0% 0 0%
mkdir rmdir readdir fsstat
0 0% 0 0% 940 0% 11071 0%
> what does the client spend most of its time doing?
- reading links and looking up information about files
- percentage of writes is low (don't need NFS server card?)
- is timing out some but isn't having to retransmit
- badxid: received a reply for which there is no outstanding call
- timeout: a call timed out
- badxid and timeouts are roughly equal, but are only .0006 of
all calls
- if timeouts or retransmissions were high, say > 5% then we
want to know why
- if badxid ~= timeout then server is too slow (and is dropping
packets)
- if badxid << timeout then go get your network analyzer
because packets are getting lost on the net due to some
other hardware problem
tuning with mount command:
rsize=n Set the read buffer size to n bytes.
wsize=n Set the write buffer size to n
bytes.
timeo=n Set the NFS timeout to n tenths of a
second.
retrans=n The number of NFS retransmissions.
---------------------------------------------------------------------
nfsstat -s
Server rpc:
calls badcalls nullrecv badlen xdrcall
82414467 0 0 0 0
Server nfs:
calls badcalls
82414467 264
null getattr setattr root lookup
82760 0% 36039746 43% 217061 0% 0 0% 27784077 33%
readlink read
287401 0% 6382386 7%
wrcache write create remove rename
0 0% 2130913 2% 397712 0% 184138 0% 31848 0%
link symlink mkdir rmdir readdir fsstat
10468 0% 1062 0% 4461 0% 4616 0% 8807761 10% 48057 0%
> what does the server spend most of its time doing?
- getting attributes and performing lookups (for ls -l?)
- its a good thing that attributes are cached on the client
side (using biod)
H. Beyond NFS
o AFS - Andrew File System, from CMU and Transarc Corp.
- Much better authentication (Kerberos)
- 8 inch high stack of installation books!
- Adds new file system type to kernel
- Addresses more than just file system semantics, also
user authentication, etc.
- Large local client-side disk cache improves performance
o DFS - Distributed File System from OSF
- "successor" to AFS; AFS-like
- Beginning to show up in most vendor's UNIX implementations
- Major part of DCE (Distributed Computing Environment)
Windows NT Shares
Chapter 9 of MWNTS4 gives information about Shares. A share is
a directory or other resource, such as a printer or CD-ROM drive,
that is designated to be used among network users (p. 248, MWNTS4).
Shares achieve a similar effect as between an NFS server and client,
with much less fuss :)
Creating a share is simple: right-click on the drive or directory
and select the Sharing option.
You can also create multiple share names for the same device/directory.
Think of it as NFS-mounting the same file system at more than one
place in the file system hierarchy. To do so, from within Explorer you
can use the Share As option from the Disk menu or click on the Share
icon on the toolbar.
(This topic was covered in more detail in an earlier lecture.)