Aim: Study of Distributed File System - Network File System and Coda. NFS PROTOCOL DEFINITION :
Servers change over time, and so can the protocol that they use. RPC provides a version number with each RPC request. This RFC describes version two of the NFS protocol. Even in the second version, there are a few obsolete procedures and parameters, which will be removed in later versions. An RFC for version three of the NFS protocol is currently under preparation. NFS Overview:
The Network File System (NFS) is a client client//server application that lets a computer user view and optionally store and update file on a remote computer as though they were on the user's own computer. The user's system needs to have an NFS client and the other computer needs the NFS server. Both of them require that you also have TCP/IP installed since the NFS server and client use TCP/IP as the program that sends the files and updates back and forth. (However, the User Datagram Protocol, UDP, which comes with TCP/IP, is used instead of TCP with earlier versions of NFS.) NFS was developed by Sun Microsystems and has been designated a file server standard. Its protocol uses uses the the Remo Remote te Proc Proced edur uree Call Call (RPC RPC)) method method of commun communicat ication ion betwee between n comput computers ers.. You can instal installl NFS on Windows Windows 95 and some some other other operati operating ng syste systems ms using using products like Sun's Solstice Network Client. Using NFS, the user or a system administrator can mount all mount all or a portion of a file a file system (which (which is a portion portion of the hierarchical hierarchical tree in any file directory directory and subdirect subdirectory, ory, including including the one you find on your PC or Mac). The portion of your file system that is mounted (designated as accessible) can be accessed with whatever privileges go with your access to each file (read-only or read-write). NFS has been extended to the Internet with WebNFS WebNFS,, a product and proposed standard that is now part of Netscape's Communicator browser. WebNFS offers what Sun believes is a faster way to access Web pages and other Internet files The Network File System (NFS) is a client client//server application that lets a computer user view and optionally store and update file on a remote computer as though they were on the user's own computer. The user's system needs to have an NFS client and the other computer needs the NFS server. Both of them require that you also have TCP/IP installed since the NFS server and client use TCP/IP as the program that sends the files and updates back and forth. (However, the
User Datagram Protocol, UDP, which comes with TCP/IP, is used instead of TCP with earlier versions of NFS.) NFS was developed by Sun Microsystems and has been designated a file server standard. Its protocol uses the Remote Procedure Call (RPC) method of communication between computers. You can install NFS on Windows 95 and some other operating systems using products like Sun's Solstice Network Client. Using NFS, the user or a system administrator can mount all or a portion of a file system (which is a portion of the hierarchical tree in any file directory and s ubdirectory, including the one you find on your PC or Mac). The portion of your file system that is mounted (designated as accessible) can be accessed with whatever privileges go with your access to each file (read-only or read-write). NFS has been extended to the Internet with WebNFS, a product and proposed standard that is now part of Netscape's Communicator browser. WebNFS offers what Sun believes is a faster way to access Web pages and other Internet files Diagram:
NFS Working
NFS consists of at least two main parts: a server and o ne or more clients. The client remotely accesses the data that is stored on the server machine. In order for this to function properly a few processes have to be configured and running. The server has to be running the following daemons: Daemon nfsd mountd rpcbind
Description The NFS daemon which services requests from the NFS clients. The NFS mount daemon which carries out the requests that nfsd(8) passes on to it. This daemon allows NFS clients to discover which port the NFS server is using.
The client can also run a daemon, known as nfsiod. The nfsiod daemon services the requests from the NFS server. This is optional, and improves performance, but is not required for normal and correct operation. File System Model
NFS assumes a file system that is hierarchical, with directories as all but the bottom level of files. Each entry in a directory (file, directory, device, etc.) has a string name. Different operating systems may have restrictions on the depth of the tree or the names used, as well as using different syntax to represent the "pathname", which is the concatenation of all the "components" (directory and file names) in the name. A "file system" is a tree on a single server (usually a single disk or physical partition) with a specified "root". Some operating systems provide a "mount" operation to make all file systems appear as a single tree, while others maintain a "forest" of file systems. Files are unstructured streams of uninterpreted bytes. Version 3 of NFS uses slightly more general file system model. NFS looks up one component of a pathname at a time. It may not be obvious why it does not just take the whole pathname, traipse down the directories, and return a file handle when it is done. There are several good reasons not to do this. First, pathnames need separators between the directory components, and different operating systems use different separators. We could define a Network Standard Pathname Representation, but then every pathname would have to be parsed and converted at each end. Other issues are discussed in section 3, NFS Implementation Issues. Although files and directories are similar objects in many ways, different procedures are used to read directories and files. This provides a network standard format for representing directories. The same argument as above could have been used to justify a procedure that returns only one directory entry per call. The problem is efficiency. Directories can contain many entries, and a remote call to return each would be just too slow.
THE CODA FILE SYSTEM
Our next example of a distributed file system is Coda. Coda has been developed at Carnegie Mellon University (CMU) in the 1990s, and is now integrated with a number of popular UNIX-based operating systems such as Linux. Coda is in many ways different from NFS, notably with respect to its goal for high availability. This goal has led to advanced caching schemes that allow a client to continue operation despite being disconnected from a server. Overviews of Coda are described in (Satyanarayanan et al., 1990; Kistler and Satyanarayanan, 1992). A detailed description of the system can be found in (Kistler, 1996). Overview of Coda
Coda was designed to be a scalable, secure, and highly available distributed file system. An important goal was to achieve a high degree of naming and location transparency so that the system would appear to its users very similar to a pure local file system. By also taking high availability into account, the designers of Coda have also tried to reach a high degree of failure transparency. Coda follows the same organization as AFS. Every Virtue workstation hosts a user-level process called Venus, whose role is similar to that of an NFS client. A Venus process is responsible for providing access to the files that are maintained by the Vice file servers. In Coda, Venus is also responsible for allowing the client to continue operation even if access to the file servers is (temporarily) impossible. This additional role is a major difference with the approach followed in NFS. The internal architecture of a Virtue workstation is shown in Fig. 10-2. The important issue is that Venus runs as a user-level process. Again, there is a separate Virtual File System (VFS) layer that intercepts all calls from client appli- cations, and forwards these calls either to the local file system or to Venus, as shown in Fig. 10-2. This organization with VFS is the same as in NFS. Venus, in turn, communicates with Vice file servers using a user-level RPC system. The RPC system is constructed on top of UDP datagrams and provides at-most-once semantics. Processes
Coda maintains a clear distinction between client and server processes. Clients are represented by Venus processes; servers appear as Vice processes. Both type of processes are internally organized as a collection of concurrent threads. Threads in Coda are nonpreemptive and operate entirely in user space. To account for continuous operation in the face of blocking I/O requests, a separate thread is used to handle all I/O operations, which it implements using low-level asynchronous I/O operations of the underlying operating system. This thread effectively emulates synchronous I/O without blocking an entire process.
Communication
Interprocess communication in Coda is performed using RPCs. However, the RPC2 system for Coda is much more sophisticated than traditional RPC systems such as ONC RPC, which is used by NFS. RPC2 offers reliable RPCs on top of the (unreliable) UDP protocol. Each time a remote procedure is called, the RPC2 client code starts a new thread that sends an invocation request to the server and subsequently blocks until it receives an answer. As request processing may take an arbitrary time to complete, the server regularly sends back messages to the client to let it know it is still working on the request. If the server dies, sooner or later this thread will notice that the messages have ceased and report back failure to the calling application.
RPC2 allows the client and the server to set up a separate connection for transferring the video data to the client on time. Connection setup is done as a side effect of an RPC call to the server. For this purpose, the RPC2 runtime system provides an interface of side-effect routines that is to be implemented by the application developer. For example, there are routines for setting up a connection and routines for transferring data. These routines are automatically called by the RPC2 runtime system at the client and server, respectively, but their implementa- tion is otherwise completely independent of RPC2 Naming
As we mentioned, Coda maintains a naming system analogous to that of UNIX. Files are grouped into units referred to as volumes. A volume is similar to a UNIX disk partition (i.e., an actual file system), but generally has a much smal- ler granularity. It corresponds to a partial subtree in the shared name space as maintained by the Vice servers. Usually a volume corresponds to a collection of files associated with a user. Examples of volumes include collections of shared binary or source files, and so on. Like disk partitions, volumes can be
mounted. Volumes are important for two reasons. First, they form the basic unit by which the entire name space is constructed. This construction takes place by mounting volumes at mount points. A mount point in Coda is a leaf node of a volume that refers to the root node of another volume. Fault Tolerance
Coda has been designed for high availability, which is mainly reflected by its sophisticated support for client-side caching and its support for server replication. We have discussed both in the preceding sections. An interesting aspect of Coda that needs further explanation is how a client can continue to operate while being disconnected, even if disconnection lasts for hours or days. Conclusion:- Hence we have studied NFS and CODA