| Stanford University
Computer Science 244b: Spring 2009
Assignment #2: Distributed Replicated Files
|
Assigned: Wednesday May 6, 2009
Due: Thursday, May 28, 2009 at 23:59
The goal of this assignment is to implement client and server prototypes
for a distributed file system in which the files are replicated. The
purpose of this assignment is to explore a service specific protocol,
relying on transactions for reliable delivery rather than conventional
transport techniques.
For this assignment, you will be working individually.
Machine Compatibility
Your program must run on the Linux
machines (pod*.stanford.edu) in the Terman Engineering Computer
Cluster. You may use multicast packets to disseminate state as with
the previous Mazewar assignment. You are not allowed to use physical
broadcast packets to disseminate state. You can re-use the multicast
setup code from your first asssignment. The multicast group address is
same as first assignment, namely 0xe0010101.
Structure, Interfaces, and Implementation
Your goal is to provide us with the following:
- A client side library
libclientReplFs.a that communicates
with the servers. It handles accesses from the application and propagates
them to the servers. The application interface that the client library
has to support is specified later in this specification.
- The server executable called
replFsServer.
Each server maintains a copy of the replicated file and keep track of
all updates made to that file.
- A makefile that produces the above things given your submission.
We are providing you with:
- The interface to be supported by your client API.
- A few simple restrictions on how the server & client executables
behave (e.g. required parameters).
- A sample client application. The application uses the client interface
to make write calls to a replicated file. This code links
with your client library and runs on the client host thereby
providing a basic test case for your client and server implementation.
Note that this is not necessarily including all of the programs we
will use to test your system, it is a starting point only.
- Skeleton client side code. You may use this code and fill in
the missing parts or write your own client side code from scratch.
- A sample makefile that builds the client side library
libclientReplFs.a and links the sample application code.
You are free to extend the given makefile and add
replFsServer generation to it or to write your own
makefile.
The framework code we provide is in /usr/class/cs244b/replFs.
Applications which wish to use the distributed filesystem link to
the client-side library and
use the client-side interface to make write calls to a replicated
file. Of course, to this application, the replication and use of the
network must be entirely transparent.
Your most important tasks are to design and implement the
client/server protocol.
Required Client API(Application Program Interface)
The application interface to the client MUST include the following:
int InitReplFs(int portNum,int packetLoss);
int AddServer(char *id);
int OpenFile(char *name);
int WriteBlock(int fd, void *buffer, int byteOffset, int blockSize);
int Commit(int fd);
int Abort(int fd); (Use int instead of void)
int CloseFile(int fd);
Conspicuous by its absence is the lack of a ReadBlock() call in this API.
Reading is not required for this assignment.
-
InitReplFs() gives your filesystem a chance to perform any
startup tasks and informs your system which UDP port # it is to use,
as well as the percentage (of 100) of packets that should be
randomly dropped.
Your system does not have to function correctly if
InitReplFs()
is not called prior to using the other calls.
-
AddServer() tells the filesystem to add another
server to its list of participating servers. The character string
passed to this routine may contain a dotted IP address or a
hostname (your code must handle both). This call should never fail.
You can assume that all AddServer() calls are made
before the OpenFile(). You don't need to take into
consideration server addition attempts while a file is being modified.
-
OpenFile() takes the name of a file and returns a
file id or -1 on failure.
-
WriteBlock() returns bytes written
-
Commit() takes a file id and returns a 0 if the changes made
to the file committed and -1 otherwise.
-
Abort() takes a file descriptor and discards
all changes since the last commit.
Abort() never fails.
-
Close() relinquishes all control that the client had with
the file. Close() commits all changes that have not
been saved.
Protocol Skeleton
In providing the API listed above, the library code at the client and
the file servers conspire to provide distributed replicated files.
While we have provided an outline of how you should do this, your
first job should be to flesh out this skeleton.
A brief outline of the steps involved with accessing a file:
- The client opens the file by multicasting to the servers and
collecting responses. All reachable & running servers must
be accounted for.
- The client multicasts writes across the net. Upon
receiving these writes, the server writes to a copy of the file. Note
specifically that there are no ACKs at this stage.
- If the application commits, two phase commit is used to ensure
that all updates from the previous step were received. (If not
received, this information must somehow be retransmitted). Upon
preparing to commit changes to a file, the client needs to identify
to the servers in some way the list of updates to the server to commit,
making sure that every server has all of the changes before sending
a commit message. If all
of the servers return an ``OK to commit'' message, the client then
sends out a commit message. Alternatively, the client may send an
abort message whereupon the servers revert the file to its previous
state.
Assignment Assumptions
For this assignment, you are allowed and required to assume the following:
System-Wide Details
- You must use C/C++. This is to in order to help prevent linkage problems we have had in the past. We should be able to run 'make' to build the required parts (
libclientReplFs.a, a library and
replFsServer, an executable)
on the TECC systems and you must export
the client API correctly to an arbitrary C client program.
- All communication is on a single UDP port. For testing, you can use the port in the file /usr/class/cs244b/replFs/port_info. Make sure your code still works with the -port option and parameter.
- You may assume a maximum block write size of 512 bytes.
- You may assume a maximum of 128 updates between commits.
- You may assume a maximum file size of 1MB.
- You can use any form of flow control that works, including
inserting delays between packet transmissions to
minimize packet overruns that cause packet loss.
- All communication should be protected with respect to data
integrity with a checksum. Packets with bad checksums must be
discarded.
- You must handle request-response packet loss from the client
(requestor) end in such a way that the system will not
hang or fail if a packet is lost.
Server Details
- Servers handle a single client, single file, and single
pending transaction at a time.
- Your server executable must accept & require these parameters:
-port port#: UDP port # for fs communication. (so we can test it)
-mount filepath : place where committed local
copies are to be stored by the server.
- Your server should maintain replicated copies of committed versions
of files on each machine in the location specified by the -mount
parameter.
On server startup, if this directory already exists, the server
should die with a message "machine already in use". If such a
directory does not exist, it should create it. Do not remove the
mount directory on server termination! For example, if the server
is started with:
replFsServer -port 4137 -mount /folder1/fs244b
And a client opens file "jane.txt" using
OpenFile("jane.txt"), the server should
create the file: /folder1/fs244b/jane.txt.
Note: you must comply with this scheme. It will be used to grade
your system. Systems which do not comply will be considered
nonworking.
- To support rollback on abort, the server can make a copy/copies of
the file on open/commit. This file should not be in the
"mount" directory
with the committed files, but should be somewhere else on the disk.
You are allowed to create another directory to store these files. The /tmp directory might be a good place if you choose to go this route.
However, any user that runs the server program should be able to create these temporary files. These files must be cleaned up on close.
Client Details
- Addresses or names of hosts will be specified to the client at
startup time
via the client API. Servers should communicate with any client
that contacts them.
- Client deals with a single open file at a time. If there is already
an open file the client is handling, the OpenFile call should return
error.
Testing Criteria
We will be using a test app to test your filesystem. It will not be supplied
to you in advance. It will attempt to
stress your system in a number of ways.
We will run tests with varying transaction size, with 1-3 servers, etc.
Report
The writeup is intended to be an insightful explication and analysis
of your work. As always, we encourage you to employ point form in
your answers and focus on protocol issues. Please put the write-up in a
README file.
The following sections should be included:
- Protocol Specification:
Document your protocol by specifying packet formats, sequencing
and semantics of packets and protocol events.
- Evaluation:
Discuss the merits and disadvantages of this
approach to replication versus using conventional reliable transport.
- Future Directions: Discuss extensions, refinements, and
modifications to your protocol and implementation that would be
required for real deployment. An answer to this discussion
necessarily includes consideration of large scale systems and files.
What to turn in
Run submit script /usr/class/cs244b/bin/submit from the
directory that contains all the source files, the makefile and the README file. The README can be in pdf form if you wish.
This is assignment/project #2.
Tentative Rubric
50% of your grade will be based on the tests run on your implementation. The other 50%
of your grade is based upon the report.