CSCI-4210 Operating Systems Fundamentals
Fall, 1998
Extra Credit Programming Assignment 2
This Project is worth 8 extra credits will be added to your total.\\

This assignment will test the extent to which run time can be improved by using threads to perform a number of complex IO operations. Your program should read a file which will contain a number of WWW URLs. Your program should create client sockets to download all of these documents to the current directory. Your program should be able to operate in two different modes, parallel and serial. In parallel mode, a separate thread should be created to download each document, while in serial mode, your program should download each document in serial as in a traditional program. The user indicates which mode to use by passing in either the -p flag or the -s flag. One and only one of these arguments must be passed as the first argument when the program is run. In addition, the user must pass in the name of the file containing the URLs as the second argument. To download a document from the web, create a stream socket, and connect to port 80 of the host. Then issue a GET command. In its most rudimentary form this is followed only by the path name of the document, followed by HTTP/1.0 followed by two newline character. For example, for this document:

http://www.cs.rpi.edu/index.html
connect your client to port 80 on www.cs.rpi.edu and then send the following request:

GET /index.html HTTP/1.0 
followed by two newline characters. In order to test how much threads speed up run time, you need to time each operation. To do this, use the function
#include 
hrtime_t gethrtime(void);
DESCRIPTION
     gethrtime() returns the current high-resolution  real  time.
     Time  is  expressed as nanoseconds since some arbitrary time
     in the past; it is not correlated in any way to the time  of
     day,  and  thus  is not subject to resetting, drifting, etc.
     gethrtime() returns an hrtime_t,  which is a 64-bit (long long) 
     signed integer.
EXAMPLE
The following code fragment measures  the  average  cost  of
getpid(2):

     hrtime_t start, end;
     int i, iters = 100;

     start = gethrtime();
     for (i = 0; i < iters; i++)
          getpid();
     end = gethrtime();

     printf("Avg getpid() time = %lld nsec\n", (end - start) / iters); 

Your program should print out the time that it takes to download each document, the total number of bytes downloaded for each document, and the total run time of the program. In the serial case, the total run time of the program should be greater than the sum of the all of the download times, while in the parallel case, it should be somewhat less. A sample input file might look like this:
http://www.cs.rpi.edu/courses/index.html
http://www.w3.org/News/Archive.html
http://www.freebsd.org/index.html
http://www.yahoo.com/Computers/index.html
http://www.caldera.com/LDP/LDP/tlk/tlk-toc.html
http://www.linux.org/index.html
Since this contains six documents, in the parallel case, your program should create six threads, one for each document. The main thread of your document should then issue a join for each thread. In the serial case, the program would just download each of the six documents in order. Some documents might hang for extended periods of time. If a document takes more than 20 seconds to download, the download of that document should be aborted, and a suitable error message displayed. You can choose to write the program either in Java or in C or C++. Note that this project can only be done on a computer which supports threads, i.e. Solaris. This project is due at midnight on December 4.