CSCI.6962 Distributed Computing over the Internet

Spring, 2003

Programming Assignment 1.

Do not show your code to any other group and do not look at any other group's code. Do not put your code in a public directory or otherwise make it public. However, you may get all the help you need from the TA or the instructor. You are encouraged to use the WebCT Discussions page to post problems so that other students can also see the answers. This project is to be done either individually or in pairs. If done in a pair, then at least one extension is required for a perfect grade.

The goal of this assignment is to create a distributed Web search engine using the Java RMI framework.

The search engine should be started with an initial URL and a depth search level.  Your engine should fetch the initial URL, indexing word occurrences in that document and producing a list of URLs to fetch in the next iteration.  It must keep indexing documents until the depth search level is reached.

To distribute the indexing and querying, you need to create a meta-engine. The meta-engine is started with the locations of two or more search engines (as defined above) and it is able to combine their query results into a single result. When you query the meta-engine, the meta-engine queries the search engines that it represents using RMI and combines their results into a single result. A client of the meta-engine should not be able to tell the difference between a meta-engine and a normal search engine (the interface should be the same).

Your search engine can be queried at any point, even while building the index.  A query contains a single word.  The engine should return the URLs for the documents that contain the word sorted by the number of occurrences of the word in each document.  A search engine client will need to produce HTML or some similar method of displaying results for ease of reading.

Possible Extensions:

More information about Java can be found at Sun's Java Web Site, while more information on RMI can be found at Sun's RMI information page.


The due date for this project is March 3rd, 2003, 11:55pm EST.  You should use the assignments drop-off box located at the course's WebCT page.  Upload a JAR file containing all the relevant documented  files, along with a README file containing instructions on running the program and explaining any design decisions you made.