CSCI.6500 Distributed Computing over the Internet

#### Programming Assignment 2.

This assignment is to be done either individually or in pairs. Do not show your code to any other group and do not look at any other group's code. Do not put your code in a public directory or otherwise make it public. While you may get help directly from the instructor, you are encouraged to use the WebCT Discussions page to post problems so that other students can also answer and see the answers.

A Consistent-Hashing Fault-Tolerant Distributed Hashtable

The goal of this assignment is to implement using Oz, a fault-tolerant distributed hashtable such as Chord, capable of dealing with dynamic node additions, sequential node failures, and concurrent addition, update, and removal of key/value pairs.

Part 1 (50%): A Distributed Hashtable

You are to create a node program which takes three parameters as arguments: the logarithm of the maximum number of nodes in the hashtable (from here on n), the node's id, and the id of a node to connect to in the distributed hashtable. If there is no node to connect to, this is the initial node of the distributed hashtable. Each node must service the following messages:

• add(key, value), which adds a key/value pair to the distributed hashtable.
• get(key), which gets the value for a given key in the distributed hashtable.
• remove(key), which removes the key/value pair from the distributed hashtable.
• update(key, value), which updates the key/value pair in the distributed hashtable.

The get method must operate in logarithmic space and time complexity. That is, the finger table must be of size n and the number of hops to satisfy a lookup must grow proportional to n.

You can assume keys are long numbers, and you can use the mod 2n hashing function to get a ring identifier in the proper range.

Additionally, write programs to Add, Get, Remove, and Update elements in the hashtable. These programs should take a node's identifier and the corresponding parameters and send the appropriate message to the node in the distributed hashtable and return the result if there is one.

Using the distributed hashtable should look like the following:
--> node 8 1
--> node 8 136 1
--> node 8 35 1
Added "hello" to hashtable with key 291, hash 35, node 35.
--> node 8 99 136
Added "world" to hashtable with key 15, hash 15, node 35.
--> node 8 32 35
--> get 35 15
Retrieved "world" from the hashtable.
--> get 136 291
Retrieved "hello" from the hashtable.
--> remove 136 291
Removed "hello" with key 291 from the hashtable.
--> get 1 291
No key/value pair for key 291 was found in the hashtable.
--> update 99 15 "hello"
Updated value for key 15 to "hello".
--> get 35 15
Retrieved "hello" from the hashtable.

Part 2 (50%): A Soft Fault-Tolerant Distributed Hashtable

When a key/value pair is added to the hashtable, it must be replicated once at another node in the hashtable. In addition, each node must be able to accept a failure message. When a node receives a failure message it must remove itself from the distributed hashtable and then the distributed hashtable must replicate all key/value pairs lost by the failure of that node (by using the replicated key/value pairs). You can assume that failure messages will not happen concurrently, i.e., another failure message will not occur until the distributed hashtable has stabilized (replication of key/value pairs and restoration of finger tables have occurred).

Extra Credit Options (10% bonus if working individually, at least one required for full credit if working in a pair):

Option 1:
Instead of duplicating key/value pairs as in part 2, your hashtable must accept a replication parameter r. When a key/value pair is added to the hashtable, it is replicated r times. Include the three following graphs and a short discussion of these results.

• A graph of lookup time vs r, for n = 10, 100 and r = 1..5;
• A graph of addition time vs r, for n = 10, 100 and r = 1..5;
• A graph of failure recovery time vs r, for n = 10, 100 and r = 1..5;

Option 2:
Your distributed hashtable must be able to handle (non-concurrent) hard failures of nodes. A node could be terminated using Ctrl-C or a power off of the machine hosting the node and the distributed hashtable must be able to recover and still accept add/get/update/remove messages.

Due Date:
 Received Time Grade Modification before Tuesday, March 21st, 11:59PM +5% Wednesday, March 22nd, from 12:00AM to 11:59PM no modification (on time) Thursday, March 23rd, from 12:00AM to 11:59PM -10% from Friday, March 24th, 12:00AM to Saturday, March 25th, 11:59PM -25% after Sunday, March 26th, 12:00AM not accepted