Instructor: Stacy Patterson email@example.com
Office Hours: T W 3pm - 4pm or by appointment
TA: Lorson Blair firstname.lastname@example.org
TA Office Hours: M 3pm - 4pm, W 1pm - 3pm
Lectures: T F 12:20pm - 2:10pm
Lectures will be delivered via WebEx Meetings during the scheduled lecture time. They will also be recorded, and the links to the recorded lectures will be posted in Submitty.
Office hours are managed through the Submitty Office Hours Queue. The access code is dsa.
This course explores the principles of distributed systems, emphasizing fundamental issues underlying the design of such systems: communication, coordination, synchronization, and fault-tolerance. We will study key algorithms and theoretical results and explore how these foundations play out in modern systems and applications like cloud computing, edge computing, and peer-to-peer systems.
The dates of the exams are listed below. Lecture is cancelled on the exam dates.
The first three projects will be evaluated using Submitty's autograding feature for networked applications. Submitty uses Docker to deploy and test your application. It is not necessary for you to use Docker to test your code, but it is a good idea.
Below are instructions for configuring and using Docker, as well as instructions for replicating the Submitty test environment on your own machine. See the Project 1 description for more information on how to configure your projects.
- Basic instructions for using Docker for this class are here.
- Instructions for managing a Docker Network are here.
- Submitty provides a tool to create Docker containers and the Docker Network, and to deploy your code to these containers. The tool and instructions are available here.
The solution_directory should be your bin directory (after running build.sh).
- Example knownhosts.json file
- Examples projects with build.sh and run.sh scripts: Python Java
Project 1 - due 9/20/20 at 11pm
Project 2 - due
10/12/20 10/18/20 at 11pm
Project 3 - due 11/15/20 at 11pm
- Project Description (updated 10/23/20)
- Public Autograding Test Cases (updated 10/27/20)
- We have created a new Docker image with Java 11. Java 8 is no longer included. This image also has Python 3 and Go. Please use the updated image for your testing:
docker pull submittyrpi/csci4510:java-11
Project 4 - due 12/11/20 at 11pm
Some papers are behind a pay wall and can only be accessed from the RPI network.
- (9/4/20) Time, clocks, and the ordering of events in a distributed system,
Leslie Lamport, Communications of the ACM, 1978.
- (9/4/20) Virtual time and global states of distributed systems, Friedemann Mattern, Parallel and Distributed Algorithms, 1989.
The version of the algorithm presented in class can be found in Section 7.
- (9/11/20) Coulouris, et al. Sections 14.3
- (9/15-22/20) Efficient solutions to the replicated log and dictionary problems, Gene T.J. Wuu and Arthur R. Berntsein, Principles of Distributed Computing, 1984.
- (9/22/20) Intro to Mutual Exclusion. Coulouris, Sec. 15.2.
- (9/25/20) A optimal algorithm
for mutual exclusion in computer networks, Glenn Ricart and Ashok K. Agrawala, Communications of the ACM, 1981.
Also see pages 561-562 in the first paper by L. Lamport
- (9/29/20) A tree-based algorithm for distributed mutual exclusion, Kerry Raymond, ACM Transactions on Computer Systems, 1989.
- (10/1/19) A √ N Algorithm for mutual exclusion in decentralized systems, Mamoru Maekawa, ACM Transactions on Computer Systems, 1985.
The Information Structure of Distributed Mutual Exclusion Algorithms, Beverly Sanders, ACM Transactions on Computer Systems, 1987
Note: The YIELD message in Sanders' paper corresponds to the RELINQUISH messsage in Maekawa's paper.
- (10/9/20) Concurrency Control and Recovery in Database Systems, Bernstein, Hadzilacos, and Goodman (2PC and 3PC in Chapter 7)
- (10/13/20) Paxos Made Simple, L. Lamport, ACM SIGACT News, 2001.
The Part-Time Parliament, L. Lamport, ACM Transactions on Computer Systems, 1998.
- (10/30/20) Elections in a Distributed Computing System, H. Garica-Molina, IEEE Transactions on Computers, 1982.
- (11/3/20) Impossibility of Distributed Consensus with One Faulty Process, M. Fischer, N. Lynch, and M. Paterson, Journal of the ACM, 1985.
- (11/10/20) The Byzantine Generals Problem, L. Lamport, R. Shostak, and M. Pease, ACM Transactions on Programming Languages and Systems, 1982.
- (11/13/20) Authenticated Algorithms for Byzantine Agreement, D. Dolev and H. R. Strong, 1982
Also see Ch 3. in Foundations of Distributed Consensus and Blockchains, E. Shi
- (11/20/20) Distributed snapshots: determining global states of distributed systems, K. Chandy and L. Lamport, ACM Transactions on Computer Systems, 1985.
(also Coulouris Sec. 14.5)
- (12/4/20) Perspectives on the CAP Theorem, S. Gilbert and N. Lynch, Computer, 2012
- (12/4/20) Dynamo: Amazon's Highly Available Key-value Store, G. DeCandia et al., Symposium on Operating Systems Principles, 2007.