CSCI 4510/6510 - Distributed Systems and Algorithms - Fall 2019

General Information

Class Time and Place: T F 12:00pm - 1:50pm, Low 3051
Instructor: Stacy Patterson     sep AT cs.rpi.edu
Office Hours: T F 2:00pm - 3:00pm or by appointment, Lally 301
TA: Tim Castiglia     castit AT rpi.edu
TA Office Hours: W 1:00pm - 3:00pm, AE 127

Course Description

This course explores the principles of distributed systems, emphasizing fundamental issues underlying the design of such systems: communication, coordination, synchronization, and fault-tolerance. We will study key algorithms and theoretical results and explore how these foundations play out in modern systems and applications like cloud computing, edge computing, and peer-to-peer systems.

Course Syllabus
Intro Slides
Submitty course page

Prerequisites

  • CSCI-2300: Intro. to Algorithms
  • CSCI-4210: Operating Systems
  • Take-Home Quizzes

    Take-home quiz instructions

    In Class Quiz Dates

  • Quiz 1 - 9/13/19
  • Quiz 2 - 10/1/19
  • Quiz 3 - 10/18/19
  • Quiz 4 - 11/5/19
  • Quiz 5 - 11/22/19
  • Quiz 6 - 12/6/19 12/10/10 (quiz is optional)
  • Project Information

  • Project 0 - No due date
  • Your projects will evaluated on the CS department's servers using Docker containers. We will use Submitty for autograding of some test cases. Additional evaluation will be performed in live demonstrations of your code.

  • Project 1 description - due Sunday, Oct 20, 2019 at 11:00pm
  • Project 2 description - due Tuesday, Nov 26, 2019 at 11:00pm
  • Pseudocode and Resources

  • Wuu Bernstein Replicated Dictionary Algorithm
  • Raymond's Algorithm
  • Maekawa's deadlock-free algorithm
  • Video explaining Chandy-Lamport algorithm
  • Slides on Two Generals Problem
  • Synod Algorithm
  • Byzantine Generals Algorithm
  • Papers and Readings

    Some papers are behind a pay wall and can only be accessed from the RPI network.

  • (8/30/19) Time, clocks, and the ordering of events in a distributed system, Leslie Lamport, Communications of the ACM, 1978.
  • (9/6/19) The version of the algorithm presented in class can be found in Section 7 of Virtual time and global states of distributed systems, Friedemann Mattern, Parallel and Distributed Algorithms, 1989
    A paper that presents a similar algorithm is Timestamps in Message-Passing Systems That Preserve the Partial Ordering, Colin J. Fidge, 1987. The paper by Fidge gives a slightly different algorithm and uses a different method to compare vector timestamps.
  • (9/10/19) Coulouris, et al. Sections 14.3
  • (9/13/19) Efficient solutions to the replicated log and dictionary problems, Gene T.J. Wuu and Arthur R. Berntsein, Principles of Distributed Computing, 1984.
  • (9/13/19) Intro to Mutual Exclusion. Coulouris, Sec. 15.2.
  • (9/20/19) A optimal algorithm for mutual exclusion in computer networks, Glenn Ricart and Ashok K. Agrawala, Communications of the ACM, 1981.
    Also see pages 561-562 in the first paper by L. Lamport
  • (9/27/19) A tree-based algorithm for distributed mutual exclusion, Kerry Raymond, ACM Transactions on Computer Systems, 1989.
  • (10/1/19) A √ N Algorithm for mutual exclusion in decentralized systems, Mamoru Maekawa, ACM Transactions on Computer Systems, 1985.
    The Information Structure of Distributed Mutual Exclusion Algorithms, Beverly Sanders, ACM Transactions on Computer Systems, 1987
  • (10/4/19) Distributed snapshots: determining global states of distributed systems, K. Chandy and L. Lamport, ACM Transactions on Computer Systems, 1985.
    (also Coulouris Sec. 14.5)
  • (10/21/19) Paxos Made Simple, L. Lamport, ACM SIGACT News, 2001.
    The Part-Time Parliament, L. Lamport, ACM Transactions on Computer Systems, 1998.
  • (11/1/19) Concurrency Control and Recovery in Database Systems, Bernstein, Hadzilacos, and Goodman (2PC and 3PC in Chapter 7)
  • (11/8/19) Impossibility of Distributed Consensus with One Faulty Process, M. Fischer, N. Lynch, and M. Paterson, Journal of the ACM, 1985.
  • (11/15/19) The Byzantine Generals Problem, L. Lamport, R. Shostak, and M. Pease, ACM Transactions on Programming Languages and Systems, 1982.
  • (12/3/19) Perspectives on the CAP Theorem, S. Gilbert and N. Lynch, Computer, 2012
    Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services, S. Gilbert and N. Lynch, ACM SIGACT News, 2002.
  • (12/3/19) Chord: a scalable peer-to-peer lookup protocol for Internet applications, I. Stoica, et al., IEEE/ACM Transactions on Networking, 2003.
  • (12/6/19) Dynamo: Amazon's Highly Available Key-value Store, G. DeCandia et al., Symposium on Operating Systems Principles, 2007.
  •