CSE 6117, Winter 2016
CSE 6117
Theory of Distributed Computing
Winter 2016
Instructor: Eric Ruppert
Office: Lassonde Building, room 3042
Email: [my last name] @cse.yorku.ca
Telephone: (416) 736-2100 ext. 33979
Lectures: Mondays and Wednesday 17:30-19:00 in room 211 of Stong College
Office Hours: Wednesdays 12:00-13:00 and Fridays 14:00-15:00 or by appointment, or just drop by when I'm in.
Usually, the best way to contact me is by email. Please use your cse account when sending me email, and start your subject line with "[6117]". Send messages in plain text, without attachments.
Announcements
- (Apr 12) I have posted final grades for the course. You can check all your grades by typing "courseInfo 6117 2015-16 W" on a departmental unix machine.
- (Mar 24) Leqi Zhu is coming to York to give a talk on Friday, April 1 at 3:00 on a topic closely related to the content of this course. (A formal announcement will appear soon.) Please make every effort to attend the talk.
- (Mar 24) The university has suspended classes today (Mar 24) due to the weather, so we will not have a class today. My office hour for today is also cancelled. In our next meeting on Monday, March 28, we will have presentations by those originally scheduled to speak on March 23.
- (Mar 21) Our class on Wednesday, March 23 has been rescheduled to Thursday, March 24 from 13:00 to 14:30 in Lassonde 3033.
My office hour on Wednesday, March 23 is cancelled (sorry for the short notice). Instead, I will have office hours on Thursday, March 24 at 16:00 and on Monday, March 28 at 10:00.
- (Mar 16) Starting March 21, our class will be in Lassonde 3033 for the student presentations. (If you are using your own computer, you might want to check in advance that your slides look right on the projector in that room.)
- (Mar 13) The deadline for assignment 6 will be extended to Wednesday, March 16 (at 5:30 pm).
- (Mar 12) There was a typo in assignment 6, part (b)(ii): the two configurations should be C_PQ and C_QP, obtained by allowing processes P and Q to take one step each in opposite orders from C. The corrected version appears below.
- (Mar 9) I will have an extra office hour on Monday, March 14 from 10:00 to 11:00.
- (Feb 25) The test on Monday, February 29 will be in Lassonde 3033 instead of the usual classroom.
- (Feb 25) I will have an office hour on Monday, February 29 from 10:00 to 11:00.
- (Feb 22) By popular demand, the midterm test has been moved to Feb 29. I hope to have room 3033 in the Lassonde Building booked for the midterm.
- (Feb 21) My office hour on Friday, Feb 26 is cancelled, but I will have an additional office hour on Thursday, Feb 25 from 14:30-15:30.
- (Feb 15) During reading week, I will have my office hour on Wednesday, Feb 17, but not on Friday Feb 19.
- (Feb 13) For the test, you will be permitted to bring one 8.5 inch by 11 inch sheet of paper with handwritten notes (on both sides) into the test with you.
- (Feb 2) I have extended the deadline for submitting assignment 3 to 3:00 p.m. on Friday, Feb 5.
- (Jan 30) Clarification: On assignment 3, the name P_{i,j} should not be used within the algorithm itself: processes do not know which row or column they are in. I only named them so that you could talk about the processes when you write up your solution (if you want to).
- (Jan 27) My office hour on January 29 will be at 1pm instead of 2pm.
Course Description
Can a given problem be solved in a distributed system? If so, how
efficiently can it be solved? We investigate how the answers to these
questions depend on aspects of the underlying distributed system
including synchrony, fault-tolerance and the means of communication
between processes.
A tentative list of topics:
- shared-memory and message-passing models of distributed systems,
- mutual exclusion,
- agreement problems (consensus, leader-election, Byzantine agreement, approximate agreement),
- broadcast and multicast algorithms,
- impossibility results and lower bounds,
- the consensus hierarchy,
- implementing shared data structures,
- randomization in distributed computing,
- self-stabilization, and
- a theoretical model for mobile computing.
Marking scheme
Homework exercises | 50% |
Test | 20% |
Class presentation | 20% |
Class participation | 10% |
The class presentation will be a 25-minute talk summarizing the results of a
research paper on the theory of distributed computing that you find in
the literature.
We will have some checkpoints along the way before the presentations. (More information on this to come.)
Good places to look for a paper include
PODC
or
DISC
conference proceedings or the journal
Distributed Computing.
This survey paper has a bibliography containing lots of papers that would be suitable to choose.
If you have a topic in mind, I might be able to help you find a
good paper if you come talk to me.
Important Dates
First class | Monday, January 4 |
Deadline to drop course | Tuesday, January 26 |
Reading week (no lectures) | February 15 to 19 |
Test in Lassonde 3033 | Monday, February 29 |
Last class | Wednesday, March 30 |
Lectures
These will be filled in as the term progresses.
The references below are intended for students who want to read more about the topics discussed in class. Sometimes the readings might be helpful for the assignments. Sometimes they will extend the ideas covered in lectures.
- January 4: Introduction describing what the course is about. Two Generals problem. (See these notes from an earlier year.)
- January 6: Dijkstra's mutual exclusion algorithm. [Dij65], some notes
- January 11: Modelling distributed systems. [LL90]
- January 18: a linearizability, implementing swmr registers from message passing [HW90, ABD95].
- January 20: Flooding to achieve broadcasting and building a spanning tree [AW04, Chapter 2.2-2.3 or Lyn96, Chapter 15.3 or San06, Chapter 1.5, 2.1]
- January 25: Distributed MST algorithm [GHS83 or Lyn96, Chapter 15.5]
- January 27: Leader Election [AW04, chapter 3]
- February 1: Byzantine Agreement [AW04, chapter 5.2], [Lynch96, chapter 6].
- February 8: Synchronous consensus with halting failures [AW04, chapter 5.1]
- February 10: Time lower bound on synchronous consensus with f halting failures [AW04, chapter 5.1]
- February 22: Introduction to shared memory.
- February 24: Wait-free consensus [FLP85, AW04 Section 5.3.1]
- February 29: Midterm test
- March 2: Wait-free consensus continued [Her91]
- March 7: Snapshots [AW04, chapter 10.2-10.3, AADGMS93]
- March 14: Universality of Consensus [Her91]
- March 16: Mutual exclusion: Bakery algorithm [AW04, Sec 4.4.1] and space lower bound [AE14, Sec 6.1].
- March 21 presentations:
Yuping on consensus with dynamic omission failures
- March 28 presentations:
Jungho on counterexample guided abstraction refinement,
Osama on leader election in mobile networks
- March 30 presentations:
Toni on asynchronous agreement,
Raghavender on autonomous robots,
Forouqsadat on routing without regret,
Abeer on vertex colouring
- April 4 presentations:
Mingbin on skiplist-based priority queues,
Norah on Distributed sorting,
Omar on Cortical learning,
Pooria on conflict on a communication channel
References
There is no required textbook for the course. However, I shall sometimes recommend readings from books or papers. These references will be listed here, and the list will grow during the term. Accessing some of the links below may require you to be logged into a machine at York, so that you can access the ACM Digital Library, etc.
Books
Papers
This list is from the 2014 version of the course, but it gives you an idea of the kinds of topics covered. I'll add and subtract items from the list during the term.
- [AADGMS93] Yehuda Afek, Hagit Attiya, Danny Dolev, Eli Gafni, Michael Merritt and Nir Shavit. Atomic snapshots of shared memory. J. of the ACM, 40(4), pages 873-890, September 1993.
- [Ang80] D. Angluin. Local and global properties in networks of processors. In Proc. 12th ACM Symposium on Theory of Computing, pages 82-93, 1980.
- [AADFP06] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer and René Peralta. Computation in networks of passively mobile finite-state sensors. Distributed Computing, 18(4), pages 235-253, 2006.
- [ABD95] Hagit Attiya, Amotz Bar-Noy and Danny Dolev, Sharing memory robustly in message-passing systems. J. of the ACM, 42, p124-142, 1995.
- [ABDPR90] Hagit Attiya, Amotz Bar-Noy, Danny Dolev, David Peleg and Rüdiger Reischuk. Renaming in an asynchronous environment. J. of the ACM, 37(3), pages 524-548, July 1990.
- [AGPV90] Baruch Awerbuch, Oded Goldreich, David Peleg and Ronen Vainish. A trade-off between information and communication in broadcast protocols. J. of the ACM, 37(2), pages 238-256, April 1990.
- [BG93] P. Berman and J. Garay. Cloture votes: n/4-resilient distributed consensus in t+1 rounds. Mathematical Systems Theory 26(1), pages 3-19, 1993.
- [BW01] Saâd Biaz and Jennifer L. Welch. Closed form bounds for clock synchronization under simple uncertainty assumptions. Information Processing Letters, 80, pages 151-157, 2001.
- [Dij65] E. W. Dijkstra. Solution of a problem in concurrent programming control . Communications of the ACM, 8(9), page 569, September, 1965. A very early paper on mutual exclusion.
- [Dij74] Edsger W. Dijkstra. Self-stabilizing systems in spite of distributed control. Communications of the ACM, 17(4), pages 643-644, November 1974.
- [DM90] Cynthia Dwork and Yoram Moses. Knowledge and common knowledge in a Byzantine environment: Crash failures. Information and Computation, 88(2), pages 156-186, October 1990.
- [EFR08] Faith Ellen, Panagiota Fatourou and Eric Ruppert. The space complexity of unbounded timestamps. Distributed Computing, 21(2), pages 103-115, 2008.
- [EFRV10] Faith Ellen, Panagiota Fatourou, Eric Ruppert and Franck van Breugel. Non-blocking Binary Search Trees. In Proc. 29th ACM Symposium on Principles of Distributed Computing, pages 131-140, 2010.
- [FR03] Faith Fich and Eric Ruppert. Hundreds of impossibility results for distributed computing. In Distributed Computing, 16(2-3), pages 121-163, 2003.
- [FLP85] Michael J. Fischer, Nancy A. Lynch and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2), pages 374-382, April 1985.
- [GHS83] R. G. Gallager, P. A. Humblet and P. M. Spira. A distributed algorithm for minimum-weight spanning trees. ACM Transactions on Programming Languages and System Sciences, 5(1), pages 66-77, 1983.
- [HMM85] Joseph Y. Halpern, Nimrod Megiddo and Ashfaq A. Munshi. Optimal precision in the presence of uncertainty. Journal of Complexity, 1, pages 170-196, 1985.
- [Her91] Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1), pages 124-149, January 1991.
- [HR95] Maurice Herlihy and Sergio Rajsbaum. Algebraic topology and distributed computing: A primer. In Computer Science Today: Recent Trends and Developments, pages 203-217. Springer, 1995.
- [HS99] Maurice Herlihy and Nir Shavit. The topological structure of asynchronous computability. Journal of the ACM, 46(6), pages 858-923, November 1999.
- [HW90] Maurice P. Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3), pages 463-492, July 1990.
- [Lam74] Leslie Lamport. A new solution of Dijkstra's concurrent programming problem. Communications of the ACM, 18(8), pages 453-455, August 1974.
- [LL90] Leslie Lamport and Nancy Lynch. Distributed Computing: Models and methods. In Handbook of Theoretical Computer Science, Volume B, Chapter 18, Elsevier, 1990. (On reserve in Steacie Library.) A good, brief introduction to the area. It describes aspects of distributed models and surveys some important early results.
- [LM85] Leslie Lamport and P. M. Melliar-Smith. Synchronizing clocks in the presence of faults. J. of the ACM, 32(1), pages 52-78, 1985.
- [MA94] Mark Moir and James H. Anderson. Fast, Long-Lived Renaming. In Proc. of the 8th International Workshop on Distributed Algorithms, pages 141-155, 1994. (There have been several better versions of adaptive renaming algorithms published afterwards, but I just want to cover the most basic version of the splitter algorithm. See Mark Moir's page for some of the improvements.)
- [Tau04] Gadi Taubenfeld. The black-white bakery algorithm and related bounded-space, adaptive, local-spinning and FIFO algorithms. In Distributed Computing, 18th International Conference, pages 56-70, 2004.
- [Tre86] R.K. Treiber. Systems programming: Coping with parallelism. Tech report RJ5118, IBM Almaden Research Center, 1986.
Web Pages
Previous versions of this course:
Fall 2011,
Fall 2014.
Exercises
Try to keep your answers as simple as possible (but no simpler).
Important note on collaboration on homework assignments: It is okay to discuss the general approach to solving a problem with your classmates. However, you should not take any written notes away from such a discussion, and you should write up your solution on your own. Also, to protect yourself against charges of plagiarism, you should write on the front page of your assignment the names of any classmates that you discussed the problem with, and any outside sources that you used.

This page was last updated on April 12, 2016