Theory of Distributed Computing

Fall 2009

Office: Computer Science Building, room 3042

Email: [my last name] @cse.yorku.ca

Telephone: (416) 736-2100 ext. 33979

Facsimile: (416) 736-5872

Lectures: Tuesdays and Thursdays 11:30-1:00 in room 537 in the south wing of the Ross building

Office Hours: I will hold office hours on Tuesdays at 1:30 and Fridays at 2:00. Note: On Nov 6 my office hour will be at 1pm instead of 2pm.

The best way to contact me is probably by email. Please use your cse account when sending me email, and start your subject line with "[6117]". Send messages in plain text, without attachments.

- (Dec 3) My office hour on Dec 4 will be at 4:00 instead of 1:00.
- (Nov 24) I will be out of town on Nov 27, so my usual office hour will be cancelled. However, you can drop by on Nov 26 between 1:30 and 3:30 if you have questions.
- (Nov 12) Due to a meeting, I have to change the time of my office hour tomorrow (Nov 13) from 2pm to 4pm.
- (Oct 30) My office hour on Nov 6 will be moved from 2pm to 1pm.
- (Oct 22) For assignment 4, the question is about leader election, not randomized leader election. So I'm only interested in deterministic algorithms.
- (Sep 17) I am away the week of Sep 21-25 to attend a conference. There will be no classes or office hours during that week, but I will try to read email. (I may not respond as quickly as usual, though.)
- (Sep 11) The second York programming contest of 2009-10 will be on Wednesday, September 16 from 7:00 to 9:00 p.m. See this page for more details.
- (Sep 7) There will be a programming contest on Friday, September 11 from 12:30 to 3:00 in CSEB 1006. See this page for more details.

- shared-memory and message-passing models of distributed systems,
- mutual exclusion,
- agreement problems (consensus, leader-election, Byzantine agreement, approximate agreement),
- broadcast and multicast algorithms,
- impossibility results and lower bounds,
- the consensus hierarchy,
- implementing shared data structures,
- randomization in distributed computing,
- self-stabilization, and
- a theoretical model for mobile computing.

Marking scheme

Homework exercises | 80% |

Class presentation | 20% |

The presentation will be a 25-minute talk summarizing the results of a research paper on the theory of distributed computing that you find in the literature. We will have some checkpoints along the way before the presentations. (More information on this to come.) Good places to look for a paper include PODC or DISC conference proceedings or the journal Distributed Computing. This survey paper has a bibliography containing lots of papers that would be suitable to choose. If you have a topic in mind, I might be able to help you find a good paper if you come talk to me.

The references below are intended for students who want to read more about the topics discussed in class. Sometimes the readings might be helpful for the assignments. Sometimes they will extend the ideas covered in lectures.

- Sep 10: Introduction. Two Generals. (See these notes from 2006.)
- Sep 15: Dijkstra's mutual exclusion algorithm [Dij65]. (See these notes.)
- Sep 17: Comparing models, simulating swsr registers with messages for failure-free system.
- Sep 29: Broadcasting [AW04, chapter 2].
- Oct 1: Distributed minimum spanning tree algorithm [GHS83],[Lyn96, Sec 4.4].
- Oct 6: Leader election in a ring: algorithms and impossibility results. [Ang80][AW04, chapter 3]
- Oct 8: Leader election in a ring: complexity lower bound. [AW04, chapter 3]
- Oct 20: Synchronous Byzantine consensus in a complete graph (impossible when n<=2f, possible when n>4f) [AW04, chapter 5][Lynch96, chapter 6]
- Oct 22: Synchronous Byzantine consensus in a complete graph (impossible when n<=3f) [AW04, chapter 5][Lynch96, chapter 6]
- Oct 27: Synchronous Byzantine consensus in an arbitrary graph. [AW04, chapter 5][Lynch96, chapter 6]
- Oct 29: Synchronous consensus tolerating crash failures. [AW04, Sec 5.1][Lyn96, Sec 6.7]. See also these notes from a previous year.
- Nov 3: Linearizability [HW90], Lock-freedom, Wait-freedom. Snapshot objects [AADGMS93].
- Nov 5: Snapshot objects [AADGMS93].
- Nov 10: Impossibility of asynchronous wait-free consensus using registers [Her91].
- Nov 12: Consensus numbers of other object types [Her91].
- Nov 17: Impossibility of consensus using registers in the presence of 1 failure (based on [FLP85]). Implementing swmr register from swsr register [AW04, section 10.2.2].
- Nov 19: Implementing mwmr register from sw snapshot [AW04, Section 10.2.3], Herlihy's universal construction [Her91].
- Nov 24: Universal construction (continued).
- Nov 26: Renaming. [AW04, Section 16.3], [MA94]
- Dec 1: Population Protocols (slides).

- [AW04] Hagit Attiya and Jennifer Welch.
*Distributed Computing: Fundamentals, Simulations and Advanced Topics, 2nd edition*. Wiley, 2004. - [GR06] Rachid Guerraoui and Luís Rodrigues.
*Introduction to Reliable Distributed Programming.*Springer, 2006. - [HS08] Maurice Herlihy and Nir Shavit.
*The Art of Multiprocessor Programming*. Morgan Kaufmann, 2008 - [Lyn96] Nancy A. Lynch.
*Distributed Algorithms*. Morgan Kaufmann, 1996. - [San06] Nicola Santoro.
*Design and Analysis of Distributed Algorithms*. Wiley, 2006. - [Tau06] Gadi Taubenfeld.
*Synchronization Algorithms and Concurrent Programming*. Pearson Education, 2006.

- [AADGMS93] Yehuda Afek, Hagit Attiya, Danny Dolev, Eli Gafni, Michael Merritt and Nir Shavit. Atomic snapshots of shared memory.
*J. of the ACM*, 40(4), pages 873-890, September 1993. - [Ang80] D. Angluin. Local and global properties in networks of processors. In
*Proc. 12th ACM Symposium on Theory of Computing*, pages 82-93, 1980. - [AADFP06] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer and René Peralta. Computation in networks of passively mobile finite-state sensors.
*Distributed Computing*, 18(4), pages 235-253, 2006. - [ABD95] Hagit Attiya, Amotz Bar-Noy and Danny Dolev, Sharing memory robustly in message-passing systems.
*J. of the ACM*, 42, p124-142, 1995. - [ABDPR90] Hagit Attiya, Amotz Bar-Noy, Danny Dolev, David Peleg and Rüdiger Reischuk. Renaming in an asynchronous environment.
*J. of the ACM*, 37(3), pages 524-548, July 1990. - [AGPV90] Baruch Awerbuch, Oded Goldreich, David Peleg and Ronen Vainish. A trade-off between information and communication in broadcast protocols.
*J. of the ACM*, 37(2), pages 238-256, April 1990. - [BG93] P. Berman and J. Garay. Cloture votes: n/4-resilient distributed consensus in t+1 rounds.
*Mathematical Systems Theory*26(1), pages 3-19, 1993. - [BW01] Saâd Biaz and Jennifer L. Welch. Closed form bounds for clock synchronization under simple uncertainty assumptions.
*Information Processing Letters*, 80, pages 151-157, 2001. - [Dij65] E. W. Dijkstra. Solution of a problem in concurrent programming control .
*Communications of the ACM*, 8(9), page 569, September, 1965. A very early paper on mutual exclusion. - [Dij74] Edsger W. Dijkstra. Self-stabilizing systems in spite of distributed control.
*Communications of the ACM*, 17(4), pages 643-644, November 1974. - [DM90] Cynthia Dwork and Yoram Moses. Knowledge and common knowledge in a Byzantine environment: Crash failures.
*Information and Computation*, 88(2), pages 156-186, October 1990. - [FR03] Faith Fich and Eric Ruppert. Hundreds of impossibility results for distributed computing. In
*Distributed Computing*, 16(2-3), pages 121-163, 2003. - [FLP85] Michael J. Fischer, Nancy A. Lynch and Michael S. Paterson. Impossibility of distributed consensus with one faulty process.
*Journal of the ACM*, 32(2), pages 374-382, April 1985. - [GHS83] R. G. Gallager, P. A. Humblet and P. M. Spira. A distributed algorithm for minimum-weight spanning trees.
*ACM Transactions on Programming Languages and System Sciences*, 5(1), pages 66-77, 1983. - [HMM85] Joseph Y. Halpern, Nimrod Megiddo and Ashfaq A. Munshi. Optimal precision in the presence of uncertainty.
*Journal of Complexity*, 1, pages 170-196, 1985. - [Her91] Maurice Herlihy. Wait-free synchronization.
*ACM Transactions on Programming Languages and Systems*, 13(1), pages 124-149, January 1991. - [HR95] Maurice Herlihy and Sergio Rajsbaum. Algebraic topology and distributed computing: A primer. In
*Computer Science Today: Recent Trends and Developments*, pages 203-217. Springer, 1995. - [HS99] Maurice Herlihy and Nir Shavit. The topological structure of asynchronous computability.
*Journal of the ACM*, 46(6), pages 858-923, November 1999. - [HW90] Maurice P. Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects.
*ACM Transactions on Programming Languages and Systems*, 12(3), pages 463-492, July 1990. - [Lam04] Leslie Lamport. A new solution of Dijkstra's concurrent programming problem.
*Communications of the ACM*, 18(8), pages 453-455, August 1974. - [LL90] Leslie Lamport and Nancy Lynch. Distributed Computing: Models and methods. In
*Handbook of Theoretical Computer Science*, Volume B, Chapter 18, Elsevier, 1990. (On reserve in Steacie Library.) A good, brief introduction to the area. It describes aspects of distributed models and surveys some important early results. - [LM85] Leslie Lamport and P. M. Melliar-Smith. Synchronizing clocks in the presence of faults.
*J. of the ACM*, 32(1), pages 52-78, 1985. - [MA94] Mark Moir and James H. Anderson. Fast, Long-Lived Renaming. In
*Proc. of the 8th International Workshop on Distributed Algorithms*, pages 141-155, 1994. (There have been several better versions of adaptive renaming algorithms published afterwards, but I just want to cover the most basic version of the splitter algorithm. See Mark Moir's page for some of the improvements.) - [Tau04] Gadi Taubenfeld. The black-white bakery algorithm and related bounded-space, adaptive, local-spinning and FIFO algorithms. In
*Distributed Computing, 18th International Conference*, pages 56-70, 2004.

- The ACM Symposium on Principles of Distributed Computing (PODC) and the International Symposium on Distributed Computing. These are the main conferences for the theory of distributed computing.

*This page was last updated on December 3, 2009
*