Séminaire LRDE: Designing robust distributed systems with weakly interacting feedback structures - Peter Van Roy organisé par le LRDE de l'EPITA. http://seminaire.lrde.epita.fr/2013-04-24.php
Large distributed systems on the Internet are subject to hostile environmental conditions such as node failures, erratic communications, and partitioning, and global problems such as hotspots, attacks, multicast storms, chaotic behavior, and cascading failures. How can we build these systems to function in predictable fashion in such situations as well as being easy to understand and maintain? In our work on building self-managing systems in the SELFMAN project, we have discovered a useful design pattern for building complex systems, namely as a set of Weakly Interacting Feedback Structures (WIFS). A feedback structure consists of a graph of interacting feedback loops that together maintain one global system property. We give examples of biological and computing systems that use WIFS, such as the human respiratory system and the TCP family of network protocols. We then show the usefulness of the design pattern by applying it to the Scalaris key/value store from SELFMAN. Scalaris is based on a structured peer-to-peer network with extensions for data replication and transactions. Scalaris achieves high performance: from 4000 to 14000 read-modify-write transactions per second on a cluster with 1 to 15 nodes each containing two dual-core Intel Xeon processors at 2.66 GHz. Scalaris is a self-managing system that consists of five WIFS, for connectivity, routing, load balancing, replication, and transactions. We conclude by explaining why WIFS are an important design pattern for building complex systems and we outline how to extend the WIFS approach to allow proving global properties of these systems.