Parametrized Verification of Fault-tolerant Distributed Algorithms

Project head: Josef Widder.

Supported by the Austrian Science Fund (FWF) under project number P27722.



In the design of reliable computer systems, one has to address two challenges: on the one hand, we have to design means that tolerate partial failure that is outside the control of a system designer (e.g., power outages), and on the other hand, find design faults (bugs) in order to fix them. The former is addressed by means of replication and fault-tolerant distributed algorithms (FTDAs), while the latter is dealt with by rigorous system and software engineering methods, such as model checking. These two challenges have been studied by disjoint research communities. The distributed algorithm community has created a wealth of interesting FTDAs. While in the past these FTDAs were implemented only in safety-critical systems, we currently see more and more implementations of FTDAs for other application domains. The more implementations we will see, the more important it becomes to have effective methods to verify that these algorithms actually perform the task they were designed for.

The classic approach to ascertain the correctness of FTDAs is paper-and-pencil proofs. Conducting such proofs is an error-prone task, because the reasoning required is actually quite different from arguments in mainstream computer science. As a result, even published distributed algorithms contain bugs. It is therefore questionable whether FTDAs that are designed to increase to reliability of computer systems actually reach this goal. Hence, we require additional means to increase the trust in the correctness of FTDAs. This project explores model checking for this purpose, because model checking promises a high degree and automation, and has been used successfully for software and hardware verification. In particular this project considers the issue of parametrization in FTDAs.

FTDAs are classically parametrized by the number of processes n and the assumed maximal number of faults t, using resilience conditions that require, for instance, that a majority of the processes is correct, which is formalized using expressions such as n > 2t. This leads to the verification problem that a given FTDA should be verified for all combinations of n and t satisfying n > 2t, that is, an infinite family of system instances with fixed n and t has to be verified. The verification community has considered similar problems in the research area of parametrized model checking. Parametrized model checking is still considered a field that is full of open research questions. In particular, FTDAs have several features that introduce challenges that have not been considered in the literature so far, such as, the mentioned resilience conditions, process code that uses the parameters, faults, intricate timing assumptions, etc.

Within the PRAVDA project, we will develop new parametrized model checking methods, that will allow us to ascertain the correctness of state-of-the-art FTDAs.

Recent Project Publications

On the Completeness of Bounded Model Checking for Threshold-Based Distributed Algorithms: Reachability. Igor Konnov, Helmut Veith, Josef Widder. Information and Computation, 2016. Accepted manuscript available online: 10-MAR-2016
What You Always Wanted to Know about Model Checking of Fault-Tolerant Distributed Algorithms (invited talk). Igor Konnov, Helmut Veith, Josef Widder. PSI 2015. Revised Selected Papers, 2016. to appear
Challenges in Model Checking of Fault-tolerant Designs in TLA+
. Igor Konnov, Helmut Veith, Josef Widder. 2015, Unpublished contribution to the 8th International Workshop on Exploiting Concurrency Efficiently and Correctly, San Francisco, CA, USA, July.
SMT and POR beat Counter Abstraction: Parameterized Model Checking of Threshold-Based Distributed Algorithms. Igor Konnov, Helmut Veith, Josef Widder. CAV (Part I), volume 9206 of LNCS, pages 85-102, 2015.
Decidability of Parameterized Verification. Roderick Bloem, Swen Jacobs, Ayrat Khalimov, Igor Konnov, Sasha Rubin, Helmut Veith, Josef Widder
Synthesis Lectures on Distributed Computing Theory, volume 6, number 1, pages 1-170, 2015.