Parametrized Verification of Fault-tolerant Distributed Algorithms

Project head: Josef Widder.

Supported by the Austrian Science Fund (FWF) under project number P27722.




In the design of reliable computer systems, one has to address two challenges: on the one hand, we have to design means that tolerate partial failure that is outside the control of a system designer (e.g., power outages), and on the other hand, find design faults (bugs) in order to fix them. The former is addressed by means of replication and fault-tolerant distributed algorithms (FTDAs), while the latter is dealt with by rigorous system and software engineering methods, such as model checking. These two challenges have been studied by disjoint research communities. The distributed algorithm community has created a wealth of interesting FTDAs. While in the past these FTDAs were implemented only in safety-critical systems, we currently see more and more implementations of FTDAs for other application domains. The more implementations we will see, the more important it becomes to have effective methods to verify that these algorithms actually perform the task they were designed for.

The classic approach to ascertain the correctness of FTDAs is paper-and-pencil proofs. Conducting such proofs is an error-prone task, because the reasoning required is actually quite different from arguments in mainstream computer science. As a result, even published distributed algorithms contain bugs. It is therefore questionable whether FTDAs that are designed to increase to reliability of computer systems actually reach this goal. Hence, we require additional means to increase the trust in the correctness of FTDAs. This project explores model checking for this purpose, because model checking promises a high degree and automation, and has been used successfully for software and hardware verification. In particular this project considers the issue of parametrization in FTDAs.

FTDAs are classically parametrized by the number of processes n and the assumed maximal number of faults t, using resilience conditions that require, for instance, that a majority of the processes is correct, which is formalized using expressions such as n > 2t. This leads to the verification problem that a given FTDA should be verified for all combinations of n and t satisfying n > 2t, that is, an infinite family of system instances with fixed n and t has to be verified. The verification community has considered similar problems in the research area of parametrized model checking. Parametrized model checking is still considered a field that is full of open research questions. In particular, FTDAs have several features that introduce challenges that have not been considered in the literature so far, such as, the mentioned resilience conditions, process code that uses the parameters, faults, intricate timing assumptions, etc.

Within the PRAVDA project, we developed new parametrized model checking methods, that allow us to ascertain the correctness of state-of-the-art FTDAs. The project ended 31.12.2019.


Selected Publications

Communication-Closed Asynchronous ProtocolsAndrei Damian, Cezara Dragoi, Alexandru Militaru, Josef Widder. CAV (Part II), pages 344–363, 2019.

Verifying Safety of Synchronous Fault-Tolerant Algorithms Bounded Model CheckingIlina Stoilkovska, Igor Konnov, Josef Widder, Florian Zuleger. TACAS (Part II), pages 357–374, 2019. A pre-print including the proofs is available at:

Verification of Randomized Consensus Algorithms Under Round-Rigid Adversaries.Nathalie Bertrand, Igor Konnov, Marijana Lazic, Josef Widder. CONCUR, pages 33:1–33:15, 2019.

A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithmsIgor Konnov, Marijana Lazic, Helmut Veith, Josef Widder. POPL, pages 719–734, 2017. A pre-print including the proofs is available at

SMT and POR beat Counter Abstraction: Parameterized Model Checking of Threshold-Based Distributed AlgorithmsIgor Konnov, Helmut Veith, Josef Widder. CAV (Part I), pages 85-102, 2015.