Self-stabilizing Byzantine Fault-Tolerant Distributed Algorithms for Integrated Circuits

The ultimate goal of SIC (Self-stabilizing Byzantine Fault-Tolerant Distributed Algorithms for Integrated Circuits) is to develop the foundations of a framework for the rigorous modeling and analysis of Byzantine fault-tolerant self-stabilizing distributed algorithms for VLSI circuits.

Funding: Austrian Science Fund (FWF), project no P26436 fwf.giffonds.gif

Collaborators: Christoph Lenzen (MPI Saarbrücken), Danny Dolev (Hebrew University), Thomas Nowak (ENS Paris), Michael Hofbauer (TU Wien, Institute of Electrodynamics, Microwave and Circuit Engineering)

Time Frame: 01. 11. 2013-31. 10. 2018

Contact Persons: Matthias Függer (Project Head), Ulrich Schmid

Research Team: Matthias Függer (Project Head), Ulrich Schmid, Martin Perner, Robert Najvirt

This project proposes to address robustness issues in very-large scale integrated (VLSI) circuits by means of self-stabilizing Byzantine fault-tolerant distributed algorithms. It aims at the development of the foundations of a suitable framework for correctness proofs and performance analyses for such algorithms, and its application to some representative problems in VLSI circuits. Besides permanent failures due to e.g. manufacturing defects and electrical wear-out, transient faults caused e.g. by ionizing particles, cross-talk and temporarily out-of-spec operating conditions are a major source of concern for the dependability of modern VLSI circuits, in particular, in critical applications domains like aerospace. In sharp contrast to classic fault-tolerance techniques like triple-modular redundancy, self-stabilizing algorithms can recover even from unbounded transient faults, and Byzantine-fault-tolerant self-stabilizing algorithms even allow this to happen in the presence of permanent faults.
VLSI circuits put unique challenges on Byzantine fault-tolerant self-stabilizing distributed algorithms, which have not been addressed by existing solutions at all. Besides the fact that millions of simple gates that continuously compute their outputs based on (past) inputs do not match the classic distributed state machine abstraction employed in distributed algorithms, VLSI circuits provide only very simple basic computation and communication features. As a consequence, neither existing distributed algorithms nor existing modeling and analysis frameworks can be carried over without substantial modification. Nevertheless, the Byzantine fault-tolerant self-stabilizing clock generation approach for VLSI circuits developed recently by the authors (in collaboration with Danny Dolev/Hebrew University and Christoph Lenzen/Weizmann Institute) revealed that it is principally feasible to do so.
The goal of SIC is to develop the foundations of a framework for the rigorous modeling and analysis of Byzantine fault-tolerant self-stabilizing distributed algorithms for VLSI circuits. The work to be performed in SIC involves solving fundamental research questions, in particular, sound and reasonably complete mathematical descriptions of the stabilization process and digital models for metastability generation and propagation, which cannot be solely addressed by engineering solutions in the self-stabilizing context. This work is complemented by devising Byzantine fault-tolerant self-stabilizing solutions for some representative problems in VLSI circuits, like distributed clock generation and reliable broadcast. Using the envisioned modeling and analysis framework, correctness proofs and performance analyses of these algorithms under adequate Byzantine failure models will be provided. The practical implementability of our algorithms will finally be demonstrated by means of simulation and FPGA prototype implementations, which also facilitate some experimental evaluation.