Building Reliable Distributed Systems (182.704)
[TISS-Seite] [TISS page] [Organization] [General Info] [Grading] [Schedule] [Practical Work] [Results] [Miscellaneous]
News:
Please note that this course will NOT be taught in WS2019/20 !
About the Lecture:
Aim
Acquisition of the basic understanding of practical aspects of distributed algorithms in embedded systems. Participants of this course are able to implement the abstractions (message-passing interface, lock-step synchronous rounds) needed for executing distributed algorithms on a microcontroller board running a real-time operating system, thereby realizing the basic services of a networked embedded system. Executing suitable distributed algorithms in this system lead to a deeper understanding for the resulting overall performance and reliability.
ECTS breakdown (4.5 ECTS = 112.5 hours):
4h Introductory presentations
20h Reading RTOS and hardware documentation
68.5h Implementation, test and evaluation work
20h Documentation and presentation of results
Subject
The course is made up of a short introductory part followed by the practical part, which should preferably be performed in groups of two students. The former offers a brief introduction into a real-time operating system (QNX Neutrino, RT-Linux), the used hardware platform (Pandaboard A4 with ARM9) and some basic communication primitives and clock synchronization algorithms. The practical part comprises (i) the installation of a suitable real-time operating system on the hardware platform and its (latency-)evaluation, (ii) the implementation of some low-level abstractions (communication primitives, clock synchronization) for the execution of distributed algorithms, and (iii) the implementation and evaluation of some simple distributed algorithms on the resulting network embedded system.
Lecturer
Ulrich Schmid, Kyrill Winkler
Homepage
https://ti.tuwien.ac.at/ecs/teaching/courses/brds
General Info
Prerequisite: Basic knowledge about distributed and dependable embedded systems (level of 182095 Dependable Systems and 182110 Embedded Systems Engineering or 183.168 Distributed Automation Systems); experience in microcontroller and operating systems prorgramming (level of 182064 Microcontroller and 182.709 Operating Systems).
Enrolling: Personally, after the introductory lecture.
Language: English or German
Groups: The practical work should be done in groups of two.
General collaboration Policy: Discussion of concepts with others is encouraged, but all work must be done on your own and written up in your own words. All your sources must be referenced, whether they be a person, a book, a solution set, a web page, public domain code, or whatever.
Grading
Grading is based on the written lab protocol (see below) and the oral presentation of your results as follows:
- Lab protocol: Description of the solution architecture and design + code (50%)
- Lab protocol: Evaluation results (20%)
- Oral presentation of the solution and the evaluation results (30%)
To also facilitate individual grading, please make it clear in the lab protocol (and also in the presentation) which group member took the lead of which part (unless you did truly shared work on all parts, in which case you should also present your accomplishments jointly).
We employ the following grading scheme:
90-100 Sehr Gut (S1, A) 80-89 Gut (U2, B) 70-79 Befriedigend (B3, C) 60-69 Genügend (G4, D) 0-59 Nicht genügend (N5, F)
Schedule (tentative)
All presentations will be held in the Library Embedded Computing Systems Group E182/2, Treitlstraße 1, 2nd floor. The (tentative - can be moved!) presentation time will usually be on Thursdays 13:15-14:45.
Day | Date | Time | Lecturer | Topic | Slides |
Thu | 11.10.2018 | 13:15-14:00 | Schmid | Introduction | Slides |
Thu | 25.10.2018 | 13:15-15:45 | Schmid | Introduction to QNX Neutrino | Slides |
Introduction to RT Linux | Slides | ||||
Thu | * | 13:15-14:45 | Student presentations (RTOS) | ||
Practical Work
At least two Pandaboards will be made available to you and can also be carried home until the final testing phase, which requires more than two nodes and should hence be done in the ECS lab (Treitlstraße 1-3, 2nd floor). Your TI chip card will be activated to unlock its door.
The actual work consists of 3 parts:
1. RTOS installation
Select "your" real-time operating system. We have some prefered choices, where we know that a BSP for the Pandaboard is available:
- QNX Neutrino v6.5.0 SP1 (you need to sign an academic license agreement to get access)
- RT-Linux
Alternatively, you can also choose to setup some RTOS of your own choice. You should make sure, however, that there is a board support package for the Pandaboard, otherwise the amount of work needed will certainly be excessive. In this case, in addition to the detailed setup instructions (see setup.txt below), you also expected to additional resources (some slides containing an intro to the features of your RTOS, documentation links) that will allow us to include the RTOS in the list of feasible ones in future instances of this course.
Evaluate the local latency of your RTOS port, i.e., the histogram of the time from the occurrence of some trigger event (typically the expiration of a periodic timer) until the execution of a process dedicated to servicing this event.
2. Implementation of low-level distributed systems abstractions
Implement the following low-level abstractions on top of either the Ethernet or the wireless network stack of the RTOS:
- Asynchronous message-passing interface (point-to-point, both unreliable and reliable)
- Lock-step synchronous rounds with unreliable broadcast communication (which also requires implementation of a clock synchronization algorithm)
Note that you will also need to devise and implement a suitable scheme for node addressing.
3. Application implementation
Perform the following tasks:
- Implement a simple asynchronous distributed algorithm and evaluate the latency of the end-to-end delays (and possibly the termination time of the distributed algorithm if it is a terminating one) in a reasonably large (at least 4 nodes) system using your asynchronous message-passing interface. Note that you could also use your clock synchronization algorithm here.
- Implement a simple fault-tolerant synchronous distributed algorithm, measure the synchronization accuracy and evaluate termination time and reliability in a reasonably large (at least 4 nodes) system using your synchronous lock-step round abstraction. Note that the round durations shall be chosen according to the measured end-to-end delays.
Present a brief overview of the architecture and the design of your solution and the evaluation results on one of the students presentations.
Results
A. Lab Protocol
The lab protocol must briefly present the following:
- The architecture and the design of your distributed systems abstractions (2.), in particular, how it it integrated with the RTOS services.
- The results of the latency evaluation.
- The specification and the design of your application algorithms.
- The results of the performance evaluation.
Note that it is also perfectly acceptable if your lab protocol consists of presentation slides.
B. Code and Setup
Your complete solution (code + executeables) must be uploaded to a subdirectory named BRDS in your ECS home directory. Make sure that it (and all its subdirectories and files) has the group "brds" and is read- and writeable for the group.
Besides the code, BRDS must also include
- your lab protocol (source and .pdf)
- a textfile setup.txt, which describes in detail the actions used to (i) install the RTOS on the Pandaboard and (ii) to setup the cross-development environment.
C. Presentations
The course schedule contains a fixed slot for presentations every week (when there is no introductory lecture), where you can present your final solution and results (15-25 min presentation time): RTOS port and local latency evaluation for 1. as well as specification, architecture and design overview for 2. and 3. and evaluation results. Recall that it is also perfectly acceptable if your lab protocol consists of (a superset of) the presentation slides.
It is up to you to choose the presentation date - however, please send an email to one of the instructors at least one week before, so that we can enter your presentation into the schedule. Everybody is welcome to attend these presentations, but attendance is NOT mandatory.
Additional Documentation
- Pandaboard wiki
- QNX site
- QNX Quickstart Guide (for v6.6.0 - we are using v6.5.0 SP1, so be somewhat cautious ...)
- QNX Getting Started
- Description of VxWorks BSP for Pandaboard
Miscellaneous