Method for the construction and interpretation of high level models for distributed fault-tolerant systems

K. Tilly, I. Kiss, G. Roman, T. Dobrowiecki, A. R. Varkonyi-Koczy

Research output: Contribution to journalConference article

Abstract

Traditional solutions for achieving fault-tolerance are intended for use at design time and they generally capture system information at a very low (hardware or machine instruction) level. Increasing reliability of complex information systems containing many (perhaps many thousands) of autonomous components requires different solutions. This article presents a new methodology for the implementation of large scale, distributed fault-tolerant systems. System models are formed of objects describing requirements, services and resources organized into high level top-down hierarchical decomposition structures. Since redundancy is a natural property of any large scale system, using such models it is possible to achieve fault tolerant behaviour by finding multiple appropriate mappings between requirements and available services, and to support the required services by available resources. The distributed system is extended with dedicated components, called diagnostic centres, which manage distinct parts of the system model, continuously observe the operation of the distributed system, and find alternative requirement-service mappings, if some services fail to fulfil their associated requirements. In the following sections the elements and the structure of the proposed system modelling method is presented, an appropriate fault model is defined, and the algorithms for model interpretation are described.

Original languageEnglish
Pages (from-to)72-81
Number of pages10
JournalProceedings of the IEEE Symposium on Reliable Distributed Systems
Publication statusPublished - Jan 1 1995
EventProceedings of the 1994 IEEE 14th Symposium on Related Distributed Systems - Bad Neuenahr, Ger
Duration: Sep 13 1995Sep 15 1995

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Method for the construction and interpretation of high level models for distributed fault-tolerant systems'. Together they form a unique fingerprint.

  • Cite this