Optimal software rejuvenation for tolerating soft failures

András Pfening, Sachin Garg, Antonio Puliafito, Miklós Telek, Kishor S. Trivedi

Research output: Contribution to journalArticle

75 Citations (Scopus)

Abstract

In recent studies, the phenomenon of software "aging" has come to light which causes performance of a software to degrade with time. Software rejuvenation is a fault tolerance technique which counteracts aging. In this paper, we address the problem of determining the optimal time to rejuvenate a server type software which experiences "soft failures" (witnessed in telecommunication systems) because of aging. The service rate of the software gradually decreases with time and settles to a very low value. Since the performability in this state is unacceptable, it is necessary to "renew" the software to its peak performance level. We develop Markov decision models for such a system for two different queuing policies. For each policy, we define the look-ahead-n cost functions and prove results on the convergence of these functions to the optimal minimal cost function. We also prove simple rules to determine optimal times to rejuvenate for a realistic cost criterion. Finally, the results are illustrated numerically and the effectiveness of the MDP model is compared with that of the simple rules.

Original languageEnglish
Pages (from-to)491-506
Number of pages16
JournalPerformance Evaluation
Volume27-28
Publication statusPublished - Oct 1996

    Fingerprint

Keywords

  • Fault tolerant systems
  • Markov decision process
  • Optimal stopping problem
  • Software rejuvenation

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Pfening, A., Garg, S., Puliafito, A., Telek, M., & Trivedi, K. S. (1996). Optimal software rejuvenation for tolerating soft failures. Performance Evaluation, 27-28, 491-506.