Transparent parallel checkpointing and migration in clusters and ClusterGrids

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

This paper introduces a novel approach in parallel checkpointing aimed at supporting fault-tolerance and migration among clusters of a ClusterGrid environment with various middleware components. Based on an architectural analysis, compatibility and integrity requirements are identified and corresponding conditions are established. Some of the available checkpointing systems are checked against the conditions in order to examine their conformity. Finally, a novel checkpointing approach is defined and the Parallel Grid Runtime and Application Development Environment (P-GRADE) Grid Programming Tool is adapted.

Original languageEnglish
Pages (from-to)171-181
Number of pages11
JournalInternational Journal of Computational Science and Engineering
Volume4
Issue number3
DOIs
Publication statusPublished - Jan 1 2009

Keywords

  • Checkpoint
  • Cluster
  • Clustergrid
  • Condor
  • Graphical programming environment
  • Grid
  • MP
  • Message passing
  • Migration
  • Parallel
  • pvm

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Hardware and Architecture
  • Computational Mathematics
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Transparent parallel checkpointing and migration in clusters and ClusterGrids'. Together they form a unique fingerprint.

  • Cite this