The paper describes a parallel program checkpointing mechanism and its potential application in Grid systems in order to migrate applications among Grid sites. The checkpointing mechanism can automatically (without user interaction) support generic PVM programs created by the PGRADE Grid programming environment. The developed checkpointing mechanism is general enough to be used by any Grid job manager but the current implementation is connected to Condor. As a result, the integrated Condor/PGRADE system can guarantee the execution of any PVM program in the Grid. Notice that the Condor system can only guarantee the execution of sequential jobs. Integration of the Grid migration framework and the Mercury Grid monitor results in an observable Grid execution environment where the performance monitoring and visualization of PVM applications are supported even when the PVM application migrates in the Grid.
|Number of pages||10|
|Journal||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Publication status||Published - Dec 1 2004|
ASJC Scopus subject areas
- Theoretical Computer Science
- Computer Science(all)