Evaluating the reproducibility cost of the scientific workflows

Anna Bánáti, P. Kacsuk, Miklós Kozlovszky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In almost all research field scientific studies can be implemented by in silico experiments. They are modelled by scientific workflows which describes the data or control flow between the consecutive computational tasks. Since these experiments are data and compute intensive they need parallel and distributed infrastructures to be enacted (grids, clusters, clouds and supercomputers). The complexity of the infrastructures and the continuously changing environment faces us a big challenge in reproducibility, which is often needed for results sharing or for judging scientific claims in the scientists' community. The necessary parameters of reproducible workflows can be originated from different sources (infrastructural, third party, or related to the binaries), which may change or become unavailable during the process of re-execution. However in most cases the lack of the original parameters can be compensated by replacing, evaluating or simulating the value of the descriptors with some extra cost in order to make it reproducible. In this paper we give the expected cost of making a workflow reproducible or more precisely to determine the probability of making a workflow reproducible with more than a predefined cost C.

Original languageEnglish
Title of host publicationSACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages187-190
Number of pages4
ISBN (Electronic)9781509023790
DOIs
Publication statusPublished - Jul 7 2016
Event11th IEEE International Symposium on Applied Computational Intelligence and Informatics, SACI 2016 - Timisoara
Duration: May 12 2016May 14 2016

Other

Other11th IEEE International Symposium on Applied Computational Intelligence and Informatics, SACI 2016
CityTimisoara
Period5/12/165/14/16

Fingerprint

Scientific Workflow
Workflow
Reproducibility
Work Flow
Costs and Cost Analysis
Costs
Infrastructure
Supercomputers
Supercomputer
Data Flow
Flow control
Descriptors
Experiment
Consecutive
Sharing
Experiments
Computer Simulation
Binary
Grid
Necessary

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Control and Systems Engineering
  • Control and Optimization
  • Health Informatics

Cite this

Bánáti, A., Kacsuk, P., & Kozlovszky, M. (2016). Evaluating the reproducibility cost of the scientific workflows. In SACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings (pp. 187-190). [7507367] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SACI.2016.7507367

Evaluating the reproducibility cost of the scientific workflows. / Bánáti, Anna; Kacsuk, P.; Kozlovszky, Miklós.

SACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. p. 187-190 7507367.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bánáti, A, Kacsuk, P & Kozlovszky, M 2016, Evaluating the reproducibility cost of the scientific workflows. in SACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings., 7507367, Institute of Electrical and Electronics Engineers Inc., pp. 187-190, 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, SACI 2016, Timisoara, 5/12/16. https://doi.org/10.1109/SACI.2016.7507367
Bánáti A, Kacsuk P, Kozlovszky M. Evaluating the reproducibility cost of the scientific workflows. In SACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings. Institute of Electrical and Electronics Engineers Inc. 2016. p. 187-190. 7507367 https://doi.org/10.1109/SACI.2016.7507367
Bánáti, Anna ; Kacsuk, P. ; Kozlovszky, Miklós. / Evaluating the reproducibility cost of the scientific workflows. SACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 187-190
@inproceedings{57340920292b456bbafccfd967440b41,
title = "Evaluating the reproducibility cost of the scientific workflows",
abstract = "In almost all research field scientific studies can be implemented by in silico experiments. They are modelled by scientific workflows which describes the data or control flow between the consecutive computational tasks. Since these experiments are data and compute intensive they need parallel and distributed infrastructures to be enacted (grids, clusters, clouds and supercomputers). The complexity of the infrastructures and the continuously changing environment faces us a big challenge in reproducibility, which is often needed for results sharing or for judging scientific claims in the scientists' community. The necessary parameters of reproducible workflows can be originated from different sources (infrastructural, third party, or related to the binaries), which may change or become unavailable during the process of re-execution. However in most cases the lack of the original parameters can be compensated by replacing, evaluating or simulating the value of the descriptors with some extra cost in order to make it reproducible. In this paper we give the expected cost of making a workflow reproducible or more precisely to determine the probability of making a workflow reproducible with more than a predefined cost C.",
author = "Anna B{\'a}n{\'a}ti and P. Kacsuk and Mikl{\'o}s Kozlovszky",
year = "2016",
month = "7",
day = "7",
doi = "10.1109/SACI.2016.7507367",
language = "English",
pages = "187--190",
booktitle = "SACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Evaluating the reproducibility cost of the scientific workflows

AU - Bánáti, Anna

AU - Kacsuk, P.

AU - Kozlovszky, Miklós

PY - 2016/7/7

Y1 - 2016/7/7

N2 - In almost all research field scientific studies can be implemented by in silico experiments. They are modelled by scientific workflows which describes the data or control flow between the consecutive computational tasks. Since these experiments are data and compute intensive they need parallel and distributed infrastructures to be enacted (grids, clusters, clouds and supercomputers). The complexity of the infrastructures and the continuously changing environment faces us a big challenge in reproducibility, which is often needed for results sharing or for judging scientific claims in the scientists' community. The necessary parameters of reproducible workflows can be originated from different sources (infrastructural, third party, or related to the binaries), which may change or become unavailable during the process of re-execution. However in most cases the lack of the original parameters can be compensated by replacing, evaluating or simulating the value of the descriptors with some extra cost in order to make it reproducible. In this paper we give the expected cost of making a workflow reproducible or more precisely to determine the probability of making a workflow reproducible with more than a predefined cost C.

AB - In almost all research field scientific studies can be implemented by in silico experiments. They are modelled by scientific workflows which describes the data or control flow between the consecutive computational tasks. Since these experiments are data and compute intensive they need parallel and distributed infrastructures to be enacted (grids, clusters, clouds and supercomputers). The complexity of the infrastructures and the continuously changing environment faces us a big challenge in reproducibility, which is often needed for results sharing or for judging scientific claims in the scientists' community. The necessary parameters of reproducible workflows can be originated from different sources (infrastructural, third party, or related to the binaries), which may change or become unavailable during the process of re-execution. However in most cases the lack of the original parameters can be compensated by replacing, evaluating or simulating the value of the descriptors with some extra cost in order to make it reproducible. In this paper we give the expected cost of making a workflow reproducible or more precisely to determine the probability of making a workflow reproducible with more than a predefined cost C.

UR - http://www.scopus.com/inward/record.url?scp=84981335276&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84981335276&partnerID=8YFLogxK

U2 - 10.1109/SACI.2016.7507367

DO - 10.1109/SACI.2016.7507367

M3 - Conference contribution

SP - 187

EP - 190

BT - SACI 2016 - 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -