The flowbster cloud-oriented workflow system to process large scientific data sets

P. Kacsuk, József Kovács, Zoltán Farkas

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

The paper describes a new cloud-oriented workflow system called Flowbster. It was designed to create efficient data pipelines in clouds by which large compute-intensive data sets can efficiently be processed. The Flowbster workflow can be deployed in the target cloud as a virtual infrastructure through which the data to be processed can flow and meanwhile it flows through the workflow it is transformed as the business logic of the workflow defines it. Instead of using the enactor based workflow concept Flowbster applies the service choreography concept where the workflow nodes directly communicate with each other. Workflow nodes are able to recognize if they can be activated with a certain data set without the interaction of central control service like the enactor in service orchestration workflows. As a result Flowbster workflows implement a much more efficient data path through the workflow than service orchestration workflows. A Flowbster workflow works as a data pipeline enabling the exploitation of pipeline parallelism, workflow parallel branch parallelism and node scalability parallelism. The Flowbster workflow can be deployed in the target cloud on-demand based on the underlying Occopus cloud deployment and orchestrator tool. Occopus guarantees that the workflow can be deployed in several major types of IaaS clouds (OpenStack, OpenNebula, Amazon, CloudSigma). It takes care of not only deploying the nodes of the workflow but also to maintain their health by using various health-checking options. Flowbster also provides an intuitive graphical user interface for end-user scientists. This interface hides the low level cloudoriented layers and hence users can concentrate on the business logic of their data processing applications without having detailed knowledge on the underlying cloud infrastructure.

Original languageEnglish
Pages (from-to)55-83
Number of pages29
JournalJournal of Grid Computing
Volume16
Issue number1
DOIs
Publication statusPublished - Jan 11 2018

Keywords

  • Cloud orchestration
  • Data stream
  • Multi-cloud
  • Scientific data
  • Workflow

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'The flowbster cloud-oriented workflow system to process large scientific data sets'. Together they form a unique fingerprint.

  • Cite this