Meta-Tools For Designing Scientific Workflow Management Systems: Part-I, Survey


Scientific workflows require the coordination of data processing activities, resulting in executions driven by data dependencies. Due to the scales involved and the repetition of analysis, typically workflows are analyzed in coordinated campaigns, each execution managed and controlled by the workflow management system. In this respect, a workflow management system is required to (1) provide facilities for specifying workflows: intermediate steps, inputs/outputs, and parameters, (2) manage the execution of the workflow based on specified parameters, (3) provide facilities for managing data provenance, and (4) provide facilities to monitor the progress of the workflow, include facilities to detect anomalies, isolate faults and provide recovery actions. In this paper, part-I of a two part series, we provide a comparison of some state of the art workflow management systems with respect to these four primary requirements.