Pegasus workflow management system

Continuing the discussion from CI Meeting 2023/09/27:

An update on my research about using Pegasus as our workflow management engine: my preliminary reading of the documentation is encouraging. Pegasus checks most if not all of our boxes:

  • :white_check_mark: Native support for containerized jobs
  • :white_check_mark: Provenance data is collected in a database, and the data can be summarized with tools such as pegasus-statistics, pegasus-plots, or directly with SQL queries.
  • :white_check_mark: Workflow API: workflows can be defined using YAML but there is a Python API recommended for creating Pegasus workflows.
  • :white_check_mark: Many Execution Environments are supported:
    • Local Execution
    • Condor Pools and Glideins
    • Grids
    • Clouds

I refuse to fall victim to the sunk cost fallacy here; just because I have already spent a considerable amount of time working on a custom workflow management solution, it is primitive compared to Pegasus and it’s hard to imagine that the disadvantages of using Pegasus could outweigh the benefits in terms of features and robustness.

There are people at NCSA having some familiarity with Pegasus, so I’m getting their input as well. Any additional insight from the MUSES community would be appreciated.

It turns out that Pegasus is already being used as the official workflow management system for ACCESS, through which the NCSA Delta HPC cluster resources are allocated.

Pegasus is a workflow management system, which enables you to run computational workflows across ACCESS resources. You will seamlessly be able to orchestrate jobs and data movements at different resource providers. At this point, Pegasus on ACCESS is mainly used for high throughput computing (HTC) workloads. This means jobs which can fit on a single compute node (single core, multicore, or single node MPI jobs).

Pegasus is being used in production to execute scientific workflows in several different disciplines including astronomy, gravitational-wave physics, bioinformatics, earthquake engineering, helio-seismology, limnology, machine learning, and molecular dynamics, among others. Pegasus provides the necessary abstractions for scientists to create workflows and allows for transparent execution of these workflows on a range of computing platforms. More information can be found on the Pegasus website or in the Pegasus user guide.

lego-clicking

1 Like