You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains source code and the final version of my PhD thesis titled Ergonomics and efficiency of workflows on HPC clusters.
The rendered thesis is available online. If you'd like to see the PDF rendered version, it can be found here.
The goal of my thesis was to design approaches that would enable the effortless and efficient execution of task graphs (DAGs, workflows, pipelines, …) on heterogeneous HPC (High-Performance Computing) clusters, aka supercomputers. This research led to the following three main outputs/contributions:
HyperQueue is an HPC-focused task runtime that can automatically load balance task graphs across different Slurm/PBS allocations and provides sophisticated heterogeneous resource management.
It is actively being used by researchers in various HPC centers across Europe; I consider it to be the most successful output of my thesis.
RSDS is a reimplementation of the Dask server and scheduler, which improves its performance in HPC scenarios and is backwards compatible with existing Dask programs.