Skip to content

Home

Taskblaster workflow for high-throughput simulations

The repository contains a brief tutorial demonstrating how to perform high-throughput DFT calculations using Taskblaster with FHI-aims. Taskblaster is a light-weight python framework for designing and managing high throughput workflows. It allows for executing multiple tasks (e.g. structural relaxation, bandstructure calculation, postprocessing analysis etc.) via command-line interface. For detailed information, please refer to the official documentation

Required packages

Additionally, you will need to make a choice on the electronic structure package of interest. In this tutorial, we will be using FHI-aims. You can run the calculations on a laptop as well as supercomputer after ensuring all these packages are installed.

Installation

It is recommended to run the workflow by creating a separate environment. You can create a conda environment as follows:

conda create -n tb_workflow
conda activate tb_workflow

You can clone this repository to get the required files to setup a taskblaster workflow. In this cloned directory, you can run:

pip install --editable .

This will set up a high-throughput workflow named ht_workflow along with installing the necessary packages to run the workflow such as ASE if already not installed.

Optionally, once this is done, you can check whether this is working as expected by importing this module in python; import ht_workflow

If you are using the latest version of ASE (3.23.0), then you can configure FHI-aims with ASE via updating the FHI-aims executable command and basis set location in the $HOME/.config/ase/config.ini file. An example is shown below:

[aims]
    command = mpirun -np 6 aims.x
    default_species_directory = /path/to/FHIaims/species_defaults/defaults_2020/light

where aims.x should be the name of your FHI-aims executable (as well as location if it is not on your path), and the default_species_directory should be set according to where the species_defaults folder is found on your FHI-aims installation.

Input file preparation

We will divide this tutorial into two sections. In the first section, we discuss how to set up a workflow where multiple tasks (different DFT calculations) can be carried out on a single material. In the second section, we will see how this workflow can be modified to do a single task on multiple materials. You may also try the combination of both as an exercise.

Two important files which will be used for running the taskblaster workflow and hence to be placed in the directory where you are setting up the workflow are;

  • tasks.py defining the different tasks to be performed;
  • workflow.py defining the order and dependence between the tasks.

Examples of these files can be found in the ht_workflow/single_material and ht_workflow/multi_material directories for the respective cases.

Single Material-Based Workflow

In a new directory, a workflow can be initially set up by running tb init ht_workflow. This allows the workflow to save objects using the ASE JSON encoder.

You may then copy over the files tasks.py and workflow.py from ht_workflow/single_material into your current working directory. You will need to edit the line

species_dir = "/scratch/projects/bep00114/softwares/aims_new/FHIaims/species_defaults/defaults_2020/light"

in tasks.py to the location of your species_defaults folder, as specified in the ASE config above.

We will select bulk Si as the material of interest for which simulations are to be conducted as this is a simple system with two atoms in the unit cell. As we can see from the tasks.py file, we define three different tasks to be performed on bulk Si which are PBEsol relaxation, HSE06 SCFcalculation and HSE06 bandstructure calculation. This is a reasonable workflow hierarchy as PBEsol functional gives reliable geometries and HSE06 gives more accurate energetics and band gaps for narrow-gap semiconducting systems like Si. These tasks are defined as python functions which are called under the Single_Material_Workflow class in the workflow.py file.

Now we can run the command

tb workflow workflow.py

This generates the tasks we have defined. These can be checked via running tb ls which will show

state    deps  tags        worker        time     folder
───────────────────────────────────────────────────────────────────────────────
new      0/1                                      tree/hse_band
new      0/1                                      tree/hse_scf
new      0/0                                      tree/pbesol_relax

The label new indicates that these tasks are newly created and have not been submitted yet. Also, we can see the dependency of the tasks in the column deps which shows that PBEsol relaxation is started from scratch whereas the remaining tasks are taking outputs of other tasks as inputs in the same way we have defined them in the workflow.py file. Now we can run these tasks via

tb run .

An example output of this command will look like:

Starting worker rank=000 size=001
[rank=000 2024-09-09 17:31:25 N/A-0/1] Worker class: —
[rank=000 2024-09-09 17:31:25 N/A-0/1] Required tags: —
[rank=000 2024-09-09 17:31:25 N/A-0/1] Supported tags: —
[rank=000 2024-09-09 17:31:25 N/A-0/1] Main loop
Got task <taskblaster.worker.LoadedTask object at 0x1473c7c217c0>
[rank=000 2024-09-09 17:31:25 N/A-0/1] Running pbesol_relax ...
[rank=000 2024-09-09 17:31:50 N/A-0/1] Task pbesol_relax finished in 0:00:25.201288
Got task <taskblaster.worker.LoadedTask object at 0x1473c7bfcb20>
[rank=000 2024-09-09 17:31:50 N/A-0/1] Running hse_scf ...
[rank=000 2024-09-09 17:32:09 N/A-0/1] Task hse_scf finished in 0:00:18.724472
Got task <taskblaster.worker.LoadedTask object at 0x1473c0d40160>
[rank=000 2024-09-09 17:32:09 N/A-0/1] Running hse_band ...
[rank=000 2024-09-09 17:34:41 N/A-0/1] Task hse_band finished in 0:02:32.160490
[rank=000 2024-09-09 17:34:41 N/A-0/1] No available tasks, end worker main loop

As we can see, all the three tasks are successfully finished. We can inspect the directories of these tasks inside the tree directory to verify this. You can find the FHI-aims input and output files generated for each of the tasks in this repository. Please note that for tutorial purposes, we have used very light settings which should be changed for production runs. Despite these light settings, we get a band gap of 1.38 eV which is not too far from the experimental band gap of 1.12 eV.

If any of the steps of the calculation failed, e.g. if you forgot to changes the species_dir in tasks.py then you may see something like the following after running tb ls

state    deps  tags        worker        time     folder
───────────────────────────────────────────────────────────────────────────────
cancel   0/1                                      tree/hse_band
cancel   0/1                                      tree/hse_scf
fail     0/0               N/A-0/1       00:00:00 tree/pbesol_relax
^^^^  RuntimeError: The requested species_dir /scratch/projects/bep00114/softwares/aims_new/FHIaims/species_defaults/defaults_…

If you wish to edit anything in tasks.py and then re-run the workflow, you must first use the tb unrun command, followed by the folder you wish to unrun, e.g.

tb unrun tree/pbesol_relax --force

The workflow will then be reset, which you can verify by running tb ls:

state    deps  tags        worker        time     folder
───────────────────────────────────────────────────────────────────────────────
new      0/1                                      tree/hse_band
new      0/1                                      tree/hse_scf
new      0/0                                      tree/pbesol_relax

You can then edit the tasks.py file and re-run the workflow. You could also edit other parameters, e.g. the size of the k-grid, in a similar manner.

Multi Material-Based Workflow

Now we will look into a scenario where some tasks are executed over multiple materials. This is often useful in material science, especially when working on development of databases or machine learning. For simplicity, we will consider four bulk solids (Si, Ag, Ti, Au), and use a single task of PBEsol relaxation. You can copy both the tasks.py and workflow.py file from the ht_workflow/single_material directory. Since we are using only the PBEsol relaxation, we can comment out or delete the lines from the workflow.py file related to the other tasks. We can also modify the Class name from Single_Material_Workflow to Multi_Material_Workflow.

For working with multiple materials, we can create an ASE database by using the write_db.py file present in the ht_workflow/multi_material directory. Copy this over to your working directory and execute it. This will generate an ase db file named bulk.db which saves all the basic information such as lattice parameters, atomic positions, magnetic moments and so on for these materials.

After initiating a repository as we have done previously, we can add the information all the materials to the workflow by copying over the the totree.py file also present in the ht_workflow/multi_material directory and running as

tb workflow totree.py

This will create a tree structure with the four materials, and you should see the following output

     add: new      0/0   tree/Si2/material
     add: new      0/0   tree/Ag/material
     add: new      0/0   tree/Ti2/material
     add: new      0/0   tree/Au/material

Now, we modify the def workflow routine inside workflow.py from the single Si case to run all the materials that have just been added to the tree

@tb.parametrize_glob('*/material')
def workflow(material):
    return Multi_Material_Workflow(atoms=material)

This modified workflow.py file can be found inside the ht_workflow/multi_material directory.

Finally, similar to the single material case, running tb workflow workflow.py can be used for initiating the workflow. tb ls will return:

entry:         add new      0/1   tree/Ag/pbesol_relax 
entry:         add new      0/1   tree/Au/pbesol_relax 
entry:         add new      0/1   tree/Si2/pbesol_relax 
entry:         add new      0/1   tree/Ti2/pbesol_relax 

As we can see, the same task (PBEsol relaxation) is now defined for all the materials (the stoichiometry of the materials represents the number of atoms in their unit cell).

Now we can submit these tasks by running

tb run .

which will show an output similar to:

Starting worker rank=000 size=001
[rank=000 2024-09-09 22:35:22 N/A-0/1] Worker class: —
[rank=000 2024-09-09 22:35:22 N/A-0/1] Required tags: —
[rank=000 2024-09-09 22:35:22 N/A-0/1] Supported tags: —
[rank=000 2024-09-09 22:35:22 N/A-0/1] Main loop
Got task <taskblaster.worker.LoadedTask object at 0x147cb68ae3a0>
[rank=000 2024-09-09 22:35:22 N/A-0/1] Running Ag/material ...
[rank=000 2024-09-09 22:35:22 N/A-0/1] Task Ag/material finished in 0:00:00.002124
Got task <taskblaster.worker.LoadedTask object at 0x147cb68ae550>
[rank=000 2024-09-09 22:35:22 N/A-0/1] Running Ag/pbesol_relax ...
[rank=000 2024-09-09 22:35:48 N/A-0/1] Task Ag/pbesol_relax finished in 0:00:26.249008
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fd0>
[rank=000 2024-09-09 22:35:48 N/A-0/1] Running Au/material ...
[rank=000 2024-09-09 22:35:48 N/A-0/1] Task Au/material finished in 0:00:00.002103
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fa0>
[rank=000 2024-09-09 22:35:48 N/A-0/1] Running Au/pbesol_relax ...
[rank=000 2024-09-09 22:36:16 N/A-0/1] Task Au/pbesol_relax finished in 0:00:27.658316
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fa0>
[rank=000 2024-09-09 22:36:16 N/A-0/1] Running Si2/material ...
[rank=000 2024-09-09 22:36:16 N/A-0/1] Task Si2/material finished in 0:00:00.001984
Got task <taskblaster.worker.LoadedTask object at 0x147cb7107f70>
[rank=000 2024-09-09 22:36:16 N/A-0/1] Running Si2/pbesol_relax ...
[rank=000 2024-09-09 22:36:33 N/A-0/1] Task Si2/pbesol_relax finished in 0:00:16.813452
Got task <taskblaster.worker.LoadedTask object at 0x147cb7107f70>
[rank=000 2024-09-09 22:36:33 N/A-0/1] Running Ti2/material ...
[rank=000 2024-09-09 22:36:33 N/A-0/1] Task Ti2/material finished in 0:00:00.002379
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fa0>
[rank=000 2024-09-09 22:36:33 N/A-0/1] Running Ti2/pbesol_relax ...
[rank=000 2024-09-09 22:37:09 N/A-0/1] Task Ti2/pbesol_relax finished in 0:00:36.020013
[rank=000 2024-09-09 22:37:09 N/A-0/1] No available tasks, end worker main loop

You can inspect the results of the calculations by going into the respective directories for each of the material.

Note that this will submit all the calculations for all the materials. In case you have an expensive task and would like to submit individually for each material, you can specify the corresponding path while running the tasks. For instance, tb run tree/Ag will perform the task(s) only for Ag.

To run the calculations either for all the materials together or for specific materials in a supercomputer, you can attach the same command in the submission script. An example SLURM script is below:

#!/bin/bash
#SBATCH -o ./Ag_out.%j
#SBATCH -e ./Ag_err.%j
#SBATCH -D ./
#SBATCH -J Ag
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=96
#SBATCH --cpus-per-task=1
#SBATCH --time=05:00:00
ulimit -s unlimited
module load anaconda3/2019.10
source /sw/tools/anaconda3/2019.10/skl/etc/profile.d/conda.sh
conda activate tb_workflow
tb run /path/to/tree/Ag