Home
Taskblaster workflow for high-throughput simulations
The repository contains a brief tutorial demonstrating how to perform high-throughput DFT calculations using Taskblaster with FHI-aims. Taskblaster is a light-weight python framework for designing and managing high throughput workflows. It allows for executing multiple tasks (e.g. structural relaxation, bandstructure calculation, postprocessing analysis etc.) via command-line interface. For detailed information, please refer to the official documentation
Required packages
Additionally, you will need to make a choice on the electronic structure package of interest. In this tutorial, we will be using FHI-aims. You can run the calculations on a laptop as well as supercomputer after ensuring all these packages are installed.
Installation
It is recommended to run the workflow by creating a separate environment. You can create a conda environment as follows:
conda create -n tb_workflow
conda activate tb_workflow
You can clone this repository to get the required files to setup a taskblaster workflow. In this cloned directory, you can run:
pip install --editable .
This will set up a high-throughput workflow named ht_workflow
along with installing the necessary packages
to run the workflow such as ASE if already not installed.
Optionally, once this is done, you can check whether this is working as expected by importing this module in python;
import ht_workflow
If you are using the latest version of ASE (3.23.0), then you can configure FHI-aims with ASE via updating the
FHI-aims executable command and basis set location in the $HOME/.config/ase/config.ini
file. An example is shown
below:
[aims]
command = mpirun -np 6 aims.x
default_species_directory = /path/to/FHIaims/species_defaults/defaults_2020/light
where aims.x
should be the name of your FHI-aims executable (as well as location if it is not on your path), and the default_species_directory
should be set according to where the species_defaults
folder is found on your FHI-aims installation.
Input file preparation
We will divide this tutorial into two sections. In the first section, we discuss how to set up a workflow where multiple tasks (different DFT calculations) can be carried out on a single material. In the second section, we will see how this workflow can be modified to do a single task on multiple materials. You may also try the combination of both as an exercise.
Two important files which will be used for running the taskblaster workflow and hence to be placed in the directory where you are setting up the workflow are;
tasks.py
defining the different tasks to be performed;workflow.py
defining the order and dependence between the tasks.
Examples of these files can be found in the ht_workflow/single_material
and ht_workflow/multi_material
directories
for the respective cases.
Single Material-Based Workflow
In a new directory, a workflow can be initially set up by running tb init ht_workflow
. This allows the workflow to save objects using the ASE JSON encoder.
You may then copy over the files tasks.py
and workflow.py
from ht_workflow/single_material
into your current working directory. You will need to edit the line
species_dir = "/scratch/projects/bep00114/softwares/aims_new/FHIaims/species_defaults/defaults_2020/light"
in tasks.py
to the location of your species_defaults
folder, as specified in the ASE config above.
We will select bulk Si as the material of interest for which simulations are to be conducted as this is a simple system
with two atoms in the unit cell. As we can see from the tasks.py
file, we define three different tasks to be performed
on bulk Si which are PBEsol relaxation
, HSE06 SCF
calculation and HSE06 bandstructure
calculation. This is a
reasonable workflow hierarchy as PBEsol functional gives reliable geometries and HSE06 gives more accurate energetics
and band gaps for narrow-gap semiconducting systems like Si. These tasks are defined as python functions which are
called under the Single_Material_Workflow
class in the workflow.py
file.
Now we can run the command
tb workflow workflow.py
This generates the tasks we have defined. These can be checked via running tb ls
which will show
state deps tags worker time folder
───────────────────────────────────────────────────────────────────────────────
new 0/1 tree/hse_band
new 0/1 tree/hse_scf
new 0/0 tree/pbesol_relax
The label new
indicates that these tasks are newly created and have not been submitted yet. Also, we can see the
dependency of the tasks in the column deps
which shows that PBEsol relaxation
is started from scratch whereas
the remaining tasks are taking outputs of other tasks as inputs in the same way we have defined them in the
workflow.py
file. Now we can run these tasks via
tb run .
An example output of this command will look like:
Starting worker rank=000 size=001
[rank=000 2024-09-09 17:31:25 N/A-0/1] Worker class: —
[rank=000 2024-09-09 17:31:25 N/A-0/1] Required tags: —
[rank=000 2024-09-09 17:31:25 N/A-0/1] Supported tags: —
[rank=000 2024-09-09 17:31:25 N/A-0/1] Main loop
Got task <taskblaster.worker.LoadedTask object at 0x1473c7c217c0>
[rank=000 2024-09-09 17:31:25 N/A-0/1] Running pbesol_relax ...
[rank=000 2024-09-09 17:31:50 N/A-0/1] Task pbesol_relax finished in 0:00:25.201288
Got task <taskblaster.worker.LoadedTask object at 0x1473c7bfcb20>
[rank=000 2024-09-09 17:31:50 N/A-0/1] Running hse_scf ...
[rank=000 2024-09-09 17:32:09 N/A-0/1] Task hse_scf finished in 0:00:18.724472
Got task <taskblaster.worker.LoadedTask object at 0x1473c0d40160>
[rank=000 2024-09-09 17:32:09 N/A-0/1] Running hse_band ...
[rank=000 2024-09-09 17:34:41 N/A-0/1] Task hse_band finished in 0:02:32.160490
[rank=000 2024-09-09 17:34:41 N/A-0/1] No available tasks, end worker main loop
As we can see, all the three tasks are successfully finished. We can inspect the directories of these tasks inside
the tree
directory to verify this. You can find the FHI-aims input and output files generated for each of the tasks
in this repository. Please note that for tutorial purposes, we have used very light settings which should be changed
for production runs. Despite these light settings, we get a band gap of 1.38 eV which is not too far from the
experimental band gap of 1.12 eV.
If any of the steps of the calculation failed, e.g. if you forgot to changes the species_dir
in tasks.py
then you may see something like the following after running tb ls
state deps tags worker time folder
───────────────────────────────────────────────────────────────────────────────
cancel 0/1 tree/hse_band
cancel 0/1 tree/hse_scf
fail 0/0 N/A-0/1 00:00:00 tree/pbesol_relax
^^^^ RuntimeError: The requested species_dir /scratch/projects/bep00114/softwares/aims_new/FHIaims/species_defaults/defaults_…
If you wish to edit anything in tasks.py and then re-run the workflow, you must first use the tb unrun
command, followed by the folder you wish to unrun, e.g.
tb unrun tree/pbesol_relax --force
The workflow will then be reset, which you can verify by running tb ls
:
state deps tags worker time folder
───────────────────────────────────────────────────────────────────────────────
new 0/1 tree/hse_band
new 0/1 tree/hse_scf
new 0/0 tree/pbesol_relax
You can then edit the tasks.py
file and re-run the workflow. You could also edit other parameters, e.g. the size of the k-grid, in a similar manner.
Multi Material-Based Workflow
Now we will look into a scenario where some tasks are executed over multiple materials. This is often useful in
material science, especially when working on development of databases or machine learning. For simplicity, we will
consider four bulk solids (Si, Ag, Ti, Au), and use a single task of PBEsol relaxation. You can copy both the
tasks.py
and workflow.py
file from the ht_workflow/single_material
directory. Since we are using only the
PBEsol relaxation, we can comment out or delete the lines from the workflow.py
file related to the other tasks.
We can also modify the Class name from Single_Material_Workflow
to Multi_Material_Workflow
.
For working with multiple materials, we can create an ASE database by using the write_db.py
file present in the
ht_workflow/multi_material
directory. Copy this over to your working directory and execute it.
This will generate an ase db file named bulk.db
which saves all the basic
information such as lattice parameters, atomic positions, magnetic moments and so on for these materials.
After initiating a repository as we have done previously, we can add the information all the materials to the
workflow by copying over the the totree.py
file also present in the ht_workflow/multi_material
directory
and running as
tb workflow totree.py
This will create a tree structure with the four materials, and you should see the following output
add: new 0/0 tree/Si2/material
add: new 0/0 tree/Ag/material
add: new 0/0 tree/Ti2/material
add: new 0/0 tree/Au/material
Now, we modify the def workflow
routine inside workflow.py
from the single Si case to run all the materials
that have just been added to the tree
@tb.parametrize_glob('*/material')
def workflow(material):
return Multi_Material_Workflow(atoms=material)
This modified workflow.py
file can be found inside the ht_workflow/multi_material
directory.
Finally, similar to the single material case, running tb workflow workflow.py
can be used for initiating the workflow. tb ls
will return:
entry: add new 0/1 tree/Ag/pbesol_relax
entry: add new 0/1 tree/Au/pbesol_relax
entry: add new 0/1 tree/Si2/pbesol_relax
entry: add new 0/1 tree/Ti2/pbesol_relax
As we can see, the same task (PBEsol relaxation) is now defined for all the materials (the stoichiometry of the materials represents the number of atoms in their unit cell).
Now we can submit these tasks by running
tb run .
which will show an output similar to:
Starting worker rank=000 size=001
[rank=000 2024-09-09 22:35:22 N/A-0/1] Worker class: —
[rank=000 2024-09-09 22:35:22 N/A-0/1] Required tags: —
[rank=000 2024-09-09 22:35:22 N/A-0/1] Supported tags: —
[rank=000 2024-09-09 22:35:22 N/A-0/1] Main loop
Got task <taskblaster.worker.LoadedTask object at 0x147cb68ae3a0>
[rank=000 2024-09-09 22:35:22 N/A-0/1] Running Ag/material ...
[rank=000 2024-09-09 22:35:22 N/A-0/1] Task Ag/material finished in 0:00:00.002124
Got task <taskblaster.worker.LoadedTask object at 0x147cb68ae550>
[rank=000 2024-09-09 22:35:22 N/A-0/1] Running Ag/pbesol_relax ...
[rank=000 2024-09-09 22:35:48 N/A-0/1] Task Ag/pbesol_relax finished in 0:00:26.249008
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fd0>
[rank=000 2024-09-09 22:35:48 N/A-0/1] Running Au/material ...
[rank=000 2024-09-09 22:35:48 N/A-0/1] Task Au/material finished in 0:00:00.002103
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fa0>
[rank=000 2024-09-09 22:35:48 N/A-0/1] Running Au/pbesol_relax ...
[rank=000 2024-09-09 22:36:16 N/A-0/1] Task Au/pbesol_relax finished in 0:00:27.658316
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fa0>
[rank=000 2024-09-09 22:36:16 N/A-0/1] Running Si2/material ...
[rank=000 2024-09-09 22:36:16 N/A-0/1] Task Si2/material finished in 0:00:00.001984
Got task <taskblaster.worker.LoadedTask object at 0x147cb7107f70>
[rank=000 2024-09-09 22:36:16 N/A-0/1] Running Si2/pbesol_relax ...
[rank=000 2024-09-09 22:36:33 N/A-0/1] Task Si2/pbesol_relax finished in 0:00:16.813452
Got task <taskblaster.worker.LoadedTask object at 0x147cb7107f70>
[rank=000 2024-09-09 22:36:33 N/A-0/1] Running Ti2/material ...
[rank=000 2024-09-09 22:36:33 N/A-0/1] Task Ti2/material finished in 0:00:00.002379
Got task <taskblaster.worker.LoadedTask object at 0x147cafa62fa0>
[rank=000 2024-09-09 22:36:33 N/A-0/1] Running Ti2/pbesol_relax ...
[rank=000 2024-09-09 22:37:09 N/A-0/1] Task Ti2/pbesol_relax finished in 0:00:36.020013
[rank=000 2024-09-09 22:37:09 N/A-0/1] No available tasks, end worker main loop
You can inspect the results of the calculations by going into the respective directories for each of the material.
Note that this will submit all the calculations for all the materials. In case you have an expensive task and
would like to submit individually for each material, you can specify the corresponding path while running the tasks.
For instance, tb run tree/Ag
will perform the task(s) only for Ag.
To run the calculations either for all the materials together or for specific materials in a supercomputer, you can attach the same command in the submission script. An example SLURM script is below:
#!/bin/bash
#SBATCH -o ./Ag_out.%j
#SBATCH -e ./Ag_err.%j
#SBATCH -D ./
#SBATCH -J Ag
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=96
#SBATCH --cpus-per-task=1
#SBATCH --time=05:00:00
ulimit -s unlimited
module load anaconda3/2019.10
source /sw/tools/anaconda3/2019.10/skl/etc/profile.d/conda.sh
conda activate tb_workflow
tb run /path/to/tree/Ag