Part 2: Replica-Exchange Grand-Canonical (REGC) scheme
This Replica-Exchange Grand-Canonical (REGC) code mainly focuses on calculating the phase diagrams of materials including cluster, nanoparticle, and surface in a reactive gas phase. The basic physical theories and techniques in REGC method are described in:
1 Y. Zhou, M. Scheffler, and L. M. Ghiringhelli, "Determining surface phase diagrams including anharmonic effects," Phys. Rev. B 100, 174106 (2019) https://link.aps.org/pdf/10.1103/PhysRevB.100.174106
2 Y. Zhou. Surface phase diagrams including anharmonic effects via a replica-exchange grand-canonical method. Doctoral thesis, Technische Universität Berlin, Berlin, 2020. https://depositonce.tu-berlin.de/handle/11303/11723
3 Y. Zhou, C. Zhu, M. Scheffler, and L. M. Ghiringhelli, "Ab initio approach for thermodynamic surface phases with full considerabtion of anharmonic effect -- the example of hydrogen at Si(100)", Phys. Rev. Lett 128, 246101 https://journals.aps.org/prl/pdf/10.1103/PhysRevLett.128.246101 Here we will run a very small example of a \(\mathrm{Si}_2\) cluster in a \(\mathrm{H}_2\) gas phase to demonstrate the workflow of the REGC method.
Please contact:zhouwaiwai9@gmail.com if any questions are raised for this section.
Getting started with FHI-panda
FHI-panda is written in python3
. The FHI-panda package is distributed in source code form, the following prerequisites are needed:
- compile FHI-aims always with PLUMED external package for ab initio MD
- python3.X
- numpy2.1.1
- mpi4py if needed for some cases
- pymbar https://pymbar.readthedocs.io/en/master/ for post-processing
To compile FHI-aims with PLUMED, add the following setting in your current CMake file and recompile
set(USE_PLUMED ON CACHE BOOL “”)
If you have used the point release 240910
the name of the executable compiled with PLUMED should be the following:
aims.240910.meta.scalapack.mpi.x
All FHI-panda scripts are shipped with the tutorial material, which you can download or clone from the following site:
It includes `exercises_1` for perform a REGC-MD run and `exercise_2` for the post-processing of REGC data.
## The Input files for REGC molecular dynamics
We will perform a REGC molecular dynamics calculation for a $\mathrm{Si}_2$ cluster in contact with
$H_2$ gas. Here, FHI-aims code was used for total energy and
molecular dynamics calculations. Please go to the following directory:
The working directory should contain the following necessary files:
1. **REGC.dat**: The REGC main input file
2. **\*.py**: All the python source files of FHI-panda
3. **run.sh**: The shell script for submit the REGC job
4. **control.in.E**: the FHI-aims input file. Note that the geometry info is already included in REGC.dat.
5. **subjob**: a bash script that defines how to run the replica in parallel.
## Preparing the `REGC.dat` input file
The main input file `REGC.dat` contains all the necessary
parameters for the structure prediction. For the $\mathrm{Si}_2$ cluster in a $\mathrm{H}_2$
gas phase it should look like this:
########################## The Basic Parameters of REGCMD
SystemName = Si2Hx NumberOfSpecies = 2 NameOfAtoms = Si D radius = 4 Center = 0.0 0.0 0.0 Numberofatom = 2 Numberofmetalatom = 2 @DistanceOfIon 2.14 1.4 1.4 0.75 @End Current_step = 0 Num_T = 2 Num_u = 2 Temperature = 300 400 Mu = -0.3 -0.2 MDt = 0.02 @Atomposition 0.94787 3.59052 0.22819 0.17231 3.14166 2.76918 @End N_total = 20000 mass = 28.086 2.014101 PickUp = F
The file consists of input tags that can be given in any order, or be omitted while the default values are used. Here we offer a quick overview of `REGC.dat` syntax:
- The general syntax is `"tag labels" = "value1 value2 value3 ..."`. For matrix inputs, it started with `\@tag` and ended with `\@End`. Values that are not specified in the REGC.dat file are assigned a default value. Input values should be separated by space.
- The lines beginning with \# is taken as explanation of the following `tag label`.
- Logical values can be given as T or True, F or False.
- The labels are not case sensitive.
In the following the input settings are described in detail. You can skip this subsection and continue with the exercise [by continuing with the next section](#running-regc-jobs-in-parallel).
### Input settings
* `SystemName = <string>`: A string gives a name for the description system. *Default value*: REGCMD
* `NumberOfSpecies = <integer>`: Number of different chemical species in the system. For example, this number is "2" for Si$_2$ in contact with $\mathrm{H}_2$ gas. *Default value*: There is no default. You must provide the number.
* `NameOfAtoms = <string1> <string2> <string3> ...`: Elemental symbols of
each chemical species separated by space. Taking $\mathrm{Si}_2$ in
contact with $\mathrm{H}_2$ gas as examples, one will write `NameOfAtoms
= Si H`. *Default value*: There is no default. You must define it.
* `Mass = <real1> <real2> <real3> ...`: The atomic mass of each chemical
species separated by space. Taking $\mathrm{Si}_2$ in contact with
H$_2$ gas as examples, one will write `Mass = 28.0855
1.0079`. *Default value*: There is no default. You must define it.
* `NumberOfmetalAtoms = <integer>`: The total number of metal particles
in the initial structure. Here the term `metal particles` refers to
the particles of bare cluster/surface. *Default value*: There is no default. You must define it.
* `@DistanceOfIon` (array)
```
@DistanceOfIon
<real11> <real12> <real13> ...
<real21> <real22> <real23> ...
...
@End
```
Minimal interatomic distances (in unit of angstrom) in a format of $n×n$ matrix. The rank $n$ of the matrix is determined by the `NumberOfSpecies`. For $\mathrm{Si}_4\mathrm{H}_x$ `NumberOfSpecies=2`,
the $2×2$ matrix is defined as:
```
@DistanceOfIon
d_11 d_12
d_11 d_12
@End
```
where $d_{11}$, $d_{12}$ are the minima distances between Si-Si and
Si-H, respectively. Thereby, are the minima distance between H-Si and
H-H, respectively. *Default value*: There is no default. You must define it.
* `SystemType = <integer>`: Our method targets at determining phase
diagrams of materials, including clusters(`<integer>`=0),
nanoparticles(`<integer>`=1), surfaces(`<integer>`=2) and bulk
(`<integer>`=3). *Default value*: There is no default. You must define it.
* `Num_T = <integer>`: the number of different temperatures selected in
the simulation. *Default value*: There is no default. You must define it.
* `Num_U = <integer>`: the number of different chemical potentials
selected in the simulation. *Default value*: There is no default. You must define it.
!!! important "The total number of replicas"
In total, the number of replicas is `Num_T`×`Num_U`!
* `Current_step = <integer>`: the number of REGC steps in the
simulation. If you just begin from scratch, set it to 0, otherwise, find
the number in the head line of you restart.dat files. *Default value*: There is no default. You must define it.
* `PickUp = <logical>`: If `False`, the calculation is from scratch. If `True`, it will start from a previous simulation, together set with `Current_step`. *Default value*: There is no default. You must define it.
* `MDt = <real>`: in ps unit. The simulation time for the single molecular
dynamics. *Default value*: There is no default. You must define it.
* `N_total=<integer>`: The total number of REGC steps for the
simulation. *Default value*: There is no default. You must define it.
* `NumberOfAtoms = <integer>` or `<integer1>, <integer2, integer3, ...>`: The
total number of particles in the initial structure. If all the replicas
share the same initial structure, then only one integer should be given,
on the other hand, each replica has different initial structures. *Default value*: There is no default. You must define it.
The following tag labels are system-dependent. We give the examples for
clusters and surfaces. First of all, the tag labels are illustrated for
clusters (`SystemType = 0`):
* `Radius = <real>`: The radius of the sphere where cluster are confined
during REGC simulation in angstrom unit. The sphere should be large
enough so that the adsorption of gas is able to reach its condensed
line. *Default value*: There is no default. You must define it.
* `@Atomposition` (array)
```
@Atomposition
<real11> <real12> <real13>
<real21> <real22> <real23>
...
@End\
```
If the same initial structure are shared with all replicas, please set
this tag
```
@Replica0atomposition
<real11> <real12> <real13>
<real21> <real22> <real23>
...
@End
@Replica1atomposition
<real11> <real12> <real13>
<real21> <real22> <real23>
...
@End
@Replica2atomposition
<real11> <real12> <real13>
<real21> <real22> <real23>
...
\@End
...
```
## Running REGC jobs in parallel
Note that one of merits of Replica Exchange method is its ***embarrassingly parallel*** nature,
which indicating that the total energy/molecular dynamics calculations of
replicas can perform simultaneously. The following `subjobs` file executes 4 replicas in parallel assuming you have 4 MPI tasks available on your machine:
The `subjobs` file uses the MPI executable `mpirun`, but this depends on the actual computer setup (it could be `srun`, `orterun`, etc.). Also the name of the FHI-aims executable can be different on your machine. **Please adapt it to your setup.**
Now, we are ready to start the calculation:
!!! todo "Excercise 1"
Run a REGCMD simulation for $\mathrm{Si}_2$ systems. Temperatures are 300 and 500 K. The chemical potentials ($\Delta \mu_{\mathrm{H}}$) are -0.3 and -0.2 eV. The input files to fulfill this task are prepared for you in the folder `Tutorials/P2-REGC/REGC/exercise_1/Si2_run`. You only need to do the following two steps
1. Adapt the MPI executable and FHI-aims executable name in the `subjob` file according to your system setup.
2. Execute `python core.py > log.core`
## The Output files of the REGC molecular dynamics
During a REGC simulation, we define two kinds of replicas: one is called
*physical* replica which is an actual collection of atoms
(configuration) throughout the course of the simulation; the other is
*logical* replica referred to a specific thermodynamic state $T_i$ and
$\mu_i$ The major output files and folders are listed in the running
directory:
* **`restart.dat`**: The file contains all the information for restart a
REGC calculation including: total energy, particle position
* **`log_rex`**: The file contains all the information regarding to the
swapping process.
* **`*_*_.out`**: The file collects the MD simulation of each *logical*
replica (specific thermodynamic state) after each **REGC step**.
* **`rex_*_*`**: The file collects the MD simulation of each *logical*
replica (specific thermodynamic state) after each **replica swapping**.
* **`replica_*/`**: The folder contains the *trajctory* files.
* **trajectory**: The file collects the MD calculation of the *physical*
replica (specific configuration) after each REGC step.
Here is an example of the output files of a specific REGC-MD simulation for
$\mathrm{Si}_2$ cluster in contact with $\mathrm{H}_2$. The so-called
*physical* replicas are under the `replica_*/` folders:
trajectory
python check_convergence.py## Analysis of Results **Explosive** amount of data are generated in a REGC simulation. Here we only explain in detail analysis REGC data in two aspects: Convergence and a *posteriori* analysis via MBAR. ### Convergence One simple criteria to meet for a convergence is that the histogram (distribution) of a variable of interest at each thermodynamical state become stable or in the other term does not change with the further more simulation time. This variable can be any properties you are interested in. For the applications in Ref. [1], the distribution of adsorption energy or the number of adsorbed particle for the *logical* replicas are chosen. There is a python script named `check_convergence.py` taking care of the convergence check. Just type the following command in your running directory:
N__ E__You will get a collection of files for each thermodynamic state:
data/ reweight_regcmc.pyThen the idea of ***Block Average*** [3] can be used to check whether the simulation is long enough. ![adsorbedN](Figure1.png) ![adsorbedE](Figure2.png) ### Post-processing via MBAR There is an example that MBAR is applied to analyze the data of T-only replica exchange simulation in the pymbar code (<https://github.com/choderalab/pymbar>). The example is given in our package to illustrate how to use MBAR to analyze our data in the directory `Tutorials/P2-REGC/REGC/exercise_2/N_H`:
2D PMFThe `reweight_regcmc.py` file is python script to execute this data analysis and can be used to estimate a 2D potential of mean force from REGC simulation data at target temperature and chemical potential. All the required input data are under the `data/` directory. !!! todo "Excercise 2" Run a simple point post-processing for the $\mathrm{Si}_2$ data of ref.[1] at (360K, -0.1 eV) and identify the number of adsorbed hydrogen of most probably phase. Do the following: 1. Go to REGC/excercise_2/N_H 2. Execute `python reweight_regcmc.py` 3. Inspect the resulting file `f_k.out` The relative free energy surface ('g' column in `f_k.out`) and the number of chemisorbed D atoms ('phi' column in `f_k.out`) is shown as below:
bin phi N g dg
0 0.500000 21 4.658 0.242
1 1.500000 133 52.301 0.424
2 2.500000 1647 45.040 0.120
3 3.500000 76 43.130 1.002
4 4.500000 803 32.017 0.258
5 5.500000 99 21.015 1.002
6 6.500000 3304 0.000 0.000
7 7.500000 7 37.755 0.959
```
Here, the free energy is the relative free energy with respect to the lowest free energy. For the most probable phase, the relative free energy is 0.0 ("g = 0.0"). The corresponding "phi = 6.5" is the bin center of the \(N_\mathrm{H}\) bin [6, 7.0), so the most probable \(N_\mathrm{H}\) = 6 at (360K, -0.1 eV). The info of the most probable phase is also written in 'PTD' file. For more information about MBAR, you may go to this website https://pymbar.readthedocs.io/en/master/#.
In order to construct phase diagram, the post-processing of REGC data need to be performed at every 1 K and every 0.01 eV for the whole T-P region of relevant interest.
After the post-processing, 3D data including \(T\), \(\mu\) and \(N_\mathrm{H}\) are obtained as shown in "3D_MuN" file. With the relation between \(T\), \(p\) and \(\mu\), "3D_phasediagram" is ready for a \(T\)-\(p\) phase diagram as shown below.
Reference
1 Y. Zhou, M. Scheffler, and L. M. Ghiringhelli, "Determining surface phase diagrams including anharmonic effects," Phys. Rev. B 100, 174106 (2019) https://link.aps.org/pdf/10.1103/PhysRevB.100.174106
2 Y. Zhou. Surface phase diagrams including anharmonic effects via a replica-exchange grand-canonical method. Doctoral thesis, Technische Universität Berlin, Berlin, 2020. https://depositonce.tu-berlin.de/handle/11303/11723
3 Y. Zhou, C. Zhu, M. Scheffler, and L. M. Ghiringhelli, "Ab initio approach for thermodynamic surface phases with full considerabtion of anharmonic effect -- the example of hydrogen at Si(100)", Phys. Rev. Lett 128, 246101 https://journals.aps.org/prl/pdf/10.1103/PhysRevLett.128.246101