Part 2: Replica-Exchange Grand-Canonical (REGC) scheme
This Replica-Exchange Grand-Canonical (REGC) code mainly focuses on calculating the phase diagrams of materials including cluster, nanoparticle, and surface in a reactive gas phase. The basic physical theories and techniques in REGC method are described in:
1 Y. Zhou, M. Scheffler, and L. M. Ghiringhelli, "Determining surface phase diagrams including anharmonic effects," Phys. Rev. B 100, 174106 (2019) https://link.aps.org/pdf/10.1103/PhysRevB.100.174106
2 Y. Zhou. Surface phase diagrams including anharmonic effects via a replica-exchange grand-canonical method. Doctoral thesis, Technische Universität Berlin, Berlin, 2020. https://depositonce.tu-berlin.de/handle/11303/11723
Here we will run a very small example of a \(\mathrm{Si}_2\) cluster in a \(\mathrm{H}_2\) gas phase to demonstrate the workflow of the REGC method.
Getting started with FHI-panda
FHI-panda is written in python2.7
. The FHI-panda package is distributed in source code form, the following prerequisites are needed:
- compile FHI-aims always with PLUMED external package for ab initio MD
- python 2.7
- numpy 1.17.0
- mpi4py if needed for some cases
- pymbar https://pymbar.readthedocs.io/en/master/ for post-processing
To compile FHI-aims with PLUMED, add the following setting in your current CMake file and recompile
set(USE_PLUMED ON CACHE BOOL “”)
If you have used the point release 210716_2
the name of the executable compiled with PLUMED should be the following:
aims.210716_2.meta.scalapack.mpi.x
All FHI-panda scripts are shipped with the tutorial material, which you can download or clone from the following site:
https://gitlab.com/FHI-aims-club/tutorials/introduction-of-ab-initio-thermodynamics-and-regc/
Change into the directory:
introduction-of-ab-initio-thermodynamics-and-regc/Tutorials/P2-REGC/REGC
It includes exercises_1
for perform a REGC-MD run and exercise_2
for the post-processing of REGC data.
The Input files for REGC molecular dynamics
We will perform a REGC molecular dynamics calculation for a \(\mathrm{Si}_2\) cluster in contact with \(H_2\) gas. Here, FHI-aims code was used for total energy and molecular dynamics calculations. Please go to the following directory:
introduction-of-ab-initio-thermodynamics-and-regc/Tutorials/P2-REGC/REGC/exercise_1/Si2_run
The working directory should contain the following necessary files:
- REGC.dat: The REGC main input file
- *.py: All the python source files of FHI-panda
- run.sh: The shell script for submit the REGC job
- control.in.E: the FHI-aims input file. Note that the geometry info is already included in REGC.dat.
- subjob: a bash script that defines how to run the replica in parallel.
Preparing the REGC.dat
input file
The main input file REGC.dat
contains all the necessary
parameters for the structure prediction. For the \(\mathrm{Si}_2\) cluster in a \(\mathrm{H}_2\)
gas phase it should look like this:
################################ The Basic Parameters of REGCMD ################################
SystemName = Si2Hx
NumberOfSpecies = 2
NameOfAtoms = Si D
radius = 4
Center = 0.0 0.0 0.0
Numberofatom = 2
Numberofmetalatom = 2
@DistanceOfIon
2.14 1.4
1.4 0.75
@End
Current_step = 0
Num_T = 2
Num_u = 2
Temperature = 300 400
Mu = -0.3 -0.2
MDt = 0.02
@Atomposition
0.94787 3.59052 0.22819
0.17231 3.14166 2.76918
@End
N_total = 20000
mass = 28.086 2.014101
PickUp = F
The file consists of input tags that can be given in any order, or be omitted while the default values are used. Here we offer a quick overview of REGC.dat
syntax:
- The general syntax is
"tag labels" = "value1 value2 value3 ..."
. For matrix inputs, it started with\@tag
and ended with\@End
. Values that are not specified in the REGC.dat file are assigned a default value. Input values should be separated by space. - The lines beginning with # is taken as explanation of the following
tag label
. - Logical values can be given as T or True, F or False.
- The labels are not case sensitive.
In the following the input settings are described in detail. You can skip this subsection and continue with the exercise by continuing with the next section.
Input settings
-
SystemName = <string>
: A string gives a name for the description system. Default value: REGCMD -
NumberOfSpecies = <integer>
: Number of different chemical species in the system. For example, this number is "2" for Si\(_2\) in contact with \(\mathrm{H}_2\) gas. Default value: There is no default. You must provide the number. -
NameOfAtoms = <string1> <string2> <string3> ...
: Elemental symbols of each chemical species separated by space. Taking \(\mathrm{Si}_2\) in contact with \(\mathrm{H}_2\) gas as examples, one will writeNameOfAtoms = Si H
. Default value: There is no default. You must define it. -
Mass = <real1> <real2> <real3> ...
: The atomic mass of each chemical species separated by space. Taking \(\mathrm{Si}_2\) in contact with H\(_2\) gas as examples, one will writeMass = 28.0855 1.0079
. Default value: There is no default. You must define it. -
NumberOfmetalAtoms = <integer>
: The total number of metal particles in the initial structure. Here the termmetal particles
refers to the particles of bare cluster/surface. Default value: There is no default. You must define it. -
@DistanceOfIon
(array)@DistanceOfIon <real11> <real12> <real13> ... <real21> <real22> <real23> ... ... @End
Minimal interatomic distances (in unit of angstrom) in a format of \(n×n\) matrix. The rank \(n\) of the matrix is determined by the
NumberOfSpecies
. For \(\mathrm{Si}_4\mathrm{H}_x\)NumberOfSpecies=2
, the \(2×2\) matrix is defined as:where \(d_{11}\), \(d_{12}\) are the minima distances between Si-Si and Si-H, respectively. Thereby, are the minima distance between H-Si and H-H, respectively. Default value: There is no default. You must define it.@DistanceOfIon d_11 d_12 d_11 d_12 @End
-
SystemType = <integer>
: Our method targets at determining phase diagrams of materials, including clusters(<integer>
=0), nanoparticles(<integer>
=1), surfaces(<integer>
=2) and bulk (<integer>
=3). Default value: There is no default. You must define it. -
Num_T = <integer>
: the number of different temperatures selected in the simulation. Default value: There is no default. You must define it. -
Num_U = <integer>
: the number of different chemical potentials selected in the simulation. Default value: There is no default. You must define it.
The total number of replicas
In total, the number of replicas is Num_T
×Num_U
!
-
Current_step = <integer>
: the number of REGC steps in the simulation. If you just begin from scratch, set it to 0, otherwise, find the number in the head line of you restart.dat files. Default value: There is no default. You must define it. -
PickUp = <logical>
: IfFalse
, the calculation is from scratch. IfTrue
, it will start from a previous simulation, together set withCurrent_step
. Default value: There is no default. You must define it. -
MDt = <real>
: in ps unit. The simulation time for the single molecular dynamics. Default value: There is no default. You must define it. -
N_total=<integer>
: The total number of REGC steps for the simulation. Default value: There is no default. You must define it. -
NumberOfAtoms = <integer>
or<integer1>, <integer2, integer3, ...>
: The total number of particles in the initial structure. If all the replicas share the same initial structure, then only one integer should be given, on the other hand, each replica has different initial structures. Default value: There is no default. You must define it.
The following tag labels are system-dependent. We give the examples for
clusters and surfaces. First of all, the tag labels are illustrated for
clusters (SystemType = 0
):
-
Radius = <real>
: The radius of the sphere where cluster are confined during REGC simulation in angstrom unit. The sphere should be large enough so that the adsorption of gas is able to reach its condensed line. Default value: There is no default. You must define it. -
@Atomposition
(array)@Atomposition <real11> <real12> <real13> <real21> <real22> <real23> ... @End\
If the same initial structure are shared with all replicas, please set this tag
@Replica0atomposition <real11> <real12> <real13> <real21> <real22> <real23> ... @End @Replica1atomposition <real11> <real12> <real13> <real21> <real22> <real23> ... @End @Replica2atomposition <real11> <real12> <real13> <real21> <real22> <real23> ... \@End ...
Running REGC jobs in parallel
Note that one of merits of Replica Exchange method is its embarrassingly parallel nature,
which indicating that the total energy/molecular dynamics calculations of
replicas can perform simultaneously. The following subjobs
file executes 4 replicas in parallel assuming you have 4 MPI tasks available on your machine:
cd replica_00
mpirun -n 4 aims.x >> aims.out&
cd ..
cd replica_01
mpirun -n 4 aims.x >> aims.out&
cd ..
cd replica_02
mpirun -n 4 aims.x >> aims.out&
cd ..
cd replica_03
mpirun -n 4 aims.x >> aims.out&
cd ..
wait
The subjobs
file uses the MPI executable mpirun
, but this depends on the actual computer setup (it could be srun
, orterun
, etc.). Also the name of the FHI-aims executable can be different on your machine. Please adapt it to your setup.
Now, we are ready to start the calculation:
python2.7 core.py >log.core
Excercise 1
Run a REGCMD simulation for \(\mathrm{Si}_2\) systems. Temperatures are 300 and 500 K. The chemical potentials (\(\Delta \mu_{\mathrm{H}}\)) are -0.3 and -0.2 eV. The input files to fulfill this task are prepared for you in the folder Tutorials/P2-REGC/REGC/exercise_1/Si2_run
. You only need to do the following two steps
- Adapt the MPI executable and FHI-aims executable name in the
subjob
file according to your system setup. - Execute
python2.7 core.py > log.core
The Output files of the REGC molecular dynamics
During a REGC simulation, we define two kinds of replicas: one is called physical replica which is an actual collection of atoms (configuration) throughout the course of the simulation; the other is logical replica referred to a specific thermodynamic state \(T_i\) and \(\mu_i\) The major output files and folders are listed in the running directory:
-
restart.dat
: The file contains all the information for restart a REGC calculation including: total energy, particle position -
log_rex
: The file contains all the information regarding to the swapping process. -
*_*_.out
: The file collects the MD simulation of each logical replica (specific thermodynamic state) after each REGC step. -
rex_*_*
: The file collects the MD simulation of each logical replica (specific thermodynamic state) after each replica swapping. -
replica_*/
: The folder contains the trajctory files. -
trajectory: The file collects the MD calculation of the physical replica (specific configuration) after each REGC step.
300.0_-0.2.out 400.0_-0.3.out replica_02 rex_300.0_-0.3
300.0_-0.3.out replica_00 replica_03 rex_400.0_-0.2
400.0_-0.2.out replica_01 rex_300.0_-0.2 rex_400.0_-0.3
replica_*/
folders:
ls replica_00
>>> trajectory
Analysis of Results
Explosive amount of data are generated in a REGC simulation. Here we only explain in detail analysis REGC data in two aspects: Convergence and a posteriori analysis via MBAR.
Convergence
One simple criteria to meet for a convergence is that the histogram
(distribution) of a variable of interest at each thermodynamical state
become stable or in the other term does not change with the further more
simulation time. This variable can be any properties you are interested
in. For the applications in Ref. [1], the
distribution of adsorption energy or the number of adsorbed particle for
the logical replicas are chosen. There is a python script named
check_convergence.py
taking care of the convergence check. Just type
the following command in your running directory:
python check_convergence.py
You will get a collection of files for each thermodynamic state:
N_*_* E_*_*
Then the idea of Block Average [3] can be
used to check whether the simulation is long enough.
Post-processing via MBAR
There is an example that MBAR is applied to analyze the data of T-only
replica exchange simulation in the pymbar code
(https://github.com/choderalab/pymbar). The example is given in our
package to illustrate how to use MBAR to analyze our data in the
directory Tutorials/P2-REGC/REGC/exercise_2/N_H
:
data/
reweight_regcmc.py
The reweight_regcmc.py
file is python script to execute this data
analysis and can be used to estimate a 2D potential of mean force from
REGC simulation data at target temperature and chemical potential. All
the required input data are under the data/
directory.
Excercise 2
Run a simple point post-processing for the \(\mathrm{Si}_2\) data of ref.[1] at (360K, -0.1 eV) and identify the number of adsorbed hydrogen of most probably phase. Do the following:
- Go to REGC/excercise_2/N_H
- Execute
python reweight_regcmc.py
- Inspect the resulting file
f_k.out
The relative free energy surface ('g' column in f_k.out
) and the number of chemisorbed D atoms ('phi' column in f_k.out
) is shown as below:
2D PMF
bin phi N g dg
0 0.500000 21 4.658 0.242
1 1.500000 133 52.301 0.424
2 2.500000 1647 45.040 0.120
3 3.500000 76 43.130 1.002
4 4.500000 803 32.017 0.258
5 5.500000 99 21.015 1.002
6 6.500000 3304 0.000 0.000
7 7.500000 7 37.755 0.959
Here, the free energy is the relative free energy with respect to the lowest free energy. For the most probable phase, the relative free energy is 0.0 ("g = 0.0"). The corresponding "phi = 6.5" is the bin center of the \(N_\mathrm{H}\) bin [6, 7.0), so the most probable \(N_\mathrm{H}\) = 6 at (360K, -0.1 eV). The info of the most probable phase is also written in 'PTD' file. For more information about MBAR, you may go to this website https://pymbar.readthedocs.io/en/master/#.
In order to construct phase diagram, the post-processing of REGC data need to be performed at every 1 K and every 0.01 eV for the whole T-P region of relevant interest.
After the post-processing, 3D data including \(T\), \(\mu\) and \(N_\mathrm{H}\) are obtained as shown in "3D_MuN" file. With the relation between \(T\), \(p\) and \(\mu\), "3D_phasediagram" is ready for a \(T\)-\(p\) phase diagram as shown below.
Reference
1 Y. Zhou, M. Scheffler, and L. M. Ghiringhelli, "Determining surface phase diagrams including anharmonic effects," Phys. Rev. B 100, 174106 (2019) https://link.aps.org/pdf/10.1103/PhysRevB.100.174106
2 Y. Zhou. Surface phase diagrams including anharmonic effects via a replica-exchange grand-canonical method. Doctoral thesis, Technische Universität Berlin, Berlin, 2020. https://depositonce.tu-berlin.de/handle/11303/11723
3 D. Frenkel and B. Smit. Understanding molecular simulation: from algorithms to applications, volume 1. Elsevier, 2001.