Skip to content

The canonical ensemble

Exercise 4: Testing thermostats

Time estimation: \(\sim\)1.5h

Most "real-life" experiments cannot be done in a situation where the energy is explicitly kept constant. Instead, other quantities like the average temperature or pressure can be easily controlled. As you may know, an ensemble where the temperature is kept constant is called a canonical ensemble.

We will here focus on stochastic schemes to simulate thermostats in molecular dynamics. We will include a brief description of them below, as well as the idea behind other types of thermostats that will not be studied in this exercise, but that are nevertheless quite popular.

Theory for thermostats

  1. Langevin thermostat1 The Langevin thermostat is expressed through the following differential equation for the momenta (here in one dimension, without loss of generality): $$ \dot{p}(t) = -\frac{\partial V\left(R\right)}{\partial R} -\gamma p(t) + \sqrt{2m\gamma T}\xi(t)$$ where \(V\) is the potential energy, \(\gamma\) is the friction parameter and \(\xi(t)\) is a stochastic variable distributed like a Gaussian white noise (just as in the SVR thermostat). The friction term works like a drag that cools the system, while the random-noise term has a counter effect of heating the system. Since a system in the canonical ensemble must obey detailed-balance there is a relationship between the friction and the white noise that must be obeyed, often also called the fluctuation-dissipation relation which is given by \(\langle\xi(t) \xi(0) \rangle= 2 k_B T \gamma \delta(t)\). The Langevin equation describes, for example, Brownian motion -- i.e. a Markovian (memoryless) stochastic differential equation. A Langevin thermostat disturbs dynamics considerably and is very sensitive on the value of \(\gamma\), being hard to choose an optimal value for systems with degrees of freedom spanning a wide frequency range.

  2. Stochastic velocity-rescaling thermostat (SVR)2. In this algorithm, which takes on ideas of much older stochastic thermostats3,4 and velocity rescaling schemes5, a deviation of the instantaneous kinetic energy is corrected in the following way:

    \[dK = \left[\overline{K} - K(t)\right] \frac{dt}{\tau} + 2 \sqrt{\frac{K(t)\overline{K}}{N_f\tau}}\xi(t)\]

    where \(\overline{K}\) is the target kinetic energy, \(K(t) = p^2(t)/2m\) is the instantaneous kinetic energy, \(\tau\) is the relaxation time of the thermostat, \(N_f\) is the number of degrees of freedom, and \(\xi\) is a white noise term (the derivative of a Wiener process) that obeys \(\left\langle \xi(t) \xi(t') \right\rangle = \delta(t-t')\).

    Wiener process

    An example of a Wiener process \(W(t)\) is the Brownian motion.
    \(W(t)\) has the following characteristics:

    • \(\xi(0) = 0\);
    • \(W(t)\) is continuous;
    • the increments are independent;
    • \(W(t_2)-W(t_1)\) is a Gaussian with average 0 and \(\sigma=t_2-t_1\).

    The svr thermostat yields the correct distribution of \(K\), does not perturb too much the dynamics, and its accuracy and efficiency is rather independent of \(\tau\). Still, for (very) small non-ergodic isolated molecules it may have some issues.

  3. Colored noise thermostats 6

    Colored noise thermostats are an extension of Langevin thermostats; indeed they are also called Generalized Langevin Equation (GLE) thermostats. They are constructed via introducing auxiliary degrees of freedom \({\textbf{s}}\) to the dynamics. These extra degrees of freedom model a Markovian process in higher dimensions, but give rise to non-Markovian dynamics when the fictitious degrees of freedom are integrated out. Since these thermostats can directly act in different frequency ranges, they can be more efficient at thermalization of certain systems.

    Theory behind colored noise thermostats

    The equations of motion are:

    \[\begin{aligned} \dot{R} & = p/m \\ \begin{pmatrix} \dot{p} \\ { \dot{s}}\end{pmatrix} & = \begin{pmatrix} -V^{\prime}(R) \\ 0 \end{pmatrix} - %\begin{pmatrix} a_{pp} {\bf a}_p^T \\ {\bf a}_p {\bf A} \end{pmatrix} {\textbf{ A}}_p \begin{pmatrix} p \\ {\textbf{ s}}\end{pmatrix} + %\begin{pmatrix} b_{pp} {\bf b}_p^T \\ {\bf b}_p {\bf B} \end{pmatrix} {\textbf{ B}}_p \begin{pmatrix} {\boldsymbol \xi} \end{pmatrix}, \label{eq:gle-eom} \end{aligned}\]

    where \({\boldsymbol \xi}\) is an array of uncorrelated Gaussian (white) noises, \(V^{\prime}(R)\) is the gradient of the potential and the \({\textbf{A}}_p\) and \({\textbf{B} }_p\) are matrices that obey the relation

    \[{\textbf{A}_p} {\textbf{C}_p} + {\textbf{C}_p }{\textbf{A}_p^T} = {\textbf{B}_p} {\textbf{B}_p^T},\]

    where \({\textbf{C}}_p\) is the covariance matrix defined as \(\textbf{C}_p = \langle (p, \textbf{s})^T (p, \textbf{ s})\rangle\). By integrating out the \(\textbf{ s}\) degrees of freedom, one gets dynamics of a non-Markovian process in the physical variables, with the EOM given by

    \[\begin{aligned} \dot{R} &= p/m \\ \dot{p} &= -\frac{\partial V}{\partial R} - \int_{-\infty}^t Q(t-\tau)p(\tau) + \zeta(t), \end{aligned}\]

    where \(\zeta(t)\) is a correlated noise and \(Q(t-\tau)\) is a time (or frequency) dependent memory kernel which depends on the drift matrix \({\textbf{A}}_p\). The fluctuation-dissipation theorem (and canonical sampling) is obeyed if \(\langle \zeta(t) \zeta(0) \rangle = k_B T Q(t)\), or equivalently that \({\textbf{C}}_p = k_B T \mathbb{1}\), where \(\mathbb{1}\) is the identity matrix. Since these thermostats can directly act in different frequency ranges, they can be more efficient at thermalization of certain systems. One has to choose the parameters of the \({\textbf{A}}_p\) matrix, which is fitted in the original papers 6 by optimizing functions that ensure, for example, a minimal correlation time for the potential energy on a wide frequency range.

    Colored-Noise Thermostats à la Carte Michele Ceriotti, Giovanni Bussi, and Michele Parrinello Journal of Chemical Theory and Computation 2010 6 (4), 1170-1180

  4. Extended Lagrangian approach: the Nosé-Hoover thermostat 3,7.

    Nosé-Hoover thermostat

    Even though we will not use this kind of thermostats in the exercise, we give here a small explanation about this class of thermostats. Equations of motion derived from the Lagrangian of the system conserve the total energy of the system. One can write an extended Lagrangian, by adding fictitious degrees of freedom, such that the overall total energy is conserved but the atomic subsystem can span ensembles other than microcanonical. With the Nosé-Hoover Lagrangian, the atomic subsystem samples the canonical ensemble. The equations of motion of the Nose-Hoover thermostat are:

    \[\begin{aligned} \mathbf{\dot{R}}_i &= \mathbf{p}_i/m_i\\ \mathbf{\dot{p}}_i &= -\frac{\partial V\left( \mathbf{R}^N \right)}{\partial \mathbf{R}_i} - \frac{\Pi \mathbf{p}_i}{Q}\\ \dot{\eta} &= \frac{\Pi}{Q} \\ \dot{\Pi} &= \left( \sum_i \frac{\mathbf{p}_i^2}{m_i} - \frac{g}{\beta} \right)\end{aligned}\]

    where \(g\) is the number of degrees of freedom of the system, \(V\) is the potential energy, \(Q\) the "thermostat mass", and \(\mathbf{p}_i\) and \(m_i\) the momenta and masses of the \(i\)th particle of the system, respectively. The conjugated momentum \(\Pi\) of the extra coordinate \(\eta\) acts as a fluctuating drag parameter to the atomic subsystem. The conserved energy associated to the equations of motion is:

    \[\mathscr{E} = \sum_i \frac{\mathbf{p}_i^2}{2m_i} + \mathscr{U} \left( \mathbf{r}^N \right) + \frac{1}{2} \frac {\Pi ^2}{Q} + g \frac{\eta}{\beta}\]

    In practice, most implementations of the Nosé-Hoover thermostat actually employ Nosé-Hoover chains, where several thermostats are coupled to each other giving rise to coupled equations of motion with different masses \(Q_j\). This scheme considerably ameliorates ergodicity issues present when a single Nosé thermostat is applied to the system in question 8.

Instructions part 1: Running the simualtions

  • You will find some templates for the input files in the Exercise 4 folder. For this exercise we will simulate H5O2+.

    You should use a time step of 0.5 fs, which corresponds to the following line in the i-pi input.xml file:

    <timestep units="femtosecond"> 0.5 </timestep>

    and we will be simulating all systems at 300 K. Inside the <ensemble> </ensemble> block, we put:

    <temperature units="kelvin"> 300 </temperature>

    We also initialize the initial velocities to match the Maxwell distribution for given temperature. Inside the <initialize> </initialize> block:

    <velocities mode="thermal" units="kelvin"> 300 </velocities>

    Since we are simulating a canonical ensemble, the dynamics mode will be set to NVT:

    <dynamics mode="nvt">
  • The file provided in this folder is tailored to run fast (and inaccurate) simulations, so that this exercise can be done within the time frame proposed here.


    Please do not use this type of file for real production calculations.

  • Each group will run only one of the different thermostats discussed above, and the division will be made clear by the tutors at hand.

    Two of these thermostats, namely the Stochastic Velocity Rescaling (SVR) and the Langevin ones, have parameters that you can play with. In order to obtain reasonable results, one should provide an educated guess for the value of these thermostats' parameters. In order to realize to which extent such user-given parameters can influence a simulation, you will be asked to try different values for these parameters, as explained below.

    1. SVR

      <thermostat mode="svr"> 
          <tau units="femtosecond"> xxx </tau>

      Note that the influence of \(\tau\) is similar to the mass parameter \(Q\) for the Nosé-Hoover thermostat.

      Here tau (\(\tau\)) is a relaxation time of the thermostat, and is given in femtoseconds. Its value has to be chosen by the user (i.e., you !). The performance of the thermostat depends on the value of \(\tau\). In order to gauge the influence of this parameter, we ask you to test two different values for \(\tau\): one value which should yield a correct behavior, for example \(\tau=20\), and one more extreme value, for example \(\tau=500\) or \(\tau=0.002\).


      For the latter case of \(\tau=0.002\) it's not necessary to run a simulation for too long, 0.5 ps will be more than enough. You can run it the last and reduce its length if you run out of time.

    2. Langevin

      <thermostat mode="langevin">
          <tau units="femtosecond"> xxx </tau>

      Just like for the SVR thermostat, you have to choose a value for \(\tau\).

    3. GLE

      The Generalized Langevin Equation provides a very flexible framework to manipulate the dynamics of a classical system, improving sampling efficiency.

      This thermostat takes a matrix as an input parameter. It can be generated at Choose the parameters in the website as it is shown in the figure below, and copy them to the appropriate section of your input.xml file.

      GLE parameters

  • While waiting for the simulations to complete, you can calculate what would be the temperature corresponding to a zero point energy (\(\hbar\omega/2\)) for a frequency \(\omega_1\) = 3000 cm\(^{-1}\) and for \(\omega_2\) = 100 cm\(^{-1}\) considering a system with only 1 degree of freedom. Also show which frequency \(\omega\) corresponds to a temperature of 300 K.

Instructions part 2: Analysing the results

  • Use the script to extract different properties, like the temperature, the kinetic energy and the potential energy:

    python <property> ex4.out

    where is a placeholder for the property you want to extract. You can choose from temperature, kinetic_md, potential, conserved, kinetic_md(H) and kinetic_md(O). This will output a single-column file containing the desired property without any change of units. Plot each of these quantities. Is it what you were expecting?

  • Let us analyze a bit further the output. Use the script in order to obtain the the cumulative average, that is, the average temperature over time computed at each step of the simulation. You can do this by running the following script:

    python temperature.dat xxx

    Here we assumed that 'temperature.dat' is your single-column file containing the temperature of the system and xxx means that the first xxx data points are discarded.

    Please try different xxx values and see how the cumulative average change. Do you understand why? Can we assert that the system is correctly thermalized?

    How do the different thermostats you used perform? Are any of those working better or faster?

    Temperature of H and O species

    Here we show the kinetic energy per atom for each species in Langevin MD with \(\tau\) ranging from 0.02 to 200 fs. When the system is thermalized, each atom has \(K=3/2 k_BT\) and both species have the same kinetic energy/ One can see that, as intensity of the thermostat increases, the thermalization rate improves, but it saturates when thermostat becomes too intense.

    You can do the same exercise for the svr thermostat. You should see that it is less sensitive to the parameter \(\tau\). This is one reason why normally svr (and GLE) are preferred over the Langevin thermostat (but there are other reasons also, discussed in the original publications).

    plot_lgv200 plot_lgv20 plot_lgv2 plot_lgv0.2 plot_lgv0.02

  • Please repeat the previous item for the potential energy (i.e first extract the data from the ipi-output and then compute the cumulative average). Do you see a different behaviour? Can we assert that the system is correctly thermalized? From which point?

  • In the folder long_traj_outputs we provide i-PI output files with longer trajectories with different \(\tau\) and thermostats.

    Please repeat the previous procedure for the same thermostats and \(\tau\) values that you have simulated but using these extended outputs and try to answer again the questions. Has any of them changed?

  • The thermostat is not only in our simulation to thermalize the system, but also to help us to sample the phase space. We want to sample the phase space in a way that is consistent with the Boltzmann distribution but at the same time "as fast as possible". One way to measure this efficiency is to see how fast an observable decorrelates in time. The faster it does, the more efficient is our sampling scheme. The normalized correlation function for an observable \(A\) is given by

    \[\label{eq:CorrFunc} C(t) = \frac{\langle A(t)A(0)\rangle - \langle A \rangle^2}{\langle{A}^2\rangle - \langle{A}\rangle^2} \]

    where \(\langle \rangle\) stands for the average value. \(C(t)\) starts from 1 at \(t=0\) and goes to 0 when \(t \to \infty\). It tells you to which extent the observable \(A\) at time \(t'\) and at time \(t'+ t\) are correlated. There are several ways to define the autocorrelation time from \(C(t)\) but here we use the simplest one which is looking when \(C(t)\) is below a given threshold.

  • You can compute the correlation function and correlation time by typing:

    python temperature.dat xxx

    Here we assumed again that 'temperature.dat' is your single-column file containing the temperature of the system and 'xxx' means that the first xxx data points are discarded. The script will print in the screen the correlation time and will produce a file with the correlation function.

    Plot the autocorrelation function against time for the thermostats that you have been working with. What are their correlation time? (How) does it change with respect to thermostat \(\tau\) parameter?

  • Repeat the same for the potential energy, does it behave similarly or not?

  • The statistical uncertainty of an observable can be estimated as:

    \[ \epsilon_A \approx \frac{\text{std}_A}{\sqrt{N_\text{ind}}} \]

    where std\({_A}\) is the standard deviation \(\left(\text{std}_A = \sqrt{\sum_i^N (A_i- \langle A\rangle)^2/N-1}\right)\) of our measurements of the observable \(A\) and \(N_\text{ind}\) is the number of independent measurements. One can be tempted to use this formula considering that \(N_\text{ind}\) is given by the total amount of steps in our simulation. However, consecutive steps in a MD simulation are really correlated. Therefore, a proper estimation of \(\epsilon_A\) is given by:

    $$\label{eq:ERROR} \epsilon \approx \frac{\text{std}}{\sqrt{N/\tau_\text{corr,A}}}, $$ where \(\tau_\text{corr,A}\) is the correlation time of \(A\) expressed in steps and \(N\) is the total number of simulation steps. From the equation above it is clear why we want to minimize \(\tau_\text{corr,A}\). Which are the other (two) ways to reduce the uncertainty?

  • Use the script in order to obtain the average and standard deviations and then compute the average temperature and potential energy expressed as

    `mean_value` $\pm$ `uncertainty`

    The syntax, as usual, is:

    `python temperature.dat xxx `
  • Finally, in the solutions you will find a file called statistics.dat which contains averages, standard deviations and correlation times for all the thermostats. Look how the correlation times change for a given thermostat along different \(\tau\) values and along the three thermostats families. Does the correlation time of the temperature behave similar to the one of the potential energy? Why? Which, in your opinion, is the best thermostat for this system and why? And for other systems?


    For very small \(\tau\) values the molecular diffusion is suppressed. You can observe that if you look at the movies provided in the folder long_traj_outputs/movies.