04 The Closed-Shell MP2 energy

Unlike the Hartree-Fock energy, correlation energies like the MP2 (second-order Moller-Plesset perturbation) energy are usually expressed in terms of MO-basis quantities (integrals, MO energies). The most expensive part of the calculation is the transformation of the two-electron integrals from the AO to the MO basis representation. The purpose of this project is to understand this transformation and fundamental techniques for its efficient implementation. The theoretical background and a concise set of instructions for this project may be found here.

# Following os.chdir code is only for thebe (live code), since only in thebe default directory is /home/jovyan
import os
if os.getcwd().split("/")[-1] != "Project_04":
    os.chdir("source/Project_04")
from solution_04 import Molecule as SolMol

from pyscf import gto
import numpy as np
import scipy.linalg
from scipy.linalg import fractional_matrix_power
from typing import Tuple
np.set_printoptions(precision=7, linewidth=120, suppress=True)
# Solution mol only uses PySCF approach
sol_mole = SolMol()
sol_mole.construct_from_dat_file("input/h2o/STO-3G/geom.dat")
sol_mole.obtain_mol_instance(basis="STO-3G")
sol_mole.obtain_nao()
sol_mole.obtain_nocc()

Molecule Object Initialization

In this project, we may use the updated Molecule initialization. Since these are technical details, we toggle the code and explanation below. Most of the method functions are illustrated in Project 01 and Project 03.

class Molecule:

    def __init__(self):
        # Project 03 existed
        self.atom_charges = NotImplemented  # type: np.ndarray
        self.atom_coords = NotImplemented  # type: np.ndarray
        self.natm = NotImplemented  # type: int
        self.mol = NotImplemented  # type: gto.Mole
        self.nao = NotImplemented  # type: int
        self.charge = 0  # type: int
        self.nocc = NotImplemented  # type: int
        # Project 04 added
        self.mo_coeff = NotImplemented  # type: np.ndarray
        self.mo_energy = NotImplemented  # type: np.ndarray
        self.eri_ao = NotImplemented  # type: np.ndarray
        self.eri_mo = NotImplemented  # type: np.ndarray
        self.energy_rhf = NotImplemented  # type: np.ndarray
        self.energy_corr = NotImplemented  # type: np.ndarray

    def construct_from_dat_file(self, file_path: str):
        with open(file_path, "r") as f:
            dat = np.array([line.split() for line in f.readlines()][1:])
            self.atom_charges = np.array(dat[:, 0], dtype=float).astype(int)
            self.atom_coords = np.array(dat[:, 1:4], dtype=float)
            self.natm = self.atom_charges.shape[0]

    def obtain_mol_instance(self, basis: str, verbose=0):
        mol = gto.Mole()
        mol.unit = "Bohr"
        mol.atom = "\n".join([("{:3d} " + " ".join(["{:25.18f}"] * 3)).format(chg, *coord) for chg, coord in zip(self.atom_charges, self.atom_coords)])
        mol.basis = basis
        mol.charge = self.charge
        mol.spin = 0
        mol.verbose = verbose
        self.mol = mol.build()

    def eng_nuclear_repulsion(self) -> float:
        return self.mol.energy_nuc()

    def obtain_nao(self):
        self.nao = self.mol.nao_nr()

    def obtain_nocc(self):
        assert (self.atom_charges.sum() - self.charge) % 2 == 0
        self.nocc = (self.atom_charges.sum() - self.charge) // 2

    def integral_ovlp(self) -> np.ndarray:
        return self.mol.intor("int1e_ovlp")

    def integral_kin(self) -> np.ndarray:
        return self.mol.intor("int1e_kin")

    def integral_nuc(self) -> np.ndarray:
        return self.mol.intor("int1e_nuc")

    def get_hcore(self) -> np.ndarray:
        return self.integral_kin() + self.integral_nuc()

    def integral_eri(self) -> np.ndarray:
        return self.mol.intor("int2e")

    def integral_ovlp_m1d2(self) -> np.ndarray:
        return fractional_matrix_power(self.integral_ovlp(), -1/2)

    def get_fock(self, dm: np.ndarray) -> np.ndarray:
        return self.get_hcore() + (self.eri_ao * dm).sum(axis=(-1, -2)) - 0.5 * (self.eri_ao * dm[:, None, :]).sum(axis=(-1, -3))

    def get_coeff_from_fock_diag(self, fock: np.ndarray) -> np.ndarray:
        return scipy.linalg.eigh(fock, self.integral_ovlp())[1]

    def make_rdm1(self, coeff: np.ndarray) -> np.ndarray:
        return 2 * coeff[:, :self.nocc] @ coeff[:, :self.nocc].T

    def get_updated_dm(self, dm: np.ndarray) -> np.ndarray:
        return self.make_rdm1(self.get_coeff_from_fock_diag(self.get_fock(dm)))

    def eng_total(self, dm: np.ndarray) -> float:
        return (0.5 * (self.get_hcore() + self.get_fock(dm)) * dm).sum() + self.eng_nuclear_repulsion()

    def scf_process(self, dm_guess: np.ndarray=None) -> Tuple[float, np.ndarray]:
        eng, dm = 0., np.zeros((self.nao, self.nao)) if dm_guess is None else np.copy(dm_guess)
        max_iter, thresh_eng, thresh_dm = 64, 1e-10, 1e-8
        for epoch in range(max_iter):
            eng_next, dm_next = self.eng_total(dm), self.get_updated_dm(dm)
            if np.abs(eng_next - eng) < thresh_eng and np.linalg.norm(dm_next - dm) < thresh_dm:
                eng, dm = eng_next, dm_next
                break
            eng, dm = eng_next, dm_next
        return eng, dm

The meaning of additional attributes of Molecule are:

  • mo_coeff \(C_{\mu p}\) Molecular orbital coefficient

  • mo_energy \(\varepsilon_p\) Molecular orbital energy (from fock diagonalization)

  • eri_ao \((\mu \nu | \kappa \lambda)\) Atomic orbital electron repulsion integral

  • eri_mo \((p q | r s)\) Molecular orbital electron repulsion integral

  • energy_rhf \(E_\mathrm{tot}^\mathsf{RHF}\) Total energy of RHF

  • energy_corr \(E_\mathrm{corr}\) Correlation energy, in this project it is MP2’s form \(E_\mathrm{corr}^\mathsf{MP2}\)

mo_coeff \(C_{\mu p}\), mo_energy \(e_p\), eri_ao \((\mu \nu | \kappa \lambda)\), energy_rhf \(E_\mathrm{tot}^\mathsf{RHF}\) is already calculated in SCF process, as illustrated in Project 03. To store these quantities as attribute of Molecule is out of consideration of convenience and efficiency (one does not need to evaluate atomic eri every time when it is called).

The other two quantities, eri_mo \((pq|rs)\) and energy_corr \(E_\mathrm{corr}\) is our main concern in this project.

Step 1: Two-Electron Integrals

The Mulliken-ordered integrals are defined as:

\[ (\mu \nu | \kappa \lambda) = \int \phi_\mu (\boldsymbol{r}_1) \phi_\nu (\boldsymbol{r}_1) \frac{1}{|\boldsymbol{r}_1 - \boldsymbol{r}_2|} \phi_\kappa (\boldsymbol{r}_2) \phi_\lambda (\boldsymbol{r}_2) \, \mathrm{d} \boldsymbol{r}_1 \, \mathrm{d} \boldsymbol{r}_2 \]

We have already read and used this integral in Project 03. However we still does not have a function storing this integral to attribute of Molecule.

Implementation

def obtain_eri_ao(mole: Molecule):
    # Attribute Modification: `eri_ao` atomic orbital electron repulsion integral
    raise NotImplementedError("Exactly 1 line of code")

Molecule.obtain_eri_ao = obtain_eri_ao

Solution

sol_mole.obtain_eri_ao()
sol_mole.eri_ao.shape
(7, 7, 7, 7)

Step 2: Obtain the SCF Intermediate Results

We have already implemented an SCF process, but results are not stored in attributes of Molecule currently. We need to have a function to store RHF total energy \(E_\mathrm{tot}^\mathsf{RHF}\), molecular orbital coefficients \(C_{\mu p}\) and molecular orbital energy \(\varepsilon_p\).

Implementation

def obtain_scf_intermediates(mole: Molecule, dm_guess: np.ndarray=None):
    # Attribute Modification: `energy_rhf` Total energy of RHF
    # Attribute Modification: `mo_energy` Molecular orbital energies
    # Attribute Modification: `mo_coeff` Molecular orbital coefficients
    raise NotImplementedError("About 3~5 lines of code")

Molecule.obtain_scf_intermediates = obtain_scf_intermediates

Solution

sol_mole.obtain_scf_intermediates()
sol_mole.energy_rhf, sol_mole.mo_energy, sol_mole.mo_coeff
(-74.94207992819237,
 array([-20.2628916,  -1.2096974,  -0.5479647,  -0.4365272,  -0.3875867,   0.4776187,   0.5881393]),
 array([[-0.9944346, -0.2391589,  0.       , -0.0936832, -0.       ,  0.1116399,  0.       ],
        [-0.024097 ,  0.8857356, -0.       ,  0.4795859,  0.       , -0.6695791, -0.       ],
        [ 0.       ,  0.       , -0.6072848, -0.       ,  0.       ,  0.       , -0.9192343],
        [-0.0031615,  0.0858962,  0.       , -0.7474314, -0.       , -0.7384886, -0.       ],
        [ 0.       , -0.       , -0.       ,  0.       , -1.       ,  0.       , -0.       ],
        [ 0.0045937,  0.1440396, -0.4529977, -0.3294712, -0.       ,  0.7098495,  0.7324607],
        [ 0.0045937,  0.1440396,  0.4529977, -0.3294712,  0.       ,  0.7098495, -0.7324607]]))

Step 3: MO ERI Transformation (Naïve Algorithm)

Molecular orbital (MO) electron repulsion integral (ERI) could be defined as

\[ (pq|rs) = \int \phi_p (\boldsymbol{r}_1) \phi_q (\boldsymbol{r}_1) \frac{1}{|\boldsymbol{r}_1 - \boldsymbol{r}_2|} \phi_r (\boldsymbol{r}_2) \phi_s (\boldsymbol{r}_2) \, \mathrm{d} \boldsymbol{r}_1 \, \mathrm{d} \boldsymbol{r}_2 \]

just similar to definition of AO ERI \((\mu \nu | \kappa \lambda)\). Note that \(r\) in subscript means any (occupied or virtual) molecular orbital, not related to electron coordinate vector \(\boldsymbol{r}_1\) or \(\boldsymbol{r}_2\).

The most straightforward expression of the AO/MO integral transformation is

\[ (pq|rs) = \sum_\mu \sum_\nu \sum_\kappa \sum_\lambda C_{\mu p} C_{\nu q} (\mu \nu | \kappa \lambda) C_{\kappa r} C_{\lambda s} \]

This approach is easy to implement (hence the word “naïve” above), but is expensive (time-costly) due to its \(O(N^8)\) computational order. Nevertheless, you should start with this algorithm to get your code working, and run timings for the test cases below to get an idea of its computational cost.

Implementation

def get_eri_mo_naive(mole: Molecule) -> np.ndarray:
    # Naive algorithm
    # Output: MO electron repulsion integral
    raise NotImplementedError("About 5~15 lines of code")

Molecule.get_eri_mo_naive = get_eri_mo_naive

Solution

%%time
sol_mole.get_eri_mo_naive()[0, 3]
CPU times: user 19.9 s, sys: 78.1 ms, total: 19.9 s
Wall time: 20.9 s
array([[ 0.166246 ,  0.0218153, -0.       ,  0.0244196, -0.       ,  0.0027303,  0.       ],
       [ 0.0218153,  0.0126143, -0.       , -0.0184892,  0.       , -0.0184186, -0.       ],
       [-0.       , -0.       ,  0.0055846,  0.       , -0.       , -0.       , -0.0002852],
       [ 0.0244196, -0.0184892,  0.       , -0.0089254, -0.       ,  0.0012042, -0.       ],
       [-0.       ,  0.       , -0.       , -0.       ,  0.0046733, -0.       , -0.       ],
       [ 0.0027303, -0.0184186, -0.       ,  0.0012042, -0.       ,  0.0180084,  0.       ],
       [ 0.       , -0.       , -0.0002852, -0.       , -0.       ,  0.       ,  0.0037402]])

This algorithm is not suitable if basis gets even larger.

Step 4: MO ERI Transformation (Smarter Algorithm)

Notice that none of the orbital coefficients \(\mathbf{C}\) in the above expression have any indices in common. Thus, the summation could be rearranged such that:

\[ (pq|rs) = \sum_\lambda C_{\lambda s} \left[ \sum_\kappa C_{\kappa r} \left[ \sum_\nu C_{\nu q} \left[ \sum_\mu C_{\mu p} (\mu \nu | \kappa \lambda) \right] \right] \right] \]

This means that each summation within brackets could be carried out separately, starting from the innermost summation over \(\mu\), if we store the results at each step. This reduces the \(O(N^8)\) algorithm above to four \(O(N^5)\) steps.

After you have the noddy algorithm working and timed, modify it to use this smarter algorithm and time the test cases again. Enjoy!

Implementation

def get_eri_mo_smarter(mole: Molecule):
    # Smarter algorithm
    # Output: MO electron repulsion integral
    raise NotImplementedError("About 5 + 3~5 * 4 lines of code")

Solution

%%time
sol_mole.get_eri_mo_smarter()[0, 3]
CPU times: user 93.8 ms, sys: 0 ns, total: 93.8 ms
Wall time: 103 ms
array([[ 0.166246 ,  0.0218153, -0.       ,  0.0244196, -0.       ,  0.0027303,  0.       ],
       [ 0.0218153,  0.0126143, -0.       , -0.0184892,  0.       , -0.0184186, -0.       ],
       [-0.       , -0.       ,  0.0055846,  0.       , -0.       ,  0.       , -0.0002852],
       [ 0.0244196, -0.0184892,  0.       , -0.0089254, -0.       ,  0.0012042, -0.       ],
       [-0.       ,  0.       , -0.       , -0.       ,  0.0046733, -0.       , -0.       ],
       [ 0.0027303, -0.0184186,  0.       ,  0.0012042, -0.       ,  0.0180084,  0.       ],
       [ 0.       , -0.       , -0.0002852, -0.       , -0.       ,  0.       ,  0.0037402]])
%%timeit
sol_mole.get_eri_mo_smarter()[0, 3]
97 ms ± 801 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Step 5: MO ERI Transformation (numpy.einsum)

numpy.einsum (NumPy API) provides an extremely concise way to do tensor contraction, with sufficient support of efficiency.

We still use the following formula:

\[ (pq|rs) = \sum_{\mu \nu \kappa \lambda} C_{\mu p} C_{\nu q} (\mu \nu | \kappa \lambda) C_{\kappa r} C_{\lambda s} \]

and molecular orbital basis ERI could be calculated as

eri_mo = np.einsum("up, vq, uvkl, kr, ls -> pqrs", coeff, coeff, eri_ao, coeff, coeff, optimize=True)

where optimize=True tells numpy.einsum to use the optimized (smarter) algorithm. This optional parameter is critical

Finally, store molecular orbital ERI into attribute eri_mo in Molecule.

Implementation

def get_eri_mo_einsum(mole: Molecule):
    # Use numpy.einsum to automatically find optimal contraction path
    # Output: MO electron repulsion integral
    raise NotImplementedError("About 1~3 lines of code")

Molecule.get_eri_mo_einsum = get_eri_mo_einsum
def obtain_eri_mo(mole: Molecule):
    # Attribute Modification: `eri_mo` Molecular orbital electron repulsion integral
    raise NotImplementedError("Exactly 1 line of code")

Molecule.obtain_eri_mo = obtain_eri_mo

Solution

%%time
sol_mole.get_eri_mo_einsum()[0, 3]
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 1.49 ms
array([[ 0.166246 ,  0.0218153, -0.       ,  0.0244196, -0.       ,  0.0027303,  0.       ],
       [ 0.0218153,  0.0126143, -0.       , -0.0184892,  0.       , -0.0184186, -0.       ],
       [-0.       , -0.       ,  0.0055846,  0.       , -0.       ,  0.       , -0.0002852],
       [ 0.0244196, -0.0184892,  0.       , -0.0089254, -0.       ,  0.0012042, -0.       ],
       [-0.       ,  0.       , -0.       , -0.       ,  0.0046733, -0.       , -0.       ],
       [ 0.0027303, -0.0184186,  0.       ,  0.0012042, -0.       ,  0.0180084,  0.       ],
       [ 0.       , -0.       , -0.0002852, -0.       , -0.       ,  0.       ,  0.0037402]])
%%timeit
sol_mole.get_eri_mo_einsum()
416 µs ± 2.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Step 6: Compute the MP2 Energy

Restricted MP2 correlation energy is (Helgaker, et al., eq 14.4.56)

\[ E_\mathrm{corr}^\mathsf{MP2} = \sum_{iajb} \frac{(ia|jb) \big[ 2 (ia|jb) - (ib|ja) \big]}{\varepsilon_i + \varepsilon_j - \varepsilon_a - \varepsilon_b} \]

where \(i\) and \(j\) denote doubly-occupied orbitals and \(a\) and \(b\) unoccupied orbitals, and the denominator involves the MO energies.

Finally, store MP2 correlation energy into attribute energy_corr in Molecule.

Implementation

def eng_mp2_corr(mole: Molecule):
    # Output: (Restricted) MP2 correlation energy
    raise NotImplementedError("About 4~10 lines of code")

Molecule.eng_mp2_corr = eng_mp2_corr
def obtain_mp2_corr(mole: Molecule):
    # Attribute Modification: `energy_corr` Post-HF (in this project is RMP2) correlation energy
    raise NotImplementedError("Exactly 1 line of code")

Molecule.obtain_mp2_corr = obtain_mp2_corr

Solution

sol_mole.obtain_eri_mo()
sol_mole.obtain_mp2_corr()
print("SCF total       energy: {:16.8f}".format(sol_mole.energy_rhf))
print("MP2 correlation energy: {:16.8f}".format(sol_mole.energy_corr))
print("MP2 total       energy: {:16.8f}".format(sol_mole.energy_rhf + sol_mole.energy_corr))
SCF total       energy:     -74.94207993
MP2 correlation energy:      -0.04914964
MP2 total       energy:     -74.99122956

Test Cases

The input structures, integrals, etc. for these examples are found in the input directory. You can also use PySCF approach and simply ignore the integral files.

Water/STO-3G (Directory)

sol_mole = SolMol()
sol_mole.construct_from_dat_file("input/h2o/STO-3G/geom.dat")
sol_mole.obtain_mol_instance(basis="STO-3G")
sol_mole.print_solution_04()
SCF total       energy:     -74.94207993
MP2 correlation energy:      -0.04914964
MP2 total       energy:     -74.99122956

Water/DZ (Directory)

sol_mole = SolMol()
sol_mole.construct_from_dat_file("input/h2o/DZ/geom.dat")
sol_mole.obtain_mol_instance(basis="DZ")
sol_mole.print_solution_04()
SCF total       energy:     -75.97787898
MP2 correlation energy:      -0.15270988
MP2 total       energy:     -76.13058885

Water/DZP (Directory) For this molecule, some additional code is included. See Project 03 for details.

sol_mole = SolMol()
sol_mole.construct_from_dat_file("input/h2o/DZP/geom.dat")
sol_mole.obtain_mol_instance(basis="DZP_Dunning")
sol_mole.mol._basis["H"][2][1][0] = 0.75
sol_mole.mol.cart = True
sol_mole.mol.build()
sol_mole.print_solution_04()
SCF total       energy:     -76.00882179
MP2 correlation energy:      -0.22251923
MP2 total       energy:     -76.23134103

Methane/STO-3G (Directory)

sol_mole = SolMol()
sol_mole.construct_from_dat_file("input/ch4/STO-3G/geom.dat")
sol_mole.obtain_mol_instance(basis="STO-3G")
sol_mole.print_solution_04()
SCF total       energy:     -39.72685032
MP2 correlation energy:      -0.05604667
MP2 total       energy:     -39.78289699

References

  • Szabo, A.; Ostlund, N. S. Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory Dover Publication Inc., 1996.

    ISBN-13: 978-0486691862

  • Helgaker, T.; Jørgensen, P.; Olsen, J. Molecular Electronic-Structure Theory John Wiley & Sons, ltd., 2000

    ISBN-13: 978-1118531471