Advanced Molecular Science: Electronic Structure Theoryszalewic/teach/838/lectures.pdf · Advanced...

Advanced Molecular Science: Electronic Structure Theory

Krzysztof Szalewicz et al.Department of Physics and Astronomy,

University of Delaware, Newark, DE 19716, USA(Dated: December 17, 2017)

AbstractThese Lecture Notes were prepared during a one-semester course at the University of Delaware.

Some lectures were given by students and the corresponding notes were also prepared by

students. The goal of this course was to cover the material from first principles, assuming only

the knowledge of standard quantum mechanics at advanced undergraduate level. Thus, all the

concepts are defined and all theorems are proved. There is some amount of material looking

ahead which is not proved, it should be obvious from the context. About 95% of the material

given in the notes was actually presented in the class, in the traditional blackboard and chalk

manner.

1

CONTENTS

I. Introduction 5A. Spinorbitals 6B. Products of complete basis sets 7

II. Symmetries of many-particle functions 7A. Symmetric group 8B. Determinant 10

III. Separation of nuclear and electronic motion 12A. Hamiltonian in relative coordinates 13B. Born-Oppenheimer approximation 15C. Adiabatic approximation and nonadiabatic correction 16

IV. The independent-particle model: the Hartree-Fock method 18A. Slater determinant and antisymmetrizer 19B. Slater-Condon rules 21C. Derivation of Hartree-Fock equations 22

V. Second-quantization formalism 27A. Annihilation and creation operators 27B. Products and commutators of operators 28C. Hamiltonian and number operator 30D. Normal products and Wick’s theorem 31

1. Normal-Product 312. Contractions (Pairings) 323. Time-independent Wick’s theorem 324. Outline of proof of Wick’s theorem 335. Comprehensive proof of Wick’s theorem 346. Particle-hole formalism 377. Normal products and Wick’s theorem relative to the Fermi vacuum 388. Generalized Wick’s theorem 399. Normal-product form of operators with respect to Fermi’s vaccum 40

VI. Density-functional theory 41A. Thomas-Fermi-Dirac method 42B. Hohenberg-Kohn theorems 47C. Kohn-Sham method 49D. Local density approximation 53E. Generalized gradient approximations (GGA) 55

2

F. Beyond GGA 57

VII. Variational Method 58A. Configuration Interaction (CI) method 59

1. Size extensivity of CI 612. MCSCF, CASSCF, RASSCF, and MRCI 62

B. Basis sets and basis set convergence 64C. Explicitly-correlated methods 66

1. Coulomb cusp 662. Hylleraas function 673. Slater geminals 684. Explicitly-correlated Gaussian functions 69

VIII. Many-body perturbation theory (MBPT) 70A. Rayleigh Schrödinger perturbation theory (classical derivation) 70B. Hylleraas variation principle 74C. Møller-Plesset perturbation theory 76D. Diagrammatic expansions for MPPT 82

1. Diagrammatic notation 822. One-particle operator 833. Two-particle operators 844. Hugenholtz diagrams 865. Antisymmetrized Goldstone diagrams 876. Diagrammatic representation of RSPT 87

E. Time versions 881. Time version of the first kind 882. Time version of the second kind 88

F. Connected and disconnected diagrams 89G. Linked and unlinked diagrams 90H. Factorization lemma (Frantz and Mills) 92I. Linked-cluster theorem 94J. Removal of spin 96

IX. Coupled cluster theory 97A. Exponential ansatz 97B. Size consistency 98C. CC method with double excitations 99D. Equivalence of CC and MBPT theory 108E. Noniterative triple excitations correction 110F. Full triple and higher excitations 113

3

X. Linear response theory 114A. Response function 114

1. Density-density response function 1172. Calculation of properties from response functions 119

B. Linear response in CC approach 1201. CC equations 1202. Hellmann-Feynman theorem 1223. Linear response CC for static perturbation 1244. Lambda equations 126

XI. Treatment of excited states 128A. Excitation energies from TD-DFT 128B. Limitations of single-reference CC metods 131C. The equation-of-motion coupled-cluster method 131D. Multireference coupled-cluster methods 133

XII. Intermolecular interactions 136A. Symmetry-adapted perturbation theory 137B. Asymptotic expansion of interaction energy 141C. Intermolecular interactions in DFT 142

XIII. Diffusion Monte Carlo 143

XIV. Density-matrix approaches 146A. Reduced density matrices 148B. Spinless density matrices 150C. N-representability 151D. Density matrix functional theory 154E. Contracted Schrödinger equation 156

XV. Density matrix renormalization group (DMRG) 159A. Singular value decomposition 159B. SVD applications 161C. DMRG wave function 162D. Expectation values and diagrammatic notation 163E. Matrix product ansatz 165F. DMRG algorithm 166

G. DMRG in practice 167H. Dynamic correlation and excited states 167I. Applications to atoms and molecules 167J. Limitations 168

4

I. INTRODUCTION

The subject of these lecture notes will be methods of solving Schödinger’s equation foratoms, molecules, biomolecular aggregates,

and solids. Schrödinger’s equation provides very accurate description of most typesof matter under most conditions, where by “most" we will understands the materials andconditions on the Earth. Exceptions include materials that include heavy atoms whererelativistic effects have to be accounted for and high-precision measurements where notonly the relativistic but also quantum electrodynamics (QED) effects play a role. We willrestrict our attention to solutions of Schrödinger’s equation. Incorporation of relativisticand QED effects can be achieved by a fairly straightforward extension of the methodsdiscussed here. We will also restrict our attention to systems built of electrons and nucleitreated as point particles. Thus, we will not consider phenomena which involve nuclearreactions. However, many of the methods discussed here are used in theoretical nuclearphysics.

For systems with up to 4 electrons and 1 to 3 nuclei, one can now solve Schrödinger’sequation almost to any desired precision, although for the most complicated systemsof this type it requires huge amounts of computer resources. Some of the methodsused in such calculations, such as the variational method with explicitly correlatedfunctions (i.e., functions depending explicitly on electron-electron distances), will bebriefly discussed here, but we will devote most of the time to systems that are largerand for which such methods are not applicable. The difficulty of solving Schrödinger’sequation for systems with 5 and more electrons originates from dimensionality of theproblem. For example, the benzene molecule contains 12 nuclei and 42 electrons, sothat Schrödinger’s equation for this system is 162-dimensional. Thus, this equation canbe solved only by making approximations (although quantum diffusion Monte Carlomethods which will be considered later on do solve such equations “almost" directly).The main approximation applied is many-particle (or many-electron or many-body)expansion. Therefore, most of the material covered here belongs to the branch of physicscalled many-body physics. The concept of many-particle expansion is based on theobservation that in a many-particle system the most important interactions are thoseinvolving only two particles. This leads to several method hierarchical in the number ofparticle interaction considered.

The particles that we will consider almost almost exclusively will be electrons. Many-particle theories applied to bound states of such particles are known as electronicstructure theory. The reasons for using the word “structure" are uncleared, but probablyrelates to shell structure of atoms and orbital picture of molecules.

5

A. Spinorbitals

Electrons are fermions of spin 12 . The wave function of a single electron depends

on the space coordinate r = [x,y,z] with each variable in the range (−∞,∞) and on thespin coordinate s which takes only values ±1

2 . Therefore, the wave function for a singleelectron, called spinorbital, can be written in the form of the so-called spinor

=[ψ+(r)ψ−(r)

]↔ ψ(r, s) ≡ ψ(x).

where the ψ+ component is the amplitude of finding the electron with spin projection,the eigenvalue of the operator Sz, equal to ~/2 (“spin up") whereas the ψ− componentwhich is the amplitude of finding this particle with spin projection −~/2 (“spin down").Note that an electron in the state is in general in a mixed spin state. It will be moreconvenient to use the other form of the wave function shown in the eqution above

ψ+(r) = ψ(r,12

)

ψ−(r) = ψ(r,−12

).

This form is particularly convenient to use in the expectation values of operators (matrixelements) involving spinorbitals. In the one-electron case

〈ψ|f φ〉 =∑

s=−1/2,1/2

∫d3rψ(x)f (r)φ(x),

i.e., we sum over spin variable and integrate over the three space variables. Since f doesnot contain spin operators, the sum over the spin degrees of freedom can be computedimmediately.

In most cases we will consider pure spin states, i.e., states with the property that eitherψ+(r) or ψ−(r) is zero. For example,

=[ψ+(r)

0

]is the spinorbital with spin projections ~/2 or “spin alpha (α)" state, whereas the otheroption is called “spin beta (β)" state. In such cases, the pure-spin spinorbitals can bedenoted as

ψ+(r) and ψ−(r) or ψ(r)α(s) and ψ′(r)β(s) or ψα(r) and ψβ(r),

where α(1/2) = β(−1/2) = 1 and α(−1/2) = β(1/2) = 0 and the spatial part is calledthe orbital. Note the ψ+(r) and ψ−(r) are now different spinorbitals, whereas beforethey were components of a single spinorbital. We also continue using the symbol ψ(x)

6

assuming that ψ(x)∣∣∣s=1/2

either describes pure spin alpha or is zero. One sometimes usesa somewhat confusing notation where spinorbital and orbital are denoted by the samesymbol, so that we have, for example, ψ(x) = ψ(r)α(s). The meaning of this symbolshould be obvious from the context.

If ψ and φ represent the same spin projection, they are simultaneously nonzero ateither s = 1/2 or −1/2 and zero at the opposite value, so that the spin summation can beperformed and leaves only the spatial integral. If spin projections are opposite, at eachvalue of s one of the spinorbitals is zero, so that the result of summation over spins iszero. The same becomes more transparent in the alpha/beta notation, for example,

〈ψα|f φα〉 =∑

s=−1/2,1/2

α2∫d3rψ(r)f (r)φ(r) =

∫d3rψ(r)f (r)φ(r)

and obviously for opposite spins one gets zero.

B. Products of complete basis sets

One of main theorems used in the many-body theory tells that a complete basis set inthe space of a many-particle functions can be formed as a product of complete basis setsof single-particle functions. Let us show that this is the case on the simplest exampleof a function of two variables. Lets assume that gi(x)∞i=1 is a complete basis set ofone variable. Then the set of products gi(x)gj(y) is a complete set in the space of twovariables, i.e., any functions f (x,y) can be expanded in this set

f (x,y) =∞∑i,j=1

cijgi(x)gj(y).

To see that this is indeed the case, consider the function f (x,y) at a fixed value of xdenoted by x0. Since f (x0, y) is just a function of a single variable, we may write

f (x0, y) =∑j

dj(x0)gj(y).

However, taken at different values of x0, dj(x0) is just a single-variable function and canbe expanded in our basis

dj(x) =∑i

cijgi(x)

which proves the theorem.

II. SYMMETRIES OF MANY-PARTICLE FUNCTIONS

Since electrons are fermions, the electronic wave functions have to be antisymmetric.This chapter will show how to achieve this goal. The notion of antisymmetry is related

7

to permutations of electrons’ coordinates. Therefore we will start with the discussion ofthe permutation group.

A. Symmetric group

The permutation group, known also under the name of symmetric group, it the groupof all operations on a set of N distinct objects that order the objects in all possible ways.The group is denoted as SN (we will show that this is a group below). We will callthese operations permutations and denote them by symbol σi . For a set consisting ofnumbers 1, 2, . . . , N , the permutation σi orders these numbers in such a way that k is atjth position. Often a better way of looking at permutations is to say that permutationsare all mappings of the set 1, 2, . . . , N onto itself: σi(k) = j, where j has to go over allelements.

The number of permutations is N ! Indeed, we can first place each object at positions1, so there are N possible placements. For each case, we can place one of the remainingN −1 objects at the second positions, so that the number of possible arrangements is nowN (N − 1). Continuing in this way, we prove the theorem.

For three numbers: 1, 2, 3, there are the following 3! = 6 arrangements: 123, 132, 213,231, 312, 321.

One can use the following “matrix" to denote permutations:

σ =(

1 2 . . . k . . . N

σ (1) σ (2) . . . σ (k) . . . σ (N )

)The order of columns in the matrix above is convenient, but note that if the columns wereordered differently, this will still be the same permutation. An example of a permutationin this notation is

σ =(

1 2 3 43 4 1 2

)We define the operation of multiplication within the set of permutations as (σσ ′)(k) =

σ (σ ′(k)). For example, if

σ1 =(

1 2 3 43 4 1 2

)σ2 =

(1 2 3 42 4 3 1

)then

σ2 σ1 =(

1 2 3 43 1 2 4

).

We can now check if these operations satisfy the group postulates

• Closure: σ σ ′ ∈ SN . The proof is obvious since the product of permutations givesa number from the set, therefore is a permutation.

8

• Existence of unity I : this is the permutation σ (k) = k.

• Existence of inverse, i.e., for each σ there exists σ−1 such that σ σ−1 = I . Clearly,the inverse can be defined such that if σ (k) = j, then σ−1(j) = k.

• Multiplications are associative:

σ3 (σ2 σ1) = (σ3 σ2) σ1.

Proof is in a homework problem.

One important theorem resulting from these definitions is that the set of products ofa single permutation with all elements of SN is equal to SN

σ SN = SN .

Proof: Due to closure, the only possibility of not reproducing the whole group is that twodifferent elements of SN are mapped by σ onto the same element:

σ σ ′ = σ ′′′ = σ σ ′′.

Multiplying this equation by σ−1, we get σ ′ = σ ′′ which contradicts our assumption.Another theorem states that σ−1 = SN . This is equivalent to saying that σ and σ−1

are in one-to-one correspondence. Indeed, assume that there are two permutations thatare inverse to σ : σ1 σ = I = σ2 σ . Multiplying this by σ−1 from the right, we get thatσ1 = σ2.

One important property of permutation is that each permutation can be written as aproduct of the simplest possible permutations called transpositions. A transposition is apermutation involving only two elements:

τ = τij = (ij) =

σ (i) = jσ (j) = iσ (k) = k for k , i, j

=(

1 2 . . . i . . . j . . . N

1 2 . . . j . . . i . . . N

).

To prove that any permutation can be written as a product of transpositions, we justconstruct such a product. For a permutation σ written as

σ =(

1 2 . . . k . . . N

i1 i2 . . . ik . . . iN

)(1)

first find i1 in the set 1,2, . . . ,N and then transpose it with 1 (unless i1 = 1, in whichcase do nothing). This maps i1 in 1. Then consider the set with i1 removed, find i2,and transpose it with 2. Continuing in this way, we get the mapping of expression (1)which proves the theorem. The decomposition of a permutation into transposition isnot unique as we can always add τijτij = 1. Although the number of transpositions in

9

a decomposition is not unique, this number is always either odd or even for a givenpermutation. The proof of this important theorem is given as a homework. Thus, (−1)πσ ,where πσ is the number of permutations in an arbitrary decomposition, is always 1 or −1for a given permutation and we can classify each permutation as either odd or even. Wesay that each permutation has a definitive parity.

One theorem concerning the parity of permutations is that that (−1)πσ = (−1)πσ−1 , i.e.,that a permutation and its inverse have the same parity. This results from the fact thateach transposition is its own inverse.

B. Determinant

The fundamental zeroth-order approximation for the wave function in theory of manyfermions is Slater’s determinant. Thus, we have to study the concept of determinant. Fora general N ×N matrix A with elements aij , the determinant is defined as

|A| ≡ detA =

∣∣∣∣∣∣∣∣∣∣∣∣a11 a12 . . . a1N

a21 a22 . . . a2N

. . . . . . . . .

aN1 aN2 . . . aNN

∣∣∣∣∣∣∣∣∣∣∣∣=

∑σ

(−1)πσ aσ (1)1aσ (2)2 . . . aσ (N )N (2)

where the sum is over all permutations of numbers 1 to N and πσ is the parity of thepermutation.

There are several important theorems involving determinants that we will now prove.First, let us show that that |A| = |AT |, which also means that the definition (280) can bewritten as

|A| =∑σ

(−1)πσ a1σ (1)a2σ (2) . . . aNσ (N ). (3)

To prove this property, first consider σ (i) = 1. There must be one such aiσ (i) in each termin formula (3). Denote this value of i in a given term by i1 and move ai11 to the firstposition in the product

a1σ (1)a2σ (2) . . . ai11 . . . aNσ (N ) = ai11a2σ (2) . . . ai1−1σ (i1−1) ai1+1σ (i1+1) aNσ (N )

Next, look for σ (i) = 2 = σ (i2) and move ai22 it the second position in the product.Continuing, one eventually gets

a1σ (1)a2σ (2) . . . aσ (N )N = ai11ai22 . . . aikk . . . aiNN . (4)

The set i1, i2, . . . , iN is a permutation σ : σ (k) = ik. Note that σ , σ in general. Also, thepermutations σ originating from different terms in expansion (3) are all different. This isso since from σ (ik) = k and ik = σ (k) it follows that σ (σ (k)) = k = (σ σ )(k). Thus, σ = σ−1.Therefore, if we sum all possible terms on the right-hand side of Eq. (4), we sum over

10

all permutations of SN (as shown earlier, σ−1 = SN ). The only remaining issue is thesign. The sign is right since we have proved that the parity of σ and σ−1 is the same. Thiscompletes the proof.

The next important theorem says that if one interchanges two columns (or rows) in adeterminant, the value of the determinant changes sign

|Ai↔j | = −|A|

where Ai↔j denotes a matrix with such interchange. The proof is as follows. We canassume without loss of generality that i < j. Denote:

A = akl Ai↔j =a′kl

akl = a′kl if l , i, j (5)

aki = a′kj , akj = a′ki (6)

Therefore, in the expansions of |A| and |Ai↔j |, we can identify identical terms, modulosign. Pick up a term in the expansion of |A|

(−1)πσ aσ (1)1aσ (2)2 · · ·aσ (i)i · · ·aσ (j)j · · ·aσ (n)n

where σ is here some fixed permutation of 1,2, · · ·n. To find the corresponding term inthe expansion of |Ai↔j |

|Ai↔j |=∑σ

(−1)πσ a′σ (1)1 a′σ (2)2 · · ·a

′σ (i)i · · ·a

′σ (j)j · · ·a

′σ (n)n

we should choose: σ (k) = σ (k) for k , i, j since, due to (5), a′σ (k)k = aσ (k)k if k , i, j.Analogously, σ (i) = σ (j) and σ (j) = σ (i) since, due to (6),

a′σ (j)j = aσ (j)i = aσ (i)i ,

where the second equality results from our assumption σ (j) = σ (i), and, similarly,

a′σ (i)i = aσ (i)j = aσ (j)j .

This can be done for all n! terms in |A| so that there is one to one correspondence betweenterms, modulo sign. Since

σ (k) =σ (k) k , i, j

(σ τij)(k) k = i or j

=

(σ τij

)(k)

(if k , i, j, τij has no effect), the permutations σ and σ differ by one transposition andtherefore (−1)πσ = −(−1)πσ , which proves the theorem.

11

Another theorem states that if a column of a matrix is a linear combination of two (ormore) column matrices, the determinant of this matrix is equal to the linear combinationof determinants, each containing one of these column matrices:

|A(aj = βb+γc)| = β|A(aj = b)|+γ |A(aj = c)|. (7)

The proof follows from the fact that the definition of determinant implies that each termin the expansion (280) contains exactly one element from each column and each row.Thus, each term contains the factor βbi + γci and can be written as a sum of two terms.Pulling the coefficients in front of determinants proves the theorem.

One more theorem which is the subject of a homework is that the determinant of aproduct of two matrices is the product of determinants: |AB| = |A||B|. This theoremcan be used to prove that the determinant of a unitary matrix, i.e., a matrix with thepropertyUU† = I, where the dagger denotes a matrix which is transformed and complexconjugated, is a complex number of modulus 1. Indeed

1 = |UU†| = |U||U†| = |U||(UT)∗| = |U||U∗| = |z|2

where we used the theorem about the determinant of a transformed matrix.Finally, a homework problem shows that the determinant ofA can be computed using

the so-called Laplace’s expansion

|A| =∑i

(−1)i+jaij |Mij | =∑j

(−1)i+jaij |Mij |.

where the matrixMij is obtained from matrixA by removing the ith row and jth column.

III. SEPARATION OF NUCLEAR AND ELECTRONIC MOTION

For a molecule consisting of K particles, nuclei and electrons, the Hamiltonian is

H = −K∑i=1

~2

2mi∇2

Ri+

K∑i<j

qiqj|Ri −Rj |

(8)

where mi (qi) is the mass (charge) of particle i and all coordinates are measured in aspace-fixed coordinate system. This leads to an equation in 3K-dimension. For example,for the hydrogen molecule, it is 12-dimensional. While for small molecules it is currentlypossible to solve Schödinger’s equation with this Hamiltonian, one can easily reduce thedimensionality. First, one can rigorously separate the center of mass motions, reducingthe number of dimensions by 3. Second, one can approximately separate the electronicand nuclear motions. For the hydrogen molecule, the resulting equation for electronmotions is then six-dimensional. For molecules larger than the hydrogen molecule,

12

the gain is not as dramatic since the number of electrons in molecules containingheavier atoms is much larger than the number of nuclei. Nevertheless, this separation isalways performed since it easier to solve equations that concern identical particles (i.e.,electrons) than several different kinds. The separation of nuclear and electronic motionis a good approximation since a nucleus is at least about 2000 times heavier than anelectron and therefore the former particles move about

√2000 times slower. Thus, as the

slow nuclei move, the fast electrons follow them and their distribution around nuclei isnot much different than in the case of stationary nuclei. Such separation of motions iscalled the adiabatic approximation. In the case of molecules, we more often uses the so-called Born-Oppenheimer (BO) approximation which is a further simplification of theadiabatic one. The BO approximation, called also the clamped-nuclei approximation,just means that electrons move in the field of nuclei clamped in space. The solutions ofthe clamped-nuclei Schrödinger’s equation are called the electronic states.

In many cases, one has to go beyond the adiabatic approximation. This is needed forsmall molecules when one needs to get very accurate results or for any size moleculesin certain regions of nuclear configurations where the adiabatic approximation breaksdown do to strong interactions between energetically close electronic states. One usuallystarts from the adiabatic approximation and solves equations that couple the electronicand nuclear motions in a perturbative fashion, computing in this way the so-callednonadiabatic effects.

A. Hamiltonian in relative coordinates

To simplify notation, let us restrict our attention to diatomic molecules with nuclearmasses M1 and M2. A generalization to molecules with more nuclei is straightforward.Let Ri , i = 1,2, denote the coordinate of the two nuclei, whereas the coordinates of theN electrons will be denoted by ri , all coordinates still in a space-fixed system. Nowintroduce the center of mass (CM)

RCM =1M

M1R1 +M2R2 +meN∑i=1

ri

,whereme is the mass of an electron andM =M1+M2+Nme the total mass of the molecule,and relative coordinates

R =R1 −R2 ri = ri −12

(R1 +R2) .

We have chosen to measure electronic positions from the geometric center of nuclei.Another possible choice is to measure them from the center of nuclear mass.

To transform the Hamiltonian (8), we have to perform some chain-rule differentiations

13

corresponding to the following change of variables: [R1,R2, r1, . . . , rN ]→ [RCM,R,r1, . . . ,rN ]

∂∂X1

=∂

∂XCM

∂XCM

∂X1+∂∂X

∂X∂X1

+∑i

∂∂xi

∂xi∂X1

=M1

M∂

∂XCM+∂∂X− 1

2

∑i

∂∂xi

,

∂∂X2

=M2

M∂

∂XCM− ∂∂X− 1

2

∑i

∂∂xi

,

∂∂xi

=meM

∂∂XCM

+∂∂xi

.

Now second derivatives

∂2

∂X21

=(M1

M

)2 ∂2

∂X2CM

+∂2

∂X2 +14

∑i

∂∂xi

2

+2M1

M∂

∂XCM

∂∂X−M1

M∂

∂XCM

∑i

∂∂xi− ∂∂X

∑i

∂∂xi

,

∂2

∂X22

=(M2

M

)2 ∂2

∂X2CM

+∂2

∂X2 +14

∑i

∂∂xi

2

−2M2

M∂

∂XCM

∂∂X−M2

M∂

∂XCM

∑i

∂∂xi

+∂∂X

∑i

∂∂xi

,

∂2

∂x2i

=(meM

)2 ∂2

∂X2CM

+∂2

∂x2i

+ 2meM

∂∂XCM

∂∂xi

.

Plug the derivatives in the kinetic energy part of the Hamiltonian

Tx = −~2

2M1

M2∂2

∂X2CM

− ~2

2M1

∂2

∂X2 −~

2

8M1

∑i

∂∂xi

2

− ~2

M∂

∂XCM

∂∂X

+~

2

2M∂

∂XCM

∑i

∂∂xi

+~

2

2M1

∂∂X

∑i

∂∂xi

− ~2

2M2

M2∂2

∂X2CM

− ~2

2M2

∂2

∂X2 −~

2

8M2

∑i

∂∂xi

2

+~

2

M∂

∂XCM

∂∂X

+~

2

2M∂

∂XCM

∑i

∂∂xi

− ~2

2M2

∂∂X

∑i

∂∂xi

− ~2

2NmeM2

∂2

∂X2CM

− ~2

2me

∑i

∂2

∂x2i

− ~2

M∂

∂XCM

∑i

∂∂xi

,

Terms 4 and 10 cancel, so do terms 5, 11, and 15. Terms 1, 7, and 13 can be addedtogether and the masses in the numerators add to M. We therefore now get

Tx = − ~2

2M∂2

∂X2CM

− ~2

2µ∂2

∂X2 −~

2

2me

∑i

∂2

∂x2i

− ~2

8µ

∑i

∂∂xi

2

+~

2

2

(1M1− 1M2

)∂∂X

∑i

∂∂xi

,

where 1/µ = 1/M1 + 1/M2. Since the CM coordinates appear only in the first term,the center of mass motion can be separated. After adding the terms in the other two

14

directions, the remaining Hamiltonian, expressed only in relative coordinates, can bewritten as

H = − ~2

2µ∇2

R −~

2

2me

∑i

∇2ri− ~

2

8µ

∑i

∇ri

2

− ~2

21µa

∇R ·∑i

∇ri +V ,

where we denoted 1M2− 1M1

= 1µa

and V denotes the second term in the Hamiltonian (8).Since this term contains only interparticle distances, it is uneffected by the transformation.

B. Born-Oppenheimer approximation

The Hamiltonian can be divided into two parts

H =H0 +H ′ (9)

H0 = − ~2

2me

∑i

∇2ri

+V (10)

H ′ = − ~2

2µ∇2

R −~

2

8µ

∑i

∇ri

2

+~

2

21µa

∇R ·∑i

∇ri . (11)

The Hamiltonian H0 is called the electronic Hamiltonian since it acts only on electroniccoordinates. It is also called clamped-nuclei Hamiltonian since it describes the system ifH ′ is neglected (and H ′ becomes zero if nuclear masses go to infinity so that nuclei donot move, are clamped in space). Such approach is called the BO approximation. Theelectronic Schrödinger equation is− ~

2

2me

∑i

∇2ri

+V

ψ(r1, . . . ,rN ;R) = E(R)ψ(r1, . . . ,rN ;R).

Since the equation is different for each internuclear separation R = |R|, the wave functionand the energy depend parametrically on R. We use the word “parametrically" toemphasize that R is not a variable in the electronic Schrödinger equation, but theequation has to be solved separately for each value of R that is of interest. For moleculeswith more than two nuclei, the electronic wave function depends parametrically on thepositions of all nuclei.

Despite the name “clamped-nuclei" approximation, one solves for nuclear motion inthe BO approximation. To do this, one assumes the exact wave function to be a productof the electronic wave function and of a function of R

Ψ (r1, . . . ,rN ;R) ≈ ψ(r1, . . . ,rN ;R)f (R).

15

Next, approximate H ′ by its first term and plug this function into the approximateSchrödinger’s equation− ~2

2µ∇2

R −~

2

2me

∑i

∇2ri

+V

ψ(r1, . . . ,rN ;R)f (R) = Eψ(r1, . . . ,rN ;R)f (R).

The function f can be pulled out from the second and third term of the Hamiltonian. Wemake now one more approximation and neglect the terms resulting from the action ofthe first term on ψ. Then we can write

− ~2

2µψ(r1, . . . ,rN ,R)∇2

Rf (R)+f (R)

− ~2

2me

∑i

∇2ri

+V

ψ(r1, . . . ,rN ,R) = Eψ(r1, . . . ,rN ;R)f (R).

and integrate over electron coordinates assuming 〈ψ|ψ〉 = 1 for all R. We then get[− ~

2

2µ∇2

R +E(R)]f (R) = Ef (R).

Thus, the electronic energy becomes the potential energy surface for the motion of nuclei.

C. Adiabatic approximation and nonadiabatic correction

The BO approximation discussed above can be obtained from a more rigorous procedurethat originates from the exact solutions of the Schrödinger equation for all particles. Wecan expand such a solution in complete basis sets in electronic and nuclear coordinatesusing the theorem discussed earlier

Ψ (r1, . . . ,rN ,R) =∑ij

cijψi(r1, . . . ,rN )gj(R) =∑i

ψi(r1, . . . ,rN )∑j

cijgj(R)

=∑i

ψi(r1, . . . ,rN )fi(R)

where the second, equivalent form is more convenient to use. However, since we wantto use the solutions of the electronic Schrödinger equation rather than some arbitrarycomplete basis set, our expansion becomes

Ψ (r1, . . . ,rN ,R) =∑j

ψj(r1, . . . ,rN ;R)fj(R). (12)

One can view this expression as using a different complete basis set for each R.We now insert the expansion (12) into Schrödinger’s equation (with CM separated),

multiply by ψi(r1, . . . ,rN ;R), and integrate over electronic coordinates. Let’s work out thefirst term in the operator H ′:

− ~2

2µ

∑j

〈ψi |(∇2

Rψj〉fj)

= − ~2

2µ

∑j

[〈ψi |

(∇2

Rψj)〉fj + 〈ψi |ψj〉∇2

Rfj + 2〈ψi |(∇Rψj

)〉 ·∇Rfj

],

(13)

16

where the parentheses inside integrals indicate that differentiations with respect toR areperformed only inside the parentheses. Similarly, for the third term we get

− ~2

2µa

∑j

〈ψi |

∇R ·∑k

∇rkψj〉fj

= (14)

− ~2

2µa

∑j

〈ψi |∇R ·

∑k

∇rkψj

〉fj + 〈ψi |∑k

∇rkψj〉 ·∇Rfj

.(15)

The sum of the first term in Eq. (13), of the matrix element of second operator in Eq. (11),and of the first term in Eq. (14) can be written as

− ~2

2µ

∑j

〈ψi |(∇2

Rψj)〉fj −

~2

8µ

∑j

〈ψi |

∑k

∇rk

2

ψj〉fj −~

2

2µa

∑j

〈ψi |

∇R ·∑k

∇rkψj

〉fj=

∑j

H ′ijfj (16)

where H ′ij are the matrix elements of H ′ between the electronic wave functions with H ′

interpreted in such a way that it does not act outside the integrals. i.e., H ′ij are simplefunctions of R. With this definition, we can write Schrödinger’s equation as

− ~2

2µ∇2

Rfi(R) +Ei(R)fi(R)−Efi(R) +∑j

H ′ij(R)fj(R)

−~2

µ

∑j

〈ψi |(∇Rψj

)〉 ·∇Rfj(R)− ~

2

2µa

∑j

〈ψi |∑k

∇rkψj〉 ·∇Rfj(R) = 0 (17)

where we used the orthonormality of electronic wave functions for each R to obtain thefirst three terms. The last two terms will be written as

Bij(R) ·∇Rfj(R) = −~2

1µ〈ψi |

(∇Rψj

)〉+ 1

2µa〈ψi |

∑k

∇rkψj〉

·∇Rfj(R).

We will now show that Bii(R) = 0 for real electronic functions (one can always chooseelectronic functions to be real, for proof see Shankar p. 177). This is because we have

0 = ∇R〈ψi |ψi〉 = 〈∇Rψi |ψi〉+ 〈ψi |∇Rψi〉 = 2〈ψi |∇Rψi〉

so that the first term is zero. The second term is zero since it is proportional to theexpectation value of the momentum operator. The latter value is zero since for real wavefunction the probability of finding momentum P and −P is the same (in one dimension

17

|〈e−ipx/~|ψ〉|2 = |〈eipx/~|ψ〉|2 and this result generalizes to any number of dimensions). Nowwe can move the off-diagonal to the right-hand side, getting[

− ~2

2µ∇2

R +Ei(R) + H ′ii(R)−E]fi(R) = −

∑j,i

[H ′ijfj(R) +Bij(R) ·∇Rfj(R)

](18)

Note that this equation is still equivalent to Schrödinger’s equation. This set of coupledequations can be solved directly for very small molecules, but usually one solves itperturbatively, treating the right-hand side as a perturbation.

The last form of Schrödinger’s equation is appropriate for making the approximationsdiscussed above. Since usually the off-diagonal matrix elements are smaller than diagonalone, one obvious approximation is to neglect the right-hand side. This gives the adiabaticapproximation. The resulting equation for fi(R) differs from the BO equation by the termH ′ii(R) which is called the adiabatic or diagonal correction. Thus, the BO approximationdifferes from the adiabatic approximation by this correction. The adiabatic equation isof the same degree of difficulty as the BO equation since in each case nuclei move on apotential energy surface. Since the diagonal correction is usually small, in most currentcalculations it is neglected.

The adiabatic approximation fails when potential two energy surfaces Ei(R) and Ei′ (R)become close to each other. Clearly, in such cases some off-diagonal matrix elements arenot significantly smaller compared to diagonal ones since there are two electronic wavefunctions which are similar. In such cases, one has to include at last the off-diagonalmatrix elements that couple these states.

IV. THE INDEPENDENT-PARTICLE MODEL: THE HARTREE-FOCK METHOD

Our problem to solve it the time-independent Schrödinger equation with the Hamil-tonian

H = − ~2

2m

N∑i=1

∇2i −

Nnuc∑a=1

N∑i=1

Zae2

|ri −Ra|+

12

N∑i<j

e2

|ri − rj |(19)

where m denotes electron’s mass, e electron’s charge, N is the number of electrons, Nnuc

is the number of nuclei, Za is the charge of nucleus a, ri are positions of electrons,Ra arepositions of nuclei. Note that this Hamiltonian is the same as the Hamiltonian definedby Eq. (10) except that we neglected the nuclear-nuclear repulsion terms. These termsgive just a constant in any type of electronic structure approach and this constant can besimply added to the final result. We also dropped the subscript “0" since this will be theonly Hamiltonian considered from now on.

Despite the simplification of eliminating nuclear degrees of freedom, the solutionof the clamped-nuclei Schrödinger’s equation for even simple molecules, such as the

18

water molecule with 10 electrons and 30 spatial degrees of freedom, appears as animpossible task. The main idea for simplifications that may come to mind is to solvesuch equation one electron at a time, which is then a 3-dimensional problem. In the moststraightforward approach, this would mean that one neglects all interactions betweenelectrons in the Hamiltonian (19). With such an approximation, the problem rigorouslyseparates into N one-electron problem when the wave function is written as a productof one-electron functions. However, this straightforward independent-particle modelworks poorly. In particular, when an electron in a molecule or solid is far from a nucleus,it does not see an object of charge Za since other electrons screen the nuclear charge.There were several efforts at the beginnings of quantum mechanics to scale nuclearcharges to account for the screening. One step further is to include in the one-electronequation an interaction with an electron cloud representing an average of the electronpositions, leading to a family of mean-field methods. It turns out there is a rigorous andsystematic way of achieving the best possible representation of the mean field, calledthe Hartree-Fock (HF) method. The wave function in this method is an antisymmetrizedproduct of one-electron functions and the method still requires solving only one-electronequations, however, the set of equations is coupled.

A. Slater determinant and antisymmetrizer

The wave function in the HF method is written in the form of Slater determinant

Ψ (x1,x2, . . . ,xN ) =1√N !

∣∣∣∣∣∣∣∣∣∣∣∣φk1

(x1) φk1(x2) . . . φk1

(xN )φk2

(x1) φk2(x2) . . . φk2

(xN ). . .

φkN (x1) φkN (x2) . . . φkN (xN )

∣∣∣∣∣∣∣∣∣∣∣∣(20)

where xi = ri , si denotes the spatial and spin coordinates of ith electron and single-electron functions φki (xj) are called spinorbitals. The spin variable takes on the values±1

2 . We will use only pure-spin spinorbitals which means that a given φi(x) has to bezero at s = 1

2 and nonzero at s = −12 or vice verse. The orbitals form a complete set of

one-electron functions and the subset included in the Slater determinant is an arbitrarysubset of such spinorbitals. Of course, as it follows from properties of determinants, allspinorbitals have to be different, otherwise the determinant is zero. As we will showsoon, the normalization factors assures that the determinant is normalized to 1 if the setof spinorbitals is orthonormal.

The determinant can also be written as the result of the action of an operator A calledantisymmetrizer

A =1N !

∑σ

(−1)πσ Pσ (21)

19

where the sum is over all N ! permutations Pσ of N electrons. Since Pσ acts now onelectron coordinates, we call it an operator. The normalization factor assures that theantisymmetrizer is idempotent, i.e., A2 =A. This can be seen from the following

A2 =1

(N !)2

∑σ,σ ′

(−1)πσ (−1)πσ ′ Pσ Pσ ′ .

Consider a fixed value of σ . The product of Pσ with all operators Pσ ′ is equal to the setof all SN operators. Thus, as we sum over σ , we get N ! times the set of all permutationoperators. Thus, it will be equal to A if the signs are right. This is the case since if wewrite Pσ ′′ = Pσ Pσ ′ , and expand Pσ and Pσ ′ into products of transpositions, we see thatthe number of transpositions in Pσ ′′ is the sum of the numbers of transpositions in Pσ andPσ ′ .

Acting with A on the product of spinorbitals, we get

Ψ (x1,x2, . . . ,xN ) =√N !A

(φk1

(x1)φk2(x2) . . . φkN (xN )

)we indeed get the Slater determinant since the antisymmetrizer realizes the definition ofthe determinant.

Let us now prove that the Slater determinant is normalized if the spinorbitals arenormalized:

〈φi |φj〉 =∑s

∫φ∗i (x)φj(x)d3r = δij

where we defined the bracket notation that will be used for integrals from now on.Notice that the bracket includes summation over the spins. This summation runs overs = 1

2 and s = −12 . If i = j, the spinorbital is nonzero at one of the values of s. Two

spinorbitals may have the same spatial part, but differ by spin. Then for each value ofs one of the spinorbitals is zero, which satisfies orthonormality. Other pairs of differentspinorbitals can be orthogonal already due to different spins or/and due to orthogonalityof spatial components (usually one assumes, however, that different spatial parts arealways orthogonal). The overlap integral can be written as

〈Ψ |Ψ 〉 =N !〈A(φk1

(x1)φk2(x2) . . . φkN (xN )

)|A

(φk1

(x1)φk2(x2) . . . φkN (xN )

)〉

where the brackets denote integral over space and spin coordinates of all electron. SinceA is obviously Hermitian and we have shown that it is idempotent, we can move it to theket getting

〈Ψ |Ψ 〉 =N !〈(φk1

(x1)φk2(x2) . . . φkN (xN )

)|A

(φk1

(x1)φk2(x2) . . . φkN (xN )

)〉.

Consider first the identity permutation in A. For this term, the integral separates intothe product of N one-electron integrals with the integrand in each one-electron integral

20

being the square modulus of a spinorbital. Thus, each such integral is 1. Now considerthe term such that electron 1 is permuted with 2:

〈(φk1

(x1)φk2(x2) . . . φkN (xN )

)|(φk1

(x2)φk2(x1) . . . φkN (xN )

)〉.

Now, we get two integrals that are zero: 〈φk1(x1)|φk2

(x1)〉 and 〈φk2(x2)|φk1

(x2)〉. Thus,this contribution is zero. Clearly, any permutation of electrons in the ket leads to zeroterm. Thus, A reduces to 1

N !I and 〈Ψ |Ψ 〉 = 1.

B. Slater-Condon rules

Matrix elements of the Hamiltonian (19) with Slater determinants can be written interms of matrix elements between spinorbitals using the so-called Slater-Condon rules.Let us define the operators

h(i) = − ~2

2m∇2i −

Nnuc∑a=1

Zae2

|ri −Ra|(22)

F =N∑i=1

h(i) (23)

G =12

N∑i<j

e2

|ri − rj |=

12

N∑i<j

g(ij) (24)

where we introduced short-hand notation replacing xi by i. The rules for the operator Fare

〈Ψ |FΨ 〉 =N∑i=1

hii (25)

〈Ψ |FΨ ′〉 = hik (26)

〈Ψ |FΨ ′′〉 = 0 (27)

where Ψ denotes the determinant built from the set of spinobitals φ1, φ2, . . . φN , Ψ ′

differs from Ψ by replacement of the spinorbital φi by the spinobital φk , k > N , andΨ ′′ includes two such replacements. We have also introduced a short-hand notation forthe integrals, i.e., hik = 〈φi |hφk〉. The rules are actually valid for any set of orbitals, butwe will need them only for the set specified.

The proof of Eq. (25) is as follows. Similarly as we did when proving the normalizationof Slater determinants, we can move the antisymmetrizer, only now from ket to bra.We have to first commute it with the operator F. This is possible since this operator issymmetric, i.e., does not change if we permute any electrons in it. Thus, we can pull F

21

through the antisymmetrizer. Then using the Hermiticity and idempotency of A, we get

〈Ψ |FΨ 〉 =N !〈A (φ1(1)φ2(2) . . . φN (N )) |∑i

h(i) (φ1(1)φ2(2) . . . φN (N ))〉

=N !〈A (φ1(1)φ2(2) . . . φN (N )) | [h(1)φ1(1)φ2(2) . . . φN (N )

+φ1(1)h(2)φ2(2) . . . φN (N )

+φ1(1)φ2(2) . . . h(N )φN (N )]〉

We can see that similarly as in proof of normalization, any permutation of the electron inthe bra will lead to a zero integral. Thus, A reduces to 1

N !I which proves Eq. (25).In the case of Eq. (26), we will have in the ket one spinorbital, φk, which is orthogonal

to all spinorbitals in the bra. Thus, the integral involving this spinorbital can be nonzeroonly if h acts on it. Moreover, in the bra one has to permute the electrons in such a waythat spinorbital φi is the function of the same electron as φk since φi is orthogonal to allspinorbital in the bra. Thus, only a single permutation in the bra survives, which provesEq. (26).

In the case of Eq. (27), we will have in the ket two spinorbitals, φk and φk, which areabsent in bra. Only one of them can be acted upon by h, so that the other spinorbital willalways make the integrals zero, which proves Eq. (27).

The analogous formulas involving G are (proofs in a homework):

〈Ψ |GΨ 〉 =12

N∑i,j=1

(gijij − gijji

)(28)

〈Ψ |GΨ ′〉 =N∑j=1

(gijkj − gijjk

)(29)

〈Ψ |GΨ ′′〉 = gijkl − gijlk (30)

〈Ψ |GΨ ′′′〉 = 0 (31)

where Ψ ′′′ denotes a triply substituted Slater determinant and

gijkl = 〈φiφj |gφiφj〉 =∑s1,s2

"d3r1d

3r2φ∗i (x1)φ∗j(x2)

e2

|r1 − r2|φk(x1)φl(x2).

C. Derivation of Hartree-Fock equations

The Hartree-Fock methods seeks the Slater determinant Ψ which minimizes theexpectation value of the Hamiltonian

EHF = minΨ

〈Ψ |HΨ 〉〈Ψ |Ψ 〉

≥ E0.

22

The Ritz variational principle guarantees that EHF is always greater or equal to the exactground-state energy E0 of a given system. Since Ψ is built from spinorbitals, the methodfinds the optimal spinorbitals for the ground state of a system and can be consideredto be the ultimate mean-field method. We will always assume that the spinorbitals areorthonormal, so that Ψ is normalized and we can write

EHF = minΨ〈Ψ |HΨ 〉.

Using the Slater rules for the Hamiltonian (19), the expectation value can be writtenas

〈Ψ |HΨ 〉 =N∑i=1

hii +12

N∑i,j=1

(gijij − gijji

)where we still assume that the set of spinorbitals is enumerated by 1, . . . ,N and weincluded i = j term in the second sum since the two terms add to zero in this case.

To find the minimum, we have to vary each orbital. Since the orbitals are complex,one has to vary both the real part and the imaginary part. Equivalently, one can vary theorbital and its complex conjugate

φi → φi + δφi φ∗i → φ∗i + δφ∗i .

We will start from varying ψ∗i ’s only and we will find that this is sufficient to obtain asolvable set of equations. One can then check that varying ψi ’s gives an equivalent set ofequations. Since we assumed that the spinorbitals are orthonormal, we have to imposedthe condition 〈ψi |ψj〉 = δij during the optimization. This can be done by adding to theexpectation value the the condition multiplied by Lagrange’s undetermined multipliers.Thus, we will minimize

L = 〈Ψ |HΨ 〉 −N∑

i≤j=1

λij(〈φi |φj〉 − δij

).

Replacing all φ∗i by φ∗i + δφ∗i , we get

L[φ∗i + δφ∗i ] =∑i

〈φi + δφi |hφi〉+12

∑ij

〈(φi + δφi)(φj + δφj

)|g

(φiφj −φjφi

)〉

−∑i≤jλij

(〈φi + δφi |φj〉 − δij

)=

∑i

〈φi |hφi〉+12

∑ij

〈φiφj |g(φiφj −φjφi

)〉

+∑i

〈δφi |hφi〉+12

∑ij

〈δφiφj |g(φiφj −φjφi

)〉+ 1

2

∑ij

〈φiδφj |g(φiφj −φjφi

)〉

−∑i≤jλij〈δφi |φj〉

23

where the first two terms give the value of the functional at the minimum and wehave used the fact that the spinorbitals are orthonormal at the minimum so there isno orthonormality term in this part. We have also omitted the terms that are productsof orbital increments as they are of second order. The term with δφj can be shown to beequal to the preceding term

12

∑ij

〈φiδφj |g(φiφj −φjφi

)〉 =

12

∑ij

〈δφjφi |g(φjφi −φiφj

)〉 =

12

∑ij

〈δφiφj |g(φiφj −φjφi

)〉

where in the first step with interchanged coordinates of electron 1 and 2 in the integraland in the second step we interchanged the summation indices. We can now write

L[φ∗i + δφ∗i ] = EHF[φ∗i ] +∑i

∑s1

∫d3r1δφ

∗i (x1)×h(r1)φi(x1) +

∑j

∑s2

∫d3r2φ

∗j(x2)g(r1,r2)

(φi(x1)φj(x2)−φj(x1)φi(x2)

)−∑j

λijφj(x1)

Since at the minimum the linear increment has to be zero for an arbitrary δφ∗i , this can beonly achieve if the whole expression in the large square bracket is equal to zero for anyx1

h(r1)φi(x1) +∑j

∑s2

∫d3r2φ

∗j(x2)g(r1,r2)

(φi(x1)φj(x2)−φj(x1)φi(x2)

)=

∑j

λijφj(x1).

These are the Hartree-Fock equations for spinorbitals. Let us rewrite this equationintroducing the so-called Coulomb and exchange operators

J(r1) =∑j

Jj(r1) =∑j

∑s2

∫d3r2φ

∗j(x2)g(r1,r2)φj(x2)

K(r1)φ(x1) =∑j

Kj(r1)φ(x1) =∑j

∑s2

∫d3r2φ

∗j(x2)g(r1,r2)φ(x2)φj(x1)

where φ(x1) is an arbitrary spinorbital. Note that while J is just a multiplicative operator,K is an integral one since it integrates over the function it acts upon. Notice that theoperators J and K do not depend on spin but act on functions including spin coordinate.Using these operators, one can rewrite the HF equations as[

h(r1) + J(r1) + K(r1)]φi(x1) =

∑j

λijφj(x1). (32)

Equations (32) are the set of N equations for N spinorbitals φi depending also onN (N + 1)/2 Lagrange’s multipliers. It is possible to transform these equations to the so-called canonical form where only the diagonal multipliers are present. To achieve this

24

goal, we will use the important theorem stating that Slater determinants are invariantunder unitary transformations of spinorbitals (i.e., the determinant is the same whenexpressed in the original and in the transformed spinorbitals). Let us denote the trans-formed set of spinorbitals by φ′i , so that

=

φ1

φ2

. . .

φN

→ ′ =

φ′1φ′2. . .

φ′N

=U

withUU† = I. The Slater determinant built of transformed spinorbitals can be written as

Ψ ′ =∣∣∣φ′1(x1)φ′2(x2) . . .φ′N (xN )

∣∣∣ =

∣∣∣∣∣∣∣∣∣∣∣∣∑i u1iφi(1)

∑i u1iφi(2) . . .

∑i u1iφi(N )∑

i u2iφi(1)∑i u2iφi(2) . . .

∑i u2iφi(N )

. . . . . . . . .∑i uNiφi(1)

∑i uNiφi(2) . . .

∑i u2iφi(N )

∣∣∣∣∣∣∣∣∣∣∣∣=

∣∣∣∣∣∣∣∣∣∣∣∣U

φ1(1) φ1(2) . . . φ1(N )φ2(1) φ2(2) . . . φ2(N ). . . . . . . . .

φN (1) φN (2) . . . φN (N )

∣∣∣∣∣∣∣∣∣∣∣∣= |U|Ψ

where | . . . | denote a determinant and [. . . ] a matrix and where we started to use short-hand notation xi ≡ i. Since the determinant of a unitary matrix is a complex number ofmodulus 1, |U| is just a multiplicative phase factor which is irrelevant.

If we apply the unitary transformation to the HF equations, i.e., replace all spinobitalsby transformed spinorbitals, this substitution has to be made also in the J and K

operators. Let us now show that these operators are invariant under such transformation.For J we have

J[φ′i] =∑j

∑s2

∫d3r2(φ′j)

∗(2)g(r1,r2)φ′j(2)

=∑j

∑s2

∫d3r2g(r1,r2)

∑i

u∗jiφ∗i (2)

∑k

ujkφk(2)

=

∑j

∑s2

∫d3r2g(r1,r2)

∑i

uijφ∗i (2)

∑k

ujkφk(2)

=

∑s2

∫d3r2g(r1,r2)

∑i,k

φ∗i (2)φk(2)∑j

u∗jiujk .

25

The last sum is the same as in U†U = I, so that

J[φ′i] =∑s2

∫d3r2g(r1,r2)

∑i,k

φ∗i (2)φk(2)δik

=∑i

∑s2

∫d3r2g(r1,r2)φ∗i (2)φi(2) = J[φi],

i.e., the Coulomb operator is indeed invariant under the unitary transformation oforbitals. An analogous proof holds for K .

Let us write HF equations in matrix form[h(r) + J(r) + K(r)

] =

and rewrite them in transformed form[h(r) + J(r) + K(r)

]U†′ = U†′

and multiply this equation from left by U to get[h(r) + J(r) + K(r)

]UU

†′ =UU†′.

Since matrix is symmetric, we can always find a unitary transformation that diagonalizesit, getting [

h(r) + J(r) + K(r)]′ = diag

′

Denoting the diagonal elements of diag by εi , we can write the canonical HF equationsas [

h(r) + J(r) + K(r)]φi(x) = εiφi(x) i = 1,2, . . . ,N .

These equations are sometimes called pseudo-eigenvalue equations since the operatorsJ and K depend on all φi ’s. The equations have to be solved iteratively, i.e., one firstassumes some initial orbitals (for example, for atoms these can be the hydrogenicorbitals), solves the resulting eigenvalue problem, computes new operators J and K usingthe spinorbitals just obtained, and so on. The convergence of iteration can sometimes bea problem and several methods have been developed to deal with such problems.

The most often used method of solving HF problems is to expand the spinorbitals interms of some known basis functions χi

φi =∑j

cijχj .

Then Hartree-Fock equations become matrix pseudo-eigenvalue equations, sometimescalled Hartree-Fock-Roothaan or Hartree-Fock-Roothaan-Hall equations. The basisfunctions can be in particular atomic orbitals and such approach is then called the linearcombination of atomic orbitals (LCAO) method. This approach can be used not only to

26

compute energies for atoms, molecules, and solids, but also as an interpretative tool. Inparticular, it can be used to interpret the chemical bond in molecules. In the simplestcase of a diatomic molecules if one restricts restricts linear combinations to pairs oforbitals with similar energy, one of the two molecular orbital energies resulting fromsuch a pair is lower than either atomic orbital energy whereas the other one is higher(this fact is not obvious). Thus, if the spinorbitals corresponding to the lower level areoccupied (such spinorbitals are called the bonding ones), there is a gain in energy. Thispicture can be also applied to solids where the orbital energies corresponding to differentpairs of atoms will be very close to each other, forming the so-called bands. The relationsbetween the highest filled or partly filled band and the lowest empty band determinewhether a solid is an insulator, semiconductor, or conductor. However, for conductorsthe HF method encounters several problems.

V. SECOND-QUANTIZATION FORMALISM

In many-particle theory, one often uses the so-called formalism of second quantization,i.e., we express formulas in terms of operators that create or annihilate particles at somespinorbitals. The name of the formalism is somewhat misleading. It results from the firstuse of such operators to quantize electromagnetic field. This happened in the late 1920s,after the formulation of Schrödinger’s equation which was the first quantization, i.e., thequantization of particles. Since we will consider only particles, this leads to a kind ofoxymoron: we will use second-quantization tools in first quantization approach.

A. Annihilation and creation operators

Slater determinants can be described using occupation number representation:

Ψ =√N !A

[φk1

(x1)φk2(x2) . . . φkN (xN )

]↔ |n1, n2, . . . , ni , . . .〉

ni = 1 if i ∈ k1, k2, . . . , kN ni = 0 otherwise

where we assumed that spinorbitals form an ordered set. Conventionally, we orderspinorbitals coming from the HF method according to their orbital energies (for degeneratespinorbitals, the order is arbitrary). To have a unique relation to the standard way ofwriting the Slater determinant, we assume that that the spinorbitals on the left-handside are also ordered. The number of positions in the occupation number representationis infinite, but we can display explicitly only the sequence up to the highest occupiedorbital. For example,

Ψ =√

5!A [φ2(x1)φ7(x2)φ9(x3)φ10(x4)φ11(x5)]↔ |0100001011100 . . .〉

27

The creation operator can now be defined

a†i |Ψ 〉 =

a†i | . . .

i0 . . .〉 = (−1)σ | . . .

i1 . . .〉

a†i | . . .i1 . . .〉 = 0

where σ denotes the number of ‘1’s before the position i. In words, if spinorbital φi isabsent in |Ψ 〉, this orbital is added at the ith place. If this orbital is present, the action ofa†i gives zero.

The phase factor (−1)σ is needed in order to create exactly the same determinant aswould be built from the set of the spinorbitals present in |Ψ 〉 plus the spinorbital φi . Tosee that the phase is correct, first add φi as the first row to |Ψ 〉. This gives us a uniquelydefined determinant, but it may differ by phase from the standard determinant. We canthen permute this row with other rows until it arrives at ith position. This gives the phasefactor (−1)σ .

We can use the creation operators to define the ground state determinant of a systemwith N electrons:

|Φ0〉 = a†1a†2 . . . a

†N |vac〉.

A homework exercise will show the order chosen gives the correct ground-state deter-minant. In the case of a closed-shell system (e.g., atoms with complete shells occupied),this definition gives a unique determinant. For open-shell systems where the highestoccupied spinorbital is degenerate, several determinants can be created. In most cases, alinear combination of such determinants is required to form a wave function with propersymmetry properties.

The annihilation operator is defined analogously to the creation operator

ai |Ψ 〉 =

ai | . . .i1 . . .〉 = (−1)σ | . . .

i0 . . .〉

ai | . . .i0 . . .〉 = 0

,

so the we may consider this action as first permuting φi until it arrives at the first rowand then annihilating it.

B. Products and commutators of operators

If several creation operators act in sequence, the order is important since theseoperators do not commute. Let us find the commutation rule by acting with a pairof creation operators on a state with zero electrons. We call such a state true vacuum anddenote by |vac〉. We get (assume without loss of generality that i < j):

a†i a†j |vac〉 = a†i | . . .

j1 · · · = | . . .

i1 . . .

j1 . . .〉

28

a†j a†i |vac〉 = a†i | . . .

i1 . . .〉 = −| . . .

i1 . . .

j1 . . .〉

Thus, since a†i a†i = 0 from the definition, we have

a†i a†j = −a†j a

†i or

[a†i , a

†j

]+

= 0,

where [a,b]+ = ab+ba, is the anticommutator. Thus, the creation operators anticommute.We have considered here the action of a†i a

†j on the vacuum state only, but one can easily

see that the result is the same for any determinant since if either φi or φj is includedin the determinant, we get zero. In the opposite case, reasoning is the same as for thevacuum case.

Clearly, the commutation rule for annihilation operators is analogous to that for thecreation operators [

ai , aj]+

= 0.

To show this, we have to act on a determinant containing both φi and φj and the minussign results analogously to the creation operators case.

Let us now find our commutation rules for products of creation and annihilationoperators

a†i ai |Ψ 〉 =

0 if i < Ψ

|Ψ 〉 if i ∈ Ψ.

Note the phase factors cancel: (−1)σ (−1)σ = 1. Analogously,

aia†i |Ψ 〉 =

0 if i ∈ Ψ|Ψ 〉 if i < Ψ

.

Thus, in the anticommutator, one of the two terms will always reproduce |Ψ 〉, so that[a†i , ai

]+

= 1.

Finally, consider (again assuming i < j)

a†i aj |Ψ 〉 =

0 if j < Ψ or i ∈ ψ

(−1)σj (−1)σi | . . .i1 . . .

j0 . . .〉 otherwise

.

aja†i |Ψ 〉 =

0 if j < Ψ or i ∈ ψ

(−1)σj+1(−1)σi | . . .i1 . . .

j0 . . .〉 otherwise

where the additional power of −1 results from the fact that the number of occupied statesincreases by one due to the action of a†i . Thus, the action in the opposite orders gives thesame result times −1, so that we have [

a†i , aj]+

= δij .

29

C. Hamiltonian and number operator

Since linear combinations of creation and annihilation operators can be used toconstruct an arbitrary wave function in the Hilbert space, one can use such linearcombination to construct various operators. Let us first construct a simple operatorcalled the occupation number operator

Ni = a†i ai

As shown above, this operator acting on any determinant gives 0 if spinorbital i isabsent and recovers the determinant when it is present. We may say therefore that theeigenvalues of this operator are 0 and 1 and these eigenvalues are occupation numbersfor φi in |Ψ 〉

Ni |Ψ 〉 = ni |Ψ 〉.

We can now construct the number operator

N =∞∑i=1

Ni

which gives

N |Ψ 〉 =∞∑i=1

ni |Ψ 〉 =N |Ψ 〉.

Thus, this operator “detects" the number of electrons in |Ψ 〉.Let us now consider the one-electron part of the Hamiltonian

F =N∑i=1

f (ri).

We postulate that F can be written as

F =∞∑i,j=1

fija†i aj

where, as before, fij = 〈φi |f φj〉. To prove this expression, it is sufficient to show that thematrix elements of this Hamiltonian with arbitrary determinants are the same as thoseresulting from the Slater-Condon rules

〈Ψ |FΨ 〉 =∞∑i,j=1

fij〈Ψ |a†i ajΨ 〉 =∑i,j∈Ψ

fijδij =∑i∈Ψ

fii

We could reduce the sum to go only over indices of the orbitals present in |Ψ 〉 since ifj is not in the range, then the action of aj gives zero, whereas if i is not in the range,

30

the determinant created in the ket is orthogonal to the one in the bra. For indices in therange, the action of a†i aj with i , j, as discussed earlier, gives either zero or a determinantdifferent from the original one which makes the matrix element equal to zero. Thus, theonly case when the matrix element is nonzero is i = j and then a†i ai is just the occupationnumber operator. This proves the theorem for the same determinant on both sides.

Now consider the case when spinorbital k in the ket is replaced by spinorbital l. Allthe arguments from the previous case still hold except that in addition i has to be equalto k to annihilate φk, otherwise we get zero. Thus, the next to last sum is only over j

〈Ψ |FΨ ′〉 =∞∑i,j=1

fij〈Ψ |a†i ajΨ′〉 =

∑i,j∈Ψ ,Ψ ′

fij〈Ψ |a†i ajΨ′〉 = fkl .

In this case, j has to be equal to k to annihilate the replacement spinorbital and i has to beequal to k to create φk in Ψ ′〉, which gives just a single matrix element. The overall sign isplus since the spinorbitals are annihilated and created at the same position. For the Ψ ′′〉case, a†i aj is unable to annihilate two replacement orbitals, so the result is zero. Thus, thesecond-quantized form of F gives the same matrix elements as the first-quantized formand therefore the two forms are equivalent.

The proof of a similar expression for the operator G is left as a homework problem

G =12

∞∑i,j,k,l=1

gijkla†i a†j alak .

Notice that the order of indices is different in the matrix element from that in the stringof operators.

D. Normal products and Wick’s theorem

1. Normal-Product

The normal-product of creation and annhilation operators is defined as rearragnedproduct of these operators such that all creation operators are to the left of all annihilationoperators with a phase factor corresponding to the parity of the permutation producingthe rearrangement. For an arbitrary product of creation and annihilation operatorsABC... normal-product is denoted as n[ABC..] and is given as

n[ABC...] = (−1)σa†b†...uv...

wherea†b†...uv... = P (ABC...)

P being the permutation of operators A,B,C, ... and σ being the parity of permutation.This definition is not unique, since any rearrangement of the creation operators among

31

themselves and/or the annihilation operators among themselves is permissible butwould always be accompanied by an appropriate change in the phase factor; thus allforms of a normal-product are equivalent. Examples are as follows:

n[a†b] = a†b, n[ab†] = −b†a, n[ab] = ab = −ba, n[a†b†] = a†b† = −b†a†

n[a†bc†d] = −a†c†bd = a†c†db = c†a†bd = −c†a†db

The usefulness of the normal-product form is that its physical vacuum expectation valueis zero:

〈vac|n[ABC...]|vac〉 = 0 if [ABC...] is not empty.

2. Contractions (Pairings)

In order to be able to compute expectation values of general operator strings, we willtake advantage of Wick’s theorem. In order to be able to formulate this we need to definethe contraction (or pairing) of operators. For a pair of creation or annihilation operatorsA,B, we define their contraction as

A B ≡ AB−n[AB]

Specifically, the four possibilities are:

a† b† = a† b† − a† b† = 0,

a b = ab − ab = 0,

a† b = a† b − a† b = 0,

a b† = ab† − (−b†a) = [a,b†]+ = δab.

A normal-product with contractions is defined as follows:

n[ABC...R...S...T ...V ...] = (−1)σRT SV ...n[ABC...]

where all the contracted pairs have been put in front of the normal-product and σ is theparity of the permutation.

3. Time-independent Wick’s theorem

A product of a string of creation and annihilation operators is equal to their normal-product plus the sum of all possible normal-products with contractions. Symbolically,

ABCD... = n[ABCD...]+n[ABCD...]+n[ABCD...]+n[ABCD...]+...+n[ABCD...]+n[ABCD...]

32

+...+n[ABCD...] +n[ABCD...] +n[ABCD...] +n[ABCD...] + ...

Thus, all possible contractions of one pair, two pairs etc. are included. The importanceof the above result is that the vacuum expectation value of any normal-product withcontractions is zero unless all operators are contracted. The reason is that each contractioncontributes a factor of zero or 1 and, if an uncontracted normal-product remains, itsvacuum expectation value is zero. For example, consider a†bc†de†f , applying Wick’stheorem we get

a†bc†de†f = n[a†bc†de†f ] +n[a†bc†de†f ] +n[a†bc†de†f ] +n[a†bc†de†f ] +n[a†bc†de†f ]

where we have omitted all contractions except those of the form ab†, since they vanish.Since no fully contracted term survives, the vacuum expectation value of this operatorproduct is zero. A more complex example ab†cd†ef † is given as a homework.

4. Outline of proof of Wick’s theorem

In a normal-ordered product p†q†...uv all contractions vanish since in such a productthere can be no contractions involving annihilation operator to the left of creationoperator. Thus, if a string of operators is already in normal-product form we have

p†q†...uv = n[p†q†...uv] +∑

(All possible contractions)

since all terms in the sum vanish. Thus Wick’s theorem holds in this case. Consider nextthe case where one pair of operators is out of normal order:

p†q†...rs†...uv = p†q†...([r, s†]+ − s†r

)... uv

= p†q†...δrs...uv − p†q†...s†r ... uv= n[p†q†...rs†...uv] +n[p†q†...rs†...uv]

All other contractions vanish, so Wick’s theorem still holds.Now consider the case where we have two annihilation operators to the left of one of thecreation operators:

p†q†...rst†...uv = p†q†...rst†...uv − p†q†...rt†s ... uv

= p†q†...rst†...uv − p†q†...rt†s...uv + p†q†...t†rs ... uv

= n[p†q†...rst†...uv] +n[p†q†...rst†...uv] +n[p†q†...rst†...uv]

again satisfying Wick’s theorem, since all other contractions vanish. This procedure canbe continued for all pairs of operators out of normal order.

33

5. Comprehensive proof of Wick’s theorem

We shall prove this theorem in three steps. We first prove a lemma L1 which expressesan arbitrary normal-product, multiplied on the right by a single operator, in terms of thenormal-product and normal-products with pairings of all the operators involved. Next,we shall generalize this lemma (L2) for normal-products with contraction and, finally,we shall use the two lemmas to prove the general theorem.

L1:

n[M1M2...Mk]Ml = n[M1M2...MkMl] +k∑i=1

n[M1...Mi ...MkMl]

ConsiderMl is an annihilator, say a, then indeed all the normal-product with contractionappearing on the right hand side vanishes. And we get

n[M1M2...Mk]a = n[M1M2...Mka]

Since a is an annihilator, it can be taken inside the normal-product. We can thus assumethat Ml is a creator say b†. Moreover, without any loss of generality, we can furtherassume that all the operators Mi , i = 1, ..., k are annihilators. Since one can easily extendthis special case to a general case as follows. We simply multiply from the left both sidesof L1 with the product of pertinent creation operator. These being to the left of all theoperators may be brought inside of all the normal-products. Then we can add to the righthand side the terms in which these added creators are contracted, one by one, with thelast operator in the product, b†. All these terms vanish, since a contraction of two creatorsvanishes, and will not change the validity of our identity. Finally, we may rearrange theorder of the operator Mi , i = 1, ..., k in each normal-product as desired. Moreover, anypermutation of the operatorsMi , i = 1, ..., k will not reverse the ordering of the contratedpairs, since the only contraction present is with the right most operator in the product,which is not affected by the permutation.It thus remains to prove L1 for the special case in which M1,M2, ...,Mk are annihilators,say a1, a2, ..., ak, and Ml is the creator b†, i.e.,

n[a1a2...ak]b† = n[a1a2...akb

†] +k∑i=1

n[a1...ai ...akb†] (33)

We can now use induction, since above equation is clearly valid for k = 1,

n[a1]b† = n[a1b†] +n[a1b

†]

a1b† = −b†a1 + δab

and gives anticommutation relation.We suppose that Eq. (33) is valid for k =N ≥ 1 and prove that it is also valid for k =N +1.

34

To do so wmultiply Eq. (33) by an arbitrary annihilator, say a0, from the left.

a0 n[a1a2...aN ]b† = a0 n[a1a2...aNb†] +

N∑i=1

a0 n[a1...ai ...aNb†] (34)

Consider now the left hand side of Eq. (34). Since all the operators in the normal-productare annihilators, we can bring a0 in the normal-product.

a0 n[a1a2...aN ]b† = n[a0a1a2...aN ]b† (35)

Similarly we can rewrite all the terms under the summation symbol on the right handside of Eq. (34) obtaining

N∑i=1

a0 n[a1...ai ...aNb†] =

N∑i=1

n[a0a1...ai ...aNb†] (36)

Finally, the first term on the right hand side of Eq. (34) can be rearranged to the form

a0 n[a1a2...aNb†] = (−1)N a0b

† n[a1a2...aN ] = (−1)N (n[a0b†] + a0b

†) n[a1a2...aN ] (37)

where in the last equation we have used definition of contraction.

n[a0b†]n[a1a2...aN ] = −b†a0a1...aN = −n[b†a0a1...aN ]

a0b† n[a1a2...aN ] = n[a0b

†a1a2...aN ] = (−1)N n[a0a1...aNb†]

Substituting above two equations in Eq. (37) we get

a0 n[a1a2...aNb†] = (−1)N+1 n[b†a0a1...aN ] + (−1)2Nn[a0a1...aNb

†]

a0 n[a1a2...aNb†] = n[a0a1...aNb

†] +n[a0a1...aNb†] (38)

Now substituting Eqs. (35), (36), and (38) into Eq. (34), we finally get

n[a0a1a2...aN ]b† = n[a0a1...aNb†] +n[a0a1...aNb

†] +N∑i=1

n[a0a1...ai ...aNb†]

n[a0a1a2...aN ]b† = n[a0a1...aNb†] +

N∑i=0

n[a0a1...ai ...aNb†] (39)

We now generalize L1 to the case of the normal-products with contraction.L2:

n[M1M2...Mi ............Mk]Ml = n[M1M2...Mi ............MkMl]+k∑

j=1,j<C

n[M1M2...Mi .........Mj .....MkMl]

35

where C designates the index set of those operators Mi , i = 1, ..., k which are alreadycontracted in the normal-product on the left hand side.Thus in the normal-product on the left hand side and in the first term on the righthand side only the operators Mi , i ∈ C, are contracted, while in the last term there is anadditional contraction involving the last operator Ml and some yet unpaired operatorsMi , i = 1, ..., k; i < C. The proof of this lemma is very easy when one realizes that L2reduces to L1 when C is empty. This is because all the contracted terms on left handside, first term on the right hand side, and only the terms not contracted with Ml in thesecond term on the right hand side can be taken out of the normal-product.We are now ready to prove Wick’s theorem. We shall use again the mathematicalinduction, since from the definition of a contraction the theorem holds for N = 2.

M1M2 = n[M1M2] +n[M1M2]

= n[M1M2] +M1M2

and, trivially for N = 1. We thus assume its validity for N ≥ 2 and prove that it is thenalso valid for N + 1.Indeed, multiplying Wick’s theorem forN operators with an arbitrary creation or annihilationoperator MN+1 from the right we obtain

M1M2...MNMN+1 = n[M1M2...MN ]MN+1 +∑

1≤i<j≤Nn[M1...Mi ...Mj ...MN ]MN+1

+ ...+∑

n[M1...........Mk−1MkMk+1......MN ]MN+1 +(∑

n[M1.......MN ]MN+1

)(40)

where the last term (enclosed in paranthesis), in which all the operators inside thenormal-product are contracted, is only present if N is even, while the last unbracketedterm, in which all but one operator are paired in the normal-product, represents the lastterm when N is odd.Using now L1 to express the first term on the right hand side of Eq. (40) and, similarly,L2 for all the subsequent terms, we get

M1M2...MNMN+1 = n[M1M2...MNMN+1] +N∑i=1

n[M1...Mi ...MNMN+1]

+∑

1≤i<j≤Nn[M1...Mi ...Mj ...MNMN+1] +

∑1≤i<j≤N

∑k,i,j

n[M1...Mi ...Mj ...Mk ...MN ...MN+1]

+ ...+∑

n[M1...........Mk−1MkMk+1......MNMN+1]

+∑

n[M1...........Mk−1MkMk+1MN ...MN+1] +(∑

n[M1.......MNMN+1])

(41)

The first two terms on the right hand side of Eq. (41) originates from the first term ofthe right hand side of Eq.(40), the third and fourth terms of Eq.(41) originates from the

36

second term of Eq.(40). ForN even, the terms in the last line contain all possible normal-products with all but one operator contracted, since (N + 1) is then odd. Conversely,for odd N , Eq.(41) contains all possible fully contracted terms. Consequently, Wick’stheorem also holds for (N + 1) operators and, thus, in general.

6. Particle-hole formalism

Instead of referring all SDs and their matrix elements back to the vacuum state

|I〉 = |a1a2...aN 〉 = a†1a†2...a

†N |vac〉

it’s more convenient to begin with a fixed reference state also called as Fermi vacuum, incontrast with the physical vacuum |vac〉.

|0〉 ≡ |Φ0〉 = |ijk...n〉

and define other SD’s relative to it, e.g.

|Φai 〉 ≡ a

†i|Φ0〉 = |ajk...n〉 (single excitation),

|Φabij 〉 ≡ a

†b†ji|Φ0〉 = |abk...n〉 (double excitation),

|Φi〉 ≡ i|Φ0〉 = |jk...n〉 (electron removal),

|Φa〉 ≡ a†|Φ0〉 = |aijk...n〉 (electron attachment)

etc. Notice also that|Φabij 〉 = |Φba

ji 〉 = −|Φbaij 〉 = −|Φab

ji 〉

The spinorbitals i, j,k, ...,n are occupied in |0〉 are called hole states (they appear explicitlyonly when an electron is excited out of them by, e.g. i, creating a hole in the referencestate), while the other spinorbitals a,b, ... are called particle states. We shall use the lettersi, j,k, ... to indicate indices restricted to hole states, the letters a,b,c, ... to indicate indicesrestricted to particle states and the letters p,q, r, ... to indicate any state (either hole orparticle, without restriction). We assume an energy level separating the filled (hole)states (present in |0〉) with the empty (particle) state. This energy level is called Fermilevel. Using this notation, we find that

i†|0〉 = 0, a|0〉 = 0

〈0|i = 0, 〈0|a† = 0

It is convenient to define a new set of operators, sometimes called pseudo-creation andpseudo-annihilation operators (or quasi-operators), via

bi = a†i , b†i = aiba = aa, b†a = a†a

37

Thus b†i creates a vacancy in state i while bi eliminates such a vacancy. The particlepseudo-operators are identical to the ordinary particle operators, while the hole pseudo-creation and pseudo-annihilation operators are equivalent to the ordinary hole annihila-tion and creation operators, respectively. The motivation for this notation is that allpseudo-annihilation operators operating to the right on the Fermi vacuum state give zeroand all pseudocreation operators operating to the left on the Fermi vacuum state also givezero,

bp|0〉 = 0, 〈0|b†p = 0

7. Normal products and Wick’s theorem relative to the Fermi vacuum

Now we modify the concepts of normal products, contractions and Wick’s theorem sothat they relate to a reference state (the Fermi vacuum) instead of the physical vacuum.A product of creation and/or annihilation operators is said to be in normal order relativeto the Fermi vacuum |0〉 ≡ |Φ0〉 = |ijk...n〉 if all pseudo-creation operators a†, ... and i, ...are to the left of all pseudo-annihilation operators a, ... and i†, ... . Using the notation

b†i = ai = i, bi = a†i = i†, b†a = a†a = a†, ba = aa = a

the product is in normal order if all the b†p operators are to the left of all the bp operators.Since

bp|0〉 = 0, 〈0|b†p = 0

the Fermi-vacuum expectation value of a normal-ordered product of such operators vanishes.To distinguish the new type of normal product from the previous type, it is often writtenas

N [ABC...] = (−1)σb†pb†q...bubv ,

instead of n[ABC...] when the ordering is relative to the physical vacuum. The power σ ofthe phase factor is the parity of the permutation from ABC... to b†pb

†q...bubv . Contractions

relative to the Fermi vacuum will be denoted by brackets above the operators instead ofbelow, and we have

A B ≡ AB−N [AB]

So for contractions relative to the Fermi vacuum we find that the only nonzero contrationsare

i†j = δij , ab† = δab

A normal product with contractions is also defined in the same way as in the case whereit is relative to the physical vacuum:

N [ABC...R...S...T ...V ...] = (−1)σRT SV ...N [ABC...]

38

Quantity True vacuum formalism Fermi vacuum formalism

vacuum state |vac〉 |0〉 or |Φ0〉creation operator b†a = a†a = a† b†i = ai = iannihilation operator ba = aa = a bi = a†i = i†

normal product of operators n[ABC...] N [ABC..]

If we recall the proof of Wick’s theorem, we see immediately that the same proofwill apply to particle-hole formalism versions of the theorem. We only have to replaceeverywhere the true vacuum quantities with the corresponding Fermi vacuum quantities,as indicated in the table given above. We can thus write immediately the particle-holeform of Wick’s theorem as follows:

ABCD... =N [ABCD...] +∑

(All possible contractions)

as indicated, the sum is over all possible contractions of one pair, two pairs etc. Theoperatore are particle-hole operators defined with respect to |0〉. Obviously, the usefulnessof this theorem is at least partly due to the fact that the Fermi vacuum expectation valueof a normal product vanishes unless it is fully contracted, so that

〈0|A...B...C...D...|0〉 =∑〈0|N [A...B...C...D...]|0〉

where the sum is over all fully contracted normal products. From here on, unlessexplicitly stated otherwise, whenever we talk of the vacuum we will be referring tothe Fermi vacuum and whenever we talk of normal products or contractions, we arereferring to these concepts relative to the Fermi vacuum.

8. Generalized Wick’s theorem

To complete this phase of the analysis, we need one more theorem, the generalizedWick’s theorem dealing with products of normal products of operators. This is neededsince we shall have to evaluate matrix elements of the normal-product operator Wbetween various Slater determinants (not just the reference SD), as for example in

〈Φab...ij... |W |Φ

de...lm...〉 = 〈0|i†j†...baW d†e†...ml|0〉

Here we have a vacuum expectation value of a product of three operator strings, each ofwhich separately is in normal-product form, since

N [i†j†...ba] = i†j†...ba,

N [d†e†...ml] = d†e†...ml

39

The generalized Wick’s theorem states that a general product of creation and annihilationoperators in which some operator strings are already in normal-product form is given asthe overall normal product of all the creation and annihilation operators plus the sumof all overall normal products with contractions except that, since contractions of pairsof operators that are already in normal order vanish, no contractions between pairs ofoperators within the same original normal product need be included:

N [A1A2...]N [B1B2...]N [C1C2...] =N [A1A2...B1B2...C1C2...]+∑ ′

N [(All possible contractions)]

where the sum is over contractions of one pair at a time, two pairs, etc., and the prime onthe summation sign indicates that no ”internal” contractions.

Note that the case in which the original product contains some individual creation orannihilation operators not within any normal product is also included in the scope of thegeneralized Wick’s theorem, since for such operators A =N [A].

9. Normal-product form of operators with respect to Fermi’s vaccum

One-electron operators: Let us consider a one-electron operator

F =∑pq

〈p|f |q〉p†q (42)

Using Wick’s theorem,p†q =N [p†q] + p†q

The contracted term vanishes unless p and q are the same hole state (call it i), when it isequal to 1, and thus

F =∑pq

〈p|f |q〉 N [p†q] +∑i

〈i|f |i〉

= FN +∑i

〈i|f |i〉

where FN is the normal-product form of the operator Eq. (42),

FN =∑pq

〈p|f |q〉 N [p†q]

The expectation value of FN for Fermi vacuum is zero, i.e., 〈0|FN |0〉 = 0. To show it,consider the four possible permutations of p†q for particle and hole operators.Case I: Both operators correspond to particle states, then 〈0|N [a†b]|0〉 = 〈0|a†b|0〉 = 0 sincethere is no particle state to annihilate in |0〉.Case II: Both operators correspond to hole states, then 〈0|N [i†j]|0〉 = 〈0|ji†|0〉 = 0.Case III: One of the operators corresponds to a hole state and other to a a particle state,

40

such that 〈0|N [i†a]|0〉 = 〈0|i†a|0〉 = 0.Case IV: One of the operators corresponds to a particle state and the other one to a holestate, such that 〈0|N [a†i]|0〉 = 〈0|a†i|0〉 = 〈0|Φa

i 〉 = 0.Therefore, we have

〈0|F|0〉 =∑i

〈i|f |i〉, F = FN + 〈0|F|0〉

Note that FN contains hole-hole, particle-particle, and hole-particle terms,

FN =∑ij

fij N [i†j] +∑ab

fab N [a†b] +∑ia

fia N [i†a] +∑ai

fai N [a†i]

= −∑ij

fij ji† +

∑ab

fab a†b+

∑ia

fia i†a+

∑ai

fai a†i

Two-electron operators: Next consider a two-electron operator,

G =12

∑pqrs

〈pq|g |rs〉 p†q†sr (43)

The derivation of the normal-product for of this operator is left for homework.

VI. DENSITY-FUNCTIONAL THEORY

The solution of Schrödinger’s equation in the clamped-nuclei approximation is a 3N -dimensional function (not including the spin degrees of freedom), whereN is the numberof electrons

HΨ (x1,x2, . . . ,xN ) = EΨ (x1,x2, . . . ,xN ).

Each wave function gives a unique electron density

ρ(r) =N∑

s,s2,...,sN

∫d3r2 . . .d

3rN |Ψ (x,x2, . . . ,xN )|2 . (44)

Note that since Ψ is antisymmetric, it does not matter which particle is left out of theintegrations. We see immediately that∫

d3rρ(r) =N.

The other commonly used symbol denoting ρ is n. Obviously, solutions of electronicstructure problems would be much easier if one could replace Ψ by ρ, an object thatis only three dimensional. While it may seem initially impossible, attempts to do sogo back to the early days of quantum mechanics and we now know solid mathematicalbackground for such an approach. Here is a summary of major historical developmentsin this field.

41

1927 Llewellyn H. Thomas proposes to apply the expressions coming from quantumstatistical treatment of uniform electron gas (the latter can be found in moststatistical mechanics textbooks) to atoms. While electron density in atoms isobviously non-uniform, Thomas assumed that it is uniform locally.

1928 Enrico Fermi comes independently with a similar idea.

1930 Paul Dirac extends theory to include the so-called exchange terms. Such approachis now called the Thomas-Fermi-Dirac (TFD) method. The essence of this theoryis that the energy of a system is written as a functional of electron density with allterms in the functional originating from quantum statistical treatment of electrongas.

1935 Carl Weizsäcker proposes a correction to the kinetic energy term in TFD.

1951 John Slater develops a method which is a combination of the HF method and TFD,in particular, the density is computed from the Slater determinant and the methodsolves one-electron equations similar to HF equations. The Slater method was inmany respects similar to the Kohn-Sham method discussed below, but was missinga rigorous derivation.

1964 Pierre Hohenberg and Walter Kohn (HK) prove that there exists a functional of ρwhich upon minimization gives the exact ground-state energy. The method is calledthe density-funtional theory (DFT). However, HK say nothing on how to constructsuch a functional.

1965 Kohn and Lu Jeu Sham (KS) derive one-electron equations similar to HF equationsthat can be solved for spinorbitals which then give ρ which minimizes the densityfunctional. The functional used in the original KS method has several terms takenfrom the TFD method and therefore this approach is now called the local-densityapproximation (LDA).

1998 Walter Kohn receives Nobel prize in chemistry for DFT.

2017 Hundreds of approximation to the unknown exact density functional have beenproposed by now and DFT is the most used computational method in many fieldsof physics and chemistry.

A. Thomas-Fermi-Dirac method

Although it is often stated that the Thomas-Fermi-Dirac (TFD) method originatesfrom quantum statistical mechanics, no statistical approach is needed to derive this

42

method in its basic form. The reason is that the statistical treatment is taken fortemperature T → 0, when one can use non-statistical quantum mechanics.

The Thomas-Fermi (TF) theory expressions come from considering a system of non-iteracting spin 1/2 fermions of mass equal to the electrons mass placed in a cubicbox. Then one adds the interelectron Coulomb interactions as a first-order correctionneglecting at this point the permutational symmetry of wave function (like in theHartree approach). The TFD extension fixes this deficiency, i.e., computes the first-ordercorrection accounting for antisymmetry.

Let us consider a system of N noniteracting spin 1/2 fermions of mass equal to theelectrons mass m placed in a cubic box of side L (volume V = L3). Since the Hamiltonianis separable into Hamitonians of individual particles, the solution of the Schödingerequation for such system reduces to solutions of single-particle equations, and then toseparate solutions for each dimension, giving orbitals

ψnx,ny ,nz(r) = ψx(x)ψy(y)ψz(z) =

√8V

sin(kxx)sin(kyy)sin(kxz)

and orbital energies

εnx,ny ,nz =π2

~2

2mL2

(n2x +n2

y +n2z

)nx, ny , nz = 1,2, . . .

wherekx =

πLnx ky =

πLny kz =

πLnz.

We assumed that the box extends from 0 to L in each dimension and that the potentialis zero inside the box and infinite outside, so that the boundary conditions are ψx(0) =ψx(L) = 0 and similarly for other dimensions. One may also assume periodic boundaryconditions: ψx(x) = ψx(x + L) and similarly for other dimensions and this assumptionleads to same results in the limit of large number of particles.

We assume that each orbital corresponds to two spinorbitals, one with s = 1/2 andone with s = −1/2. The total wave function for the ground state of this system is then theSlater orbital built fromN spinorbitals with lowest energies. This means that each orbitalenergy level is doubly occupied or doubly degenerate (the highest occupied energy levelis called Fermi level). The total energy of the system is

E =occ∑

nx,ny ,nz

εnx,ny ,nz . (45)

If we put a dot in three-dimensional coordinate system for each point nx,ny ,nz, thepart of the space with positive coordinates will be divided into cubes of side 1. For large

N , the surface formed by largest such values, nx, ny , nz, limited by√n2x + n2

y + n2z ≤ r for

some fixed, sufficiently large r, is well approximated by the surface of the sphere with the

43

radius r. The volume of the considered part of the space limited by this surface is 18

43πr

3

so the number of states inside this surface is nr = 13πr

3, where we multiplied by 2 toinclude spin degeneracy. The number of states in a shell r, r +dr is therefore dnr = πr2dr.All these states have (approximately) the same orbital energies εr = (π2

~2/2mL2)r2. Thus,

we can obtain the total energy of the system by integrating

E =∫εrdnr =

∫ rF

0

π2~

2

2mL2 r2πr2dr =

π3~

2

10mL2 r5F

where rF denoted the radius corresponding to the Fermi level. This radius can be foundfrom

N =13πr3

F

which gives

E =π3

~2

10mL2

(3Nπ

)5/3.

This energy can be further written in terms of electron (number) density ρ =N/V as

E =π3

~2

10m

( 3NπL3

)5/3L3 =

π3~

2

10m

(3ρπ

)5/3V = CFV ρ

5/3 (46)

where CF = 310

~

m(2π2)2/3 is the so called Fermi constant. We will later use atomic unitswhere this constant reduces to CF = 3

10(2π2)2/3. Note that this energy is just the thekinetic energy, the only energy in the case of noninteracting fermion gas.

For future reference, let us find the expression for Fermi’s energy and Fermi’s wavevector. The Fermi energy is the orbital energy at rF

εF =π2

~2

2mL2 r2F =

π2~

2

2mL2

(3Nπ

)2/3=π2

~2

2m

( 3NπL3

)2/3=π2

~2

2m

(3ρπ

)2/3(47)

where the Fermi wave vector is

kF =√

2mεF = π~(3ρπ

)1/3= ~

(3π2ρ

)1/3(48)

Thomas and Fermi used the expression of Eq. (46) as the kinetic energy in theirmodel even if it was applied to atoms, molecules, or solids despite the fact that theelectron density in such systems is obviously not constant. A critical assumption of TFmodel is that the density can be assumed locally constant, the so-called local densityapproximation (LDA). Next, Thomas and Fermi moved to interacting electron gas,i.e., added to the Hamiltonian the electron Coulomb repulsion term as a perturbationoperator

Uee =12

N∑i<j

e2

|ri − rj |(49)

44

and included the first-order correction, i.e., the expectation value of Uee = G with theproduct of ground-state spinorbitals in their energy expression. If we recall derivetionof Eq. (28),

〈Ψ |GΨ 〉 =12

N∑i,j=1

(gijij − gijji

)(50)

using just the product produces only the first term in this expression. It term can bewritten as

12

N∑i,j=1

gijij =12

N∑i,j=1

∑s1,s2

"d3r1d

3r2φ∗i (x1)φ∗j(x2)

e2

|r1 − r2|φi(x1)φj(x2) (51)

The sum over spinorbitals can be replaced by electron density. To see it, let us writeEq. (44) for Slater determinant. Due to orthonormality of spinorbitals, the spinorbitalsof coordinates integrated over must be the same. Thus the only surviving terms are thosewhere the consecutive squares of modulus of a given spinorbital are depending on x

ρ(r) =N(N − 1)!

(√N !)2

∑s

N∑i=1

φ∗i (x)φi(x) =∑s

N∑i=1

φ∗i (x)φi(x) = 2N∑i=1

ψ∗i (r)ψi(r) (52)

where 1/√N ! comes from the definition of the determinant and the factor (N − 1)! is the

number of permutations (identical in the bra and ket) of spinorbitals other than φi(x). Inthe last step, we have integrated over spin recalling that pairs of spinorbitals are relatedto the same orbital. Using Eq. (52) we can write Eq. (51) as

12

"d3r1d

3r2ρ(r1)e2

|r1 − r2|ρ(r2) = JH[ρ]. (53)

This term is known under the name of Hartree energy and denoted by JH[ρ]. It describedCoulombic interaction of electron density with itself.

For atoms, moleccules, and solids, Thomas and Fermi included, of course, also theCoulomb interaction of electron with nuclei

V = −Nnuc∑a=1

N∑i=1

Zae2

|ri −Ra|=

N∑i=1

v(ri). (54)

Since this is a one-electron operators, its expectation value with the ground-state Slaterdeterminant

〈Ψ |VΨ 〉 =N∑i=1

vii =∫ρ(r)v(r) = V [ρ].

Since we now have different kinetic energy at teach point of space, we have to averagethe expression (46) over the space, obtaining in this way the kinetic energy of the TFmethod

T TF =1V

∫Ed3r = CF

∫ρ5/3(r)

45

The total energy expression in the TF metod is therefore

T TF[ρ] = TTF[ρ] +V [ρ] + JH[ρ].

This functional of ρ can be minimized with respect to ρ, we will not discuss thesemethods. For atoms, the functional has often been evaluated with densities obtainedfrom the HF method.

The TFD method is an extension of the TF method by including the permutationalsymmetry in evaluating the expectation value of the Uee operator, the term sometimescalled the Dirac exchange energy. In contrast to the Hartree term which is valid for anyset of orbitals, the Dirac term is explicitly computed with orbitals of the noninteractinggas. The exchange integral of HF theory can be written in terms of one-particle densitymatrix

K =12

N∑i,j

gijji =12

N∑i,j=1

∑s1,s2

"d3r1d

3r2φ∗i (x1)φ∗j(x2)

e2

|r1 − r2|φj(x1)φi(x2).

Let’s first sum over spin. We have∑s1

ψ∗i′ (1)σ (1)ψj ′ (1)σ ′(1)∑s2

ψi′ (2)σ (2)ψ∗j ′ (2)σ ′(2)

where i′ and j ′ are orbital indices. Note that i′ is coupled with the same σ in both placesas it is the same spinorbital. Thus, if in the sum over s1 we have, say, +− combination ofspins, the same combination apears in the sum over s2. Therefore, the only nonvanishingterms are ++ and −−, so we get an overal factor of 2 from spin summation and we canwrite

K ="

d3r1d3r2

N/2∑i

ψ∗i (r1)ψi(r2)

e2

|r1 − r2|

N/2∑i

ψi(r1)ψ∗i (r2)

=

14

"d3r1d

3r2ρ1(r1,r2)e2

|r1 − r2|ρ1(r2,r1)

=14

"d3r1d

3r2

∣∣∣ρ1(r1,r2)∣∣∣2 e2

|r1 − r2|

The quantity ρ1 is the one-electron (reduced, i.e., integrated over spin) density matrix

ρ1(r1,r2) = 2N/2∑i

ψi(r1)ψ∗i (r2) (55)

and the factor 2 in its definition leads to the factor 1/4 in the expression for K .We will now compute the one-electron density matrix for noniteracting electron gas.

In contrast to what we did when deriving the kinetic energy expression, it is now more

46

convenient to assume the periodic boundary conditions. One can prove that for large Nthe two conditions give the same answers, but we will not do it since at this point wemade much more drastic approximations then could arise from a possible inconsistencyresulting from different boundary conditions. With the periodic conditions, the orbitalwave functions is of the form

ψnx,ny ,nz =1√V

eik·r

where kv = 2πL nv with nv = 0,±1,±2, . . . . The density matrix of this system is

ρ1(r1,r2) =2V

occ∑nx,ny ,nz

eik·(r1−r2).

Analogously as before, for large N we can change summation to integration

ρ1(r1,r2) =2V

∫ occ

eik·r12dnx dny dnz =1

4π3

∫ |k|=kF

eik·r12d3k

where changing variables with used dkx = (2π/L)dnx and so on, which gives overallJacobian V /(8π3). The upper limit of the integration was defined in Eq. (48). This integralcan be evaluated in spherical coordinates and this evaluation is given as a homework. Theresult is

ρ1(r1,r2) =1

π2s3[sin(skF)− skF cos(skF)] (56)

where s = |r12|. It is now natural to view ρ1 as a function of variables s = r1 − r2 andr = (r1 + r2) /2. We see that it is independent of r and of the direction of s, as expectedfor the uniform gas. Notice that ρ1 does depend on ρ via kF.

To compute the Dirac exchange energy used in TFD, we assume as before that theuniform gas expression is valid locally and with ρ dependent on r averge over space,obtaining

KD = Cx

∫ρ4/3(r)d3r, Cx =

34

( 3π

)1/3. (57)

The derivation of this expression is left as homework.

B. Hohenberg-Kohn theorems

The first HK theorem states that, for the ground state of a system, the knowledge ofρ allows one to determine Ψ and vice verse. The latter is obvious from the definitionof density. For systems consisting of atoms with no external potentials, the proof of theformer part of this theorem is very simple. Since the sources of the potential are nucleiand the electron interaction with a nucleus is singular, the density will have sharp peaks

47

exactly at the positions of nuclei. The steepness depends on the charge of a nucleus.Thus, the knowledge of density gives us locations and charges of nuclei. Therefore, onecan write Schrödinger’s equation and solve it to find Ψ . The importance of this theoremis that is shows that the density alone gives all the needed information about the system.Also note that the theorem discusses only the exact ground-state density.

The second HK theorem states that there exists a functional of density, denoted byE[ρ], that upon minimization with respect to ρ gives the ground-state energy

E[ρ] ≥ E[ρ0] = E0 where ρ(r) > 0 and∫d3rρ(r) =N.

The density is arbitrary except for satisfying the two conditions listed, originating fromthe definition of ρ in Eq. (44).

To prove this theorem, we will follow the arguments given by Levy. We will start fromthe Ritz variational principle

E0 = minΨ〈Ψ |HΨ 〉

where Ψ belongs to the Hilbert space of normalized antisymmetricN -electron functions.We can then write

E0 = minΨ〈Ψ |HΨ 〉 = min

ρ

[minΨ→ρ〈Ψ |HΨ 〉

]with ρ constrained by the conditions specified in the theorem. The meaning of the doubleminimizations is as follows. We go over the space of all possible ρ’s and for a givenρ find all Ψ ′s that give this ρ. We select such Ψ out of this set that gives the lowestexpectation value of the Hamiltonian. Clearly, if we go over all ρ’s, we will eventuallyfind the ground-state energy, which proves the theorem.

One subtlety to discuss is whether for an arbitrary ρ there exists an antisymmetricΨ which gives this ρ via Eq. (44). One can indeed prove that this is the case (byconstructing a set of orthonormal spinorbitals from the density and then constructinga Slater determinant from the density). We say that each ρ (fulfilling the constraints) isN -representable. However, we do not need to prove this theorem to complete the proofof the second HK theorem. Since for each Ψ there exists a ρ, we will sweep the space ofall Ψ ’s when going over all ρ’s. If there were ρ’s that are not N -representable (which isnot the case), we could just ignore them.

The importance of Hohenberg-Kohn work is mainly conceptual, stemming from thefact that it has put density-functional theory on a solid mathematical ground, in contrastto the TFD and Slater methods which both were based on ad hoc arguments. However,the HK theorems did not offer any new practical tools. The proof that the functionalexists is via the wave functions, so it tells us nothing about finding the actual functionalthat could be applied without invoking wave functions. Significant efforts have beenmade by many researchers to find good approximations to such a functional, but in factall the proposed pure density functionals work poorly. The family of methods that do

48

work, now known under the name DFT, are in fact not true DFT approaches, i.e., are notbased on density alone. These methods originate from the Kohn-Sham ideas which willbe discussed next and use spinorbitals in addition to densities.

C. Kohn-Sham method

Let us write the Hamiltonian of Eq. (19) with the following notation for the threeconsecutive terms

H = T + V + Uee.

The matrix elements of the multiplicative operator V which is the sum of one-electronterms

V = −Nnuc∑a=1

N∑i=1

Zae2

|ri −Ra|=

N∑i=1

v(ri)

can be written as an explicit functional of density

〈Ψ |VΨ 〉 =N∑s1,...,sN

∫d3r1 . . .d

3rN v(r1) |Ψ (x1,x2, . . . ,xN )|2 =∫d3r1v(r1)ρ(r1). (58)

We can therefore write HK theorem as

E0 = minρ

[∫d3r v(r)ρ(r) +FHK[ρ]

]where

FHK[ρ] = T HK[ρ] +UHKee [ρ] = min

Ψ→ρ〈Ψ |

(T + Uee

)Ψ 〉 = 〈Ψmin[ρ]|

(T + Uee

)Ψmin[ρ]〉.

There is nothing new in this equation except for introducing the notation.One might think that an explicit density functional can be obtained when the Uee

operator is neglected. This is not so since the operator T is a differential operator. Thus,if we replace V by T in Eq. (58), this equation cannot be integrated to depend on densityonly. To overcome this difficulty, Kohn and Sham replaced T HK[ρ] by an expression usedin the HF method, i.e., by the expectation value of T with a Slater determinant

T HK[ρ]→ TS[φi[ρ]] = −12

N∑i=1

〈φi |∇2φi〉

where we started to use atomic units such that ~ = e =me = 1. The notation introduced inTS indicates that spinorbitals can be considered to be determined by density. We may saythat TS[φi[ρ]] is an explicit functional of orbitals and an implicit functional of density

49

(we will not make use of these concepts). The density can be calculated from a Slaterdeterminant using Slater-Condon’s rules for the following electron-density operator

ρ =N∑i=1

δ(r − ri),

i.e.,

ρ(r) = 〈Ψ |ρΨ 〉 =N∑i=1

∑s1

∫d3r1φ

∗i (x1)δ(r − r1)φi(x1) =

N∑i=1

∑s

∣∣∣φi(x)∣∣∣2 =

N∑i=1

∣∣∣φi(r)∣∣∣2 ,(59)

where we replaced s1 by s in the next to last equation, φi is the orbital part of thespinorbital φi , and where we assumed pure spin states, so that the sum over the spinpart is one. So far we do not know how to determine the spinorbitals, we will get to thisissue later on.

The next important idea of Kohn and Sham was to write the UHKee [ρ] term as a sum of

the Coulomb interaction of the density with itself

EH[ρ] =12

∫d3rd3r′

ρ(r)ρ(r′)|r − r′ |

,

called the Hartree energy, and of the remainder. This term appears in the HF theory asthe expectation value of the N -electron Coulomb operator and in the THD theory, so thischoice was natural.

With both discussed approximations, the FHK[ρ] functional can be written as

FHK[ρ] = TS[φi[ρ]] +EH[ρ] +Exc[ρ]

where the last term, called the exchange-correlation energy, collects all interactions notincluded in the first two terms. It is worth to write this term explicitly

Exc[ρ] = T HK[ρ]− TS[φi[ρ]] +UHKee [ρ]−EH[ρ].

Thus, despite the label “exchange-correlation", this term includes kinetic energy correc-tions. The term is expected to correct for the electron correlation effects not included inEH[ρ] and for the effects resulting from antisymmetrization of the Slater determinantwhich in the HF method lead to the exchange operator. When a concrete Exc[ρ] isconstructed, one usually considers separately the correlation and exchange components,denoted by Ec[ρ] and Ex[ρ], respectively. All the hundreds of DFT methods in use differby the selection of Exc. In the simplest case used in the original KS paper, this term istaken from the TFD theory. We will discuss various choices of Exc later on.

The complete KS functional can be written as

EKS0 [ρ] = TS[φi[ρ]] +

∫d3rv(r)ρ(r) +EH[ρ] +Exc[ρ]

50

We will find its minimum in a way analogous to that used in the derivation of the HFmethod. Since ρ is expressed in terms of φi via Eq. (59), variation of ρ will be expressedvia variations of φi ’s. Thus, we will vary, as in the HF method,

φ∗i → φ∗i + δφ∗i

and this variation will imply the variation of ρ

ρ(r)→ ρ(r) + δρ(r) =∑s

N∑i=1

(φ∗i (x) + δφ∗i (x)

)φi(x) = ρ(r) +

∑s

N∑i=1

φi(x)δφ∗i (x)

We have to now impose the two conditions on ρ. The positiveness condition is automa-tically satisfied if the definition (59) is used. The normalization to N will be achievedif each orbital is normalized. We will in addition require that orbitals are orthogonal toeach other since only then Eq. (59) holds. Thus, the conditions will be imposed in exactlythe same way as in the HF method, i.e., we will minimize

LKS[ρ] = EKS0 [ρ]−

N∑j≥iλij

(〈φi |φj〉 − δij

)The linear variations of the kinetic energy, the nuclear attraction, the Hartree, and theconstraints terms are exactly the same as in the HF method

δTS = −12

N∑i=1

〈δφi |∇2φi〉

δ

∫d3rv(r)ρ(r) =

N∑i=1

〈δφi |vφi〉

δEH[ρ] =N∑i=1

〈δφi |Jφi〉

δN∑j≥iλij

(〈φi |φj〉 − δij

)=

N∑j≥iλij〈δφi |φj〉.

For the exchange-correlation (xc) energy term, we have to use symbolic notation since wedo not know this term explicitly. This term is assumed in the form of an integral of theso-called xc energy density

Exc[ρ] =∫d3r exc[ρ](r)

where exc(r) is some function of ρ(r), in the simplest cases it can be just a power of ρ.Therefore, we can write

Exc[ρ+ δρ] = Exc[ρ] + δExc[ρ] +O[(δρ)2] = Exc[ρ] +∫d3r e′xc(r)δρ(r) +O[(δρ)2]

51

where e′xc(r) is defined as the function which integrated with δρ(r) gives δExc[ρ]. Thus,this is an analog of the standard derivative: f (x + δx) = f (x) + f ′(x)δx +O((δx)2). We willuse notation

e′xc(r) ≡ δExc

δρ(r) ≡ vxc(r)

and we call vxc(r) the functional derivative of Exc[ρ]. Notice that vxc(r) is a function ofr, whereas Exc[ρ] is just a single real number for a given ρ. Although this definitionmay appear to be abstract, it is simple to find vxc in practice. For example, the exchangeenergy density in TFD is ex(r) = Cxρ(r)4/3 so that

Ex[ρ] = Cx

∫d3rρ(r)4/3

Ex[ρ+ δρ] = Cx

∫d3r (ρ(r) + δρ(r))4/3 = Cx

∫d3r

(ρ(r)4/3 +

43ρ(r)1/3δρ(r) + . . .

),

where we applied Taylor’s expansion of f (x) = x4/3 (true of any value ρ(r) = x). Thus,vx(r) = 4

3ρ(r)1/3 in this case. One may also comment that the familiar derivation of Euler-Lagrange’s equations in classical mechanics uses concepts analogous to those definedabove and can also be formulated using functional derivatives.

The derivation of a formula for vxc(r) gets a bit more complicated if the exchange-correlation energy depends also on derivatives of ρ, as it will be discussed later on. Inthis case, we have

Exc[ρ] =∫d3r exc[ρ,∇ρ](r).

The quantity exc[ρ,∇ρ](r) is some concrete function of ρ and of ρi ≡ ∂ρ/∂xi . Although thederivatives ρi are defined by ρ, we can first treat them as independent variables [like indf (x,y(x)) = (∂f /∂x)dx + (∂f /∂x)dy = (∂f /∂x)dx + (∂f /∂x)(dy/dx)dx]. At a given point rand for a given ρ, the linear variation of exc[ρ,∇ρ](r) is the sum of the increments δρ andδρi ’s multiplied by the regular partial derivatives of exc with respect to these variables

δexc = exc[ρ+ δρ,ρ1 + δρ1,ρ2 + δρ2,ρ3 + δρ3]− exc[ρ,∇ρ]−O[(δρ)2]

=∂exc

∂ρδρ+

∂exc

∂ρ1δρ1 +

∂exc

∂ρ2δρ2 +

∂exc

∂ρ3δρ3.

We now have to eliminate the increments δρi ’s in favor of δρ, similarly as in the derivationof the Euler-Lagrange equations. This can be done integrating by parts:∫

d3r∂exc

∂ρiδρi =

∫d3r

∂exc

∂ρi

∂δρ

∂xi=∂exc

∂ρiδρ

∣∣∣∣∣∞−∞−∫d3r

∂∂xi

(∂exc

∂ρi

)δρ.

The surface term vanishes since δρ vanishes at infinity and we eventually have∫d3rvxc(r)δρ =

∫d3r

∂exc

∂ρ−∑i

∂∂xi

∂exc

∂ρi

δρ.52

Using the definition of the functional derivative, we can write the linear variation ofxc energy in terms of variations of orbitals as

δExc[ρ] =∫d3rvxc(r)δρ(r) =

∑s

N∑i=1

∫d3rvxc(r)φi(x)δφ∗i (x).

Now we have determined variations of all terms in LKS[ρ]. Assuming that all variationsof spinorbitals are zero except for spinorbital φ∗i (or, alternatively, noticing that allvariations are independent), we get

〈δφi |(−1

2∇2 + v + J + vxc

)φi〉 −

∑j

λij〈δφi |φj〉 = 0.

As in the HF case, it implies that the ket has to be identically equal to zero(−1

2∇2 + v(r) + J(r) + vxc(r)

)φi(x) =

∑j

λijφj(x)

Since ρ is obviously invariant to unitary transformation of orbitals, so is vxc(r) and wecan diagonalize the matrix by a unitary transformation, obtaining the canonical KSequations (

−12∇2 + v(r) + J(r) + vxc(r)

)φi(x) = εφi(x).

The KS equation are similar to HF equations. The major difference is the presence ofthe vxc functional and the absence of the K operator. One might think that it wouldbe a good idea to include K , but the predictions of such methods are poor. However,there is a whole family of density functionals that add a fraction of K at the same timesubtracting an equivalent part of vx. Such approaches are called hybrid DFT methods.Another major difference between HF and KS approaches is that in the former case onecomputes the total system energy as an expectation value of the Hamiltonian, whereas inthe latter case one uses the appropriate KS functional. Of course, this is consistent withthe way each type of equations was obtained.

D. Local density approximation

The exchange-correlation energy is defined in the KS method as

Exc = T +Uee − TS −EH

where T is the exact kinetic energy and Uee is the exact electron repulsion energy. Thus,despite its name, it should correct also for the deficiencies in the description of kineticenergy by TS. However, little work is done in this direction, major efforts in the fieldhave been directed into improving the description of components resulting from electron

53

correlation and electron exchanges (however, in the so-called meta-GGA theories thatwill be discussed later, one used terms related to kinetic energies).

The electron repulsion term is partly accounted in KS approach by the EH term,Coulomb repulsion of density with itself. This term is the same as in the HF method,except that it is computed with KS rather than HF densities. In the HF approach, thisterm does not include any correlation effect (if electron correlation energy is defined asEcorr = Eexact−EHF). Thus, we do not expect it to completely describe electron correlationin the KS approach. We need further contributions.

The need for some term related to electron exchanges is clear when we realize thatthe exchange operator of the HF method is missing in KS orbital equation. It wouldbe simple to add this operator and indeed the so-called hybrid methods which will bediscussed alter on do it. However, just adding the complete K was found to give poorresults.

The truth is that the meaning of the words "exchange" and "correlation" in relation toExc should be taken in only a loose sense. Nevertheless, there is a rich literature devotedto constructing Exc functionals and most of this work is based on solid physics. One oftendiscusses separately the two terms writing

Exc = Ec +Ex

So, how one gets any expression for Exc? The preceding discussion might indicatethat this is an impossible task. However, as usual in physics, one studies simple, exactlysolvable models and tries to design expression which work for such models. One of themost important model is homogeneous interacting electron gas (HIEG). As discussed inSec. ??, the TFD model is derived from this physical system. This also means that if theTFD model is applied to HIEG, it is expected to work very well, and this is indeed thecase. The KS approach also uses the local density approximation (LDA) as does TFD. Infact, the original KS model in the 1965 is often called the LDA approach. It is similar toTFD in several respects. First, the terms V and EH are identical. Next, Kohn and Shamhave also taken the Ex from TFD:

Ex = KD

where KD is defined by Eq. (57). Thus the main difference is the use of TS instead ofTTF, which was the main feature of Slater’s density functional theory. The Ec was usuallyset to zero in early LDA variants, although there were some attempts to approximatethis term based on perturbation theory of interacting electron gas. Note that while thenoninteracting electron gas problem can be solved exactly, no analytic solutions exist forthe interaction gas except in the limits of very large and very small densities.

The use of LDA may seem as a huge approximation for atoms molecules and solids.Indeed, in its initial form KS/LDA did not work well for molecules, in many casespredicting that well-known molecules are not bound. The greatest successes of thisapproach were for metal, where the conduction electrons resemble electron gas.

54

The LDA method is still widely used. The main difference between the modern LDAand the original KS version is the addition of an Ec term fitted to nearly exact numericalcalculations for interacting uniform electron gas performed by Ceperley and Alder in1980. The calculations were performed using the diffusions Monte Carlo (DMC) methodwhich will be discussed later on. The only quantity obtained by Ceperley and Alder wasthe total energy of the electron gas as function of density. One can then arbitrarily definethe correlation energy as

Ec[ρ] = Etotal[ρ]− TS[ρ]−ED[ρ]. (60)

These numerical results were then fitted by some simple analytic functions. One maynotice that there is no V and EH terms in this equation. The reason is that for theinteracting uniform electron gas, one has to use a uniform positive background tocompensate for the electron charges and the Coulomb interactions included in thesetwo terms add up to zero. Notice further the arbitrariness of definition (60): the twosubtracted terms are not the exact kinetic and exchange energies of the system but someapproximations of these quantities.

The fits of Ec are usually expressed in terms of the quantity rS called the Wigner-Seitzradius and defined by

43πr3

S =1ρ

=VN

i.e., it is the radius of a sphere with volume corresponding to the volume occupied by oneelectron. Ec is usually expressed as

Ec =∫ρ(r)εc(r)d3r

where εc(r) is called correlation energy density. Several fits of this quantitiy have beenpublished, a particularly simple one was developed by Chachiyo in 2016

εc = a ln(1 +

brs

+b

r2s

)where a and b are fit parameters. Other fits have been published by Vosko-Wilk-Nussair(VWN) in 1981 and by Perdew and Wang (PW92) in 1992.

E. Generalized gradient approximations (GGA)

LDA has been developed using uniform electron gas as the underlying model, a systemwhere ∇ρ(r) = 0, but applying the resulting formulas to systems where ∇ρ(r) , 0. It istherefore natural to include ∇ρ(r) in DFT. Attempts to do so have a long history: theWeizsäcker correction to the TF kinetic energy is the earliest example. The expansion

55

in powers of ∇ρ(r) was discussed in the HK 1964 paper. This expansion, now calledgradient expansion approximation (GEA), is usually expressed in terms of the quantity

s(r) =|∇ρ(r)|2kFρ(r)

=|∇ρ(r)|

2(3π2)1/3ρ(r)4/3

or in terms of

x(r) =|∇ρ(r)|ρ(r)4/3

.

The exchange energy can then be written as

Ex = Cx

∫ρ4/3(r)

(1 +Dxs

2(r))d3r = Cx

∫ρ4/3(r)Fx(s)d3r

where Fx(s) is called the enhancement factor. Notice that this expression does reduceto the uniform gas limit for ∇ρ(r) = 0. Note also that there are no terms linear incomponents of ∇ρ(r). The reason is that Ex is assumed to be a quantity independentof the external potential, i.e., only dependent on electron-electron interactions. Thus,this quantity should be invariant under rotations. The coefficient Dx can be determinedby requesting that Ex satisfies some exact constraints or by fitting to experimental dataor to results from wave function based calculations. We will not discuss these issuessince calculations applying GEA have demonstrated that it gives poor results, usuallyworse than LDA. The underlying reasons is that GEA violates several exact conditionsthat LDA does satisfy (again, we will not discuss these issue).

The problems of GEA were solved by a family of methods known under the namegeneralized gradient approximations (GGA). To explain heuristically the main idea ofthese solutions let us consider the exchange hole defined as

ρx(r1,r2) = −12|ρ1(r1,r2)|2

ρ(r1

where ρ1 has been defined by Eq. (55). Numerical GEA calculations show that while thisquantity is well reproduced by GEA for small |r1 − r2|, at some range of intermediatevalues of interelectron separation the values of the exchange hole are much too large.One simple solution is to cut-off this region in numerical calculations. However, theunphysical behaviour is clearly related to the enhancement factor Fx(s) increasing toofast. Thus, the GGA methods use some fnctions multiplying s2 that damp this growth.One of the popular GGA enhancement factors was introduced by Becke in 1988

Fx(x) = 1 + x2 β

1 + 6βx sinh−1(x)

with β = 0.0042 fitted to reproduce atomic HF energies. One can plot this function to seethat it increases slower than x2 for large x. GGAs perform much than LDA for virtuallyall systems and are now the mainstream DFT methods.

56

F. Beyond GGA

To build on success of GGA, one can think about including further terms in GEA,starting from the s4 term. This leads to a family of methods called meta-GGA which usealso terms dependent on the so-called spinorbital kinetic energy density

τ(r) =12

∑s

N∑i=1

∣∣∣∇φi(x)∣∣∣2 .

Another extension of GGA approach is the inclusion of the HF exchange operator inthe KS one-electron equations, which leads to a family of the so-called hybrid GGAmethods. The HF exchange is, of course, calculated with KS orbitals, so it is differentfrom the actual HF exchange. This term is often called “exact” exchange, but of course itis not exact. The HF exchange operator is multiplied by some fractional number α andexchange potential vx is then multiplied by 1 −α. There are also methods called range-separated hybrid (RSH) methods which admix the HF exchange at a variable amountdepending on the value of |r1 − r2| in the exchange integral.

The whole family of DFT method is sometimes visualized in the form of the so-called“Jacob” ladder proposed by Perdew. The consecutive “rungs” are

———— virtual (e.g., RPA)

———— hybrid: HF exchange

———— metaGGA: |∇ρ|2, τ

———— GGA: ∇ρ

———— LDA: ρ(r)

The last rung are theories that use virtual orbitals. As discussed earlier, the KS methodis not a true DFT method since it uses orbitals. However, all the rungs but the topone use only the occupied orbitals. Although the use of orbitals is numerically morecostly than the use of density only, the restriction to occupied orbitals makes the costincrease manageable and even hybrid metaGGA methods are numerically much lessexpensive than even the simplest wave function methods above the HF level. However,it is possible to use KS both occupied and virtual orbitals in many-body methods suchas those discussed in the next sections. One of the simples ones is the random-phaseapproximation (RPA) which can be viewed as a special case of the coupled cluster methodwith double excitations (CCD) [see Sec. IX C]. Of course, the costs of such approaches asthe same as costs of the corresponding wave function approaches. One may ask why touse DFT in such cases. One reason is that the unperturbed problem may be closer to theexact solution than in the case when HF is used as zeroth-order approximation. This isthe case in particular for metals where HF work poorly.

57

Let us now present a very breif list of most popular functionals (the total number ofDFT functionals proposed is a few hundred).LDA nonempirical Kohn and Sham 1965 solidsBLYP nonempirical Becke, Lee, Yang, and Parr 1988 moleculesPBE nonempirical Perdew, Burke, and Ernzerhof 1996 molecules and solidsSCAN nonempirical Perdew et al. 1995 molecules and solidsB3LYP fitted Becke 1993 moleculesPBE0 fitted Adamo and Barone 1999 moleculesM06-2X fitted Truhlar et al. 2006 molecules

The second and third functional belong to the GGA rung, the fourth is metaGGA, thefifth and sixth are hybrid GGAs, and the last one is a hybrid metaGGA. The first fourfunctionals can be classified as nonempirical, i.e., the parameters were fixed mostlyusing various exact conditions, possibly with some fitting to a limited set of data such asatomic total energies. In the fitted functionals, the parameters were adjusted by fittingDFT predictions to a set of benchmark date obtained both from experiments and fromaccurate calculations using wave function methods.

VII. VARIATIONAL METHOD

The variation principle is an approximation method that provides a simple way ofplacing an upper bound on the ground state energy of any quantum system. We startwith the inequality

〈E〉 =〈ψ|H |ψ〉〈ψ|ψ〉

≥ E0 (61)

where 〈E〉 is the expected energy, |ψ〉 is an arbitrary state and E0 is the lowesteigenvalue of the Hamiltonian, H . The proof of the proposed claim is as follows.First, let’s assume that the arbitrary state is normalized, i.e. 〈ψ|ψ〉 = 1. If we expand|ψ〉 =

∑n cn|n〉 with

∑n |cn|2 = 1 to ensure normalization, then we can write for the

expected energy

〈E〉 =∞∑

n,m=0

c∗mcn〈m|H |n〉 =∞∑

n,m=0

c∗mcnEnδmn =∞∑n=0

|cn|2En

〈E〉 = E0

∞∑n=0

|cn|2 +∞∑n=0

|cn|2(En −E0) ≥ E0

In the case of a non-degenerate ground state, we have an equality only if c0 = 1, whichimplies that cn = 0 for all n , 0. If we consider a family of states |ψ(α)〉, which depend onsome number of parameters αi , we can define

E(α) =〈ψ(α)|H |ψ(α)〉〈ψ(α)|ψ(α)〉

≥ E0

58

Here, we still have the relation E(α) ≥ E0 for all parameters α. The lowest upper boundon the ground state energy is then obtained from the minimum value of E(α) over therange of parameters α, i.e. obtained by taking the first derivative

∂E∂αi

∣∣∣∣α=αk

= 0

giving us the upper bound E0 ≤ E(αk). Unfortunately, the variational method does nottell us how far above the ground state E(αk) lies. Despite the limitation, when a set ofstates |ψ(α)〉 is chosen fairly close to the ground state, the variational method can giveremarkably accurate results.

A. Configuration Interaction (CI) method

The basic idea of Configuration Interaction (CI) is to diagonalize the N -electronHamiltonian in a basis of N -electron functions, or Slater determinants. Essentially whatwe’re doing here is representing the exact wave function as a linear combination of N -electron trial functions and then using the variational method to minimize the energy.If a complete basis were used, we would obtain the exact energies to both the groundstate and all excited states of the system. In principle, this provides an exact solutionto the many-electron problem; however, in practice, only a finite set of N -electrontrial functions are manageable so the CI wavefunction expansion is typically truncatedat specific excited configurations. As a result of the size restrictions on practical CIcalculations, CI often provides only upper bounds to the exact energies.

The CI wavefunction is a linear combination of known Slater determinants |Φi〉 withunknown coefficients. This allows us to write eigenvectors of our Hamiltonian as

|Ψj〉 =∑i

cij |Φi〉

Generally, the Slater determinants are constructed from excitations of the Hartree-Fock "reference" determinant |Φ0〉.

|Ψ 〉 = c0|Φ0〉+∑r,a

cra|Φra〉+

∑r<s,a<b

crsab|Φrsab〉+

∑r<s<t,a<b<c

crstabc|Φrstabc〉+ . . . (62)

where, |Φra〉 represents the singly excited Slater determinant formed by replacing spinorbital

φa with φr . Similarly, |Φrsab〉 represents the doubly excited Slater determinant formed by

replacing spinorbital φa with φr and replacing spinorbital φb with φs, and so on forhigher excited states. Every N -electron Slater determinant can be formed by a set of Nspinorbitals, φiNi=1.

We can rewrite Eq. 62 in a more general form |Ψ CI〉 =∑i=0 ci |Φi〉, where i = 0

refers to our reference Hartree-Fock wavefunction, i = 1 refers to our singly excited

59

state wavefunction and so on. We now optimize our total CI wavefunction via the Ritzvariational method.

E =〈Ψ CI |H |Ψ CI〉〈Ψ CI |Ψ CI〉

If we then expand the CI wavefunction in a linear combination of our Slater determinants,we get

E =

∑i∑j c∗i cj〈Φi |H |Φj〉∑

i∑j c∗i cj〈Φi |Φj〉

The variational procedure corresponds to setting all the derivatives of our energy withrespect to the expansion coefficients ci equal to zero. Rearranging, we get

E∑i

∑j

c∗i cj〈Φi |Φj〉 =∑i

∑j

c∗i cj〈Φi |H |Φj〉

∂E∂ci

∑ij

c∗i cj〈Φi |Φj〉+ 2E∑i

ci〈Φi |Φj〉 = 2∑i

ci〈Φi |H |Φj〉+∑ij

c∗i cj∂∂ci

(〈Φi |H |Φj〉

)The first term vanishes from the minimization of the energy, and the last term vanishessince it doesn’t depend on the coefficients. Since the basis functions are orthonormal, weobtain

E∑i

ciδij =∑i

ci〈Φi |H |Φj〉∑i

Hijci −∑i

Eδijci = 0

where Hij = 〈Φi |H |Φj〉. Since there is one equation for each j, we can transform thisequation into a matrix equation.

(H−EI)c = 0

Hc = Ec

H00 −E H01 . . . H0j . . .

H10 H11 −E . . . H1j . . ....

.... . .

... . . .

Hj0... . . . Hjj −E . . .

...... . . .

.... . .

c0

c1...

cj...

=

00...

0...

(63)

Solving these secular equations is equivalent to diagonalizing the CI matrix. The CIenergy is then obtained as the lowest eigenvalue of the CI matrix, and the correspondingeigenvectors contain the ci coefficients in front of the determinants in Eq. 62. In this case,the second lowest eigenvalue corresponds to the first excited state, the third lowest is thesecond excited state and so on.

We have mentioned that the CI expansion is typically truncated at specific excitedconfigurations. From studying the Slater-Condon rules, we know that only singly and

60

doubly excited states can interact directly with the reference state, therefore matrixelements that have more than three unlike spinorbitals vanish. Due to Brillouin’stheorem, the matrix elements 〈S |H |Φ0〉 are zero. The structure of the CI matrix, underthe basis set of HF Slater determinants and their excited states is then given as

H =

〈Φ0|〈S|〈D|〈T|〈Q|...

〈Ψ0|H |Ψ0〉 0 〈D |H |Ψ0〉 0 0 . . .

0 〈S|H |S〉〈S|H |D〉〈S|H |T〉 0 . . .

〈D|H |Ψ0〉〈D|H |S〉〈D|H |D〉〈D|H |T〉〈D|H |Q〉 . . .0 〈T|H |S〉〈T|H |D〉〈T|H |T〉〈T|H |Q〉 . . .0 0 〈Q|H |D〉〈Q|H |T〉〈Q|H |Q〉 . . ....

......

......

...

(64)

where |Φ0〉 is the Hartree-Fock reference state, |S〉 is the singly excited state, |D〉 is thedoubly excited state and so on. The blocks 〈X |H |Y 〉 which are not necessarily zero maystill be sparse, meaning that most of its elements are zero. Let’s look at the matrix elementbelonging to the block 〈D |H |Q〉. The matrix elements 〈Φrs

ab|H |Φtuvwcdef 〉 will be nonzero only

if φa and φb are contained in the set φc,φd ,φe,φf , and if φr and φs are contained in theset φt,φu ,φv ,φw.

The task at hand is then to calculate each matrix element and to diagonalize theCI matrix. As we include more and more excitations in the CI expansion, we capturemore and more electron correlation. CI needs more basis sets in order to capture thecorrelation energy efficiently. We can increase the size of the CI matrix by adding moreexcited configurations, or by increasing the basis set size. However, there’s a problemwith adding more and more excitations or basis sets - namely, it is very expensive to doso. If the number of spinorbitals produced by HF is 2M, the number of determinantsconstructed is then

(2MN

), where N is the number of electrons. Taking into account all

possible excitations in the expansion is known as Full CI (FCI), and this method goeswith a complexity of O(N !).

Because of the complexity of Full CI, what is usually done is to take advantage oflower excitation states and truncate the CI matrix, i.e. CI Doubles (CID) only takes intoconsideration CI with double excitations. Since the single excitations themselves do notcorrelate with the ground state explicitly, the most significant term for the correlationenergy must come from the double excitations, since they are the first excitations coupledwith the HF Slater determinant. This gives a reduced matrix which is much more feasiblefor practical computation; however, this introduces another problem - size extensivity.

1. Size extensivity of CI

A method is said to be size extensive if the energy calculated scales linearly withthe number of particles N , i.e. the word "extensive" is used in the same sense as in

61

thermodynamics. The truncated CI will introduce errors in the wave function, whichwill in turn cause errors in the energy and all other properties. A particular resultof truncating the N -electron basis is that the CI energies obtained are no longer sizeextensive.

Let us show that CI is not size extensive through an example. Consider two noninteractinghydrogen (H2) molecules. We expect the total energy of the two molecules to be the sumof the individual molecules, i.e. E(2H2) = 2E(H2). Using CID for a single H2 moleculewill result in the exact energy; however, if we use the CID method and consider theenergy from CI wavefunction for the pair of molecules, the energy of the two moleculesat large separation will not be the same as the sum of their energies when calculatedseparately. The CI wavefunction for this system will look like

|Ψ 〉 =A|ψa〉|ψb〉

|Ψ 〉 =A(a0|σ2〉a + a2|σ ∗

2〉a)(a0|σ2〉b + a2|σ ∗

2〉b)

|Ψ 〉 =A(a2

0|σ2〉a|σ2〉b + a0a2|σ2〉a|σ ∗

2〉b + a0a2|σ ∗

2〉a|σ2〉b + a2

2|σ∗2〉a|σ ∗

2〉b)

where A is the asymmetrization operator and states |σ2〉 correspond to both electronsbeing in the ground state and |σ ∗2〉 corresponds to both electrons in the excited state.Notice that the last term in the expansion has two excitations from both molecules. Thisis considered a quadruply excited state, which is truncated out in the CID calculation. Inorder to account for the missing energy from the expected total energy, we would have tohave included quadruply excited states in the CI basis set, since local double excitationscould happen simultaneously on both subsystems.

It is clear that the fraction of the correlation energy recovered by a truncated CI willdiminish as the size of the system increases, making it a progressively less accuratemethod. However, if we were to truncate CI, we should realize that the spinorbitalsin the Slater determinants come from HF method, so we should allow those orbitals tore-optimize as we take linear combinations of the determinants. We should also considerfor example, not exciting the inner shell orbitals since the computational complexity forthose excitations are huge for small effects on the energy differences. We can neglectthese orbitals by "freezing" the core orbitals and implementing CI in higher orbitals.

2. MCSCF, CASSCF, RASSCF, and MRCI

The Multi-Configurational Self-Consistent Field (MCSCF) method is another approachto the CI method, in which we decide on a set of determinants that can sufficientlydescribe our system. Each of the determinants are constructed from spinorbitals that arenot fixed, but optimized as to lower the total energy as much as possible. The main ideahere is to use the variational principle to not only optimize the coefficients in front of the

62

determinants, but also the spinorbitals used to construct the determinants. In a sense,the MCSCF method is a combination of the CI method and HF method (if the number ofdeterminants chosen was just 1, we get back the HF method).

The classical MCSCF approach follows very closely to the Ritz variational methoddescribed before. We start with the MCSCF wavefunction, which has the form of a finitelinear combination of Slater determinants ΦI

ΨMCSCF =∑I

cIΦI

where cI are the variational coefficients. Next, we calculate the coefficients for thedeterminant using the variational method, without changing the determinants. Next,we vary the coefficients in the determinants at the fixed CI coefficients to obtain thebest determinants. And finally, we repeat by going back and expanding the MCSCFwavefunction in terms of the newly optimized determinants.

The MCSCF method is mainly used to generate a qualitatively correct wavefunction,i.e. recover the "static" part of the configuration. The goal is usually not to recover alarge fraction of the total correlation energy, but to recover all the changes that occur inthe correlation energy for a given process. A major problem that this procedure faces isfiguring out which configurations are necessary in include for the property of interest.

The Complete Active Space Self-Consistent Field (CASSCF) method is a special caseof the MCSCF method. From the molecular orbitals computed from HF, we partitionthe space of these orbitals into an active and inactive space. The inactive space ofspinorbitals are chosen from the low energy orbitals, i.e. the doubly occupied orbitalsin all determinants (inner shells). The remaining spinorbitals belong to the active space.Within the active space, we consider all possible occupancies and excitations of theactive spinorbitals to obtain the set of determinants in the expansion of the MCSCFwavefunction (hence, "complete").

A common notation used for CASSCF is the following: [n,m]-CASSCF, where n is thenumber of electrons distributed in all possible ways in m spinorbitals. For example,[11,8]-CASSCF for the molecule NO pertains to the problem of 11 valence electrons beingdistributed between all configurations that can be constructed from 8 molecular orbitals.For any full CI expansion, CASSCF becomes too large to be useful, even with small activespaces. To overcome this problem, a variation called the Restricted Active Space Self-Consistent Field (RASSCF) method is used.

In the RASSCF method, the active orbitals are divided into 3 subsections, RAS1, RAS2,and RAS3. Each of these subsystems have restrictions on the excitations allowed. Atypical example is one where RAS1 includes occupied orbitals that are excited in theHF reference determinant, RAS2 includes orbitals from the full CI or limited to SDTQexcitations, and RAS3 includes virtual orbitals that are empty in the HF determinant.The full CI expansion within the active space severely restricts the number of orbitals

63

and electrons that can be treated by CASSCF methods. Any additional configurationsto those from RAS2 space can be generated by allowing excitations from one space toanother. For example, allowing 2 electrons to be excited from RAS1 to RAS3. In essence,a typical example of the RASSCF method generates configurations by a combination ofa full CI in a small number of orbitals in RAS2 and a CISD in a somewhat large orbitalspace in RAS1 or RAS3.

Excitation energies of truncated CI methods such as the ones described above aregenerally too high, since the excited states are not that well correlated as the ground stateis. For equally correlated ground and excited states, one can use a method called Multi-Reference Configuration Interaction (MRCI), which can use more than one referencedeterminant from which certainly known singly, doubly, and higher excited states (thisset of certainly known determinants is called the model space). MRCI gives a bettercorrelation of the ground state, which is important if the system under consideration hasmore than one dominant determinant since some higher excited determinants are alsotaken into the CI space. The CI expansion is then obtained by replacing the spinorbitalsin the model space by other virtual orbitals.

B. Basis sets and basis set convergence

The standard wave functions used in solving Schrödinger’s equations for atomsand molecules are constructed from antisymmetric products of spinorbitals. In mostmethods, these spinorbitals are generated by expanding a finite set of simple basisfunctions. The choice of basis functions for a molecular calculation if therefore important,depending on which system we wish to analyze. There are hundreds of basis sets thatcan be used, each optimized for a specific system. The most general types include Slater-type orbitals (STO) and Gaussian-type orbitals (GTO). Here, we will consider ThomDunning’s correlation-consistent basis sets, which were designed for converging post-HFcalculations systematically to the complete basis set limit using extrapolation techniques.Correlation consistent basis sets are built by adding functions corresponding to electronshells to a core set of HF functions.

What we will need for carrying out accurate correlated calculations are not onlya set of spinorbitals that resemble as closely as possible the occupied orbitals of theatomic systems, but also a set of virtual correlating orbits into which the correlatedelectrons can be excited. An obvious candidate here are the canonical orbitals from theHF calculations; however, since the lowest virtual HF orbitals are very diffuse, they willnot be well suited for correlating the ground-state electrons, except when the full set oforbitals is used. Another strategy is to try and generate correlating atomic orbitals formolecular calculations by relying on the energy criterion alone, i.e. adjust the exponentsof the correlating orbitals so as to maximize their contribution to the correlation energy.

64

By doing this, we should be able to generate sets of correlating orbitals that are morecompact, i.e. contains fewer primitive basis functions. This method will generate forus correlation-consistent basis sets, meaning that each basis set contains all correlatingorbitals that lower the energy by comparable amounts as well as all orbitals that lowerthe energy by larger amounts.

In these correlation-consistent basis sets, each correlating orbital is represented as asingle primitive chosen as to maximize its contribution to the correlation energy, andwhere all correlating orbitals that make similar contributions to the correlation energyare added simultaneously. A hierarchy of basis sets can then be set up that is correlation-consistent in the sense that each basis set contains all correlating orbitals that lower theenergy by comparable amounts as well as all orbitals that lower the energy by largeramounts. The main advantage of this method is that it allows us to empty smallerprimitive sets.

Correlation-consistent basis sets were designed to converge systematically to thecomplete basis set limit using extrapolation techniques. Let us consider the structureof the correlation-consistent basis sets in more detail. We will start with the main twofamilies of basis sets, cc-pVXZ and cc-pCVXZ, where n = D,T ,Q,5,6,7... Here, cc-pstands for correlation consistent polarized, and V and CV stand for valence and core-valence, respectively. p indicates the presence of polarization functions in the basis set.XZ is the zeta factor, which tells us how many basis functions are used for each atomicorbital. As we increase X, we add more higher angular functions, which spans higherangular space. The basis functions are added in shells, e.g, for the C atom, cc-pVDZwould consist of [3s2p1d], cc-pVTZ would consist of [4s3p2d1f], and cc-pVQZ wouldconsist of[5s4p3d2f1g]. The main difference between the two families is that the cc-pCVXZ basis sets are extended from the standard cc-pVXZ sets for additional flexibilityin the core region. A prefix aug can be added to the two families of basis sets above tomeans that one set of diffuse functions is added for every angular momentum present inthe basis, improving flexibility in the outer valence region.

As the number of basis functions increase, the wavefunctions become better representedand the energy decreases to approach the complete basis set limit (CBS). An infinitenumber of basis functions is impossible to employ practically, but we can try to estimatethe energy at the CBS limit. By using hierarchical basis sets, i.e. correlating consistentsets with adjacent angular momenta, we can calculate the energy for a couple of pointsthen hope to extrapolate higher basis function energies or higher correlation energies.

If we look at the dependence of the HF energy on the basis set size, we will see thatthe error in HF energy should scale exponentially with the cardinal number, X. Thecorrelation energy scales differently, by E ∝ X−3. This allows us to carry out calculationsat for example, Dζ and Tζ, and fit the energies on a logarithmic plot with energies vs. X.This line can then be used to extrapolate what the energies would be in higher ζ, or evenat the CBS limit.

65

C. Explicitly-correlated methods

In this section, we will consider methods that utilize wavefunctions that dependexplicitly on the interelectronic distance r12. This explicitly-correlated wavefunctionleads to much faster convergence of the CI expansion, as well as improving dramaticallythe accuracy of the energy. Recall that in the HF method, we neglected all interactionsbetween the electrons, i.e. the HF wavefunction did not depend on r12 near r12 =0. This method overestimates the possibility of finding two electrons close togetherand thus overestimates the electron repulsion energy. To account for the interactionsbetween electrons, we must somehow integrate the interelectronic distance into ourcalculation. However, these explictly correlated methods do bring a couple of problems.First, the resulting algorithms are much more difficult to implement. Second, they areincompatible with concepts such as orbitals and electron configurations since they avoidthe 1-electron approximation from the very beginning.

1. Coulomb cusp

We will consider the behavior of the exact wavefunctions for coinciding particles; inparticular, where the electronic Hamiltonian becomes singular and gives rise to a cuspin the wavefunction. For simplicity, we will examine the ground state of He, which wecan easily generate accurate approximations to the true wavefunction. The Hamiltonianof He is given as

H = −12∇2

1 −12∇2

2 −2|r1|− 2|r2|

+1

|r1 − r2|We can see here that the singularities of this Hamiltonian occur if r1 = 0, r2 = 0, orr1 − r2 = 0. At these points, the exact solution of Schrödinger’s equation must providecontributions to HΨ that balance the singularities in H to ensure the local energyremains constant and equal to the energy eigenvalue E. The only possibly source ofthis balancing is via the kinetic energy term. It is convenient to express the Hamiltonianin terms of relative coordinates r1, r2, and r12, where r1 and r2 are the distances of theelectrons to the nucleus and r12 the interelectronic distance. Doing so, we get

H = −12

2∑i=1

(∂2

∂r2i

+2ri

∂∂ri

+2Zri

)−(∂2

∂r212

+2r12

∂∂r12

− 1r12

)−(

r1

r1· r12

r12

∂∂r1

+r2

r2· r21

r21

∂∂r2

)∂∂r12

Schrödinger’s equation must be well behaved, so the singularities must somehow cancel,leading to a nuclear and interelectronic cusps. In order for the singularities to cancel,terms that multiply 1

riand 1

r12must cancel. We’ll only look at the electron-electron cusp,

for which terms with 1r12

must vanish in HΨ . From the second term, we find that thisleads to

∂Ψ∂r12

∣∣∣∣r12=0

=12Ψ (r12 = 0)

66

which describes the behavior of the wavefunction when the electrons coincide andrepresents the electron-electron cusp condition. This cusp condition is impossible tofulfill using orbital-based wave functions.

If we do a FCI expansion for He in terms of Slater-type orbitals, we will get

Ψ FCI = e−ζ(r1+r2)∑ijk

cijk(ri1rj2 + rj1r

i2)r2k

12

where the summation is over all nonnegative integers. This FCI expansion thus containsall possible combinations of powers of r1, r2 and r12. Our wavefunction now includesthe interelectronic distance r12; however, since only even powers of r12 are present, thecusp condition can never be satisfied. This missing cusp condition in the wavefunctionleads to slow convergence of CI with respect to the basis set. This is an intrinsic problemshared by all wavefunction expansions in orbital products. In order to fix this problemand gain faster convergence, we will introduce an explicit linear dependence on r12 intothe wavefunction.

Ψ CIr12

= (1 +12r12)Ψ CI

Now if we take the derivative, we get

∂Ψ CIr12

∂r12

∣∣∣∣r12=0

=12Ψ CI (r12 = 0) =

12Ψ CIr12

(r12 = 0)

which satisfies the Coulomb cusp condition exactly. In general, we may impose thecorrect Coulomb cusp behavior on any determinant-based wave function Φ by multiplyingthe expansion by some correlating function γ , such that

γ = 1 +12

∑i<j

rij

which leads to the correct non-differentiable cusp in the product function γΦ . A proofof this is assigned as a homework problem. However, just because it now has thecorrect cusp behavior doesn’t mean that the associated improvements in the energyare significant. If we plot the Helium ground state energy as a function of the numberof terms in the expansion, we will see that introducing the single r12 term reduces theerror by 2 orders of magnitude. In order to converge the CI-R12 energy with even moreaccuracy, we will need an even more flexible wave function.

2. Hylleraas function

The Hylleraas function is one such function. Hylleraas was the first who succeededin constructing an accurate wavefunction for the singlet S state helium atom. If we

67

generalize the FCI expansion of STO’s to include all powers of r12, we will obtain

Ψ H = e−ζ(r1+r2)∑ijk

cijk(ri1rj2 + r i2r

j1)rk12

which is usually expressed as

Ψ H = e−ζs∑ijk

cijksit2juk

where s, t and u are the so-called Hylleraas coordinates,

s = r1 + r2, t = r1 − r2, u = r12

The Hylleraas function is usually truncated according to i + 2j + k ≤ N ; however, itstill presents very high accuracy with only a few terms, especially with Helium. Thisfunction is also only applicable to few electron atomic systems, since the complexity ofthe function increases dramatically with more electrons.

3. Slater geminals

Geminals, or two-electron functions, are another type of explicitly correlated functionsthat represent a generalization of single-electron orbitals accounting for intra-orbitalcorrelation effects. The wavefunction is expanded into two-electron basis functions inaddition to orbital products. The primary cusp condition suggests that such an expansionis effective for geminal basis functions with the asymptotic behavior

f12 =12r12 +O(r2

12)

Including these f12 functions requires two-electron integrals for operators f12 and r−112 ,

such asK

(Q)12 = −(∇1f12) · (∇1f12)

Most explicitly correlated methods have employed basis functions such as the linear r12

(R12) or Gaussian-type geminals (GTG)

f R1212 =

12r12

f GTG12 =NG∑G

cGe−ζGr2

12

A downside of R12 functions is that the associated energies do not always cover asufficient fraction of the correlation energy. GTG does not suffer from such a problemat large r12; however it never fulfills the cusp condition exactly. Despite this, a modest

68

number of GTGs can still represent a suitable range of r12 accurately. The main disadvantageis that the computation of the integrals involved can get relatively costly especially foroperators quadratic to f12 involving N 2

G/2 primitive operations.Slater-type geminals (STG), or Slater geminals, with the form

f STG12 = −rc2e−r12/rc

where rc is a scale-length parameter, remedy the above problems of GTGs. Thesefunctions use STO’s as geminal basis functions to incorporate interelectronic distances.STG simplifies the quadratic operators to the exponential forms, i.e.

K(Q)12 = −1

4e−2r12/rc

(f STG12

)2= r2

c K(Q)12

It turns out that STG provides better results in comparison to methods such as GTGand R12. For example, the upside of these functions is that at least 5ζ quality results areobtained in a Tζ basis when used. From a computational point of view, STGs are alsomore efficient due to its compact and short-range form.

4. Explicitly-correlated Gaussian functions

Explicitly correlated Gaussians were proposed to describe N -particle wavefunctionswith a basis of exponential functions with an argument involving the square of theinterelectronic distances,

ψECG = Ae−∑Ni<j αij (ri−rj )2

where αij are adjustable parameters. These functions were called Gaussian-type geminals(GTG) in its earliest, two-electron version. At first, these functions with exponentialcorrelating factors were underestimated and claimed to have much slower convergencetimes than correlating functions that had powers of r12. It was shown that carefuloptimization of the nonlinear parameters allows very short expansions of high qualityfor certain molecules, such as H2.

The main advantage of this method is that they have very simple integrals, whichresults in easy applicability to general many-center molecules. The integrals are no morecomplicated than ordinary Gaussian integrals involving only the exponential functionand the well-known gamma function. Other advantages of these types of functionsare that they give very high accuracy since the basis functions are correlated, which ismagnified for systems with strongly attractive interparticle interactions. The quadraticform involving rij also permits the reduction of the Hamiltonian matrix elements tovery simple analytic expressions, which do not gain anymore algebraic complexity for

69

N ≥ 3. The main disadvantages they seem to exhibit are that they are unable to describethe electron-nuclear cusp, they vanish too quickly for large distances, and the Gaussiancorrelation factor does not reproduce the electron-electron cusp, as mentioned in theprevious section.

VIII. MANY-BODY PERTURBATION THEORY (MBPT)

The second quantized formalism is perhaps most extensively utilized in the field ofperturbation theory of many-electron systems. The is due to the tedious derivationsnecessary to arrive at feasible working formulae, especially at the higher orders of PT.......

A. Rayleigh Schrödinger perturbation theory (classical derivation)

Let us review first the essence of the nondegenerate Rayleigh-Schrodinger perturbationtheory. Consider the time-independent Schrodinger equation.

HΨn = EΨn (65)

Finding solutions to this equation is, in most cases, a difficult task. Assume, however,that the Hamiltonian consist of two Hermitian parts, zero-order part and a perturbation,

H = H0 + V (66)

It is convenient to write the following form

H = H0 +λV (67)

where λ is an "order parameter" that is used to classify the various contributions by theirorder. We assume, solutions to the zeroth order eigenvalue problem for H0

H0Φn = E(0)n Φn (68)

with

〈Φm|Φn〉 = δmn (69)

If Φn is nondegenerate, it is possible to number the solutions in such a way that

limλ→0

Ψn = Φn

limλ→0

En = E(0)n

(70)

And if there are degeneracies, it is possible to choose the zero-order solutions so that (70)is still satisfied.

χn = Ψn −Φn∆En = En −E

(0)n

(71)

70

Here we have partitioned Ψn into two parts, one parallel (i.e. proportional) to Φn and theother orthogonal to it. So it is convenient to use intermediate normalization:

〈Φn|Φn〉 , 〈χn|Φn〉 ,〈Ψn|Φn〉 = 〈Φn +χn|Φn〉 = 1,

〈Ψn|Ψn〉 = 1 + 〈χn|χn〉(72)

To proceed further, we use the order parameter λ and expand:

Ψn = Φn +χn = Ψ(0)n +λΨ (1)

n +λ2Ψ(2)n + . . . (Ψ (0)

n ≡ Φn)

En = E(0)n +∆En = E(0)

n +λE(1)n +λ2E

(2)n + . . .

(73)

Substituting into the Schrodinger equation(H −En

)Ψn = 0 (74)

with H = H0 +λV , we get(H0 +λV −E(0)

n −λE(1)n −λ2E

(2)n − . . .

)(Ψ

(0)n +λΨ (1)

n +λ2Ψ(2)n + . . .

)= 0 (75)

Equating coefficients of powers of λ gives for λ0, λ1, and λ2, respectively:(H0 −E

(0)n

)Ψ

(0)n = 0 (zero order), (76)(

H0 −E(0)n

)Ψ

(1)n =

(E

(1)n − V

)Ψ

(0)n (first order), (77)(

H0 −E(0)n

)Ψ

(2)n =

(E

(1)n − V

)Ψ

(1)n +E(2)

n Ψ(0)n (second order) (78)

and in general, for λm, the mth-order equation

(H0 −E

(0)n

)Ψ

(m)n =

(E

(1)n − V

)Ψ

(m−1)n +

m−2∑l=0

E(m−l)n Ψ

(l)n (79)

which becomes (E

(0)n − H0

)Ψ

(m)n = VΨ

(m−1)n −

m−1∑l=0

E(m−l)n Ψ

(l)n (80)

In order to get expressions for E(m)n we apply 〈Φn| to each equation and integrate. For λ1

we get

〈Φn|H0 −E(0)n |Ψ

(1)n 〉 = 〈Φn|E

(1)n − V |Φn〉 (81)

71

By the Hermitian property of H0 we have

〈(H0 −E

(0)n

)Φn︸︷︷︸

=0

|Ψ (1)n 〉 = E(1)

n − 〈Φn|V |Φn〉︸︷︷︸≡Vnn

(82)

and so

E(1)n = 〈Φn|V |Φn〉 = Vnn (83)

Thus we have obtained E(1)n without knowledge of Ψ (1)

n and same can be done for eachorder m:

〈Φn|E(0)n − H0︸︷︷︸=0

|Ψ (m)n 〉 = 〈Φn|V |Ψ

(m−1)n 〉 −

m−1∑l=0

E(m−l)n 〈Φn|Ψ

(l)n 〉︸︷︷︸

=δl0

(84)

giving

E(m)n = 〈Φn|V |Ψ

(m−1)n 〉 (85)

Thus, in principle, we can obtain each E(m)n from the previous Ψ

(m−1)n and then solve for

Ψ(m)n etc., while always maintaining 〈Φn|Ψ

(m)n 〉 = 0(m > 0).

To calculate Ψ (m)n we can expand it in terms of the known zero-order solutions Φk. This

exploits the fact that the set of eigenfunctions of any semibounded Hermitian operatorform a complete set:

Ψ(m)n =

∑k

a(m)kn Φk =

∑k

|Φk〉〈Φk |Ψ(m)n 〉

a(m)kn = 〈Φk |Ψ

(m)n 〉 (to be determined)

(86)

To obtain a(m)kn we multiply the mth-order equation by 〈Φk | and integrate:

〈Φk |E(0)n − H0︸︷︷︸(

E(0)n −E

(0)k

)〈Φk |

|Ψ (m)n 〉 = 〈Φk |V |Ψ

(m−1)n 〉︸︷︷︸∑

j〈Φk |V |Φj〉〈Φj |Ψ(m−1)n 〉

−m−1∑l=0

E(m−l)n 〈Φk |Ψ

(l)n 〉︸︷︷︸

=a(l)kn

(87)

Thus (E

(0)n −E

(0)k

)a

(m)kn =

∑j

Vkja(m−1)jn −

m−1∑l=0

E(m−l)n a

(l)kn (88)

In this equation the l =0 contributions are to be interpreted as a(0)kn = 〈Φk |Φn〉 = δkn. This

result provides a system of equations for the a(m)kn , coefficients, to be solved order by

72

order, but the first thing to notice is that we have no equation for a(m)nn ; this coefficient

is arbitrary, corresponding to the arbitrariness of adding any multiple of the zero-ordersolution Φn. This arbitrariness appears for each order Ψ (m)n separately. The followingchoice of intermediate normalization can thus be made for each order:

〈Φn|Ψ(m)n 〉 = 0 (m > 0) ,

a(m)nn = 0 (m > 0) .

Consequently

a(m)nn = δm0 (89)

Since a(0)kn = δkn, the first-order equation becomes:(

E(0)n −E

(0)k

)a

(1)kn =

∑j

Vkj a(0)jn︸︷︷︸δjn

−E(1)n a

(0)kn

= Vkn −E(1)n a

(0)kn

= Vkn (n , k)

a(1)kn =

Vkn

E(0)n −E

(0)k

(n , k) (90)

Thus we have the well-known result

Ψ(1)n =

∑n,k

Vkn

E(0)n −E

(0)k

Φk (91)

From this we get the second-order energy,

E(2)n = 〈Φn|V |Ψ

(1)n 〉 =

∑k

〈Φn|V |Φk〉〈Φk |Ψ(1)n 〉 =

∑k

a(1)knVnk

=∑n,k

VnkVkn

E(0)n −E

(0)k

= −∑n,k

|Vkn|2

E(0)k −E

(0)n

(92)

which is also well known.This process can be continued in the same manner to higher orders, e.g.,

a(2)kn =

(E

(0)n −E

(0)k

)−1∑n,j

a(1)jn Vkj −E

(1)n a

(1)kn −E

(2)n a

(0)kn

=

∑k,j,n

VkjVjn(E

(0)k −E

(0)n

)(E

(0)j −E

(0)n

) −∑k,n

VknVnn(E

(0)k −E

(0)n

)2 (k , n)(93)

73

Using the (93) we can write E(3)n ,

E(3)n =

∑k,j,n

VnkVkjVjn(E

(0)k −E

(0)n

)(E

(0)j −E

(0)n

) −∑k,n

VnkVkn(E

(0)k −E

(0)n

)2 (94)

It is evident that while this procedure is quite straightforward, the book-keeping for thegeneration of order by order wave function and energy is cumbersome.

B. Hylleraas variation principle

Hylleraas showed that the first order wave function and the second order energy canalso be determined variationally. According to Hylleraas variation principle, if the trialwave function Ψ

(1)n is an approximate solution of the first order wave function, then using

(77) and, multiplying it with 〈Ψ (1)n | and integrating,

〈Ψ (1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉 = 〈Ψ (1)

n |(E

(1)n − V

)|Φn〉

0 = 〈Ψ (1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉+ 〈Ψ

(1)n |

(V −E(1)

n

)|Φn〉 (95)

To this equation we add the equation for the second-order energy,

E(2)n = 〈Φn|V −E

(1)n |Ψ

(1)n 〉 (96)

Adding (95) and (96)

E(2)n = 〈Φn|V −E

(1)n |Ψ

(1)n 〉+ 〈Ψ

(1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉+ 〈Ψ

(1)n |

(V −E(1)

n

)|Φn〉

= 2Re〈Ψ (1)n |

(V −E(1)

n

)|Φn〉+ 〈Ψ

(1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉 (97)

If we define a functional

J2

[Ψ

(1)n

]= 2Re〈Ψ (1)

n |(V −E(1)

n

)|Φn〉+ 〈Ψ

(1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉 (98)

Then we can write,

J2

[Ψ

(1)n

]> E

(2)n (99)

If Ψ (2)n is the exact correction to the wave function, from (99) it follows:

J2

[Ψ

(1)n

]= E(2)

n (100)

74

Otherwise, the functional J2[Ψ

(1)n

]yields an upper bound for E(2)

n . Then it can be proved

that (99) for the first-order correction follows directly from the variation of functional

J2

[Ψ

(1)n

]equated to zero.

δJ2

[Ψ

(1)n

]= 〈δΨ (1)

n |(V −E(1)

n

)|Φn〉+ 〈Φn|

(V −E(1)

n

)|δΨ (1)

n 〉 (101)

+ 〈δΨ (1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉+ 〈Ψ

(1)n |

(H0 −E

(0)n

)|δΨ (1)

n 〉

Requiring that δJ2[Ψ

(1)n

]= 0 for any δΨ (including δΨ ∗), then

= 〈δΨ (1)n |

(V −E(1)

n

)|Φn〉+ 〈δΨ

(1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉

then,(

H0 −E(0)n

)Ψ

(1)n =

(E

(1)n − V

)Ψ

(0)n (102)

for which Ψ(1)n = Ψ

(1)n is a solution (since the above relation is equivalent to the first-order

equation).

Next we show that if E(0)n is the lowest eigen value of H0 then δJ2

[Ψ

(1)n

]is an upper

bound for E(2)n . Taking the trial wave function Ψ

(1)n as,

Ψ(1)n = Ψ

(1)n +χ (103)

Using (103) in (98) gives,

J2

[Ψ

(1)n

]= 2Re〈Ψ (1)

n +χ|(V −E(1)

n

)|Φn〉+ 〈Ψ

(1)n +χ|

(H0 −E

(0)n

)|Ψ (1)n +χ〉

= 2Re〈Ψ (1)n |

(V −E(1)

n

)|Φn〉+ 2Re〈χ|

(V −E(1)

n

)|Φn〉+ 〈Ψ

(1)n |

(H0 −E

(0)n

)|Ψ (1)n 〉

+ 2Re〈+χ|(H0 −E

(0)n

)|Ψ (1)n 〉︸︷︷︸(

E(1)n −V

)|Φn〉

+〈χ|(H0 −E

(0)n

)|χ〉 (104)

Second and fourth term in the above equation cancels each other,

J2

[Ψ

(1)n

]= J2

[Ψ

(1)n

]+ 〈χ|

(H0 −E

(0)n

)|χ〉 (105)

If E(0)n is the lowest eigen value of H0 then the integral 〈χ|

(H0 −E

(0)n

)|χ〉 is nonnegative

and zero if and only if, χ is the corresponding eigenfunction.Therefore,

J2

[Ψ

(1)n

]> E

(2)n (106)

75

Thus J2[Ψ

(1)n

], with an arbitrary trial function Ψ

(1)n containing adjustable parameters,

can be used in a variational approach for finding approximations to the first-order wavefunction and second-order energy, and this provides an upper bound to E(2)

n in the caseof a state having the lowest zero-order energy (provided that Φn is an exact eigenfunctionof H0.

C. Møller-Plesset perturbation theory

The role of the many-body theory is to evaluate the expressions of energy for differentorders coming from RSPT, containing many electron wave functions in terms of orbitalcontributions. The matrix elements should be expressed in terms integrals over one-electron functions. In the course of quantum mechanical application, the followingpoints should be clarified:

1. The nonrelativistic Born-Oppenheimer many-body Hamiltonian projected to agiven basis set can be most conveniently specified by the usual second quantizedform. Underlying basis set is assumed to be orthonormalized; MBPT calculationsare usually performed in the molecular orbital (MO) basis which meets this criterion.

2. The choice of the zeroth-order Hamiltonian is arbitrary, any Hermitain operatorwould do in principle. In wishes to chose H0 as close to H as possible in order toobtain favorable convergence properties of the perturbation series. On the otherhand, H0 should be as simple as possible, since one should be able to diagonalizeit and obtain its complete set of eigenfunctions. A practical balance between thesetwo conflicting requirements is to choose H0 as the Fock operator:

H0 = F =∑p

εpp†p (107)

in terms of molecular spinorbital operators p and orbital energies εp. By this choice,the perturbation operator V describes the electron correlation (the error of theHartree-Fock approach) and the aim of the perturbation calculation is to improvethe HF energy towards the exact solution of the Schrodinger equation in the samebasis set. This is the so-called Møller-Plesset partitioning.

The formal expansion of the MPPT partitioned Hamiltonian may be written as

H = H0 + V ⇒ V = H − F =∑i<j

1rij︸︷︷︸

H2

−∑i

[J(i)− K(i)

]︸︷︷︸

U

(108)

76

where

H2 =14

∑pqrs

〈pq||rs〉p†q†sr+∑pqi

〈pi||qi〉p†q+ 12

∑ij

〈ij ||ij〉 (109)

and,

U =∑pqi

〈pi||qi〉p†q+ 〈0|U |0〉

=∑pqi

〈pi||qi〉p†q+∑ij

〈ij ||ij〉 (110)

Then we can write

V = H2 − U =14

∑pqrs

〈pq||rs〉p†q†sr − 12

∑ij

〈ij ||ij〉 (111)

In the above derivation no assumption has been made about F. We can now assumeif in canonical HF (107) is valid.

3. Accepting the partition described by (107), the solution of the zeroth-order equationinvolves the solution of the Hartree-Fock problem. We have to specify the groundstates and excited many-electron states explicitly. The ground state is simply theFermi vacuum,

|Ψ (0)0 〉 = |Fermivacuum〉 = |HF〉 = |0〉 (112)

The excited states can be classified according to the number of electrons to beexcited.

Singly excited states are given given by:

Ψ(0)K = a†i|0〉 (113)

where K labels the i → a excitation. Equation (113) expresses that an electron isannihilated from spinorbital i and it is inserted into a.

A doubly excited state is given by

Ψ(0)K = a†b†j i|0〉 (114)

where

K =i→ a

j→ b

77

Now let us evaluate RSPT theory formulae using second quantization. Starting fromzeroth order, we have

H0Ψ(0)

0 = F|0〉 =∑i

εii†i

|0〉+

∑i

εi |0〉

=occ∑i

εi |0〉 (115)

Here only hole-hole pair has contributed. So the Fermi vacuum is the zeroth-order eigenfunction in the ground-state of the H0. And

∑i εi is the sum of the energies of the

occupied orbitals and not the HF energy. The first order contribution is given by,

E(1)0 = 〈0|V |0〉 (116)

It follows that the energy to the first order will be

E = E(0)0 +E(1)

0 = 〈0|H0|0〉+ 〈0|V |0〉= 〈0|H0 + V |0〉= 〈0|H |0〉 = 〈0|H1 + H2|0〉 (117)

Using the second quantized form of the H as mentioned previously, we can expressionone-electron and two-electron as,

H1 = H1,N +∑

1

〈i|h|i〉, H2 = H2,N + H ′2,N +12

∑ij

〈ij |v|ij〉 (118)

Equating (118) in (117) we get,

E =∑i

〈i|h|i〉+ 12

∑ij

〈ij |v|ij〉 (119)

E = Eref = EHF (120)

which is the expectation value of the full Hamiltonian with the Hartree-Fock wavefunction, the Hartree-Fock electronic. We see that, using the Møller-Plesset partitioning,the first order of perturbation theory corrects the sum of orbital energies to the true HFenergy.

78

Then first-order energy can be written as,

E(1)0 = 〈0|H |0〉 − 〈0|H0|0〉

= −∑i

εi +∑i

hii +12

∑ij

〈ij ||ij〉

= −∑i

uii +12

∑ij

〈ij ||ij〉

= −∑ij

〈ij ||ij〉+ 12

∑ij

〈ij ||ij〉

= −12

∑ij

〈ij ||ij〉 (121)

where we have used F = H + U and fpq = hpq +upq.In deriving the second order result, the explicit form of the perturbation operation V

should be specified. We can write,

V = H − H0 (122)

To evaluate the second order formula, the only matrix element we need is V0K since thesecond-order energy correction can also be written as:

E(2)0 = −

∑n,K

|V0K |2

E(0)K −E

(0)0

(123)

where K labels an excited state. In principle, it can be a p-fold state with p = 1,2,3 . . ..However, it is easy to show only p = 2 contribute to V0K . Let us check first the role ofsingly excited states. From the Brillouin theorem we know that the full Hamiltonian doesnot have such a matrix element:

H0K = 〈Ψ (0)0 |H |Ψ

(0)K 〉 = 0 (124)

that is,

〈Ψ (0)0 |H0 + V |Ψ (0)

K 〉 = EK〈Ψ(0)

0 |Ψ(0)K 〉+ 〈Ψ

(0)0 |V |Ψ

(0)K 〉

= V0K = 0 (125)

where the zeroth-order Schrodinger equation and the orthogonality of the zeroth-orderstates are utilized. It follows that V0K = 0 if K is a singly excited state.

And for any excited state higher than doubly excited state it will also give zerocontribution to V0K = 0 because V contains at most two-elctron terms, then using Slater-Condon rule for two-electron operator with more than two non-coincidences we getzero.

79

So, only doubly excited states contribute to the matrix element V0K , thus only theyenter the second-order formula (123). With this result, the matrix element of V0K = 0 canbe evaluated. Then

〈Ψ (0)0 |H − H0|Ψ

(0)K 〉 (126)

First, we evaluate one-electron part of V using generalized Wick’s theorem and (118),

〈Ψ (0)0 |H1|Ψ

(0)K 〉 =

∑pq

hpq〈Ψ(0)

0 |p†q|Ψ (0)

K 〉 =∑pq

hpq〈Ψ(0)

0 |p†qa†b†j i|Ψ (0)

0 〉 (127)

here p and q include both hole and particle states. Second term in F will not contributebecause Ψ

(0)0 and Ψ

(0)K are orthogonal.

=∑pq

hpq〈Ψ(0)

0 |p†qa†b†ji+ p†qa†b†ji+ p†qa†b†ji+ p†qa†b†ji+ . . .

+ all allowed contractions|Ψ (0)0 〉 (128)

Here all terms we will get are not fully contracted, so vacuum expectation value of sucha operator vanishes. Same argument holds for H0. So only non-zero contribution willcome from two-electron operator of H . Now,

〈Ψ (0)0 |H2|Ψ

(0)K 〉 = 〈Ψ (0)

0 |H2,N + H ′2,N +12

∑ij

〈ij |v|ij〉|Ψ (0)K 〉 (129)

Again here also second term will become zero because H ′2,N has similar form as H0 andthird term vanishes because of orthogonality. Then,

〈Ψ (0)0 |H2|Ψ

(0)K 〉 = 〈Ψ (0)

0 |H2,N |Ψ(0)K 〉 =

12

∑pqrs

〈pq|v|rs〉〈Ψ (0)0 |p

†q†sra†b†ji|Ψ (0)0 〉 (130)

Using generalized Wick’s theorem and collecting fully contracted terms with non-zerocontractions,

=12

∑pqrs

〈pq|v|rs〉〈Ψ (0)0 |p

†q†sra†b†ji+ p†q†sra†b†ji+ p†q†sra†b†ji

+ p†q†sra†b†ji+ p†q†sra†b†ji|Ψ (0)0 〉 (131)

First term has no contribution, then

=12

∑pqrs

〈pq|v|rs〉[−δpiδqjδsaδrb + δpiδqjδsbδra − δpjδqiδsbδra + δpjδqiδsaδrb

]=

12

[−〈ij |v|ba〉+ 〈ij |v|ab〉 − 〈ji|v|ab〉+ 〈ji|v|ba〉]

= [〈ij |v|ab〉 − 〈ij |v|ba〉]= [〈ij ||ab〉] (132)

80

Then collecting all non-zero terms and substituting into (123). The excitation energy inthe denominator of the second-order formula is determined by the change in the sum ofthe orbital energies due to the change in the occupancy of the orbitals upon excitation:

E(2)0 = −

∑a<b,i<j

|〈ij ||ab〉|2

εa + εb − εi − εj(133)

Equation (133) is second-order Møller-Plesset(MP2) formula for the correction energy interms of the spinorbitals.

Similarly for third-order formula (94), we have already calculated V00 = E(1)0 and V0K .

Only unknown matrix element is VKJ = 〈Ψ (0)K |V |Ψ

(0)J 〉. Here also only doubly excited

states with two electron operator term will contribute as mentioned before. Usinggeneralized Wick’s theorem we can write,

〈Ψ cdlm |p

†p|Ψ abij 〉 = 0, 〈Ψ cd

lm |p†q|Ψ ab

ij 〉 = 0 (134)

Then,

VKJ = 〈Ψ (0)K |V |Ψ

(0)J 〉 = 〈Ψ (0)

0 |l†m†dcp†q†sra†b†ji|Ψ (0)

0 〉

= 〈Ψ (0)0 |l

†m†dcp†q†sra†b†ji+ l†m†dcp†q†sra†b†ji

+ l†m†dcp†q†sra†b†ji+ l†m†dcp†q†sra†b†ji

+ l†m†dcp†q†sra†b†ji+ . . . + many terms |Ψ (0)0 〉 (135)

Remaining work is left as an exercise. If we evaluate all fully contracted terms then wewill end up with,

E(3)0 =

18

occ∑i,j,l,m

vir∑ab

〈ij ||ab〉〈ab||lm〉〈lm||ij〉(εa + εb − εi − εj

)(εa + εb − εl − εm)

18

occ∑i,j

vir∑a,b,c,d

〈ij ||ab〉〈ab||cd〉〈cd||ij〉(εa + εb − εi − εj

)(εc + εd − εi − εj

)occ∑i,j,l

vir∑a,b,c

〈ij ||ab〉〈lb||cj〉〈ac||il〉(εa + εb − εi − εj

)(εa + εc − εi − εl)

(136)

This is MP3 formula for the correction energy in terms of the spinorbitals.

81

D. Diagrammatic expansions for MPPT

1. Diagrammatic notation

Sometimes evaluating terms in second-quantization treatment can be cumbersomeand error-prone so to our calculations easy diagrammatic notation was introduced. Ithelps to list all non vanishing distinct terms in the perturbation sums, to elucidate certaincancellations in these sums and to provide certain systematics for the discussion andmanipulation of the various surviving terms.

Time ordering represents the time sequence in the application of various operators,and this is indicated in the diagrams by means of a time axis for the sequence of events.Another common arrangement is to place the time axis horizontally, from right to left.The actual time at which each event occurs (i.e. an operator acts) is irrelevant; only the

t

FIG. 1. Time Ordering

sequence is significant.Starting with the representation of a Slater determinant (SD). The reference state

(the Fermi vacuum) is represented by nothing, i.e. by a position on the time axis atwhich there are no lines or other symbols. Any other SD, is represented by vertical ordiagonal directed lines, pointing upward for particles and downward for holes, withlabels identifying the spinorbitals. The horizontal double line represents the point

|Ψai 〉 = a†i|0〉

i a

of operation of the normal-product operator, and below or above it we have the Fermivacuum. To avoid phase ambiguity, we can indicate which particle index appears abovewhich hole index.

82

|Ψabij 〉 = a†b†ji|0〉

i a j b

2. One-particle operator

Now we consider the representation of operators. We begin with a one-electronoperator in the normal form, say,

UN =∑pq

〈p|u|q〉p†q (137)

acting on singly excited Slater determinant |Ψ ai 〉 = a†i|0〉 The action and representation

of the individual terms in the sum over p, q in (137) will depend on whether p and q areparticle or hole indices. For illustration we will consider two cases particle−particle(pp)and particle − hole(ph) only and remaining cases are left as an exercise.

We begin with a (pp) term, then application of one-electron operator on singly excitedSlater determinant we obtain (using the generalized Wick’s theorem)

〈b|u|c〉b†ca†i|0〉 = 〈b|u|c〉δac|Ψ bi 〉 = 〈b|u|a〉|Ψ b

i 〉 (138)

which is represented by the diagram Here at the bottom we had |Ψ ai 〉 and at the top we

i a

b

have |Ψ bi 〉, the resulting determinant. The point of action of the operator is marked by

the interaction line (or vertex ). We associate the integral 〈b|u|a〉 with the vertex as amultiplicative factor. Note that the bra spinorbital in the integral corresponds to the lineleaving the vertex, while the ket corresponds to the entering line.

83

Similarly for ph,

〈b|u|j〉b†ja†i|0〉 = 〈b|u|j〉|Ψ abij 〉 (139)

showing that the resulting determinant is |Ψ abij 〉. The following principles are used to

i a

b j

draw these kind of diagrams:

1. The interaction is denoted by a dotted, horizontal line and the electron orbitalsinvolved in that interaction by solid, vertical lines, connected with the interactionline to a vertex.

2. A core orbital is represented by a line directed downwards (hole line) and a virtualorbital by a line directed upwards (particle line).

3. The orbitals belonging to the initial state (to the right in the matrix element) havetheir arrows pointing toward the interaction vertex, those of the final state awayfrom the vertex.

3. Two-particle operators

We now turn to a two-particle operator in normal-product form,

W =12

∑pqrs

〈pq|rs〉p†q†sr = 14

∑pqrs

〈pq||rs〉p†q†sr (140)

This operator is denoted by an interaction line connecting two half-vertices at the samelevel (i.e. the same point on the time axis). The two half-vertices and the interaction lineconstitute a single vertex. Each individual half-vertex will have one incoming and oneoutgoing line, each of which may be a particle line or a hole line. The association of linelabels with the two-electron integral indices and the creation or annihilation operatorsfollows the same rule as for one-body vertices:

incoming line ↔ annihilation operator ↔ ket state

outgoing line ↔ creation operator ↔ bra state (141)

84

electron 1 ↔ left half-vertex

electron 2 ↔ right half-vertex (142)

The integral indices associated with a two-body vertex are assigned according to thescheme

〈left-out right-out | left-in right-in〉 (143)

while the corresponding operator product can be described by

(left-out)† (right-out)† (left-in) (right-in) (144)

Diagrams employing this representation of the two-body interaction (which is basedon non-antisymmetrized integrals) are called Goldstone diagrams. Consider a simpleexample of vacuum expectation value of W 2. Then using Wick’s theorem we obtain,

〈0|W 2|0〉 =12

∑pqrs

〈pq|rs〉12

∑tuvw

〈tu|vw〉〈0|p†q†srt†u†wv|0〉

=14

∑abij

〈ij |ab〉〈ab|ij〉 −∑abij

〈ij |ab〉〈ba|ij〉

+

14

−∑abij

〈ij |ab〉〈ab|ji〉+∑abij

〈ij |ab〉〈ba|ji〉

(145)

The diagrammatic description of terms can be done easily using rules defined earlier.The first and fourth diagrams are equivalent (by exchange of the two half-vertices at the

top or bottom) and so are the second and third. So keeping only first and third terms inthe sum. And to identify the correct phase factor, there is a rule (−1)h−l . Where h is thenumber of hole lines in the loop and l is the number of loops. So,

〈0|W 2|0〉 =12

∑abij

〈ij |ab〉〈ab|ij〉 − 12

∑abij

〈ij |ab〉〈ab|ji〉 (146)

The factor 1/2 derives from the fact that each of these diagrams is symmetric underreflection in a vertical plane through its middle. Now if we want to do similar calculationsfor matrix element like 〈0|W 3|0〉, Goldstone representation will have number of distinctdiagrams as the number of interaction vertices increases, reflecting the individual listing

85

of each possible exchange. There is also some difficulty in making sure that all thosedistinct possibilities have been listed exactly once, since it is not always easy to determinewhether two diagrams are equivalent. However, the advantage of Goldstone diagramsis the straightforward determination of phase factors. The difficulties associated withthe use of the Goldstone representation can be overcome by basing the analysis on theantisymmetric integrals 〈pq||rs〉. Since the exchange contribution is incorporated withineach antisymmetrized integral, such an approach leads to a much smaller number ofdistinct diagrams. The diagrams using this representation of the W operator are calledHugenholtz diagrams.

4. Hugenholtz diagrams

They maintain the usual (Goldstone) form for one-body operators but represent thetwo-body vertex as a single large dot with two incoming and two outgoing lines (eachof which can be a particle or hole line). The labels on the outgoing lines appear in thebra part of the antisymmetrized integral, while the incoming labels appear in the ketpart. The order of the labels in each part is indeterminate, and therefore the phase of thecorresponding algebraic interpretation is indeterminate.

The Hugenholtz representation of the 〈0|W 2|0〉 matrix element has just one distinctdiagram instead of two, Expansion of the antisymmetrized integrals in terms of ordinary

〈0|W 2|0〉 = = 14

∑〈ij||ab〉〈ab||ij〉

integrals gives four terms, which are equal in pairs, reproducing the two-term resultobtained with Goldstone diagrams. The weight factor 1/4 is obtained by counting thenumber of pairs of equivalent lines in the diagram: a pair of lines is equivalent if theyconnect the same pair of vertices in the same direction. Each pair of equivalent linescontributes a factor 1/2. The diagram for 〈0|W 2|0〉 has two such pairs, resulting ina weight factor 1/4. Goldstone and Hugenholtz representation is left as an exercise.It is a good exercise to convince yourself the power of Goldstone and Hugenholtzrepresentation.

86

5. Antisymmetrized Goldstone diagrams

The antisymmetrized Goldstone diagrams can be summarized by the following rules:

1. Generate all distinct Hugenholtz skeletons.

2. For each skeleton assign arrows in all distinct ways to generate Hugenholtz diagrams.

3. Expand each Hugenholtz diagram into an ASG diagram in any of the possibleequivalent ways.

4. Interpret each two-body vertex in each ASG diagram in terms of an antisymmetrizedintegral, with the usual 〈left-out right-out || left-in right-in〉 arrangement.

5. Interpret each one-body vertex in each ASG diagram as in ordinary Goldstonediagrams.

6. Assign a phase factor (−1)h−l ,as for ordinary Goldstone diagrams.

7. Assign a weight factor(

12

)n, where n is the number of equivalent line pairs; two

lines are equivalent if they connect the same two vertices in the same direction.

6. Diagrammatic representation of RSPT

The zero- and first-order energies are given by

E(0)0 =

∑i

εi (147)

E(1)0 = −1

2

∑ij

〈ij |V |ij〉 (148)

The second-order energy expression can be alternatively written in the following equivalent

= −

form which is more useful,

E(2)0 =

∑K,0

〈Ψ (0)0 |V |Ψ

(0)K 〉〈Ψ

(0)K |V |Ψ

(0)0 〉

E(0)0 −E

(0)K

= 〈Ψ (0)0 |V R0V |Ψ

(0)0 〉 =

14

∑a,b,i,j

|〈ij ||ab〉|2

εi + εj − εa − εb(149)

87

where,

R0 =∑K,0

|Ψ (0)K 〉〈Ψ

(0)K |

E(0)0 −E

(0)K

(150)

is called the resolvent operator. Its presence in an expression is represented diagrammatically

a i j b

R0

by a thin horizontal line cutting the particle-hole lines, as shown on the figure. R0 doesnot change the state on which it operates, it only represents the division by the energydenominator, therefore any particle or hole lines present below the point of action of R0

continue unchanged above it.Expressions we have derived for MP 1, MP 2 and MP 3 are correlation energies which

corrects the Hartree Fock energy. While computing, MP2 is less expensive and givesignificance improvement. In principle, one could go up to higher orders of perturbationtheory (MP3, MP4, etc), but the computer programs become too hard to write, and theresults (perhaps surprisingly) don’t necessarily get any better.

E. Time versions

The diagrams which may be transformed one into the other by topological deformations(transformations) which do not preserve the order of operators along the time axis arereferred to as time versions of the same diagrams.

1. Time version of the first kind

Time version of the first kind may be obtained one from another by the permutationof vertices which do not change the particle-hole character of any of the fermion line inthe diagram. For example, the diagrams in Fig.(2) are time versions of the first kind of afourth order energy diagram with two U vertices.

2. Time version of the second kind

When the vertex permutation changes the hole-particle character of atleast one line,we obtain different time versions of the second kind. Thus the diagram in Fig.(3) is timeversion of the second kind of the first diagram.

88

FIG. 2. Time version of first kind.

FIG. 3. Time version of second kind.

F. Connected and disconnected diagrams

In the second order wavefunction we have either disconnected or connected diagrams.

|Ψ (2)〉 = R0W R0W |0〉

Fig.(4) are all the possible Hugenholtz diagrams for the second order wavefunctioncontribution. Thus we have one disconnected diagram (i) which yeilds a quadruplyexcited contribution, while the remaining diagrams are connected and correspond totriply (ii), (iii) doubly (iv), (v), (vi) and singly (vii), (viii) excited contributions. Nocontribution of the vacuum can arise, since any diagram having only internal lines (i.e.energy diagrams) would lead to a dangerous denominator.

89

FIG. 4. Second order wavefunction correction.

FIG. 5. Third order wavefunction correction.

G. Linked and unlinked diagrams

We have seen in the preceding section that for wavefunction contribution we obtaindisconnected diagrams already in the second order. In higher order of perturbationtheory disconnected diagrams of another type will occur. For example, a few possiblethird order wavefuntion diagrams, which are disconnected are as shown in Fig.(5).Even though all these diagrams are bona-fide wavefunction diagrams (i.e., no dangerous

denominators). We shall see that the latter diagram (iii) has a very different characterthan the former two diagrams (i) and (ii), since it contains an energy diagram as adisconnected part.We shall refer to energy diagrams, which have no external lines, as vacuum diagrams(or vacuum parts when they form a disconnected part of some diagrams), since theyrepresent Fermi vacuum mean values. Further, a disconnected is unlinked if it has atleast

90

FIG. 6. Unlinked part of third order wavefunction correction.

one disconnected vacuum part, and linked, if it has no disconnected vacuum part.Any unlinked diagram is by definition a disconnected diagram, while a linked diagramcan be either connected or disconnected. In the latter case, however, none of its disconnectedparts can be a vacuum diagram. On the other hand, a connected diagram is always linked(even if it’s a vacuum diagram), while a disconnected diagram can be either linked orunlinked, depending on whether all of its disconnected parts are of a non-vacuum typeor not respectively.Obviously, each unlinked diagram has a number of time versions of the first kind, sinceits disconnected parts can be positioned relative to one another in all distinct ways whichdo not introduce dangerous denominators. Thus, in the case of diagrams Fig.(5(iii)), thereare two possible time versions as shown in Fig.(6).The contributions from either of these time vertices differ only in the denominator part

since all the scalar factors associated with the vertices and all the operators associatedwith external lines are clearly identical. Designating the denominator of the vacuumpart, considered as a seperate diagram, by a and, similarly, the denominator of the partinvolving external lines (considered seperately) as b, the contribution from both timeversions is

N1

b(a+ b)b+N

1a(a+ b)b

where N designates the identical numerator part. Carrying out the sum we get,

N1

b(a+ b)

(1b

+1a

)=N

a+ bb(a+ b)ab

=N1ab2 (151)

91

FIG. 7. Linked disconnected diagrams for parts A and B.

The above result is easily seen to be precisely the contribution, except for the sign, fromthe third order renormalization term in third order corrected wavefuntion given as,

|Ψ (3)〉 =

(R0W

)3︸︷︷︸P rincipal term

−〈0|W R0W |0〉R20W︸︷︷︸

Renormalization term

|0〉 (152)

which is given, up to the sign, by the product of the second order energy contributionand the first order wavefunction contribution taken with the second order denominatorvertex and thus equals −N 1

a1b2 . Therefore, the renormalization term Eq.(152) exacly

cancels the contribution from unlinked diagrams Fig.(6) originating from the principalthird order term, given by Eq.(151).

H. Factorization lemma (Frantz and Mills)

Consider all the possible time versions for a linked diagram consisting of two disconnectedparts called A and B as shown in Fig.(7). Let the set of energy denominators for the partA alone be ∆aµ, µ = 1, ...,m, and for the part B be ∆bν , ν = 1, ...,n. The denominators arenumbered along the time axis, i.e., the lowest denominators are ∆A1 and ∆B1 in parts Aand B respectively.The denominator contribution from all time versions of the first kind corresponding to

all possible orderings of permutation vertices in parts A and B relative to one another

92

can be written as DABmn given as

DABmn =∑α,β

m+n∏p=1

(∆Aα(p) +∆Bβ(p)

)−1(153)

where the summation extends over all the sets of (m+n) integer pairs Γp = (α(p),β(p))such that 0 ≤ α(p) ≤m & 0 ≤ β(p) ≤ n.Γp is defined as follows:

1. Γ1 = (1,0) or Γ1 = (0,1).

2. Γp+1 = (α(p) + 1,β(p)) or Γp+1 = (α(p),β(p) + 1).

3. Γm+n = (m+n); α(m+n) =m, β(m+n) = n.

where we also define∆A0 = ∆B0 = 0

For the seperate disconnected parts the denominators are given by the products of ∆Aµ or∆Bν which can alos be written using the general expression Eq.(153) as

DAm =DABm0 =m∏µ=1

(∆Aµ

)−1& DBn =DAB0n =

n∏ν=1

(∆Bν

)−1(154)

where we define DA0 = DB0 = DAB00 = 1 The desired factorization lemma can now be simlystated as

DABmn =DAmDBn (155)

The proof is easily carried out using mathematical induction. The lemma holds whenm = 0 or n = 0, Since

DABm0 =DAmDB0 =DAm or DAB0n =DA0 D

Bn =DBn

in agreement with Eq.(154). Assume that the lemma holds for M = m − 1,N = n andM =m,N = n−1,m,n ≥ 1, i.e. DABMN =DAMD

BN . Clearly, all the terms in DABmn can be divided

into two disjoint classes according to whether the leftmost interaction occurs in A or inB subgraph, respectively, The last (top) denomimator factor being always the same asrequired by (3), namely (∆Am +∆Bn), we can write

DABmn =(∆Am +∆Bn

)−1 (DABm−1,n +DABm,n−1

)(156)

Since all the remaining factors are identical with those characterizing the disconnecteddiagrams which results when one top vertex is deleted: either in the A part (M = m −1,N = n) or in the B part (M =m,N = n−1). This result holds even when m or n equals 1.

93

Since the lemma Eq.(155) holds forM =m−1,N = n andM =m,N = n−1 by assumption,we can write Eq.(156) as


)−1 (DAm−1D

Bn +DAmD

Bn−1

)(157)

The denominator of seperate parts are given by a single product, Eqs.(154), so we have

DAm =DAm−1

(∆Am

)−1=⇒ DAm−1 =DAm∆

Am

DBn =DBn−1

(∆Bn

)−1=⇒ DBn−1 =DBn∆

Bn

so we get


)−1 (DAm∆

AmD

Bn +DAmD

Bn∆

Bn

)=DAmD

Bn

proving the lemma.

I. Linked-cluster theorem

We saw that for third order correction to the wavefunction, the renormalization termwas cancelled by the unlinked term from the principal term. This happens at all orders,so the contribution in each order is given by all the linked diagrams.In this derivation the linked-diagram expansions for the wave function and energy aresubstituted into the recursive form Eq.(158) of the Schrodinger equation (This equationcan be found in Shavitt’s Chapter 2 Eq.(2.75)), and the factorization theorem is used toshow that this expansion satisfies the equation. To prove this assertion we first rewriteEq.(2.75) from Shavitt’s book in a form appropriate for RSPT,

|Ψ 〉 = |0〉+ R0

(W −∆E

)|Ψ 〉 (158)

∆E = 〈0|W |Ψ 〉 (159)

where R0 ≡ R0 (E0),W = V −E(1) and ∆E = E −Eref = E −E0 −E(1). The implicit equations(158), (159) for |Ψ 〉 and ∆E are entirely equivalent to the Schrodinger equation.We need to prove that these equations are satisfied by the linked-diagram expansions

|Ψ 〉 =∞∑n=0

[(R0W

)n|0〉

]L

(160)

∆E =∞∑n=1

〈0|W(R0W

)n|0〉L (161)

where the subscript L indicates that the summations are limited to linked diagramsonly (note that the n = 0 term is missing in the summation for ∆E in Eq.(161) because

94

〈0|W |0〉 = 0). We are going to prove this assertion by substituting Eq.(160) and Eq.(161)into the recursive equations (158), (159) and showing that the latter are then satisfied.We first substitute Eq.(160) in Eq.(159), obtaining

∆E =∞∑n=1

〈0|W[(R0W

)n|0〉

]L

(162)

It is easy to verify that all the closed diagrams that can be formed by adding a new topvertex to the upwards-open linked n-vertex diagrams are linked (because all disconnectedparts of the open diagram must be closed by the single added vertex) and constitute thecomplete set of all closed linked (n+ 1)-vertex diagrams. Therefore Eq.(162) is consistentwith Eq.(159).Next we substitute Eq.(160) in Eq.(158), resulting in

|Ψ 〉 = |0〉+∞∑n=0

R0

(W −∆E

) [(R0W

)n|0〉

]L

= |0〉+∞∑n=0

R0W[(R0W

)n|0〉

]L−∞∑n=0

∆ER0

[(R0W

)n|0〉

]L

(163)

Each term of the first sum over n in the second line of Eq.(163) consists of all theupwards-open (n+ 1)-vertex diagrams that can be formed by adding one vertex (and thecorresponding resolvent) to all upwards-open linked n vertex diagrams. Each resultingdiagram either is linked or is unlinked with a single separate closed part (if the addedvertex closed a disconnected part of the n-vertex open diagram) and has the top vertex ofthe closed part as the top vertex of the entire diagram. We may therefore rewrite Eq.(163)in the form

|Ψ 〉 = |0〉+∞∑n=0

[R0W

(R0W

)n|0〉

]L

+∞∑n=0

[R0W

[(R0W

)n|0〉

]L

]U−∞∑n=0

∆ER0

[(R0W

)n|0〉

]L

(164)

where the subscript U indicates restriction to unlinked terms. The factorization theoremcan then be used to show the cancellation of the last two sums in this equation, becauseeach term in the third sum can be described by an open diagram with an insertion aboveits top vertex; this diagram cancels the contributions to the second sum from the sumof corresponding unlinked two-part open diagrams in which the top vertex of the closedpart is the top vertex of the entire diagram. The remaining terms of the right-hand sideare equivalent to the linked-diagram expansion Eq.(160), proving that this expansionsatisfies Eq.(158) and the Schrodinger equation.

95

J. Removal of spin

So far the formalism has been specified in terms of spinorbitals, and no attempt hasbeen made to consider the effects of spin. However, since the nonrelativistic Hamiltoniandoes not contain spin coordinates, integration over the spin variables is easily carried outand results in significant economies in the calculations.The simplest way in which spin affects the perturbation theory summations is that someintegrals vanish because of spin orthogonality. Thus if we indicate the spin factor of aspinorbital by putting a bar over β spinorbitals, and no bar over α’s, we have

〈pq||rs〉 = 〈pq|v|rs〉 − 〈pq|v|sr〉,〈pq||rs〉 = 〈pq|v|rs〉,〈pq||rs〉 = −〈pq|v|sr〉,〈pq||rs〉 = 〈pq|v|rs〉,〈pq||rs〉 = −〈pq|v|sr〉,〈pq||r s〉 = 〈pq|v|rs〉 − 〈pq|v|sr〉,

where the integrals on the r.h.s. are over the spatial factors only, and

〈pq||rs〉 = 〈pq||rs〉 = 〈pq||rs〉 = 〈pq||rs〉 = 0,

〈pq||rs〉 = 〈pq||r s〉 = 0,

〈pq||rs〉 = 〈pq||rs〉 = 〈pq||r s〉 = 〈pq||r s〉 = 0.

Thus, out of the 16 possible combinations of spin assignments to the four orbitals in anantisymmetric two-electron integral, 10 of the resulting integrals vanish completely andfour are reduced to a single spatial integral.Taking the second-order energy in the canonical RHF case as an example, we’ve:

E(2) =14

∑abij

〈ij ||ab〉〈ab||ij〉εabij

E(2) =14

∑abij

1

εabij

[〈ij ||ab〉〈ab||ij〉+ 〈ij ||ab〉〈ab||ij〉+ 〈ij ||ab〉〈ab||ij〉

]+

14

∑abij

1

εabij

[〈ij ||ab〉〈ab||ij〉+ 〈ij ||ab〉〈ab||ij〉+ 〈i j ||ab〉〈ab||i j〉

]

E(2) =14

∑abij

1

εabij[〈ij ||ab〉〈ab||ij〉+ 〈ij |v|ab〉〈ab|v|ij〉+ 〈ij |v|ba〉〈ab|v|ji〉]+

14

∑abij

1

εabij[〈ij |v|ba〉〈ab|v|ji〉+ 〈ij |v|ab〉〈ab|v|ij〉+ 〈ij ||ab〉〈ab||ij〉]

96

E(2) =14

∑abij

1

εabij[2〈ij ||ab〉〈ab||ij〉+ 2〈ij |v|ab〉〈ab|v|ij〉+ 2〈ij |v|ba〉〈ab|v|ji〉]

E(2) =12

∑abij

1

εabij[(〈ij |v|ab〉 − 〈ij |v|ba〉) (〈ab|v|ij〉 − 〈ab|v|ji〉) + 〈ij |v|ab〉〈ab|v|ij〉+ 〈ij |v|ba〉〈ab|v|ji〉]

E(2) =12

∑abij

1

εabij[〈ij |v|ab〉〈ab|v|ij〉 − 〈ij |v|ba〉〈ab|v|ij〉 − 〈ij |v|ab〉〈ab|v|ji〉]

+12

∑abij

1

εabij[〈ij |v|ba〉〈ab|v|ji〉+ 〈ij |v|ab〉〈ab|v|ij〉+ 〈ij |v|ba〉〈ab|v|ji〉]

E(2) =12

∑abij

1

εabij[2〈ij |v|ab〉〈ab|v|ij〉+ 2〈ij |v|ba〉〈ab|v|ji〉 − 〈ij |v|ab〉〈ab|v|ji〉 − 〈ij |v|ba〉〈ab|v|ij〉]

Exchanging electron labels in second part of second and third term in above equation.

E(2) =12

∑abij

1

εabij[2〈ij |v|ab〉〈ab|v|ij〉+ 2〈ij |v|ba〉〈ba|v|ij〉 − 〈ij |v|ab〉〈ba|v|ij〉 − 〈ij |v|ba〉〈ab|v|ij〉]

where the summations are over the distinct spatial orbitals only. Since a, b are dummysummation indices and can be interchanged, we find that the first two terms in thebrackets are equal (after summation), and so are the third and fourth. Thus

E(2) =∑abij

1

εabij[2〈ij |v|ab〉〈ab|v|ij〉 − 〈ij |v|ab〉〈ba|v|ij〉]

E(2) =∑abij

〈ij |v|ab〉εabij

[2〈ab|v|ij〉 − 〈ba|v|ij〉]

Similar treatments hold for other terms.

IX. COUPLED CLUSTER THEORY

The coupled cluster theory was introduced in 1960 by Coester and Kummel forcalculating nuclear binding energies. In 1966 J. Cizek and latter with J. Paldus reformulatedthe method for electron correlation in atoms and molecules.

A. Exponential ansatz

Ψcc = ΩΦ0

Ψcc = eTΦ0

97

where Ω is often called wave operator as it takes an unperturbed solution into the exactsolution and

T = T1 + T2 + T3 + .......Tm

whereT1 =

∑ia

tai a†i

T2 =1

(2!)2

∑ijab

tabij a†b†j i

.

.

.

Tm =1

(m!)2

∑ij...ab...

tab...ij... a†b†.....j i

where m ≤ N and N represents the number of electrons. tab...ij... are coefficients to bedetermined, usually referred as "amplitudes" for the corresponding operators. Also

tabij = −tabji = −tbaij = tbaji

The simplest couple cluster approach is that of coupled cluster doubles (CCD) in whichT is truncated to

TCCD = T2

The most common extension of this model is coupled cluster singles and doubles (CCSD),defined by

TCCSD = T1 + T2

and similarlyTCCSDT = T1 + T2 + T3

B. Size consistency

Consider a system AB composed of two non-interacting components A and B

Φ0(AB) = Φ0(A)Φ0(B)

T (AB) = T (A) + T (B)

thenΨ (AB) = eT (AB)Φ0(AB)

Ψ (AB) = eT (A)+T (B)Φ0(A)Φ0(B)

98

Ψ (AB) = eT (A)Φ0(A)eT (B)Φ0(B)

Ψ (AB) = Ψ (A)Ψ (B)

This separability of wavefunction ensures the additivity of the energy

H(AB)Ψ (AB) = [H(A) +H(B)]Ψ (A)Ψ (B)

H(AB)Ψ (AB) = [E(A) +E(B)]Ψ (A)Ψ (B)

H(AB)Ψ (AB) = [E(A) +E(B)]Ψ (AB)

C. CC method with double excitations

The Schrodinger equation is

HΨCCD = ECCDΨCCD (165)

〈Φ0|H |ΨCCD〉 = ECCD 〈Φ0|ΨCCD〉

We can put 〈Φ0|ΨCCD〉 = 1 by the choice of intermediate normalization

ECCD = 〈Φ0|H |ΨCCD〉 (166)

where ΨCCD = eT2Φ0 In order to make our calculation easy, we write the total HamiltonianH in normal order form i.e.

H =HN + 〈0|H |0〉

H =HN +Eref

whereHN = FN +WN

HN =∑pq

fpqp†q+14

∑pqrs

〈pq||rs〉 p†q†sr

andEref = 〈0|H |0〉

So equation (170) becomes

ECCD −Eref = 〈Φ0|(H −Eref )|ΨCCD〉

∆ECCD = 〈Φ0|HN |ΨCCD〉

Let for simplicity |Φ0〉 = |0〉 so

∆ECCD = 〈0|HN eT2 |0〉

99

∆ECCD = 〈0|HN (1 + T2 +12T 2

2 )|0〉 (167)

The first term in above equation is zero because HN is in normal order form and also thethird term is zero due to slatter-condon rule so we get

∆ECCD = 〈0|HNT2|0〉

∆ECCD =14

∑ijab

tabij 〈0|[∑pq

fpqp†q+14

∑pqrs

〈pq||rs〉 p†q†sr]a†b†j i|0〉

The first term is zero since it is in normal order form and the second term becomes

∆ECCD =1

16

∑ijab

∑pqrs

〈pq||rs〉 tabij 〈0|p†q†sra†b†j i|0〉

As we know that the full contraction terms survive which are

∆ECCD =1

16

∑ijab

∑pqrs

[〈0|p†q†sra†b†ji|0〉+〈0|p†q†sra†b†ji|0〉+〈0|p†q†sra†b†ji|0〉+〈0|p†q†sra†b†ji|0〉

]

∆ECCD =1

16

∑ijab

∑pqrs

[δpiδqjδsbδra − δpiδqjδsaδrb − δpjδqiδsbδra + δpjδqiδsaδrb

]so

∆ECCD =1

16

∑ijab

[〈ij ||ab〉 − 〈ij ||ba〉 − 〈ji||ab〉+ 〈ji||ba〉

]tabij

since 〈ij ||ab〉 = 〈ij ||ba〉so the above equation becomes

∆ECCD =14

∑ijab

〈ij ||ab〉 tabij

To calculate energy, we need the amplitudes tabij and we can obtain equation for theseamplitudes by projecting equation (169) onto all double excitation i.e.

〈Φabij |HN |ΨCCD〉 = ECCD 〈Φab

ij |ΨCCD〉

〈Φabij |HN e

T2 |0〉 = ECCD 〈Φabij |e

T2 |0〉

〈Φabij |HN (1 + T2 +

12T 2

2 )|0〉 = ECCD 〈Φabij |(1 + T2 +

12T 2

2 )|0〉

100


12T 2

2 )|0〉 = ECCD 〈Φabij |T2|0〉


12T 2

2 )|0〉 =1

(2!)2ECCD∑i′j ′a′b′

〈Φabij |Φ

a′b′i′j ′ 〉 t

a′b′i′j ′


12T 2

2 )|0〉 = ECCDtabij (168)

Let’s evaluate each term separately of the LHS of equation (168). The first term is equalto

〈Φabij |HN |0〉 = 〈ab||ij〉

in above equation we used the Slatter-Condon rule. Now let’s evaluate the second term

〈Φabij |HNT2|0〉 =

14

∑klcd

〈Φabij |HN |Φ

cdkl 〉 t

cdkl

〈Φabij |HNT2|0〉 =

14

∑klcd

〈Φabij |(FN +WN )|Φcd

kl 〉 tcdkl

We solve each term separately in above equation. Let’s name the first term as L1 andthe second as L2, so

L1 =14

∑klcd

〈Φabij |FN |Φ

cdkl 〉 t

cdkl

L1 =14

∑klcd

∑pq

fpq 〈0|i†j†bapqc†d†l k|0〉 tcdkl

Here the possible contractions are 16, four of which are

[〈0|i†j†bapqc†d†lk|0〉

]+[〈0|i†j†bapqc†d†lk|0〉

]+

[〈0|i†j†bapqc†d†lk|0〉

]+[〈0|i†j†bapqc†d†lk|0〉

]tcdkl

=[δikδjlδbdδapδcq + δikδjlδacδbpδdq − δikδjqδlpδacδbd − δiqδkpδjlδacδbd

]tcdkl

As there are twelve more terms, in each case we obtain contributions that are equal tothe above four terms canceling the factor 1

4 and hence we get

L1 =∑c

factcbij +

∑d

fbdtadij −

∑l

fljtabil −

∑k

fkitabij

101

Changing the dummy summation indices in some terms and permuting some indicesgives the following result.

L1 = −∑c

(factcbij − fbct

acij )−

∑k

(fjktabik − fikt

abjk )

In canonical Hartee-Fock case

L1 = −∑c

(εaδactbcij − εbδbct

acij )−

∑k

(εjδjktabik − εiδikt

abjk )

L1 = −(εatbaij − εbt

abij )− (εjt

abij − εit

abji )

L1 = (εa + εb − εi − εj)tabji

For the two particle part of the linear term, we have to evaluate

L2 =1

16

∑pqrs

∑klcd

〈pq||rs〉〈0|i†j†bap†q†src†d†lk|0〉 tcdkl

To obtain valid contractions in this case we must form two contraction each. They canbe classified into three cases (a) contract two pairs of hole-index operators (b) contracttwo pairs of particle-index operators (c) contract one pair of each type.

L2a =1

16

∑pqrs

∑klcd


L2a =18

∑pqrs

∑cd

〈pq||rs〉〈0|bap†q†src†d†|0〉 tcdkl

L2a =1

16

∑pqrs

∑klcd

〈pq||rs〉[〈0|bap†q†src†d†|0〉+ 〈0|bap†q†src†d†|0〉+

〈0|bap†q†src†d†|0〉+ 〈0|bap†q†src†d†|0〉]tcdkl

L2a =1

16

∑pqrs

∑klcd

〈pq||rs〉[δbqδapδdsδcr − δbqδapδcsδdr − δbpδaqδdsδcr + δbqδapδcsδdr

]tcdkl

L2a =12

∑cd

〈ab||cd〉 tcdij

Now

102

L2b =1

16

∑pqrs

∑klcd


L2b =18

∑pqrs

∑kl

〈pq||rs〉〈0|i†j†p†q†srlk|0〉 tcdkl

By similar contraction as for L2a, we get

L2b =12

∑kl

〈kl||ij〉 tabkl

For L2c,

L2c =1

16

∑pqrs

∑klcd


L2c =1

16

∑pqrs

∑klcd

〈pq||rs〉[〈0|i†j†bap†q†src†d†lk|0〉+ 〈0|i†j†bap†q†src†d†lk|0〉+

〈0|i†j†bap†q†src†d†lk|0〉+ 〈0|i†j†bap†q†src†d†lk|0〉]tcdkl

L2c =1

16

∑pqrs

∑klcd

〈pq||rs〉[〈0|j†bp†q†src†k|0〉 tacik − 〈0|i

†bp†q†src†k|0〉 tacjk−

〈0|j†ap†q†src†k|0〉 tbaik + 〈0|i†ap†q†src†k|0〉 tbcjk]

The first term in above equation can be contracted in four ways

〈0|j†bp†q†src†k+ j†bp†q†src†k+ j†bp†q†src†k+ j†bp†q†src†k|0〉

After simplifying, we get= −〈bk||cj〉 tacik

Apply the same procedure on the other three terms of L2c and then after combiningall the terms, we get

L2c = −∑kc

(〈bk||cj〉 tacik − 〈bk||ci〉 t

acjk − 〈ak||cj〉 t

bcik + 〈ak||cj〉 tbcjk

)After adding L1, L2a, L2b, L2c, we get

L = L1 +L2a +L2b +L2c

103

L = (εa + εb − εi − εj)tabji +12

∑cd

〈ab||cd〉 tcdij +12

∑kl

〈kl||ij〉 tabkl

−∑kc


acjk − 〈ak||cj〉 t

bcik + 〈ak||cj〉 tbcjk

)Now we solve for the quadratic term in equation ()

Q = 〈Φabij |HN (

12T 2

2 )|0〉

Q = 〈Φabij |(FN +WN )(

12T 2

2 )|0〉

Q = 〈Φabij |(FN )(

12T 2

2 )|0〉+ 〈Φabij |WN (

12T 2

2 )|0〉

The first term in above equation having one electron operator is zero. If we think interms of diagram, this becomes more clear. Since T 2

2 corresponds to quadruple excitationwhile the target state is a double excitation, we must use a −2 de-excitation level diagrambut FN has at most −1 de-excitation and hence becomes zero. The second term havingtwo electron operator is

Q =12〈Φab

ij |WN (T 22 )|0〉

Q =18

∑pqrs

∑m>n,e>f

∑k>l,c>d

〈pq||rs〉〈0|i†j†bap†q†src†d†lke†f †nm|0〉 tcdkl tefmn

No nonzero contractions are possible between the third and fourth normal productsin above equation and thus, to obtain nonzero contributions, four of the eight operatorsin the third and fourth normal products have to be contracted with the first product,and the remaining four with the second product. We shall first consider the case inwhich the four operators of the first product are contracted with the four operators ofthe fourth. This term, and the similar one in which the four contractions are betweenthe first and third normal products, represent unlinked contributions since the set ofcontractions involving the first normal product is decoupled from the set involving thesecond. Considering the inequalities in the restricted summations over m, n, e, f andthe restriction i > j,a > b, the contractions between the first and fourth products can beaccomplished in only one way:

Qa =18

∑pqrs

∑m>n,e>f

∑k>l,c>d


104

Qa =18

∑pqrs

∑k>l,c>d

〈pq||rs〉〈0|p†q†src†d†lk|0〉 tcdkl tabij

The above term can be contracted in four possible ways which gives equal contributionand is given

Qa =12

∑k>l,c>d

〈kl||cd〉 tcdkl tabij

The same result is obtained (after renaming the summation indices) for the case inwhich the four operators of the first product are contracted with those in the thirdproduct, and thus we get

Qb =12

∑k>l,c>d

〈kl||cd〉 tcdkl tabij

The remaining terms in the quadratic contribution fall into four classes, depending onthe pattern of contractions of the first normal product. In class (a) the two hole operatorsof the first product are contracted with either the third or the fourth product (i.e. i† andi† are contracted with k and l, respectively, or with m and n, respectively, using orderedsums) while the two particle operators are contracted with the fourth or third product,respectively. These two types of contraction produce equal results, canceling a factor 1/2. Then converting to unrestricted summations adds a factor 1/4, which is later canceledby the four equivalent ways of contracting the remaining operators, giving

Qc =1

16

∑pqrs

∑klcd

〈pq||rs〉〈0|p†q†srlkc†d†|0〉 tcdkl tabij

Qc =14

∑klcd

〈kl||cd〉 tcdij tabkl

In class (b) one hole and one particle operator of the first normal product are contractedwith operators in the third product, while the remaining two operators are contractedwith operators in the fourth. Converting to unrestricted summations, which introducesan additional factor 1/16, we find that there are 64 choices for these contractions.Specifically, there are four ways for i† and a to be contracted with operators in thethird product while j† and b can be contracted with operators in the fourth productin four ways, giving 16 equal terms; contracting i† and a with operators in the fourthproduct while j† and b are contracted with operators in the third product give 16 moreterms equal to the above, for a total of 32 equal terms. Another set of 32 equal terms isobtained by contracting i† and b with operators in the third product while j† and a arecontracted with operators in the fourth product, or vice versa. In total, after renamingthe summation indices and performing the remaining contractions we get

105

Qd =14

∑pqrs

∑klcd

〈pq||rs〉〈0|p†q†src†kd†l|0〉 (tacik tbdjl − t

bcik t

adjl )

Qd =∑klcd

〈pq||rs〉 (tacik tbdjl − t

bcik t

adjl )

Qd =∑klcd

〈pq||rs〉 (tacik tbdjl + tbdik t

acjl )

In classes (c) and (d) three operators of the first normal product are contracted withoperators in the third product and one with an operator in the fourth, or vice versa. Inclass (c) the set of three operators in the first product consists of two particle operatorsand one hole operator while in class (d) it consists of one particle operator and two holeoperators. Furthermore, each case can be generated in two distinct ways, depending onwhether the set of three operators is j†ab or j†ab for (c) and i† j†a or i† j†b for (d). Thereare 16 possibilities in each case: the set of three operators in the first product can becontracted with operators in the third or the fourth product, and in each case these threecontractions can be done in four ways, while the remaining single contraction can bechosen in two ways. The 16 possibilities lead to equivalent results, canceling the factor1/16 obtained by converting to unrestricted summations. As an example, the first Qeterm can be written in the form

18

∑pqrs

∑mnef

∑klcd


= −18

∑pqrs

∑klcd

〈0|p†q†src†d†kl|0〉 tcdkj tabli

The sign reflects the odd number of interchanges needed to move all the contractedoperators to the front in pairs (note that the summation index m is changed to l after thecontraction). The remaining operators can be contracted in four ways:

〈0|p†q†src†d†kl+ p†q†src†d†kl+ p†q†src†d†kl+ p†q†src†d†kl|0〉

= −12

∑klcd

〈kl||cd〉 tabik tcdjl

Similarly for the second term of Qd , we get

−12

∑klcd

〈kl||cd〉 tcdik tabjl

106

and hence we get

Qe = −12

∑klcd

〈kl||cd〉 (tcdik tabjl + tabik t

cdjl )

and using the same procedure for case (d), we get

Qf = −12

∑klcd

〈kl||cd〉 (tacij tbdkl + tbdij t

ackl )

When all Q’s are put together, we get

Q =18

∑klcd

〈kl||cd〉 tcdkl tabij +

18

∑klcd

〈kl||cd〉 tcdkl tabij +

14

∑klcd

〈kl||cd〉 tcdij tabkl +

∑klcd

〈pq||rs〉 (tacik tbdjl

+ tbdik tacjl )− 1

2

∑klcd


cdjl )− 1

2

∑klcd


ackl )

Equation (168), after putting all the values and some cancellation, we get

εabij tabij = 〈ab||ij〉+ 1

2

∑cd


∑kl

〈kl||ij〉 tabkl −∑kc


acjk

− 〈ak||cj〉 tbcik + 〈ak||cj〉 tbcjk)

+14

∑klcd


∑klcd


acjl )

− 12

∑klcd


cdjl )− 1

2

∑klcd


ackl )

In order to solve for the following CCD amplitude equation, we proceed as follows

εabij tabij = 〈ab||ij〉+ 1

2

∑cd


∑kl

〈kl||ij〉 tabkl −∑kc


acjk

− 〈ak||cj〉 tbcik + 〈ak||cj〉 tbcjk)

+14

∑klcd


∑klcd


acjl )

− 12

∑klcd


cdjl )− 1

2

∑klcd


ackl )

εabij tabij = 〈ab||ij〉+L(t) +Q(tt)

where L(t) and Q(tt) corresponds to the linear and quadratic amplitudes respectively.Now the question is how to solve for the amplitude t’s. Here we will use the iterativemethod. First we substitute L(t) and Q(tt) equal to zero. Thus the first approximation totabij is

107

tabij =〈ab||ij〉εabij

This gives an estimate of each amplitude. This approximate value is then substitutedthen back on the right hand side to evaluate the left hand side and so forth. Finally, onecan achieve a self-consistency of the iterative process and obtain the CC function for theground state of the system. A more efficient way is when the initial amplitudes are takenfrom a short CI expansion, with subsequent linearization of terms containing the initial(known) amplitudes.

D. Equivalence of CC and MBPT theory

Here we will show that CC form of wave function can be derived from the infiniteorder of MBPT wave function. The total wave function in MBPT can be written as

ΨMBPT =∞∑n=0

(R0VN )n |0〉

= Φ0 +Ψ (1) +Ψ (2) + .....

where the superscripts indicate the order in VN and where VN = F0N +W

The algebraic expression for individual orders of Ψ can be written as

Ψ (1) =14

∑abij

〈ab||ij〉εabij

Φabij +

∑ai

〈a|f |i〉εai

Φai

P si(2) =18

∑abcdij

〈ab||cd〉〈cd||ij〉εabij ε

cdij

Φabij −

∑abcijk

〈ak||cj〉〈cb||ik〉εabij ε

bcik

Φabij +

12

∑abcij

〈aj ||cb〉〈cb||ij〉εcbij ε

ai

Φai

+14

∑abcdijk

〈ab||dk〉〈ad||ij〉εabcijk ε

adij

Φabcijk −

14

∑abcijkl

〈lc||jk〉〈ab||il〉εabcijk ε

abil

Φabcijk +

116

∑abcdijkl

〈cd||kl〉〈ab||ij〉εabcdijkl ε

abij

Φabcdijkl +.......

In Ψ1 , Ψ2 etc. contain expressions corresponding to connected and disconnecteddiagrams. Let r=1 contain all the expressions corresponding to connected form of wavefunction diagrams. We can represent this class by connected operator T ,

T =∞∑n=0

(R0VN )n |0〉C

where

T =N∑m=1

Tm

108

We can see that in perturbation theory, infinite order of connected form correspondsto Tm. so we can write

Tm =N∑m=1

T(n)m

Tm |0〉 = (R0VN )n |0〉C,m

and hence the corresponding expansion for amplitudes is

tabc...ijk... =∞∑n=0

tabc...(n)ijk...

As an example, we can write the expression corresponding to T (1)2 is

T(1)2 =

14

∑abij

〈ab||ij〉εabij

Φabij

where

tab(1)ij =

〈ab||ij〉εabij

Initial few terms in the expansion of T1 , T2 interms of MBPT expressions are given

T1 |0〉 = (T (1)1 + T (2)

1 + T (3)1 + ...) |0〉

=(∑ai

〈a|f |i〉εai

a†i +12

∑abcij


ai

a†i − 12

∑abcij


ai

a†i + ...)|0〉

T2 |0〉 = (T (1)2 + T (2)

2 + T (3)2 + ...) |0〉

=(14

∑abij

〈ab||ij〉εabij

a†b†ji +18

∑abcdij

〈ab||cd〉〈cd||ij〉εabij ε

cdij

a†b†ji −∑abcijk

〈ak||cj〉〈cb||ik〉εabij ε

bcik

a†b†ji + ....)|0〉

similarly we can write for T3 and other terms.Now we will show that r = 2(having two disconnected parts) expression of MBPT

corresponds to the square terms in CC theory. we will illustrate this, for m=2, r=2 and infirst order. We can write the last term of Ψ (2) as

116

∑abcdijkl

〈cd||kl〉〈ab||ij〉εabcdijkl ε

abij

=12

116

∑abcdijkl

〈cd||kl〉〈ab||ij〉( 1

εabcdijkl εabij

+1

εabcdijkl εcdkl

)a†b†jic†d†lk

109

=12

116

∑abcdijkl

〈cd||kl〉〈ab||ij〉( 1

(εabij + εcdkl )εabij

+1

(εabij + εcdkl )εcdkl

)a†b†jic†d†lk

after simplification, we get

=12

116

∑abcdijkl

〈cd||kl〉〈ab||ij〉εcdkl ε

abij

a†b†jic†d†lk

=12

(T (1)2 )2

if we collect terms in a similar fashion, at the end we will get all the terms ofexponential in CC theory i.e.

ΨMBPT = eT |0〉

E. Noniterative triple excitations correction

CCSDT (coupled cluster single double and triple) approximation is more accuratethan CCSD but has an order of N 8 and hence is very expensive computationally. Inorder to reduce the cost, MBPT have been used to account for the famous (T) correctionand is called CCSD(T) approach instead of using full CCSDT approximation. Now wewill show that how MBPT can be used in (T) correction of connected triple excitation.

We can decompose the normal ordered Hamiltonian HN as follows

HN =H0 +H1 = FN +VN

where zeroth order component of the Hamiltonian is taken to be the Fock operator suchthat the perturbation operator is then the remaining two electron operator i.e. VN . Alsowe can decompose the cluster operators as done before in previous section i.e.

Tm = (T (1)m + T (2)

m + T (3)m + ... (169)

We can define our Hamiltonian as

H = eTHN eT

For T = T1 + T2 + T3, the above equation takes the form

H = (HN +HNT1 +HNT2 +HNT3 +12HNT

21 +

12HNT

22 +

12HNT

23 +HNT1T2 +HNT1T3 + ......)C

The proof of this relation is asked in the homework problem set. As in the previoussection, we proved the equivalence between CC theory and MBPT interms of wave

110

function so in a similar way we can show the equivalence in terms of energy. The CCSDenergy contains contributions identical to those of MBPT(2) and MBPT(3) energy, butlacks triple excitation contribution necessary for MBPT(4). Thus a natural approach tothe "triples problem" is to correct the CCSD energy for the missing MBPT(4) terms usingthe CCSDT similarity-transformed Hamiltonian,

H = e−T1−T2−T3HN eT1+T2+T3

For m=1,2,3 in equation (169) and then plug in the above equation we get

H = H (0) + H (1) + H (2) + ....

Here we are interested in calculating E(4) so we will need H (4) which is

H (4) = (VNT32 )C

soE(4) = 〈0|(VNT 3

2 )C |0〉

plugging the VN and T 32 operator in above equation and using Wick’s theorem, we get

E(4) =14tab(3)ij

∑ijab

〈ij ||ab〉 (170)

in above equation, tab(3)ij is not known and need to be calculated as follows

0 = 〈Φabij |H

(3)|0〉

0 = 〈Φabij |(FNT

22 +VNT

21 +VNT

22 +VNT

23 +

12VN (T 1

2 )2)|0〉

we can see from above equation that it contains T 23 and hence will involve tabc(2)

ijk

amplitude. so in order to find this amplitude, we have

0 = 〈Φabcijk |H

(2)|0〉

0 = 〈Φabcijk |VNT

12 +VNT

23 +

12VN (T 1

2 )2|0〉

after plugging the operator and using Wick’s theorem we get

εabcijk tabc(2)ijk = P (i|ij)P (a|bc)

∑d

〈bc||dk〉 tad(1)ij − P (i|kj)P (c|ab)

∑l

〈lc||jk〉 tab(1)il (171)

where P (p|qr) permutation operators perform anti-symmetric permutations of indexp with indices q and r. These T 2

3 amplitudes may then be used to compute the T 32

111

amplitudes, which may then be used in equation (170) to compute the triple excitationcontribution to the forth-order energy, E(4)

T . The corrected CCSD energy is

ECCSD+T (4) = ECCSD +E(4)T

and is referred as CCSD + T (4) method. If on the other hand one choose to use theconverged CCSD T2 amplitudes rather than first order T2 in equation ( 174 ) then one canobtain different correction which is called CCSD+T(CCSD) or CCSD[T]

ECCSD+T (CCSD) = ECCSD +E[4]T

This approach is reported to give quantitatively incorrect predictions of molecularproperties for some systems. In 1989 a similar analysis was developed by Raghavachari etal., who determined that a fifth-order energy contribution involving single excitations,denoted E

[5]ST ;, should be included in the CCSD correction, as well. This component

may be derived based on the second-order T3 contribution to the third-order T1 operator,which subsequently contributes to fourth-order T2. Although the diagrammatic techniquesdescribed above are particularly convenient for deriving E[5]

ST , here we will simply presentthe final equation

E[5]ST =

14

∑ijkabc

〈jk||bc〉 tai tabcijk

where the triple-excitation amplitudes are determined using a modified form of Eq. (174 ) that includes converged T2 amplitudes

εabcijk tabc(2)ijk = P (i|jk)P (a|bc)

[∑d

〈bc||di〉 tadjk −∑l

〈la||jk〉 tbcil

Hence, the total CCSD(T) energy may be succinctly written as

ECCSD(T ) = ECCSD +E[4]T +E[5]

ST

This method of energy calculation is called CCSD(T) approach and is the "GoldStandard" in quantum chemistry.

The second method to solve for the amplitude equation is the multivariable Newton-Raphson Method. We can see that the amplitude equation we have is nonlinear. We canwrite the amplitude equation in matrix form (by defining tabij as the ij,ab element of the tcolumn vector) as

0 = a+ bt+ ctt (172)

where a = 〈ab||ij〉. The solution of these nonlinear algebraic equations pose a substantialdifficulty in implementing coupled cluster theory.

112

To solve for the nonlinear amplitude equation, we choose t such that the vector f(t)defined as

f (t) ≡ a+ bt + ctt (173)

becomes equal to zero. This is done by expanding f(t) about the point t0. Keeping onlylinear terms in this Taylor expansion and setting f(t) equal to zero, one obtains equationfor the changes ∆t in the t amplitudes, which can be expressed as

f aijb(t) = 0 = f abij (t0) +∑klcd

(∂f abij∂tcdkl

)t0∆tcdkl (174)

The step lengths(corrections to t0) can be obtained by solving the above set of linearequations and then used to update the t amplitudes

t = t0 +∆t

These values of t can then be used as a new t0 vector for the next application of Eq.(174). This multidimensional Newton-Raphson procedure, which involves the solutionof a large number of coupled linear equations, is then repeated until the ∆t values aresufficiently small (convergence). Although the first applications of the coupled clustermethod to quantum chemistry did employ this Newton-Raphson scheme, the numericalproblems involved in solving the large multivariable inhomogenous equations (174) hasled more recent workers to use the perturbative techniques discussed already.

To solve for nonlinear equation of coupled cluster theory, other methods were devised.One such method within the perturbative framework is the reduced linear equationstechnique, developed by Purvis and Bartlett. Although this method can efficiently solvea large systems of linear equations but can also be used for nonlinear coupled clusterequations by assuming an approximate linearization of the nonlinear terms.

F. Full triple and higher excitations

Due to the on going growth of in computational resources, it is nowadays oftenpossible to perform full Coupled Cluster singles, doubles and triples (CCSDT) calculationsin cases that demand very high accuracy. This method was first formulated in 1987. Thecomplete inclusion of T3 makes it harder and hence scales as N 8. However it is necessaryin many cases to go beyond this level and to include correlation effects beyond CCSDT.The next method is CC single, double, triples, and quadruples (CCSDTQ) which is veryexpensive with a computational scaling of N 10. It is of interest to explore methodsintermediate between CCSDT and CCSDTQ with a reduced scaling of the cost. The CCwave function including quadruple excitation is given by ΨCCSDTQ = exp(T1+T2+T3+T4).

113

X. LINEAR RESPONSE THEORY

The linear response theory is used in situations when a system of electrons is subject tosmall perturbations. For example example an electric or magnetic field from a probe inan experiment. The response properties of a system determine the screening (dielectric)properties of the system and can be used to study excited states of a system. Thedensity-density response function determines the second-order dispersion energy inthe symmetry-adapted perturbation theory (SAPT) and the exact exchange-correlationenergy of the density functional theory Exc.

A. Response function

Let us consider a system in the ground state |Ψ0〉 of a Hamiltonian operator H0 i.e.,H0|Ψ0〉 = E0|Ψ0〉. Let A be an observable of the system then its expectation value isA0 = 〈Ψ0|AΨ0〉. The time evolution operator in this situation is U (t, t0) = e−i(t−t0)H0 andtherefore the wave function should just have a phase change in time e−i(t−t0)E0 . Letthe system be subject to a perturbation H1(t) = F(t)B, where F(t) is a time-dependentfield coupled to an observable B of the system. The total Hamiltonian becomes H(t) =H0 + H1(t). As a consequence the expectation value of A also become time dependent ingeneral i.e.,

A(t) = 〈Ψ (t)|A|Ψ (t)〉 t > t0 . (175)

The differenceA(t)−A0 is called the response of A to the perturbation H1(t). The responsein general can be written as

A(t)−A0 = A1(t) +A2(t) +A3(t) · · · , (176)

where A1(t) is the change which is first-order (linear) in the perturbation H1(t), A2(t) issecond-order (quadratic) and so on. We will limit discussion to linear response A1(t).The time evolution operator can be written now as U (t, t0) = e−i(t−t0)H0U1(t, t0). The stateof the system after the application of perturbation evolves as

|Ψ (t)〉 = U (t, t0)|Ψ0〉 = e−i(t−t0)H0U1(t, t0)|Ψ0〉. (177)

The time-dependent Schrödinger wave equation gives

i∂|Ψ (t)〉∂t

= [H0 + H1(t)]|Ψ (t)〉 (178)

i∂U1(t, t0)

∂t= ei(t−t0)H0 ˆH1(t)e−i(t−t0)H0U1(t, t0) (179)

U1(t, t0) = U1(t0, t0)− i∫ t

t0

dt′ ei(t′−t0)H0 ˆH1(t′)e−i(t−t0)H0U1(t′, t0) (180)

114

As perturbation is turned on at t = t0 so U1(t0, t0) = 1. Therefore,

U1(t, t0) = 1− i∫ t

t0

dt′ ei(t′−t0)H0 ˆH1(t′)e−i(t−t0)H0U1(t′, t0) (181)

This integral transformation equation can be solved iteratively. The zeroth-order solutionis U (0)

1 (t, t0) = 1, thus, the first-order solution is

U(1)1 (t, t0) = 1− i

∫ t

t0

dt′ ei(t′−t0)H0 ˆH1(t′)e−i(t−t0)H0 . (182)

The time evolution operator to first-order in perturbation is

U (1)(t, t0) = e−i(t−t0)H0

[1− i

∫ t

t0

dt′ ei(t′−t0)H0 ˆH1(t′)e−i(t−t0)H0

]. (183)

The linear response then can be written as

A1(t) = A(t)−A0 = 〈Ψ (t)|A|Ψ (t)〉 − 〈Ψ0AΨ0〉= 〈Ψ0|U (1)†(t, t0)AU (1)(t, t0)|Ψ0〉 − 〈Ψ0|A|Ψ0〉 (184)

Consider

U (1)†(t, t0)AU (1)(t, t0)

=[1 + i

∫ t

t0

dt′ F(t′)B(t′ − t0)]e+i(t−t0)H0Ae−i(t−t0)H0

[1− i

∫ t

t0

dt′ F(t′)B(t′ − t0)]

=[1 + i

∫ t

t0

dt′ F(t′)B(t′ − t0)]A(t − t0)

[1− i

∫ t

t0

dt′ F(t′)B(t′ − t0)]

=[1 + i

∫ t

t0

dt′ F(t′)B(t′ − t0)][A(t − t0)− i

∫ t

t0

dt′ F(t′)A(t − t0)B(t′ − t0)]

= A(t − t0)− i∫ t

t0

dt′ F(t′)A(t − t0)B(t′ − t0) + i∫ t

t0

dt′ F(t′)B(t′ − t0)A(t − t0) + O(F2)

= A(t − t0)− i∫ t

t0

dt′ F(t′)A(t − t0)B(t′ − t0)− B(t′ − t0)A(t − t0)+ O(F2)

= A(t − t0)− i∫ t

t0

dt′ F(t′)[A(t − t0), B(t′ − t0)

]+ O(F2), (185)

where[A(t − t0), B(t′ − t0)

]is a commutator. Using Eq. (185) in Eq. (184) and keeping

terms upto first order in field F we get

A1(t) = 〈Ψ0|A|Ψ0〉 − i∫ t

t0

dt′ F(t′)〈Ψ0|[A(t − t0), B(t′ − t0)

]|Ψ0〉 − 〈Ψ0|A|Ψ0〉

= −i∫ t

t0

dt′ F(t′)〈Ψ0|[A(t − t0), B(t′ − t0)

]|Ψ0〉 (186)

115

Now consider[A(t − t0), B(t′ − t0)

]= A(t − t0)B(t′ − t0)− B(t′ − t0)A(t − t0)

= ei(t−t0)H0Ae−i(t−t′)H0Be−i(t

′−t0)H0 − ei(t′−t0)H0Bei(t−t

′)H0Ae−i(t−t0)H0

= ei(t′−t0)H0

[A(t − t′), B

]e−i(t

′−t0)H0 , (187)

where A(t − t′) = ei(t−t′)H0Ae−i(t−t

′)H0 . As e−i(t′−t0)H0 is unitary, and it is well known in

quantum mechanics that unitary transformations do not change expectation values. Weconfirm it here as

〈Ψ0|[A(t − t0), B(t′ − t0)

]|Ψ0〉 = 〈Ψ0|ei(t

′−t0)H0[A(t − t′), B

]e−i(t

′−t0)H0 |Ψ0〉

= 〈Ψ0|ei(t′−t0)E0

[A(t − t′), B

]e−i(t

′−t0)E0 |Ψ0〉

= 〈Ψ0|[A(t − t′), B

]|Ψ0〉. (188)

Thus linear response from Eq. (186) using Eq. (188) is

A1(t) = −i∫ t

t0

dt′ F(t′)〈Ψ0|[A(t − t′), B

]|Ψ0〉 (189)

The response function is defined as

χAB(t − t′) ≡ −iΘ(t − t′)〈Ψ0|[A(t − t′), B

]|Ψ0〉, (190)

where Θ(t − t′) is time step function which has value 1 for t ≥ t′ and zero otherwise.Thisensures the causality. It allows us to replace upper limit by∞.Thus linear response of Ain terms of the response function is

A1(t) =∫ ∞−∞dt′ F(t′)χAB(t − t′) (191)

where the lower limit has been extended to −∞ as field F(t) is zero for all values belowt0. Let us write Eq. (191)in Fourier space,

12π

∫ ∞−∞dωA1(ω)e−iωt =

1(2π)2

∫ ∞−∞

∫ ∞−∞

∫ ∞−∞dt′dωdω′F(ω)χAB(ω′)e−iωt

′e−i(t−t

′)ω′

=1

(2π)2

∫ ∞−∞

∫ ∞−∞

∫ ∞−∞dt′dωdω′F(ω)χAB(ω′)e−i(ω−ω

′)t′e−iω′t

=1

2π

∫ ∞−∞

∫ ∞−∞dωdω′ F(ω)χAB(ω′)δ(ω −ω′)e−iω

′t

=1

2π

∫ ∞−∞dωF(ω)χAB(ω)e−iωt, (192)

116

where we used 2πδ(ω −ω′) =∫∞−∞dt

′e−i(ω−ω′)t′ . The relation in Eq. (192) holds for any

value of t. So, we can write

A1(ω) = F(ω)χAB(ω) (193)

The Fourier transform of Eq. (190) gives frequency dependent response.

χAB(ω) =∫ ∞−∞dτ χAB(τ)eiωτ = −i

∫ ∞−∞dτΘ(τ)〈Ψ0|

[A(τ), B

]|Ψ0〉eiωτ

= limη→0+

12π

∫ ∞−∞

∫ ∞−∞dτdω′

e−iω′τ

ω′ + iη〈Ψ0|

[A(τ), B

]|Ψ0〉eiωτ , (194)

where we have used the integral representation of time step function Θ(τ) = limη→0+i

2π

∫∞−∞dω

′ e−iω′τ

ω′+iη .Now using 1 =

∑∞j=0 |Ψj〉〈Ψj | we get

χAB(ω) = limη→0+

∞∑j=0

12π

∫ ∞−∞


〈Ψ0|A|Ψj〉〈Ψj |B|Ψ0〉ω′ + iη

e−i(ω′−ω+Ωj )τ

− limη→0+

∞∑j=0

12π

∫ ∞−∞


〈Ψ0|B|Ψj〉〈Ψj |A|Ψ0〉ω′ + iη

e−i(ω′−ω−Ωj )τ , (195)

where Ωj = Ej −E0. Now using the standard integral

12π

∫ ∫dτdω′

e−i(ω′−ω+Ωj )τ

ω′ + iη=

∫dω′

δ(ω′ −ω+Ωn)ω′ + iη

=1

ω −Ωj + iη, (196)

we can write Eq. (195) as

χAB(ω) = limη→0+

∞∑j=1

〈Ψ0|A|Ψj〉〈Ψj |B|Ψ0〉ω −Ωj + iη

−〈Ψ0|B|Ψj〉〈Ψj |A|Ψ0〉

ω+Ωj + iη

, (197)

where the j = 0 terms cancel out in the first and second terms.This is the so-calledLehman representation of response function.

1. Density-density response function

Consider the perturbation coupled to electron density as

H1(t) =∫d3r′ v1(r′, t)n(r′), (198)

where v1(r′, t) is the fluctuation in the external potential. The corresponding change inthe density is

n1(r, t) =∫ ∞−∞dt′

∫d3r′χnn(r,r′, t − t′)v1(r′, t′), (199)

117

where χnn(r,r′, t − t′) is the density-density response function which can be written as

χnn(r,r′, t − t′) = −iΘ(t − t′)〈Ψ0|[n(r, t − t′), n(r′)]|Ψ0〉. (200)

In frequency space we can write

n1(r,ω) =∫d3r′χnn(r,r′,ω)v1(r′,ω), (201)

where χnn(r,r′,ω) is the density-density response function in the frequency space.A fluctuation in the Kohn-sham potential vs(r′,ω) can induce a change in density

n1(r,ω) =∫d3r′χs(r,r

′,ω)vs1(r′,ω), (202)

where χs(r,r′,ω) is the Kohn-Sham response function. The Kohn-Sham potential is givenas

vs(r,ω) = vH(r,ω) + vxc(r,ω) + vext(r,ω)

vs1(r,ω) = vH1(r,ω) + vxc1(r,ω) + vext1(r,ω) (203)

=∫d3r′

δvH[n](r,ω)δn(r′,ω)

∣∣∣∣∣∣n0

n1(r′,ω) +∫d3r′

δvxc[n](r,ω)δn(r′,ω)

∣∣∣∣∣∣n0

n1(r′,ω) + v1(r,ω)

=∫d3r′

n1(r′,ω)|r − r′ |

+∫d3r′ fxc(r,r′,ω)n1(r′,ω) + v1(r,ω)

=∫d3r′ [w(r,r′) + fxc(r,r′,ω)]n1(r′,ω) + v1(r,ω)

=∫d3r′ fHxc(r,r′,ω)n1(r′,ω) + v1(r,ω) (204)

where w(r,r′) = 1|r−r′ | , fxc(r,r′,ω) = δvxc[n](r,ω)

δn(r′ ,ω)

∣∣∣∣∣∣n0

is the so-called exchange-correlation

kernel and fHxc(r,r′,ω) = w(r,r′) + fxc(r,r′,ω) is the so-called Hartree-xc kernel. The KSresponse function χs relates the change of density n1 and change of KS potential given inEq. (204) as

n1(r1,ω) =∫d3r′χs(r,r

′,ω)vs1(r′,ω)

="

d3r′d3r′′χs(r,r′,ω)fHxc(r′,r′′,ω)n1(r′′,ω) +

∫d3r′χs(r,r

′,ω)v1(r′,ω). (205)

Consider "d3r′d3r′′χs(r,r

′,ω)fHxc(r′,r′′,ω)n1(r′′,ω)

=$

d3r′d3r′′d3r′′′χs(r,r′,ω)fHxc(r′,r′′,ω)χ(r′′,r′′′,ω)v1(r′′′,ω) (206)

118

Using Eq. (206) and Eq. (201) in Eq. (205) we can write∫d3r′′′χ(r,r′′′,ω)v1(r′′′,ω) =

∫d3r′′′χs(r,r

′′′,ω)v1(r′′′,ω)

+$

d3r′d3r′′d3r′′′χs(r,r′,ω)fHxc(r′,r′′,ω)χ(r′′,r′′′,ω)v1(r′′′,ω) (207)

Since, this relationship is valid for any arbitrary perturbation v1, hence, one can write

χ(r,r′,ω) = χs(r,r′,ω) +

"d3r′′d3r′′′χs(r,r

′′,ω)fHxc(r′′,r′′′,ω)χ(r′′′,r′,ω)

= χs(r,r′,ω) +

"d3r′′d3r′′′χs(r,r

′′,ω) [w(r′′,r′′′) + fxc(r′′,r′′′,ω)]χ(r′′′,r′,ω) (208)

This equation is the so-called Dyson screening equation. The iterative formal solution ispossible if on knows the exchange-correlation kernel fxc. If we set fxc = 0 we end up withthe random-phase approximation (RPA).

2. Calculation of properties from response functions

The response function can be used to calculate polarizability of a system which is veryimportant physical property. If a system is subject to an electric field E(t) = εsin(ωt)ezthen the 1st-order dipole polarization p1 is given as

p1(t) =∫dt′ (t − t′)E(t′), (209)

p1(ω) = (ω)E(ω) (210)

where is the dipole-dipole polarizability tensor. The perturbation as a consequence ofE(t) is v1 = zεsin(ωt) which in Fourier-space becomes v1(r,ω) = εz/2 which can polarizethe system of electrons and change density

n1(r,ω) =∫d3r′χ(r,r′,ω)v1(r′,ω) (211)

The z-component of the polarization would be

p1z = −∫d3r zn1(r,ω) (212)

= −"

d3rd3r′ zχ(r,r′,ω)v1(r′,ω) (213)

= −ε2

"d3rd3r′ zχ(r,r′,ω)z′ (214)

Comparison of Eqs. 210 and 214 leads to the conclusion that the dipole-dipole polarizabilityαzz is given as

αzz(ω) = −"

d3rd3r′ zχ(r,r′,ω)z′. (215)

119

Here ∆E is the energy with respect to vacuum reference state. Applying eT to the leftof the above equation yields

( e−T HN eT −∆E )|0〉 = 0which we can write as

(H −∆E)|0 > = 0

where H = ( e−T HN eT ). Here we can see that H which is also called the CC effectiveHamiltonian/similarity transformed CC Hamiltonian, is non-hermitian and is a symmetrictransformation of normal ordered Hamiltonian. Also, ∆E is the energy corresponding tothe state vector |0 >.

Now let us take a closer look to the CC effective Hamiltonian H = e−T HN eT . Using

Baker-Campbell-Housdorff expansion one can write that

e−BAeB = A+ [A, B] +12!

[[A, B], B] +13!

[[[A, B], B], B] + . . . (222)

The transformed CC Hamiltonian thus becomes

e−T HN eT = HN+[HN , T ]+

12!

[[HN , T ], T ]+13!

[[[HN , T ], T ], T ]+14!

[[[[HN , T ], T ], T ], T ] (223)

Here the series terminate with four fold term as the Hamiltonian has atmost two-particle interactions.

Now T has only particle creation and hole annihilation operators and only possiblenon-zero contractions are

AB† = δab

i†j = δij

which shows that [Tm, Tn] = 0. Therefore the only non-zero terms will be commutationbetween HN and T . Since only non-zero contractions are particle creation with particleannihilation on it’s left and hole annihilation with hole creation on its left, the only non-zero terms will be HN on the left.H therefore becomes

H = HN + HN T +HNT T

2!+ . . . = (HN e

T )C

Here ’C’ denotes that only fully connected terms are included.Now we can replace the term in Schrödinger equation for CC system.

(HN eT |0〉)CCD = ∆E|0〉 (224)

121

Now let us define the ground state and excited state projecton operators P, Q as

P = |0〉〈0| (225)

Q = I − P (226)

which has the properties :

P 2 = P (227)

Q2 = (1− P )2 = 1− P − P + P 2 =Q (228)

Therefore using projection operators equation (8) can be written as

(〈0|H|0〉) = ∆E (229)

and

(〈φab...ij... |H|0〉) = 0 (230)

i.e.

PHP = ∆EP (231)

and

QHP = 0 (232)

These are called the CC amplitude equations. We are going to use these equationsand Hellmann Feynman theorem to derive CC energy functional using linear responsetheory.

2. Hellmann-Feynman theorem

As we already know by now, response theory is an alternative of studying propertiesof molecular system in presence of perturbation ( i.e. external electric or magnetic fieldor displacement between nuclei etc. ). Using Hellmann-Feynman theorem, we can studyfirst or higher order properties even if the wavefunction is not know. The theorem statesthat the expectation value of first and higher order property is equivalent to the energyderivative with respect to applied perturbation at the point when perturbation is equalto zero. Let us try to prove the theorem.

Since we are considering a system in a perturbed field, we can write H = H(λ) and aMaclaurin series expansion gives

122

H(λ) = H(0) +λH1 +λ2H2 + . . . (233)

where H (n) = 1n!dnHdλn |λ=0

Here H(0) is the unperturbed Hamiltonian of the system and λ is the perturbationparameter. For linear perturbation all higher order terms with n > 1 will be zero and theHamiltonian will be

H(λ) = H(0) +λH1 (234)

Similarly, the energy and the wavefunctions will be

E(λ) = E(0) +λE(1) + . . . (235)

and

Ψ (λ) = Ψ (0) +λΨ (1) + . . . (236)

Now Schrödinger equation gives

H(λ)Ψ (λ) = E(λ)Ψ (λ) (237)

(H(0) +λH (1))(Ψ (0) +λΨ (1) + . . .) = (E(0) +λE(1) + . . .)(Ψ (0) +λΨ (1) + . . .)

〈Ψ (0)|H (1)|Ψ (0) >

〈Ψ (0)|Ψ (0) >= E1 =

dE(λ)dλ|∣∣∣∣λ=0

(238)

which is a trivial case of Hellmann-Feynman theorem i.e. first order property ofa system can be studied as derivative of the perturbed energy at λ = 0 i.e. at zeroperturbation.

We can take an example of a system in an external electric field, ~E ( perturbation inthis case). The energy of the system is given by

E(~E) = −~E ·∑

qu~ru (239)

It is quite conspicuous from the energy equation that first order property i.e. d~E(~E)dEa|~E=0

is nothing but the dipole moment of the system. The higher order properties can beobtained following same method.

123

3. Linear response CC for static perturbation

Now that we have all required equations, we can derive energy functional for CC. Theamplitude equation for CC is given by equation (234), (235)

PHP = ∆EP

and

QHP = 0

Taking derivative of equation (234) with repect to λ gives

d∆Edλ

P = PddλHP (240)

Now H = ( e−T HN eT )

ddλH

∣∣∣∣λ=0

= −e−T dTdλHN e

T + e−T HN eT dTdλ

+ e−TdHNdλ

eT

= [H, T λ] + H[λ]

where H[λ] = e−T dHNdλ eT∣∣∣∣λ=0

and T λ = dTdλ

∣∣∣∣λ=0

On inserting this expression into above energy derivative gives,

P [H, T λ]P

= P H(P + Q)T λP − P T λ(P + Q)HP

= ∆EP T λP + P HQT λP − ∆EP T λP = P HQT λP

This expression can be simplified by using CC amplitude equation as follows :

Q HP = 0

124

QdHdλ

P = 0

i.e.

QH[λ] + [H, T λ]P = 0

Now, the second term gives

Q[H, T λ]P = QH(P + Q)T λP − QT λ(P + Q)HP = Q(QHQ −∆E)QT λP

Therefore using the fact that Q2 = Q, we get

QT λP = Q(∆E − QHQ)−1QH[λ]P

and equation(236) can be writen as where∆Eλ = d∆E

dλ

∣∣∣∣λ=0

∆EλP = P H[λ]P + P HQ(∆E − QHQ)−1QH[λ]P (241)

In the above equation, T λ has been eliminated.We can now define effective resolvent operator R(λ) = Q[∆E(λ) − Q ˆH(λ)Q]−1Q

The above equation then becomes

∆EλP = P H[λ]P + P HRQH[λ]P (242)

To enhance the fact that this equation is valid at λ = 0, let us write the above equationin following manner :

∆E(1)P = P H[λ](0)P + P H(0)R(0)QH[λ](0)P

Here we can see that P ˆH(0)R(0)Q is independent of λ (perturbation) at λ = 0 and doesnot contain the perturbation operator and therefore we define a new operator Λ as

Λ(0) = P ˆH(zero)R(0)Qand therefore we get

∆EλP = P (1 +Λ)H[λ]P (243)

125

which when integrated over λ gives

∆EP = P (1 +Λ)HP (244)

This is called fundamental CC energy functional which is independent of perturbationand if it is solved at λ = 0, Λ needs to be solved only once. The Λ operator satisfies fewlinear equations under stationary conditions which is going to be our next section.

4. Lambda equations

To exploit stationary requirements of a functional we are going to set the coefficientsof dependent functions equal to zero so that all higher derivatives vanish and henceforthHelmann-Feynman theorem is still valid.

We start with a functional defined asE(Λ, T ) = P E(Λ, T )P = P (1 +Λ)HPwhere H = e−T HN eT

and Λ is a dexcitation operator and T is an excitation operator satisfying satisfyingequations

ΛP = 0, ΛQ = Λ, P T = 0, QT = Q

Now variation of the functional with respect to its argument will be given by

P δE P = P δΛQHP + P (1 +ΛQ)δ(e−T HN eT )P = 0 + P (1 +ΛQ)[H,δT ]P (245)

Since the functional is stationary with respect to its argument, we can set the coefficientsof δΛ and δT is zero satisfying CC amplitude equations. We can simplify the second termusing CC amplitude equations P HP = ∆EP and QHP = 0 i.e.

P (1 +ΛQ)[H,δT ]P = P (H+ΛQH −∆EΛ)QδT P

The stationary condition of the functional with respect to T gives

P (H+ΛQH −∆EΛ)Q = 0

To evaluate Λ, we can either use inversion of the operator from equation (32) whichis a tedius precedure or we can exploit the stationary condition of the functional and getsome linear equations that Λ satisfies.

126

Starting from above equation we get,

P (1 +ΛQ)(H −∆E) = 0

Here an extra term ∆EP Q (which is equal to zero since P , Q are orthonormal operators)has been added to derive the above steps.

Now projection of P will produce CC energy functional again but projecting Q to theright gives the Λ equations.

P (1 +ΛQ)(H −∆E)Q = 0

P HQ+ PΛHQ − ∆EPΛQ = 0

The equation has energy dependence which can produce some disconnected terms.The energy dependence can be eliminated the following way.

PΛHQ = P [Λ,H]Q+ P H(P + Q)ΛQ

= P (ΛH)CQ+∆EPΛQ+ P HQΛQ

Therefore the energy equation after eliminating energy dependence is given as

P HQ+ P (ΛH)CQ+ P HQΛQ = 0 (246)

or, P (HN eT )CQ+ P (Λ(HN e

T )C)CQ+ P (HN eT )CQΛQ = 0 (247)

where the last term is discunnected.For an arbitrary excited state, the above equation can be written in explicit form as

< 0|HN eT |φab...ij... 〉+ < 0|Λ(HN eT )C |φab...ij... 〉C (248)

+∑k<l<...c<d<...

< 0|HN eT |φcd...kl... 〉C〈φcd...kl... |Λ|φ

ab...ij... 〉 = 0

be written asThese are the Λ equations. For single excited state the equation can

127

< 0|HN eT |φai 〉+ < 0|Λ(HN eT )C |φai 〉C (249)

where the disconnected term doesn’t contribute to the Λ equation since there is nointermediate state between vacuum and singly excited state.

For doubly excited state, Λ equation looks like

< 0|HN eT |φabij 〉+ < 0|Λ(HN eT )C |φabij 〉C (250)

+∑k=ijc=a,b

< 0|HN eT |φck〉C〈φck |Λ|φ

abij 〉 = 0

Now once we know these linear equations, we can solve for Λ.

XI. TREATMENT OF EXCITED STATES

A. Excitation energies from TD-DFT

The poles of density-density response function give exact excitation energies. A finiteresponse can be sustained by a system at its excitation frequencies even in the absence ofany external perturbation. Therefore, setting v1 = 0 in Eq. (205) one can write

n1(r,Ω) ="

d3r′d3r′′χs(r′,r′′,Ω)fHxc(r′,r′′,Ω)n1(r′′,Ω) (251)

In the spin dependent formalism

n1σ (r,Ω) =∑σ ′σ ′′

"d3r′d3r′′χs,σσ ′ (r,r

′,Ω)fHxc,σ ′σ ′′ (r′,r′′,Ω)n1σ ′′ (r

′′,Ω) (252)

If we pre-multiply with∫d3r′′′fHxc,σσ ′ (r,r′′′,Ω) and using the notation

gσσ ′ (r,Ω) =∫d3r′fHxc,σσ ′ (r,r

′,Ω)n1σ ′ (r′,Ω), (253)

we can write Eq. 252 as

gσσ ′ (r,Ω) =∑σ ′′σ ′′′

"d3r′d3r′′ fHxc,σσ ′ (r,r

′,Ω)χs,σ ′σ ′′ (r′,r′′,Ω)gσ ′′σ ′′ (r

′′,Ω) (254)

The Kohn-Sham response function can be written as

χs,σσ ′ (r,r′,Ω) = δσσ ′

∑jk

αjkσΦ∗jkσ (r)Φjkσ (r′)

Ω−ωjkσ + iη, (255)

128

where

αjkσ = fkσ − fjσ (256)

Φjkσ (r) = φ∗jσ (r)φkσ (r) (257)

ωjkσ = εjσ − εkσ (258)

gσσ ′ (r,Ω) =∑

jkσ ′′σ ′′′

δσ ′σ ′′αjkσ ′

Ω−ωjkσ ′

"d3r′d3r′′ fHxc,σσ ′ (r,r

′,Ω)Φ∗jkσ ′ (r′)Φjkσ ′ (r

′′)gσ ′′σ ′′′ (r′′,Ω)

(259)

=∑jkσ ′′′

αjkσ ′

Ω−ωjkσ ′

∫d3r′ fHxc,σσ ′ (r,r

′,Ω)Φ∗jkσ ′ (r′)︸︷︷︸∫d3r′′Φjkσ ′ (r

′′)gσ ′σ ′′′ (r′′,Ω)︸︷︷︸

(260)

Now multiplying Eq. 260 by∑σ ′

∫d3rΦjk′σ (r) and using

Hjkσ (Ω) =∑σ ′

∫d3rΦjkσ (r)gσσ ′ (r,Ω), (261)

Kjkσ ,j ′k′σ ′ ="

d3rd3r′Φ∗jkσ (r)fHxc,σσ ′ (r,r′,Ω)Φ∗j ′k′σ ′ (r

′), (262)

we can write

Hjkσ (Ω) =∑j ′k′σ ′

αj ′k′σ ′

Ω−ωj ′k′σ ′

"d3rd3r′Φ∗jkσ (r)fHxc,σσ ′ (r,r

′,Ω)Φ∗j ′k′σ ′ (r′)︸︷︷︸Hj ′k′σ ′ (263)


αj ′k′σ ′

Ω−ωj ′k′σ ′Kjkσ ,j ′k′σ ′Hj ′k′σ ′ (264)


αj ′k′σ ′Kjkσ ,j ′k′σ ′βj ′k′σ ′ (Ω) (265)

βj ′k′σ ′ (Ω) =Hj ′k′σ ′

Ω−ωj ′k′σ ′(266)

Now multiplying and dividing Eq. 265 by Ω−ωjkσ we can write

(Ω−ωjkσ )βjkσ =∑j ′k′σ ′

αj ′k′σ ′Kjkσ ,j ′k′σ ′βj ′k′σ ′ (Ω), (267)

ωjkσβjkσ (Ω) +∑j ′k′σ ′

αj ′k′σ ′Kjkσ ,j ′k′σ ′βj ′k′σ ′ (Ω) = Ωβjkσ (Ω) (268)∑j ′k′σ ′

[δj,j ′δkk′δσσ ′ωj ′k′σ ′ +αj ′k′σ ′Kjkσ ,j ′k′σ ′

]βj ′k′σ ′ (Ω) = Ωβjkσ (Ω) (269)

129

Since indices j(j ′) and k(k′) can take values such that if one runs over occupied orbitalsthen the other has to run over virtual orbitals. We will be denoting occupied orbitals byi(i′) and virtual orbitals by a(a′). Therefore, Eq. 269 can be written as∑

j ′k′σ ′

[δij ′δak′δσσ ′ωj ′k′σ ′ +αj ′k′σ ′Kjkσ ,j ′k′σ ′

]βj ′k′σ ′ (Ω) = Ωβiaσ (Ω) (270)∑

j ′k′σ ′

[δaj ′δik′δσσ ′ωj ′k′σ ′ +αj ′k′σ ′Kjkσ ,j ′k′σ ′

]βj ′k′σ ′ (Ω) = Ωβaiσ (Ω) (271)

∑i′a′σ ′

[δii′δaa′δσσ ′ωi′a′σ ′ +αi′a′σ ′Kiaσ ,i′a′σ ′

]βi′a′σ ′

+∑i′a′σ ′

[δia′δai′δσσ ′ωa′i′σ ′ +αa′i′σ ′Kiaσ ,i′a′σ ′

]βa′i′σ ′

=∑i′a′σ ′

[(δii′δaa′δσσ ′ωi′a′σ ′ −Kiaσ ,i′a′σ ′ )βi′a′σ ′ +Kiaσ ,i′a′σ ′βa′i′σ ′

](272)

Therefore Eqs. 271 and 270 can be written as∑i′a′σ ′

[(δii′δaa′δσσ ′ωi′a′σ ′ −Kiaσ ,i′a′σ ′ )βi′a′σ ′ +Kiaσ ,i′a′σ ′βa′i′σ ′

]= Ωβiaσ (273)∑

i′a′σ ′

[−Kaiσ ,i′a′σ ′βi′a′σ ′ + (δaa′δii′δσσ ′ωa′i′σ ′ −Kiaσ ,a′i′σ ′ )βa′i′σ ′

]= Ωβaiσ (274)

Now defining Xiaσ = −βiaσ and Yiaσ = βaiσ and using the fact that ωi′a′σ ′ = −ωa′i′σ ′ wecan rewrite Eqs. 273 and 274 as∑

i′a′σ ′

[(δii′δaa′δσσ ′ωa′i′σ ′ +Kiaσ ,i′a′σ ′ )Xi′a′σ ′ +Kiaσ ,i′a′σ ′Ya′i′σ ′

]= −ΩXiaσ (275)∑

i′a′σ ′

[Kaiσ ,i′a′σ ′Xi′a′σ ′ + (δaa′δii′δσσ ′ωa′i′σ ′ +Kiaσ ,a′i′σ ′ )Ya′i′σ ′

]= ΩYaiσ (276)

If Kohn-Sham orbitals are real then from Eq. 262 Kiaσ ,i′a′σ ′ = Kaiσ ,a′i′σ ′ . Therefore, we canwrite (

A B

B A

)(X

Y

)= Ω

(−1 00 1

)(X

Y

)(277)

where matrices A and B are called Hessians and have elements

Aiaσ ,i′a′σ ′ (Ω) = δii′δaa′δσσ ′ωa′i′σ ′ +Kiaσ ,i′a′σ ′ (Ω) (278)

Biaσ ,i′a′σ ′ (Ω) = Kiaσ ,i′a′σ ′ (279)

The Eq.277 is known as Casida Equation which can in principle be solved to get the exactexcitation energies Ωn. The solution of Eq. 277 also gives −Ωn which is deexcitationenergy. The excitation energy energy Ωn may represent single as well as multiple

130

excitations.One needs the exact Kohn-Sham orbitals and corresponding energy eigenvaluesalong with the exchange correlation kernel fxc which is unknown in general, moreover, itwill be an infinite dimensional problem and in practice one needs to approximate fxc andsolve it for finite dimensions. One such approximation is Tamm-Dancoff approximationwhich ignores deexcitation processes. Another approximation called small matrix approximation(SMA) in which off-diagonal elements of the matrices A and B are neglected which maywork well in special conditions.

B. Limitations of single-reference CC metods

The conventional, single-reference, coupled-cluster method is very effective for electronicstates domminated by a single determinant, such as most molecular ground states neartheir equilibrium geometry. Such stateds are predominantly closed-shell singlet states,and CC calculations on them produce pure singlet wave functions. But even thesestates become dominated by more than one determinant when one or more bonds arestretched close to breaking, besides, most excited, ionized and electron-attached statesare open-shell states, so that single-reference CC based on RHF orbitals is then notusually appropriate for the calculation of entire potential- energy surfaces. One solutionto these problems is to resort to multireference methods. An effective alternative in manycases is provided by the equation-of-motion coupled-cluster (EOM-CC) method.

C. The equation-of-motion coupled-cluster method

The basic idea of EOM-CC is to start with a conventional CC calculation on some initialstate, usually a vonveniently chosen closed-shell state, and obtain the desired target stateby application of a CI-like linear operator acting on the initial state CC wave function.Althogh the calculations for the the two states must use the same set of nuclei in the samegeometrical arrangement and the same set of spinorbitals defining a common Fermi state|0〉, they need not have the same number of electrons.

In the EOM-CC method we consider two Schrödinger-equation eigenstates simultaneously,an initial state Ψ0 and a target state Ψk,

HΨ0 = E0Ψ0, HΨk = EkΨk . (280)

The initial state is often referred as the reference state.The aim of the method is to determine the energy difference

ωk = Ek −E0 (281)

If we use the normal-product form of the Hamiltonion, equations (280) become

131

HNΨ0 = ∆E0Ψ0 (282)

HNΨk = ∆EkΨk (283)

where ∆E0 = E0 −Eref and ∆Ek −Eref , with Eref = 〈0|H |0〉. Then we have

ωk = ∆Ek −∆E0 (284)

The initial-state coupled-cluster wave function is represented by the action of anexponential wave operator Ω0 = eT on a single-determinant reference function |0〉,

|Ψ0〉 = Ω0|0〉 = eT |0〉 (285)

An operator Rk is used to generate the target state from the initial state,

|Ψk〉 = Rk |Ψ0〉 (286)

so that, using 285, the target-state Schrödinger equation refcl can be written in theform

HN RkeT |0〉 = ∆EkRke

T |0〉 (287)

In the EOM-CC case, if all possible excitations from the initial state are included wehave

Rk = r0 +∑i,a

rai a†i+ . . . (288)

Since Rkis an excitation operator, it commutes with the CC cluster operator T and allits components.

Multiplying (287) on the left with e−T and using the commutation between Rk and Tand using the commuation between Rk and T , we get

HRk |0〉 = ∆kRk |0〉 (289)

where H = e−T HN eT

showing that Rk |0〉 is a right eighenfuction of H with eigenvalue ∆Ek. And it hasleft eigenfunctions 〈0|Lk, with the same eigenvalues ∆Ek as the corresponding righteigenfunctions Rk |0〉, satisfying

〈0|LkH = 〈0|Lk∆Ek (290)

The operator Lk is a de-excitation operator.

132

Lk = l0 +∑i,a

liai†a+ . . . (291)

and therefore safisfies

Lk P = 0, Lk = LkQ (292)

For the initial state (k=0) we have R0 = 1, but L0 , 1The two sets of eigenfunctions are biorthogonal and can be normalized to satisfy

〈0|LkRl |0〉 = δkl (293)

They provide a resolution of the identity,

1 =∑k

Rk |0〉〈0|Lk (294)

Also, because R0 = 1 we have

〈0|Lk |0〉 = δk0 (295)

Since R0 = 1, the initial-state version of (289) is

H|0〉 = ∆E0|0〉 (296)

Multiplying this equation on the left by Rk and substracting it from (289), we obtainthe EOM-CC equation in the form

[H, Rk]|0〉 = (∆Ek −∆E0)Rk |0〉 (297)

or(HRk |0〉)C =ωkRk |0〉 (298)

D. Multireference coupled-cluster methods

As in the case of quasidegenerate perturbation theory, multireference coupled-cluster(MRCC) theory is designed to deal with electronic states for which a zero-order descriptionin terms of a single Slater determinant does not provide an adequate starting point forcalculating the electron correlation effects.

All multireference methods are based on the generalized Bloch equation (Lindrgen1974)

[Ω, H0]P = VΩP −ΩP VΩP (299)

133

The projection operator P projects onto a model space spanned by a set of modelfunctions Φα,

P =∑α

|Φα〉〈Φα | =∑α

Pα, Pα = |Φα〉〈Φα | (300)

and Ω = ΩP is the wave operator, which, when operating on the model space,produces the space spanned by the perturbated wave functions,

Ψα = ΩΦα (301)

By rearranging the terms in (299), and using H0 + V = H , at the same time notingthatPΩ = P , the result is

HΩ = Ω(H0P + VΩ) = Ω(H0 + V )Ω (302)

or

HΩ = ΩHΩ (303)

The functions Ψα, are not individually eigenfunctions of H but span the space ofeigenfunctions Ψα for which the model space forms a zero-order approximation,

HΨα = EαΨα (304)

Applying P from the left, we get the matrix eigenvalue equation

P HΩΨα = EαΨα (305)

The operator

Hef f = P HΩ (306)

which operators entirely in P -space and whose eigenfunctions and eigenvalues are Φα

and Eα, respectively, is called the effective Hamiltonion operator. With this notation, thegeneralized Bloch equation (303) can be written in the form

HΩ = ΩHef f (307)

In the Hilbert-space approach to MRCC theory assumes a separate Fermi-vacuumdefinition, and thus a separate partition of the spinorbitals into hole and particle states,for which model-space dererminant.

The wave operator is separated into individual wave operators for the different modelstates,

134

Ω =∑α

Ωα =∑α

eTαPα (308)

Substituting the definition of the wave operator (308), the generalized Bloch equation(307) may be written in the form∑

β

HeTβPβ =

∑β

eTβPβH

ef f P (309)

Projection on the left with eTα

and on the right with Pα we obtain

e−TαHeT

αPα =

∑β

e−TαeT

βPβH

ef f Pα (310)

Applying an external-space determinant 〈Φab...ij... (α)| = 〈a†ib†j ...Φα | on the left and

the model function |Φα〉 on the right, we obtain equations for the external-excitationemplitudes tab...ij... (α) contained in the operators T α

〈Φab...ij... (α)|e−T

αHeT

α|Φα〉 =

∑β


αeT

β|Φβ〉〈Φβ |Hef f |Φα〉 (311)

The matrix elements of the effective Hamiltonian Hef f appearing in this equation areobtained, using (306) and (308), as

Hef fβα = 〈Φβ |Hef f |Φα〉 = 〈Φβ |HΩ|Φα〉 = 〈Φβ |HeT

α|Φα〉 (312)

We can use the CC effective Hamiltonian for model Φα,

Hα = e−TαHeT

α(313)

Then the equations for the external-excitation amplitudes take the form

〈Φab...ij... (α)|Hα |Φα〉 =

∑β


αeT

β|Φβ〉H

ef fβα (314)

To evaluate the matrix element in (312), we note that

〈Φα |e−Tα

= 〈Φα |(1− T α + ...) = 〈Φα | (315)

and therefore

Hef fαα = 〈Φα |e−T

αHeT

α|Φα〉 = 〈Φα |Hα |Φα〉 (316)

We insert eTαe−T

α= 1 and obtain

Hef fβα = 〈Φβ |eT

αHα |Φα〉 (317)

135

Next, we consider the first factor in the sum on the r.h.s. of(314)

Sab...ij....(αβ) = 〈Φab...ij... (α)|e−T

αeT

β|Φβ〉 = 〈Φab...

ij... (α)|e−TαeT

β|Φxy...uv...(α)〉 (318)

Insering a resolution of the identity between the two exponentials, we obtain

Sab...ij....(αβ) =∑I

〈Φab...ij... (α)|e−T |ΦI〉〈ΦI |eT

β|Φxy...uv...(α)〉 (319)

The series expansions of the exponentials in this equation result in expressions,involving CI-like amplitudes, corresponding to linear combinations of T amplitudesand their products. Combing the matrixHef f with the CI-like amplitudes, an eigenvaluefunction of theHef f can be derived. The diagrammatic representation and the evaluationof this expansion are decribed in detail by Paldus, Li and Petraco (2004). Severalapplications of Hilbert-space SU-MRCCSD were discussed by Li and Paldus (2003c,2004), who compared model spaces of different dimensions ith high-excitation single-reference CI.

XII. INTERMOLECULAR INTERACTIONS

Intermolecular interactions (forces) determine the structure and properties of clusters,nanostructures, and condensed phases including biosystems. The interaction energyof a cluster of N atoms or molecules (called monomers) with n electrons is defined inthe following way. The time-independent Schrödinger equation in Born-Oppenheimerapproximation can be written as

H(r1, ...,rn;Q1, ...,QN )Ψ (x1, ...,xn;Q1, ...,QN ) = Etot(Q1, ...,QN )Ψ (x1, ...,xn;Q1, ...,QN )

with the usual notation for the electron coordinates and with the variableQi = (Ri ,ωi ,ξi)denoting the set of coordinates needed to specify the geometry of ith monomer: theposition of the center of mass, Ri , set of three Euler angles ωi defining the orientationof the monomer, and a set of internal monomer coordinates, ξi . The interaction energyis the defined as the difference between this quantity and the sum of monomer energiesEi(ξi)

Eint(Q1, ...,QN ) = Etot(Q1, ...,QN )−∑i

Ei(ξi).

Interaction energies defined in this way are sometimes called “vertical" interactionenergies since the geometry of each monomer is the same as its geometry in the dimer.

The interaction energy of an N -mer can be represented in the form of the followingmany-body expansion (assuming rigid monomers)

Eint[N ] = Eint[2,N ] +Eint[3,N ] + ...+Eint[N,N ],

136

where the term Eint[k,N ] is called the k-body contribution to theN -mer energy. The two-body contribution is just the sum of interaction energies of all isolated monomer pairs,i.e., all dimers

Eint[2,N ] =∑i<j

Eint(Qi ,Qj)[2,2].

Analogously, the three-body contribution is

Eint[3,N ] =∑i<j<k

Eint(Qi ,Qj ,Qk)[3,3].

This definition applied to a trimer shows that the trimer three-body energy is

Eint(Q1,Q2,Q3)[3,3] = Eint[N ]−3∑i<j

Eint(Qi ,Qj)[2,2]

The higher-rank terms are defined in an analogous way.Any of the electronic structure methods can be used to compute interaction energies

from the definition given above (so-called supermolecular approach). However, sinceinteraction energies are usually more than an order of magnitude smaller in absolutevalue than chemical-bond energies and at least four orders of magnitude smaller inabsolute value than the total electronic energies of atoms or molecules, the most naturalmethod for investigating these phenomena is to start from isolated monomers andtreat the interactions as small perturbations of this system. Such an approach is calledsymmetry-adapted perturbation theory (SAPT). An early version of SAPT was introducedalready in 1930s by Eisenschitz and London. A generally applicable SAPT was developedin late 1970s and 1980s.

A. Symmetry-adapted perturbation theory

The simplest perturbation theory of intermolecular interactions is just the standardRayleigh-Schrödinger (RS) perturbation theory discussed earlier. For a dimer, we partitionthe total Hamiltonian as

H = H0 + V = HA + HB + V

where HX is the Hamiltonian of the isolated monomer X and V is the intermonomerinteraction potential

V =∑α∈A

∑β∈B

ZαZβRαβ

−∑α∈A

∑j∈B

Zαrjα−

∑i∈A

∑β∈B

Zβriβ

+∑i∈A

∑j∈B

1rij.

The zeroth-order problem is(H0 −E0)Φ0 = 0

137

where Φ0 = ΦAΦB and E0 = EA +EB. We get the standard set of RS equations

(H0 −E0)Φ (n)RS = −VΦ

(n−1)RS +

n∑k=1

E(k)RS Φ

(n−k)RS

E(n)RS = 〈Φ0|VΦ

(n−1)RS 〉.

It can be shown (a homework problem) that the first-order energy

E(1)RS = 〈Φ0|VΦ0〉

can be expressed as a Coulomb interaction of unperturbed charge densities of monomers,i.e., an electrostatic interaction. Therefore, this terms is usually called the electrostaticenergy and denoted as E(1)

elst.The second-order energy can be written in the form of the usual spectral expansion

E(2)RS =

∑k+l,0

|〈ΦA0 Φ

B0 |VΦA

k ΦBl 〉|

2

EA0 +EB0 −EAk −E

Bl

.

This energy consists of two physically distinct components, the induction energy

E(2)ind =

∑k,0

|〈ΦA0 Φ

B0 |VΦA

k ΦB0 〉|2

EA0 −EAk

+∑l,0

|〈ΦA0 Φ

B0 |VΦA

0 ΦBl 〉|

2

EB0 −EBl

and the dispersion energy

E(2)disp =

∑k,0

∑l,0

|〈ΦA0 Φ

B0 |VΦA

k ΦBl 〉|

2

EA0 +EB0 −EAk −E

Bl

.

In the induction energy expression, one can integrate in the first (second) sum over thecoordinates of system B (A), obtaining in this way the electrostatic potential of monomerB (A) acting on system A (B). Thus, the induction energy is the response of a monomerto the electrostatic field of the interacting partner. The dispersion energy term is a purequantum effect resulting from the correlation of electronic positions in system A withthose in system B.

The RS approach uses wave functions that are not globaly antisymmetric, they areantisymmetric only with respect to exchanges of electrons within monomer. We oftensay that the RS theory violates the Pauli’s exclusion principle. Despite of this, it wasshown by performing numerical calculations for one- and two-electron monomers thatthe RS theory actually does converge to the correct ground-state energy. However, thisis not true anymore if even one of the monomers includes three or more electrons.We will return to this subject later on. Even more serious problem is that the RSapproach does not give the repulsive walls at short intermonomer distances, i.e., becomes

138

unphysical there. It is easiest to see this in interactions of rare-gas atoms where theelectrostatic energy is very small (cf. classical electrostatic interactions of sphericalcharge distribution), so the interaction energy is dominated by the second-order termwhich we know is negative for the ground state. Despite problems at small separations,the RS method gives nearly exact energies at large separations, as will be discussed below,and is the basis for the multipole expansion of interaction energy, also discussed later.

To solve this problem of the RS method, one has to antisymmetrize the wave functions.We cannot use anymore the RS approach since already Φ0 has to be antisymmetrized,so that the zeroth-order equation does not hold. There are several ways of introducingantisymmetry constraint in a perturbative way, leading to the family of SAPT methods.One way to derive several variants of SAPT is to iterate the Bloch form of Schrödinger’sequation

Ψ = Φ0 + R0

(〈Φ0|VΨ 〉 − V

)Ψ (320)

where

R0 =∑m,0

|Φm〉〈Φm|Em −E0

(321)

is the same resolvent operator as used before. To derive Bloch’s equation, write Schrödinger’sequation as (

H0 + V)Ψ = (E0 +∆E)Ψ

or (H0 −E0

)Ψ =

(∆E − V

)Ψ (322)

and act from the left with R0. Since R0(H0−E0) = 1−|Φ0〉〈Φ0| and assuming intermediatenormalization raketΦ0|Ψ = 1, we get the Bloch equation where ∆E = raketΦ0|VΨ frommultiplication of Eq. (322) by 〈Φ0|.

The Bloch equation can be iterated starting from replacing Ψ by Φ0. We get

Ψn = Φ0 + R0 (En − V )Ψn−1

withEn = 〈Φ0|VΨn−1〉.

Note that n is not the order of perturbation theory here. This set of equations isequivalent to RS perturbation theory in the sense that consecutive iterations reproducehigher and higher orders of this theory. However, since AΨ = Ψ , where A is theantisymmetrizer defined earlier, we can insert A in front of Ψ on the right-hand sideof Eq. (320). After iterating, one get the following set of equations

Ψn = Φ0 + R0 (〈Φ0|V GΨn−1〉 − V ) F Ψn−1

139

En =〈Φ0|V G′Ψn−1〉〈Φ0|G′Ψn−1〉

(323)

where F , G and G′ can be A or 1. Particular choices of these operators lead to thefollowing SAPT methodsF G G′ Ψ0 name1 1 A Φ0 Symmetrized RS (SRS)A 1 1 AΦ0 Jeziorski-Kolos (JK)A A A AΦ0 Eisenschitz-London-Hirschfelder-van der Avoird (EL-HAV)

One may think that the EL-HAV method, applying the antisymmetrizer in all possibleplaces, should work best. This is not the case since at large intermolecular separationsthis method in low order is not compatible with the RS approach (we omit the proof),and, as mentioned above, this approach is very accurate at such separation. The reasonis that the exchange effects in interaction energies, i.e., effect resulting from the globalantisymmetrization, decay exponentially with increasing intermonomer separation R,whereas the total interaction energy, as it will be shown below, decays as inverse powersof R. One can see the former from the zeroth-order antisymmetrized wave function fortwo interacting hydrogen atoms

sA [1sA(r1)1sB(r2)] = (1± P12)1sA(r1)1sB(r2) = 1sA(r1)1sB(r2)± 1sA(r2)1sB(r1)

where we use spin-free approach and therefore we consider two types of states whichafter multiplication by singlet and triplet spin functions will form antisymmetric wavefunctions. The first term in the last part of the equation written above is the Φ0 ofthe RS theory and the second term is the exchange one. The latter term, when usedin the bra of expression (323) will lead to integrals where electron 1 is on center A inthe bra and on center B in the ket, and similarly for electron 2. The effect is that allsuch integrals are proportional to two-center overlap integrals, and such integrals haveto decay exponentially since wave functions decay exponentially.

In contrast to EL-HAV, the SRS and JK methods have correct asymptotics. Thiscorrectness is evident for SRS since the SRS wave functions corrections are the sameas those of RS. Moreover, the two latter methods are identical in the first two orders. Inpractice, modern SAPT implementations always use SRS due to its simplicity.

The derivation presented above assumed that one knows exact wave functions ofmonomers. In practice, it is possible only for the smallest atoms. Thus, a generallyapplicable SAPT theory has to use methods analogous to the MBPT methods discussedearlier. The zeroth-order approximation that can be computed accurately for verylarge systems is the Hartree-Fock level. One then simultaneously accounts for theintramonomer correlation energies, e.g., at the MP2 level or at the CCSD level, andthe SAPT expansion effects.

140

We conclude this section by examining the spectrum of Li–H in order to understandwhy RS perturbation theory has to diverge for dimers containing one or two monomerswith three or more electrons. The simplest example of such system is Li–H. The leftcolumn of the figure shows the spectrum of the unperturbed system, i.e., the energiesELik + EH

l . The right column shows the physical spectrum of LiH at the minimumseparation of the dimer. The word physical indicates that the wave functions arecompletely antisymmetrized. The middle column shows the spectrum of LiH at the sameR but in the space of functions that are antisymmetized only within Li. Since the Pauliexclusion principle does not apply to such states, one may have states well approximatedby a wave function with three electrons occupying the 1s orbital: two electrons of Liand one electron coming from H. One can show (see a homework problem) that thelowest energy of such as system is much below the lowest physical energy and thatthe continuous spectrum starts below the lowest physical state. The physical state toappear in this spectrum by are “submerged" in the unphysical (sometimes called Pauli-forbidden) continuum. Since this means that the physical states are degenerate with thiscontinuum, one cannot expect convergence.

FIG. 8. Spectrum of Li–H

.

B. Asymptotic expansion of interaction energy

When the distance between monomers becomes large, one can expand the interactionpotential V in multipole series. This series is defined in most E&M textbooks. For the

141

electron repulsion term, we have

1|r1 − r2|

=∞∑

lA,lB=0

∑l<

m=−l<

KmlAlBRlA+lB+1

QmlA(r1)Q−mlB (r2),

l< = min(lA, lB),where K is a combinatorial coefficients

KmlAlB = (−1)lB(lA + lB)!

[(lA +m)! (lA −m)! (lB +m)! (lB −m)!]1/2

and the solid harmonics are expressed through the standard spherical harmonics Qml (r)

(called also 2lth-pole moment operator)

Qml (r) = −( 4π2l + 1

)1/2r lYml (r).

One of homework problems shows that the first-order electrostatic energy can bewritten as

E(1)elst =

"ρA(r1)v(r1,r2)ρB(r2)d3r1d

3r2

where

v(r1,r2) =1

|r1 − r2|− 1NA

∑β

Zβ|r1 −Rβ |

− 1NB

∑α

Zα|r2 −Rα |

+1

NANB

∑α,β

ZαZβ|Rα −Rβ |

(324)

with the sums running over the nuclei of system A and B and Zγ ’s denoting the nuclearcharges. Let’s apply the asymptotic expansion to the first term in this expression"

ρA(r1)1

|r1 − r2|ρB(r2)d3r1d

3r2 =

∞∑lA,lB=0

∑l<

m=−l<

KmlAlBRlA+lB+1

∫ρA(r1)QmlA(r1)d3r1

∫ρB(r2)Q−mlB (r2)d3r2

The first (second) of the two integrals can be recognized as component of the multipolemoment of monomer A (B) of rank lA (lB). If the molecules are neutral and polar, the firstnonvanishing moment is the dipole moment. For such systems the electrostatic energydecays as 1/R3. Similar derivations can be performed for the remaining terms in Eq. (324)and in higher orders of RS perturbation theory.

C. Intermolecular interactions in DFT

Density functional theory (DFT) is the most often used method in computationalstudies of matter. In the standard Kohn-Sham (KS) implementation, all electron cor-

142

relation effects are included in the exchange-correlation energy. The exact form ofthis energy is unknown, and a large number of approximate functionals have beenconstructed to describe it, as discussed earlier. While such functionals describe manyproperties of matter quite accurately, there are also several properties where all existingfunctionals fail, and one such example are intermolecular interactions which involveatoms or molecules separated by several angstroms or more. The local density approximation(LDA) obviously misses any interactions between distant regions. The semilocal generalizedgradient approximations (GGA’s) still cannot describe long-range electron correlationsdue to the limited range of the exchange-correlation hole. The size of such a hole is of theorder of 1 Å, so correlation interactions between regions separated by much more thanthis distance cannot be recovered. One can say that these methods are myopic with therange of vision of about 1 Å.

Interaction energies given by most DFT methods can be brought to agreement withaccurate interaction energies by adding a negative correction, which at very large R (forsystems with no dipole and quadrupole moments) is simply the dispersion energy. Forshorter R, the dispersion energy has to be tapered, differently for each DFT method. Thisobservation led to a family of methods supplementing DFT interaction energies by a“dispersion" correction (computed for example as an atom-atom function fitted to resultsof calculations with wave function methods and properly tapered) referred to as DFT+Dtype methods. The DFT+D methods are reasonably successful, reproducing completeinteraction energy curves with errors of the order of a few percent, but this approach isnot anymore a first-principles one.

There were also several so-called nonlocal density functionals created. These are first-principles approaches but at the present time are less accurate than DFT+D methods.

XIII. DIFFUSION MONTE CARLO

The diffusion Monte Carlo method provides a different way, than what we have seenso far in this course, to solve time-dependent Schrodinger equation of a system.Let’s consider a single particle m in a one-dimensional box. Transformation from realtime to imaginary time can be done by making the following changes:

τ = it (325)

and V (x)→−V (x).Using this transformation the Schrodinger equation reads:

~∂τψ = ~2/2m∂2

xψ − [V (x)−Er]ψ (326)

One can solve this equation as:

ψ(x,τ) =∑

cnφn(x)exp[−(En −ER)τ/~] (327)

143

whereφn(x) and En are the eigenstates and eigenvalues of the time-independent Schrodingerequation, respectively. There are three possibilities for τ → ∞: (i) if ER > E0 thewavefunction diverges exponentially fast. (ii) if ER < E0 the wavefunction vanishesand (iii) if ER = E0 we get ψ(x,τ) = c0φ0(x). This behavior provides the basis of theDMC method: for ER = E0 the wavefunction ψ(x,τ) converges to the ground state φ0(x)regardless of the choice of initial wavefunction ψ(x,0) as long as there is an overlapbetween the initial wavefunction and the ground state, namely as long as c0 , 0. Pathintegral method can be used to solve 327. Readers are referred to standard quantummechanics text books to convince themselves that the following equation is true:

ψ(x,τ) = limN→∞

∫ ∞−∞

N−1∏j=0

dxj

N∏n=1

W (xn)P (xn,xn−1)ψ(x0,0) (328)

where the probability density P and the weight function W can be obtained as:

P (xn,xn−1) =√m/2π~∆τ exp[−m(xn − xn−1)2/(2~∆τ)] (329)

W (xn) = exp[−(V (xn)−ER)∆τ/~] (330)

with ∆τ = τ/N . Note that∫∞−∞ P (x,y)dy = 1 and the exponential part of P is Gaussian

probability for the random variable xn with mean xn−1 and variance σ =√~∆τ/m.

Equation 328 should be solved numerically and one may use the so-called Monte Carlomethod. In this method an N-dimensional integral

I =∫ ∞−∞

N−1∏j=0

dxjf (x0, ...,xN−1)P (x0, ...,xN−1) (331)

with P being probability density can be approximated as:

I = 1/NN∑

i=1,x(i)∈P

f (x(i)0 , ...,x

(i)N−1) (332)

x(i)j ∈ P means i = 1,2, ...,N ; j = 0,1, ...,N−1 are selected randomly with probability density

P. It is worth mentioning that the larger N the better approximation for I.While the Monte Carlo method is able to calculate ψ(x,τ), it is unable to find E0 andφ0(x0). An improvement over this method is called Diffusion Monte Carlo which will beexplained here.The basic idea is to consider the wavefunction a probability density sampling the initialwavefunction,ψ(x0,0), at N0 points. In fact, this method generates N0 Gaussian randomwalkers which evolve in time:

x(i)n = x(i)

n−1 + σρ(i)n (333)

144

where x(i)n is generated by 329 with mean value x(i)

n−1 and variance σ . ρ(i)n is a Gaussian

random number in the interval [0,1] with mean being 0 and variance 1. It is obviousthat this stochastic process looks exactly like Brownian diffusion process. The generated"random walkers" are called "particles" or "replicas" in the DMC method. Instead oftracing the motion of each particle, one follows the motion of whole ensemble of replicas.The integrand in 328 can be interpreted as:

W (xn)P (xn,xn−1)...W (x2)P (x2,x1)W (x1)P (x1,x0)ψ(x0,0) (334)

where ψ(x0,0), P (x1,x0), W (x1), ..., P (xn,xn−1) and W (xn) are process 0, process 1, process2, ..., process 2N − 1 and process 2N , respectively.Initial state: The 0th process describes particles distributed according to the initialwavefunction,ψ(x0,0), which is typically chosen as δ−function (ψ(x0,0) = δ(x − x0)).Diffusive displacement: The DMC algorithm produces x1 = x0 + σρ1, x2 = x1 + σρ2, etc.by generating random numbers ρn;n = 1,2, ...Birth-death processes: After each time step, each particle is replaced by a number ofreplicas which is given by:

mn =min[int[W (xn)] +u,3] (335)

where u is a random number which is uniformly distributed in [0,1]. If mn = 0 theparticle is deleted and diffusion process is terminated (death). Ifmn = 1 there is no effect,the particle stays alive and the algorithm takes it to the next diffusion step. If mn = 2 theparticle goes to the next diffusion step and another particle starts off a new series at thepresent location (birth). If mn = 3, the scenario is similar to the previous case but thereare 2 newly born replicas starting off at the current location.Algorithm: Now, it is time to summarize the algorithmic steps of the DMC:1) One starts with N0 particles at positions x(i)

0 , i = 1,2, ..,N0 which are placed accordingto the distribution ψ(x0,0). It is more convenient to choose all replicas to start at the samepoint x0.2) Rather than following the fate of each replica, one follows all replicas simultaneously:

x(j)1 = x(j)

0 +√~∆τ/mρ

(j)1 ; j = 1,2, ...,N0 (336)

This is regarded as one-step diffusion process of replicas.3) Once the new position x(j)

1 is calculated, one evaluates W (x(j)1 ) through 330 and from

335 one determines a set of integers m(j)1 for j = 1,2, ...,N0. Replicas with m

(j)1 = 0 are

terminated. If m(j)1 = 1 replicas are left unaffected. Replicas with m

(j)1 = 2,3 go to the

next diffusion step, but 1,2 more replica(s) should be added to the system at the currentposition.4) The number of replicas is counted and N1 is determined.5) During the combined diffusion and birth-death processes, the distribution of replicas

145

changes in such a way that the coordinate x(j)1 now is distributed according to the

probability density ψ(x,∆τ).6) As a result of birth-death processes, the total number of replicas,N1, is now differentfrom N0. One wants to have almost constant number of replicas during the calculations.Therefore, one can use a suitable choice of ER to fix the increased or decreased numberof replicas. Note that for sufficiently small ∆τ 330 can be approximated as W (x) 1− (V (x)−ER)∆τ/~. Now, averaging over all replicas:

<W >1 1− (< V >1 −ER)∆τ/~ (337)

with < V >1= 1/N1∑N1j=1V (x(j)

1 ) One would like < W >1 to be eventually always unity.Therefore,

E(1)R =< V >1 (338)

E(2)r can be evaluated as (the proof is left for homework):

E(2)R = E(1)

R + ~/∆τ(1−N1/N0) (339)

The diffusive displacement, the birth-death processes and estimation of new ER arerepeated until ER and distribution of replicas converge to stationary values. Now, thedistribution of replicas can be interpreted as the ground state wavefunction and theground state energy can be calculated as E0 = limn→∞< V >n.

XIV. DENSITY-MATRIX APPROACHES

The quantum state of a single particle thus far has been described by a wavefunctionΨ (x) in coordinate and spin space. In this section, we will consider an alternativerepresentation of the quantum state, called the density matrix. The density matrix wasoriginally introduced in quantum statistical mechanics to describe a system for whichthe state was incompletely specified. Although describing a quantum system with thedensity matrix is equivalent to using the wavefunction, it has been shown that densitymatrices are more practical for certain time-dependent problems. The general N -orderdensity matrix is formally defined as

γN (x′1x′2 . . .x′N ,x1x2 . . .xN ) ≡ ΨN (x′1,x

′2, . . . ,x

′N )Ψ ∗N (x1,x2, . . . ,xN ) (340)

where xi = ri , s denotes spatial and spin coordinates. Note that the density matrixcontains two sets of independent quantities, x′i and xi, that gives γN a numerical value.

Equivalently, Eq. 340 can be viewed as the coordinate representation of the densityoperator,

γN = |ΨN 〉〈ΨN | (341)

146

since〈x′1x′2 . . .x

′N |γN |x1x2 . . .xN 〉 = 〈x′1x′2 . . .x

′N |ΨN 〉〈ΨN |x1x2 . . .xN 〉

Note that γN can also be thought of as the projection operator onto the state |ΨN 〉. Wethen have for normalized ΨN ,

Tr(γN ) =∫

Ψ ∗N (xN )ΨN (xN )dxN = 1

where xN stands for the set xiNi=1. The trace of an operator A is defined as the sum ofdiagonal elements of the matrix representing A, or the integral if the representation iscontinuous as above. It can also be verified that

〈A〉 = Tr(γN A) = Tr(AγN )

From this, the density operator γN can be seen to carry the same information as the N -electron wave function |ΨN 〉. Note that while |Ψ 〉 is defined only up to an arbitrary phasefactor, γN for a state is unique. γN is also positive semidefinite and Hermitian. The stateof the system is said to be pure if it can be described by a wavefunction, and mixed if itcannot. A system in a mixed state can be characterized by a probability distribution overall accessible pure states. We can think of γN as an element of a matrix (density matrix);if we set xi = x′i for all i, we get the diagonal elements of the density matrix,

γN (x1x2 . . .xN ) ≡ Ψ ∗N (x1,x2, . . . ,xN )ΨN (x1,x2, . . . ,xN ) = |ΨN (x1,x2, . . . ,xN ) |2

which is the N-order density matrix for a pure state. Note that this is also the probabilitydistribution associated with a solution of the Schrödinger equation. We can express theSchrödinger equation in density-matrix formalism by taking the time derivative of thedensity operator and using Hermiticity and commutation relations,

∂∂tγN =

(∂∂t|ΨN 〉

)〈ΨN |+ |ΨN 〉

(∂∂t〈ΨN |

)∂∂tγN =

(Hi~|ΨN 〉

)〈ΨN | − |ΨN 〉

(Hi~〈ΨN |

)i~∂∂tγN =

[H, γN

](342)

This equation describes how the density operator evolves in time. We can generalize thedensity operator γN to the ensemble density operator

Γ =∑i

pi |Ψi〉〈Ψi | (343)

where pi is the probability of the system being found in the state |Ψi〉, and the sum is overthe complete set of all accessible pure states. pi has the following properties since it is aprobability:

pi ≥ 0,∑i

pi = 1

147

We can then rewrite Eq. 342 in terms of the ensemble density matrix to obtain

i~∂∂t

Γ =[H, Γ

](344)

which is true if Γ only involves states with the same number of particles, as is true inthe canonical ensemble. This equation is also known as the von Neumann equation,the quantum mechanical analog of the Liouville equation. For stationary states, Γ isindependent of time, which means that[

H, Γ]

= 0

which implies that H and Γ share the same eigenvectors. Work done in statisticalmechanics deal heavily with systems at thermal equilibrium, where the density matrix ischaracterized by thermally distributed populations in the quantum states

ρ =e−βH

Z

where β = 1/kBT , kB is the Boltzmann constant, and Z is the partition function

Z = Tr(e−βH )

In this language, one can express a thermally averaged expectation value as

〈Ω〉 =Tr(Ωρ)Z

With a mixed state, we have less than perfect knowledge of what the quantum state is.We can describe how much less information there is by defining the entropy as

S = −kBTr[ρ lnρ]

The basic Hamiltonian operator, Eq. 19, is a sum of two symmetric one-electronoperators and a symmetric two-electron operator, neither depending on spin. Alongwith the fact that the wavefunctions ΨN are antisymmetric , the expectation values of thedensity operator can be systematically simplified by integrating the probability densitiesover N −2 of its variables, giving rise to concepts of reduced density matrix and spinlessdensity matrix.

A. Reduced density matrices

The reduced density matrix of order p is defined as

γp(x′1x′2 . . .x′p,x1x2 . . .xp) =(

Np

)∫· · ·

∫γN (x′1x′2 . . .x

′pxp+1 . . .xN ,x1x2 . . .xp . . .xN )dxp+1 . . .dxN (345)

148

where(Np

)is a binomial coefficient, and γN is defined as Eq. 340. This is also known as

taking the partial trace of the density matrix. For example, the first-order density matrixγ1 is defined as

γ1(x′1,x1) =N∫. . .

∫Ψ ∗(x′1x2 . . .xN )Ψ (x1x2 . . .xN )dx2 . . .xN (346)

and normalizes to

Tr γ1(x′1,x1) =∫γ1(x1,x1)dx1 =N

Similarly, the second-order density matrix γ2 is defined as

γ2(x′1x′2,x1x2) =N (N − 1)

2

∫· · ·

∫Ψ ∗(x′1x′2x3 . . .xN )Ψ (x1x2x3 . . .xN )dx3 . . .dxN (347)

and normalizes to the number of electron pairs

Tr γ2(x′1x′2,x1x2) =∫ ∫

γ2(x1x2,x1x2)dx1dx2 =N (N − 1)

2

The reduced density matrices γ1 and γ2 just defined are coordinate-space representationsof operators γ1 and γ2, acting on the one- and two-particle Hilbert spaces, respectively.We can express the one-particle operator in terms of its eigenvalues and eigenvectors

γ1 =∑i

ni |ψi〉〈ψi |

where the eigenvalues ni are the occupation numbers and the eigenvectors |ψi〉 are thenatural spin orbitals. Similarly, the two-particle operator can be expressed as

γ2 =∑i

gi |θi〉〈θi |

where the eigenvalues gi are the occupation numbers and the eigenvectors |θi〉 are callednatural geminals. It also follows that ni ≥ 0 and gi ≥ 0. Comparing these two operatorswith Eq. 343, we can see that ni is proportional to the probability of the one-electronstate |ψi〉 being occupied and gi is proportional to the probability of the two-electronstate |θi〉 being occupied.

Now let us consider the expectation values of one- and two-electron operators with anantisymmetric N -body wavefunction Ψ . For a one-electron operator

O1 =N∑i=1

O1(xi ,x′i)

we have

〈O1〉 = Tr(O1γN ) =∫O1(x1,x

′1)γ1(x′1,x1)dx1dx′1 (348)

149

If the one-electron operator is local, i.e. O1(r′,r) = O1(r)δ(r′ − r), we can conventionallywrite down only the diagonal part; thus

〈O1〉 = Tr(O1γN ) =∫

[O1(x1)γ1(x′1,x1)]x′1=x1dx1

Similarly, if the two-electron operator is local, we have

O2 =N∑i<j

O2(xi ,xj)

and the corresponding expectation value

〈O2〉 = Tr(O2γN ) =∫ ∫

[O2(x1,x2)γ2(x′1,x′2,x1,x2)]x′1=x1,x′2=x2

dx1dx2

We thus obtain for the expectation value of the Hamiltonian, Eq. 19, in terms of densitymatrices

E = Tr(HγN ) = E[γ1,γ2]

=∫ [(−1

2∇2

1 + v(r1))γ1(x′1,x1)

]x′1=x1

dx1 +∫ ∫

1|r1 − r2|

γ2(x1x2,x1x2)dx1dx2 (349)

We can further simplify this result by integrating over the spin variables.

B. Spinless density matrices

The first-order and second-order spinless density matrices are defined by

ρ1(r′1,r1) =∫γ1(r′1s1,r1s1)ds1

=N∫· · ·

∫Ψ ∗(r′1s1x2 . . .xN )Ψ (r1s1x2 . . .xN )ds1dx2 . . .dxN (350)

and

ρ2(r′1r′2,r1r2) =∫γ2(r′1s1r′2s2,r1s1r2s2)ds1ds2

=N (N − 1)

2

∫· · ·

∫Ψ ∗(r′1s1r′2s2x3 . . .xN )Ψ (r1s1r2s2x3 . . .xN )ds1ds2dx3 . . .dxN (351)

We can introduce a shorthand notation for the diagonal elements of ρ1,

ρ1(r1) = ρ1(r1,r1) =N∫· · ·

∫|Ψ |2ds1dx2 . . .xN

150

and similarly for ρ2,

ρ2(r1,r2) = ρ2(r1r2,r1,r2) =N (N − 1)

2

∫· · ·

∫|Ψ |2ds1ds2dx3 . . .dxN

Also note that from the above definitions, we can express the first-order density matrixin terms of the second-order density matrix

ρ(r′1,r1) =2

N − 1

∫ρ2(r′1r2,r1r2)dr2

ρ(r1) =2

N − 1

∫ρ2(r1,r2)dr2

The expectation value of the Hamiltonian, Eq. 349, in terms of density matrices nowbecomes

E = E[ρ1(r′1,r1),ρ2(r1,r2)]

=∫ [−1

2∇2ρ1(r′,r)

]r′=r

dr +∫v(r)ρ(r) +

∫ ∫1

|r1 − r2|ρ2(r1,r2)dr1dr2 (352)

where the three terms represent the electronic kinetic energy, the nuclear- electronpotential energy, and the electron-electron potential energy, respectively. Note thatsince we can express the first-order density matrix in terms of the second-order, only thesecond-order density matrix is needed for the expectation value of the Hamiltonian.

C. N-representability

From Eq. 349, one may hope to minimize the energy with respect to the densitymatrices, thus avoiding having to work with the 4N - dimensional wavefunction. Sinceonly the second-order density matrix is needed for the energy minimization, the trial γ2

must correspond to some antisymmetric wavefunction Ψ ; i.e. for any guessed second-order density matrix γ2 there must be a Ψ from which it comes via its definition, Eq.347. This is the N -representability problem for the second-order density matrix.

For a trial wavefunction to beN -representable, it must correspond to some antisymmetricwavefunction from which it comes via Eq. 345. It’s a difficult task to obtain thenecessary and sufficient conditions for a reduced density matrix γ2 to be derivable froman antisymmetric wavefunction Ψ . Instead, it may be easier to solve the ensemble N -representability problem for Γ2, where Γp is the p-th order mixed state (ensemble) densitymatrix defined as

Γp(x′1x′2 . . .x′p,x1x2 . . .xp) =(

Np

)∫· · ·

∫ΓN (x′1x′2 . . .x

′pxp+1 . . .xN ,x1x2 . . .xp . . .xN )dxp+1 . . .dxN (353)

151

SinceE0 = Tr(HΓ 0

N ) ≤ Tr(H ΓN )

it is completely legitimate to enlarge the class of trial density operators for an N -electronproblem from a pure-state set to the set of positive unit-trace density operators made upfrom N -electron states. This minimization leads to the N -electron ground state energyand the ground state γN if it is not degenerate, or an arbitrary linear combination γN(convex sum) of all degenerate ground states if it is degenerate. Thus the minimizationin Eq. 352 can be done over ensemble N -representable Γ2. For a given Γ1,

Γ1 =∑i

ni |ψi〉〈ψi |

the necessary and sufficient conditions for it to be N -representable are that

0 ≤ ni ≤ 1 (354)

for all of the eigenvalues of Γ1. This conforms nicely with the Pauli exclusion principle.Let us now prove this theorem of the necessary and sufficient conditions for the first-

order density matrix to be N -representable. The necessary conditions for Γ1 and Γ2 aresuch that they satisfy Eq. 353 for a proper ΓN . The sufficient conditions are thosethat guarantee the existence of a ΓN that reduces to Γ1 or Γ2. The set of Γ1 or Γ2 thatsimultaneously satisfies both necessary and sufficient conditions is called the set of N-representable Γ1 or Γ2. If the energy is minimized over sets Γ1 and Γ2 satisfyingonly necessary conditions, an energy lower than the true energy can be obtained (lowerbound). If it is minimized over sets satisfying only sufficient conditions, an energy higherthan the true energy is obtained (upper bound). If one minimizes the energy over all setssatisfying the sufficient conditions, the ground-state energy is obtained.

The necessary conditions on γ1 and γ2 imposed by N -representability are also calledPauli conditions and are as follows. If |ψi〉 is some normalized spinorbital state and |ψiψj〉is a normalized 2× 2 Slater determinant built from orthonormal ψi and ψj , then

0 ≤ 〈ψi |γ1|ψi〉 ≤ 1 (355)

0 ≤ 〈ψiψj |γ2|ψiψj〉 ≤ 1 (356)

In the coordinate representation, they can be written as

0 ≤∫ ∫

dx1dx′1ψ∗i (x′1)γ1(x′1,x1)ψi(x1) ≤ 1 (357)

0 ≤ 12

∫ ∫ ∫ ∫dx1dx′1dx2dx′2

[|ψ∗i (x

′1)ψ∗j (x

′1)|γ2(x′1x′2,x1x2)|ψ∗i (x1)ψ∗j (x2)|

]≤ 1 (358)

Eq. 355 is equivalent to the requirement that the eigenvalues of γ1 are given by Eq. 354,whereas Eq. 356 is not equivalent to the eigenvalues of γ2 since the eigenfunctions in

152

general are not 2 × 2 Slater determinants. Let us prove this relation for γ1. We start byintroducing the field creation and annihilation operators, which create and annihilateone-particle states that are the eigenfunction in coordinate space

ψ†(x) =∑i

ψ∗i (x)a†i ψ(x) =∑i

ψi(x)ai

where

a†i =∫dxψi(x)ψ†(x) ai =

∫dxψ∗i (x)ψ(x)

Let us consider an arbitrary single-particle operator A(i) and do a spectral decomposition

A(i) =∑αβ

|α〉〈α|A|β〉〈β| =∑αβ

Aαβ |α〉〈β|

The simplest N -particle operation can be constructed as

AN = A(1) + A(2) + · · ·+ A(N )

where A(i) acts on the states of the ith particle,

A(i)|α1 · · ·αi · · ·αN 〉 =∑βi

〈βi |A|αi〉|α1 · · ·βi · · ·αN 〉 =∑βi

Aβiαi |α1 · · ·βi · · ·αN 〉

giving

AN |α1 · · ·αi · · ·αN 〉 =N∑i=1

∑βi

Aβiαi |α1 · · ·βi · · ·αN 〉

We can instead write A in terms of creation and annihilation operators

A =∑αβ

Aαβ a†αaβ =

∫dx1dx′1A(x1,x

′1)ψ†(x1)ψ(x′1)

The expectation energy of A is then obtained from

〈A〉 = 〈Ψ |A|Ψ 〉 =∫dx1dx′1A(x1,x

′1)〈Ψ |ψ†(x1)ψ(x′1)|Ψ 〉

if we now compare this equation to Eq. 348, and let A = O1, we find that the first-orderdensity matrix can be written as

γ1(x′1,x1) = 〈Ψ |ψ†(x1)ψ(x′1)|Ψ 〉

We can now rewrite Eq. 357 as

0 ≤∫ ∫

dx1dx′1ψ∗i (x′1)〈Ψ |ψ†(x1)ψ(x′1)|Ψ 〉ψi(x1) ≤ 1

153

0 ≤∫ ∫

dx1dx′1ψ∗i (x′1)〈Ψ |

∑l

ψ∗l (x1)a†l∑j

ψj(x′1)aj |Ψ 〉ψi(x1) ≤ 1

which can be reduced down to

0 ≤ 〈Ψ |a†i ai |Ψ 〉 ≤ 1

where a†i ai = Ni is just the number operator, which generates the occupation number forthe ith orbital. Since Ni is a projection operator, the expectation value of it is alwaysnonnegative and less than or equal to 1. Hence the Pauli condition for the first orderdensity matrix, Eq. 355 is satisfied.

The sufficient conditions require that the there must exist a ΓN that reduces to Γ1 andΓ2. The proof for Γ1 is as follows. First we need a simple lemma about vectors and convexsets. A set is convex if an arbitrary positively weighted average of any two elements inthe set also belongs to the set. Next we define an extreme element of a convex set asan element E such that E = p1Y1 + p2Y2 implies that Y1 and Y2 are both multiples of E.Then the lemma states that the set L of vectors v = (v1,v2, . . . ) in a space of arbitrary butfixed dimension with 0 ≤ vi ≤ 1 and

∑vi = N is convex and its extreme components are

the vectors with N components equal to 1 and all other components equal to 0. Giventhis lemma, it is clear that any γ1 or Γ1 satisfying Eq. 354 is an element of a convexset whose extreme elements are those γ0

1 or Γ 01 that have N eigenvalues equal to 1 and

the rest equal to 0. Each of these γ01 or Γ 0

1 determines up to a phase a determinantalN -electron wavefunction and a unique corresponding pure-state density operator γ0

N .Some positively weighted sum of these γ0

N will be the ΓN that reduces to the given γ1 orΓ1 through Eq. 353. Hence, sufficiency is proved.

D. Density matrix functional theory

We have studied DFT in depth in an earlier section, where we started from a variationalprinciple that had the electron density ρ(r) as the basic variable. Using the concept ofdensity matrices, we can construct a corresponding density-matrix-functional theory(DMFT), in which the basic variable is the first-order reduced density matrix γ1(x′,x), orthe first-order spinless density matrix ρ1(r′,r). The main advantage of DMFT over DFT isthat the kinetic energy as a functional of the density matrix is known, therefore there isno need to introduce an auxiliary system. The unknown exchange- correlation part onlyhas to describe only has to describe the electron- electron interactions, whereas in DFT,the kinetic energy part was also included. As we know, the first-order density matrixdetermines the density

ρ(r) = ρ1(r,r) =∫γ1(x′,x)

∣∣∣∣x=x′

ds

and by the Hohenberg-Kohn theorem, it determines all properties including the energy,

E = E[γ1] = E[ρ1]

154

The explicit form of the functional is given via a constrained search, where the searchcan be over all trial density operators

E[γ1] = minγN→γ1

Tr(γN H)

The variational principle for DMFT can be written as

δE[γ1]−µN [γ1] = 0 (359)

where µ is the chemical potential. This variational principle stands for

E0 = minγ1

E[γ1] (360)

If we parametrize γ1 in terms of natural spinorbitals ψi and occupation numbers ni , weobtain

µ =(∂E∂ni

)ψi ,ni,nj

for all i

assuming that

0 < ni < 1 (361)

If there are any natural orbitals among the complete set of natural orbitals for whichEq. 361 is not true, then the above assertion for µ is not true, for such an orbital withni = 1 has ∂E/∂ni ≤ µ. It is a conjecture that Eq. 361 holds for all orbitals in an atom ormolecule.

If the variation in Eq. 359 is chosen for the natural orbitals themselves with orthonormalizationconditions imposed, the result is a set of coupled differential equations for the naturalorbitals, which are very different from the KS equations derived previously.

From Eq. 349, the first-order density matrix determines all components of the energyexplicitly except for Vee[γ], constrained search in Eq. 360 may be restricted to searchingfor the γN that minimizes Vee,

Vee[γ1] = minγN→γ1

Tr(γN Vee)

or since γN → γ2→ γ1, we could write

Vee[γ1] = minγ2→γ1

∫ ∫1r12γ2(x1x2,x1x2)dx1dx2

where the trial γ2 would have to satisfy the N -representability conditions outlined in theprevious section.

155

E. Contracted Schrödinger equation

The many-body Hamiltonian operator is the sum of 1 and 2 electron operators, whichis why the energy of an N -electron system can be expressed as a functional of the 2-RDM, which depends on the variables of electrons. This gives us a method of studyingthe structure of electronic systems by determining the 2-RDM instead of the N -electronwavefunction. The main question is whether the Schrödinger equation can be mappedinto the 2 electron space; and if it is capable, what would be the properties of the resultingequations? There were two approaches to answering this question: one method is tointegrate the Schrödinger equation obtained in first quantization, resulting in what’scalled the density equation; the second method is to apply a contracting mapping tothe matrix representation of the Schrödinger equation, obtaining what’s known as theContracted Schrödinger equation (CSE). It was found that although the two equationslooked very different, they are in fact equivalent. In this section, we will put focus on thecontracted Schrödinger equation. We begin with an introduction to the notation.

The many-body Hamiltonian can be written as

H =∑ij

hija†i aj +

12

∑ijkl

〈ij |kl〉a†i a†j alak

where the 1-electron basis is assumed to be finite and formed by 2K orthogonal spinorbitals.We can rewrite it as

H =12

∑ijkl

Kijkla†i a†j alak (362)

where the elements of the two-particle reduced Hamiltonian matrix are given by

Kijkl =1

N − 1(hikδjl + hjlδik) + 〈ij |kl〉

is the reduced Hamiltonian, which has the same properties of the 2-electron matrix.In second quantization formalism, the p-order reduced density matrix (p-RDM) can

be written aspDΨΨ ′

i1,i2...,ip,j1,j2,...,jp=

1p!〈Ψ |a†i1a

†i2. . . a†ipajp . . . aj2aj1 |Ψ

′〉 (363)

When Ψ , Ψ ′, we have an expression defining an element of the p-order transitionreduced density matrix (p-TRDM). We will work with the case of pure states with Ψ = Ψ ′,

pDΨi1,i2...,ip,j1,j2,...,jp

=1p!〈Ψ |a†i1a

†i2. . . a†ipajp . . . aj2aj1 |Ψ 〉 (364)

where the complementary matrix to the p-RDM is the p-order holes reduced densitymatrix (pHRDM),

pDΨi1,i2...,ip,j1,j2,...,jp

=1p!〈Ψ |ajp . . . aj2aj1a

†i1a†i2 . . . a

†ip|Ψ 〉 (365)

156

where hole implies that Ψ itself is the reference state.This second quantization formalism for the density matrix is equivalent to the expressions

in the previous sections. The equivalence is shown using the field creation and annihilationoperators formulated in section C. Integrating the N -electron density matrix,

ND1,2...N ;1′ ,2′ ...N ′ = Ψ (1,2 . . .N )Ψ ∗(1′,2′ . . .N ′)

over coordinates (p+ 1) to N defines the p-RDM

pDΨ1,2...p,1′2′ ...p′ =

∫Ψ (1,2 . . .N )Ψ (1′,2′ . . .p′ . . .N )d(p+ 1) . . .dN

where we have changed notation for convenience (i1, i2 . . . ip, j1, j2 . . . jp)→ (1,2 . . .p,1′2′ . . .p′).Let us begin with a quantum system of N fermions characterized by the Schrödinger

equation (SE)

H |ψn〉 = En|ψn〉

where the wavefunction ψn depends on the coordinates for the N particles. We willuse second quantization to derive the contracted Schrödinger equation, emphasizing theuse of of test functions for contracting the SE onto a lower particle space. Nakatsuji’stheorem tells us that there is a one-to-one mapping between the N -representable RDMsolutions of the CSE and the wavefunction solutions of the SE. The proof will be coveredas homework.

Because theN -particle Hamiltonian contains only two-electron excitations, the expectationvalue of H yields an energy

E =12

∑ijkl

Kijkl〈ψ|a†i a†j alak |ψ〉 =

∑ijkl

Kijkl2D

ψij,kl

where2D

ψij,kl =

12〈ψ|a†i a

†j alak |ψ〉

Next define functions to test the two-electron space

〈Φ ijkl | = 〈ψ|a

†i a†j alak

If we take the inner product of the test functions with the SE, we obtain

〈ψ|a†i a†j alakHψ〉 = E〈ψ|a†i a

†j alak |ψ〉 = 2E 2Dij,kl

If we substitute the Hamiltonian, Eq. 362 into the above, obtaining

12

∑pqst

Kpqst〈ψ|a†i a†j alaka

†pa†qatas|ψ〉 = 2E 2Dij,kl

157

Rearranging creation and annihilation operators to produce RDMs, we generate the2,4-CSE

(K2DΨ

)ij,kl

+ 3∑pqt

(Kpq,it

3DΨpqj,ktl +Kpq,jt

3DΨpqi,ltk

)+ 6

∑pqst

Kpq,st4DΨ

pqij,stkl = E 2DΨij,kl

where the first term comes from considering two pairs of indices being the same (thisremoves two pairs of operators, leaving us with the 2-RDM). The second term comesfrom considering one pair of indices being the same (this removes one pair of operators,leaving us with the two 3-RDM terms). The factor of 3 in front comes from the differentways the operators can be rearranged. And the last term comes from considering noindices being the same (this allows us to commute the operators so all creation operatorsare on the left and all annihilation operators are on the right). The factor of 6 comesfrom the possible arrangements of the indices. The underlines in the first term indicatematrices.

We can see that this depends on the 3-RDM and 4-RDM, causing it to be indeterminate.If we knew how to build higher ordered RDM from the 2-RDM, this equation would allowus to solve iteratively for the 2-RDM. This class of problems is called the reconstructionproblem, and two approaches have been explored. One approach is to explicitly representthe 3-RDM and 4-RDM as functionals of the 2-RDM, and the other method is to constructa family of higher 4-RDMs from the 2-RDM by imposing ensemble representabilityconditions.

Besides this difficulty, we also have to ask whether the solutions of this equation willcoincide with those of the SE. The derivation for the 2-CSE shows that the SE implies the2-CSE, but does the inverse hold true? The answer is yes; we must show that the p-CSEfor p ≥ 2 is equivalent to the SE by stating and proving the following theorem.

Theorem [Nakatsuji]: If the RDM’s are N -representable, then the p -CSE is satisfied by thep-, (p+ 1)-, and (p+ 2)-RDM if and only if the N -DM satisfies the Schrödinger equation.

The proof of this theorem will be a homework problem. First, the SE is satisfied if andonly if the following dispersion relation is satisfied.

〈Ψ |H2|Ψ 〉 − 〈Ψ |H |Ψ 〉2 = 0

Therefore, we must prove that the 2-CSE in second quantized form must satisfy thedispersion relation. Since for p > 2, the p-CSE implies the 2-CSE, the demonstrationis also valid for the higher order equations. Note that this is not valid for the 1-CSE sincethe Hamiltonian includes 2-electron terms. One of the important consequences of theequivalence of 2-CSE and higher order CSE’s with the SE is that the CSEs may be appliedto the study of excited states.

158

XV. DENSITY MATRIX RENORMALIZATION GROUP (DMRG)

The Density Matrix Renormalization Group (DMRG) is an approximation methodwhich was first conceived by Steven White in 1992 as a way to handle strongly correlatedquantum lattices. In the context of molecular physics, the renormalization group structureis often ignored, and DMRG is viewed as a special wave-function ansatz. We will startby writing the wave-function

|Ψ 〉 = |ΨHF〉+ |Ψcorr〉 (366)

〈ΨHF|Ψ 〉 = 1. (367)

where |ΨHF〉 is the Hartree-Fock wave-function, and |Ψcorr〉 is the correlation part. Inmost methods, it is assumed that |Ψcorr〉 is small compared to the exact wave function.Issues arise when the coefficients in the expansion of |Ψcorr〉 are on the order or largerthan unity. The primary challenge of strongly correlated systems is that there are alarge number of determinants that contribute significantly to the wave function. Thegoal of DMRG is to overcome this complexity by encoding the idea of locality into thewave-function which may include all possible determinants. In other words, most ofthe quantum phase space is not explored by physical ground states, which makes thestrongly correlated problem far more tractable. Currently DMRG is able to describeabout 40 electrons in 40 orbitals for compact molecules, and in elongated molecules itcan describe about 100 electrons in 100 orbitals.

A. Singular value decomposition

When discussing DMRG singular value decomposition (SVD) proves to be an indispensablemethod which is able to break down the coefficient tensor of a strongly correlated wavefunction. SVD is a method which is able to decompose any M ×N matrix ψ (M ≥N )

ψ =UDV T , (368)

where U is an M ×M column orthogonal matrix

UTU = IM×M , (369)

V is a square N ×N column orthogonal matrix

V TV = IN×N , (370)

and D is a diagonal matrix who’s elements σ1, · · · ,σN are arranged in decreasing order.

159

The columns of U are chosen as the eigenvectors of ψψT , the columns of V are chosenas the eigenvectors of ψTψ, and the singular values are the square root of the eigenvaluesof ψTψ.

The SVD is in general not unique and in many texts the diagonal matrix is defined tobe square and either the left or right orthogonal matrix is defined to be rectangular, butall of these formulations are really the same thing.

In order to prove the SVD we must first show that the eigenvalues of a real symmetricmatrix are real, positive and that their eigenvectors are orthogonal.

Given real valued ψ with dimensions M×N we may construct ψTψ which is realsymmetric.

Given any eigenvalue λ and corresponding normalized eigenvector ~x we will showthat λ must be real

〈~x,ψTψ~x〉 = λ〈~x,~x〉 = λ (371)

but〈~x,ψTψ~x〉 = 〈ψTψ~x,~x〉 = λ∗〈~x,~x〉 = λ∗ (372)

this means that λ = λ∗ is real.

Now we will show that these eigenvalues are non-negative.

λ = 〈~x,ψTψ~x〉 = 〈ψ~x,ψ~x〉 = |ψ~x|22 ≥ 0. (373)

Now let us assume that it has unique eigenvalues, and take any two of them (λ,µ)with their corresponding normalized eigenvectors (~x,~y). From this we will show that theeigenvectors are orthogonal.

λ〈~y,~x〉 = 〈~y,ψTψ~x〉 = 〈ψTψ~y,~x〉 = µ〈~y,~x〉.⇒ (λ−µ)〈~y,~x〉 = 0⇒ 〈~y,~x〉 = 0 since µ , λ.

Now if any of the eigenvalues are degenerate then those eigenvectors are clearlylinearly independent (since they are different eigenvectors) and orthogonal to all othereigenvectors corresponding to different eigenvalues by the argument above, therefore wemay decompose them using the Gram-Schmidt process and produce a set of orthogonalvectors for the degenerate eigenvalues.

Now the proof of SVD is as follows. For all non-zero eigenvalues which are orderedfrom largest to smallest λi (i ∈ 1, · · ·r of ψTψ we define σi = 2

√λi , and ~ui = ψ~xi/σi . Then

160

〈~ui , ~uj〉 = δi,j . These ~ui may be therefore extended to a basis for Rm. Now construct amatrix U out of these ~ui by defining each of these vectors in order as the columns ofU , and then define matrix V in the same way buy using ~xi as its columns. From thisdefinition we see that

(UTψV )i,j = ~uTi ψ~xj =~xiψ

Tψ~xjσi

=

0 i, j > r

σiδi,j else≡Di,j . (374)

From this we have shown that ψ =UDV T .

B. SVD applications

Let us now apply SVD to a quantum systemA, which is described by anM dimensionalorthogonal basis |i〉AMi=1, which is surrounded by an environment B, that is described byan N dimensional orthogonal basis |j〉BNj=1.

The state of this combined time-independent system can be represented as

|Ψ 〉 =M∑i=1

N∑j=1

ψi,j |i〉A|j〉B. (375)

where ψ is real.

Now suppose that some operatorO acts on the quantum system but not the environment.The expectation value of this operator O may be written as

〈O〉 =M∑

i,i′=1

N∑j,j ′=1

ψi,jψi′ ,j ′〈i|O|i′〉δj,j ′ =M∑

i,i′=1

M∑j=1

ψi,jψi′ ,j〈i|O|i′〉 =M∑

i,i′=1

ρAi′ ,iOi,i′ = TrA[ρAO],

(376)where ρA = TrB[ρ] = TrB[|Ψ 〉〈Ψ |] is the reduced density matrix for this system.

Now as an example lets assume that we have a specific case of a system describedbefore where one spin in system A is in contact with some environment B which alsocontains a spin:

|Ψ 〉 =15

(4| ↑↓〉 − 3| ↓↑〉) , (377)

then the resultant coefficient matrix is

ψ =15

(0 4−3 0

), (378)

161

and it has the resultant singular value decomposition

U =(1 00 −1

), D =

15

(4 00 3

), V =

(0 11 0

), (379)

and from this we may show that (proved in the homework for general case)

ρA =1625~u1~u

T1 +

925~u2~u

T2 . (380)

Now suppose were were to approximate this RDM with only its largest eigenvalue(the most probable state) clearly this approximation would yield inexact expectationvalues, but in general as the coefficient matrices become sufficiently large and a smallerproportion of the eigenvectors are truncated, approximations of this type become betterand better until solutions converge to the exact case. This is the real essence of DMRG.We use SVD in order to lower the dimensionality of the space over which we search forground states by only searching over the most probable states.

C. DMRG wave function

To understand the DMRG wave function lets start with the FCI wave function expandedin a complete basis of determinants.

|Ψ 〉 =∑sj ψs1s2···sL |s1s2 · · ·sL〉, (381)

|sj〉 ∈ |vac〉, | ↑〉, | ↓〉, | ↑↓〉 (382)

Here |s1s2 · · ·sL〉 describes the occupancy of L orbitals, and the coefficient tensor ψ inthe expansion above has dimension 4L. This problem becomes intractable as L gets large,since unlike in normal CI we are unable to truncate this tenor since sparsity for stronglycorrelated systems is not assumed.

The FCI tensor may be exactly decomposed by singular value decomposition asfollows:

ψs1s2···sL =∑α1

U [1]s1,α1s[1]α1

V [1]α1,s2···sL . (383)

A[1]s1α1 ≡U [1]s1,α1s[1]α1

. (384)

And we may again do this to decompose V [1]

V [1]α1,s2···sL =∑α2

U [2]s2,α2s[2]α2

V [2]α2,s3···sL (385)

162

A[2]s2α1,α2 ≡U [2]s2,α2s[2]α2

(386)

We may continue doing this until the coefficient tensor is exactly decomposed

ψs1s2···sL =∑αk

A[1]s1α1A[2]s2α1,α2 · · ·A[L]sLαL−1 = Tr[A[1]s1A[2]s2 · · ·A[L]sL]. (387)

This form is useful since instead of variationally optimizing the FCI tensor we mayoptimize over the tensors of its decomposition and truncate the virtual (α) dimension.As the dimension D of the virtual index is increased the MPS ansats includes a largerregion of the full Hilbert space until it exactly captures the original FCI wave-function.

D. Expectation values and diagrammatic notation

At this point it is convenient to introduce diagrammatic notation for tensors. In tensornetworks objects are denoted by shapes with lines connecting them. The amount of linesdetermines what type of object is being dealt with. Scalars have no lines, vectors areconnected by one line, matrices have two, and higher dimensional tensors have more.

FIG. 9. Diagrammatic notation for tensors.

From this notation we may represent the FCI tensor:From this notation we may compute overlap integrals by contracting two tensors with

like indicates.

Lastly we may use this notation to represent operators and their expectation values.Given a many body operator O.

163

FIG. 10. Diagrammatic notation for FCI tensor and MPS tensor.

FIG. 11. Diagrammatic notation for FCI contractions and MPS overlaps.

O =∑rj ,sj

Or1r2···rLs1s2···sL |r1r2 · · ·rL〉〈s1s2 · · ·sL|, (388)

we may write it as:

FIG. 12. Diagrammatic notation for an arbitrary operator.

Expectation values are calculated by contracting the open indicates on each side byappropriate wave-functions.

164

FIG. 13. Diagrammatic notation for an expectation value.

E. Matrix product ansatz

In DMRG it is assumed that the wave-function can be written as a matrix productstate as described before (MPS ansatz), where the virtual dimensions are truncated to D.

This wave-function is invariant to a number of transformations. Between any two A[i]we may place the identity without changing the wave-function. From this the DMRGwave function may be written in canonical form as

|Ψ 〉 =∑sj

Tr[L[1]s1 · · ·L[p − 1]sp−1C[p]spR[p+ 1]sp+1 · · ·R[L]sL]|s1s2 · · ·sL〉 (389)

where Ls and Rs satisfy the orthogonality conditions

∑sj ,αk−1

(Lsj [k])†αk ,αk−1Lsj [k]αk−1,βk = δαk ,βk (390)

∑sj ,αk−1

Rsj [k]αk−1,αk (Rsj [k])†αk ,βk−1

= δαk−1,βk−1(391)

From these Lsi and Rsi operators we may define sets of renormalized many-particlebasis states l, r, where

|αLp−1〉 =∑

sj α1···αp−2L[1]s1α1 · · ·L[p − 1]

sp−1αp−2,αp−1 |s1s2 · · ·sp−1〉, (392)

165

|αRp+1〉 =∑

sj αp+2···αL−1R[p+ 2]

sp+2αp+1,αp+2 · · ·R[L]sLαL−1 |sp+2sp+3 · · ·sL〉, (393)

where

〈αLp−1|βLp−1〉 = δαi−1,βi−1

(394)

〈αRp+1|βRp+1〉 = δαi−1,βi−1

(395)

These left and right vectors represent a renormalized bases of the many body Hilbertspaces for site k from orbitals 1 to i − 1 and orbitals i + 2 to L respectively. Consider theleft side. For site k from 1 to i − 2, the many body basis is augmented by one orbital andsubsequently truncated again to at most D renormalized basis states

|αLp−1〉 ⊗ |sp〉 → |αLp〉 =

∑αp−1,sp

A[k]spαp−1,αp |α

Lp−1〉|sp〉. (396)

This implies that DMRG is a renormalization group for many-body Hilbert spaces.

F. DMRG algorithm

Our goal is to approximate the diagonalization of the exact Hamiltonian in theorthogonal basis |αLp−1〉 ⊗ |sp〉 ⊗ |sp+1〉 ⊗ |αRp+1〉. The DMRG algorithm consists ofsuccessive sweeps over the orbitals during which two neighboring MPS tensors arevariationally optimized. We will achieve this by first combining two adjacent matrices inthe MPS and combining them into a new tensor which we will variationally optimize.∑

αi

A[i]siαi−1,αiA[i + 1]si+1αi ,αi+1 = B[i]si ,si+1

αi−1,αi+1 . (397)

We will do this by choosing a trial wave-function (for the first step only) and extremizingthe Lagrangian

L = 〈Ψ (B[i])|H |Ψ (B[i])〉 −Ei〈Ψ (B[i])|Ψ (B[i])〉, (398)

with respect to the complex conjugate of B[i]. From our prior knowledge of variationalprinciple we know this to yield the eigenvalue problem

H[i]ef f B[i] = EiB[i]. (399)

Once B[i] is found, it is decomposed with a SVD, and is truncated if there are morethan D singular values.

166

During this iteration B[i] is constructed, the corresponding effective Schrodingerequation is solved, the solution B[i] is decomposed using singular value decomposition,A[i] is defined again as the contraction between U [i], s[i], and A[i + 1] is set to thecorresponding right normalized matrix V [i]. Once this iteration is completed i → i + 1(i→ i − 1) and the process is repeated again and again until i = L (i = 1) and the processwill reverse direction back and forth in consecutive "sweeps" until some convergencecriterion is triggered. These calculations are repeated again and again with increasingvirtual dimensions D in order to take note of the convergence, since we may oftenextrapolate EFCI from the DMRG energies ED .

G. DMRG in practice

Before a DMRG algorithm is implemented it is hugely important to chose an appropriateorbital ordering for the MPS ansatz and an appropriate initial wave-function. Theeffect of the starting guess causes an estimated error in the energy which is an orderof magnitude smaller than the effect of the ordering of the orbitals.

It is best to either start with a small active space and subsequently add in previouslyfrozen orbitals, this can be done by starting with a small CASSCF, CI or HF calculationfor the orbitals.

The choice and ordering of orbitals is non-trivial. In early DMRG White was able toeasily order the orbitals since the system was one-dimensional. Neighboring electronswere far more correlated than electrons which were very far from one another. Formany elongated molecules a spatially local basis is useful since the system is roughlyone-dimensional.

H. Dynamic correlation and excited states

DMRG is not able to correct for dynamic electron correlation on its own. Thesecorrelations may be accounted for by combining DMRG to solve for the active space andCASSCF.

DMRG may find excited states by projecting out lower-lying eigenstates, or by targetingspecific excited states with state-specific algorithms. Additionally DMRG may be combinedwith linear response theory in order to make DMRG-LRT which allows for the calculationof excited states and other response properties.

I. Applications to atoms and molecules

DMRG has been applied to many systems in order to calculate ground state properties,excited state energies, polarizabilities, and many properties that require calculating the

167

single or two body reduced density matrix (which may easily be computed from the MPS)such as spin densities and dipole moments. Currently is one of the best methods for oneone dimensional, almost one dimensional systems, heavy molecules for which relativisticeffects are important, and systems such as transition metals with have a very large activespace.

J. Limitations

Although DMRG is applicable to most systems, it is only efficient at describing localityin on spatial dimension. In large molecules orbital ordering and the MPS structureas a whole becomes unpractical. For higher dimensional systems more general tensornetwork states are being developed which increases the number of virtual dimensionssummed over and helps encode locality into the TNS wave-function.

168

Advanced Molecular Science: Electronic Structure Theoryszalewic/teach/838/lectures.pdf · Advanced...

Documents

Transcript of Advanced Molecular Science: Electronic Structure Theoryszalewic/teach/838/lectures.pdf · Advanced...