This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
= 41'12t@IY> =*= =
*
(3.37)
=
Since T12 = T21. the operator T'12 = T12 (1-P12) is also complex symmetric. This follows from the fact that <@IT12'I Y> = J I d x l d ~ 2 @*(1,2)T12(1-P12)"(1.2) = Ti2 [ Y(1,2) - "(2.1)) = = I I dxldx2 @*(1,2) =
I I dxldx2 {@*(1,2)- @*(2,1))Ti2 "(1.21,
(3.38)
where we have simply interchanged the names of the integration coordinates x i and x2 in the second term. Using (3.37).one obtains further
which proves the statement. The first term T1 in the right-hand side of (3.35)is complex symmetric. In order to show that the second term tl is also complex symmetric tlt = tl*, one has to show that - for any pair of one-particle functions f(1) and g(1) in L2 - one has= = * = * = . (3.40)
Using (3.36)and (3.39).one obtains directly
= I dxl f * ( l ) t l g(1) =
Per-Olov Liiwdin, Piotr Froelich, and Manoj Mishra
208
= I dxl g(l){xk.1 I h 2 Vl(2) Tl2' Wk(2) ekl ==
,
fc(1) =
(3.41)
where in the proof we have interchanged the summation indices k and 1 and used the fact that ekl= elk. Hence tl is complex symmetric, and one has (3.42) The fact that also the effective one-particle Hamiltonian T&l) is complex symmetric h a s deep-going consequences and renders considerable simplifications in the theory.
={
Let u s first evaluate the canonical Hartree-Fock functions ~ p p ...WN ] which are solutions to the equations
vl,w2,
where the matrix h is on classical canonical for, with the eigenvalues tk on the diagonal. In such a case, one can introduce the conjugate set 4 = {@I,$9, ... , ON} through application of relation (2.26) to the one-particle case, which gives
4 = y*-1.
(3.44)
One has further
I n such a case, the one-particle projector p defined by (3.32) takes the simplified form
with the kernel
Many-Particle Hamiltonian Transformations
209
(3.47) which if often very useful in the numerical applications. In solving the non-linear Hartree-Fock equations (3.43). one
using the iterative SCF-procedure discussed in reference C. If one starts from a one-particle projector p which is complex is
symmetric, this property is going to stay invariant under the iteration procedure, and one finally reaches a solution with the desired property: D, = Ca*cCa I C,*>-'-
3.4. Special Case when Ti= T* = T. Let us now consider the special case when a complex symmetric operator is real, s o that T* = T. I n this case, the operator T is also self-adjoint. Tt = T. and one can use the results of the conventional Hartree-Fock method '. The eigenvalues are real, 3L = h*, and - if an eigenvalue h is non-degenerate. the associated eigenfunction C is necessarily real or a real function multiplied by a constant phase factor exp(i a). In both cases, one has D = C*cC I C*>-' = C. In the conventional Hartree-Fock theory, the one-particle projector p takes the form p= l p c ~ I ~ - 1 < ~ l t
(3.48)
and it has hence the special property p+ = p.
(3.49)
I t is then easily shown that also the effective one-particle operator Teff is self-adjoint:
t
Teff (1) = Teff(1).
(3.50)
If one starts the iterative SCF-procedure from a self-adjoint operator p, the properties (3.50)and (3.49)are going to be invariant under the iterations and are going to characterize the final solution. Since Teff is self-adjoint, the classical canonical matrix D is always on diagonal form. We note that, in this case, there is no need to assume that the set oy = {wl. w2, ... , WN) should consist of real functions, and that it is sometimes
210
Per-Olov Lawdin, Piotr Froelich, and Manoj Mishra
convenient to consider complex sets. I t is a peculiarity of the Hartree-Fock scheme that the properties of the approximate eigenfunction C, does not always reflect the properties of the exact eigenfunction C. A well-known example is given by the symmetry dilemma which says that if an eigenfunction C has a special symmetry property characterized by the projector 0. so that
',
OC=C.
TO=OT,
(3.51)
then the approximate eigenfunction C, obtained from the variation principle 61 = 0 does not automatically have this property. This means that the condition OCa = C a becomes a constraint on the variation principle, and one has the choice between looking for an "absolute" minimum I. in which the Slater determinant C, may be a mixture of different symmetry types, and a "local" minimum Iloc > I, in which the constraint is satisfied. I t should further be observed that, if one starts from a Slater determinant which satisfies the condition OCa = C,, then the constraint is satisfied during the entire iterative SCF-procedure. and that this approach leads to the so-called restricted Hartree-Fock (RHF) scheme. In this particular case, the matrix It in the canonical Hartree-Fock equations may contain non-diagonal Lagrangian multipliers tkl # 0, but we note that they are of a different character than those occurring in the Jordan blocks. Since the operator T is complex symmetric, Tf = T*. one may wonder what happens if one starts the iterative SCFprocedure from a complex set y = ( wl.w2, ... , vN) and a oneparticle projector p of the form (3.32).i.e. p = Iolp>-
1
(3.52)
instead of (3.48). In such a case, the two relations p t = p* and = T,ff* are going to be invariant under the iterative SCFprocedure and characteristic also for the final solutions. Since the relations (3.44)and (3.45)are now valid, one has also the property D, = C,*-l, but we note that the Slater determinant C, is not necessarily real, or a r e d function multiplied by a constant phase factor cia. This means also that the approHmate T,f;
Many-Particle Hamiltonian Transformations
21 I
eigenvalue h defined by the bi-variational principle is not necessarily real. It is clear that this complex symmetric Hartree-Fock scheme will reduce to the conventional Hartree-Fock scheme, if and only if the relation
y*= IQ.
(3.53)
is satisfied, i.e. if the linear space spanned by the functions ~ p = p { ... , wN} is stable under the operation of complex conju-
w1.w2.
gation (*). In this case, the one-particle operator (3.52)reduces to the conventional form (3.48).and - for the Slater determinant Ca = (N!)-1'2 Iw(xi)I - one obtains
where the determinant I a I is a constant factor, and the approximate eigenvalue I becomes real. If one starts the iterative SCFprocedure from a set olp, which satisfies the relation (3.53).the relations pt = p* and Tefft = Teff* are going to stay invariant under the iterations, and one obtains the conventional HartreeFock scheme. On the other hand, if one starts from a set which does not satisfy (3.53),the relations p t = p* and Teff'= Teff*are going to
stay invariant under the iterative SCF-procedure, and we note that there is nothing in this process which says that the final solution y must necessarily satisfy relation (3.53).Hence one can expect that the complex symmetric Hartree-Fock scheme and the conventional scheme will sometimes give different results.
3.5. Invariance of the Property of "Complex Symmetry" under Restricted Similarity Transformations.- Starting from the general similarity transformatiion (3.1).one obtains the two relations TU =
(TUlt = (Ut)-'TtUt.
(3.55)
212
Per-Olov Liiwdin, Piotr Froelich, and Manoj Mishra
One sees immediately that, if the operator T is originally complex symmetric, so that Ti= T*. it will keep this property invariant under the similarity transformation provided that the operator U satisfies the condition ( U + ) - l = u*,
(3.56)
in which case one speaks of a restricted similarity transformation: see also reference A, eqs. (A.1.54-1.71). Let us further consider the stability problem for the operators T and Ti in the form given by the relations (2.19)and (2.20).For the con'ugate eigenfunctions Cu and DU to the transformed operators T and (TU)'.one gets in analogy with (3.8)that
L
CU=UC,
DU = (Ut)-l D,
(3.57)
I n the special case when T is complex symmetric, one has further according to (2.26)that D = C * < C I C * > - l . S i n c e= = = . one obtains
DU = ( U y D = U*C*-l = (CU)* -l, (3.59) and this important relation is hence invariant under the restric--ted similarity transformations. The many-particle operator U defined by the product (3.2) defines a restricted similarity transformation provided that the one-particle operators u satisfy the condition:
(u+)-'= u*.
(3.60)
Let us now consider the one-particle transformations (3.19) which, in this case, take the form:
If all the functions in the sets D(P and
0 belong to the domain of u
and u*, respectively, the one-particle operators p and pu exist,
Many-Particle Hamiltonian Transformations
213
and one obtains - according to (3.21)and (3.28)- the relations U
p
-1
=UPU
,
1
T u = u T e ~ u .-
(3.62)
Using (3.60).one gets further that
i.e. the operators p and Teff remain complex symmetric under the restricted similarity transformations. In this particular case, one has
and the classical canonical form B stays invariant under the similarity transformation as one would expect. According to (3.44). one has further 0 = y*-l. and since eyuI yu*>= . one gets also ?#I
=
(yU)*-l.
(3.66)
Let us finally assume that one starts from a set yu which be-
longs to L2,but that this is not necessarily the case for the set y -1 u = u D(O . In such a case, the projector pu = I yU.c(yU)* I yU>-l<(l!+P)* I
(3.67)
as well as the operator T U eexist, ~ and the relations (3.63)are going to be satisfied. This implies that, if one has solved the canonical Hartree-Fock equations (3.43)for this particular case:
where hu is a classical canonical form, then one can immediately introduce the conjugate solutions +u through the relation
214
Per-Olov Uwdin, Piotr Froelich, and Manoj Mishra
(3.69) According to (3.45).they satisfy the conditions
I $5 =
n.
(3.70)
However, since the projector p does not necessarily exist, the invariance theorem t = QU could very well break down, corresponding to the case of "lost eigenvalues" in the general theory. Let us now consider the special case when Tt = T* = T, i.e. when the original operator T is self-adjoint and real. In the conventional Hartree-Fock scheme - in the following indicated by an index c (=conventional) - which is based on relation (3.48). one has p, = pet and Teff,, = Teff,,t , and it is obvious that the classical canonical form h, must be diagonal with only real eigenvalues.
For restricted similarity transformations, the transformed operator TU= UTU-' is still complex symmetric, and the same applies - according to (3.33)and (3.42)- to the one-particle operators pu and Tueff.If one starts from the assumption that oPp' = u
y,.
(3.71)
one gets according to (3.62)that pu=up,u
.
-1
(3.72)
and the invariance theorem t = Q shows then that, in this case, the classical canonical form h is diagonal with real eigenvalues. Since we are here interested in complex one-particle eigenvalues tuk, this trivial application of the Tarfala theorem leads hence to a negative result. I t is evident that, if one wants to avoid this trivial case, one has to abandon relation (3.71)and assume that the set yu has more general properties. For this purpose, we will start from the complex symmetric many-particle operator Tu and consider the associated one-particle operators
Many-Particle Hamiltonian Transformations
pu = I yu><(DpRIu y)* u > - l < ( q ) *I ,
215
(3.73)
and TUef&l)defined by (3.23). In this more general case, one is dealing with the complex symmetric Hartree-Fock scheme , and it may very well happen that the classical canonical form hU has complex eigenvalues. I t is not so easy to demonstrate this fact by a theoretical or numerical example for the simple reason that, even if it is easier to solve the Hartree-Fock equations than the no one has so far been able to original eigenvalue problem (2.3), solve the Hartree-Fock equations exactly. In the approximate procedures built on the use of truncated real orthonormal basis sets of order m, the matrix Tueffis a symmetric matrix of order mxm with complex elements, which - as a rule - has complex eigenvalues. Once one has solved the Hartree-Fock equations associated with the transformed operator TU= UTU-l. it is evident that it would be of interest to study also the inverse transformation T = U -1T uU, where T has the properties T = Ti = T*. In order to stay within the complex symmetric Hartree-Fock scheme, we will then consider the functions:
y = u -1y u.
(3.74)
If at least one of the functions yu is not in the domain of the
operator u-', at least one of the functions y would be situated outside L2, the complex symmetric Hartree-Fock scheme would break down, and one would have a situation which would be analogous to the case of a lost eigenvalue in the theory of the exact eigenfunctions C and D. We note, however, that the Hartree-Fock scheme is a n approximation in which the solutions do not necessarily reflect the behaviour of the exact eigenfunctions. I n fact, it may very well happen that the canonical Hartree-Fock functions y" are all situated in the domain of the operator u-', and this implies that also the transformed functions D(P = u-'prU are all situated in the L2 Hilbert space. In such a case, the one-particle operators p = l ~ > < ~ * l o o p > - l < ~ * l a n d T,ff exist, a n d one has t h e
Per-Olov Uwdin, Piotr Froelich, and Manoj Mishra
216
transformations:
p = u -1puU,
Tedl) = u -1Tue d l )
U,
(3.75)
which indicate that the pDerators D and Teffll) are cormley surnmetn'c but usuallv not sklFad_ioint:I t should be observed that the complex symmetric Hartree-Fock scheme is different from the conventional one, and that the former reduces to the latter if and only if the relation (3.53)is satisfied. i.e. if the linear space spanned by the set y is stable under complex conjugation. In order to obtain complex eigenvalues to the operator Tueff(l),it is hence necessary to avoid this particular situation, in which y*=
v pa,
-1
(u
u )* = (u-'o(Pu) p a , and
i.e. all the functions uutYuk* for k = 1, 2, ... , N should be situated in the linear space spanned by the functions yu = ( ~ " 1 ,wu2, ... , w u ~ } ;for algebraic details, see Appendix A. I t is evident that the relation (3.76)is a very special condition for the general set yu= (Yuk} which is very seldom satisfied. In practice, one knows for sure that it is not fulfilled as soon as the operator Tueff(l)has a t least one complex eigenvalue. We note finally that, in the numerical studies of the complex symmetric operators Tu and T u e ~it, is usually convenient to use orthonormal basis sets which are real, since the associated matrices will then automatically be symmetric with complex elements. I n the case of a truncated basis of order m, most of the eigenvalues will usually turn out to be complex, but we observe that one has to study their behaviour when m goes to infinity and the set becomes complete, before one can make any definite conclusions as to the existence of true complex eigenvalues to Tu and T u , ~ .respectively. The connection between the results of approximate numerical treatments and the exact theory is still a very interesting but mostly unsolved problem.
Many-Particle Hamiltonian Transformations
217
4. Properties of a Many-Particle Hamiltonian under Complex scaling.
4.1. Method of Complex Scaling: the Dilatation Operator. - The method of complex scaling is based on the use of the dilation operator u = u(q) - where q is a fixed complex parameter called the "scale factor" which is sometimes expressed in the alternative forms q = exp(8) = q exp(ia) - defined through the property
u(q) f(x) = q112 f(qx).
(4.1)
Here f= f(z) is an analytic function originally defined on the real axis {x),which is assumed to be analytic also in the point z = qx. An essential problem in the method of complex scaling in quantum mechanics is hence to study whether a wave function w = ~ ( xdefined ) on the real axis may be continued analytically out in the complex plane to the point z = qx. Since many analytic functions have natural boundaries, it is from the very beginning evident that there may be considerable restrictions on the parameter q itself. More generally, one may define the operator u through the relation u(q) f(z) = $ I 2 f(qz),
(4.2)
where it is assumed that the function f is analytic in both z and qz. The method of complex scaling h a s been rigorously developed during the last few decades by a number of mathematicians, and - for a selection of the key references - the reader is referred to our reference A. Here we will consider the method of complex scaling essentially from the physicist's point of view. There are many ways to show that the dilatation operator u satisfies the condition (3.60). which is characteristic for the restricted similarity transformations. By putting q = exp(8). it was proven in eq. (A.3.14)that u may be written in the form
where
Per-Olov Lawdin, Piotr Froelich, and Manoj Mishra
218
o = (1/2) + xd/dx = (2xi/h)(px+xp)/2,
(4.4)
and p = (h/2ni) d/dx is a self-adjoint operator with the special property p* = - p. We will here give an alternative derivation for finite values of 8.
Let us assume that f = f(x) is analytic around the point x=O within a radius p. so that - for Ix I < p - one has the Taylor expansion f(x) = zn (xn/n!) P(o).
(4.5)
Since (xd/dx)xn = nx", one gets (xd/dx)p 3 = np xn, as well as exp(8 xd/dx) xn = ($ ( ep/p!) =
{% (ep nP/p!)) x"
(X
d/dx)p)xn =
.
= exp (en) x" = (e8xn
(4.6)
Hence one has exp@ d/dx) f(x) = Zn (ee x)"/n! +n'(0)= f(eex),
(4.7)
provided that I eexl < p , so that the series is convergent and the function analytic in the point z= eex = q x. Hence
u = eem = exp ((n i/h) 0 (px+xp)).
(4.8)
Since further wt = - w = o*.one gets directly u t = exp(eo)t = exp
I-
e*o) = {
U(e*))-l
= { u * J - ~ , (4.9)
which proves our statement. For the specific algebraic ruIes valid for the dilatation operator u, etc., the reader is referred to reference A. I t should be observed that u is a n unbounded operator, as shown by some simple examples. 4.2. The Coulombic Many-Particle Hamiltonian.- For a system of atomic nuclei and electrons under the influence of electrostatic
forces, the many-particle Hamiltonian takes the form:
Many-Particle Hamiltonian Transformations
219
as described in (A.3.7).The method of complex scaling may be defined in many different ways, but - in reference A and here we have defined it through an unbounded similarity transformation of the type (1.1)which gives (4.11)
In an approximate numerical treatment based on the bivariational principle and the use of a finite (real or complex) orthonormal basis Q, one has
HU = q-2T + q-lv,
(4.12)
If the basis Q is chosen real, one has the additional advantage that the matrices T and V are real and symmetric, and - with q = p(cos a + i sin a ) - one can easily separate the real and imaginary parts of the ei envalue problem. I t is evident that, in such a case, the matrix W is automatically symmetric with complex elements. and this important matrix property was discovered in the numerical applications long before one started studying the corresponding operator property.
fJ
Keeping the basis fixed, one can now study the behaviour of the approximate eigenvalues as a function of the scale factor q and particularly of the "rotation angle" a. Using the results from the exact theory, it seems reasonable to believe that the eigenvalues which are approximately situated on a ray in the complex plane with the angle (-2a)belong to the "continuum". whereas those which are situated between this ray and the real axis may correspond to discrete complex eigenvalues associated with physical "resonances". I n the exact theory, these complex eigenvalues EU are independent of q as soon as they are "uncovered": dE/dq = d2E/dq2 = ... = dnE/dqn = ... = 0,
(4.13)
whereas in the approximate numerical treatment they show certain stability properties: in practice they are often determined from "stabilization graphs" based on the fact that at
220
Per-Olov Liiwdin, Piotr Froelich, and Manoj Mishra
least dE/da = 0. The Coulombic Hamiltonian (4.1) is invariant under translations and rotations, and it is hence convenient to separate the motion of the "center of mass" 5 and to study the new Hamiltonian:
H
H - p~2/Mg
(4.14)
in which one may further separate the rotation of the system
considered as a whole. One of the fundamental problems in quantum chemistry is that the Hamiltonian H' does not distinguish between different isomers or configurations, and that one still does not know how to treat the influence of the "nuclear motion" in full detail. In this situation, one often tries to use additional chemical insight obtained from experiments. In the Born-Oppenheirner approximation , one assumes that the nuclei g are very heavy (m, = -1, neglects the kinetic energy of the nuclei, and assumes that they are situated in fuced positions . We note that - even if today the calculation of internuclear istances and angles is a highly developed art - the general shapes of the molecules involved are often taken from chemical experience. In this scheme, the Hamiltonian takes the simplified form:
2
(4.15) where
2
Hg = e /rg,
(4.16)
and the indices g and h characterize the atomic nuclei and the indices i and j the electrons. Scaling also the nuclear coordinates properly (eg. by external scaling), one may now apply the method of complex scaling to the Born-Oppenheimer Hamiltonian (4.15). The results of the general theory of unbounded similarity transformations as outlined in reference A apply also to this case, and - for numerical applications - the reader is referred to the bibliography in this paper.
221
Many-Particle Hamiltonian Transformations
4.3. Application of the Hartree-Fock Method. - Since numerical Hartree-Fock programs dealing with complex numbers are available in many research groups, it seemed natural to apply this scheme also to the scaled Born-Oppenheimer Hamiltonian (4.15). As a consequence, some numerical results were obtained before the theory was developed, and - as we have emphasized in
the Introduction - some features seemed rather astonishing. The Hamiltonian H defined by (4.15) is self-adjoint and real, H = Ht= H*, and this implies that the transformed Hamiltonian Hu remains complex symmetric under complex scaling. Hence the same holds for the one-electron projector pu defined by (3.52) 1 pu = ll$J=-
(4.17)
and for the associated effective Hamiltonian HUefr(l).I n such a case, the Tarfala theorem expressed in the relations (3.21) and (3.28) is valid: 1
pu = u p U- ,
TueffIl) = u Teff(l) U-'.
(4.18)
In Sec. 3.3, we have emphasized that the complex symmetric Hartree-Fock scheme usually does not reduce to the conventional one for the special case when q = 1. More generally, in the complex Hartree-Fock scheme, the results obtained from a starting value q l do not necessarily reduce to the results obtained from another starting value q2 - i.e. each starting value leads to a specific result. In the numerical applications, it is hence convenient to fx the scale factor q, to use a real orthonormal basis @ of order m, and to calculate all the m eigenvalues &k of the effective Hamiltonian Heff(1) on the real axis and in the complex plane. One then studies the behaviour of these eigenvalues as the order m of the basis increases and looks for "stable" complex eigenvalues between the ray (-2a)and the real axis - often by means of "stabilization graphs". One then repeats the calculation for a series of values of the scale factor q. observing that the results are often discontinuous functions of the rotation angle a. An example of such a calculation will be given in the next section.
222
Per-Olov Uwdin, Piotr Froelich, and Manoj Mishra
5. Numerical Application of the Complex Scaling Method in the Hartree-Fock Approximation. The calculation of physical resonances associated with the existence of meta-stable states with a finite life-time plays an important role in modern physics. They occur e.g. in scattering problems during the interaction between atoms/molecules with incoming particles or with electromagnetic fields. A simple example may be given by considering a so-called s h a p e resonance arising in the course of the scattering of an electron by an atom. This resonance is associated with the temporary capture of the electron by the target atom (A) resulting in the formation of a metastable negative ion (A-)*. which then decays by an electron emissiong: e+A
+
(A-)* + e + A ,
(5.11
This process is most efficient if the incident electron has a resonance energy K, given by the difference of the average energy E, of the composite structure (A-)* and the energy Eg of the ground state of (A). Conventionally, the energy of (A-)*is expressed as a complex number E = E, - ir/2, where E, indicates the middle of the energy level and r gives its half-width. In this way, the study of slow electron-atom collisions can be essentially reduced to the study of the resonances of the negative ions. Resonances may be calculated in many different ways, and during the last decade - the complex scaling method has turned out to be particularly powerful for this purpose, since it has been shown that the complex discrete eigenvalues revealed in this approach are identical with the physical resonances; for a survey of this method, the reader is referred to paper A and references given there. I t is evident that, in calculating the eigenvalues of the transformed Hamiltonian HU = UHU-', methods based on the use of superposition of configurations (CI) are going to be particularly useful. However, it may also be of some interest to find out how much of the resonances may be found in the Hartree-Fock scheme and how much requires more deep-going correlation calculations. The start of the present work was based on the idea that at least some of the "shape resonances" could be detected already in the Hartree-Fock approximation applied to the scaled Hamiltonian.
Many-Particle Hamiltonian Transformations
223
I t is evident that, in order to solve this problem, the SCF resonances of the composite system (A-)* ought to be studied in full detail. However, in the numerical application described below, we have simplified the problem even further by assuming that one may obtain the resonances of the ionic system (A-) at least approximately by studying the "meta-stable" virtual (unoccupied) orbitals in the dilated SCF calculation performed on the target atom (A) alone. The details of such an approach for the study of a resonance formation and decay are given elsewhere lo.
By using this approach, we have studied the 2P shape resonance in the e-Be scattering problem. The spectrum of the bi-variationally obtained effective Hamiltonian for the beryllium atom (1s22s2) has been investigated, and this spectrum and its behaviour are presented in Figures 1-4. Even at this crude level of approximation, it seems possible to detect a "trace" of the resonance, i.e. to distinguish between the resonant virtual orbital eigenvalue from other virtual eigenvalues describing the dilated scattering solutions". In spite of the crudeness of the model, the energy and width of the resonance show reasonable agreement with those obtained with more accurate methods: see Table 1.
Per-Olov Liiwdin, Piotr Froelich, and Manoj Mishra
224
Figme 1.
- 0.080.00
-
-
I -,',,:
>
!!! c (
=0.7500
( i n i t ) =O.OOO
= 0.0200
i
-0.40-
h
E
o(
Ad
-0.16-
-0.24
f
-0.48 -
- 0.56-
-0.64
-
Fig. 1. Trajectory for the real and imaginary parts of the resonant orbital energy E denoted by dots ( 0 ) and the nearest scattering root (0) as functions of the rotation angle a in radians. The starting value is a = 0.0 a t the top right (ImIEI = 0.0).and the increment A a is 0.02. The real factor p in the scale parameter q = p exp(ia) is here fuced a t p = 0.75.
Many-Particle Hamiltonian Transformations
225
Figure 2a.
Fig.2a. Trajectory for the real and imaginary parts of the resonant orbital energy E denoted by dots ( 0 ) as a function of the real scale factor p. The starting value is p = 0.65 at top - left, and the increment Ap is 0.025. The rotation angle is fixed at a = 0.38.
Per-Olov Lawdin, Piotr Froelich, and Manoj Mishra
226
Figure 2b.
- 3.65
1\
-3.75 -3.85 -3.95
-
?(init)
= 0.650
A ? xO.0250
\
-
- 4.1 5 -
0,
= 0.4000
\
-
-4.05
4
X‘
-
w
v
E -4.25
‘i
-
1
c
1 1
\
-4.35
-4.45
I
-4.55 0.40
1
1
0.48
1
1
0.56
1
1
0.64
1
1
0.72
1
I
1
0.80
0.88
0.96
Fig. 2b. The same trajectory as in Fig. 2a. but with the rotation angle fured at a = 0.4.
227
Many-Particle Hamiltonian Transformations
Figure 3a.
-1.12
0.36
I
l
0.44
l
l
l
0.52
-
l
0.60
Re ( E ) eV
-
l
l
0.68
l
l
0.76
1
1
0.84
1
I
0.92
Fig.3a. Trajectory for the nearest scattering root (denoted by o in Fig. 1) as a function of the real scale factor p. The starting value is p = 0.65 at top left, and the increment A p is 0.025. The rotation angle is futed at a = 0.38. Note that the slope of the line corresponds to the value of a expected from the complex scaling theorem (AIm[EI/A Re[El = tan 2a).
228
Per-Olov Uwdin, Piotr Froelich, and Manoj Mishra
Figure 3b.
-1 .I 6
0.36
0.44
0.52
0.60
-Re(E)eV
-
0.68
0.76
0.84
0.92
Fig. 3b. The same data as in Fig. 3a, but with the rotation angle fxed at a = 0.4.
229
Many-Particle Hamiltonian Transformations
0.1 6
0.08
1
Figure 4.
p = 0.7750 o((init)
= 0.000
-0.68
- 0.56
7.00
7.08
-
7.16
12L 7.32 R e ( E ) eV XIO-’
7.40
-
748
7.56
Fig.4. Trajectory for the real and imaginary parts of the resonant orbital energy E denoted by dots ( 0 ) as a function of the rotation angle a with the starting value a = 0.0 at top right and the increment A a = 0.02. The real factor is here fixed at the optimal value p = 0.775 as determined from Fig.2. Note that the distances between consecutive points ( 0 ) are monotonically increasing in the beginning, but noticeably slowing down at the end which resembles a cusp. This phenomenon is associated with a resonance: for further details see reference 10.
Per-Olov Liiwdin, Piotr Froelich, and Manoj Mishra
230
Table I. Energy and Width of the 2P Resonance in the e-Be Scattering from Various Calculations.
a b C
d
e f
h i
Model Potential Static Exchange Phase Shift Static Exchange Plus Polarizability Phase Shift Static Exchange Phase Shift Static Exchange Plus Polarizability Phase Shift Static Exchange Cross Section Static Exchange Plus Polarizability Cross Section Complex SCF Coordinate Rotated Static Exchange Second-order Dilated Electron Propagator Based on RealSCF Complex SCF Complex CI: Singles, Doubles Singles, Doubles, and Triples Second-order Dilated Electron Propagator Based on Complex SCF This work Bivariational SCF
0.60 0.77
0.22 1.61
0.20 0.75
0.28 1.64
0.14
0.13
1.20
2.6
0.16 0.70
0.14 0.51
0.76
1.11
0.57 0.69 0.58 0.32
0.99 0.51 0.38 0.30
0.62 0.70
0.60 0.84
____________________-----------------------------
a J. Hunt and B.L.Moiseiwitsch, J.Phys, B 3.892 (1970). b H.A. Kurtz and Y. O h m , Phys.Rev. A 19,43 (1979). c H.A. Kurtz and K.B.Jordan, J.Phys. B14. 4361 (1981) d C.W. McCurdy. T.N. Rescigno, E.R. Davidson and J.G. Lauderdale. J.Chem.Phvs. 73,3268 119801. e T.N. Rescigno. C.W. CuGdy and A.E. Orel, Phys.Rev. A 17,1931 ( 1 978) f R.A. Donelly and J. Simons, J.Chem.Phys. 73,2858 (1980). g M. Mishra. 0. Goscinski, and Y. Ohm, J.Chem.Phys. 79.5495 (1983). h J.F. McNutt and C.W. Curdy, Phys.Rev. A 27,132 (1983). i M. Mishra, 0. Goscinski. and Y. Ohm, J.Chem.Phys. 79,5505, (1983).
Many-Particle Hamiltonian Transformations
231
6. Conclusions and Acknowledgements.
The motivation for the start of the research reported in this paper was a desire to find out if at least a "shape resonance" could be detected in the Hartree-Fock approximation applied to a scaled Hamiltonian of the type described by relation (4.12). Using computer programs available for other purposes, programs for solving the dilated SCF equations could be constructed. In starting from a real orthonormal one-electron basis, it was found that all relevant matrices were complex symmetric, which greatly facilitated the computations, that many of the oneelectron energies were complex, and that some of them showed certain "stability properties" indicating that they may be related to resonances. Even if it was evident that the one-electron spectra {&k}were discontinuous functions of the rotation angle a. it was still of interest to study what happened when this angle approached zero. If the Hartree-Fock functions D(P undergo a transformation y' =
uoop, and all the functions are situated in the domain of u, then the one-particle operators p and H,ff(l) defined by (3.20) and (3.23), respectively, undergo the transformations (3.2 1) and (3.28) here referred to as the Tarfala theorem I t is also assumed that the functions $ are situated in the domain of (ut)-'. and we note that - in the complex symmetric case - one h a s a$ = y*-l, and that, since (ut)-'= u*, this condition is automatically satisfied. Since the effective Hamiltonian undergoes a similarity transformation, the spectrum {&k} for the occupied orbitals ought to be invariant, and - at first sight - the Tarfala theorem seemed hence to be in contradiction to the numerical results already obtained. This puzzle was solved when it was realized that the complex symmetric Hartree-Fock scheme does not necessarily reduce to the conventional one when the Hamiltonian becomes real and self-adjoint for cc = 0. This result is also in agreement with some conclusions previously reported 11.12 .
I t is interesting to observe that, if one limits oneself to study the original real and self-adjoint Hamiltonian with Ht= H = H* and applies the complex symmetric Hartree-Fock method to this particular case without any transformations whatsoever, one
232
Per-Olov Uwdin, Piotr Froelich, and Manoj Mishra
ought still to obtain some complex one-electron energies, which depend on the starting point for the iterative SCF-procedure involved. By relating them to results of the complex scaling method, one could still draw the conclusion that they should be associated with physical "resonances" to the same extent a s this method. Numerical Hartree-Fock programs dealing with complex matrix elements have been available for quite some time in various research groups, and one could hence anticipate that several researchers should have encountered the occurrence of complex one-electron energies. In going over the literature, we have so far found at least one example that this phenomenon has been observed before13, but it seems likely that there should be many more. We hope that the analysis given here should help in clarifymg the nature of the complex symmetric Hartree-Fock scheme. In conclusion we note that one of the most urgent problems in this area of research is to find the connection between the exact theory and the numerical approximations based on the use of a finite basis of order m when the value of m increases and the basis - in priciple - tends to become complete.
Acknowledgements. This paper was started during the 1981 Workshop held at the Tarfala Glaciolog station in the Kebnekaise mountian region in Lappland, Sweden. In these international workshops, every day devoted to scientific discussions was usually followed by a day of climbing - a combination leading to a very special form of scientific and personal friendship between the participants. This paper is respectfully dedicated to the memory of the late Professor Walter Schytt, who during many years together with his wife Anna-Nora was host for these workshops and for the International Summer Institutes in Quantum Chemistry and Solid-state Theory arranged by the Uppsala Quantum Chemistry Group in collaboration with the Florida Quantum Theory Project. The warm hospitality of the Schytts is hereby greatfully acknowledged, and one of the theorems in this paper is called the Tarfala theorem in their honour. - The authors are also thankful to Mrs. Karin Lowdin for her valuable help in equipping the field part of the Tarfala workshops during many years. They would also like to thank Drs. Jean-Louis Calais and Erkki Briindas for many valuable comments in the discussions.
Many-Particle Hamiltonian Transformations
233
Appendix A. Treatment of One-Particle Subspaces of Order N. which are Stable under Complex Conjugation. Let us consider a subspace of the L2 Hilbert space, which is spanned by the linearly independent set ~ p o= {yfl. yf2, .... , y f ~ of ] order N. Such a subspace is said to be stable under complex conjugation (*), if
Introducing the metric matrix A = cp~pI ~ p p > and the reciprocal set I y?> = 1, and multiplying the yr= ~ d pA- having the property relation (A. 1) to the left by c yrI , one obtains
which is our explicit expression for the expansion matrix of order NxN. This leads also to the relations= ma,
= at A,
(A.3)
where the second relation is obtained by taking the adjoint of the first. One has now the following theorem: The nessary and sufficient condition that the subspace spanned by the set y is stable under complex conjugation is that the matrix a =satifies the relation
That the condition is necessary follows from the fact that, if p?* = ya, one has y?= y*c[a*,which immediately gives a-'= a*. In order to prove that the condition (A.4) is sufltcient. we will consider the diagonal elements of the matrix:
234
Per-Olov Uwdin, Piotr Froelich, and Manoj Mishra
Q =,
(A.5)
when a is defined by the relation (A.2).Using (A.3), one obtains directly: Q =- < y*ry>a- at +
+ a ta =
=
Dpp*>
-
at Aa
-
at A
Q
+
-A-lA A - l < y I y>=
={ - -I< Dop I Dgp*>]
={(a-l)*Q }
= 0,
= (A.6 )
i.e. all the matrix elements of Q - diagonal as well as nondiagonal - are identically vanishing. Hence, one has necessarily y*-ya = 0, which completes the proof. In relation (3.76).we are considering a stability relation of the type
where - for the sake of generality - we have replaced y" b y y 1 and a by 6. Here the operator v = u ut has the special property -~ (uut )* = v*. which gives v-'= ( u ~ ) - ~=uu*(ut)*= v = vt = (v-1) *
(A.8)
and we note that also v is connected with a restricted transformation. Multiplying relation (A.7)to the left by
g =-l .
(A.9)
Many-Particle Hamiltonian Transformations
235
One has now the theorem: The necessary and sufficient condition that the subspace spanned by the set yl is stable under complex conjugation combined with the operator v = uut is that
g-1 = @*.
(A. 10)
A simple proof is obtained by introducing the transformed functions opo = u-'y1 and reducing (A. 10) to (A.4). However, even
the direct proof is simple.
For this purpose, we will proceed as follows. In order to prove that the condition (A.10) is sujJcient, we observe that, if @ - l = g*. one has a n alternative form for the matrix f3 defined by (A.9):
g = ( p*,-l
= <9op1 I yl*>-l
(A. 1 1 )
Let us now consider the diagonal elements of the matrix Q1=
u-1pM#I u ty,* - U-lylg > =
Using the form (A.9) for @ in the last term, one finds that the last two terms cancel. Using the alternative form ( A . l l ) for in the second term, one obtains Q1
=
-= 0,
(A.13)
Hence all the matrix elements of Q1 - diagonal as well as non-
236
Per-Olov Uwdin, Piotr Froelich, and Manoj Mishra
diagonal - are identically vanishing, and one can draw the conclusion that utyl*- u - l w l b = 0 , i.e. that vy1* = The proof is then completed. I t is then fairly easy to test whether the stability conditions (3.76)are valid or not. It should be observed that there are also other forms than (A.9)for the expansion matrix g. Multiplying the relation (A.7)to
the left by I , where x = {xl,x 2 , .... , XN ] is an arbitrary set of N linearly independent functions, one obtains
Jp= <x I PBp1>-1<xI vl yl*> provided that the matrix A, = special choice
x
= v* ~
1
<x I y1>is
(A. 14)
non-singular. For the
one , obtains e.g. that
(A.15)
=
which is a positive definite matrix. As before, one proves that the relation 6-l = B* must necessarily be valid, and one has hence also the alternative form (
f3*,-'
=
<x* Iv- 1 I DPp#<x*
I D(p1*>.
(A.16)
which corresponds to the introduction of the set v*x* instead of x in the original formula (A.14). We note, however, that - in the case of a general set - one can no longer expect that the condition f3 = ( @ * ) - lshould be sufficient for the required stability property, and that our proof is limited to the specific choice x = v*yl. This completes the appendix.
Many-Particle Harniltonian Transformations
237
.
References
1. P.-0. Liiwdin, Advances in Quantum Chemistry (Academic Press), vol. 19 (1987).This is the first part of this paper and is here referred to as reference A.
2. P.-0. Liiwdin. J.Math.Phys. 24, 70 (1983).This paper is here referred to as reference B.
3. P. Froelich and P.-0. Lowdin, J.Math. Phys. 24, 88 (1983). This paper is here referred to as reference C. 4. P. Froelich and P.-0. Lijwdin. submitted to Int. J.Quantum Chem. This paper is here referred to as reference D.
5. P.A.M. Dirac, "The Principles of Quantum Mechanics", (4th ed. Clarendon Press, Oxford 1958). 6. V. Fock, ZPhysik 61,1266 (1930); P.A.M. Dirac. Proc. Cambridge Phil.Soc. 26, 376 (1930),27, 240 (1931). 7. D.R. Hartree, Proc. Cambridge Phil.Soc. 24, 89 (1928); V. Fock, Z.Physik 61. 126 (1930): J.C. Slater, Phys.Rev. 35,210 (1930). 8.
P.-0. Lijwdin, Revs. Mod. Phys. 35, 496 (1963).
9. G.J. Schultz. Revs.Mod.Phys. 45. 378 (1973); H.S. Taylor, Adv. Chem.Phys. 18. 91 (1970).
10. M. Mishra, Y. Ohm, and P. Froelich, Phys.Lett. 84, 4 (1981): M. Mishra. P. Froelich, and Y. O h m , Chem.Phys.Lett. 81, 339(1981); M. Mishra, 0.Goscinski, and Y. Ohm, J.Chem.Phys. 79, 5494 (1983); 79, 5505 (1983): M. Mishra, H.A. Kurtz, 0. Goscinski, and Y. Ohm. J.Chem.Phys. 79, 1896 (1983).
11. E. Brandas, P. Froelich, C.H. Obcemea, N. Elander, and M. Rittby. Phys.Rev. A 26, 3656 (1982): P. Froelich and E. B r a d a s . 1nt.J.Quantum Chem. 23. 91 (1983);E. B r a d a s . 1nt.J.Quantum Chem. QC Symp. 20, 119(1986); P. Krylstedt et.al., J.Phys. B 20, 1295 (1987).
12. P. Froelich, J.Math. Phys. 24, 2762 (1983). 13. C. W. McCurdy, J . L . Lauderdale. a n d R.C. Mowrey, J.Chem.Phys. 75, 1835 (1981).
This Page Intentionally Left Blank
NEWTON BASED OPTIMIZATION METHODS FOR OBTAINING MOLECULAR CONFORMATION John D. Head* Michael C. ZerneiQuantum Theory P r o j e c t University of Florida Department of Chemistry Gainesville, Florida 32611 and Guelph Waterloo Centre for Graduate Work in Chemistry University of Guelph Department of Chemistry Guelph, Ontario, N1G 2W1 CANADA
*Present address: University of Hawaii, Department of Chemistry, 2545 The Mall, Honolulu, Hawaii 96822 USA. ADVANCES IN QUANTUM CHEMISTRY, VOLUME 20
239
Copyright 0 1989 by Academic Press. Inc. All rights of reproduction in any form reserved.
240
John D. Head and Michael C. Zerner
TABLE OF CONTENTS 1. 2. 3. 4. 5. 5.1 5.2 5.3 5.4 5.5 5.6 5.7 6. 7. 8. 8.1 8.2 8.3 8.4 8.5 8.6 9.
Introduction Characterization of Minima Optimization Algorithms Line Search Strategy Search Directions Steepest Descent Newton's Method Quasi-Newton Methods Approximate Analytic Hessians Restricted Step Method The Augment Hessian Method Conjugate Gradients Analysis of Search Direction Methods Computational Procedures Results General Conclusions Newton's Method Quasi-Newton Methods with Unit Hessian Quasi-Newton Methods with Non-Unit Starting Hessians Approximate Hessians Augmented Hessian Conclusions Appendix Acknowledgements References
Newton Based Optimization Methods
1.
241
INTRODUCTION
One of the most important advances in molecular quantum mechanics of the last fifteen years has been the increased ease and accuracy of predicting molecular conformation through purely theoretical models [l-41. The stable geometries of molecules as yet unmade can be determined and molecular properties of these systems studied. The conformations and properties of electronically excited molecules can be examined theoretically, whereas experimentally these geometries are often very elusive. Transition states, the species that separates reactants from products, and that have but fleeting existence -- if at all, experimentally -- are also obtainable using molecular quantum quantum mechanics. Much of this progress has been centered around the discovery that energy derivatives can often be explicitly evaluated [3-71, freeing potential energy searches from inefficient minimization algorithms such as uniaxial methods, simplex methods and even numerical derivative methods [ 8 ] . We might classify methods that search potential energy surfaces as [ 9 ] : 1. Methods without derivatives, 2. Methods with numerical gradients (first derivatives), 3. Methods with analytic gradients, and 4 . Methods with analytic gradients, second derivatives (Hessian). with obvious extension and divisions. Methods of the first type such as the simplex methods [ l o ] are simply not competitive with the more modern methods which obtain gradients (except for diatomic molecules). Since first derivatives for nearly any wavefunction can be obtained analytically, at least with the help of the coupled perturbed Hartree-Fock Theory (CPHF) 1111, methods that use only numerical gradients will not be competitive except, again, for very small systems. Analytic second derivatives are generally time consuming to obtain, even for Hartree-Fock
242
John D.
Head and
Michael C.
Zerner
(molecular orbital) type wavefunction. The evaluation of second derivatives inevitably involves a t least the CPHF and, as such, is a factor of N more difficult than obtaining only the gradients, where N is the size of the basis set. Because of this, methods of the third type, those that use analytic first derivatives and numerical second, are important and are likely to remain important. In this paper we examine various optimization procedures that build up information on the second derivatives as they minimize the energy. We focus in this work only on energy minima, although the connection with techniques f o r finding saddle point, such as transition states, is reasonably straight forward, at least in concept [12,13]. The descriptions of the various potentially useful methods is reasonably complete, although we have chosen to give examples only f o r the methods we have found most promising. For comparisons we do calculate the exact Hessian matrix and perform true Newton steps, although, except in especially difficult cases, we do not recommend this time consuming procedure. On the other hand, i f vibrational spectra about a minima are to be examined, O L i f a transition state is sought, a Ifgood" Hessian is required. The update procedures that we examine are not reliably Itgood" enough for these purposes. In the next two sections we briefly detail the problem. Sections 4 and 5 describe two aspects of optimization algorithms, the line search and the search direction. These sections are somewhat lengthy as befits " a r t , " f o r there i s still a good deal of artistry in optimizing functions that do not explicitly show dependence on their independent variables. In section 4 we give an overview of the various suggested procedures in the context of a normal mode analysis where the "physics" of the situation is somewhat more transparent. In Sections 7 and 8 we present results for the more successful approaches. The calculations presented are of the Hartree-Fock self consistent field type utilizing the INDO11 [14,15] model Hamiltonian. Although the force constants from
243
Newton Based Optimization Methods
such calculations are generally twice as large as those obtained from ab-initio results [161, this nearly constant scaling can be shown not to affect the generality of the conclusions presented in the last section of this paper.
2. CHARACTERIZATION OF MINIMA We assume that the energy hypersurface E ( x ) is a unique single-valued function of the nuclear coordinates 5 . We further assume E(x) - and its derivatives with respect to 5 are continuous. Under these conditions, the second order Taylor series for the energy hypersurface E(5) is well defined. E(xk k gi
aE axi
=
Gij
k ' ) = E(x ) + g
+
Ix
=
xk
E'a
=
~x . a xIx i i
-
=
.x k
where gk = g ( xk ) is a column vector of the energy gradients, and Gk is the matrix of second derivatives, called the Hessian Matrix. sk is the "step" vector, given by sk = xk + l - x k A strong local minimum with E(X*)
occurs at
<
~ ( x )for all x
* x
*
x*
when
*+s
g =Oorg
* G > o
#
(2b)
= O
or S + G * ~ >
o
for all s
(2c)
Condition (2c) requires the Hessian matrix to be positive definite; that is, the eigenvalues of G* are all greater than zero.
244
john D. Head and Michael C. Zerner
The minimum usually found in a geometry optimization using Cartesian coordinates is a "weak" minimum, distinguished by the equality of
*
E(x )
E(x), xfx
*
The equality of Eq. (3a) arises because of the translational and rotational invariance of the energy. At the weak minimum we also have
*
g
= o
G*
2
0
*
where the Hessian matrix G has 6 (or 5) zero eigenvalues corresponding to the rotational and translational modes; the remaining 3N - 6 ( 5 ) eigenvalues should all be positive [17]. The 6 ( 5 ) coordinates corresponding to translations and rotations can be removed by a suitable choice of internal coordinates. However, the energy expression is invariant to this removal, and to any choice of linearly independent coordinates. This invariance property means that there is no particular advantage in an optimization procedure for any given choice of coordinates providing the initial steps (see below) are the same. Of course, it may be easier to constrain chemically meaningful coordinates, such as a bond length or angle, in an internal set of coordinates. A potential energy surface in general will have many extreme * +* or stationary points, all characterized by g = 0 . If s G s > 0 , * Eq. (2c), the point x represents a minimum. Such a minimum may be local or the global minimum; the latter by definition will have the lowest energy in the hypersurface. All the methods presented in this paper are only capable of finding local minima. In contrast, a transition state with positions given by x f is characterized as an extreme point having all eigenvalues of the Hessian matrix positive except one and only one which is negative
[18]. On such a "saddle point" the energy increases in all directions except along the "reaction path." Along this
245
Newton Based Optimization Methods
direction E(x # ) is a maximum separating two local minima. Other extreme points with two or more negative eigenvalues of the Hessian are of less immediate interest in chemistry.
3.
OPTIMIZATION ALGORITHMS
Optimization methods are generally based on the typical algorithm as follows [ 9 ] : Choose an initial set of coordinates and gradients (variables) xo and calculate the energy E(x') g(xo), then
k
1. Choose a search direction s , then 2. Perform a line search to determine how far along this k direction to move to reduce the value of E(x ). In the simplest case one chooses an a value in xk + l k us , then 3.
4.
=xk+
If the system has not converged, increase k to k + 1, and go to Step 4 . I f convergence is obtained, the optimization is complete. Calculate or estimate G a new Hessian at x and
Test for convergence.
+
+
repeat Steps 1 through 4 . The information gained about G in Step 4 is used to choose the search direction sk of Step 1. Generally the first +
search direction is chosen along the direction of steepest descent although other choices, as discussed below, are possible. An efficient algorithm balances the work of finding the best search directions ( sk ) against that of finding the optimal distance to move along that search direction (a).
4.
LINE SEARCH STRATEGY
The aim of the line search is to determine the appropriate displacement along a search direction. k
This is, if at the point
xk we have found a search direction s , the k + 1 point is that
246
John D. Head and Michael C. Zerner
given by k + 1 = x k + aksk
X
(4)
where ak is obtained from the line search and should be such that k E(xk ') is "reasonably" less than the pievious energy E(x ) . The decrease in energy should be reasonahle enough to ensure the optimization algorithm converges t o the local minimum. The weak k + l ) < E(xk ) can, of course, by itself requirement that E(x result in very slow convergence. Figure 1 illustrates a typical variation of energy with respect to a along the search diiection s k . In an exact line search, the minimizing value am, which gives a minimum in E(xk l ) , is located. This often requires several energy and gradient evaluations along sk . Such calculations may be more useful in selecting alternate search directions. The more efficient approach is to perform an inexact 0 1 "partial" line search where an a value is selected to givc a reasonable decrease in the function. For example, in Figure 1, an appropriate a might lie in the interval (a1,cx7). The problem now is to determine from E(xk + 1) and t: appropriate value of al and a 2' The energy along the Serlrrh direction sk up to I quadratic terms is given by Figure 1 +
+
+
E(X
k
+ ask
= E(X
k
+ ag+(x
k
.
sk
1 2 k+Gsk
?a s
(5)
The energy at xk + ask will be lowered providing
Inequality ( 6 ) , called "the descent condition," selects the sign required for the search direction sk t o ensure a reduction in the
Newton Based Optimization Methods
247
k energy. The minimum E(xk + % s ) value is found by differentiating (5) with respect to 0:
aE
aa -
.
g+(xk)
sk + ask+&
=
k
g+(xk + 0s
)
*
s
k
(7)
Thus the exact line search requires k +
g+(x
k)
QMS
.
sk
=
0
(8) k
I<
that is, the gradient g at x + %s is perpendicular t o the search direction sk (see, for example, Figure 2). An estimate for the reduction of the energy can he obtained from E q s . (5 -
SK /
Figure 2
with the greatest reduction - 1 / 2 % s k C . gk when a equals %. When a > QM, there is a reduction in t h e energy if
John D. Head and Michael C. Zerner
which implies that
k+ - gk < E k - E k t l < -1 sk+ . gk - 2%
0 S - p . ~
when 1
(1lb)
O l P S Z
Equations ( l l ) , originally suggested by Goldstein 1191 motivate the test for whether a is less than a2. For a to be inside the right extreme of the line search interval (al,a2), E
- E
k + l > - p u sk+ . gk
(12)
-
when p = 1/2 the line search becomes exact. For the left extreme a test similar to E q . (12) could be used. However, a more successful test in our experience is based on the gradients 1201. Since the Hessian G is positive definite, Eq. (7) gives (13)
the gradient vector component indicating for the new point X along sk should decrease. A successful test for the left extreme a1 is +
An exact line search is obtained when u = 0. Occasionally, and especially when starting with a poor k ' 1 > Igk 1, even when initial geometry the gradient norm Ig Ek < Ek. Such a situation is depicted i n Figure 2. When this occurs we use the "cosine test" of Eq. (14b) +
rather than the test of Eq. (14a); that is, the cosine of the angle between the step direction and the gradient along that
Newton Based Optimization Methods
249
direction. This is because the new gradient becomes perpendicular to the direction of the step as the line search becomes more exact regardless of the magnitude of the gradient. With left and right extremes of o! as chosen from Eqs. (12) and (14) the optimization procedure can be shown to converge, providing gk+ . sk # 0 . Numerically, however, the condition that gk+ . sk 0 results more often than appreciated. In the Newtonlike methods described below, this pathology is associated with ill-conditioned Hessian matrices. We discuss further this point in Section 6. One additional constraint is needed in the discussion of line searches. The condition,
-
safeguards against large coordinate changes which might accidentally lead to points outside the region of interest (in the neighborhood in which we have initiated the search), or beyond the neighborhood that we might reasonably expect quadratic behavior. In our update procedure, after considerable numerical experimentation, we have adopted the following line search algorithm which is a variant of Fletcher [9a]. Initially an interval ( aT1 ,aT2 ) is chosen, where the superscript T emphasizes trial values. A of Eq. (15) is taken as 0 . 4 a.u. when x is T chosen as Cartesian coordinates. a2 is initially taken as 0. T We then calculate a2 from E q . (15). OL is initially set to 1 as suggested by a quadratic potential, unless a.I < 1 in which T case we set a = a2. We then evaluate E(xk + ors ) and perform the right extreme test given by E q . (12). If this test fails u > a2 (as opposed to aT2 ) and is reduced in value using the interpolation formula
-
z
250
John D. Head and Michael C. Zerner
a
new
= : a
+ 1
EI ( a lT) = g ( x+
(a
k
- alT)
T k
[i
+ a l s ) .
k k + E(xk) - E(x + cws ) T ' T ( a - a1 ) E (a1 ) sk =
[E]
a
=
=1
T
The energy at xk + anewzk is now computed and the right line search test is repeated with a2 = a and a = a new until Eq. (12) is satisfied. The gradient g(xk + ask ) is now evaluated and used in the left extreme test, E q . ( 1 4 ) . If the left extreme test fails then a is too small. A trial a new is computed from the extrapolation formula
that a < an providing E ' ( a l ) > E'(a). The energy is reevaluated at xI? + a sk and both the right and left line new T = a and a = a searches are repeated with al new' When both tests have been successfully completed, the partial line search is finished and a new search direction is sought. The line search algorithm outlined is parameter dependent. After much numerical experimentation, we choose u = 0.9 and p = 0.03, parameters which constitute a fairly weak line search. In our experience it is more effective to change search directions than to determine a more exact minimum of energy along a given search direction. In practice a is usually equal to unity. When so
a line search fails, however, the subsequent treatment o f a can have a marked affect on the efficiency of the optimization. For steepest descent search directions, after the first step in any update procedure where the Hessian matrix is as yet unknown, aT = 0.4 is most useful.
251
Newton Based Optimization Methods
5.
SEARCH DIRECTIONS
5 . 1 Steepest Descent.
The search direction in steepest descent is given by the negative of the gradient s
k
k
(17)
= -g
While the steepest descent search direction sk can be shown to converge to the minimum with a proper line search, in practice the method has slow and often oscillatory behavior. Most QuasiNewton procedures, however, make this choice for the initial step, for there is no information yet on G. When working with different coordinates, q , the Taylor series expansion suggests
a+
*
g =
a;
*
gq
where 8 = B & 9
defines the new set of coordinate steps S
q
.
From Eq. (18)
B = B+ gq
leading to sk
4
=
B sk
=
-BB+~
q
as the proper replacement for E q . (17). Failure to use Eq. ( 2 1 ) correctly for the steepest descent can lead to symmetry breaking ( 2 1 1 . It has also on occasion led to incorrect conclusions on the importance of linear coordinate transformations. 5.2 Newton's Method.
Newton's method includes the quadratic nature of the energy hypersurface in the search direction computation. Consider the Taylor's series for the change in gradient given by
252
John D. Head and Michael C. Zerner
g(xk
k + Gk . sk
= g (x )
')
+
at an extremum g ( x * ) is zero. Hence a step in the direction of minimum is predicted by sk
= (x
* - xk )
=
-G-1g(xk)
(23)
Equation ( 2 3 ) is used to generate search direction sk for all the Newton-like algorithms. The actual s t e p taken tik = ask is determined by the line search and allows f o r the non-quadratic part of the function. In molecular quantum mechanics, the analytical calculation of G is very time consuming. Furthermore, as discussed later, the Hessian should be positive definite t o ensure a step in the direction of the local minimum. One solution to this later problem is t o precondition the Hessian matrix and this is discussed for the restricted step methods. The Quasi-Newton methods, presented next, provides alternative solution to both of these problems.
5.3 Quasi-Newton Methods. The Quasi-Newton, or variable metric, methods of optimization are essentially based on the algorithm 1. Set search direction s k = -Hkgk . k k 2 . Perform line search to give xk = xk + a s k + l 3 . Update Ek to E where the search direction is essentially given as a Newton step, Eq. ( 2 3 ) , but with Ak approximating the inverse Hessian. Hk avoids the explicit evaluation o f G-' and is updated on each algorithm cycle. The inverse Hessian A is updated by relating the differences in the gradients +
k + l
Y
k + l
= g
k
-g
to the differences in the coordinates
*
.
Newton Based Optimization Methods
253
For a quadratic function, the coordinates and gradients are related by Hk
yk =
+
ak
(26)
Equation (26) is the "Quasi-Newton" condition; it is fundamental to all the updating formula. There have been many updates proposed, and we briefly review some of the more important ones. The simplest are based on
H~
=
+
H~ + E~
=
k E + alu>
(27)
where I u> is a vector; I u>
k E
k
+ alu>
Lk
(28) k k and hence 1 u> is proportioned to I g > - H I y >. Taking, for simplicity, the scale a = -l we obtain from Eqs. (27) and y
=
k
(28)
where the k superscript has been dropped f o r convenience. Rank one formula of this type have been proposed by Broyden 1221, Davidon [ 2 3 ] , Fiacco and McCormick [ 2 4 ] , and Murtagh and Sargent 1251, and examined for their utility in molecular quantum mechanics by many workers [ 2 , 2 6 ] . A common approach used is to 1 take E as a unit matrix, and thus the first step s1 is in the direction of steepest descent. An alternative procedure, potentially useful when the initial search is near the minimum of energy, is to start the Hessian as a diagonal matrix of known force constants for bonded atoms that have been tabulated for the model at hand (27,28). In such a case, internal coordinates should be used, Eqs. (20) and (21). In later cycles more of the quadratic nature of the energy is included in Ek . Unfortunately,
John D. Head and Michael C. Zerner
254
this simple rank one update procedure has two serious disadvantages. The positive definiteness o f the inverse Hessian may be lost on updating. Secondly, the denominator in E q . (29) may become close to zero, with resulting loss in numerical precision. Murtagh and Sargent [ 2 9 ] have developed successful algorithms for detecting these problems but not correcting them. Often the only solution is to reset the inverse Hessian back to the unit matrix. This results in the loss of all the previous curvature information. Nevertheless, the Murtagh Sargent variant of the rank one formula has in the past been the most successful in molecular geometry searches. There is a whole class of rank two update formula, the two most important members are
where the subscript DFP indicates Davidson, Fletcher and Powell [30] who developed this update; and
'BFGS+
< > (1 *) <&Iy) E
=
+
JS><S
+
suggested by Broyden, Fletcher, Goldfarb and Shanno (BFGS) [31]. The complete family of rank two update is given by a linear combination of the DFP and BFGS formula 1321
and is known as the Broyden rank two "family." These rank two updates have several desirable features, but perhaps the most important property is that they retain the positive definiteness of the Hessian if the appropriate line search is used. The advantage of a particular rank two formula over another in molecular problems is not clear, and for .__ exact line searches all
Newton Based Optimization Methods
255
the formula give the same search directions. A s is indicated in the result section, the BFGS formula in our experience tends to be the most successful with partial line searches. An alternative type of update formula has been suggested by Greenstadt [33]. The updating formula has the usual form
where I E> = I S> - 8 ) y>. In the GI-eenstadtformula, the positive definiteness of the inverse Hessian is not retained? nor is i t invariant to coordinate changes. In the above we have given update formula for the inverse Hessian for direct use in such equations as Eq. ( 2 3 ) . In t h e Appendix we summarize the update formulas for the Hessian itself. Such formulas may be of greater utility in methods such as t h e augmented Hessian and resricted step methods, discussed be ow. 5.4 Approximate Analytic Hessians Exact second derivatives methods require the solution of the coupled perturbed Hartree-Fock equat o n s , CPHF [11,34,35]. At the Hartree-Fock level this requires several steps in addition to the usual SCF procedure and the evaluation of the first derivatives. 1. An ao to mo transformation. Only those integrals with two occupied and two virtual indices are required?
and . This is a N5 step. This transformation step can be avoided by working directly in the ao basis [ 3 6 ] but at the expense of increased I/O. 2. Evaluation of the integral second derivatives, a step proportional t o the number of nuclear coordinates times 5 the number integrals, about N . 3. The solution of the CPHF equations, a N4 step for each 5 nuclear coordinate, or again about N .
256
John D. Head and Michael C. Zerner
Obtaining the second derivatives for a S C F wave function is thus a factor of N times more difficult than i t is for evaluating the SCF energy and the first derivatives. The resulting CPHF equations to be solved are
(1 - A) U(l) - B
=
maib.
c
-
=
0
Kai
EiiOj -
bj A E ~ a~ [E~(o) , ~ -~ ~ ~ ( -0I ) bj
'qi,
=
+ ( l - hai
, bj)
(34c) (34d)
is the matrix of first order changes in the mo where U(') coefficients
C(X)
=
c(xo)lJ(xo
U(X0
+
ax)
,
the matrix of first derivatives given by
=
U(X0)
+ 6X) +
=
CU(XO + 6 X )
+
...
and
with
8
(1)
(35a) (35b)
Newton Based Optimization Methods
PPV From Eq. (35c), regrouping terms,
or
Expanding U(l) itself as a second perturbation sequence
Equations such as (38) are immediately siiggested by the original CPHF Eq. (34),
u
=
(1 - A ) -1 B
=
C
A"B n = O
the only difference being the denominator shift by. Whereas Eq. (40) is slow to converge, E q . (38a) is not. Replacement of U(l) by in
dl)
2
a "NUC + - - ax ay
2
a slJv
c "pv axay +
aplJv aHuv
E
F
T
+
258
John D. Head and Michael C. Zerner
with
+ C*pi cvp L pi P ) generally give the second derivatives to within 50.01 aulbohr, far better than any of the update procedures. For the first iterates, V'l), only theand integrals are required, nartvirtual and "i" occupied. All the integral second derivatives are required. For semi-empirical methods the reduction of two of the three additional steps after the SCF and first derivative evaluation is enoimous, a full factor of N as seen in the results section. F o r ab-initio .____ theories, reduction of the integral transforms is approximately a factor of 4 . Realized efficiencies then depends on the relative time/rate required for the integral transform step and the derivative integral steps, and this is often a function of the computer used.
5.5 Restricted Step Methods. The Quasi-Newton methods are very powerful techniques for locating local minima. The success of these procedures can largely be attributed to the retention of the positive definiteness of the Hessian. In this section we present an alternative approach which h a s the added capability of handling Hessians which may contain negative eigenvalues. Such procedures will prove especially useful for starting geometries which are well removed from the desired minima. For example, a starting
Newton Based Optimization Methods
259
geometry may correspond to a transition state struc ure w th one negative eigenvalue in the Hessian. The restricted step algorithm should be capable of optimizing towards both the reactant and the product geometries. The restricted step method of the type discussed below were originally proposed by Levenberg and Marquardt (37,381 and extended to minimization algorithms by Goldfeld, Quandt and Trotter (391. Recently Simons [13] discussed the restricted step method with respect to molecular energy hypersurfaces. The basic idea again is that the energy hypersurface E(x) can reasonably be approximated, at least locally, by the quadratic function 1 E(x + 15) = E(x) + g+(x) . d + 2 d+G(x)d
where the step 15 is restricted in some way. In our case we restricted the step norm 181 to be smaller or equal to some trust region size A which is a measure of the extent of the quadratic approximation.
The minimum on such a quadratics surface can readily be identified. i t is either a minimum within the region, with the gradient vector g null and the Hessian matrix G positive definite or the minimum is on the boundary of the region. The step 15 towards the minimum is found by minimizing E(x + 8 ) with respect to 8 imposing the constraint from Eq. (42). Introducing a Lagrange's undetermined multiplier X we obtain the functional F
=
x (6'8 E(x + 15) - 2
-
2 A )
which on differentiation becomes
and 15 = - ( G
- XI)- 1g
260
John D. Head and Michael C. Zerner
X is a parameter which ensures that the matrix ( G
- M) is a
positive definite and that L is essentially given by a Newtonlike step. Analysis of Eqs. ( 4 3 ) reveals that a downhill search requires
where Xmin is the lowest eigenvalue of G . When X is zero we obtain the usual Newton step. If X equals zero satisfies the inequality ( 4 4 ) and the step size IS( is smaller than A, then we have the situation where the minimum is within the quadratic region. The restricted step method requires the two parameters A and X t o be defined. Various numerical tests have been suggested for varying the trust region size A dependjng upon the quality of the quadratic surfaces [ 9 , 3 7 , 3 8 ] . Until now we have defined this trust region A as fixed, as explained with the discussion on Eq. (15). It may be the nature of the energy surfaces we have examined, but very little seems gained in searching for energy minima by examining the trust region in detail. In addition, solving for A, which determines A is a non-trivial problem [ 3 8 ] . The augmented Hessian procedure described next circumvents this particular problem by introducing a natural choice for A. In our experience the Newton and Quasi-Newton methods with the line search described i s faster to reach a minimum than the restricted step method with confidence region. This latter method is a more "conservative" method as discussed in the results section. The restricted step methods, however, are very effective in searching for transition states [ 4 0 , 4 1 ] .
5.6 The Augmented Hessian Method The starting point in the augmented Hessian approach is the eigenvalue equation
Newton Based Optimization Methods
261
where the gradient g i s scaled by ci. The step s is found by diagonalizing Eq. (45a) for the lowest eigenvalue X and lowest = 1. By expanding Eq. (45a) two normalized eigenvector v2 + equations are obtained. GV + a@ g
=
XV
u= XI3
Rearranging (45b) gives the step
to be compared with the restricted step equation, Eq. ( 4 3 c ) , and Newton's equation, Eq. ( 2 2 b ) . The eigenvalues of the augmented Hessian (Xi will bracket those of the Hessian itself, (Xim). In particular, the lowest eigenvalue XI must be zero +
')
+
or negative. In the diagonalization one of the following three situation
are obtained. 1.
2.
X < 0 and B = 0 . I n this case v = v(1ow) is the eigenvector of G c-orrespondingto the lowest eigenvalue.is zero and X = X(10w). A displacement along v(1ow) will reduce the energy. Alternatively, if a displacement along v(1ow) is not desired, then v(1ow) is shifted to a higher eigenvalue by G' = G + X(shift) Iv(low)>
John D. Head and Michael C. Zerner
262
Intermediate between the two conditions is X < 0 and B # 0. G has a negative eigenvalue but s given by Eq. (45d) will give a reduction in energy along the negative eigenvector mode. The step size, I s I , depends in a non-linear way on a. Since Is1 is a monotonically decreasing function of a, interpolation can be easily used to find a step o f the appropriate length. The augmented Hessian method requires an exact Hessian, or an update method on the Hessian itself. The update formula for the Hessian analysis to the inverse Hessian appear in the Appendix. At each step in the update procedure, the procedure i s defined such that the Hessian times its inverse is unity. 3.
5.7 Conjugate Gradients Of some historic value, and for completeness, we mention that an alternative method for selecting search directions is to use the conjugate gradient approach 191. The initial direction is taken as that given by steepest desce'nt s
1
=
1 -g
then for k
>
(46a)
1, s
is given by
+
k Sk + l = - gk + l + C
BjSJ
j = 1
where B is determined by requiring all the search directions be j "conjugate" to each other; that is si + G s j = O
i #j
In the Flecher-Reeves algorithm 1421, E q . (46b) is simplified to S
with
k + l -
k + 1
- -g
+
@ksk
Newton Based Optimization Methods
263
where Eq. (46d) is exact for a quadratic function. Other k algorithms present different ways of forming @ which are equivalent for a quadratic function. The principle drawback of conjugate-gradient methods are that they are designed for use with exact line searches. They do have the interesting feature of not explicitly requiring the Hessian in the equatioiis. It is now fairly well established, however, that when there is gradient information available, the conjugate gradient approaches are not competitive with the quasi-Newton methods 191.
6. ANALYSIS OF SEARCH DIRECTION METHODS The Hessian matrix G is real and symmetric. It can then be resolved for analysis in terms of its eigenvalues X i and its eigenvectors I wi> as
Since the set Iwi> is complete, the gradient Ig> and the step Is> can be decomposed as 1
6. =
J
<w. (g> J
ll. = <w. Is> J
From Eq. (47) and Eq. ( 4 4 ) Is>
=
-
c‘i Xi i
Iw.> 1
264
John D. Head and Michael C. Zerner
ni
=
- &./A i
i
The descent condition <slg>
<
0 becomes
and is certainly obeyed if each &.n < 0 o r if Xi > 0 for all i. i i This is, of course, the case, if G o r its approximations are positive definite. When G is indefinite, the restricted step method modifies the Xi, E q . ( 4 3 ) , so they are all positive. The most common techniques when they run into trouble, however, reset all Xi to 1 as in the Murtagh Sargent procedure [ 2 9 ] . Some techniques diagonalize G , and, if A . is negative, simply change its sign. Such approaches do not systematically modify the Hessian and reduce the overall efficiency o f the optimization algorithm. The angle, 8 , between the step and the gradient is given by
where Xmin and Amax are the minimum and maximum eigenvalues of G. Uhen cos 9 approaches zero, the step and the gradient become perpendicular, leading to slow convergence. In practice this may occur when the a gradient vector has major components associated with the large eigenvalues of G , while the step has major components elsewhere. Again the restricted step method, or the augmented Hessian method, which shift a l l X equally, give a more favorable Xmin/Xmax ratio for descent. In terms of the normal mode analysis the Taylor series expansion for the energy can be rewritten as
Newton Based Optimization Methods
265
Eqs. (51) demonstrate the need for G to be positive definite. We see explicitly from Eq. (51a) that for similar step displacements the major energy contribution comes from those modes with i large eigenvalues. Conversely, E q . (51b) indicate the major energy contribution comes from the small eigenvalue modes when the gradient components si are comparable. In an optimization, the large eigenvalue modes are minimized quickly at first, until a1 the gradient components sl become similar; then larger step displacements along softer modes occur. This is illustrated in the Results Section. If ci = 0, then there will be no displacements in these directions whether Xi is zero or not. Since the energy is invariant to translations and rotations, the center of coordinates and moments about these coordinates will be conserved. In the former case i t can be shown that A 1. = 0; in the case of rotation the corresponding Xi may not equal zero except at the extreme point, but the corresponding si = 0. Since the energy is also invariant t o symmetry operations that leave the Hamiltonian invariant, the gradient vector will have nonvanishing components only along totally symmetric modes, "A" (see [ 4 3 ] for a more detailed proof). (g> =
1
EilOi>
i&A Since motion along totally symmetric modes cannot change the point group symmetry, this symmetry is preserved throughout the optimization. The exception i s at the extreme points themselves, where a higher symmetry might be obtained. Symmetry greatly reduces the number of natural parameters in the optimization problem. The number of parameters is simply the number of totally symmetric vectors in G. A t the same time, if the local minimum being sought is at lower symmetry than the starting geometry i t will not be found. For example, a planar molecule will remain planar throughout the optimization even if a non-planar structure is of lower energy.
266
John D. Head and Michael C. Zerner
7. COMPUTATIONAL PROCEDURES All calculations reported are performed within the INDO/1 semi-empirical framework [14,15]. These methods have also been programmed for the Quantum Theory Project ab-initio programs, and similar conclusions are drawn. The energy derivatives with respect to the nuclear coordinates are obtained from analytical formula. The "exact" second derivative matrix is obtained from the CPHF equations. For comparison, these results are compared with numerical second derivatives in which each coordinate xi is symmetrically displaced in turn by Sxl to give the central difference second derivative
Hence in a system of N-nuclei the formation of the numerical Hessian matrix requires 6N full SCF calculations, one for each coordinate displacement. Second derivatives are obtained much more rapidly through analytic expressions such as those of Eq. (40), than through the use of numerical expressions such as that of Eq. (53). A projection technique i s used to eliminate from G the small non-zero eigenvalues associated with the rotation and translation normal modes [ 4 4 ] . The exact inverse Hessian matrix H is obtained by diagonalizing G and discarding those eigenvectors with zero eigenvalues when back transforming to If. Selection of a convergence criterion in an optimization procedure is an important consideration. In theory, optimization is complete when g = 0 . The more practical alternative is t o ask for a maximal gradient component, or gradient norm or step size <61S>1/2, or for a minimum change in the energy, or for all of these. An assumption, often not true, is made that the norms of the gradient and step size monotonically decrease through the optimization.
Newton Based Optimization Methods
267
In practice, when all Cartesian gradients are less than atomic units, bond lengths are usually within kO.01 A and bond angles within 1-2' of their final optimized values. Torsional dihedral angles require a much finer optimization. In some cases, large changes in coordinates can be driven along very soft modes by very small gradients, E q . ( 4 8 ) . The physical meaning of optimizing along such soft modes, however, is not clear. Its justification lies only in the aesthetic.s of the mathematics.
8. RESULTS 8.1 General Conclusions, Table I summarizes resu ts f o r an arbitrary starting and the final geometry for hydrogen peroxide. The starting geometry is not greatly different from he final geometry, the largest Cartesian coordinate difference is approximately 0.02 A along the z direction. The relatively small starting gradients, also given in Table I, reflect the closeness of the initial geometry to the optimized geometry. A summary of the Newton iterations for hydrogen peroxide is given in Table 11. We report the variation of the 6 non-zero eigenvalues of the exact Hessian matrix G, four of which transform as al irreducible representations and two as a2. We see that the eigenvalues stay roughly constant throughout the optimization except for the smallest one which is initially negative indicating that we are more or less in the quadratic region of the energy surface. The eigenvalues also have a large range in values, the very small eigenvalue 0.0178 corresponds to a torsional mode of the hydrogens around the oxygen axis. Table I1 also gives the scalar products si of the Hessian vibrational eigenvectors with the gradient (g> and the step component ni calculated by Eq. (52). For the Newton optimization
268
John D. Head and Michael C. Zerner
TABLE I:
H202 COORDINATES AND GRADIENTS. a) Trial Coordiates ( A )
0 0 H H
X
1.1527552 -1.1527552 1.8897626 -1.8897626
Y -0.1322834 0.1322834 1.0771647 -1.0771647
z 0.1511810 0.1511810 -1.1338576 -1.1338576
In i t ial Grad i en t s ( a. u .) gY
0 0
H H
-0.0079031 0.0079031 -0.0124657 0.0124657
0.0238236 -0.0238236 --0.0210846 0.0210846
gz
-0.0226673 -0.0226673 0.0226673 0.0226673
b) Final Geometry ( A ) X
0 0 H H
1.1563486 -1.1563486 1.8934178 -1.8934178
Y -0.1356317 0.1356377 1.0810208 -1.0810208
z
0.1695288 0.1695288 -1.1522053 -1.1522053
Newton Based Optimization Methods
TABLE 11:
H 0
-
X’A&E
ALL QUANTITIES ARE IN ATOMIC UNITS. THE HESSIAN ETGENVALUES. a)
X Values
a
al
al
cycle
al 1
2
3
4
1 2 3 4 5
4.5053 4.3564 4.3572 4.3586 4.3586
2.1933 2.0450 2.0334 2.0346 2.0346
.1630 .la07 .i7aa .i784 .1784
-..0178 .0159 .0181 .oi7a .0178
b)E1
=
1 -4.804 D-3 9.044 D-4 9.220 D-6 9.406 D-7 5.250 D-8
6.702 2.131 -6.721 -5.526 -3.705
1.066 -2.076 -2.116 -2.158 -1.204
2
d)
.2592 .2818 .2873 .2874 .2874
-3.186 5.550 5.067 9.319 1.522
D-4 D-4 D-5 D-6 D-6
(48b)
4
5.014 D-3
2.827 D-3 -2.521 D-4 -3.021 D-5 -4.896 D-6
-1.786 -3.493 -2.799 -5.227 -8.530
D-2 D-2 D-3 D-4 D-5
Ma*
lql
Summary of the Optimization
Energy (a.u.)
-34.828473 -34.829517 -34.829530 -34.829530 a
Eq.
3
-3.055 D-2 -1.042 D-3 3.305 D-4 2.176 D-6 1.821 D-7
D-3 D-4 D-6 D-7 D-8
6
4
-8.171 D-4 --5.109D-4 4.507 D-5 5.309 D-6 8.705 D-7
D-2 D-3 D-4 D-6 D-7
= < s ( w ~ >t
1
a2
2.3411 2.1600 2.1518 2.1538 2.1538
3
a
Q1
a2 5
1
< g l ~ ~ E> q~. ,(48.3)
2
C)
269
l g ( (a.u./bohr)
0.023824 0.000994 0.000276 0.000007
0.067195 0.002435 0.000676 0.000012
All Computed lei I and Ini I along the a7 modes (i 14 less than 10- a.u.
=
5 and 6) are
270
John D. Head and Michael C. Zerner
all searches were successful with
Q
=
1.
The scalar products Q1
and Ei are essentially zero for the a2 vibrational modes as anticipated from symmetry conservation. Initially the largest gradient and step scalar products occiir for the 2al vibrational mode which corresponds to Although the 2al gradient the next two optimization occurs along the soft 4al
a symmetric OH stretching mode. component sl remains the largest for cycles, the largest step component lli mode f o r the reasons discussed in the
previous section. Table I11 presents an analysis o t the energy change between the second and third optimization cycles. E ~ =Q <slwi><wilg> is ~ the descent condition and should be negative for each ai." Here only the symmetric modes contribute (see Section 6 ) .
For this
cycle the major decrease in energy is along the soft fourth mode. This is so even though the largest gradient component (Table 11) remains along the second mode. There ic: from this specific analysis and the more general one of the previous section an apparent danger in using only the magnitude of the gradient as a check for convergence of the optimiiation procedure.
TABLE 111: ENERGY DESCENT FOR H202 FROM CYCLE 2 TO CYCLE 3 (10-6H).
<s
loi><wiIG >a
1/2Xi<S IWi>2b AEi
-0.18775
-2.72n50
-1.44431
-19.38615
0.09388
1.11019
0.72207
9.69983
-0.09387
--1.10931
-0.72224
- 9.68632
Total: -11.61174 :This is the descent condition and should be negative. The contribution of the first two terms in the Taylor series expansion of the energy.
-
Newton Based Optimization Methods
271
TABLE IV: GEOMETRY OPTIMIZATION O F H,O, USING THE BFGS ALGORITH.
Xla 1
1.0 2.1776 3.9361 4.4429 4.8876 6.2775 4.4232 4.3586
2 4 6
8 10 15 Exact aAtomic units o r Hartrees.
TABLE V:
2
10 15
1.0 1.D 1.4179 7.. 0445 2.3818 7.6888 ?.I109 2.0346
l3 1.0 1.0 1.0 0.9996 1.0324 0.3846 0.1948 0.1784
x4
1.0 0.9849 0.9400 0.8710 0.1471 0.0552 0.0178 0.0178
STEP SIZES (a.u.) IN THE RFGS OPTIMIZATION OF THE GEOMETRY O F H202.
CycleIMode 4 6 8
A2
1 3.924 1.707 -1.194 7.146 -5.193 -1.012
D-3 D-3 D-5 D-5 D-5 D-5
2
3
----
4. 171 1.200 1.015 -4.750 -1.380
D-3 D-4 D-4
D-5 D 5
4
_________ -_--4.148 D-5 -2.094 D-4 -5.344 D-4 -3.654 D-5
-6.861 3.028 9.083 -6.410 -2.389 2.907
D-3 D-3 D-4 D-4 D-3 D-4
A similar analysis for a B F G S quasi-Newton optimization of hydrogen peroxide is given in Tables T V and V starting with the same initial geometry a in Table I. Only the symmetric vibrational modes are included in 'Tables IV and V since only these are updated. A more detailed discussion of the Hessian eigenvalue evolution for the B F G S scheme is given below, but in Table IV we see that it takes O V ~ Keight Hessian updates before the low frequency normal mode is realized. However the eighth cycle energy is only 2vH above the converged energy. 8 . 2 Newton's Method.
In this section, we present a summary of some Newton optimizations. The number of energy and gradient calculations required to reach the minimum forms a frame of reference for considering the efficiency of the different optimization
John D. Head and Michael C. Zerner
272
procedures. The molecules studied are H2C0, NH3 and C2H6. The initial geometry for each molecule has been chosen to have low symmetry (C,) by making a 0.08 a.u. coordinate displacement along each vibrational mode from the minimum. A second initial geometry for NH3 is considered with 0.05 a.u. displacements along each mode from the minimum. The number of line searches, energy and gradient evaluations are given in Table VI for both the Newton and Quasi-Newton methods. Table VI clearly indicates that the use of an exact inverse Hessian requires less points to arrive at the optimum geometry. However in Table VI we have not included the relative computer times required to form the second derivative matrix. If this is taken into account, then the Newton's method with its requirement for an exact Hessian matrix is considerably slower than the quasi-Newton procedures. TABLE VI:
NUMBER OF OPTIMIZATION CYCLES REQUIRED FOR DIFFERENT UPDATE FORMULAE~.
Exact
Rank 1
BFGS
DFP
Greenstadt
H2C0
4(5,5)
9(11,10)
11(12,12)
12(13,13)
16(18,17)b
NH3d
6(8,7)
7(11, 8)
7 ( 8, 8)
11(12,12)
12(13,13)b
NH3e
3(4,4)
7( 9, 8)
7 ( 8, 8)
10(11,11)
11(13,12)b
C2H6
6(7,7)
17( 19,18)
19( 20,20)
29(30,30)'
19(27,20)b
aThe notation x(y,z) refers t o x line searches, y energy bevolutions and z gradient evaluations. Energy changing less than 5.D-07 07 Ha but above true minimum energy value. C dOptimizaton not quite converged. Displaced 0.08 a . u . along each normal mode. eDisplaced 0.05 a.u. along each normal mode.
273
Newton Based Optimization Methods
TABLE VII:
1 5.281D+OO 4.209D+00 3 4.249D+00 4 4.279D+00 5 4.279D+00 2
Eval. No. E G
1 2 3
4 5
1 2 3 4 5
EXACT HESSIAN OPTIMIZATION OF HZCO - HESSIAN EIGENVALUES (a.11. )
.
1.592D+00 2.146D+00 2.046D+00 2.033D+00 2.0331)+00
Energy
-24.316168 -24.336801 -24.339573 -24.339593 -24.339593
TABLE VIII:
1 3.405D+00 2.209D+00 2.072D+00 4 2.440D+OO 2.355D+00 5 2.216D+OO 2.213D+OO 6 2.27OD+OO 2.270D+00 7 2.274D+OO 2.274D+00 8 2.274D+00 2.274D+OO
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7
lg
I
Max
0.369646 0.084631 0.009807 0.000050
lql
0.264229 0.066710 0.007186 0.000034
n.oooooo
1.92213-01 1.906D-01 1.847D-01 1.830D-01 1.830D-01
n.oooooo
1.104D-01 1.417D-01 1.584D-01 1.589D-01 1.589D-01 Opt Type
a
0.00000D+00 1 1.00000D+00 1 1.00000D+00 1 1.00000D+00 1 ~.OOOOOD+OO 1
THE OPTIMIZATION NH WITH THE EXACT HESSIAN - HESSIAN EIGENVALU~S(a.11. ) .
3 3.179D+OO
Eval. No. E G
7.879D-01 1.996D-01 1.060D+00 1.996D-01 1.023Dt00 2.217D-01 1.023D+00 2.3961)-01 1.023Dt00 2.396D-01
Energy
-12.499372 -12.492400 -12.502913 -12.521926 -12.523048 -12.523215 -12.523216 -12.523216
1.341Dt00 1.431D+00 1.354D+00 1.468Dt00 1.454D+OO 1.451Dt00 1.451D+00
lg
1.684D-01 3.866D-02 1.689D-01 8.842D-02 1.553D--Ol 1.492D-01 1.750D-01 1.567D-01 1.609D-01 1.543D-01 1.599D-01 1.54313-01 1.599D-01 1.5431)-01
I
Max
k iI
0.358164
0.198326
0.321531 0.032392 0.016203 0.000832 0.000012 0.000000
0.176294 0.021575 0.012235 0.000694 0.000010 0.000000
-3.8401)-03 5.403D-02 1.122D-01 1.564D-01 1.5431)-01 1.543D-01 1.543D-01
a
0.00000D+OO 1.83049D-01 5.95025D-02 1.00000D+00 1.00000D+OO 1.00000D+00 1.00000D+00 1.00000D+00
Tables VII and VIII examine the nature of the Hessian matrix for HZCO and NH3 as the geometry varies during the optimization. In both cases the starting geometry is far from the quadratic region about the minima. After the first geometry update in H2C0 (cycle 2) the Hessian eigenvalues are within 10% of their final
274
John D. Head and Michael C. Zerner
value. The next step (cycle 3 ) shows gradients reduced by a factor of 10, and Hessian eigenvalues within 1% of their final values. A step with these values leads to a reduction of the gradient by a factor of 200. Similar conclusions can be reached for NH3 from the information in Table V I I I . Highly accurate Hessian matrices at the early stages of an optimization are probably unnecessary. In the final stages of an optimization, when the quadratic region is being approached, an accurate Hessian has a big impact. This is a l s o demonstrated in Table VI when comparing the optimization of NH3 with two different starting geometries. When the starting geometry is far enough removed from the neighborhood of the first geometry each successive step takes the molecule through regions of the energy hypersurface with quite different quadratic behavior, see below. In the case of NH3, Table VIII, the final Hessian shows two degenerate modes, a requirement of C3v symmetry. The initial distorted geometry has quite diffeient eigenvalues for the "parents" of these modes, and i t is quite interesting to follow the evolution of the eigenvalues of these modes through the optimization procedure. The starting geometry is poor enough t o produce very large gradients and a negative eigenvalue in the Hessian. The step, with a = 1, produced from this gradient is sufficiently large as t o produce coordinate displacements larger than allowed by Eq. (15). The geometry and energy, E Z , obtained from the step scaled by the value 0.183 [from Eq. ( 1 5 ) ] fail the right hand line search test. The step is then scaled by 0.0595 obtained from the quadratic interpolation formula (16a) leading to an energy E3 After this, optimization is lower than the starting energy El. rapid. 8 . 3 Quasi-Newton Methods with Unit Hessian. Here we consider the relative efficiencies of the different inverse update formula. The calculations take the initial inverse Hessian as a unit matrix and the initial a in the line
Newton Based Optimization Methods
275
search as 0.4. Table VI provides a comparison of the number of line searches, energy and gradient evaluation for the different formula. The results of Table VI as well as studies on other systems, indicates that the rank one and BFGS methods are the most successful formula and are similar in their optimization efficiency for small molecules. Our experience and those of others [ 9 ] suggest that popular DFP formula may be quite successful with very exact line searches, although still not as good as the rank one and BFGS methods. The Greenstadt method was found not to converge for many of the systems studied. A study of the eigenvalues of the updated Hessian are revealing. Tables IX, X, XI and XI1 report these values for NH3 with the initial displacement 0.08 a.u. along each normal mode as described previously. A s expected, the rank one update summarized in Table IX updates one eigenvalue each cycle until all six modes are realized. The negative eigenvalue shown in the first mode on the fourth cycle is an example of a serious flaw. A step taken with this approximate inverse Hessian led to an increase in energy and a succession of right line search failures. The eighth evaluation is finally taken with a very small step, a = 0.007. Most optimization, of course, do not work with the normal modes and do not diagonalize the Hessian as we have done here to examine the problem. If this were done, then this particular negative eigenvalue could have been set back to its previous value, or even back to unity. Working in normal modes might be desirable in ab-initio calculations, however, where this diagonalization, of order N 3 , is a small step compared with the formation of the integrals and Fock matrices, both N 4 steps, where N is the size of the basis. The rank two BFGS and DFP formula form two non-unit eigenvalues on the first update, again reflecting the rank two nature of the update. Both formula, as expected, preserve the positive definite nature of the Hessian.
276
John D. Head and Michael C. Zerner
TABLE IX:
RANK 1 UPDATE OPTIMIZATION OF NH3 - HESSIAN EIGENVALUES (a.u.).
1 1.000D+00 1.000D+00 1.000D+00 1.000D+00 1.000D+00 1.000D+00 2 2.724D+00 1.000D+00 1.000D+00 1.000D+00 1.000D+00 I.OOOD+OO 3 3.227D+00 2.714D+00 1.000D+00 1.000D+00 1.000D+00 1.000D+00 4 -2.354D-02 2.717D+00 1.802D+00 1.000D+00 1.000D+OO 1.68913-01 8 2.719D+00 2.3318+00 1.458D+00 1.000D+00 1.000D+00 1.689D-01 9 2.587D+OO 2.303D+00 1.456D+00 1.000D+00 9.9941)-01 1.581D-01 10 2.37OD+OO 2.267D+00 1.438D+00 9.999D-01 9.862D-01 1.55213-01 11 2.359D+OO 2.262D+00 1.435D+00' 9.9971)-01 5.302D-01 1.506D-01 Exact
2.274D+OO 2.274D+00 1.452D+00 1.600D-01 1.543D-01 1.543D-01 Eval. No. E G
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4
5 6 7 8
kil
Energy
l!z I
-12.499372 -12.521284 -12.522080 -12.522281 -12.476193 -12.520752 -12.522073 -12.522235 -12.523207 -12.523216 -12.523216
0.358164 0.050917 0.028010 0.019552
0.198326 0.031968 0.017822 0.012914
0.020010 0.004431 0.000612 0.000086
0.012987 0.002909 0.000343 0.000049
Max
a
0.00000D+00 4.00000D-01 1.00000D+00 1.00000D+00 8.67976D-01 8.01833D-02 1.55110D-02 3.671051)-03 1.00000D+00 1.00000D+00 1.00000D+00
277
Newton Based Optimization Methods
TABLE X:
BFGS OPTIMIZATION OF NH3 - HESSIAN EIGENVALUES (a.u.). -
1 2 3 4
5 6 7
9
1.000D+00 l.OOOD+OO l.OOOD+OO 1.032D+00 1.527D+00 1.514Dt00 1.476D+00 1.472D+00
1.00D+00 2.723D+00 3.719D+00 2.836D+00 2.785D+00 2.577D+OO 2.423D+OO 2.434D+OO
1.000D+00 l.OOOD+OO 1.000Dt00 l.OOOD+OO 1.000D+00 1.000Dt00 1.000D+00 1.000D+00
1.000D+00 1.000D+00 1.000D+00 9.784D-01 1.000D+00 6.7251)-01 1.000D+00 2.641D-01 1.000D+00 1.618D-01 9.9941)-01 1.541D-01 9.952D-01 1.627D-01 9.254D-01 1.586D-01
Exact 2.274D+00 2.274D+00 1.452D+00 1.600D-01 1.543D-01 1.543D-01 Eval. No. E G
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
Energy
Ig I
Max lgil
a
-12.499372 -12.521284 -12.522072 -12.522491 -12.523027 -12.523199 -12.523215 -12.523216
0.358164 0.050917 0.028891 0.019688 0.016294 0.007948 0.000941 0.000101
0.198326 0.031968 0.018206 0.010447 0.010447 0.005281 0.000621 0.000059
0.00000D+00 4.00000D-01 1.00000D+00 1.00000D+00 1.OOOOOD~OO 1.00000D+00 1.00000D+00 1.00000D+00
278
John D. Head and Michael C. Zerner
TABLE XI: DFP UPDATE OPTIMIZATION OF NH3 - HESSIAN EIGENVALUES (a.u.).
1 2 3 4 5 6 7 8
9 10 11 12
1.000D+00 2.732D+00 2.721D+OO 4.004D+00 5.743D+OO 6.785D+OO 5.846D+OO 3.321D+00 2.620D+00 2.629D+OO 2.814D+00 2.638D+OO
1.000D+00 1.000D+00 1.926D+00 2.584D+00 2.602D+00 2.601D+00 2.600D+00 2.589D+00 2.175D+00 2.475D+00 2.584D+00 2.431D+00
1.000D+00 1.000D+00 1.000D+00 1.112D400 1.285D+00 1.294D+00 1.310D+00 1.328D+00 1.277D+00 1.215D+00 1.242D+00 1.258D+00
1.000D+00 1.000D+00 1.000D+00 1.00D+00 l.OOOD+OO 9.918D-01 1.000D+00 1.000D+00 7.735D-01 1.000D+00 1.000D+00 4.347D-01 1.000D+00 1.000D+00 2.399D-01 1.000D+00 9.999D-01 1.578D-01 1.001D+00 9.999D-01 1.555D-01 1.000D+00 9.9991)-01 2.168D-01 1.000D+00 9.953D-01 2.8421)-01 1.001D+00 9.884D-01 2.120D-01 1.001D+00 9.838D-01 1.611D-01 1.000D+00 9.828D-01 1.571D-01
Exact 2.274D+00 2.274D+00 1.452+00 Eval. No. E G
1 2 3
4 5 6 7
8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
Energy
-12.499372 -12.521284 -12.522077 -12.522434 -12.522807 -12.523011 -12.523144 -12.523190 -12.523210 -12.523215 -12.523216 -12.523216
lg I 0.358164 0.05091 7 0.028333 0.018602 0.019977 0.021627 0.016718 0.009809 0.003410 0.000453 0.000173 0.000096
1.600D-01 1.543D-01 1.543D-01
Max Igi
I
0.198326 0.031968 0.017964 0.013570 0.009532 0.012255 0.009988 0.005927 0.001953 0.000453 0.000173 0.000096
a
279
Newton Based Optimization Methods
TABLE XII: 1 2 3 4 5 6 7 8 9 10 11 12 13
GREENSTADT UPDATE IN THE OPTIMIZATION OF NH3 HESSIAN EIGENVALUES (a.u.).
1.000D+00 1.000D+00 2.724D+00 1.000D+00 2.772D+00 1.758D+00 1.025D+01 2.304D+00 -1.302D+01 2.313D+OO 2.892D+00 2.269D+00 -lS729D+022.339D+OO 1.060D+01 2.337D+00 3.653D+00 2.342D+00 7.378D+00 2.309D+00 -3.148D+02 2.320D+00 -4.799D+01 2.346D+OO -4.799D-02 6.525D+01
1.000D+00 1.000D+00 1.000D+00 1.034D+OO 1.142D+00 l.lllD+OO 1.000D+00 1.117D+00 1.131D+00 1.101D+00 1.096D+00 l.llOD+OO 2.183D+00
1.000D+00 1.000D+00 1.000D+00 1.000D+00 1.000D+00 9.969D-01 1.000D+00 1.000D+00 8.639D-01 1.000D+00 1.000D+00 6.917D-01 1.000D+00 9.999D-01 5.936D-01 1.000D+00 9.998D-01 7.8701)-01 1.000D+00 9.998D-01 3.647D-01 1.000D+00 9.998D+00 4.593D-01 1.000D+00 9.998D-01 4.937D-01 1.001D+00 9.994D-01 4.248D-01 1.001D+00 9.998D-01 3.999D-01 1.000D+00 9.898D-01 3.974D-01 1.001D+00 4.0588-01 2.509D-01
Exact
2.274D+00 2.2274D+00 1.452D+00 1.600D-01 1.543D-01 1.543D-01 Eval. No. E G
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
I
Energy
lg
-12.499372 -12.521284 -12.522079 -12.522402 -12.522565 -12.522591 -12.522774 -12.522889 -12.522968 -12.523059 -12.523154 -12.523155
0.358164 0.050917 0.028129 0.019O60 0.023526 0.019286 0.014169 0.029717 0.018153 0.011297 0.009138 0.010399
Max Igi I
a
0.198326 0.031968 0.017874 0.013480 0.011952 0.010941 0.007835 0.021467 0.011784 0.006420 0.005761 0.006741
We notice that even at optimization, none of these procedures yield particularly accurate Hessian matrices. For most of the systems we have examined, however, the BFGS method is most accurate, followed by the rank one method. Least accurate is the Greenstadt method. The modes least accurately represented are the softer modes. If this system had been given an initial geometry that had C3v symmetry, then only the 2al modes 3 and 4 of the tables, would be updated.
John D. Head and Michael C. Zerner
280
Summarizing, the BFGS and rank one formula appear to be the most successful for updates, at least for small molecules. The rank one formula, however, has a serious disadvantage of not preserving the positive definiteness of the Hessian. TABLE XIII:
A COMPARISION OF THE MURTAGH-SARGENT (MS) PROCEDURE
WITH A BROYDE-FANNO-GOLDFARB-SHANNO (BFGS) PROCEDURE. THE NOTATION IS THE SAME AS THAT OF TABLE VI.
System
H\
H\ c,
H'
H'
c = c
Max. Cornp. Grad. (a.u . /bohr) =
MS
BFGS
0 0.20
>50
24( 27,26)
0.39
11(14,13)
10( 11,ll)
0.16
16( 20,18)
10( 11,ll)
0.15
>50
43( 44,44)
Larger molecules have very many soft modes and it is often difficult t o optimize their structures. For these systems, however, the BFGS formula has a strong advantage over the rank methods that we have examined. Some examples are presented in Table XI11 Pyrimidine and 2,2' paracyclophane are more "typical" examples. The number of SCF cycles in these two cases are 14 versus 11 and 2 0 versus 11 for the Murtagh-Sargent (MS) versus BFGS methods, respectively. The other two molecules in this study were not optimized in 50 seatches by the rank one MS
Newton Based Optimization Methods
281
method, partially a consequence of our initial distortions hat case, for example the destroy all symmetry. In the P O)2(CH3)2 rotations of the two CH3 groups are soft and very strongly coupled.
8.4 Quasi-Newton Methods with Non-Unit Starting Hessians. Ve consider starting the optimization procedure with a more accurate Hessian than the unit matrix described in the previous section. Three approximations are suggested. The first, not examined here, is to have a table of force constants between bonded atoms for "typical" situations, and to form the starting Hessian from this table (27,281. The second is to start the optimization with an exact Hessian. Although a time-consuming process, the numbers of subsequent geometry searches, as summarized in Table XIV, is nearly halved in the case of H2C0, and reduced from 8 to 6 in the case of NH3. An approximate Hessian can also be formed in an SCF procedure by displacing the coordinates as described in Section VII, but freezing the density at the initial geometry. This procedure avoids 6N SCF calculations although new integrals involving the displaced atom must be calculated at each displacement. Starting with this approximate Hessian does not show a great advantage over starting with the unit matrix, as typified in Table XIV Since it is more time-consuming than the latter, i t is not worthwhile. Starting with an exact Hessian matrix (or one formed from a Table of force constants) at an arbitrary setting geometry is of questionable value. If the starting geometry is sufficiently near the optimized geometry, and "quadratically related" to it, an exact Hessian is advantageous (H2C0 of Table XIV) and may allow optimization along the soft modes not easily obtained otherwise. Far from the sought structure, this time-consuming procedure is not worthwhile, for the Hessian itself is changing too rapidly through the optimization. What might be desirable is a better (more rapidly obtained) approximate Hessian used until the optimization had led to a geometry in the neighborhood of the
282
John D. Head and Michael C. Zerner
minimum and then a stable update procedure such as BFGS employed. Tables XV and XVI follow the course of an optimization of NH3 utilizing an exact inverse Hessian and an approximate one, respectively, and should be compared with Tables VIII - XII. For ab-initio calculations such a n appi-oximateHessian might be effectively obtained using analytic expressions from the much faster semi-empirical methods. In addition, large basis set ab-initio calculations might have the Hessian matrix estimated at the minimum basis set level. TABLE XIV:
NUMBER OF BFGS OPTIMIZATION CYCLES USING DIFFERENT INITIAL HESSIAN MATRICES. Unit H
11(12,12)
H2C0
N
H
___
~
Approx. H
Exact H
Newton
9 ( 10,lO)
6(7,7)
4(5,5)
~
_.
aDisplaced 0.05 a.u. along each normal mode. TABLE XV:
BFGS OPTIMIZATION OF MH WITH INITIAL EXACT INVERSE HESSIAN - HESSIAN EIC~NVALUES(a.u.) .
1 2.907D+00 2.522D+00 3 2.401D+00 4 2.431D+00 5 2.377D+00 6 2.331D+00 2
2.238D+00 2.223D+00 2.256D+00 2.256D+00 2.228D+00 2.257D+00
1.409D+00 1.422D+OO 1.454D+00 1.454D+00 1.457D+OO 1.447D+00
1.634D-01 9.583D-02 1.633D-01 9.628D-02 1.633D-01 1.495D-01 1.633D-01 1.495D-01 1.6381)-01 1.595D-01 2.6331)-01 1.561D-01
7.978D-02 8.5761)-02 9.5851)-02 9.5851)-02 9.6871)-02 1.199D-01
Exact
2.274D+00 2.274D+00 1.451D+OO 1.599D-01 1.543D-01 1.543D-01 Eval. No. E G
1
1
2
2
3 1 5 6
3 4 5 6
Energy
-12.514386 -12.522959 -12.523022 -12.523215 -12.523216 -12.523216
lg I
0.204926 0.024481 0.013808 0.001333 0.000101 0.000018
Max
kil
0.112413 0.014681 0.009355 0.000805 0.000066 0.000012
a
0.00000D+00 1.00000D+00 1.00000D+00 1.00000D+00 1.00000D+00 1.00000D+00
283
Newton Based Optimization Methods
TABLE XVI: BFGS OPTIMIZATON OF NH WITH INITIAL APPROXIMATE INVERSE HESSIAN - HESSIAN ~IGENVALUES(a.u.
1 3.173D+00 2.493D+OO 1.668D+00 2 2.804D+00 2.477D+00 1.598D+00 3 2.905D+00 2.498Dc00 1.462D+00 4 3.240D+00 2.505D+00 1.564D+00 5 2.732D+OO 2.469D+OO 1.540D+OO 6 2.559D+00 2.359D+00 1.498D+OO 7 2.512D+00 2.335D+00 1.550D+00
3.570D-01 2/244D-01 1.760D-01 3.570D-01 2.260D-01 1.765D-01 2.621D-01 2.277D-01 1.764D-01 2.285D-01 1.801D-01 1.761D-01 2.284D-01 1.765D-01 1.587D-01 2.282D-01 1.765D-01 1.672D-01 2.034D-01 1.766D-01 1.659D-01
Exact
2.274D+00 2.2740+00 1.451D+00 1.599D-01 1.543D-01 1.543D-01 Eval. No. E G
1 2 3 1 5 6 7
1 2 3 4 5 6 7
I
Energy
lg
-12.514386 -12.522727 -12.523121 -12.523202 -12.523216 -12.523216 -12.523216
0.204926 0.033809 0.007852 0.003959 0.002398 0.000363 0.000035
a
0.112413 0.016685 0.006127 0.002449 0.001415 0.000239 0.000027
8.5 Approximate Hessians As seen in Table XV, the use of the V(l) approximate Hessians, Eq. (38), is remarkable. For a l l practical purposes the number of geometry searches I - e q i i i r e d for optimization using V(l) is about the same number as that required using the exact Hessian. For semi-empirical methods there is an impressive savings of computer time over both the BFGS procedure, and the exact evalution o f the Hessian. For large systems the calculation of V(l) is still practical, although the exact Hessian is not. [2,2'] paracyclophane optimizes in but 3 complete SCF calculations using the V ( 1 ) approximation, to be compared with the 11 BFGS SCF cycles (and 10 searches) reported is Table XIII. For ab-initio methods the use of V(l) might be expected to lead to a savings of a factor of two - four in computer time for large systems f o r the reasons previously discussed. Although the
284
John D. Head and Michael C. Zerner
use of the V(l) method is of advantage over the use of the exact Hessian for geometry searches, the BFGS method is likely still to be much faster overall except in pathological cases.
8.6 Augmented Hessian. We [ 4 4 ] , as well as others [40,41], have found the augmented Hessian or related restricted step methods of great use for determining transition states in molecules. For minima, this method is most useful when an accurate estimate of the second derivative matrix is available and negative eigenvalues occur in it. Otherwise, use of this method slows down the optimization as shown in Table XVII. The reason for this is the restricted nature of the step caused by the denominator shift in Eq. (45d). The augmented Hessian method is very successful with the V(l) estimate of the Hessian, as indicated in this Table. The augmented Hessian is not, in general, reliable with any of the update procedures we have examined. TABLE XVII:
A SUMMARY OF SOME OPTIMIZATIONS.
For each type of optimization the number of cycles is given. The number in paranthesis is time (min:sec) on a VAX11/780, and because of system load, are in possible error of 20%. The test calculations are INDO SCF. System
NH3
CH30H
Cis 1,2-Br,F Cyclopropane
Initiala Grad Norm 0.169 0.160 0.143 8 25(4:45) 2 ( 10:24) BFGS, Eq. 31 Newton Method 6 4(1:42) 9( 5 : 4 6 ) Approx. Hessian, Aug. Approx. 5 9(2 :28) 17(10:30) Hessian, Eq. 45 Newton Method, Exact Hessian, 3 7(4:30) 8(37:32) Aug. Hessian, 3 9(5:28) 17( 8 4 :42) Exact Hessian Eq. 45 aInitially all symmetry is broken. For convergence all gradients bhave been reduced to below 5*10-4 a.u./bohr. The approximate Hessian is formed from the V(l) method of Eq. 38.
Newton Based Optimization Methods
285
A more interesting use of the augmented Hessian is
illustrated for HCN. A transition state for the HCN to HNC has been found. The initial geometry for HCN is then formed by displacing the coordinates 0.05 a.u. along the negative eigenvalue normal mode. The BFGS method with unit starting Hessian is slow to find the minimum with this starting conformation, requiring 12 (17, 1 4 ) o r 12 line search, 17 energy evaluations and 14 gradients evaluations. The augmented Hessian approach required 8 (lZ,ll), rpprenenting a considerable reduction of effort. Both methods insure the positive definiteness of the Hessian throughout the cycling - as evidenced that both obtained the minimum - but clearly the extra curvature introduced by the augmented Hessian, E q s . ( 4 5 ) , has helped.
9. CONCLUSIONS
In this paper we have reviewed and studied several Hessian update procedures for optimizing the geometries of polyatomic molecules. Of the several methods we have studied, many of which we have reported on here in detail, the most reliable and efficient is the BFGS. The weak line search outlined in this work is sufficient to ensure successful optimization even when starting geometries are generated through molecular mechanics [ 4 5 ] or ball and stick models [ 4 6 j . Although the number of iterations in the optimizations procedure can be reduced by starting update procedures with an exact Hessian or Newton step, the time involved in forming this Hessian is seldom worth the effort unless the gradients are sufficiently small as to suggest that the current geometry is close to the optimized geometry. At this stage in the optimization, a single Newton step can save considerable effort. These findings are dramatically summarized for
286
John D. Head and Michael C. Zerner
0
CH2 = CH - 0 -
IIP
I
- CN
in Table XVIII, a case with many soft modes. It should be noted, however, for reference that either the rank one or BFGS methods after 10 cycles have found all bond lengths to within k0.02 a.u. of their final value, and all bond angles to within +3O. The difficult modes involve optimizatiori o t dihedral angels that are coupled with pseudo-rotations. 0
TABLE XVIII:
OPTIMIZATION OF C H ~ CH - 0
II -
P - CN, a.u.
I
N( CH3 1 2 Murtagh-Sargent Rank 1
1 5 10 11 15
AE (a.u.)
lgl
0.0163 0.0149 0.0141
0.0050 0.0026 0.0024
---
0.0109 25 0.0101
___
0.0024 0.0020
BFGS + 10th Cycle Exact Hessian
BFGS
AF,
(a.11.)
0.0163 0.0134 0.0093 _-_
0.0032
0.0002
lgl 0.0050 0.0030 0.0046
___
0.0030 0.0026
AE (a.u.1 0.0163 0.0134 0.0083 0.0000
lgl 0.0050
0.0030 0.0048
6x10-
In addition to examining Newton steps, we have also examined methods of the restricted step o r augmented Hessian type. Although of enormous utility in searching for transition states, the very nature of their construction makes these methods not effective for finding minima unless the starting geometries have negative Hessian eigenvalues. In such a case augmented Hessian steps could be determined until the second (non-zero) eigenvalues of the augmented Hessian is positive, and subsequent steps determined using the Newton methods outlined.
Newton Based Optimization Methods
287
We have also examined here t h e use of approximate solutions of the coupled perturbed Hartree-Fock equations for estimating the Hessian matrix. This Hessian appears to be more accurate than any updated Hessian we have been able to generate during the normal course of an optimization (usually the structure has optimized to within the specified tolerance before the Hessian is very accurate). For semi-empirical methods the use of this approximation in a Newton-like algorithm for minima appears optimal as demonstrated in Table 17. In ab-initio methods searching for minima, the BFGS procedure we describe is the best compromise. APPENDIX Update formula for the direct Hessian G. Rank 1
corresponding to Eq. (29). BFGS, corresponding t o E q . (31)
DFP, corresponding to E q . ( 3 0 ) GDFP k + l = G k +
Note that at each cycle of the update, Gk + 1 Hk + 1 -- Hk + 1 k = 1 providing the same update is used for Gk and E Gk +
.
288
John D. Head and Michael C. Zerner
ACKNOWLEDGEMENTS This work is supported in part through a grant from the Division of Sponsored Research at the University of Florida, and through a grant from the US Air F o r c e (Tyndall) F08635-83-C0136.
REFERENCES
1. 2.
P. Pulay, in Modern Theoretical Chemistry, Vol. 4, ed. H.F. Schaefer I11 (Plenum Press, New York, 1977), p. 153.
H.B. Schlegel, Adv. in Chemical Physics, 67, 249 (1987).
3. Geometric Derivatives of Energy Surfaces an Molecular Properties, ed. P. Jorgensen and J. Simons, NATO AS1 series C, Vol. 166 Reidel, Boston (1985). 4.
P. Pulay, Mol. Phys. 17, ~ - 197 . (1969);
18, 473
(1970).
5. I.W. McIver, Jr. and A. Komornicki, Chem. Phys. Lett. 303 (1971).
lo,
6. H.B. Schlegel, S . L . Wolfe, J. Chem. Phys. 63,3632 (1975); 67, 4181 (1977); 67, 4194 (1977). 7.
H.B. Schlegel, J. Chem. Phys.
c,3676 (1982).
8. M.C. Zerner, in Computational Methods for Molecular Structure Determination: Theory and Technique, NRCC Proceedings No. 8. 9. a)
10.
R. Fletcher, in Practical Method of Optimization, Vol. 1, (Wiley, New York, 1980);
b)
P.E. Gill, W. Murray and M.H. Wright, in Practical Optimization, (Academic Press, New York, 1981).
a)
W. Spendley, G.R. Hext and F.R. Himsworth, Technometrics 4, 441 (1962); 3 . A . Nelder and R. Mead, Comp. Jour. 1, 308 (1965).
b)
11.
J.A. Pople, R. Krishnan, H.B. Schlegel and J.S. Binkley,
12.
a) b)
Int. J. Quantum Chem.
s,225 (1979).
K. Muller, Angew. Chem. 19, 1 (1980); S. Bell and J.S. CrightoG J. Chem. Phys. (1984).
80, 2464
289
Newton Based Optimization Methods
13. a) b)
J. Simons, P. Jdrgensen, H. Taylor and J. Ozment, J, Phys. Chem. 87, 2745 (1983). A. Banerjee, N. Adams, J. Simons and R. Shepard, J. 89, 52 (1985). Phys. Chem. -
14. J.A. Pople, D.L. Beveridge and P.A. Dobosh, J. Chem. Phys. 47, 2026 (1967). 15. A.D. Bacon and M.C. Zerner, Theoret. Chim. Acta, 53, 21 (1979). 16. G. Fogarsi and P. Pulay, J. Mol. Struct.
2,275
(1977).
17. E.B. Wilson, Jr., J.C. Decius and P.C. Cross, in Molecular Vibrations, (McGraw-Hill, New York, 1955).
64, 371
18.
J.N. Murrell and K.J. Laidler, Trans. Faraday SOC.
19.
A.A. Goldstein, SIAM J. Control 3, 147 (1965).
20.
M.J.D. Powell, in Optimization in Action, ed. L.C.W. Dixon (Academic Press, New York, 1976), p . 117.
21.
53, 279 (1979). P. Scharfenberg, Theor. Chim. Acta -
22.
21, 368 (1967). C.G. Broyden, Maths. Comput. -
23.
W.C. Davidon, Computer J.
(1968).
LO, 406
(1968).
24. A.V. Fiacco and G.P. McCormick, in Non-linear Programming, (Wiley, New York, 1968).
25.
B.A. Murtagh and R.W.H. Sargent, in Optimization, ed. R. Fletcher, (Academic Press, London, 1969).
26. For example, the ab-initio packages GAUSSIAN 8 0 , HONDO, TEXAS and GAMES. 27.
H.B. Schlegel, Theoret. Chim. Acta
66, 333 (1984).
2, 128 (1934); 3, 227 (1935). 28. R.M. Badger, J. Chem. Phys. 13, 185 (1970). 29. B.A. Murtagh and R.W.H. Sargent, Comput. J . -
30. a) b) 31.
W.C. Davidon, AEC Res. and Dev. Report ANL-5990 1959) ; 6, 163 (1963).
R. Fletcher and M.J.D. Powell, Comput. J.
a) C.G. Broyden, J. Inst. Maths Applns. 6, 76, 222 1970) ; b) R. Fletcher, Coput. J. 13, 317 (1970)y 24, 23 (1970); c) D. Godfarb, Maths. Compz. d) D.F. Shanno, Maths. Comput. 24, 647 (1970).
290
John D. Head and Michael C. Zerner
22, 368
32.
C.G. Broyden, Maths. Comput.
33.
J.C. Greenstadt, Maths. Comput. 2 6 ,1 ( 1 9 7 0 ) .
34.
a)
b) c)
(1967).
R.M. Steens, R. Pitzer and W.N. Libscomb, J. Chem. Phys. 38, 550 ( 1 9 6 3 ) .
Gerratt and I.M. Mills, J. Chem. Phys,
( 1 9 6 8 ) ; 4 9 , 1730 ( 1 9 6 8 ) .
T.C. Coves and M. Karplus, J . Chem. Phys.
49, 50,
1719 3649
(1969). 35.
J.D Head and M.C. Zerner, Chem. Phys. Letters (1986).
131, 359
36.
Y. Osamura, Y. Yamaguchi, P. Saxe, M.A. Vincent, J.F. Gaw and H.F. Schaeffer 111, Chem. Phys. 72, 131 ( 1 9 8 2 ) .
37.
K. Levenberg, Quant. Appl. Math. 2, 164 ( 1 9 4 4 ) .
38.
D.W. Marquardt, SIAM J. 11, 431 ( 1 9 6 3 ) .
39. 40.
S.M. Goldfeld, R.E. Quandt and H.F. Trotter, Econometrica
34,
541 ( 1 9 6 6 ) .
C.J. Cerjan and W.M. Miller, J. Chem. Phys.,
2, 2800
(1981). 41.
J. Simons, P. Jorgensen, H . Taylor and J . Ozment, J. Phys. Chem. 87 2745 ( 1 9 8 3 ) ; D. O’Neal, H. Taylor and J. Simons, J. Phys. Chem., 88, 1510 ( 1 9 8 4 ) ; H. Taylor and J. Simons, J. Phys. Chem., 89, 52 ( 1 9 8 5 ) ; D.T. Nguyen and D.A. Case, J. 4020 ( 1 9 8 5 ) . Phys. Chem.,
e,
1, 149
42.
R. Fletcher and C.M. Reeves, Cornput. J.
43.
P. Pechukas, J. Chem. Phys. 64, 1516 ( 1 9 7 6 ) .
44.
J.D. Head and M.C. Zerner,
45.
N.L. Allinger, J. Amer. C h m .
46.
D.G. Purvis, QUIPU Program System, Quantum Theory Project, University of Florida.
t o be
(1964).
published.
Soc. 99 8127 ( 1 9 7 7 ) .
USE OF CLUSTER EXPANSION METHODS IN THE OPEN-SHELL CORRELATION PROBLEM Debashis Mukherjee Theory Group, Department of Physical Chemistry Indian Association for the Cultivation of Science Calcutta 700 032, India Sourav Pal Physical Chemistry Division National Chemical Laboratory Pune 411 008, India
1. INTRODUCTION 2. ‘THEORETICAL MODEL CHEMISTRY‘ AND ITS RELEVANCE
TO THE OPEN-SHELL FORMALISMS
3 . CLUSTER EXPANSION AND SIZE-EXTENSIVITY
4. CATEGORIZATION OF THE OPEN-SHELL MANY-BODY FORMALISMS 4.1 The Earlier Open-Shell Perturbative Developments 4.2 The Propagator Methods 4.3 The Open-Shell Cluster Expansion Approac h 4.4 Certain General Comments 5. SEMI-CLUSTER EXPANSION THEORIES FOR THE OPEN-SHELL STATES 5.1 General Considerations 5.2 Variational Methods 5.3 Non-variational Methods 6. FULL CLUSTER EXPANSION THEORIES IN HILBERT SPACE 6.1 General Preliminaries 6.2 Single Root Hilbert Space Formalisms 6.3 Multiroot Hilbert Space Formalisms 7. FULL CLUSTER EXPANSION THEORIES IN FOCK SPACE 7.1 Preliminaries for a Fock Space Approach ADVANCES IN QUANTUM CHEMISTRY, VOLUME 20
291
Copyright 0 1989 by Academic Press. Inc. All rights of reproduction in any form ~ ~ e ~ e d .
292
Debashis Mukherjee and Sourav Pal
7.2 Similarity Transformation-Based Fock Space Theories 7.3 Bloch Equation-Based Fock Space Theories 7.4 Variational CC Theory in Fock Space 8. SIZE-EXTENSIVE FORMULATIONS WITH INCOMPLETE MODEL SPACE (IMS) 8.1 Preliminaries 8.2 General Formulation 8.3 The Case of IMS having valence holes and particles 8.4 Applications of IMS-based CC Theories 8.5 Size-extensivity of energies with INS 9., CONCLUDING REMARKS ACKNOWLEDGEMENTS REFERENCES
1. INTRODUCTION
Our aim in this review is to highlight mainly the formal developments of the open-shell many-body cluster expansion theories that have taken shape over the last one and a half decades. We have now come to believe that a proper understanding and interpretation of most of the spectroscopic and dynamical phenomena involving excited or ionized molecules and molecular fragments require models for open-shell states transcending the simple orbital description. The language of occupation number representation - coupled with the hole-particle description of the operators, the use of Wick’s theorem and the attendant diagrammatic depiction of the various terms/l/- makes the many-body approach a very attractive alternative to the traditional CI-based theories. The open-shell many-body perturbation theory(MBPT)/2-9/ and the propagator or the Green’s function methods/lO-16/ have now developed into vast computational technology, generating softwares which are cost-effective and competitive with the CI-based codes. The cluster expansion based open-shell many-body formalisms possess even greater potentiality since they not only share and embed all the desirable features and technical advantages of the open-shell MBPT or propagator theories, but also are inherently non-perturbative in nature. They provide more compact descriptions of the electron correlation and are more flexible. Moreover, the cluster expansion
Cluster Expansion Methods in Open-Shell Correlation
293
of the open-shell functions provides u 5 with valuable insight regarding the structure of general many-electron wave-functions. In fact, over the last few years there have been very fruitful developments /91-96,138-144/ on the size-extensivity of approximate functions (which generate energies scaling properly with the size of the molecules ) from a careful analysis of the structure of the cluster expansion for an open-shell function of arbitrary complexity, which have a s yet no counterparts in the other many-body branches . The developments of the cluster expansion theories appear to have reached a stage where a clear perspective is beginning to emerge, although n o comprehensive review of the various facets of the approach and a critical evaluation of the seemingly disparate formalisms put forward i s available in the literature. There are, however, several reviews on closed-shell coupled cluster theories where the openshell cluster expansion theories are also touched upon/18,19,21,22/. A few reviews on the open-shell MBPT describe in broad terms the cluster expansion techniques in so far a 5 they relate to MBPT /20,23/. A concise survey of what we shall call 'full cluster expansion theories' appears in a recent article by Lindgren and Mukherjee/?4(a)/. We shall try to present the various theories from a general and unified point of view. In our exposition, we shall not necessarily follow the chronological order of the developments. Rather, we shall follow a logical course which will bring out the evolution of the cluster expansion strategy, gradually encompassing finer aspects of the electron correlation that are peculiar to the open-shell states. This attitude leads to a natural subdivision of the currently available open-shell cluster-expansion formalisms into various categories. The historical order will be maintained a s far a s i s possible in the delineation of each category.
2. 'THEORETICGL MODEL CHEMISTRY' AND ITS RELEVhNCE TO THE OPEN-SHELL FORMALISMS
Before describing the unifying theme, which will classify the various theoretical developments in the cluster expahsion formalisms, it is pertinent to summarize first certain essential criteria that an
294
Debashis Mukherjee and Sourav Pal
acceptable theoretical model, describing large systems l i k e molecules of interest in chemistry, should satisfy. In the modern many-body parlance the necessity of maintaining certain qualities of the wave-function satisfying these criteria has been emphasized by Bartlett et a1/17,18/, Pople et a1/24/ and Kutzelnigg/46/, although an awareness about these aspects can be traced back to such early practitioners a s Bruckner/25/, Goldstone/Zb/, Hubbard and Huyenholti/27/,Sinanoglu/28/, Primas/29/, and Coester/JO/. A 'Theoretical Model Chemistry'/l8,24/ has an underlying approximation scheme which i s of potentially uniform precision for all the phenomena it wants to describe, so that the viability of the model can be tested by an appeal to the experimental results. The criteria that a theoretical model should satisfy are the following : ( a ) Size-extensivity, i.e., the calculated energies should scale properly with the size of the molecule (or, equivalently, with the number of electrons) (b) Generality ,i.e., the model should not depend for its validity on some special configurations or their symmetry. (c) The model should respect the proper space and spin symmetry required by the symmetry of the system. (d) The model should guarantee separability of the molecule into proper fragments if the situation so demands. (e) The model should be versatile enough, so that it applies equally well to ground, excited, ionized or open-shell states. W e may recall that the desirability of ensuring size-extensivity for a closed-shell state was one of the principal motivations behind the formulation of the MBPT for the closed-shells. The 'linked cluster theorem' of Bruckner/25/, Goldstone/26/ and Hubbard/27/, proving that each term in the perturbation series for energy can be represented by a 'linked' (connected) diagram directly reflects the size-extensivity of the theory. Hubbard/27/ and Coester/30/ even pointed out immediately after the inception of MBPT/25,26/, that the size-extensivity 1 5 intimately related to a cluster expansion structure of the associated wave-operator that is not just confined The corresponding nononly to perturbative theory perturbative scheme for the closed-shells w a s first described by Coester and Kumme1/30,31/ in nuclear physics and this was transcribed to quantum chemistry
.
Cluster Expansion Methods in Open-Shell Correlation
295
by Cizek/32/, and then has been further developed by Paldus. Bartlett/l8,2l(a)/ and others/33-37/. This formalism, called the coupled-cluster (CC) method for closed shells, is now widely accepted a 5 the most versatile and powerful electron correlation theory for closed shell systems such a s the molecular closed shell ground state/l8,21/. Fss Bartlett/lB/ has pointed out, for a closed shell single determinant Hartree-Fock (HF) function a s the starting point for MBPT or CC theory, separability into neutral fragments is not automatically guaranteed, since sometimes the UHF rather than the F?HF dissociates properly. When the criteria (a) and i d ) above are both satisfied, the model may be called size-consistent/24/. hlthough the closed shell CC theory has been sometimes used to describe some special open-shell situations/3?,40/ ( l i k e single configuration triplets/39/) or dissociating species/40-42/, from the point of view o f generality, and on physical grounds it i s natural to look for open-shell generalizations of the cluster expansion formalisms a s developed for the closed shells. The criteria ( a ) to (e) noted above pose special problems for the open-shell states, having n o corresponding analogues for the closed-shell situation. In any theoretical model for the open-shell situation, one first identifies a conceptually minimum set of determinants which are essential to describe the basic physico-chemical properties which the model sets out to describe (such a s space-spin symmetry, treatment of quasidegeneracy, separability into proper fragments). The determinants of this starting or 'reference' function may be said to span a 'model space/3/. Correlation is brought in by superposing other configurations belonging to the 'virtual' space. It is customary to designate the orbitals which are completely filled in all the model space functions a 5 'core' orbitals, those which are partially occupied (i.e,. occupied in some model space functions and unoccupied in others ) a s the 'valence' orbitals. The rest of the orbitals are called 'particles'. The above nomenclature owes its origin to the early developments of the open-shell MBPT (see 12-9,20/ for extensive reviews), and forms a convenient starting point for all the later modern open-shell formalisms that have followed. The total number of electrons in the model space functions is more than those in the core, according to this scheme. Sometimes it i s more convenient to take a core having more electrons than
296
Debashis Mukherjee and Sourav Pal
in the model space functions, and vacant core orbitals are interpreted a 5 ‘valence holes’ in the core. We shall, at the moment, side-step this complication. Most of the discussions to follow remain valid for this latter situation a 5 well. The unifying theme for describing electron correlation in the open-shell states may now be laid down. We wish to look upon the overall effect of electron correlation a 5 stemming from additive contributions from the correlation among the core-electrons, the correlation among the valence electrons and the interactions of core and valence electrons leading to core-polarization and core-relaxation effects. The various open-shell developments to be described emphasize these various effects in various degrees. More importantly, we can envisage several different, although inter-related, aspects of the size-extensivity requirements which are special to the open-shell situations. They define different models for the open-shell many-body formalisms: (all Core-extensive models : The model is explicitly size-extensive with respect to the core electrons N only. This is quite useful for a good C descript on o f electron correlation for a series of open-shell systems with a fixed number N of the valence electrons, when Nc>>NV. The bulkVof the size-extensivity then comes from the size-extensivity o f the core-correlation. The valence correlation and the core-valence interaction need not be so rigorously size-extensive, since N does not change in the V series. Computation of the energies o f the states in an isovalent series like Be,Mg etc. with a few valence electrons,treatment of electron-detached or attached states of closed shell molecules, low-lying excited states of closed-shell molecules with only a K ) can be few valence electrons (Li Na 2’ 2’ conveniently performed with such 2a model without much size-extensivity error. ( a 2 ) ell-electron extensive models: These models are explicitly size-extensive with respect to the total number of electrons N (=N + N 1 only. Here, C the dissection of the total correlatign energy into core correlation, valence correlation and core-valence interaction plays a subsidiary role. Addition and deletion o f valence electrons to the system demands that one starts from scratch once again. (a3) Core-valence extensive models : These models are explicitly size-extensive with respect to NCand N V separately. This is obviously the most general
,...
Cluster Expansion Methods in Open-Shell Correlation
297
size-extensive model, s i n c e i t n o t o n l y a u t o m a t i c a l l y e n s u r e s t h e f e a t u r e ( a 2 1 above, but a l s o t h e f e a t u r e (al), when N i s fixed. In a d d i t i o n , when e l e c t r o n s a r e d e l e t e d Yrom o r added t o t h e v a l e n c e o r b i t a l s , thereby c h a n g i n g N f o r a fixed N it looks u p o n t h e V C' c o n s e q u e n t c h a n g e in e n e r g y a s a c h a n g e o f core-valence i n t e r a c t i o n a c c o m p a n y i n g t h e c h a n g e in N V and a n a d d i t i o n a l v a l e n c e c o r r e l a t i o n effect. O n e needs to compute only these additional energies, which a r e d i f f e r e n t i a l c o r r e l a t i o n e n e r q i e s , and c a n t h u s by-pass a m o r e e l a b o r a t e c a l c u l a t i o n d e novo. T h i s a s p e c t i s o f paramount i n t e r e s t in f o r m u l a t i n g a size-extensive theory o f s u c h e n e r g y d i f f e r e n c e s of s p e c t r o s c o p i c i n t e r e s t as i o n i z a t i o n potential (IF), e l e c t r o n a f f i n i t y ( E A ) o r e x c i t a t i o n e n e r g y ( E E ) of the closed shell systems, since the common core-correlation e n e r g y c a n c e l s o u t o n taking energy d i f f e r e n c e s and t h u s need n o t b e c o m p u t e d a t all.Another i m p o r t a n t a d v a n t a g e o f t h i s model i s that, when t h e c o r r e l a t i o n e f f e c t i s s h o r t range( a s i s usally t h e c a s e ) , s u c c e s s i v e a d d i t i o n o f v a l e n c e e l e c t r o n s will g r a d u a l l y lead t o a s a t u r a t i o n in t h e c h a n g e in e l e c t r o n c o r r e l a t i o n , and t h e o v e r a l l v a l e n c e c o r r e l a t i o n i n v o l v i n g many v a l e n c e e l e c t r o n s may b e c o m p u t e d from a k n o w l e d g e o f v a l e n c e c o r r e l a t i o n involving f e w e r electrons. T h e core-valence e x t e n s i v e f o r m a l i s m s w o u l d t h u s b e c o m e progressively m o r e a c c u r a t e w i t h increasing n u m b e r o f v a l e n c e electrons. In v i e w o f t h e f i n e r c l a s s i f i c a t i o n s ( a l ) t o ( a 3 ) of t h e size-extensivity, t h e a s p e c t s of t h e size-consistency for t h e open-shell t h e o r i e s a r e a l s o analogously m o d i f i e d , d e p e n d i n g o n w h i c h of t h e f e a t u r e s (all t o ( a 3 1 i s c h o s e n f o r size-extensivity. W h a t e v e r i s t h e c h o i c e , however, t h e size-consistency i s most c o n v e n i e n t l y e n s u r e d if t h e model s p a c e i s s e l e c t e d in s u c h a way t h a t t h e s t a r t i n g f u n c t i o n s c o n s t r u c t e d f r o m t h e model s p a c e f u n c t i o n s d i s s o c i a t e i n t o t h e desired fragments. W h e n t h e c o r e e l e c t r o n s a r e m u c h m o r e rigidly bound c o m p a r e d t o t h e v a l e n c e e l e c t r o n s , and t h e core s e p a r a t e s i n t o closed-shell fragment c o r e s , t h e v a r i o u s f r a g m e n t s t h a t o n e n e e d s to d e s c r i b e in t h e d i f f e r e n t d i s s o c i a t i o n l i m i t s c a n all b e g e n e r a t e d f r o m t h e model s p a c e f u n c t i o n s if t h e model s p a c e c o n t a i n s all t h e d e t e r m i n a n t s t h a t c a n b e v a l e n c e e l e c t r o n s in t h e formed by a l l o c a t i n g N V v a l e n c e o r b i t a l s in all p o s s i b l e ways. T h i s i s t h e s p a c e , f o r e x a m p l e , w h i c h i s o f t e n u s e d in t h e CI methods, and i s c a l l e d t h e 'complete active space' ( C A S ) 1431 o r a 'full-valence' model space. In t h e
298
Debashir Mukherjee and Sourav Pal
current many-body terminology, one calls this a complete model space’ / 4 4 / , although this nomenclature 1s unfortunately not in conformity with the definition of ’completeness’ in a configurational expansion a s given by L&wdin/lOB/. Most of the many-body formalisms used to date start with a complete model space. Size-consistency then follows a 5 a consequence of the size-extensivity of the model underlying the theory. I f the model space is incomplete/44/, great care is needed to choose proper model functions ensuring proper dissociation into fragments, and all possible fragments cannot certainly be described in a size-consistent manner with a set o f fixed model space functions spanning an imcomplete model space /45/. It is,however, possible to ensure separability into specfic fragments/9&/ by a suitable choice of the model space. For a general but comprehensive discussion of the aspects of sizeextensivity, see e.g., a recent paper b y Chaudhuri e t a1/147/.
3.
CLUSTER EXPFINSION AND SIZE-EXTENSIVITY
There is an intimate connection between the cluster-expansion of a wave-function and the property of size-extensivity. T o describe this aspect in the simplest manner,it is pertinen.t to recall first the closed-shell ground state. The ways to encompass the open-shell states can then be i.ndicated as appropriate extensions and generalizations of the closed-shell cluster expansion strategy. For an N-electron closed-shell state with a dominant reference function H the exact 0’ wave-function !P can be written a5 a superposition o f 0 various n-fold excited determinants H’ on Ho. 9 may n thought to be generated from H by a wave opera?or
n
:
(3.1)
In the many-body description, Ho 1s usually taken a s the ‘vacuum‘, and ‘holes‘ and ’particles’ are appropriately defined as those occupied and vacant i n
Cluster Expansion Methods in Open-Shell Correlation
299
i respectively. Hn ' s may then be viewed a s nh-np eEcited determinants-obtained from 0 0 The amplitudes C1 in the expansion, eq.(3.1), are n a measure of the importance of n-electron correlation in a A s the number of amplitudes C: increases 0 astronomically with the increase in N, any practicable computational scheme has to truncate the expansion, eq. (3.lj. A systematic rationale for truncation is provided by a model for electron correlation. l-hus, for example, Sinanoglu/28/ advocated that the pair-correlation is the dominant electron correlation for a closed-shell state with the HF function taken a s Unfortunately, however, a restriction to the 0 approximation R = 1 + C>2, leads to an energy which becomes increasingly poorer with growing N. The approximation is hence not size-extensive. r 3inanoglu/20/ demonstrated that, even if the pair-correlation is dominant, the amplitudes C4,C6,C, etc are not negligible, and they are determined by simultaneous correlation of two, three,.. pairs of electrons correlating 5 i m u n t a n e o u s l y / ~ b / . l n c l u s i o nof these leads to the sire-extensivity o f energy. This finding ties u p neatly with the earlier insight offered by Hubbard/27/ and Coester/3C)/ qleaned from M B P T , which leads to the cluster- e x pans i on for / 2 9 - 3 0 / from
H
.
.
+ .
...
=
0
exp(T) 9
0
= M0
(3.2)
where T is a combination of various nh-np excitation : operators T n N
T = X Tn n=l
(3.3)
Tn introduces true n-particle correlation, and products like TkTm etc., arising out of the expansion eq.(3.2), generate simultaneous presence of k-particle amd m-particle correlations in a (k+m)-fold excited determinants etc. The truncation o f T z T2 then corresponds to the pair-correlation model of Sinanoglu, while incorporating higher excited states with several disjoint pair excitations induced through k the powers T The amplitudes for Tn may be c a 1 1 ed 2 'linked' or connected clusters for n electrons. The difficulty of a linear variation method such as CI lies in its inability to realize the cluster expansion structure eq.(3.2),in a simple and practicable manner.
Debashir Mukherjee and Sourav Pal
300
Following an analysis due to Primas/29/, we can indicate now one hallmark of a size-extensive representation of a wave-function a 5 revealed in the expansion, eq.(3.2), above. Once we identify the most important true n-electron correlations, a s reflected in the nature of the operators T the total energy n’ for a size-extensive Y must become additively 0 separable in the asymptotic limit of no interactions between the clusters. Let u s note carefully that we are talking here of the separability into various qroup5 of interactions (clusters) and n o t of separability into fragments. Unless this feature i s explicitly built-in, no approximate wave-function can have this property automatically. It is straightforward to verify that, once we identify the linked clusters with groups A,B, etc., then in the limit of no interaction between the groups, the representation, eq. ( 3 . 2 ) leads to an additive separation of the energy with respect to these groups. From eq.(3.2), we find that (3.4)
Let us assume that
A + HB
(3.5)
+....
H W H
when the groups d o not interact, and write T as T
w TA
TB
+
+....
The operators with different labels commute in the limit of no interactions between the groups, and we have
Eo* EA +
E
B
+....
(3.7)
with
EA = t m 0
1
exp(-T
A
)
HA exp(TA)lBo> etc.
(3.8)
As we have taken the groupings A,B etc., to refer to true linked clusters T , TB etc., the operators T TB must appear a s physicaflv connected entities in ?Ae occupation number representations. EA, EB etc., will also then appear a s connected entities - a 5 a consequence of the multi-commutator expansion generated by eqs. (3.8). Since the groupings are
Cluster Expansion Methods in Open-Shell Correlation
30 I
arbitrary, E itself is a connected entity. 0
The linked-cluster theorem for energy, from the above analysis, is a consequence o f the connectivity of T, and the exponential structure for Sl Size-extensivity is thus seen a s a consequence of cluster expansion of the wave function. Specfic realizations o f the situation are provided by the Bruckner-Goldstone MBPT/25,26/, a s indicated by Hubbard/27/, o r in the non-perturbative CC theory a5 indicated by Coester/JO,31/, Kummel/31/, Cizek/32/, Paldus/33/, Bartlett/2l(a)/ and others/30-38/. There are also the earlier approximate many-electron theories like CEPQ/47/, Sinanoglu's Many Electron Theory/28/ or the CI methods with 'cluster correction'/46/. For the open-shell cluster expansion theories, we have the three different choices (all to (as), a 5 described in Sec.2, for ensuring the three different types of size-extensivity. Unlike the closed-shell situation, we have here the core electrons a s well as valence electrons and we have the option of explicitly maintaining the size-extensivity of the core electrons only, or of N + Nv or N and Nv separately.
.
C
For the core-extensive theories (1.e.. with the feature (all), there must be an explicit cluster expansion structure with respect to the core electrons, and no such cluster expansion maintained for the valence electrons. The wave-operator C z should have then either of the following forms 0
Wexp(T)
(5.9)
(2
exp(T)W
(3.10)
where exp(T) has the np-nh cluster expansion structure with respect to the core, a s in eq. 1 3 - 2 1 , and W is a linear combination of several operators, a s in a C I , introducing valence- and core-valence correlations. There may be several ways to realize the operators exp(T) and W /50-63/. Silverstone and Sinanoglu/48/, already in 1965, indicated how a cluster-expansion satisfying sizeextensivity with respect to all the electrons (feature (a211 can be effected. There is also a related work by Roby/49/. Transcribed into occupation number representation, this amounts to a cluster expansion of the closed-shell type with respect to each determinant
Debashis Mukherjee and Sourav Pal
302
in the model space
Yk =
C Cikexp(T i ) & . 1
1
(3.11)
where the set { & . I spans the model space and C . ‘ 5 are k the combining cokfficients for the k-th root.Tkis is the general cluster expansion structure o f a i size-extensive theory with feature fa2). Each T excites various n-fold excitations from the corresponding & . ’ s and there is no subdivision into core correlatio;, valence correlation and core-valence interaction. The various formalisms/64-64/ in this category will differ in their realization o f the wavefunction, eq.(3.11). The size-extensive theories with the feature (a31 can be developed by converting the representation of the operator W from a linear structure ( a s in eq. ( 3 . 9 ) or (3.10)) to an exponential-like cluster structure. There are several ways of achieving this/67-78/, and they represent the various approaches to the solution of the problem. We may mention here that the closed-shell CC theory/32/ results from the corresponding MRPT/25.26/ only when the latter is a Rayleigh-Schrodinger- ( R S ) series. In a Brillouin-Wigner (BW) perturbation expansion, there is a parametric dependence of the wave-operator on the total energy, and no cluster decomposition is possible for non-interacting groups A , B , etc., owing to the dependence o f each of the cluster operatorsoperators on the total energy. Consequently only the R S type o f development in open-shell MBPT theories can be fully size-extensive. Any part of the expansion o f the wave-operator lackinq R S structure will not satisfy size-extensivity/luY/.
4. CATEGORIZATION OF THE OPEN-SHELL MANY BODY FORMALISM5 4.1. The Earlier Open-shell Perturbative Developments The first core-extensive open-shell MBPT / Z i (i.e., with the feature ( a l ) ) was developed b y Bloch
303
Cluster Expansion Methods in Open-Shell Correlation
and Horowitz. In this formulation, there is an energy-dependent effective hamiltonian H ( k E k ) which ( 0 ) e f acts on an appropriate combination @ of the model k space functions to generate k E k - the energy shift with respect to the correlated core. Diagrammatically speaking, n o term of H has a disconnected closed diagram representing ti% core-energy, although the diagrams may be disconnected, since there is n o explicit size-extensivity with respect to valence electrons. The model space is chosen to be complete. Brandow/3,20/ has discussed the Bloch-Horowitz theory in detail, and based his RS MBPT formulation on this a s the starting point. Brandow's theory/3/is the first fully size-extensive many-body formalism with the feature (a3). The diagrams representing the terms of this R S effective hamiltonian Heff(having now no AEk's) are all fully dependence on the unknown connected, indicating that the valence correlation and the core-valence interaction are calculated in a i s known as size-extensive manner. The series for H eff the linked valence expansion/3/. The essential reouirement for arriving at the proof of the above 'linked valence' expansion/;/ was With the assumption that the model space is complete a complete model space, the model space functions separate into proper fragments a s well, and Brandow's theory is thus additionally size-consistent. Brandow chose the intermediate normalization convention for the perturbed functions. It can then be shown that, a s a consequence, is nonhermitian, "ef f whose non-orthogonal eigenvectors are just the model space projections of the exact functionsi3/. There i s a transformation, due to des Cloizeaux/??/, by which an equivalent hermitian effective hamiltonoan can be derived, if desired. There have been several other formulations of RS MBPT/4-9/ with the feature ( a x ) . Brandow/3/ also attempted to generalize his formalism when the model space contains valence particles a 2 well as valence holes. H e implicitly assumed that a model space containing determinants, corresponding to all possible occupancy of fixed number o f valence holes and valence particles among the valence hole orbitals and valence particle orbitals, is also complete and consequently claimed that the linked valence expansion will be valid. This model space, however, is imcomplete_ /20,44,71,94/, since with respect to the real electrons there are still other determinants in which the electrons can be allocated among the valence orbitals in many other ways. Thus, in this case the linked valence expansion does not f f
.
304
Debashis Mukherjee and Sourav Pal
generally hold good, a 5 Brandow himself later pointed out/20/. Hose and Kaldor/44/ had been the first to have analyzed in detail the structure of the R S MBPT in intermediate normalization when the model space is generally incomplete. They showed that there are disconnected diagrams in this case. In their formulation, they abandoned the classification of the orbitals into core, valence and particles, and instead chose each ket @ . on which H acts a s the vacuum. Until very recen$ly /91-96,1F$/, it was not clear whether the disconnected diagrams appearing in the formalism spell a breakdown of the size-extensivity. F \ l s o , since in this formulation, the core energy i s not explicitly subtracted, the size-extensivity- if at all respected- will be of the type (a2). Clearly, if the model space is made complete the formalism will go over to a size-extensive theory of the type (a3), without disconnected diagrams. This may be looked upon a s a perturbative realization of the SilverstoneSinanoglu expansion/40/. Chaudhuri et a1 have recently discussed/l47/ and demonstrated/l46/ that the energies obtained by diagonalizing the H of the Hose-Kaldor ef f formalism entail disconnected terms indicating a breakdown of size-extensivity. We shall discuss again the aspect of size-extensive theories with imcomplete model space in Secs. 4.3 and 8.4, where the emphasis i s on the cluster expansion approach. 4.2.The Propagator Methods The propagator, or equation of motion methods, and the related Green’s function techniques/lO-16/ were primarily designed to generate EE,IP or Et? of a closed shell ground state. The size-extensivity that they ensure is of the type (all, since the self-energy operator-playing a role rather analogous to that of H - i s dependent on the unknown shift AEk, and in tE1s sense they are structurally analogous to the BW series of the Bloch-Horowitz type/?/.The kinship with the Bloch-Horowitz series i s apparent, since there i s again no closed core diagram in the series for the self-energy. There are major structural differences also- notably in their choice of model space (for a discussion on this point, see, e.g., Brandow/81/, Robb et a1/82/and Mukherjee and Kutzelnigg/B4/), but the feature of core-extensivity is shared by both. The formulation of an EOM-type of theory satisfying ff
Cluster Expansion Methods in Open-Shell Correlation
305
size-extensivity is rather recent, and w e refer to the works of Kuo e t a1/83/ and to a forthcoming paper by Mukherjee and Kutzelnigg/B4/ for a detailed discussion of the subject.
4.3
The Open-Shell Cluster Expansion Approach
The development o f the open-shell many-body cluster expansion methods has followed a rather circuitous and tangled route. The early developments by Mukherjee and co-workers/68,69/ directly emphasized the size-extensivity feature (ax). Their5 was essentially a complete model space formulation, and it was tailored for computing both total state energies and energy differences/69/. The formalism necessitates an explicit consideration of the differential correlation energies as more and more electrons are added to the valence orbitals. This added flexibility requires that we need consider not only the desired complete model space for a fixed N but also a l l the lower valence (henceforth called 'x:bduced' model spaces with fewer valence electrons m , (O5m5 N ) a s V well. The formalism is thus not just confined to a specfic Hilbert space of a fixed number of electrons / 6 8 , 6 9 / ,but is defined in a Fock-space which is a direct sum of several Hilbert spaces of different electron numbers. Mukherjee e t a 2 have discussed several Fock-space strategies/68,69/ to arrive a t size-extensive formulations. These methods may be viewed a s open-shell generalizations of the closed-shell CC method. In nuclear physics. Offermann et and Ey/70/ proposed a related but different formalism which also shares the size-extensivity feature (a3), and have the flexibility of describing differential correlation energy. This i s also a Fock space oriented approach. Somewhat later, Lindgren/71/ generalized his formulation of the Open-shell MBPT/4/ to derive an open-shell coupled cluster method which may be viewed as a highly summed u p version of the MBPT. Lindgren started from a complete model space with fixed N and V' looked for a set of equations which will guarantee the connected nature of the cluster amplitudes. 6lthough Lindgren's cluster expansion contained the flexibility of describing the differential correlation energy similar to that in the theory of Mukherjee e t a1 / 6 8 , 6 9 / , he confined h i s development to Hilbert space
Debashir Mukherjee and Sourav Pal
306
.
of fixed N This generally leads to an underdeterzined system of equations for the cluster amplitudes since their number far exceeds the number of equations/66,69/. Lindgren, however, augmented his formalism by a set of sufficiency conditions for ensuring the connectivity of cluster amplitudes which gave a 5 many equations a s are unknowns. Haque and Mukher~ee/69/ commented that these conditions essentially imply adopting a Fock-space strategy, and in fact Haque demonstrated/69/ that Lindgren's equations can be derived by assuming that his waveoperator for the N -valence problems behaves a s a V wave-operator for the lower valence problems. This point has been further clarified in the recent analysis of Lindgren and Mukherjee/94(a)/ and Mukherjee/94( b)/. The next set of open-shell cluster expansion theories to appear on the scene emphasized the size-extensivity feature (al), and all of them were designed to compute energy differences with a fixed number of valence electrons. Several related theories may be described here - (i) the level-shift function approach in a time-dependent CC framework by Monkhorst/56/ and later generalizations by Dalgaard and Monkhorst/57/, also by Takahasi and Paldus/lOS/, ( i i ) the CC-based linear response theory by Mukherjee and Mukherjee/58/, and generalized later by Ghosh et al/59,60,107/.(iii)the closely related formulations by NakatsuJi/50,52/ and Emrich/bZ/ and (iv) variational theories by Paldus e t &/54/ and Saute et a1/55/ and by Nakatsuji/SW. The next two formulations advocated maintaining size-extensivity with respect to the total electron number N , i.e., they subscribed to the size-extensivity feature (a2). The method of Banerjee and Simons/65/ started with a CAS-SCF function and used a cluster expansion inducing correlation of valence electrons only. The core remains frozen in this theory, leading to some computational simplifications, but at the expense of generality. Since only one combination of determinant5 was chosen a s the reference function - a s in an MR-CI-only one root could be calculated. There was thus no Hef in this theory. Extensions o f this approach to include core-excitations were initiated by Laidig e t a1 /115(a)/ and Hoffman and Simons/ll5(b)/. The theory of Jeziorski and Monkhorst/66/ was general, and this sought to compute all the roots of an Heff.defined in a complete model space. The cluster expansion used was the Silverstone-Sinanoglu expansion (eq. (3.ll)j/48/,
Cluster Expansion Methods in Open-Shell Correlation
307
i and each k e t o n which exp(T 1 a c t s w a s taken as the vacuum in the s a m e spirit a 5 that of Hose and Kaldor/44/. Heff w a s proved to be connected by a rather involved algebraic route. All these theories work in Hilbert space of a fixed number of electrons and a r e thus conceptually distinct from t h e Fock-space based theories of t h e t y p e (a3). Kutzelniggi76i and Kutzelnigg and Koch/77/ revived the cluster expansion approach of t h e t y p e (a31 in a series of papers where t h e n a m e 'Fock S p a c e Quantum Chemistry' appeared first. They emphasized that t h e structure of H in a occupation number representation is much simpler than in a Hilbert s p a c e s i n c e t h i s i s independent of the number of electrons a n d , o n c e the valence orbitals a r e identified, a s i m i l a r ' t y -i transformation of H by C2 c a n bring L =O HC2 t o a form where valence orbitals c o u p l e only w i t h the valence orbitals through L. A diagonalization of this part of L having valence o r b i t a l s only in a model space of a specified N will then furnish selected V eigenvalues of H for an N -valence problem. In effect, V this means that Qne d e f i n e s C2 in Fock s p a c e , transforms H t o L in Fock s p a c e , and projects o n t o a N -valence Hilbert s p a c e t o finally construct model V H o n t h e Hilbert space. Operationally speaking, tKis method advocates the s a m e strategy a5 that expounded by Mukherjee e t a1/67-691 o r Lindgren/71/ a 5 amplified by Haque and Mukherjee/69/. But there is a subtle difference. While t h e earlier works/67-69/ used explicit determinantal projectors for the equations determining t h e cluster amplitudes and Heff,Kutzelnigg and Koch emphasized the operator aspects of the s a m e equations without determinantal proJectors. T h e Fock-space strategy is t h u s more explicit in the latter. Kutzelnigg and Koch/77/ considered complete model s p a c e theories o n l y , with valence orbitals as o f the particle type. A related approach containing holes and particles w a s initiated by Stotarczyk and Monkhorsti85/, which, however, -strictly speaking- i s n o t an effective hamiltonian theory ( s e e , e.g., Lindgren and Mukherjee/94(a)/ for a critique o n this point).See also S i n h a e t al/ll9/. It should be emphasized here that the convenience of working in a complete model s p a c e for arriving at the connectedness of H w a s first s h o w n by eff Brandow/3/ and t h i s view-point h a s s i n c e influenced most of t h e later developments of t h e open-shell MBPT and CC theory. When t h e valence orbitals a r e rather well-spaced in energy, this may c r e a t e a s e r i o u s numerical instability in actual applications. Whenever f
f
308
Debashis Mukherjee and Sourav Pal
there are some virtual space determinants close in energy to some model space functions, they will mix strongly resulting in the divergence o f the perturbation series for H or an ill-conditioning o f ef f the open-shell CC equations. The offending virtual determinants for which such strong mixing occurs are called 'intruder states'/86/. Their presence has been a perennial problem in both nuclear physics/Bb/ and quantum chemistry/44,87/. Fls we have already discussed, Hose and Kaldor/44/ formulated an incomplete model space version of MBPT. Their motivation was to include only those functions in the model space which do not mix too strongly with the virtual space functions. Haque and Mukherjee/88/ formulated a hermitian version of MBPT for incomplete model space. Jeziorski and Monkhorst/bb/ indicated how their CC formulation for the complete model space may be generalized to the incomplete model space situation leading to the CC analogue o f the Hose-Kaldor MBPT/44/. In all these developments, disconnected diagrams appear in H and they are not sizeeff' extensive/146,147/. Recently, Lindgren/88/ introduced the concept o f a quasi-complete model space, which is a very special type o f incomplete model space, for which he claimed i s connected. In a quasi-complete model that H space,eg6e valence orbitals are bunched in different groups, the occupancy of orbitals in each group is held fixed and the various possible occupancies within each group are exhausted. Looked at from this classification, a model space consisting of nh-mp determinants- as discussed by Brandow/3,20/ in his generalized MBPT for valence holes and particles- is quasi-complete , with valence holes and valence particles in different groups. It is known that Heff is disconnected in this case.Unfortunately for a general quasi-complete caie also, H is not connected/90/. Size-extensivity of Phe energies is thus not respected in these formalisms/f46,147/. Very recently, Mukherjee/?l/ made a re-investigation of the problem of the appearance o f disconnected terms in H for nh-np determinants as ef f constituting a model space, and traced the origin o f the disconnected to the tacit assumption of maintaininq the intermediate normalization o f the eigenfunctions. By abandoning the intermediate normalization, and by adopting a diferent normalization, Mukherjee proved that there are no Using this same strategy, disconnected terms in H ef f ' Mukherjee/92/ proved for he first time that for a f f
Cluster Expansion Methods in Open-Shell Correlation
309
general incomplete model space a connected H results if the intermediate normalization iseff abandoned. We may term those normalization conventions is connected a 5 size-extensive for which H ef f normalizations /92,93/.To prove the connectedness of the cluster amplitudes and also of H f f , it was found essential to adopt a Fock-space straeegy and to postulate that the same wave-operator 0 applies to all.the various valence sectors of the Fock space. The wave-operator is thus 'valence-universal'. Connected MBPT and CC theory for a general model space from a unified viewpoint has also been presented by MukherJee/93/. Some further generalizations and the necessary and sufficient conditions for arriving at a connected H have been discussed by Lindgren and Mukherjee/SZ?f The generalizations of /76/ and / 7 7 / to tackle incomplete model space were done by Kutzelnigg et a1/95/ and Mukherjee et & / 9 6 / , where a special type of incomplete model space - called the 'isolated incomplete model space'/95/ - has been introduced which possesses the interesting property of supporting a kind of intermediate normalization a s well a 5 generating a connected H . There has been considerable activity inef&is field over the last two years/138-147/. 4.4 Certain General Comments
For the analysis of the various formalisms, manipulation of the equations, generating normal product of terms via Wick's theorem, and particularly for indicating how the proofs of the several different linked cluster theorems are achieved, we shall make frequent use of diagrams. For the sake of uniformity, we shall mostly adhere to the Hugenholtz convention/l/. All the constituents of the diagrams will be operators in normal order with respect to suitable closed-shell determinant taken a s the vacuum. We shall refer to the creation/annihilation operators with respect to this vacuum after the h-p transformation.The hamiltonian H will also be taken to be in normal order with respect to @ : 0
- 7 -
Debashis Mukherjee and Sourav Pal
310
Here { . . ) dendtes the normal ordering, and and~.ar the e one- and antisymmetrized twoelectron integrals. Their skeletons are depicted in A,B, etc., stand for general spin orbitals. We Fig.1. shall denote the orbitals by the Greek letters u ,(3 , etc_., particle orbitals by ~ , q , e t c ._hole ~ valence by a,@, etc., and particle valence by p,q, etc. Fig. 2 shows a typical classification scheme containing valence holes and valence particles. In a parallel terminology, the valence orbitals (valence holes and/or valence particles) are called 'active' and the non-valence holes and particles are called 'inactive'/B/. Fig. 2 displays this terminology a5 well. We shall sometimes use these two terminologies interchangeably. The inactive holes and particles are indicated by single arrows on the lines in the usual manner. The active holes and .particles are depicted a 5 lines with double arrows. These are illustrated in Fig. 3. We shall also choose the convention o f drawing the diagrams horizontally whereby the sequence of operators appearing from right to left will be depicted a s a sequence o f appropriate vertices sitting in the same order starting from the right.
5.SEMI-CLUSTER EXPRNSION THEORIES FOR THE OPEN-SHELL STATES General Physical Considerations
5.1
When the number of valence electrons are far less in comparison to the number of core electrons, the bulk of the size-extensivity o f the total energy
a.
b.
Fig. 1
( a ) The skeletons for the operator h. ( b ) Corresponding skeletons for v
Cluster Expansion Methods in Open-Shell Correlation
valence
valence
31 I
- (active)
holes
-
I inactive
I
Fig.2
Classification of orbitals into hole5 and particles. Also displayed are further divisions into inactive and active orbitals.
Fig.3
The arrow conventions for inactive and active holes and particles.
of an open-shell state stems from the size-extensivity of the core-electron energy. This is also the situation where the major portion of the electron correlation remains relatively unchanged on going over from the core system to the open-shell state. For example,for the low-lying valence excited or ionized states derived from a closed shell ground state, it i s a good approximatin to hold that the bulk of electron correlation remains unaffected. The inclusion of the correlation between the valence electrons and the core-valence interaction can be taken care of by a V valence wave-operator W Thus for a fixed number of valence electrons, if we introduce a model space then a spanned by a set o f M determinants Cli), semi-cluster representation o f the associated exact functions \Yk,incorporating a size-extensive description of the core-correlation, can be given in the following manner:
.
Debashis Mukherjee and Sourav Pal
312
M 1 ' 1
M
(5.1.1) C
where W = e x p ( T ) i s t h e wave-operator o f t h e core Soand each (6. i s o b t a i n e d by t h e a c t i o n o f N v a l e n c e I t c r e a t i o n o p e r a t o r s ( v a l e n c e h o l e s and/or p a r x i c l e s ) Y . on t h e c o r e H t a k e n as t h e vacuum : 0'
@.
I
= Yi t
a0
(5.1.2)
Wv i s a c o m b i n a t i o n o f v a r i o u s nh-np e x c i t a t i o n s tfrom t h e s e t {ai}. I f we d e n o t e t h e s e t of o p e r a t o r s Y . a s o f t h e t y p e W1, t h e n t h e o p e r a t o r s i n d u c i n g nh-np' e x c i t a t i o n s can be s a i d t o be o f t h e t y p e s W l+n. w r i t i n g eq.(5.1.1) i n l o n g hand, we f i n d M
(5.1.3)
where t h e v a r i o u s nh-np e x c i t e d d e t e r m i n a n t s { @ 1 w i t h r e p e c t t o P.'s a r e g e n e r a t e d by t h e o p e r a t o r m g i f o l d CY 1 c n t a i n s d i n t h e s e t CWl+nl. A s the s e t s {Vil an8 CY}, contain creation o p e r a t o r s o n l y , they commute w i t h Wc, and we can r e w r i t e eq.(5.1.3) as
5
w
0
'yk
=
Rk 'y,
(5.1.4)
where \y = WcQo i s t h e c o r r e l a t e d c o r e , and C l k ' e x c i t a ? i o n o p e r a t o r ' d e f i n e d by t h e r e l a t i o n
is an
Eq.(5.1.5) i n d i c a t e s t h a t Yk r e c e i v e s c o n t r i b u t i o n from the c o r r e l a t i n g v i r t u a l d e t e r m i n a n t s CSm) i n two ways : ( a ) c o n t r i b u t i o n o r i g i n a t i n g from t h e set } g e n e r a t i n g oh-np e x c i t a t i o n s o u t o f {ii) ,and a t from t h e p r o d u c t e x c i t a t i o n s o f t h e form W g e n e r a t i n g a d d i t i o n a l mh-mp e x c i t a t i o n s . I ++n t h e c o r e - r e o r g a n i z a t i o n and c o r e - c o r r e l a t i o n changes a t t e n d a n t w i t h t h e a d d i t i o n o f v a l e n c e h o l e s a n d / o r p a r t i c l e s a r e n o t v e r y severe, t h e n a
Cluster Expansion Methods in Open-Shell Correlation
313
t r u n c a t i o n of t h e s e t {W } t o a f e w s m a l l n will b e a good a p p r o x i m a t i o n , a d + " m o s t of t h e a d d i t i o n a l c o r r e l a t i o n will b e well-simul ted by p r o d u c t e x c i t a t i o n s of t h e t y p e '1+nT For generating the working equations for the v a r i o u s semi-cluster e x p a n s i o n f o r m a l i s m s , i t will b e c o n v e n i e n t t o d e n o t e b y t h e s e t {B ) t h e u n i o n o f {B.) 1 and { S } , and t h e a s s o c i a t e d o p e r a t o r m a n i f o l d as { Yz}.mThe u n k n o w n q u a n t i t i e s a r e t h e s e t s {C. 1 and ik {C,,} denoted henceforth collectively a s and C_={C 1 . W e may e n v i s a g e s e v e r a l a p p r o a c h e s f o r t h e i r ak . determination.
r:
5.2
Variational M e t h o d s
Fls t h e n a m e indicates, t h e s e m e t h o d s d e t e r m i n e t h e c o e f f i c i e n t s C by a n appeal t o t h e variational p r i n c i p l e for t h e s t a t e e n e r g i e s E W e shall discuss k' t w o principal f o r m a l i s m s w h i c h m a k e u s e o f t h i s strategy. In o n e v a r i a n t , d u e t o P a l d u s e t &/54/ and S a u t e et &/55/, a linked c l u s t e r t h e o r e m i s proved whereby t h e c o r r e l a t e d core-energy, E k c o m e s a 5 a s u m of E 0' and a r a t i o of t h e f o r m N /D , w h e r e N and D k k are c o n n e c t e d e n t i t i e s ( s e e ei.(k.2.4 ) ) :
Ek
= ?
kl
H
IW k > f < W k l W k >
= Eo + N k / D k
(5.2.1 1
Eq.(5.2.1) i n d i c a t e s that e a c h E h a 5 a c o n s t a n t k S i n c e 9 i s a s s u m e d t o b e known, energy s h f t E 0 the variaiona? p r i n c i p l e s f o r Ek a n d AEk ( t h e e n e r g y is core-extensive d i f f e r e n c e ) a r e synonymous. % since thee are n o disconnected terms containing E
.
0
.
W i t h Ze a s t h e Hartree-Fock f u n c t i o n f o r t h e ground sta?e, eq.(5.2.1) can generate a connected e x p r e s s i o n for s u c h e n e r g y d i f f e r e n c e s a s IP, E A o r EE d i r e c t l y , d e p e n d i n g on t h e m a n i f o l d in
Debashir Mukherjee and Sourav Pal
314
Eab = <So lWCtYaHY2WC lHo>c
(5.2.3)
b
eq.(5.2.1)
(5.2.2)
c a n be a l t e r n a t i v e l y w r i t t e n as
AEk = C t kXk
/
CkACk -t-
(5.2.4)
w h i c h i s r a t i o o f c o n n e c t e d b i l i n e a r h e r m i t i a n forms. Variation of AEk w i t h r e s p e c t t o t h e c o e f f i c i e n t s C k leads t o a n e i g e n v a l u e e q u a t i o n of t h e f o r m :
E Ck = AEk ACk
(5.2.5)
Using a perturbative analysis t o estimate the i m p o r t a n c e of v a r i o u s o p e r a t o r s , P a l d u s e t a1 /54/ advocated t h e f o l l o w i n g t r u n c a t i o n scheme:
(5.2.6)
w h e r e 0 = H and t h e u n i t o p e r a t o r I for X . and A respectively. S a u t e e t a 1 / 5 5 / applied t h i s t r u n c t i o n s c h e m e for IP,EA and EE c a l c u l a t i o n s o f model n-systems in Pariser-Parr-Pople framework, and s h o w e d t h e v i a b i l i t y of t h e formalism. NakatsujiiSO/ developed an alternative variant, w h e r e e s s e n t i a l l y t h e s a m e a n s a t z a s eq. (5.1.4) i s invoked t h o u g h n o a t t e m p t w a s m a d e t o d e r i v e a linked c l u s t e r t h e o r e m in a n e x p l i c i t manner. A s s u m i n g that t h e c l u s t e r a m p l i t u d e s o f t h e o p e r a t o r T a r e known, t h e c o e f f i c i e n t s Ck a r e d e t e r m i n e d d i r e c t l y from a variation o f t h e e x p r e s s i o n <*k
I
H-Ek
= 0
(5.2.7)
leading t o a n e i g e n v a l u e e q u a t i o n of t h e f o r m
N a k a t s u j i also s h o w e d , s o m e w h a t i n t h e line of P a l d u s e t a1/54/, that n o g e n e r a l i t y i s lost in a s s u m i n g that
Cluster Expansion Methods in Open-Shell Correlation
9
k
and Y
0
315
a r e orthogonal.
In actual numerical implementations, Nakatsuji and h i s co-work.ers/50/ used an a p p r o x i m a t i o scheqe? in w h i c h matrix-elements of t h e t y p ea r e neglected, s i n c e they a r e both higher o r d e r in e f f e c t and d i f f i c u l t t o compute. IP and EE c o m p u t a t i o n s o n several prototypical s y s t e m s produced encouraging results. It, however, a p p e a r s that Nakatsuji in h i s later applications/50-52/ tended t o favour a non-variational, projection methodpresumably in v i e w of i t s r e l a t i v e e a s e o f applications. T h i s h a s been covered in Sec.5.3. 5.3.Non-Variational
Methods
T h e r e a r e several non-variational semi-cluster expansion f o r m a l i s m s of s e e m i n g l y d i s p a r a t e structures/56-63/. All but o n e of them a r e closely related/%-621, and they all a s s u m e t h e a n s a t z , eq.(5.1.4), e i t h e r implicitly o r explicitly. T h e s e t h e o r i e s u s e t h e projection method for o b t a i n i n g C k’ w h i c h g e n e r a t e s a nonhermitian e i g e n v a l u e problem, w i t h AE ‘ 5 a s t h e eigenvalues. A different a n s a t z for qk, leaking t o a hermitian eigenproblem h a s a l s o been suggested/63/. M u k h e r j e e and M u k h e r j e e proposed what they termed a 5 a CC-based linear r e s p o n s e theory ( C C - L R T ) / 5 8 / . The method i n v o k e s t h e a p p a r a t u s of t h e linear r e s p o n s e formalism for c a l c u l a t i n g e n e r g y d i f f e r e n c e s directly. T h e essential idea i s t o c o m p u t e t h e d y n a m i c linear r e s p o n s e of a closed-shell ground s t a t e s u b j e c t e d t o a coupling w i t h a photon field of frequency (rs, and o b t a i n t h e e n e r g y d i f f e r e n c e s a s poles o f t h e linear r e s p o n s e function a s a function o f cd. T h e interaction hamiltonian H . is taken t o be o f t h e form int “int = iv
el
CC-C+I
(5.3.1)
w h e r e V el i n v o l v e s e l e c t r o n i c o p e r a t o r s o n l y , and C / C .? a r e annihilation/creation o p e r a t o r s for a photon field of frequency cd. In t h e a b s e n c e of a c o u p l i n g , t h e c o m p o s i t e s y s t e m of t h e molecular ground s t a t e and photon s a t i s f i e s t h e e q u a t i o n s (5.3.2)
Debashis Mukherjee and Sourav Pal
316
H
Ph
= o C tC
(5.3.3)
X
i s a n e i g e n s t a t e of H w i t h e i g e n v a l u e n G). *O i g h a s s u m e d t o b e o f t h e p h CC-form. In t h e p r e s e n c e o f interaction, t h e e i g e n s t a t e of t h e c o m p o s i t e s y s t e m will n o t h a v e a d e f i n i t e n u m b e r o f photons, and t h e e x t e n t o f c o r r e l a t i o n a s a r e s u l t of c o u p l i n g will a l s o b e modified. T h e s e c h a n g e s c a n b e induced by t h e action of a second cluster operator of the exponential form : exp(S1. T h e o p e r a t o r S d e s t r o y s / c r e a t e s zero, one, two,..., p h o t o n s and s i m u l t a n e o u s l y i n d u c e v a r i o u s nh-rnp e x c i t a t i o n s o u t o f 5 T h e n a t u r e of t h e e l e c t r o n i c part of t h e c l u s t e r o p e p a t o r in S i s d i c t a t e d by t h e n a t u r e of t h e e n e r g y d i f f r e n c e w e a r e 7. interested in. F o r IP/EPI c a l c u l a t i o n s , Ve in eq.(5.3.1) will d e s t r o y / c r e a t e a n e l e c t r o n f r o m *O , and c o n s e q u e n t l y S s h o u l d inv Ive nh-(nT1)p excitations. S i m i l a r l y , for EE, VeP will c o n s e r v e t h e n u m b e r of e l e c t r o n s , and S s h o u l d i n v o l v e nh-np excitations. F o r c o m p u t i n g t h e linear response,+it s u f f i c e s t o r e t a i n o n l y t h e t e r m s linear in C/C :
.
+
w h e r e S- and S a r e t h e e l e c t r o n i c p a r t s of S,a5 discussed above. T o a r r i v e a t t h e w o r k i n g e q u a t i o n s f o r CC-LRT, w e proceed a s follows/58,60/. S t a r t i n g from t h e S c h r o d i n g e r e q u a t i o n in t h e p r e s e n c e of t h e perturbation : CH + H
Ph
+
H.
int
1 e x p ( T + S ) i5 X ~
= E
0 Ph exp(T+S) @ X
*
Ph
(5.3.5)
(5.3.6)
w h e r e t h e s i n g l e bar o p e r a t o r s
6
= exp(-T)
0 exp(T).
6
a r e d e f i n e d by (5.3.7)
If w e n o w premultiply eq. (5.3.6) f u r t h e r by exp(-S), and t r u n c a t e t h e r e s u l t a n t o p e r a t o r s a f t e r t h e t e r m s linear in photon d e s t r u c t i o n / c r e a t i o n o p e r a t o r s ( h e n c e t h e t e r m 'linear' r e s p o n s e theory),
Cluster Expansion Methods in Open-Shell Correlation
317
then w e o b t a i n
is t h e energy of t h e F o m p o s i t e i n t h e where E(') linear r e s p o n s e approximation. S- in S o f eq.(S.3.4) c a n be written in t e r m s o f o p e r a t o r s (YX) : (5.3.9).
+
T o determine the vectors X-, w e project eq.(5.3.8) o n t o the v a r i o u s f u n c t i o n s
<@,ICH,Sf1
+
+ Gel 2 w S - l S
0
>
= 0
-
(5.3 10
Utilizing eq.(S.3.9), and introducing the nonhermitian matrix A and t h e vector V t h r o u g h the re1 at ions
( 5.3.12) we
have
+
[ A f w I J X - + V = O
The coefficients
+
+
X-
(5.3.13)
a r e t h u s g i v e n by
X - = - [ C l _ + w I ]
-1
v
(5.3.14 1
+
Determining X- is e q u i v a l e n t t o d e t e r m i n i n g t h e operator S, and hence of t h e f i r s t o r d e r perturbed function S @ X According t o t h e gfneral theory o f 1 inear r e ~ p o ~t h s e~ amp1 ~ - i t u d e s o f S-, i .e. , t h e v e c t o r s X - , will become s i n g u l a r when w m a t c h e s a n elementary excitation s u c h a 5 IP,EA o r EE. T h i s w 1 1 become happens w h e n e v e r t h e m a t r i c e s [ A singular. If w k ' s a r e e i g e n v a l u e s o f A obtained from
*
a
Debashis Mukherjee and Sourav Pal
318
CI Z k =
Wk
Zk
(5.3.15)
then t h e v e c t o r s X’ will b e c o m e s i n g u l a r for GI = T ak. S i n c e t h e positive p o l e 5 o f t h e linear r e s p o n s e function correspond to a n absorption o f photon, t h e s o l u t i o n of eq.(5.3.15) f u r n i s h e s u s directly w i t h nEk’s..Eq.(5.3.15) i s t h e working e q u a t i o n for CC-LRT. Explicit expression for A w a s derived for EE in 1581 and foy IP in /59/. L e t u s n o t e that t h e actual form for Ve h a s n o t been needed t o g e t wk’s. It may n o t b e immediately apparentthat CC-LRT d o e s indeed u t i l i z e t h e ansatz, eq.(5.1.4) To see t h i s explicitly, w e f o l l o w t h e a n a l y s i s o f G h o s h e t c / 6 0 / . Using t h e ansatz, eq.(5.1.4), w e f rst write an equation for AE a s in EOM/13,15,16/ k (
5.3.16)
S i n c e Wc and fik c o m m u t e , w e find from eq.(5.3.16), using eq.(5.3.7), t h e relation (
5.3.17 )
I
P r o j e c t i n g o n t o t h e s t a t e s < H . and using eqs.(5.1.5) and (5.3.11) w e hive A Ck = AEk Ck
(5.3.18)
w h i c h e s t a b l i s h e s that t h e energy obtained f r o m CC-LRT is indeed t h e s a m e a s that obtained f r o m an EOM strategy using t h e method o f projection and t h e ansatz, eq.(5.1.4). CC-LRT w a s derived in t h i s a l t e r n a t i v e manner by Adnan e t a1/97/ and G h o s h e t al/bO/, w h i c h s h o w s that CC-LRT may be viewed a 5 a renormalized nh-mp T D A ; t h e renomalization e n t e r s v i a t h e dressed molecular hamiltonian incorporating t h e ground s t a t e correlation. It should b e mentioned that Emrich/62/, in t h e c o n t e x t o f many-body n u c l e a r s t r u c t u r e theory, a l s o arrived a t eq.(5.3.18) f o r EE, but unfortunately h e missed t h e e a r l i e r works. T h e r e i s also a related e a r l i e r paper by Coester/lZ8/. P i l o t numerical a p p l i c a t i o n s o f CC-LRT w e r e performed by fidnan e t a1/98/ for s i n g l e t E E , and by G h o s h e t a 1 / 5 9 / for IP c a l c u l a t i o n s o n model ne l e c t r o n systems. G h o s h e t a1/58/ a l s o spin-adapted t h e formalism for EE t o e n c o m p a s s both s i n g l e t and
Cluster Expansion Methods in Open-Shell Correlation
319
triplet excitations, and pilot calculations with this scheme were performed by Adnan e t a&/97/. In the abinitio framework, the method has since been applied by Mukhopadhyay et a.3/99/ and Roy et a1/100/ for IP calculations covering both the outer and inner valence region. The theory was able to predict the satellite spcetra in the inner valence region of molecules l i k e Nitrogen and Water satisfactorily. In these applications, the earlier strategy of Adnan e t a&/97,98/ and Ghosh et a1/59/ was utilized. T t T, approximation was used and was computed from the closed shell CC theory. The matrix-elements o f are computed next. The construction of H i s facilitated by the diagrammatic handling of the various terms. As typical examples, the one-body hole-hole term of H and the two-body p-h - p-h term of H are shown in Figs. 4(a) and 4(b). In general, any term of H of a given shape (dictated by the nature of the lines joined to it) i s obtained a 5 a collection of connected diagrams of the same shape obtained by joining H and T vertices in every possible manner. Mukherjee amd Mukherjee/58/ listed she expression for all the one- and-two-body terms of H , and denoted_ them respectively by F and V /lOl/. Once T is known, F and can be computed once for all and stored. Three and higher body terms of H should, in principle, be included but they have been neglected as yet. The diagrams entering the matrix CI for IP and EE, with nk confined to Wland W2 manifolds,are shown in Figs.5. Closely related to CC-LRT is the non-variational formulation of Nakatsuji/S0-52/ and Hirao/S3/. Starting with the ansatz, eq.(5.1.4), they projected the Schrodinger equation for onto the functions
I,
Extensive numerical investigations of this formalism were undertaken by NakatsuJi/52/ and Hirao/53/ for I P and EE computations. For IP calculations, the operator manifold taken by them were and WZ,.and product excitations of the form W T W1 1 2 were also included. A similar approximation scheme for EE was d l 5 0 used, although all the spin-adapted W operators for the triplet EE calculations were no? included. This should be contrasted with the scheme o f
Debashis Mukherjee and Sourav Pal
320
- = - +
a.
><=x+@
b.
Fig.4
(a) Typical t e r m s o f the h-h matrix elements of F . ( b ) ph-ph matrix elements of
The
components
w,
Matrix
ionization
excitation
- w;'
-
w, w;
w,-
w*
Fig.5
w?
- w:
Typical matrix elements o f CI coupling various blocks f o r I P and EE computations.
Cluster Expansion Methods in Open-Shell Correlation
321
G h o s h e t a1/58/, w h e r e t h e full s p a c e of W utilized. S i n c e o n l y a f e w e i g e n v a l u e s and 2 t h e c o r r e s p o n d i n g e i g e n v e c t o r s o f eq.(5.3.19) a r e of interest, H i r a o and N a k a t s u j i / l 0 2 / d e v e l o p e d a g e n e r a l i z a t i o n of Davidson's a l g o r i t h m / l 0 3 / f o r n o n s y m m e t r i c matrices. H i r a o / 5 3 / a l s o d e v i s e d a d i r e c t cl s t e r e x p a n ion m e t h o d for c o n s t r u c t i n g t h e vector Yr 3,. CIC , w h e r e C is a trial e i g e n v e c t o r , f r o m t h e molecular integrals directly a s an intermediate step for t h e i t e r a t i v e s o l u t i o n of t h e e i g e n v a l u e equation. ( T h e f e a s i b i l i t y of s u c h a p r o c e d u r e in CC-LRT w a s emphasized e v e n e a r l i e r by FIdnan e t a1/98/.) Inner and o u t e r v a l e n c e IP and v a l e n c e and R y d b e r g e x c i t e d s t a t e s of s e v e r a l prototypical s y s t e m s w e r e c o m p u t e d /50-53/. T h e non-variational f o r m a l i s m s d e s c r i b e d a b o v e g e n e r a t e non-hermitian eigenproblems. A h e r m i t i a n v e r s i o n o f CC-LRT w a s s u g g e s t e d in a p r o p a g a t o r theory f r a m e w o r k by P r a s a d e t a1/63/.The f o r m a l i s m , however, i s m o r e c o m p l e x than CC-LRT and i n v o l v e s m o r e c o m p u t a t i o n a l work f o r t h e s a m e d e g r e e of accuracy. T h e o r i g i n of t h e non-hermiticity of t h e matri? A in CC-LRT c a n b e traced t o t h e non-hermiticity o f H w h i c h i s o b t a i n e d from H by a s i m i l a r i t y , r a t h e r than u n i t a r y , t r a n s f o r m a t i o n v i a exp(T). P r a s a d e t a1/63/ s u g g e s t e d instead a u n i t a r y c l u s t e r o p e r a t o r e x p ( u ) : for c o r r e l a t i n g i 0
'Yo = exp(o)S0
with o = T
-
T
*
(5.3.20)
W e shall r e v i e w n e x t t h e time-dependent v e r s i o n of t h e C C theory of M o n k h o r s t / 5 6 / w h i c h a l s o g e n e r a t e s a linear r e s p o n s e f u n c t i o n c l o s e l y c o r r e s p o n d i n g t o t h e CC-LRT r e s p o n s e function. T h i s f o r m a l i s m a l s o t h u s f a l l s in t h e c a t e g o r y of non-variational method. T o u n d e r l i n e t h e s i m i l a r i t y of t h i s m e t h o d w i t h CC-LRT, w e shall d e n o t e a n a l o g o u s e n t i t i e s by t h e s a m e s y m b o l s and n o c o n f u s i o n s h o u l d a r i s e if t h e c o n t e x t i s remembered. I f t h e m o l e c u l e in i t s g r o u n d s t a t e i s s u b j e c t e d t o a c o u p l i n g w i t h a time-dependent field w i t h t h e interaction hamiltonian : H.
Int
-
(t) =
vel
Eexp(iwt) + exp(-ic,lt)l e x p ( a t ) (5.3.21)
with a O+, corresponding to adiabatic switching from i n f i n t e past, t h e time-dependent S c h r o d i n g e r e q u a t i o n in p r e s e n c e of t h e c o u p l i n g i s g i v e n by
Debashir Mukherjee and Sourav Pal
322
Eq.(5.3.21)
indicates that ( 5.3.23)
F o l l o w i n g Langhoff e t a1/104/, Monkhorst/%/ separated t h e s e c u l a r and n o r m a l i z a t i o n terms i n * ( t ) b y f a c t o r i z i n g U(t,-m) i n t o a regular p a r t Ur(t,-m) and a vacuum term < ~ o ~ u ~ : *o>
The vacuum term g e n e r a t e s t h e complex l e v e l s h i f t i n v o l v i n g t h e s e c u l a r and n o r m a l i z a t i o n components :
= exp[-i(E
<*olU(t,-m)[Yo>
0
t + e(t)J
(5.3.25)
where e ( t ) i s g i v e n by
t
e(t
+l
= Lt
a* U s i n g eq
<*olHint
U( t
,-m) I @ o > d t (5.3.26)
-m
(5.3.26).
[ H + Hint(t)
eq.(5.3.22)
- (Eo +
may be w r i t t e n
as
e ( t ) ) l U r ' @ o = iacl ( t , - m ) / 8 t * r 0
(5.3.27) Monkhorst/Sb/ ( s e e a l s o Dalgaard and M o n k h o r s t / 5 7 / ) p o s t u l a t e d t h e f o l l o w i n g ansatz u (t,-m) :
= exp(S(t))
Ur(t,-m)
for
(5.3.28)
w i t h S a5 v a r i o u s nh-np e x c i t a t i o n s . T h i s , coupled w i t h t h e CC ansatz f o r a0, l e a d s t o t h e time-dependent CC equation :
CH
+
Hint(t)
- (Eo + e ( t ) ) l e x p ( T + S ( t ) ) P o
= i S ( t ) / B t exp(T+ S ( t ) ) a o To compute t h e l i n e a r
response f u n c t i o n ,
(5.3.29) i t suffices
323
Cluster Expansion Methods in Open-Shell Correlation
to expand S(t) up to the harmonic terms involving S(t)
I,
S+exp(iwt) + S-exp(-iot)
(
~3
:
5.3.30)
Premultiplying eq.(5.3.39) by exp(-T-S), retaining terms up to those linear in S , and projecting onto the states
<@,I
CH,S-+
3 +
Gel
2 w
+ sISo>
= 0
(5.3.31)
which Show5 that the linear response function obtained from S(t)9 will have the same pole-structure 0 a s that obtained from CC-LRT, since eq.(5.3.31) is entirely equivalent to eq.(5.3.10) of CC-LRT. Monkhorst/56/ discussed only the case of EE calculation and did not elaborate on the scheme further. The computation o f the transition moments was also touched upon, and later elaborated by Dalgaard and Monkhorst/57/. It should be mentioned here that the computation of the transition moments in the context of residues of dynamical polarizability was also indicated in the CC-LRT framework by Mukherjee and Mukherjee/58/. The time-dependent linear response version may be viewed a s the fourier transformed counter part of the CC-LRT. The proper treatment of the secular terms in the time dependent treatment i s a rather tricky business. This i s entirely bypassed in a time-independent approach. Takahashi and Paldus /105/ recently formulated the spin-adapted version of the time-dependent formalism for computing singlet as well a s triplet EE. In a pilot n-electron calculation, they computed the full with the two-body term of T. They did not precompute H , however, which is itself a scalar in both point group and spin. They ins ead spin-adapted each diagram containing H , T and W / W ). They noted the close similarity of tgeir approach with that o f Ghosh et a1 /50/ and endorsed Ghosh et a1 /58-60/ 02 the desirability o f a prior computation of H. Calculations on the singlet and triplet EE with S 1: S have earlier been performed by Sekino and Bartlett/bl?. Recently Geerstsen and Oddershede /106/ utilized the CC ground state function for calculating EE using polarization propagator. This approach may also be viewed a s a semi-cluster expansion strategy. However a clear connection with the wave function approach is difficult to establish in a propagator theory (unless it is consistent /63,84/) and we shall not elaborate further on this theory.
.
Debashis Mukherjee and Sourav Pal
324
6.FULL C L U S T E R E X P A N S I O N T H E O R I E S IN H I L B E R T S P A C E 6.l.General
Preliminaries
A s e x p l a i n e d in Sec.2, t h e full cluster-expansion t h e o r i e s in H i l b e r t s p a c e a r e d e s i g n e d t o c o m p u t e wave-functions f o r t h e open-shell s t a t e s t h a t a r e e x p l i c i t l y size-extensive w i t h r e s p e c t t o t h e total n u m b e r of e l e c t r o n s N. T h e u n d e r l y i n g c l u s t e r s t r u c t u r e of all t h e s e d e v e l o p m e n t s i s w h a t w a s e n v i s a g e d by S i l v e r s t o n e and Sinanoglu/48/:
wk
M
=
c cik e x p ( T i )
(6.1 - 1 )
i=l
A s d i s c u s s e d in S e c 2., t h e m o s t c o m m o n l y u s e d s t r a t e g y i s t o t a k e a s e t of model s p a c e f u n c t i o n s {ai) t h a t i s 'complete' w i t h r e s p e c t . t o v a l e n c e occupancies. T h e c l u s t e r o p e r a t o r s T 1 in t h a t c a s e g e n e r a t e f r o m t h e model s p a c e d e t e r m i n a n t s C . the 1' virtual d e t e r m i n a n t s X 1 by e x c i t i n g c o r e o r b i t a l s t o v a l e n c e / p a r t i c l e o r b i t a l s a n d , in addition, s o m e Since valence orbitals t o particle orbitals it i s p o s s i b l e t o p r o d u c e a particular d e t e r m i n a n t X 1 from s e v e r a l P . ' s by.the a c t i o n of t h e a p p r o p r i a t e c o m p o n e n t s o f t h e T 1 o p e r a t o r s , t h e overall a m p l i t u d e of a particular X comes out a s a combination o f several T 1 amplitudes. If w e e x p l i c i t l y i n d i c a t e by a n a d d i t i o n a l label 1 t h e d e t e r m i n a n t X that i s 1 1 g e n e r a t e d by a T , t h e n w e may w r i t e
.
(6.1 - 2 )
where
U s i n g eqs. (6.1.2) and (6.1.31, written a s
wk
=
c c.ik a!.I i
+
cc 1 i
eq.(b.l.l)
ECik t i +
Let u s now note carefully
x1
+
c a n be
.... (6.1.4)
the chief structural
Cluster Expansion Methods in Open-Shell Correlation
325
d i f f e r e n c e between t h e c l u s t e r e x p a n s i o n f o r t h e c l o s e d s h e l l s and t h e S i l v e r s t o n e - S i n a n o g l u c l u s t e r e x p a n s i o n for t h e o p e n shells. If w e c o n f i n e o u r a t t e n t i o n t o o n l y o n e s t a t e Yk, c o n f i g u r a t i o n a l coefficients d . accompanying the determinants X i k 1 are t h e only u n k n o w n v a r i a b l e s a p p e a r i n g in t h e S c h r o d i n q e r e q u a t i o n f o r 1l Eq.(6.1.4) s h o w s that k' t h e s e c o e f f i c i e n t s a r e T o m e p a r t i c u l a r linear amplitudes. T h e s e t o f c o m b i n a t i o n s of t h e t" a m p l i t u d e s {t"') for all i and 1 a r e t h u s linearly d e p e n d e n t in *k and t h e r e i s n o d y n a m i c a l way t o d e t e r m i n e them from t h e Schrijdinger e q u a t i o n f o r Y k only. In t h e i r a p p l i c a t i o n s / 4 8 , 1 1 0 / , S i l v e r s t o n e and Sinanoqlu used an 'anonymous parentaqe o x i m a t i o n ' s c h e m e w h e r e they a s s u m e d that all t h e a m p l i t u d e s f o r a g i v e n 1 a r e i n d e p e n d e n t of i, w h i c h reduced t h e n u m b e r of v a r i a b l e s in T to t h e number of independent coefficients of X 1 of YkIntroducing p r o j e c t o r s P . = I Q i > < H i l , eq.(b.l.l) be w r i t t e n a s
y'
M
(6.1.5) where M ..
9; =
c
Cik H i i=1
,
(6.1.6)
M 0 =
C [ e x p T i ) ]Pi
(6.1.7)
i=l In t h e a n o n y m o u s p a r e n t a g e a p p r o x i m a t i o n , T all i, and eq.(6 1.7) r e d u c e s t o 0
' I
exp(T)
i
==
T for
(6.1.8)
w h i c h h a s t h e f o r m of t h e c l u s t e r e x p a n s i o n f o r t h e c l o s e d shells. S i l v e r s t o n e and S i n a n o g l u a d v o c a t e d t h e c h o i c e of *o a s t h e CAS-SCF f u n c t i o n ( c a l l e d by t h e m k a s t h e g e n e r a l i z e d RHF f u n c t i o n /llO/). S i n c e T ' s c a n g e n e r a t e o n l y t h e virtual s e t CX 1 , t h e c o e f f i c i e n t s 1 {Cik} a c c o m p a n y i n g {iPi} in t h e e x a c t Yk r e m a i n frozen their CAS-SCF values. T h i s i n t r o d u c e s an a d d i t i o n a l approximation. A n o t h e r way t o c i r c u m v e n t t h e linear d e p e n d e n c y of t h e T a m p l i t u d e s i s s u g g e s t e d by t h e m o r e r e c e n t M u l t i r e f e r e n c e CI (MR-CI) developments/lll,ll2/. In a n
326
Debashis Mukherjee and Sourav Pal
MR-CI approach, one superposes virtual determinants X
that are obtained by single, double,.. excitations out of the set.{$.) spanning the reference space. D noting it the various oserators in ucing excitations a s Y' m' is found that the set { Y @ . , Vi and m) i s linearly m dependent, and once must choose only a linearly independent subset thereof for performing the CI. an analogous cluster expansion can thus be envisaged, where fi is taken to be of the form eq.(6.1.8), but T is now confined to an expression of the form
9
T = C ' tm t m' m
(6.1.9)
where the sum C runs over the linearly independent excitations. Since this choice can be made in many ways, a unique prescription demands hat we specify explicitly which of the operators ( Y 1 are chosen a s m forming the linearly independent set. The simplest way to achieve this i s to consider e c h X1 in turn, and operators o f the select from among the possible Y' form Yt+l only one, which fixes The i. The genesis of X . i s thus biased towards a fixed i, and we may call this scheme as a 'preferred parentage approximation'. The way out of this difficulty was shown in the context o f open-shell MBPT by Brandow/3/, whose strategy is not, however, confined to perturbative theories only. I f we consider all the M functions Uk that can be generated from M linearly independent starting functions U" then the totality of the Schrsdinger equationk'for M such Yk's have precisely the same number of unkn0wn.c mbining coefficients {d 1 as the amplitudes {ti' P 1 , since the number of lk . functions { $ . I and the number of roots E are both. equal to M. 'In this case, then, the ampfitudes {ti") can, in principle, be rigorously determined from the M Schrsdinger equations. Furthermore, if we determine the coefficients { C . } from the Schradinger equation Ak for W '5, rather than from a prior CAS-SCF calcuyation, then the sets { C . k ) and {ti' 3 are both exact in principle. The effective hamiltonian approach envisaged by Brandow/3/ achieved precisely this in the MBPT framework. It should be noted that Brandow's approach used a size-extensive description of the type (a3), rather than ( a 2 ) . But this approach to generate the functions Wk for k=l,M i s equally relevant to the theories of the type (a2). Since Ti+ can only excite, each Y can be written a s
+
k
(6.1.10)
327
Cluster Expansion Methods in Open-Shell Correlation
where 8 Qk
c o n t a i n s o n l y t h e s e t CX,?,
k
=
C
C ( Ti ) n /n!Qii
Cik
and i s g i v e n by (6.1.11)
n=l
i
T h e f u n c t i o n s Yk s a t i s f y t h e i n t e r m e d i a t e normalization with respect t o t h e unperturbed f u n c t i o n s U"
-
k
.
= 1 V k=l,M
(6.1.12)
If w e n o w c o n s i d e r t h e g e n e r a l i z e d S i l v e r s t o n e Sinanoglu strategy discussed above, then the most natural way t o proceed i s by way o f an e f f e c t i v e hamiltonian formalism. W e i n t r o d u c e a s i n s l e ( s t a t e - u n i v e r s a l ) wave-operator 0, w h o s e a c t i o n o n Yo's p r o d u c e t h e f u n c t i o n s Yk, d e f i n e d by eq.(6.1.1) k o r (6.1.71, and w r i t e Schrijdinger e q u a t i a n s f o r Yk's a5 H0YE =
EkOUZ
V k=l,M
(6.1.13)
Introducing t w o i d e m p o t e n t and m u t u a l l y e x c l u s i v e p r o j e c t o r s P and Q for t h e s p a c e s s p a n n e d by {Bpi} and ( X 1 ? , eq.(b.l.l3) can be converted into an equivalent o p e r a t o r equation. Eq.(6.1.12) implies
PUk =
.;:
(6.1.14)
so that Yo's a r e t h e model s p a c e p r o j e c t i o n s of t g e eigenfunckions. If w e a s s u m e that t h e f u n c t i o n s Yk a r e linearly i n d e p e n d e n t , then t h e m a t r i x o f c o e f f i c i e n t s CC. 3 i s invertible, and eq.(6.1.13) can ik be c o n v e r t e d t o HW = CMeffP
(6
,.1 - 1 51
w h e r e He f f is g i v e n by (6.1.16)
w i t h E a s t h e d i a g o n a l m a t r i x of d i m e n s i o n M w i t h e n t r i e s Ek in t h e diagonals. S i n c e n c o n v e r t s Wo t o k qk, eq.(6.1.14) also implies that P W = P
(6.1.17)
328
Debashis Mukherjee and Sourav Pal
which i s t h e o p e r a t o r equivalent o f t h e intermediate normalization for ‘*k. Using eq.(6.1.17), it f o l l o w s that “ef f
= PHCF,
so that t h e eq.(6.1.13)
(6.1.18 i s equivalent t o (6.1.19
H W = WHW
Equation of t h i s form for an open-shell s i t u a t i o n w a s apparently first considered by Bloch/llb/, and i s n o w known a s t h e B l o c h equation/4/. Eq.(b.l.lb) indicates that t h e diagonalization o f H w h i c h i s defined in eff’ the model s p a c e only, will f u r n i s h u s w i t h t h e e i g e n v a l u e s Ek and the associated c o e f f i c i e n t s { C . 3 . c a n h e n c e be c a l l e d a n e f f e c t i v e h a m i l t o n i a n t k T h e “eff unknown c o m p o n e n t o f n in eq.(6.1.19) is QW, since PLF i s already fixed by eq.(6.1.19). T h i s c a n be determined from t h e Q-projection of eq.(6.1.19) QHW = Q W ‘ H W
(6.1.20)
Eqs. (6.1.18) and (6.1.19) f o r m t h e s t a r t i n g point of several open-shell many-body formalisms. T h u s Lindgren/4/ developed h i s version of t h e open-shell MBPT by expanding t h e s e e q u a t i o n s o r d e r by o r d e r in perturbation theory, and Offermann et E y / 7 0 / and J e z i o r s k i and Monkhorst/66/ based their corresponding coupled c l u s t e r f o r m a l i s m s o n them. M u k h e r j e e e t e/67-69/ and Kutzelnigg/76/ and Kutzelnigg and Koch/77/ a l s o m a d e u s e o f t h e s e equations-fmplicitly they first premultiplied eq.(6.1.14) by 0 and then projected the resultant e q u a t i o n s by P and Q. F o r t h e c l u s t e r e x p a n s i o n o f the t y p e (a2), w e may t h u s c o n c e i v e o f t w o approaches. O n e i s t o i n v o k e a s i n g l e r o o t s t r a t e g y , and u s e e i t h e r a n o n y m o u s parentage o r preferred parentage approximation. T h i s but by i t s very approach by-passes t h e need for H n a t u r e c a n n o t g e n e r a t e a potentiaTf:’exact formalism. T h e o t h e r i s t o u 5 e t h e multi-root strategy through t h e B l o c h equation, and thereby produce a formally exact theory. W e shall review these t w o t y p e s of s c h e m e s in Secs. 6.2 and 6.3 respectively.
a,
6.2
Single R o o t Hilbert Space Formalisms B a n e r j e e and S i m o n s / 6 4 / developed a s i n g l e root
329
Cluster Expansion Methods in Open-Shell Correlation
formalism that utilizes a wave-operator structurally very similar to that in the closed-shell CC theory. 0 They started from a C A S - S C F wave-function \ y k , and wanted to incorporate only the valence correlations by exciting the valence orbitals to the particle orbitals. The core thus remained frozen. For the cluster wave operator 0 , they chose -the ansatz, eq.(b.l.B), and used a selection scheme for the various operators in T which is essentially the same a s the preferred parentage approximation. They chose T to be of the form T =
CTn
(6.2.1)
n
where T is an n-body excitation operator with the n structure
(6.2.2)
- where the sets m =(p,q,..p,q) are chosen in such a way that {Tn.-5.} form a linearly independent set. For i determinlng the cluster amplitudes of T, they used a projection method :
<@EIYm,n exp(-T) H exp(T)i\yok> V
linearly
= 0,
independent m
(6.2.3)
Several simplifications occur in eq.ib.2.3) a s a consequence of the approximations involved. Firstly, one-valence excitation amplitudes of T 1 are expected to be small, since upto the first order the single 0 excitations tekl Y d o not mix with the C A S - S C F @ ; (the Generalized B?I! louin Theorem) /64/ They can thus be neglected. This is entirely analogous to the situation for the closed-shells, where the T operator is neglected for the HF function -5 as the siarting 0 point. Secondly, the components o f the T operator commute among themselves, again just as in the closedshell case, and as a result the Hausdorff expansion in powers of T in eq (6.2.3) terminates at the quartic term.
-
330
Debashis Mukherjee and Sourav Pal
The energy Ek can be found out from (6.2.4)
In their applications/64,65/, Banerjee and Simons used unitary-group generators/ll3,114/ in a spinfree form to adapt the resulting CC equations to proper spins, and applied this formalism for studying various small prototypical systems - starting with H molecule at 2. large internuclear separation, to the singlet-triplet splitting of CH2 and obtaining the potential curve for the symmetrical H abstraction reaction from BeH,. The neglect the core excitations may not 6e computationally justified in many cases - particularly the semi-internal ones/64/, as has been found in many MR-CI results, but their inclusion will make the resultant formalism much more involved. Such a generalization has nevertheless been attempted recently by Laidig et al/ll5(a)/, who computed the potential curves for N2 including the semi-internal excitations a s well at the linearized level (i.e., exp(-T) H exp(T1 truncated at the first commutator). Although Laidig et a1 couch their formulation in terms of the equations of the multi-root approach o f Jeziorski and Monkhorst/66/, the approximations used by them are precisely the same a 5 those implied by the preferred parentage approximation, and they generate a includein T single root theory.Laidig et &/115(a)/ the semi-internal excitations, which makes the virtual space { X 3 somewhat bigger. The equations they invoke 1 for solving the T amplitudes are o f the following form
03
< X I I H + [ H , T J l W ~ >= C)
(6.2.5)
T is constructed from only those uper tors
which Y' m produce a linearly independent set {Y' H .3 . For m .i obtaining the coefficients { C . 3 , Laidig et a 2 ik advocated a projection onto the functions CP.), 1 ead ing to
-,,
which shows that a particular root of the matrix H is the desired eigenvalue E The other roots are k' . spurious. This contrasts with the multi-root strategy,
331
Cluster Expansion Methods in Open-Shell Correlation 0
where a l l T ' s a r e used t o g e n e r a t e a s e t o f qk'5. Very r e c e n t f y Hoffman and S i m o n s / l l 5 ( b ) / used a s i m i l a r s t r a t e g y where t h e y used a u n i t a r y c l u s t e r t o p e r a t o r exp(T-T i n s t e a d o f e x p ( T ) t o produce a h e r m i t i a n m a t r i x L.
6.3
The M u l t i - r o o t
H i l b e r t Space F o r m a l i s m s
J e z i o r s k i and M o n k h o r s t / 6 6 / d e v e l o p e d a r i g o r o u s c l u s t e r expansion formalism f o r t h e m a n i f o l d o f s t a t e s { * 1 , s t a r t i n g f r o m a c o m p l e t e model space, u s i n g t h e extended S i l v e r s t o n e - S i n a n o g l u formalism o u t 1i n e d i n eq.16.1.7) and (6.1.10) t o ( 6 . 1 . 2 0 ) . R o f eq.(6.1.7) has a h y b r i d 5 t r u c t u r e : t h e r e a r e c r e a t i o n / a n n i h i l a t i o n o p e r a t o r s i n Ti 5 , and t h e r e a r e i n a d d i t i o n e x p l i c i t F o l l o w i n g Hose R I - e l e c t r o n d e t e r m i n a n t a l p r o j e c t o r s Pi. and k a l d o r / 4 4 / , J e z i o r s k i and M o n k h o r s t i 6 6 / c h o s e e a c h @ o n w h i c h t h e o p e r a t o r e x p ( T 1 ) a c t s a s t h e vacuum f o r c & l c u l a t i n g t h e matrix-elements. W i t h t h i s choice, i t is s u f f i c i e n t t o i n d i c a t e i n e a c h Ti only the o r b i t a l s and t h e o r b i t a l 5 r e p l a c e d t o r e a c h a vacated i n & v i r t u a l detekminant X T h u s , i n t h i s scheme i t is 1' more u s e f u l t o c l a s s i f y t h e o r b i t a l s i n t o h o l e s and f o r any o p e r a t o r p a r t i c l e s w i t h r e s p e c t t o each 6 T h i s , somewhat uno:thodox, p r o c e d u r e has a c t i n g on b o t h a d v a n t i g e s and l i m i t a t i o n s , a s we s h a l l o b s e r v e a t an a p p r o p r i a t e p l a c e l a t e r . o p e r a t o r can be w r i t t e n as An n-body Ti
.
a
1
,a
2
E
(6.3.1)
i
Ar+ :+cited
s t a t e f u n c t i o n X 1 r e a c h e d by t h e roduct a a c t i n g on m i may be d e n o t e d by Ifi:*(i)> a c c a a Fo: t6e aet&rmination o f c l u s t e r amplitudes s, J e z i o r s k i and M o n k h o r s t s t a r t e d f r o m t h e B l o c h e q u a t i o n / l l b / , eq.(6.1.19), p r o J e c t e d i t on t h e k e t ,'and p r e m u l t i p l i e d i t w i t h e x p ( - T 1 ) , to finally obkain a
0 3 'f'
I@
iiP.lexp(-T J
i
)
H er,p(T
i
(6.3.2)
Debashis Mukherjee and Sourav Pal
332
i where use is made of the fact thatiThe left side of eq.(6.3.2) connestes if T is connected, but the connectedness of the right s i d e is not immediately manifest In fact, a special proof is needed to prove that Tl's are connected. The difficulty lies in provinq that (exp(-T1) exp(T J 1 ) is c o n n e c d d . Since T /T' have no reference to orbitals unaffected by them, the bookkeeping of the terms becomes recondite in nature. The use of multiple vacua makes clear diagrammatic rules hard to formulate. This is unfortunate, since a proof of the connectivity of an operator becomes quite easy in the diagrammatic representation. Moreover, since the vacua a . may not be spin singlets, the operators T ' are Aot spin-scalars, and spin-adaptations of the CC equations (6.3.2) are quite complex. fi positive advantage of the formalism is that, with respect.to 5 . a 5 the vacuum, the ranks of the operators in Ti are'controlled by the orbitals i vacated and replaced in 5 . ,so that the operators in T thus carry n o redundant 0: spectator labels, and this has advantages in book-keeping. The perturbative expansion of eq.(6.3.2) generates the complete model space version of the Hose-Kaldor formalism/44/. Laidig and Bartlett/ll8(a)/ have implemented the theory in its linearized form. They applied it to the symmetric dissociating potential curve for the ground state of water and the symmetric H2 abstraction from €3eH2 Each Ti i s truncated at the two-body level. The results were compared to the corresponding MR-CISD values. T o our knowledge, a quadratic extension of the formalism has not been applied computationally. A spin-adapted version of the linearized model has also been recently formulated/llB(b)/.
.
-
7.FULL CLUSTER EXPANSION THEORIES IN FOCK SPGCE 7.1.Preliminaries For a Fock Space Approach In this section, we shall motivate towards the need for a Fock-space approach to generate core-valence extensive cluster expansion theories (i.e., of type (a3)), introduced in Sec.2. Since we have to maintain size-extensivity o f the energy
Cluster Expansion Methods in Open-Shell Correlation
333
explicitly w i t h respect t o both N and N v , t h e r e C should be a c l u s t e r e x p a n s i o n involving c l u s t e r o p e r a t o r s c a p a b l e o f c o r r e l a t i n g t h e c o r e and a l s o o f introducing core-valence interaction and v a l e n c e correlation e f f e c t s involving 1,2,.., N v v a l e n c e electrons. T h e d e g r e e of flexibility in t h e wave-operator i s t h u s m u c h m o r han w h a t i s needed to j u s t g e n e r a t e t h e f u n c t i o n s Y ~ p V f f r o ma n N -valence V model s p a c e spanned by (H.}. C o n s e q u e n t l y , if w e c o n f i n e o u r s e l v e s t o a pa:ticular N -valence H i l b e r t V space,then the n u m b e r of c l u s t e r a m p l i t u d e s in t h e wave-operator become linearly deoendent. T h i s i s s o m e w h a t a n a l o g o u s to w h a t w a s encountered in Sec.6.2 and 6.3 describing t h e s i n g l e r o o t and m u l t i r o o t cluster-expansions o f t h e t y p e (a2): t h e Silverstone-Sinanoglu e x p a n s i o n i n v o l v e s linearly dependent c l u s t e r a m p l i t u d e s for o n e function W k ; t h i s redundancy is eliminated if w e c o n s i d e r all t h e M f u n c t i o n s {Y } that c a n be generated f r o m t h e M-valence mo%el space. In a s i m i l a r manner, t h e redundancy in t h e c l u s t e r a m p l i t u d e s of t h e c l u s t e r expansion of t h e t y p e (a31 c a n be eliminated if w e c o n s i d e r n o t only t h e S c h r M i n g e r equation for t h e N V -valence problem but a l s o all the lower n-valence T h e s i m p l e s t procedure subduced problems, w i t h O l n l N V i s t o postulate t h e e x i s t e n c e o f a s i n q l e wave-operator w h i c h g e n e r a t e s t h e v a r i o u s e x a c t f u n c t i o n s *;(“’for all t h e n-valence problems/68/, w i t h 05 n+- N
.
V
(7.1.1) R c a n t h u s be regarded a s a v a l e n c e universal wave-operator/69/. Denoting by S ‘ m ’ , t h e c l u s t e r o p e r a t o r o f v a l e n c e rank m ( i - e . , having m valencedestruction o p e r a t o r s ) , R should be taken of t h e form
(7.1.2) for treating u p t o t h e N v a l e n c e problem. T h e V a m p l i t u d e s o f S‘O’ correspond t o t h e core-correlation c l u s t e r amplitudes, and S ( 4 ’ , s ‘ 2 ’ , etc., successively introduce valence c o r r e l a t i o n and core-valence interaction e f f e c t s c o n t a i n i n g o n e , two,.., N v valence electrons. T h e model s p a c e for t h e n-valence problem is spanned by t h e s e t ( @ ! n ’ ) . There a r e t w o possibie w a y s t o g e n e r a t e t h e CC-
334
Debashis Mukherjee and Sourav Pal
e q u a t i o n s f r o m eq.(7.1.1). O n e way i s t o c o m p u t e for t h e c l u s t e r a m p l i t u d e s of S by s o l v i n g t h e projected equations of the form
< X i n ’ [ C2-‘HC2
I@:“’>=
0
V l,i and n
The energies a r e finally obtained from
(7.1.3)
the equations
(7.1.4) A n o t h e r way i s t o s t a r t from t h e p r e c u r s o r of t h e B l o c h e q u a t i o n , eq.(6.1.15), generalized to Fock space/69,76,93/ and u s e t h e Q and P p r o j e c t i o n s for t h e n-valence sector. W r i t i n g t h e F o c k s p a c e B l o c h equation a s
w i t h Heff a valence-universal F o c k s p a c e e f f e c t i v e hamiltonian, t h e e q u a t i o n s d e t e r m i n i n g t h e c l u s t e r amplitudes can be obtained from
and H i s o b t a i n e d from ef f
when C2 s a t i s f i e s ‘ i n t e r m e d i a t e n o r m a l i z a t i o n ’ :
B o t h m e t h o d s h a v e been tried, and e a c h h a s i t 5 o w n o p e r a t i o n a l advantages. W e shall c l a s s i f y t h e s e m e t h o d s i n t o t w o categories: ( a ) t h o s e u t i l i z i n g eqs.(7.1.3) and (7.1.4) will b e c a l l e d ’ s i m i l a r i t y transformation-based F o c k s p a c e and ( b ) t h o s e u s i n g eqs. t h e o r i e s ‘ (See, Sec.7.2) (7.1.6) and (7.1.7) will b e c a l l e d B l o c h equationbased F o c k s p a c e theories(Sec. 7.3). It s h o u l d , however, b e o b s e r v e d t h a t t h e s t a r t i n g e q u a t i o n s for t h e t w o t y p e s of t h e o r i e s a r e eqs. (7.1.51, s i n c e a pre-multiplication by 0 followed by Q t n and P c r ’ ’ p r o j e c t i o n f u r n i s h e s eqs.(7.1.3) and (7.1.4) respectively. For truncated c a l c u l a t i o n s , t h e t w o a p p r o a c h e s w i l l , however, d i f f e r and t h e i r c o n v e r g e n c e
Cluster Expansion Methods in Open-Shell Correlation
335
behaviour will also be different. A variational strategy is also possible where the cluster amplitudes in C2 are determined by varying the energies with respect to the cluster amplitudes. This is covered in Sec .7.4. The various Fock space methods that we shall describe principally differ in their choice of the actual form of the cluster operator. Following the earlier analysis of Primas/29/, the form of $2 will be of exponential type, symbolically represented by eq.(7.1.1), but it may not strictly be exponential. There have been several choices for R , each having its own advantages. The Fock space approach has the potential advantage of exploiting the fact that operators written in the occupation number representation are independent of electron number, so that all the manipulations involving H and R can be performed at the operator level first, which is somewhat simpler and more transparent than workin with the matrixelements involving functions {I. (n?} and (Xin’}. Furthermore, since the same ope:atot-s appear in all the n-valence sectors of the Fock-space, the projections onto the wave-functions can be performed at the very end, and it i s possible to treat systems with varyinq n (i.e., different degrees o f ionization) on the same footing, and in an explicitly size-extensive manner with respect to the valence electrons. A s it has turned out, only the Fock-space approach has the potentiality to furnish explicitly connected size-extensive theories for a general incomplete model space/91-96/, which shows its flexibility and generality. 7.2
Similarity Transformation-Based Fock-Space Theories
Mukherjee et a1/68-69/ formulated an explicitly connected CC theory for complete model space by invoking a valence universal cluster operator S> o f the form 0 = exp(S)
(7.2.1)
with S given a s a sum over various SIm’ operators of valence rank m :
336
Debashir Mukherjee and Sourav Pal
+=o
>=o
a. Fig.6
b.
** C.
d.
(a),(b) L b l o c k s f o r t h e zero-valence problem ( c ) F\ typical d i a g r a m f o r (a). ( d ) A typical diagram for (b). N o t e that a two-valence S c a n c o n t r i b u t e t o a zero-valence block, a s in 6(d).
(7.2.2) In their f i r s t exposition, M u k h e r j e e e t a 1 / 6 7 / did n o t i n c l u d e in S C m 'o p e r a t o r s w i t h passive valence lines, w h i c h w a s rectified in t h e s u b s e q u e n t papers/68,69/. Actually it i s t h e presence of passive valence l i n e s w i t h d i f f e r e n t v a l e n c e l a b e l s attached t o them that d i s t i n g u i s h e s t h e d i f f e r e n t s c a t t e r i n g a m p l i t u d e s that i n v o l v e t h e s a m e c h a n g e in occupancy but c o n n e c t d i f f e r e n t p a i r s of f u n c t i o n s XCr" and !S(n) 1 i T o prove t h e connectivity o f t h e c l u s t e r o p e r a t o r S , M u k h e r j e e e t al/68-69/ c h o s e a hierarchical strategy o f proceeding from t h e zero-valevce c o r e problem and going u p w a r d s by s u c c e s s i v e l y considering one, two,.. valence problems, stopping finally a t t h e desired N v valence level. S i n c e S c o n t a i n s both v a l e n c e c r e a t i o n and d e s t r u c t i o n o p e r a t o r s , the c o m p o n e n t s of S d o n o t c o m m u t e among t h e m s e l v e s and, a s a result, in t h e expression of L = Q-*HR i n eqs.(7.1.13) and (7.1.14) using eq.(7.2.1), w e have many S-S contractions. T h e multi-commutator Hausdorff formula for t h e transformed hamiltonian L connected t e r m s only, and it h a s o p e r a t o r s of varying v a l e n c e rank. Diagrammatically, e a c h c o m p o n e n t o f L c a n b e depicted a s a c o m p o s i t e block o f a g i v e n s h a p e w i t h v a l e n c e destruction o p e r a t o r s entering f r o m t h e right, and hole, valence and particle o p e r a t o r s emanating from t h e left. T h e block c o n s i s t s o f all t h e connected d i a g r a m s that c a n b e obtained by j o i n i n g H w i t h v a r i o u s S v e r t i c e s from left and right i n a s p e c f i c manner : F r o m t h e multi-commutator expansion, it f o l l o w s that w e should s t a r t o u t f r o m H and begin connecting t h e H vertex
Cluster Expansion Methods in Open-Shell Correlation
337
b.
a.
C.
Fig.7
( a ) One-valence t w o body block w i t h n o c o m m o n valence o r b i t a l s before and after scattering. ( b ) One-valence block w i t h a c o m m o n v a l e n c e label. ( c ) Reduction o f ( b ) d u e t o a lower valence equation (Fig.6(a)).
w i t h S v e r t i c e s o n i t s i m m e d i a t e left o r right and m o v e out systematically c o n n e c t i n g t h e n e x t S n e i g h b o u r s and proceed without leaving any i n t e r m e d i a t e S uncontracted during t h e process. F o r t h e zero-valence c o r e problem, eq.(7.1.3) can be diagrammatically depicted a s in Figs.6(a) and ( b ) for illustration. FIlthough they look like t h e closed shell CC e q u a t i o n s in shape, t h e r e is a m a j o r difference. Owing t o the S-S contractions, m a n y S t n ’ o p e r a t o r s o f v a l e n c e rank g r e a t e r than z e r o will a l s o c o n t r i b u t e t o eq.(7.1.3) for n=O. A f e w typical d i a g r a m s a r e shown in Figs.6(c) and (d). T h i s s h o w s that the c l u s t e r o p e r a t o r s for t h e v a r i o u s m-valence Hilbert s p a c e s e c t o r s a r e all coupled together, and eqs.(7.1.3) for all n have t o be s i m u l t a n e o u s l v considered to g e t t h e c l u s t e r amplitudes. P r o c e e d i n g to t h e o n e valence problem, eq.(7.1.3) generates d i a g r a m s w h i c h may be o f t w o types. In o n e type, t h e valence o r b i t a l s in H. a r e vacated, t o be replaced by another v a l e n c e o r pa:ticle orbital. T h e L d i a g r a m s in t h i s c a s e a r e j u s t b l o c k s o f v a l e n c e rank 1, w i t h n o valence label o n t h e e n t e r i n g l i n e s from t h e right of t h e block ever appearing o n t h e l i n e s t o t h e left. FI typical s u c h diagram will look a 5 in Fig.7(a). For some common t h e o t h e r s type, X i n ’ and @!”’share
338
Debashir Mukherjee and Sourav Pal
valence orbital. In this case, eq.(7.1.3) generates two types of blocks, whose sum vanishes. In one type, s o m e valence lines actually enter the block, but they leave on the left carrying the same labels ('passive scattering'), a s shown in the first block of Fig.7(b), and there is another type of block in which the valence lines d o not enter the diagram at all ('spectator scattering'), a s shown for the second block of Fig.7(b) for the one-valence 5ituation.Since the spectator line does not contribute to the matrix-elements, the last block independently vanishes from Fig. b(a), corresponding to the equations for the core-problem. Thus, eq.(7.1.3) in this case also reduces to a block o f valence rank 1 equated to zero, a5 shown in Fig.7(c). Thus irrespective of whether we (1' and @ ! " the have valence orbitals common to X 1 1 ' resultant eq5.(7.1.3) are all one-valence L blocks equated to zero. Proceeding upwards, it easily follows that the eqs.(7.1.3) for an n-valence problem will look like a block of valence rank n , equated to zero. Since in eq.(7.1.3), the diagrams entering the block are all connected, and are of same shape, the S t m ' operators are all connected. This i s the first linked cluster theorem for an open shell CC theory of the type (a3) using similarity transformation. Since the operator S and its powers always excite to the virtual space, the functions a;''' are generated in the intermediate normalization. The valence universal nonhermitian H can be diagrammatically ef f depicted a s blocks with incoming and outgoing lines labelled by valence only. Typical one-valence and two valence blocks are shown in Figs. 8 ( a ) and (b), and some representative diagrams contributing to these-are shown in F i g s . B ( c ) and (d). W e note here that 5"' has appeared in the one-valence block, Fig. 8(c), and this indicates how the various m-valence blocks are coupled together. I f we drop the constant closed whose value is equal to zero-valence part of H from eqs. (7.1.4), then we the correlated core engf;;, shall get energy differences with respect to the core AEL"' directly. Mukherjee e t & / b e / utilized this strategy to calculate ground state energy, i t s I P and excitation energy for a model rr-electron problem b y starting from the doubly charged molecular ground state a 5 the core. Mukherjee et &/67/ also advocated a strategy whereby one may generate a hermitian Heff In this formulation, one needs a valence universal unitary cluster operator exp(cr), with u given by
-
339
Cluster Expansion Methods in Open-Shell Correlation
b.
a. Fig.8
d.
C.
(a),(b) One- and two-valence parts of H (c),(d) Typical terms contributing to (zff(b). n = S - St
(7.2.3)
where S has the same form a 5 in eqs.(7.2.1) and (7.2.2). Since CI i s more compler than S, the corresponding equations will contain more terms. This approach was adopted in /120/ to generate a formal theory of a hermitain valence-shell hamiltonian, where the connectivity of the effective hamiltonian was also explicitly established. There i s also a related work by Westhaus et al/l21/, where the valence universality of Sl is not mentioned and the logical basis of the connectivity remains somewhat obscure. A s we have already discussed, in the formulation using the ansatz, eq.(7.TAf), the calculation of the energy remains coupled with the calculation differences AE of the correlaked core energy E This contrasts with 0 where the situation of R S MBPT for open-shells/3-9/, the core energy E can be determined without reference to the open-shelloproblems, which enter in the next stage. This decoupling was termed ’the core-valence separation’ by Brandow/3/. A generalization of the cluster ansatz for which allows such a core-valence separation was formulated later by Mukherjee e t a&/121/. In this version, S> is written in the form
.
C-2
= exp(Tc)exp(Tv)
(7.2.4)
where T has the form of the closed-shell cluster C operator, and T contains all the operators S‘”’ with 15 n 5 N The cyosed-shell cluster amplitudes are first &lved from the closed-shell CC equations for the core @ :
.
0
<Xi*’lexp(-Tc)
H
exp(Tc)lSo>
= 0
(7.2.5)
The various m-valence problems are handled simultaneously, with the core cluster-amplitudes
340
Debashir Mukherjee and Sourav Pal
frozen a t their closed-shell values. Introducing t h e dressed hamiltonian H t h r o u g h t h e relation :
fi = exp(-T
C
)
H exp(T
C
(7.2.6)
)
and s e p a r a t i n g i t s vacuum part E f i = E
0
0
explicitly,
+Hc
(7.2.7)
w e find t h a t t h e e q u a t i o n s for t h e v a r i o u s m-valence s e c t o r s c a n be w r i t t e n a 5
(m>
T h e energy d i f f e r e n c e AEk eigenvalue equations
a r e obtained from t h e
T h e c a l c u l a t i o n of t h e c o r e c l u s t e r amplitudes, eq.(7.2.5), a r e t h u s completely decoupled from that of t h e v a l e n c e c l u s t e r amplitudes, eq.(7.2.8). It is interesting t o n o t e that R in eq.(7.2.4) can be regarded a s a v a l e n c e universal wave-operator, i.e.,n i s a l s o t h e wave-operator for t h e coreproblem/94(a)/. T h i s assertion f o l l o w s from t h e s i m p l e o b s e r v a t i o n that all T o p e r a t o r s have destruction o p e r a t o r s and hence thgy a n n i h i l a t e S Mukhopadhyay et a 1 / 1 2 2 / generalizgd t h e a n s a t z (7.2.4) still further. They suggested t h e u s e o f a ‘n’ r e c u r s i v e procedure whereby t h e wave-operator at the n-valence lave1 is recursively generated from t h e lower v a l e n c e R ’ 5 . Thus, they d e f i n e
.
n‘”-”
0‘
=
n‘O’
= exp(Tc)
exp(~‘”’
(7.2.10a
(7.2.l o b )
C l e a r l y , w i t h t h i s c h o i c e , n o t only a r e t h e e q u a t i o n s for t h e c o r e c l u s t e r a m p l i t u d e s decoupled f r o m t h e rest, but a l s o t h e e q u a t i o n s for t h e T‘”’ cluatera m p l i t u d e s h a v e only T‘”’amp1itudes w i t h m < n frozen at their m-valence values. It should again b e noted that (m> 0‘”’ acting o n a P ‘ ~ ’for m
341
Cluster Expansion Methods in Open-Shell Correlation
e x p o n e n t i a l ansatz i s t h a t t h e components o f t h e c l u s t e r o p e r a t o r ( S o r T ) do n o t commute among V themselves. There a r e thus c o n t r a c t i o n s o f t h e t y p e rl n ( S S ) o r ( T v T v ) , and t h i s C O U ~ ~t h~ eS v a r i o u s
n
n
m-valence s e c t o r s t o g e t h e r . C l e a r l y , i f S S o r T Tv c o n t r a c t i o n s c o u l d be avoided, no S o r T operayor o f V v a l e n c e r a n k >m can appear f o r an m-valence problem, and a d e c o u p l i n g o f t h e v a r i o u s v a l e n c e s e c t o r s would be p o s s i b l e . Mukherjee/69/ and Haque and M u k h e r j e e / 6 9 / advocated t h e use o f normal o r d e r e d e x p o n e n t i a l a n s a t z , i n t h e manner o f L i n d g r e n / 7 1 / , t o a v o i d t h e S S c o n t r a c t i o n s . They showed t h a t n is o f t h e form
with { . . I i n d i c a t i n g t h e normal o r d e r i n g w i t h r e s p e c t t o Bo has t h e i n t e r e s t i n g p r o p e r t y t h a t
w'"'
=
n("' P("'
Q m
(7.2.12)
where 0'"' i s a normal o r d e r e d e x p o n e n t i a l c o n t a i n i n g o p e r a t o r s up t o v a l e n c e r a n k m o n l y . Moreover, eq.(7.2.11) implies that
S'"'
0 = exp(S'O')
( 7 -2.131
exp(S1
s
.
( 0 )
where c o n t a i n s a l l t h e S'"' o p e r a t o r s except S ( 0 ) Eq.(7.2.12) shows t h a t S>'O& e x p ( 5 ) is t h e c l u s t e r expansion o p e r a t o r f o r t h e c l o s e d s h e l l problem, and t h e S'O' a m p l i t u d e s can be found o u t i n a manner c o m p l e t e l y decoupled from t h e r e s t o f t h e S o p e r a t o r s ; . Introducing the operator H through eq.(7.2.7), the c (0) equations determining t h e S c l u s t e r amplitudes f o r 1 m SN can be w r i t t e n as V
4"'I n'"'-'
Hc
n'"'
I*:"'>
= 0, V 1, i,1 5 r n S N V (7.2.14 1
I t has been shown i n /69/ ( s e e a l s o /123/) t h a t t h e operator c2("'-'HCc2("' is e x p l i c i t l y connected, so t h a t , i n v o k i n g t h e same arguments a s t h o s e used i n p r o v i n g t h e connected n a t u r e o f t h e S o p e r a t o r s i n an o r d i n a r y e x p o n e n t i a l , t h e connectedness o f S o p e r a t o r can be proved. The b l o c k s r e p r e s e n t i n g t h e matrix-elements o f eq.(7.2.14) w i l l now have diagrams i n which S-S c o n t r a c t i o n s d o n o t appear, and as a r e s u l t a complete d e c o u p l i n g o f t h e v a r i o u s m-valence
342
Debashis Mukherjee and Sourav Pal
s e c t o r s i s possible. H a q u e a n d M u k h e r j e e / 6 9 / s h o w e d t h a t t h e d i a g r a m s in t h e b l o c k s of eq.(7.2.14) a r e all g e n e r a t e d by c o n t r a c t i n g H w i t h t h e S v e r t i c e s o n i t s right, a v o i d i n g S-S c o n t r a c t i o n s , and j o i n i n g t h e r e s u l t a n t c o m p o s i t e w i t h t h e S v e r t i c e s o n i t s left in t h e s a m e way but a l l o w i n g c o n t r a c t i o n s b e t w e e n t h e S o p e r a t o r s t h i s time. K u t z e l n i g g / 7 6 / and K u t z e l n i g g and K o c h / 7 7 / advocated that t h e o p e r a t o r t r a n s f o r m a t i o n L = fi-‘Hfi b e performed in F o c k s p a c e e n t i r e l y , i n d e p e n d e n t o f t h e p a r t i c u l a r Nv-valence model s p a c e a b o u t w h i c h o n e i s interested By d e f i n i n g t h e ’ c l o s e d ’ o p e r a t o r s a 5 t h e o n e s w i t h ’ a c t i v e ‘ ( v a l e n c e ) l i n e s o n l y , and ’ n o n - d i a g o n a l ’ o p e r a t o r s a s t h e o n e s in w h i c h t h e r e a r e a c t i v e l i n e s o n o n e s i d e and t h e r e i s a t least o n e i n a c t i v e line o n t h e o t h e r , t h e F o c k s p a c e d e c o u p l i n g c o n d i t i o n s , e q u i v a l e n t t o eq.(7.1.3), c a n be s t a t e d o n a n o p e r a t o r level. K u t z e l n i g g / 7 6 / and K u t z e l n i g g and K o c h / 7 7 / c l a s s i f i e d t h e o p e r a t o r s depending o n the types o f lines attached to them as follows: ( i ) Xc : a diagonal operator with active lines only ( i i ) Xfi : a non-diagonal o p e r a t o r in w h i c h t h e o u t g o i n g l i n e s a r e all active. ( i i i ) XB : a n off-diagonal o p e r a t o r in w h i c h t h e incoming l i n e s a r e all a c t i v e ( i v ) X o : a n o p e r a t o r w i t h a c t i v e and i n a c t i v e l i n e s o n b o t h sides. K u t z e l n i g g / 7 6 / and K u t z e l n i g g and K o c h / 7 7 / emphasized that t h e classification o f the operators a i n t o t h e c a t e g o r i e s C, #, B and 0 i s e s s e n t i a l l y F o c k s p a c e c o n c e p t , so that t h e c o n d i t i o n that LB v a n i s h e s will a u t o m a t i v a l l y g e n e r a t e o p e r a t o r e q u i v a l e n t of t h e eq.(7.1.3). Lc i s then t h e Fock-space effective hamiltonian H f f , and appropriaten-valence e n e r g i e s E c a n b e o b 7 a i n e d by a k projection of L o n t o t h e P‘”’space and i t s C s u b s e q u e n t d i a g o n a l i z a t i o n . C l e a r l y , t o e n s u r e that t h e LB o p e r a t o r s of all v a l e n c e r a n k s ( n ) a r e z e r o , w e n e e d in t h e Fock-space wave-operator C2 c l u s t e r o p e r a t o r s S, w h i c h a r e o f t h e t y p e SB, of all valence ranks
.
s2
= exp(SB)
(7.2.15) ( 7-2.16)
The equations determining the cluster amplitudes are
Cluster Expansion Methods in Open-Shell Correlation
343
then given by LLr'
= O ;
V n
(7.2.17)
It should be noted that Mukherjee et aJ/67,60/ in their formulation did use a Fock-space wave-operator, although they expressed their decoupling conditions (7.1.3) in terms of explicit determinantal projections. A s a result, when valence lines are present, eq.(7.1.3) leads to a sum of various blocks equated to zero, a s in Fig.7(b), for example. The lower valence blocks with spectator lines, however, are individually zero as follows from the decoupling conditions for the lower valence blocks, and a s a result each individual block appearing in the decoupling condition, eq.(7.1.3), is zero. These blocks are indeed the L operators of various ranks. B . Thus, operationally speaking, the Fock-space approach of Kutzelnigg/74/ and Kutzelnigg and Koch/77/ i s entirely equivalent to the Fock-space formalism of Mukherjee et a1/67,60/. But conceptually speaking, there is a subtle difference : Kutzelnigg and Koch emphasize the decoupling condition, on the operator level, and a s a result directly comes to eq.(7.2.17) without the intermediary of going through an expression of the type of Fig. 7(b) which i s bound to arise when explicit projectors are used. Kutzelnigg and Koch/77/ also introduced a unitary Fock space cluster operator 52 = exp(cr) with an In this antihermitian Fock space u, as in eq.(7.2.3). case, the transformed Fock space hamiltonian L should be brought into a form such that (7.2.i a ) so that there would be decoupling from both sides. Clearly cr should be of the form 0
= SB - s;
(
7-2.19)
and S i i s an operator of the type A . Since Kutzelnigg and Koch utilized a wave-operator of the exponential structure, the various valence sectors will be generally coupled. Numerical applications/77,124/ on the low-lying energy levels of small molecules such a s H2 produced encouraging results for the unitary choice of $2. It should be mentioned that an earlier work of
344
Debashis Mukherjee and Sourav Pal
Reitz and Kutzelnigg/78/ w a s a forerunner of the Kutzelnigg-Koch strategy where IP of closed-shell molecules w e r e sought by using a s i n g l e unitary cluster transformation for both the ground s t a t e and the ionized states. The method h a s s i n c e been implemented t o c o m p u t e principal IP's of a s e r i e s o f small a t o m s and molecules/l25/. H a q u e and Mukherjee/l26/ developed a Fock-space in w h i c h t h e method for generating hermitian H e f cluster wave-operator n is not unitary, following a suggestion of Jorgensen/l27/, w h o c h o s e C> to preserve the norm of the functions in t h e model space. T h u s CI satisfies PR+W =
P
( 7.2.20)
and it f o l l o w s that
i.e., i s hermitian. T h e condition (7.2.20) may be %ff.. called an isometry' condition. H a q u e and Mukherjee developed a Fock-space version of t h i s approach, in which R i s written in t h e form = {exp(S+X)}
( 7-2.22)
where S h a s t h e s a m e form a s in eq.(7.2.11), and X a r e 'closed' o p e r a t o r s inducing valence t o valence scatterings. Using t h i s 0, they derived a s e t of e q u a t i o n s like eq.(7.2.14). T h e closed operator X is chosen in s u c h a way that H becomes hermitian. In a straightforward extension fhe Jorgensen approach/l27/, H a q u e and Mukhejee/l26/, used a Focks p a c e equivalent of eq.(7.2.20), which r e a d s (7.2.23)
S i n c e t h e first term in the expansion of C2 is the unit operator, eq. (7.2.20) indicates that all the nvalence conn+ected g i a g r a m s obtained by contracting powers of S and X o p e r a t o r s w i t h t h o s e of S and X will vanish. It c a n be s h o w n that t h e r e is n o loss of generality if X is chosen a s hermitian. H a q u e and M kherjee /126/ explicitly used t h e condi()$on that - Heff, for defining the o p e r a t o r s X ff T K e r e w a s a l s o a related earlier work by Kvasnicka /72/, which hermitized t h e CC theory o f Lindgren/71/. F o r m o r e d i s c u s s i o n s o n hermitian Heff' see Sec.7.4.
+ -
.
34s
Cluster Expansion Methods in Open-Shell Correlation
7.3
B l o c h E q u a t i o n Based F o c k S p a c e Theory:
O f f e r m a n n e t a1 /70/ formulated a n open-shell CC f o r m a l i s m in N u c l e a r P h y s i c s in w h i c h a B l o c h waveo p e r a t o r 0'"' f o r a n n-valence problem is o b t a i n e d recursively f r o m t h o s e of t h e lower v a l e n c e problems: n'O'
( 0 )
= exp(~
P'O'
(7.3.1)
and so o n , w h e r e e d e n o t e s a n o u t e r product between t w o o p e r a t o r s ( i.e., n o c o n t r a c t i o n s b e t w e e n t h e o p e r a t o r s a r e allowed). Defined t h i s way, R'"' in their work e m b e d s in it i n f o r m a t i o n a b o u t all t h e lower V a l e n c e s e c t o r s and d e f i n e s a F o c k s p a c e strategy. A s i m i l a r i d e a w a s a l s o pursued by Ey/70/. T h e c l u s t e r a m p l i t u d e s a r e o b t a i n e d r e c u r s i v e l y , by s o l v i n g t h e z e r o v a l e n c e problem first, s t a r t i n g w i t h the Bloch equation for the zero valence problem s o l v i n g t h e o n e v a l e n c e problem n e x t , w i t h S'"amp1itudes kept f r o z e n , and so on- f i n a l l y arriving a t t h e d e s i r e d N v a l e n c e s i t u a t i o n . It V s h o u l d b e noted that t h e a n s a t z f o r n'"', o b t a i n e d r e c u r s i v e l y following t h e c h a i n (7.3.11, (7.3.2) .., i s s i m i l a r t o t h e o n e u s e d in eq.(7.2.10) with the d i f f e r e n c e t h a t t h e o u t e r p r o d u c t c o n v e n t i o n in eq.(7.3.3) e l i m i n a t e s c o n t r a c t i o n s between S operators. S i n c e 0"" is r e c u r s i v e l y g e n e r a t e d f r o m n' m ' o f lower v a l e n c e ranks, i t m i g h t a p p e a r t h a t t h e r e i s n o u n d e r l y i n g a s s u m p t i o n of v a l e n c e u n i v e r s a l i t y in t h i s formulation. H o w e v e r , interestingly enough, 0'"' d e f i n e d in t h i s s c h e m e h a 5 t h e property t h a t
,
so that o n e may a g a i n f o r m a l l y c l a i m that t h e m e t h o d h a s a v a l e n c e universal wave-operator. L i n d g r e n / 7 1 / d e v e l o p e d a n open-shell CC theory by postulating a n wave-operator R in n o r m a l o r d e r : 0
= texp(S)l
(7.3.5)
where the structure of the cluster operator 0 w a s inferred by a n appeal t o t h e open-shell MBPT/4/,
Debashis Mukherjee and Sourav Pal
346
starting from B l o c h equation/llb/. He c a l l e d o p e r a t o r s c a u s i n g t r a n s i t i o n s within t h e model s p a c e a 5 'closed' and t h o s e c.ausing transition o u t s i d e t h e model-space a s 'open'. L i n d g r e n ' s starting equation w a s CQ
,Hal
P
(NV) =
vzlp
(NV)
- w
(NY)
(Nv) Vzlp
(7.3.6)
w h i c h o r i g i n a t e s f r o m eq. (6.1.15), and t h e assumption of i n t e r m e d i a t e normalization.1nserting eq. (7.3.5) t o eq.(7.3.6) and expanding R ( o r equivalently S ) in orderof perturbation theory,Lindgren(Lf/ concluded operators of that S m u s t b e the s u m o v e r v a r i o u s S rank 01. m I N In t h e original formulation V Lindgren/71/ confined himself t o t h e particular N -valence model space, and consequently t h e c l u s t e r V a m p l i t u d e s of v a r i o u s v a l e n c e r a n k s a r e n o t necessarily linearly independent. T h i s s i t u a t i o n i s entirely a n a l o g o u s t o t h e o n e discussed in Sec. 7.1. Using W i c k ' s theorem, eq. (7.3.6) w i t h eq. (7.3.5), leads t o
.
( 7-3.7)
n
w h e r e t h e c o n t r a c t i o n s VC2 e x p r e s s i o n s of the form
n
{Vn)
=
C
etc mean
the connected
m
l/n! { V S
S. S..}
(7.3.8)
n
(N") Since the S operators a o n e are sufficient to > d e t e r m i n e t h e e i g e n f u n c t i o n s rY v , (7.3.7) a r e an under-determined s e t o f c o u p l e 8 e q u a t i o n s f o r S'"' w i t h 05 n I N L i n d g r e n t h u s needed s o m e auxiliary c o n d i t i o n s t o X e t e r m i n e t h e lower v a l e n c e c l u s t e r amplitudes. F o r t h i s h e appealed t o MBPT, and demanded that t h e o r d e r by o r d e r e x p a n s i o n o f S in p o w e r s of V should p r o d u c e t h e wave-operator of t h e o p e n shell MBPT. S i n c e S h a s t o be connected then, L i n d g r e n postulated a s e t of sufficiency c o n d i t i o n s w h i c h would g u a r a n t e e t h e connectivity. T h u s h e demanded that t h e following r e l a t i o n s must hold:
(rn
.
CS,Hol P
(Nv)
n
= {VR}P
(NV)
n
(Nv)
- (Weff1P
(7.3.9)
It should be noted t h a t t h e validity of eq.
Cluster Expansion Methods in Open-Shell Correlation
347
(7.3.9) i m p l i e s t h e v a l i d i t y of eq. (7.3.71, but n o t vice-versa. S i n c e V i s closed, taking t h e closed part o f eq.(7.3.9) feads to f f
(7.3.10) T h e method h a s s i n c e been u s e d t o c o m p u t e e n e r g i e s of quasi-degenerate Be iso-electronic s e q u e n c e g r o u n d state/l31/. S u p e r f i c i a l l y it m i g h t a p p e a r t h a t L i n d g r e n ’ s formulation i s a Hilbert space approach involving r e d u n d a n t S o p e r a t o r s , and t h e r e d u n d a n c y i s e l i m i n a t e d by a s e t o f a u g m e n t e d e q u a t i o n s w h i c h a t t h e s a m e t i m e e n s u r e s t h e c o n n e c t e d n e s s of H ff’ H o w e v e r , a s n o t e d by M u k h e r j e e , and H a q u e a n 8 Mukherjee/69/, use of the sufficiency conditions (7.3.9) a m o u n t s in e f f e c t t o a s s u m i n g that Q i s a valence-universal wave-operator. In f a c t H a q u e h a s e x p l i c i t l y d e m o n s t r a t e d / l 2 3 / that t h e u s e of a valence-universal n in t h e Fock-space B l o c h e q u a t i o n l e a d s a u t o m a t i c a l l y t o e q n (7.3.9) w i t h t h e ad-hoc s u f f i c i e n c y requirement. W e g i v e t h e s k e t c h of a general proof here, s i n c e i t s h o w s t h a t t h e e x t r a information content of a Fock-space 0 , a s opposed to a H i l b e r t s p a c e , c a n b e u s e d t o a d v a n t a g e for e n s u r i n g the connectivity of the cluster amplitudes o f S/93/. F o r a valence-universal 5 ) , t h e F o c k - s p a c e B l o c h e q u a t i o n (6.1.15) l e a d s t o (7.3.11)
n
n
C a l l i n g t h e c o m p o s i t e s HQ and W e as 2 and 8 respectively, eq. ( 7 - 3 . 1 1 ) i m p l i e s t h a t
where the superscript (yk,indicates operators of v a l e n c e rank 0. S i n c e R c o n t a i n s v a r i o u s nh-np e x c i t a t i o n s in S w h i c h a r e linearly i n d e p e n d e n t , it follows that
F o r t h e one-valence
problem, s i m i l a r l y
348
Debashis Mukherjee and Sourav Pal
and because o f eq.
(7.3.131,
w e have
and it again f o l l o w s that
It t h u s f o l l o w s generally, proceeding u p w a r d s in valence rank, t h a t
{Z(k’}
= (e
tk,
}
V k
( 7-3.17)
from w h i c h eq. (7.3.9) f O l l O W 5 easily o n s u m m i n g eq. (7.3.17) f o r t h e v a r i o u s r a n k s k, and using t h e partition o f H i n t o H +V. T h e a b o v e a n a l y s i g i n d i c a t e s that L i n d g r e n ’ s eq. (7.3.9) i s restrictive, s i n c e eq. (7.3.17) s h o w s that w e may w r i t e a m o r e general Fock-space e q u a t i o n
n
CS,Ho]P‘”’ = CV*)P‘”’
-
n
(nVeff}P‘”’
Vn (7.3.18)
Thus, w i t h eq. (7.3.9) it will not b e possible t o c o m p u t e energy d i f f e r e n c e s A E k , s u c h a s IP etc. In contrast, t h e Haque-Mukherjee formulation/69/, however, u s e s eq. (7.3.181, and a s a result IP,E&,EE etc., c a n b e calculated. Lindgren, however, later generalized h i s e q u a t i o n suitably to c o m p u t e IP etc.,/129,130/, w h i c h m e a n s that eqn (7.3.18) w a s implicitly used. Haque-Mukherjee f o r m u l a t i o n / 6 9 / a l s o i n d i c a t e s that t h e r e i s an automatic decoupling o f t h e e q u a t i o n s for t h e v a r i o u s n-valence sectors. T h e r e a r e no S t m ’ a m p l i t u d e s w i t h m > n for t h e n-valence problem. T h i s decoupling, w h i c h i s riqorous, h a s s i n c e been termed a s t h e ’ s u b s y s t e m embeddinq c o n d i t i o n ‘ ( S E C ) / 6 9 / . T h i s l e a d s t o a c o m p u t a t i o n a l l y a d v a n t a g e o u s scheme. H a q u e and Kaldor/132/, and Kaldor/133/ m a d e several a p p l i c a t i o n s o f t h e formalism o n small a t o m s and recently o n molecules/l34/. In t h e s e applications, low-lying excited s t a t e s o f diatomic species, valence IP’s o f closed shell ground s t a t e s and loe-lying excitation e n e r g i e s of s i m p l e a t o m s and m o l e c u l e s a r e computed. It a p p e a r s from t h e s e c a l c u l a t i o n s that
Cluster Expansion Methods in Open-Shell Correlation
349
three body cluster operators a t the one and two valence levels may be important for a quantitative accuracy. Kaldor et a1/148/ have very recently also applied the formaiism for studying the potential surface of NH + H reaction. 3 Sinha et a1/135/ also applied the Haque-Mukherjee formalism for IP calculations, but they envisaged a more general computational strategy where not only the principal IP values at the outer and inner valence region but also the satellite peaks with dominant shake-up effects were aimed at. For this, they included all the hole orbitals upto the inner valence region as active. Since the dimension of the model space for the one-hole valence problem i s Just equal to the number of hole orbitals, it may at first appear surprising that satellite peaks can at all be obtained in this scheme in addition to the main peaks. We should, however, note that the CC equations are inherently non-linear in nature. In fact, at any nvalence level the eq. (7.3.18) involves quadratic (n? powers of S Sinha et a1/135/ exploited the nonlinear nature of the equations to generate multiple solutions. The main peaks corresponded to those solution set5 which could be reached from the starting iterates obtained from the linearized equation. The satellite peaks are obtained when the starting amplitudes are estimated from iterates for the 5"' the eigenvectors of a small CI problem involving the model space and the dominant shake-up amplitudes. This finding bolsters the view that a CC-approach does transcend an MBPT in both compactness and information content. In a RS MBPT/3-9/ , the number of roots obtained will always be just M. The Bloch-Horowitz/Z/ theory, however, can furnish more than M roots but at the expense of the 1055 of size-extensivity with respect to N
.
V
.
Very recently Sinha et &/136/ have analytically shown that the CC-based LRT for IP i s algebraically equivalent to the CC formalism of Haque and Mukherjee/69/ for one-hole valence problem. This indicates that the CC equations for determining the IP's can be cast into an equivalent eigenvalue equation. In fact Sinha et a1 /136/ have demonstrated that the relation between the CC-LRT for I P and the crrresponding CC theory i5 the same a s that between a CI problem and the associated partitioned problem as obtained by the Soliverez transformation/l37/. For open-shells containing more than one valence, the correspondence between the CC-LRT and the CC equations no longer holds, but Sinha et a1 showed the CC
Debashir Mukherjee and Sourav Pal
350
e q u a t i o n s c a n still be c a s t i n t o an e q u i v a l e n t e i g e n v a l u e problem/l37/. T h i s is c o m p u t a t i o n a l l y very u s e f u l , s i n c e t h e associated e i g e n v a l u e p r o b l e m s a r e usually numerically e a s i e r t o s o l v e and t e c h n i q u e s exist w h i c h a l l o w e i g e n v e c t o r s of a g i v e n s t r u c t u r e t o be reached by s u i t a b l e root-homing procedures. Using t h i s strategy, S i n h a e t a1 have c o m p u t e d t h e h u g e r s p e c t r a of F H 0 and HF/135/, including s a t e l i t e 2' peaks/l36/ f o r t 6 e first. t i m e using an open-shell CC theory. Main peaks for N , CO and H 0 have a 1 5 0 been 2 computed by Pal e t a1/136,140/. W e have s h o w n a s typical e x a m p l e s o f c o n s t r u c t i n g t h e CC equations, t h e d i a g r a m s c o n t r i b u t i n g t o o p e n z'~', ZC2' in Figs. 9 ( a ) t o ( f ) T h o s e parts of in Figs.9(g) t o (h). T h e contributing to e'2'argPshown valence o r b i t a l s a r e p a r t i c l e type. T h e d i a g r a m s c o n t r i b u t i n g t o H'l'and a r e s h o w n in Figs.lO(a), e f ( b ) respectively. a n o t h e r example, t h e CC e q u a t i o n s for S"'amp1itudes for IP a r e s h o w n in Figs.11.
.
65
7.4
Hi::
Variational CC Theory In F o c k S p a c e :
Pal et a1/73(a)/ formulated a variational CC theory for energy d i f f e r e n c e s in Fock space, w h i c h g e n e r a t e s a hermitian Heff * They employed t h e ansatz, eq. (7.2.22) for <> :
(7-4.1)
0 = Cexp(S+X)}
w i t h S of t h e form of eq. (7.2.11), and X a r e closed o p e r a t o r s c a u s i n g s c a t t e r i n g s w i t h t h e model space. i3 is assumed, a s in eq.(7.2.23), to s a t i s f y t h e 'isometry' c o n d i t i o n :
a.
b.
e.
Fig.9
(a)-(f),
d.
c.
D i a g r a m s for Z
f.
t2) .
OP
351
Cluster Expansion Methods in Open-Shell Correlation
a.
b.
Fig.10
(1) Typical diagrams for Heff
Fig.11
CC e q u a t i o n s for '5'''
for
) and H (e 2 ff
.
the one hole problem.
352
Debashis Mukherjee and Sourav Pal
g u a r a n t e e s that Heff will be hermitian.
Eq.(7.4.2)
M is a hermitian closed operator.
In their formulation, Pal e t a1 determined the closed-she1 1 cluster amplitudes S'O' by varying the expectation value o f the ground s t a t e energy E0 with respect t o the cluster amplitudes/73(b)/ :
< @ o l . . . I H >c s t a n d s for connected quantities. for the one-valence T h e cyuster-amplitudes of S"' problems ( I P o r E F I ) w e r e determined by minimizing the : expectation values E
I
(7.4.5) Owin? t o normal ordering,€ I i. s . a t the most quadratic 1) and X"'and their adjoints.Minimization of E I' in S subject to the constraint eq. (7.4.2) for P"', leads to e q u a t i o n s of the form
where A ' s a r e t h e Lagrange's multipliers. Solution of eq. (7.4.61, along w i t h t h e isometry c ond i ti on Mi:'
-
0,
V J,K
(7.4.7)
which is equivalent t o eq. (7.4.2), furnishes u s wif?, and X"' amplitudes. W e n o w write energy Ek the 5'" (1) for e a c h ylk as
Cluster Expansion Methods in Open-Shell Correlation
353
where (7.4.9)
M i n i m i z a t i o n of E"' with respect t o CIk'5 k a s e t o f e q u a t i o n s of t h e f o r m (i'
-<1>
J
HIJ 'Jk
= Ek
*(l>
where H
IJ
'Ik
leads to
(7.4.10)
i s g i v e n by
I
T h e matrix H i g hermitian. By d r o p p i n g t h e v a c u u m d i a g r a m s f r o m H, w e may a l s o o b t a i n t h e e n e r g y d i f f e r e n c e s AEk = E k - Eo directly. F o r t h e two-valence problem, o n e may proceed similarly. In t h e two-valence s e c t o r , t h e o n e v a l e n c e a m p l i t u d e s will b e r i g o r o u s l y frozen. P a l ex ag1/73(a)/ d i s c u s s e d t h e g e n e r a l n-valence c a s e in detail. H a q u e and K a l d o r / l 3 2 ( b ) / a p p l i e d t h i s f o r m a l i s m to c o m p u t e IP's o f N2 and H 2 0 .
8.
8.1
SIZE E X T E N S I V E F O R M U L A T I O N S W I T H I N C O M P L E T E M O D E L SPPlCE : Preliminaries :
T h e r e a r e mainly t w o inter-related r e a s o n s why o n e would l i k e t o f o r m u l a t e a s i z e - e x t e n s i v e t h e o r y involving i n c o m p l e t e model spaces: ( i ) t o b y p a s s intruders, ( i i ) f o r c a l c u l a t i n g t h e low-lying E E ' s o f a closed-shell g r o u n d s t a t e , o n e f e e l s that a s e t o f hole-particle d e t e r m i n a n t s would s u f f i c e a s a c h o i c e for a r e a s o n a b l e model s p a c e , w h i c h i n v o l v e s v a l e n c e h o l e s a 5 well a s v a l e n c e particles. T h i s i s a n IMS. I t c a m e t o b e g e n e r a l l y a c k n o w l e d g e d /20,44,66,90/ that t h e t h e o r i e s u s i n g I M S g e n e r a t e d i s c o n n e c t e d
Debashir Mukherjee and Sourav Pal
354
d i a g r a m s in H ff.In t h e quasi-complete model s p a c e theory of LinSgren/SO/, o f w h i c h t h e h-p model s p a c e i s a special c a s e , disconnected d i a g r a m s a p p e a r in CIppearance o f disconnected d i a g r a m s in H Hef ff s p e l l s a breakdown of t h e size-extensivity of ?he calculated energies/146,147/. T h i s s i t u a t i o n h a s c h a n g e d very recently w i t h t h e discovery o f Mukherjee/91-?3/ that t h e real reason behind getting a disconnected Heff for an I M S lies in ( i ) n o t using a Fock-space valence-universal R and ( i i ) n o t abandoning t h e intermediate normalization ( I N ) f o r R, w h i c h i s n o t size-extensive for INS. If t h e assumption of IN i s abandoned, it i s relatively straight-forward t o prove t h e linked c l u s t e r theorem ( L C T ) f o r i n c o m p l e t e model space. Gbandoning IN, Mukherjee/Sl/ initially proved L C T for i n c o m p l e t e model s p a c e s having n-hole n-particle determinants, s h o w i n g a l s o a t t h e s a m e t i m e t h e validity of t h e core-valence separation. T h e corresponding open-shell perturbation theory o f Brandow/20/ for s u c h c a s e s leads t o unlinked t e r m s and a breakdown o f t h e core-valence separation, w h i c h used IN for R. M u k h e r j e e emphasized that it i s essential to have a valence-universal w a v e o p e r a t o r S.2 within a F o c k s p a c e formulation/91/ s u c h that it a l s o c o r r e l a t e s t h e subduced v a l e n c e sectors. L a t e r on, t h e proof o f LCT w a s s h o w n for m o r e general model spaces/92/. T h e F o c k s p a c e a p p r o a c h is particularly importantfor this, w i t h the additional c o n d i t i o n s o f abandonment o f the conventional IN. M u k h e r j e e used a normal ordered ansatz/91-93/for t h e w a v e o p e r a t o r Q in o r d e r that t h e s o l u t i o n for t h e different v a l e n c e s e c t o r s will be decoupled ( S E C , a s in /69/ 1 . T h i s i s particularly useful c o m p u t a t i o n a l l y and l e a d s t o s i g n i f i c a n t s a v i n g o f t h e c o m p u t e r time.
.
8.2
General F o r m u l a t i o n
In w h a t follows, it will be useful t o c l a s s i f y t h e parent n-valence INS belonging t o t h e s p a c e P, t h e complernentary'active s p a c e , w h o s e union w i t h P f o r m s t h e CMS, a s R and t h e 'true' virtual s p a c e containing i n a c t i v e labels a s (3. T h e c l u s t e r o p e r a t o r s inducing t h e transition P+R a r e all labelled by v a l e n c e lines, and will henceforth be denoted a s 'quasi-open'. T h e o p e r a t o r s making t r a n s i t i o n s P+Q will be c a l l e d 'open'. T h e o p e r a t o r s c o n n e c t i n g model s p a c e s only will b e c a l l e d 'closed'. In g e n e r a l , for a valenceuniversal R, it so h a p p e n s that products of quasi-open
355
Cluster Expansion Methods in Open-Shell Correlation
o p e r a t o r s a r e closed. Thus, f o r e x a m p l e , c o n s i d e r a n I M S w i t h h-p determinants. S i n c e b o t h t h e v a c u u m So and t h e d o u b l y e x c i t e d s t a t e s a r e in t h e s p a c e R , t h e r e will b e both de-excitation and e x c i t a t i o n t y p e s o f quasi-open S o p e r a t o r s in 0, a s d e p i c t e d in Figs. 12(a) and (b). T h e i r product, however, i s a c l o s e d o p e r a t o r , a s s h o w n in Fig.lZ(c). Thus,although w e m a y c h o o s e a c l u s t e r o p e r a t o r S a s c o n t a i n i n g quasi-open and o p e n o p e r a t o r s o n l y , a wave-operator o f t h e f o r m e x p ( S ) o r {exp(S)) will gene r a t e c l o s e d o p e r a t o r s s t e m m i n g from t h e e x p a n s i o n involving p r o d u c t s o f quasi-open operators. &T intermediate normalization i s thus incompatible w i t h t h e a b o v e c h o i c e of S. We s h o w below, f o l l o w i n g Mukherjee/92-94/, how a connected H c a n b e o b t a i n e d u s i n g a Valenceeff. universal R w h i c h u s e s a n o r m a l i z a t i o n c o n v e n t i o n d i f f e r e n t f r o m i n t e r m e d i a t e normalization. W e s t a r t f r o m t h e Fock-space B l o c h Equation: p'k'
H Z IP ' k'
= meff
v
(8.2.1)
olklr,,
(8.2.3)
Using t h e a r g u m e n t s s i m i l a r t o t h o s e u s e d in 5 e c 7.3, w e a r r i v e a t (8.2.4) (8.2.5a) =
n {weffi
a.
Fig.12
(8.2.5b)
b.
C.
( a ) De-excitation and ( b ) e x c i t a t i o n q-op o p e r a t o r s in a n I M S w i t h h-p d e t e r m i n a n t s . ( c ) D e m o n s t r a t i o n t h a t t h e i r product i 5 closed.
356
Debashir Mukherjee and Sourav Pal
Since H is a c l o s e d o p e r a t o r , w e have ef f
(8.2 - 6 ) (8.2.7) (8.2.8) T h e eqs. (8.2.6) and (8.2.7) a r e t h e defining e q u a t i o n s for S and S and eq. (8.2.8) d e f i n e s Heff. T h e strucfur"8 o f thgPLquations s h o w s that S is a connected opeartor, and so i s H T h e most * and C M S 1ies in the c o n s p i c u o u s d i f f e r e n c e be tween f&f; a p p e a r a n c e of R in t h e former, a s in eq. (8.2.8). S i n c e eqs.(b.2.6) and (8.2.7) a r e q u i t e s i m i l a r in s t r u c t u r e t o t h o s e for a CMS, w e need n o t d i s c u s s thei r c o n s t r u c t i o n in detail. In fact, t h e left hand s i d e of eqs.(8.2.6) and (8.2.7) a r e entirely a n a l o g o u s t o those for a CMS. On their right hand s i d e , they will have c o n t r a c t i o n s of p o w e r s of S w i t h H and s i n c e f' Hef f , f r o m eq. (8.2.81, h a s a d i f f e r e n t e structure compared t o that in a CMS, w e shall d i s c u s s t h i s part in s o m e detail. Writing as
Rcl = 1 + w cl
(8.2.9)
tk,
w e may w r i t e H as ef f (8.2.10) tk>
w h i c h s h o w s a way t o recursively g e n e r a t e Hegf i s usually a s m a l l matrix, t h e ex r a As ff compl icaFion of i t s s t r u c t u r e in an I M S d o e s n o t pose any computational disadvantage. Kutzelnigg e t a1/95/ h a v e discussed in detail t h e Fock-space c l a s s i f i c a t i o n of o p e r a t o r s for quasic o m p l e t e model space. They d l 5 0 introduced a n e w t y p e od IMS, c a l l e d t h e isolated incomplete model space(1IMS). In IIMS, p r o d u c t s of q-open o p e r a t o r s a r e all q-open, n e v e r closed. A s a result R, = 1 , just a s in a CMS. T H e resulting CC e q u a t i o n s $or IFhS have t h u s e x a c l y t h e 5 a m e s t r u c t u r e a s in t h e CMS. M u k h e r j e e e t a 1 / 9 6 / have discussed t h e Fock-space
Cluster Expansion Methods in Open-Shell Correlation
357
classification scheme of the operators for a general model space. They also considered the unitary cluster ansatz for <>
,
(2 = exp(cv)
(8.2-11)
u = s - s t, s = s
q-='P
+ s
(8.2.12) OF
(8.2.13)
The choice (8.2.12) for (2 leads to a hermitian Heff. Mukherjee/YS/ has discussed the isometric normalization f o r i-2 in IMS, which also leads to a hermitian H eff' Chaudhuri et a1 have recently formulated an open-shell MBPT for IMS, abandoning intermediate at each order in a way normalization, by choosing 0 cl which ensures the connectedness of H /150/. eff 8.3 The Case of Ills Having Valence-holes and -particles A s ha5 been stressed in Sec. 8.1 the incomplete model space is both a computational and conceptual necessity for quasidegenerate systems. Since it is formally possible to have a completely connected cluster expansion theory based on a general incomplete model space it has been possible to apply such methods to the problem of excitation energies (EE). Because of the hierarchical generation of terms, the process of obtaining EE also leads to ionization potentials (IF) and electron affinities (EA). We will discuss the theory in relation to generating such quantities in this section. For the low-lying excited states of a closed-shell N-electron ground state one chooses the Hartree-Fock function Z a s the vacuum and uses the particle-hole (p-h) excyted determinants a s spanning the model space for excited states. The p-h excited determinants are a reasonable choice for the model space on the energetic consideration of the orbitals. One chooses a set of 'active' holes and 'active' particles to constitute the model space. This choice, according to our experience, avoids the intruder state problems for most cases o f study if the active orbitals are chosen properly. A s emphasized previously, the h-p excited determinants constitute an incomplete model space, even if we exhaust all the
358
Debashis Mukherjee and Sourav Pal
a c t i v e holes and a c t i v e particles in forming the determinants. S i n h a e t a1 formulated t h e MRCC theory for EE assuming all holes and particles t o be active/l38/. However, for the s a k e of computational convenience it is usual t o treat only a few of the particle and hole o r b i t a l s a 5 active, namely t h o s e near t h e Fermi level. Pal et al, arguing o n similar lines a 5 S i n h a e t a1/138/ developed a similar formulation for t h e c a s e of active-inactive subdivision/l39-140/. S i n c e t h e IMS under consideration h a s both valence holes and valence particles, it i s better t o classify t h e o p e r a t o r s according t o their p-h valence ranks. ck.1, In Fock s p a c e w e designate by ( 9 . } a model 1 s p a c e of kp-lh determinants. In the present c a s e k and 1 range from 0 t o 1. An operator A w h i c h c a n destroy m active particles and rnk a c t i v e holes i s denoted p
(m
, m >
.
E a c h of t h e s e i s a sum of several one, as t w o ..body operators. In t h e sprit of earlier d i s c u s s i o n s t h e valence universal w a v e operator S2 may be written as, ~2 = {exp(S r k , l i
11,
(8.3.1) (8.3.2)
S u c h a w a v e operator $3 i s c a p a b l e of correlating 7 k)l> the lower valence model spaces.Thus,for the IMS !P. w i t h k=l=l the lower valence model s p a c e s a r e *“,O’ 0 * a’> (O,O> .and \Er ,corresponding t o (N+l)-electron, i (A-1)-electron and Noelectron ground s t a t e problems. O n e c a n set u p t h e equations for t h e k=l=l valence- problem in a hierarchical manner. T h e valence s e c t o r s a r e defined in t e r m s of particle-hole rank. T h u s t h e Fock s p a c e Bloch equation would be
*(
Using a r g u m e n t s similar t o t h o s e given in Sec 8.2, w e find
where
Cluster Expansion Methods in Open-Shell Correlation
359
For the solution one starts from the lowest holeparticle 5ector ( o , o ) for s'*"' and then solves the ( 0 , l ) and (1,O) sectors finally reaching the target ( 1 , l ) sector. The equation for the(0,O)sector reduces to that for the closed shells. The model spaces for (0,l) and ( 1 , O ) sectors are always complete by definition. The energies of these two sectors are those of (N-1) electron states and (N+1) electron states respectively. I f we collectively call the q-open and openifry;ators a s external, then the equations for S are given by
The model space projection yields
tk.1,
"eff
. -
For complete model spaces, since IN is valid, we have
-
lcl
for k+l = 1
(8.3.7)
i.e.,eqs (8.3.5) and (8.3.6) yield the complete model space equations a 5 in the case o f ( N i l ) and ( N - 1 ) e 1 ec tron ,ef~;es. However, for the excited states k=l=l, eq.(8.3.7) is not valid. Using OC eq. ( B . Z . ~ ] ,*lcl ' ("" may be defined recursively a s "ef f
In recent calculations by Kaldor et a1 /133,134/, use was made of p-h excited determinants with IN for 0 ignoring the fact that such a choice causes the generation of unlinked diagrams in Heff' What Kaldor and co-workers did was essentially to ignore the deexcitation operators of the type shown in Fig. 12(a), and consequently impose IN on 0 . At first sight, it then appears that their ignoring the disconnected terms in Hef is a result of the cancellation of errors:(l) use o f IN for < > ; ( 2 )neglect of de-excitation s . However, Sinha et a1/138/ have shown that, when all the particles and holes are active, it i s impossible
360
Debashis Mukherjee and Sourav Pal (1.1)
.
to connect ;y to get a connected Heff The reason for this lieg'in the fact that the operator S ( 0 , O ) 9 of 2 Fig. 1 2 ( b ) has n o lines on its right, and hence cannot on its right. Thus althou h IN be contracted to H ef f, ( I , I > ( 9 ,i > does not hold good, i s still equal to H exactly a s in a CMS theory with I N for n.They gffo ' noted that the S ( " I ' operator,which can de-excite the p-h model space ?o Qi lying outside the model space, determining can contribute only Po the S("" equations only and the calcuf'ation of all other excitation operators S'i'l' as well as d o not ef f depend on the S'i'l' operator. Hence for computing the 2 . energy of the excited states one does not have to involve S " ' " operators at all. If the vacuum 2 diagrams are dropped in the construction of H ,the results are directly the difference energies.efAe same observation was found to be true even when only some holes and particles are active, as shown by Pal e t a1 /13?,140/. See /?4(a), 138-140/ for a more extensive treatment of this aspect. The above formulation is quite general and applies equally well to quasi-complete model spaces having m holes and n particles.When there are several p-h valence ranks in the parent model space, the situation is fairly complicated. The subduced model spaces in this case may belong to the parent model space itself. The valence-universality of 0 in such a situation implies that i s the wave-operator for all the subduced model spaces, in addition to those which have same number of electrons a s in the parent model space. It appears that a more convenient route to solve this problem is to redefine the core in such a way that hole5 for the problem become particles and treat it a5 an IMS involving valence particles only. The general I M S containing Oh-Op, lh-lp, nh-np has recently been considered by Stolarczyk and Monkhorst 185,'.
...,
8.4. Applications of IMS-based CC Theories The formalisms described in Secs. 8.2 and 8.3 have found several recent applications. Apart from the IP results /99,100,132/, there have also been calculations of some atomic electron affinities/l34/ all of which accompany EE computations. There are a few molecular EE results of N CO, CH N CH C 0 , C H by 2' Pal e t a1/139,140/, Rittby et a1/141/ ?n?'Mat?ie & a&/141/. The results so far have been obtained mostly
Cluster Expansion Methods in Open-Shell Correlation
361
in t h e s i n g l e s and d o u b l e s approximations. K a l d o r h a s reported s o m e IP r e s u l t s i n c l u d i n g a p p r o x i m a t e triples/134/. T h e r e h a v e a l s o been IP r e s u l t s w i t h a p p r o x i m a t e t r i p l e s in t h e g r o u p o f Bartlett/l42/. According t o t h e e x p e r i e n c e g a i n e d , it i s computationally convenient to construct exp(-~'~'*'H exp(s'o'o' ) , w h i c h i s a n e c e s s a r y building block, s e p a r a t e l y and s t o r e it a s a d v o c a t e d by M u k h o p a d h y a y e t a1/99/. M o s t of t h e r e s u l t s f o r IP a r e r e a s o n a b l y good. H o w e v e r , t h e r e a r e c e r t a i n third-order t e r m s w h i c h c a n b e a c c o u n t e d for by triples. Initial IP c a l c u l a t i o n w i t h t r i p l e s in a b a s i s i n c l u d i n g polarization f u n c t i o n s yield a l m o s t c h e m i c a l l y a c c u r a t e o r near-chemically a c c u r a t e results. T h e q u a l i t y of EF\ r e s u l t s in t h e s a m e b a s i s i s n o t u s u a l l y a 5 good b e c a u s e o n e n e e d s t o h a v e a r e a s o n a b l y large b a s i s s e t f o r EA calculations. T h e low-lying EE c a l c u l a t i o n s w i t h m o d e s t b a s i s s e t s of s y s t e m s like N2,C0, CH CO etc., yield r e s u l t s c o m p a r a b l e w i t h e x p e r i m e n z s and t h o s e o b t a i n e d by o t h e r methods. fit d o u b l e z e t a and o n e p o l a r i z a t i o n f u n c t i o n b a s i s level t h e d i s c r e p a n c i e s of s o m e o f t h e E E ' s f o r t h e s y s t e m s studied from the experimental results a r e usually much less than 0.1 eV. T h e d i s c r e p a n c y f o r m o s t o t h e r s i s from 0.1-0.5eV and a r e o f t e n lesser t h a n o t h e r methods. K o c h u s e d t h e IMS-based CC theory t o c F 2 p u t e to t h e _+Ps t a t e s of He and t h e e x c i t e d s t a t e s of C F+d/1 4 9 / In low-lying e x c i t e d s t a t e c a l c u l a t i o n s o n e h a s so far been a b l e t o g e t rid of t h e i n t r u d e r s t a t e problem. H o w e v e r , i t i s s t i l l n o t e s t a b l i s h e d t h a t in s t u d y i n g potential s u r f a c e s e v e n s u c h a g e n e r a l i n c o m p l e t e model space-based M R C C i s c a p a b l e of handling i n t r u d e r s t a t e problems. e l t h o u g h , in a very r e c e n t pilot potential e n e r g y c a l c u l a t i o n , K o c h and M u k h e r j e e / l 4 3 / c o u l d g e t m o s t r e g i o n s of t h e potential c u r v e s for He+', and K a l d o r / 1 4 4 / and Ben-Shlomo and 2 K a l d o r / l 4 5 / g o t c u r v e s for L i H , f u r t h e r s t u d i e s a r e c l e a r l y needed.
.
8.5
Size-extensivity
of E n e r g i e s w i t h I M S
It may a p p e a r t h a t t h e c o n n e c t e d n e s s of H is eff n o t by itself a g u a r a n t e e for t h e size-extensivity of energies, s i n c e d i s c o n n e c t e d t e r m s may a p p e a r d u r i n g t h e diagonalization. C h a u d h u r i e t a1 1 1 4 6 1 h a v e d e m o n s t r a t e d r e c e n t l y that s u c h i s n o t t h e c a s e because o f t h e s p e c i a l s t r u c t u r e of H eff'
362
Debashir Mukherjee and Sourav Pal
O n e may f o l l o w t h e d i a g o n a l i z a t i o n p r o c e s s in o r d e r s of perturbation. In a g e n e r a l CI m a t r i x w i t h IMS, t h e d i s c o n n e c t e d t e r m s a r i s i n g f r o m t h e normc o r r e c t i o n may b e c a s t a s p r o d u c t s of c l o s e d d i a g r a m s , w i t h i n t e r m e d i a t e s belonging t o t h e P s p a c e o r t o t h e R space. T h e r e a r e n o c o m p e n s a t i n g d i a g r a m s w i t h R s p a c e i n t e r m e d i a t e s c o m i n g f r o m t h e general sum, s i n c e R s p a c e f u n c t i o n s a r e n o t present in t h e CI matrix. T h i s i s t h e r e a s o n why t h e r e a r e d i s c o n n e c t e d d i a g r a m s in a CI w i t h IMS. In t h e H matrix, however, t h e i n t e r m e d i a t e s in t h e n o r m 'ierm c a n n e v e r h a v e R s p a c e d e n o m i n a t o r s , s i n c e all q u a s i o p e n L o p e r a t o r s leading R t r a n s i t i o n s a r e by c o n s t r u c t i o n zero. T h e to P disconnected terms with P space intermediates a r e e x a c t l y c a n c e l l e d by t h e c o n t r i b u t i o n s f r o m t h e general s u m w i t h P s p a c e intermediates. T h e situation can be illustrated using, as an ex m p l e , a n I M S w i t h B and s i n g l y e x c i t e 9 s t a t e s {a a B 1. T h e o p e r a t o r g of t h e t y p e A a a are matrix-e+em&tsPa?e zero. A s qugsP-8pen type. T h u s L a r e s u l t @ d o e s n o t coub?e w i t h € a a H I(Fig.l3(a)). T h e g r o u n d s t a t e e n e r g y i s then jugt" O, and is size-extensive. It c a n a l s o b e s h o w n t R a $ t h e o t h e r r o o t s a r e a l s o size-extensive. In a CI matrix w i t h t h e s a m e functions, however, t h e s i t u a t i o n i s i s not zero q u a l i t a t i v e l y different. H e r e H (Fiq.l3(b)), and a s a r e s u l t wePf?ave d i s c o n n e c t e d t e r m s in t h e g r o u n d s t a t e e n e r g y from t h e norm-correction having d o u b l y , triply R space i n t e r m e d i a t e s w h i c h r e m a i n uncancelled. T h i s is s h o w n in Figs.l4(a),(b). F o r m o r e e x t e n s i v e d i s c u s s i o n s see e.g. /146/. It i s tempting t o f o r c e IN o n fl, o n c e a c o n n e c t e d f o r m a l i s m i s developed.-Thus, it may a p p e a r t h a t o n e s a t i s f y i n g IN f o r t h e c a n d e f i n e a d i f f e r e ?&,f parent model s p a c e P f
f
-
P
...
n=
mi;
(8.5.1)
w h i c h will d e f i n e a n & by a e f f related t o H similarity transformation and thus wily generate a size-extensive theory.This i s n o t so in p r a c t i c e , s i n c e f o r a S r u n c a t e d c a l c u l a t i o n it i s i m p o s s i b l e t o find o u t a n 0-with a n e x p o n e n t i a l s t r u c t u r e w h i c h generates an H similar t o an H A simple wayto ef f look at-the t r a n s f o r m a t i o n (8.5.17 is t o i n c l u d e in &Cexp(S)) s o m e c l o s e d o p e r a t o r Xc s u c h t h a t fl = 1. F o r t h e h-p model s p a c e , f o r exampfe, a two-bodyCA c a n b e u s e d t o e l i m i n a t e t h e d i s c o n n e c t e d c l o s e d terms, a s s h o w n in Fig. 15. This, however, g e n e r a t e s a f f
f
f
,
-
Cluster Expansion Methods in Open-Shell Correlation
363
1 Fig.13
( a ) T h e s t r u c t u r e of H matrix. N o t e t h e null-block in t h e loweFffdiagonal. ( b ) T h e s t r u c t u r e of H-matrix in t h e s a m e I M S .
a. Fig.14
b.
( a ) T h e norm-correction t e r m f r o m Fig.l3(b) a t s e c o n d order.(b) T h i s c a n b e c o n v e r t e d t o d i a g r a m s w i t h d o u b l y e x c i t e d intermediates.
Fig.15 G e n e r a t i o n o f a d i s c o n n e c t e d X c l i m p o s i t i o n o f IN o n R.
after the
364
Debashis Mukherjee and Sourav Pal
disconnected X o p e r a t o r w h i c h will yield a cl disconnected f r o m t h e expression "ef f (8.5.2)
F o r a n e x t e n s i v e d i s c u s s i o n of t h i s aspect, see t h e recent paper by C h a u d h u r i e t a1/146/, w h o a l s o d e m o n s t r a t e t h e e x p l i c i t break-down of sizeextensivity o f energy if IN i s enforced o n 0 .
9.
CONCLUDING REMARKS
:
W e may s u m m a r i z e o u r c o n c l u s i o n s a s follows: ( a ) F o r a size-extensive theory for both core- and valence-electrons, a F o c k s p a c e formulation o f f e r s the best insight a 5 r e g a r d s t h e connectivity o f t h e c l u s t e r o p e r a t o r S and H for a CMS. In particular, ef f t h e Fock s p a c e s t r a t e g y c a n be used t o g e n e r a t e energy d i f f e r e n c e s w i t h respect t o a closed shell ground state. ( b ) F o r an INS, t h e connectivity of S and H c a n b e proved only in F o c k space. T h i s r e q u i r e s t h e u s e o f normalization for a v a l e n c e universal Q that is size-extensive. ( c ) When N > > N V , . a n d N i s kept fixed, C V core-extensive t h e o r i e s using semi-cluster e x p a n s i o n may be a v i a b l e strategy. T h e quality o f c a l c u l a t i o n will be increasingly poorer w i t h t h e i n c r e a s e in N V ( d ) F r o m the initial t r e n d s o f producing a c c u r a t e results, r i g o r o u s m a i n t e n a n c e o f size-extensivity and t h e potentiality t o o b v i a t e t h e intruder-state problem, it a p p e a r s that t h e formulation c o v e r e d in Sec. 8 will turn o u t t o b e o n e of t h e m o s t versatile and v i a b l e t o o l s in t h e c u r r e n t s t a t u s o f e n e r g y d i f f e r e n c e c a l c u l a t i o n s ( s u c h a 5 IP, Efi and EE ) in particular, and electronic s t r u c t u r e c a l c u l a t i o n s in general.
.
ACKNOWLEDGEMENTS T h e a u t h o r s thank W.Kutzelnigg, 1 - L i n d g r e n , U.Kaldor, R.J.Bartlett and S.Koch f o r many stimulating
Cluster Expansion Methods in Open-Shell Correlation
365
d i s c u s s i o n s and f o r explaining t h e i r v i e w p o i n t s about t h e open-shell c l u s t e r e x p a n s i o n theories, and P.O. Lawdin for h i s kind patience w i t h u s as t h e e d i t o r of t h i s volume. Special t h a n k s a r e d u e t o S.Mukhopadhyay, D.Mukhopadhyay and A - M a j u m d a r for their kind help in bringing themanuscript in final form. T h e financial support of t h e Department of S c i e n c e and Technology ( N e w D e 1 h i ) i s also g r a t e f u l l y acknowledged.
REFERENCES 1. See, e.g., ( a ) J. Paldus, and J. Cizek, Adv. Quantum Chem. '3, 305 (1975). ( b ) P. Jorgensen, and J. S i m o n s , Second-quantization Based M e t h o d s In Quantum Chemistry (Acad. P r e s s , N.Y., 1981) for pedagogical surveys. 2. C.Bloch and J - H o r o w i t z , Nucl. Phys., t3, 9 1
(1958).
Rev. Mod. Phys., 5 9 , 771 (1967). Adv. Quantum Chem., 10, 187 (1977). 1 - L i n d g r e n , J. Phys. BJ, 2441 (1971). 1.Lindgren and J.Morrison, Atomic Many Body Theory ( S p r i n g e r Verlag, Heidelberg, 1981). V - K v a s n i c k a , Adv. Chem. Phys., 36, 345 (1977). G.Oberlechner, F.Owono-N-Guema and J.Richert, N u o v o C i m . , B B , 23 (1970). T.T.S.Kuo, S.Y.Lee and E.F.Ratcliffe, Nucl. Phys.,A176, 65 (1971). M.B.Johnson and M.Baranger, Ann. Phys., 62, 172 (1971). A - B a n e r j e e , D - M u k h e r j e e and J.Simons, J. Chem. Phys., 62, 1979, 1995 (1982). J - L i n d e r b e r g and Y.Ohrn, P r o p a g a t o r s In Quantum Chemistry (Acad. Press,N.Y., 1973). G.Csanak, H.S.Taylor and R.Yaris, Adv. At. Mol. Phys., 288 (1971). L.S.Cederbaum and W.Domcke, Adv. Chem. Phys., 3 6 , 206 (1977). W von Niessen, L - S - C e d e r b a u m , W.Domcke and J. Schirmer, in Computational M e t h o d s In Chemistry 1980). (Ed: J.Bargon, P l e n u m P r e s s , N.Y., ( a ) Y.0hrn and G.Born, Adv. Quantum Chem., 13, 1 ( 1981). ( b ) J.Oddershede, Adv. Q u a n t u m Chem., ll, 275 (1976). J.Simons, Ann. Rev. Phys. Chem., 2 3 , 15 (1975).
3 . B.Brandow,
4. 5. 6.
7. 8. 9. 10.
11.
12.
13.
z,
366
Debashis Mukherjee and Sourav Pal
14. J.Paldus and J.Cizek, J. Chem. Phys., 6 0 , 49 ( 1974) 15. M.F.H~rman, K.F.Freed and D.Y.Yeager, Fldv. Mole. Phys., 48, 1 (1981). 16. T.H.Dunning and V.Mckoy, J. Chem. Phys., 47, 1735 (1967). T.Shibuya and V.Mckoy, Phys. Rev., A2, 2208 (1970). 17. R.J.Bartlett and G.D.Purvis, Phys.Scripta, 21,255 (1980). R.J.Bartlett and G.D.Puvis, 1nt.J. Quantum Chem., 14, 561 (1978). 18. R.J.Bartlett, Ann. Rev. Phys. Chem., 3 2 , 359 (1981) 19. J.Paldus, in New H o r i z o n s O f Quantum Chemistey (Ed: P.Lowdin and B.Pullman, Reidel, Dordrecht, 1983) 20. B.Brandow, in New H o r i z o n s O f Quantum Chemistry (Ed: P.Lowdin and B.Pullman, Reidel, Dordrecht, 19831 21. (a) R.J.Bartlett, C.E.Dykstra and J.Paldus, in Advanced and Computational Approaches to t h e Electronic S t r u c t u r e o f Molecules (Ed: C.E.Dykstra, Reidel,Dordrecht, 1984) ( b ) K.Jankowski, in M e t h o d s in Computational Chemistry, Vol 1 (Ed: S.Wilson, Plenum Press, N.Y., 1987) 22. V.Kvasnicka, V.Laurinc and S.Biskupic, Phys. Rep., 90, 159 (1982) V.Kvasnicka, V - L a u r i n c , S - B i s k u p i c and M.Haring, Cldv. Chem. Phys., 5 2 , 181 (1983) 23 B.Brandow, Int. J. Quantum Chem., l5, 207 (1979) 24. J.A.Pople, J.S.Binkley and R.Seeger, Int. J. Quantum Chem. S y m p . , E , 1 (1976) 25. K.A.Bruckner, Phys. Rev., 100, 36 (1955) 26. J.Goldstone, Proc. Roy. SOC., 6239, 267 ( 1957) 27. J.Hubbard. Proc. Roy. SOC., A240, 539 (1957) N.M.Hugenholtz, Physica, 2 3 , 481 (1957) 28. 0 - S i n a n o g l u , Adv. Chem. Phys., b_, 315 (1964) 29. H.Primas, in Modern Quantum Chemistry; Istambul L e c t u r e s (Ed: O.Sinanoglu, Flcad. P r e s s , N.Y., 1965) 30. F.Coester, Nucl. Phys., 1, 421 (1958) 31. F.Coester and H.Kume1, Nucl. Phys., l7, 477 ( 1960) H.Kume1, Nucl. Phys., 22, 177 (1969) 32. J.Cizek, J. Chem. Phys., 4 5 , 4256 (1966) Fldv. Chem. Phys.. l4, 35 (19691
Cluster Expansion Methods in Open-Shell Correlation
33. J.Paldus, J.Cizek and I.Shavitt, Phys. Rev., AM, 50 (1972) 34. See, e.g., refs 18,19,21,22 for extensive references 35* (a)S.Koch and W.Kutzelnigg, Theo. Chim. Acta., 5 9 , 387 (1981). (b)M.R.Hoffmann , H.F.Schaefer, Adv. Quantum. Chem. ,207 (1986). 36. P.R.Taylor, G.B.Bacskay, N.S.Hush and fi.C.Hurley, Chem. Phys. Lett., 4 l , 444 (1976). 37. R-Ahlrichs, Comput. Phys. Comm.,l7, 31 (1979). 38. J.Paldus and J.Cizek, Physica Scripta, 3, 251 (1980). 39. J.W.Keney, J.Simons, G.D.Purvis and R.J.Bartlett, J. Am. Chem. SOC., 100, 6930 (1978). 40. Y.Lee, S.A.Kucharski and R.J.Bartlett, J. Chem. Phys., 81, 5906 (1984). M.Urban, J.Noga, S.J.Cole and R.J.Barelett, J. Chem. Phys., 83, 4041 (1985). 41. S-Wilson, K.Jankowski and J.Paldus, Int. J. Quantum C h e m . , Z , 1781 (1983) 42. G.F.F\dams, G.D.Bent, G.D.Purvis and R.J.Bartlett, Chem. Phys. L461 (1981). G.F.Adams, G.D.Bent, R.J.Bartlett and G.D.Purvis, in Potential Energy Surfaces and Dynamics Calculations (Ed: D.G.Truhlar, Plenum Press, N.Y., 1981). 43. P.E.Siegbahn, Int. J. Quantum Chem., l8, 1229 (1980). 44. G.Hose and U.Kaldor, J. Phys. 3827 (1979). Phys. Scripta, Z l , 357 (1980). Chem. Phys., 6 2 , 419 (1981). J. Phys. Chem., 886, 2133 (1982). 45. M.Shepard, J. Chem. Phys., 80, 1225 (1984). 46. For a thorough and modern treatment, see, e.g., W-Kutzelnigg in Modern Theoretical Chemistry Vol 3 (Ed: H.F.Schaefer, Plenum Press, 1977). 47. W.Meyer, J. Chem. Phys., 58, 1017 (19731. 48. H.Silverstone and O.Sinanoglu, J. Chem. Phys., 44, 1899,3608 (1966). 49. K-Roby, Int. J. Quantum Chem., 6, 101 (1972). 50. H.Nakatsuji, Chem. Phys. Lett., 59, 362 (1978). Chem. Phys. Lett., 6 7 . 329 (1979). H.Nakatsuji and K.Hirao, J. Chem. Phys., 61, 2053 (1978). J. Chem. Phys., 668, 4279 (1978).
,a
u,
367
368
Debashir Mukherjee and Sourav Pal
51. H.Nakatsuji and T.Yonezawa, Chem. Phys. Lett., 87 426 (1982). 52. H.Nakatsuji, Int. J. Quantum Chem. Symp., l7, 241 (1983) and references therein 53. K.Hirao, J. Chem. Phys., 79, 5000 (1983) 54. J.Paldus, J.Cizek, M.Saute and L.Laforgue, Phys. Rev., A x , 805 (1978) 55. M - S a u t e , L.Laforgue, J.Paldus and J.Cizek, Int. J. Quantum Chem., l5, 463 (1979) 56. H.J.Monkhosrt, Int. J. Quantum Chem., S A , 421 (1977). 57. E.Dalgaard and H.J.Monkhorst, Phys. Rev., A a , 1217 (1983). 58. D - M u k h e r j e e and P.K.Mukherjee, Chem. Phys., 3 7 , 325 (1979). S.Ghosh, D - M u k h e r j e e and S.N.Bhattacharyay, Chem. Phys., 72 961 (1982). 59. S.Ghosh, D - M u k h e r j e e and S.N.Bhattacharyay, Mol. Phys., 43, 173 (1981). 60. S.Ghosh and D-Mukherjee, Proc. Ind. Acad. Sci., 93, 947 (1984). 61. H.Sekino and R.J.Bartlett, Int. J. Quantum Chem. S,255 (1984). 62. K.Emrich, Nucl. Phys.,A351, 379 (1981). 63. M.D.Prasad, S.Pal and D - M u k h e r j e e , Phys. Rev., 4 3 , 1287 (1985). 64. A.Banerjee and J - S i m o n s , Int. J. Quantum Chem., l ? , 207 (1981). 65. A.Banerjee and J.Simons, J.Chem. Phys., 76, 4548 (1982). Chem. Phys., 82, 297 (1983). Chem. Phys., 8 7 , 215 (1984). 66. B.Jeziorski and H.J.Monkhorst, Phys. Rev., 4 2 , 1668 (1981). 67. D - M u k h e r j e e , R.K.moitra and A.Mukhopadhyay, Pramana, 4, 247 (1975). Mol. Phys., 30, 1861 (1975). 68. (a)D.Mukherjee, R.K.Moitra and A.Mukhopadhyay, Mol. Phys., 33, 955 (1977). (b)D.Mukherjee ,G.Mukhopadhyay and R.K.Moitra , Z.Nuturforsch , 3 ,1549 (1978) 69. D - M u k h e r j e e , Pramana, l2, 203 (1979). M.Haque and D.MukherJee, J.Chem. Phys., 8 0 , 5058 (1984). 70. R.Offermann, W.Ey and H.Kumme1, Nucl. Phys., F1273, 349 (1976). R.Offermann, Nucl. Phys., A 2 7 3 , 368 (1976). W.Ey, Nucl. Phys., 8296, 189 (1978). 71. I.Lindgren, Int. J. Quantum Chem. Svmp., l2, 33 (1978).
Cluster Expansion Methods in Open-Shell Correlation
369
72. V.Kvasnicka, Chem. Phys. Lett., 79, 89 (19811, Adv. Chem. Phys., 52, 181 (1982). 73. ( a ) S.Pa1, M.D.Prasad and D.Mukherjee, Theoret. Chim. Acta., 6 6 , 311 (1984), Theoret. Chim. Acta., 62, 523 (1983). 74. M - H a q u e and D.Mukherjee, Proc. 5th ICQC, Montreal(l985). 75. P.Westhaus, 1nt.J.Quantum Chem., 7 5 , 463 (1973) 76. W.Kutzelnigg, J.Chem. Phys., 7 7 , 3081 (1981) J. Chem. Phys., 83, 822 (1984). 77. W-Kutzelnigg and S.Koch, J. Chem. Phys., 7 9 , 4315 (1983). 78. H - R e i t z and W.Kutzelnigg, Chem. Phys. Lett., 66, 11 (1979). 79. J.des Cloizeaux , Phys.Rev.,l35A , 685,698 (1964) 80. B-Lukman and O.Goscinski, Chem. Phys. Lett., 573 (1970) 81. B-Brandow, Ann. Phys., 64, 2 1 (1971). 82. M . A , R o b b , D.Hegarty and S.Prime, in Excited States in Quantum Chemistry (Ed: C.A.Nicolaides and D.R.Beek,Reidel). 83. T.T.S.Kuo and E.M.Krenciglowa, Nucl. Phys., ~ 3 4 2 ,454 ii980). 84. D-Mukherjee and W-Kutzelnigg, in Many Body Methods in Quantum Chemistry (Ed: U.Kaldor, Springer Verlag, Heidelberg, 1988). 85. L.Z.Stolarczyk and H.J.Monkhorst, Phys. Rev., A X , 725,743 (1986). Phys. Rev. AX,l?06,1926 (1988). 86. T.H.Schucan and H.A.Weidenmuller, Ann. Phys., 73, 1 0 8 (1972). Ann. Phys., 76. 483 (1973). 87 D.Hegarty and M.A.Robb, Mol. Phys., 3 7 , 1455 (1979). H.Baker, M.A.Robb and Z.Slattery, Mol. Phys., 44, 1035 (1981). 88. M.Haque and D-Mukherjee, Pramana, 23, 651(1984). 89. I.Lindgren, Phys. Scripta, 32, 291 (1985). 90. 1.Lindgren. Phys. Scripta, 32, 6 1 1 (19851. Sci., 9 6 , 1 4 5 91. D.Mukkerjee, Proc. Ind. &ad. (1986). 92. D.Mukherjee, Chem. Phys. Lett., 125, 207 (1986) 93. D-Mukherjee, Int. J. Quantum Chem. Symp., 20 ,409 (1986) (a) 1.Lindgren and D-Mukherjee, Phys. Rep., 94* 151 , 93 (1987) (b) D.MukherJee, in Condensed Matter Theories, Vol 3 (Ed: J-Arponen, R.F.Bishop and M.Monninen, Plenum Press, N.Y., 1988).
z,
.
.
.
.
370
Debashir Mukherjee and Sourav Pal
W.Kutzelnigg, D.Mukherjee and S.Koch, J. Chem. Phys., 87, 5902 (1987). 96. D.Mukherjee, W-Kutzelnigg and S.lioch, J. Chem. Phys., 87, 5911 (1987). 97. S.S.Z.4dnan, S.N.Bhattacharyay and D-Mukherjee, Chem. Phys. Lett., 85, 204(1982) 98. S.S.Z.Adnan, S.N.Bhattacharyay and D-Mukherjee, Mol. Phys., 3 9 , 519 (1982). 99. S.K.Mukhopadhyay, D.Sinha, M.D.Prasad and D-Mukherjee, Chem. Phys. Lett., 117,437 (1986). 100. S.Roy, S.Sengupta, D-Mukherjee and P.K.Mukherjee, Int. J. Quantum Chem., 2 3 , 205 (1986). 101. In the-collection of the various expressions for F and V, a particular type o f integral was unfortunately dropped.In actual applications these terms are ,however, present. 102. K.Hirao and H-Nakatsuji, J. Comput. Phys., 45, 246 (1982). See also, S.Rettrup, J. Comput. Phys., 45, 100 (1982) . 103. E.R.Davidson, J.Comput. Phys., l7, 87 (1975) J. Phys., 4 3 , L179 (1980). 104. P.W.Langhoff, S.T.Epstein and N.Karplus, Rev. Nod. Phys., 4 5 , 602 (1972) 105. H-Takahashi and J.Paldus, J . Chem. Phys., 8 5 , 1486 (1986). 106. J.Geertsen and J-Oddershede, J. Chem. Phys., 82 , 2112, (1986) 107. S.Ghosh, S.K.Mukhopadhyay, R.Chaudhuri and D.Mukherjee, to be published 108. P.O.L&din, Phys. Rev., 97, 1474 (1955). 109. See. e . g . , P.O.L&din, J. Math. Phys., 3, 1171 (1962) for an early discussion o f this aspect. Also, P.O.L&din, in Horizons o f Quantum Chemistry (Ed: K.Fukui and B-Pulmann, Reidel, 1980). 110 .O.Sinanoglu and I.Oksuz, Phys. Rev.Lett., 21, 507 (1988) ,Phys. Rev., 42 (1969) 0.Sinannoglu and B.J.Skutnik, J.Chem. Phys., 62, 3670 ( 1974). 111. R.J.Bruna , S.D.Peyermhoff and W.Butscher, Mol. Phys., 35, 771 (1978). R.J.Bruckner and S.Peyerimhoff, in New Horizons o f Quantum Chemistry, (Ed: P.O.Lowdin and B.Pulmann, Reidel, Dordrecht, 1983). 112. B.Jonsson, B.O.Ross, P.R.Taylor and P.E.M.Siegbagn, J. Chem. Phys., 74 ,4566 (1981). 95.
.
.
.
u,
.
Cluster Expansion Methods in Open-Shell Correlation
113.
114. 115.
116. 117. 118.
119. 120. 121.
122.
123. 124 125
-
126.
127. 128.
371
H.Lischka, R.Shepard, F.B.Brown and I.Shavitt, Int. J. Quantum Chem. Symp., lS, 9 1 (1981). J. Paldus, in Theoretical Chemistry : Advances and Perspectives Vol 2, (Ed: H.Eyring and D-Henderson, Acad Press, N.Y., 1976). 1-Shavitt, Int. J. Quantum Chem., 5 2 , 131 (1977) Int. J. Quantum Chem., 5 3 , 5 (1978). ( a ) W.D.Laidig, P.Saxe and R.J.Bartlett, J. Chem. Phys., 86, 887 (1987). ( b ) M.Hoffmann and J-Simons, J. Chem. Phys., 82, 993 ( 1988). ( c ) H.Baker and M.A.Robb, Mol. Phys., 5 0 , 20 (1983). (d) R.Tanaka anmd H.Terashima, Chem. Phys. Letts., &, 558 (1984). C.Bloch, Nucl. Phys., 6, 329 (1958). P.O.L&wdin, Adv. Chem. Phys., 2 , 207 (1959). (a) W.D.Laidig and R.J.Bartlett, Chem. Phys. Letts., 104, 424 (1984). (b) B.Jeziorski and J.Paldu5, J. Chem. Phys., 8 8 , 5673 (1989). D.Sinha, R-Chaudhuri, S.Sengupta and D.Mukherjee, to be published D. Mukher jee, R. K Moi tra and A . Iluk hopadhyay , Pramana, 9 _ , 6 (1977) P.Westhaus, Int. J. Quantum Chem., 7 5 , 463 (1973). P.Westhaus and E.G.Bradford, J. Chem. Phys., 6 3 , 5416 (1975). P.Westhaus, E.G.Bradford and D.Hal1, ibid, 62, 1067 (1975) . D.Muk her j ee , R K .Moi tra and A Mukhopadhyay , Ind. J. Pure and Appl. Phys., 15, 6 1 3 ( 1 9 7 7 ) A .Muk hopadhyay , R .K .Moitra and D .MukherJee, J. Phys, l2, 1. M.Haque, Ph. D thesis, University o f Calcutta (1983). S.Koch, Ph. D. thesis, Ruhr- University o f Bochum (1983) . W-Kutzelnigg, H.Durmaz, H.Reitz and S . K o c h , Proc. Ind. Acad. Sciences, 96 ,177 (1986). M.Haque and D-Mukherjee, 5th ICQC Proceedings Montreal (1985). F.Jorgensen, Mol. Phys., 29, 1137 (1975). F.Coester, in Lectures in Theoretical Physics, Vol 11B (Ed: K.T.Mahanthappa, and W.E.Brittin, Gordon and Breach, N.Y.,1969) -
.
.
.
.
.
-
372
Debashis Mukherjee and Sourav Pal
,
129. 1-Lindgren, Phys. Rev., A 3 1 130. S.Salomonson, Z . Phys., A316,
(1985). 1273 135 (1984).
S.Salomonson and A.M.Martensson-Pendrill, Phys. Rev., F\30, 712 ( 1 9 8 4 ) . 131. S.Solomonson, 1-Lindgren and A-Martensson, Physica Scripta, 21, 351 ( 1 9 8 1 ) . 132. ( a ) A.Haque and U.Kaldor, Chem. Phys. Lett.,
u,347
(1985).
ibid, 120, 261 ( 1 9 8 5 ) . ibid, 128, 45 ( 1 9 8 6 ) . (b) 4.Haque and U.Kaldor, Int. J. Quantum Chem., 2 9 , 425 (1986) 133. U.Kaldor, J. Comput. Phys., S _ , 448 ( 1 9 8 7 ) . 134. U.Kaldor, J. Chem. Phys., 87, 467 ( 1 9 8 7 ) . 135. D-Sinha, S.K.Mukhopadhyay, M.D.Prasad and D-Mukherjee, Chem. Phys. Lett.,
.
213
(1986)
.
m,
D.Sinha, S.K.Mukhopadhyay, R.Chaudhuri and D.Mukherjee, C h e m . P h y s . L e t t s , m , S 4 4 ( 1 9 8 9 ) 137. E.Soliverez, Phys. Rev. A,* 4 (1981). 138. D.Sinha, S.Mukhopadhyay and D.Mukherjee, Chem. Phys. Letts., 127, 369 ( 1 9 8 6 ) . 139. S.Pa1, M.Rittby, R.J.Bartlett, D.Sinha and D.Mukherjee, Chem. Phys. Letts, 137, 136.
273 ( 1 9 8 7 ) . 140.
S.Pal, M.Rittby, R.J.Bartlett, D.Sinha and D-Mukherjee, J. Chem. Phys., 8388, 4357 ( 1 9 8 8 ) .
(a) M.Rittby, S.Pal and R.J.Bartlett, J. Chem. Phys., (in press). (b) R.Mattie, M.Rittby, R.J.Bartlett and S.Pal, in Aspects of Many Body Effects in Molecules and Extended Systems (Ed: D-Mukherjee, Springer Verlag, Heidelberg, in press). 142. S.Pa1, R.Mattie, M.Rittby and R.J.Bartlett, to be published . 143. S.Koch and D.Mukherjee, Chem. Phys. Letts., 141.
145 ,
321 ( 1 9 8 8 ) .
U.Kaldor, in Condensed Matter Theorie5,Vol 3 (Ed: J-Arponen, R.F.Bishop and M-Manninen, Plenun Press, N . Y . , 1988). 145. S.Ben-Shlomo and U.Kaldor, J. Chem. Phys., 144.
83,956
(1988).
R.Chaudhuri,S.Guha and D-Mukherjee, in Many Body Methods in Quantum Chemistry (Ed: U.Kaldor, Springer Verlag, Heidelberg, 1989, in press). 147. R.Chaudhuri,D.MukherJee and M.D.Prasad, in Aspects o f Many Body Effects in Molecules and Extended Systems,(Ed: D-Mukherjee, 146.
Cluster Expansion Methods in Open-Shell Correlation
373
Springer Verlag, Heidelberg, 1989, in press) 148. U.~aldor,S.Roszak,P.C.Harlharihara~ and J.J.kaufman, J. Chem. Phys., (1988), in press. 149. S . K o c h , in Aspects of Many Body Effects in Molecules and Extended S y s t e m s (Ed : D. Mub.herJee, Springer Verlag, 1989, in press). 1 5 0 . R.Chaudhuri ,D.Mukhopadhyay and D.Muhherjee, in Aspects of Many Body Effect5 in Molecules and Extended Systems,(Ed: D.Mukherjee, Springer Verlag, Heidelberg, 1989, in press) ADDENDUM
:
There have been s o m e very recent works which w e quote for completeness :
(a) Int. J. (b) Aspects Systems press).
Meissner, K . Jankowski and J. Wasilewski, Quantum Chem. 34, 535 (1988). R.F. Bishop, J. Arponen and E. Pajanne, in of Many-Body Effects in Molecules and Extended (Ed : D. Mukherjee, Springer-Verlag, 1989, in L.
( c ) B - J e z i o r s k i and J.Paldus ,in Many Body Methods In Quantum Chemistry (Ed :U.Kaldor ,Springer Verlag, Heidelberg ,1989 ,in press) ( d ) U.Kaldor ,in Condensed Matter Theories,Vol 4 (Ed :J.Keller ,Plenum P r e s s ,1989 ,in press) ( e ) S.Guha ,R.Chaudhuri and D - M u k h e r j e e , in Condensed Matter Theorie5,Vol 4 (Ed.:J.Keller, Plenum P r e s s ,1989 ,in press)
This Page Intentionally Left Blank
JACOB1 ROTATIONS: A GENERAL PROCEDURE FOR ELECTRONIC ENERGY OPTIMIZATION*
Ramon Carb6 Joan Mir6 Unitat de Quimica, Col.legi Universitari de Girona, 1707 1 -Girona , SPAIN
Llorenq Doming0 Seccio de Quimica Quantica, Dept. de Quimiometria, Institut Quimic de Sarria, 08017-Barcelona, SPAIN
Juan J. Novoa Dept. de Quimica Fisica, Facultat de Quimica, Univ. de Barcelona, 08028-Barcelona. SPAIN
*
A contribution from the "Grup de Quimica Quhntica del Institut d'Estudis Catalans"
ADVANCES IN QUANTUM CHEMISTRY, VOLUME 20
375
Copyright 0 1989 by Academic Press. Inc. All rights of repmduction in any form reserved.
Ramon Carbo, Joan Miri, Lloreng Domingo, and Juan J. Novoa
376
TABLE OF C 0 " T S 1.
2. 3. 4.
5.
6.
7.
Introduction Elementary Jacobi Rotations Variation of Quantum Mechanical Objects in the MO framework via Elementary Jacobi Rotations Some application examples 4.1. EJR on monoconfigurational density matrices 4.2. EJR on CI energy 4.3. EJR on two electron molecular integrals Monoconfigurational energy expressions 5.1. General variation structure 5.2. The Roothaan-Bagus Atomic Energy as a particular case Multiconfigurational Energy Expression 6.1. Variational structure 6.2. Structure of MC Elementary Jacobi Rotations 6.3. The CI-MO computation algorithms 6.4. Results and discussion 6.5 Comparison with the exponential MCSCF methodology Optimization of rotation sine 7.1. An exact solution 7.2. Performance improving procedures 7.3. Localization procedures and Jacobi rotations Acknowledgements Appendix I Appendix I1 References
Jacobi Rotations
377
1. INTRODUCTION
Some time ago, we started / 1 / the development of the Elementary Jacobi Rotation (EJR) algorithms which were known from old times /2, 3, 41. As a consequence of this development some intermediate work has been published /5,6/. The present paper corresponds, essentially, to the application of our EJR experience to multiconfigurational calculations. The current work presented here must be necessarily considered as a new step towards an easy, comprehensive and cheap way to obtain atomic and molecular wavefunctions. In the same path we are attempting here to show that the Jacobi Rotation techniques may be considered viable alternative procedures to the usual SCF and Unitary Transformation algorithms. We present here some developments which can compete with other procedures found in the current literature and program lore. In this sense, we deal with t h e current state of the art in the calculation of the minimal rotation steps leading to the optimal energy and wavefunction. After this, perhaps, the reader will be indulgent with the structure of this paper. First we will present a broad survey of the ideas which are necessary to build up t h e Jacobi Rotation techniques. A second part will study various energy forms, mono- and multiconfigurational, and some results will be given to illustrate t h e way optimal orthonormality constrained energies are obtained. A final part describes some a l g o r i t h m s used, which a r e i m p o r t a n t f o r e f f i c i e n t
378
Ramon CarM. Joan Miro, Lloreng Dorningo, and Juan J. Novoa
implementation of the rotations as well as some relationships between energy optimization and localization procedures.
2. ELEMENTARY JACOB1 ROTATIONS
1) An Elementary Jacobi Rotation (EJR) over n-dimensional space can be constructed, in order to fulfill the needs of the present work, as an orthogonal matrix Jij(a)= (Jij,pq(a)) with the following prescription: a) Chosen an active m i r of indices ( i j ) and a rotation angle, then: b) Jij,ii(a)= Jijjj(a)= cos
It is easy to find that
iii) Jij ( a )=Jij+ (a)
2) The action of Jij over a n-dimensional column vector produces a new vector such as
c=( cp)
379
Jacobi Rotations
(2.1)
whose components are: ri = Ci cos a- C j sin a rj = Ci sin a+ cj cos rp = cp V p + i j This is sufficient to ensure norm vectors :
invariance
3) From a set of orthormalized column vectors that is
c + c= I
for
the
involved
C = ( c i ,c2 ,...C m ) ,
(2.4)
any EJR over C leaves the induced metric over the set invariant, that is, if:
R
= Jijc
(2.5)
then
The same property is obtained if R = C Jij. 4) I t is well known that the product of a set of unitary matrices gives as a result a unitary matrix. Therefore, if S={i,j) is some set of active index pairs, and % = ( Jij) and attached set of EJR matrices, the product
380
Ramon Carb6, Joan Miro, Lloreng Domingo. and Juan J. Novoa
(2.7)
is a unitary matrix. 5) A usual representation exponential form:
of
a
where A is hermitean that is A + = A . ei* = cos A
+ i sin
A
unitary
matrix
is
the
But one can also write (2.9)
or using the well known representation of complex matrices, it can be written:
cos A + i sin A =
[
cos A sin A
-sin A cos A]
(2.10)
Thus the EJR can be viewed as the simplest form of an exponential unitary transformation matrix, expressed in a real space.
3. VARIATION OF QUANTUM MECHANICAL OBJECTS IN THE MO FRAMEWORK VIA ELEMENTARY JACOB1 ROTATIONS
The use of EJR as a tool for Quantum Chemical purposes may be based on the simple transformation of a couple of MO
381
Jacobi Rotations
functions. Let us suppose that (11>,12>,...,In>) are a set of monoelectronic functions, then choosing an active pair of indices (i.j) there can be defined an EJR of the MO pair (li>,lj>) as: li> ------->c li> - s Ij> Ij> ------->s li> + c Ij> taking into account that c2+s* = l . The above transformation can be put as MO pair, rewriting simply:
a
variation
of
the
61i> = (c-1) li> - s Ij> 6lj> = s li> + (c-1) Ij> Practical use of this transformation or variation scheme can be greatly simplified if i n the MO framework one redefines the functions and integrals involved as quantum mechanical o b i e c t s of first-, second-, ... n-th order following that they depend on one, two, ..., n active indices prone to be transformed through an EJR. Suppose that one is facing the transformation of any object bearing the active index pair ( i j ) . Then, for example, the MO's (li>, Ij>) behave as first-order objects, the monoelectronic integrals ( h i i ,hjj ,hij ) or the projectors (li><jl) as second-order objects and so on. The interesting fact about this idea appears to be included i n the transformation mechanics under EJR of objects of any order, which can be deduced from recursive transformation of lower order ones, without any further need to derive a new expression for each MO structure. The general transformation algorithm can therefore be
382
Ramon Carbi, Joan Miro, Llorenq Domingo, and Juan J. Novoa
described in a) Given a set of first b) First
the following form: an active index pair ( i j ) and an EJR sine s, as well as order objects {[p]), then order objects transformation is defined as:
[i] ----->c [i] - s U] ----->s [i] + c
ti]
u]
(3.3)
c) Second order objects transform by means of the algorithm: [iil = [il
@
[il ----> (c [i] - s ti]) @ ( c [il - s
GI)
(3.4)
that is
(3.5) which can be written as a variation:
and taking into account the relationships between sine and cosine, the expression is further simplified:
Now, calling: A = uj] - [ii] B = [ij] + Gi]
one arrives, using the same procedure, at
(3.8)
383
Jacobi Rotations
S[ijJ = Slji] = - (s2 B
+ cs A)
(3.9)
and also: Suj] = -s2 A
+ cs B
(3.10)
d) Any other n-th-order object can be manipulated in the same way, as, for example, if an object is written as [81,82,..., e n ] , with e,=(i or j ) , then the EJR transformation can be deduced from any one of the direct products
4. SOME APPLICATION EXAMPLES
As application examples of the EJR algorithm let us first show t h e variation of functions of second order objects as Density matrices and CI energy expressions.
4.1.
EJR on Monoconfigurational Density Matrices Using the expression D= P
for the first
wP Ip>
order density
matrix
it is easy to prepare
the
384
Ramon Carbo, Joan Miro, Lloreng Domingo. and Juan J. Novoa
expression for the EJR over an active pair ( i j ) ,
D=
C w 1p><jI P
J
(4.2)
P+ij
and using the previous results, order objects to find that:
for
the
variation
of
second-
2
G D = ( C I + - ~ ~ )A( S -CSB)
(4.3)
where A and B are defined as in the formulae (3.8) substituting t h e second order objects by the appropriate projectors.
4.2. EJR on CI Energy
Another application example can be described in respect of the CI energy expression
(4.4)
where ( C p ) are variational coefficients and { H ~ Q ) the hamiltonian matrix representation over the configuration state functions. EJR here can be performed i n the active index pair {I,J) on the set (Cp). This is a case of mixed first- and second-order objects, as one can write, supposing real coefficients and symmetric matrix elements:
Jacobi Rotations
385
Then, after some manipulation 6E = (c-1) El,
+ s E, + s c El, + s2 Em
(4.6)
where
if the gradient of 6 E with respect to the rotation sine i s found there is obtained
-a- -Z ~ E- t EIo+Eol + c E l l - s t E,, + 2 s E m =
as
= E,,
+ c El, + 2 s E,-
t (El,- s El, )
where t=s/c. So, if s-->O, c-->l and thus t-->O
386
Ramon Carb6, Joan Miro, Lloreng Domingo, and Juan J. Novoa
then at the extremum E,
+ Ell = 0
(4.10)
This is the same as to obtain [C'H],C,-[
C'HIIC,=O
(4.1 1)
which can be considered as a CI form of Brillouin theorem. Furthermore, this development is sufficiently general as to be considered a way to obtain the minimal eigenvalue and eigenvector of a symmetric matrix.
4.3.
EJR on Two-electron Molecular Integrals Two-electron integrals of the form
transform under EJR as first, second, third, and fourth-order objects / l / . Formulae for these transformations were deduced and published some time ago / 1 / and will not be repeated here, but some properties may be useful to the interested reader in order to implement the present formalism.
387
Jacobi Rotations
4.3.1. Coulomb and Exchange integrals.
4.3.1.1: Considering the usual Coulomb and exchange and Kpj it is interesting to note that Jpi, Jpj, Kpi
6 Jpj = - 6 J . P
integrals
(4.13)
and
6 KA. = - 6 KP.
(4.14)
4.3.1.2: Considering the integrals J i i , Jjj, Kii and Kjj then it is easy to show, since J i i = Kii and Jjj = Kjj, that 6Jii= GKii and GJjj=GKjj.
Concerning also relationship, that is
Jij and
Kij one
can
6 J..=6K.. 'J
4.3.2.
'J
obtain
a
similar
(4.15)
Invariants and relationships
4.3.2.1: The following integral expressions are an EJR: a) 6( J i i + J . . + 2 J. . ) = 0 JJ
(4.16)
'1
b) 6 ( J i i + J . . - 2 J . . + 4 K. . ) = 0 JJ
'J
invariant
1J
(4.17)
under
388
Ramon Carbo, Joan Miro, Lloreng Domingo, and Juan J. Novoa
4.3.2.2: Calling P = J..- J.. JJ
11
Q = (i i I i j) + (ij I j j)
(4.18)
and using 4
4
y=c - s 3
3
Q=sc +s c
(4.19)
then P --->y P - Q Q Q --->y Q + Q P
(4.20)
4.3.2.3: If now P = J i i + JJ.J. - 2 J.1J. - 4 K LJ. . Q = 4 [ (i i I i j) - ( i j I j j ) ]
(4.21)
and 4
4
y=c -6s2c2+s 2
2
0=4cs(c - s )
then
(4.22)
389
Jacobi Rotations
P --->y P - CJ Q Q ---> CJ P + Y Q
(4.23)
The theoretical framework to obtain EJR variation formulae from any MO expression has been developed so far, the following sections will deal with useful computational cases.
5. MONOCONFIGURATIONAL ENERGY EXPRESSIONS
5.1.
General Variation Structure
Let us write the monoconfigurational energy in terms MO integrals as
of
the
(5.1)
where (op). (apq)and (Ppq) are parameters depending on the state under consideration, ( h p q ) the monoelectronic hamiltonian integrals, ( J p q ) the two-electron Coulomb integrals and { Kpq) the exchange ones. Upon choosing an active orbital index pair ( i j ) and a rotation sine, variation of the energy through a EJR can be obtained from the preceding section. The net result is a fourth order sine polynomial G E = s c E l + s2 E 2 + s3 c E 3 + s4 E4
390
Ramon Carbo, Joan Mir6, Lloreng Domingo, and JuanJ. Novoa
for the energy variation, with the additional complication of the presence of a cosine factor in the odd terms. The polynomial coefficients (Ek) are complicated functions of the MO integrals and can be written using a . . = a i i- 2 a . . + a . . 'J
b.. IJ
IJ
JJ
= p i i - 2 p .'J. +p.. JJ
(5.3)
as E, = 4 4 I F. - Fi I j > J
E 2 = 2 ( - <j IF.- FiI j > + 2 a . . K . . -b. . ( J . . + K . . ) ] J J I 1 11 'J 1J 1J E, = 4 (a,.- b. . ) J
11
11
E,=(Jii+J..-2J..-4K..)(a..-b..) JJ
1J
1J
IJ
1J
(5.4)
where (Fi, Fj} are the Fock operators for the orbitals ( i j } , that is, for example: 1
Fi = ? m i h i +
( a i p J p -pipKp) P
(5.5)
being ( J p ] and ( K p ) the usual Coulomb and exchange operators respectively. The interesting features of these coefficients can be summarized in the following points: a) El is no more than the Brillouin matrix element for orbitals ( i j ) . Thus, at convergence, one must have El=(). This feature will be further discussed when describing sine optimization proceduresjn section 7.1.4.
Jacobi Rotations
391
b) E4 contains an invariant integral structure as has been previously obtained. c) E3, E4 are related by a quadruple rotation as described in section 4.3.2. d) E2 is the term which drives the sign of the hessian of the energy with respect to the sine, thus governing the structure of the energy minimum. Finally, Fock operators have been used in order to obtain compact expressions for the coefficients but they are not necessary, obviously, to the implementation of the algorithm, which can be entirely written over MO integrals. Application examples to various molecular structures will in the multiconfigurational section, where be shown the monoconfigurational MO space has been used as starting point for every calculation reported.
5.2.
The Roothaan-Bagus Atomic Energy as a Particular Case
According to Roothaan and Bagus /7/, the energy of some atomic states can be written as a variational functional of two density matrices: the usual total density matrix containing all the occupied orbitals with their corresponding well known occupation numbers, and a second one, in which the occupation numbers for the doubly occupied orbitals are set to zero, namely, the open shell density matrix. In this way, two different sets of occupation numbers may be defined: O T , the classical ones, and the open shell occupation numbers: 0 0 which fulfill
392
Ramon Carbo, Joan Miro, Lloreng Domingo, and JuanJ. Novoa
The particular electronic state being computed is defined through a set of parameters which are included in the construction of the two-electron contribution hypermatrices called P and Q. These hypermatrices correspond to the total and open shell density matrices, respectively, and are built up in such a way that they behave as Coulomb terms in front of MO variation. In this way, the energy expression is written as
where DT and Dg are the density matrices above mentioned, and H, P and Q are the one-electron hamiltonian and twoelectron integral contribution h ypermatrices. Any Jacobi rotation between two MO's will simply modify both density matrices in the way described in section 4.1. Once these density variation expressions, see formula (4.3), are applied on the Roothaan-Bagus energy and after collecting terms of similar order the usual four-parameter expression for the energy increment (5.2) is obtained. The parameters are now defined by the formulae: EI=F:B E,= F : A
+ -1 Y 2
B
E3=X:B 1 E =-(X:A-Y:B) 4
2
where F is the Fock matrix:
Jacobi Rotations
F = A-
H + A-
D,
393
P - Amo Do: Q
(5.9)
and X , Y two auxiliary matrices defined by: 2
X = ( Ao,P
2
- Ao,Q): A (5.10)
being A, B, and A O T , A o o defined as in formulae (3.8) and (4.3). When symmetry relationships are introduced in the calculation, it is important to realize that only t h e submatrices corresponding to the rotating orbital symmetry have to be computed for both sets of matrices A, B and X, Y. If optimization is carried out in the MO basis, the variational terms are t h e integrals instead of the density matrices, which become diagonal and their values equal to the occupation The Roothaan-Bagus energy expression can therefore numbers. be written as:
(5.1 1)
with the same definitions as above for the a, h, P and Q, but now, these last three must be considered calculated on the MO basis.
394
Ramon Carb6, Joan Mir6, Lloreng Domingo, and Juan J. Novoa
The variation of the integrals under a single Jacobi rotation over an active pair of orbitals ( i j ) has already been defined i n section 4.3, and noting that both P and Q hypermatrices transform in the same way, these expressions alone will suffice, through substitution in the energy expression, to obtain the formulae for the definition of t h e energy variation parameters :
okP(ij I kk)] - 2Awe
E,=2 A T [ hij+
E2=2A% [ ( h i i - hjj) +
c
o p Q ( i j I pp)
FO
k
C ok(P(i i I
kk) - P(i j I kk))
+
k
+2
A 4 P( i j I i j) - 2Aoe[
c
op(Q(i i I pp) - Q(i j Ipp) )I -
PEO
2
- 2 Ao,Q( i j i j) 2
E, = 2A% [ P(i I i j) - P(i j I j j)] - 2Aoi[Q(i i I i j ) - Q(ij I j j)] .
E,= ‘A< 2 1
--Am:[ 2
n
[ P(i i I i i) + P( j j I j j) - 2P( i i I j j) - 4 P(i j I i j) ] -
Q(i i I i i ) +Q(i j I j j) 2Q(i i I j j) - 4 Q ( i j I ij)] ~
(5.12)
This expression toghether with the ones that perform the rotation of the h, P and Q integral sets , provide the algorithm to implement the Jacobi rotation method on the MO basis using the Roothaan-Bagus energy expression. Both algorithms were implemented on a highly modified version of Roos, Salez, Veillard and Clementi atomic program /8/ and tested for a very large number of the calculations reported by Clementi and Roetti /9/.
Jacobi Rotations
395
6. MULTICONFIGURATIONAL ENERGY EXPRESSION
In the initial work 111 a general review on SCF methodology was presented, but as time has passed by we think meccesary an updating of the state of the art in this field. For this purpose i t will be presented a broad analysis of the development of multiconfigurational techniques covering the 1980-1987 period. During this years some reviews have appeared. The first one is due to Hinze 1101. Inside a report edited by Dupuis 1111 one can find some other revision articles, as the one due to Dietrich and Wahl. Another excellent review is due to Olsen et al. 1121. Finally, Almlof and Taylor reviwed t h e implementation on vector-oriented hardware, while some of the chapters in the two volume series were dedicated to the MCSCF methodology. Other contributions in this field deal on some especific subjects as follows: Unitary transformations had been studied by Polezzo and Fantucci 115, 16, 171, Matsen and Nelling 1181, and Fantucci et al. 1191. Stability and initial vectors had been discussed by Levy 1201, general theoretical and optimization problems had been studied by Mukherjee 1211, McCullough 1221, Igawa et al. 1231, Lengsfield 1241, Polezzo and Fantucci 1251, Lengsfield and Liu 1261, Polezzo 1271, Werner and Meyer 1281, Ishikawa 1291, Siegbahn et al. 1301, Brooks et al. 1311. General convergence problems and other aspects had been treated by Yurtsever 1321, Das 1331, Camp and King 1341,
396
Ramon Carbo, Joan Mir6, Lloreng Domingo, and Juan J. Novoa
Baushlicher et al. 1351, and Werner and Knowles 1361. During this period, Joergensen's group has made a large contribution to this subject, as the interested reader can find in references 1371 to 1471. After facing the cumbersome but simple problem of integral energy transformation in EJR techniques, and having solved and used i t in atomic and molecular monoconfigurational frameworks, we managed to construct a multiconfigurational code, ARIADNE, which has been evolved through the various hardware facilities available to us. The following theoretical discussion was tested and results were obtained by means of a VAX-11 version of the ARIADNE program.
6.1. Variational Structure The use of unitary transformations in MCSCF procedures is a common practice in Quantum Chemistry today. The most extended type of unitary transformations are the exponential ones, introduced by Levy /48/, defined using the following equation:
The implementation of this type of transformation in the multiconfigurational problem has been very successful when associated with energy direct minimization. NewtonRaphson or related techniques have been used for this task 1441. The problem in these implementations is the truncation normally done on the exponential transformation during the computational process, either by the use of the Taylor expansion of the exponential
397
Jacobi Rotations
U = exp(X) = 1 + X + (1/2) X*X + ...
(6.2)
or by the use of the Haussdorf transformation H'= exp(-X) H exp(X) = H+[H,X]+ 0.5* [[H,Xl,Xl+ ...
(6.3)
In both cases, the energy increment in each transformation is given by the approximate formula truncated at the second order El- E
G
g*X
+ 0.5 * X* G
*X
(6.4)
where g is the gradient and G the hessian matrix of the energy with respect to the variational parameters X. The direct consequence of such a truncation is that these methods are n o longer exact and, therefore, one needs to use a NewtonRaphson iterative procedure in order to find the solution. Very efficient techniques have been described in the literature in order to optimize the performances of the process. An excellent review of some of these techniques can be found in the work of Olsen, Yeager, and Joergensen /14/. The Elementary Jacobi Rotations can be taken as an alternative form for the unitary matrix transformation of the multiconfiguration energy. The exponential and the EJR forms of the unitary matrix can be related through the expressions,already discussed in section 2.5, because the EJR are the minimal parts in which the exponential transformations can be divided . Based on the EJR technique it is possible to develop a new procedure for MC energy optimization with no truncation in the energy increment expressions. Therefore, this formalism may be more powerful than those based on the exponential
398
Ramon Carbo, Joan Miro, Lloreng Domingo, and JuanJ. Novoa
transformations. In the following sections we will study the general properties of the MC method based on EJR, comparing some of our results with those obtained from t h e exponential tran s f or ma t i on s .
6.2. Structure of MC Elementary Jacobi Rotations Using the same conventions previously introduced one can distinguish between two types of rotations: The MO rotations and the CI coefficients rotations of the MC vector, which appear in a CI wavefunction
Y=CGIK> K
where IK> are the chosen polyelectronic basis functions. Each MO EJR transformation is defined as in the monoconfigurational case, see formulae (3.1). whilst the CI rotations are similarly defined on the coefficients of the MC vector as in formulae (2.2). Application of such transformations to the electronic energy expression:
-1
oij4 I h I j > + C y i j k l < ji I r21 k 1 >
=
ijkl
i j
generates the
energy
increment
for
each
rotation.
In
order
Jacobi Rotations
399
to simplify the problem one can suppose that the CI and MO rotations are mutually independent. Thus, a two-step iteration MC formalism is obtained, which can be resumed in the equations:
where
6, y
are the CI rotation sine and cosine respectively.
AE = El, (c-1)+ (Eel + Ell C)S+ (Ea +E,, C)s
2
+ (Ea + El,
3
C)S+
+ E W s4 where s,c are the MO rotation sine and cosine respectively. Both polynomials define completely the structure of the EJR MC procedure by means of the values of the coefficients t a b and Eab. The equations used to compute the MO coefficients can be found in Appendix 1 . In order to find the optimal rotation one only needs to compute the absolute minimum of each polynomial with respect to the MO and CI rotational sines. This can be done using a Newton-Raphson or any related technique, i n a very simple way, as will be shown in the section 7. Once the computed optimal sines are known, one can obtain the value of the electronic energy increment for each rotation and compute a new set of values for MO and CI coefficients.
400
Ramon CarM, Joan Miro, Llorenq Domingo, and Juan J. Novoa
6.3. The CI-MO Computation Algorithms Jsing the equations described in section 6.2, it is possit le to compute the optimal rotation for each pair of MO or CI coefficients until convergence. As in the monoconfigurational case, the process can be simplified for the MO rotations if one takes into account the fact that rotations between orbitals of the same shell or of different symmetry species keep invariant the electronic energy. Similarly one can apply the same symmetry A scheme of the whole considerations to the CI rotations. MC computational process is described in Appendix 2. The MC procedure, as given in Appendix 2, will converge i n one iteration if no coupling between the MO's and CI coefficients is present, and the MO rotations are fully independent one from the other. However, this is an ideal situation, because i n practice the rotation of a pair of MO's changes the remaining orbital structure. This is not surprising if one considers that each MO rotation is a step forward in the diagonalization of the generalized Brillouin matrix, whose elements are defined as: 41 Fij-Fji Ij>; where F a b are generalized Fock operators, and i t is well known that i n Jacobi matrix diagonalization each off-diagonal element anihilation produces some changes i n the remaining matrix elements. The considerations made above are a direct consequence of the structure of the coefficients of the MO energy increment polynomial, as it is shown i n Appendix 1, where i t can be seen that the molecular integrals are present everywhere. Due to the MO coupling, the procedure given i n Appendix 2 has an iterative form in which the rotations in the MO space are repeated until convergence.
Jacobi Rotations
40 I
Because of the couplings within the CI coefficients rotation scheme, a similar iterative procedure is necessary for this part. Finally, the coupling between MO and CI coefficients, although small, as can be seen in Tables I11 and IV, forces repetition of the two MO and CI rotation schemes. Depending on the case and precission, each partial process converges in few iterations. A n alternative procedure, hereafter identified as algorithm MC-B in order to distinguish it from the previous one which will be called algorithm MC-A, can be constructed from the one described i n Appendix 2. It will be sufficient for this purpose to skip the MO and CI internal iterative steps, keeping the external loop, until convergence is reached.
6.4.
6.4.1.
Results and Discussion
Global convergence tests
In order to study the behaviour of the EJR algorithms the computations of Wood and Veillard on the H2 molecule /49/ have been repeated, owing to the fact this is one of the tests normally used in the exponential MC methods. The results are included in Table I . As can be seen in Table I, algorithm MC-A does not converge i n one iteration but almost quadratically in the same number of iterations as the f u l l Newton-Raphson exponential technique. This is due, as it is shown in Table 11, to the existence of a small coupling between the MO and CI rotations after the first iteration which is significative at the degree of precission used ( 1 . 0 ~ 1 0 - 8a.u.). It must be noticed that for this
402
Ramon Carbo, Joan Miro, LlorenG Domingo, and Juan J. Novoa
TABLE I. Total energy (a.u.) computed by different MC methods for the H2 molecule in its ground state. The basis set and geometry are those of Wood and Veillard /49/.
METHOD
Exact Algorithm MC-A Algorithm MC-B (this work) ITERATION exDonentiala (this work) -1.13268788d 0 -1.13411C - 1.13268666d - 1.15121878 1 - 1.15089 - 1.15129176 -1.15130498 2 - 1.15130 - 1.15 130474 - 1.15130503 3 -1.15131 - 1.15130503 4 5 6 7 8
First order MC methodh -1.1341OC - 1.13956 -1.14883 -1.15087 - 1.15 122 - 1.15129 - 1. I5130 - 1.15130 -1.15131
(a)
Results taken from reference 61. reference 49. A good starting point or level shifting used. Otherwise the process converges to the SCF value. (c) After a CI computation (see ref. 49). (d) After the SCF result. The result obtained i n a complete CI with this basis set is -1.16838 a,u. , the exact energy being -1.1745 a.u. /49/. (b) Results from
molecule algorithms MC-A and MC-B give the same final result in the same number of global iterations. Furthermore, the results of both algorithms are invariant upon the choice of starting CI and MO coefficients. In order to complete the study we have also tested the convergence of both algorithms for the Li2 system ground state,taking as reference the MC results of Lengsfield /50/. The
Jacobi Rotations
403
TABLE 11. Structure of the electronic energy increment (a.u.) using algorithm MC-A on the H 2 molecule in its ground state and at the equilibrium geometry (r(H-H)=l.4 a.u.). The basis set used is the uncontracted 5s of Huzinaga (ref. 59) plus a set of p orbitals with exponents 0.9 .
ITERATION 0 1 2 3 4
ELECTRONIC ENERGY -1.84697237 -1.86557746 -1.86559045 - 1.86559072 - 1.86559073
ENERGY INCREMENT . CI MO . -0.01297389 -0.00563118 -0.00001132 -0.00000166 -0.00000024 -0.00000003 -0.00000001 -0.00000000
TOTAL ENERGY =
- 1.15130503
TABLE 111. Variation of the electronic energy (a.u.) during MC computation using the algorithm MC-A on the ground the state of the Li2 molecule. The internuclear distance is set to 2.5 a.u. and the basis set is given i n reference 50. ELECTRONIC ITERATION ENERGY 0 -18.31803889 1 -1 8.32337641 2 -18.33034282 3 -18.33041655 4 -18.33041688 TOTAL EJR ENERGY = TOTAL ENERGY a =
ENERGY INCREMENT. CI MO . -0.00281133 -0.00252618 -0.00643254 -0.00053385 -0.00006846 -0.00000526 -0.00000021 -0.00000012
- 14.73041687 -14.73041770
(a) Result obtained by Lengsfield (ref. 50) using a Full MC exponential method after 4 iterations.
404
Ramon Carb6, Joan Miro, Lloreng Domingo, and Juan J. Novoa
TABLE IV. Variation of the electronic energy (a.u.) during the MC computation using algorithm MC-B on the ground state of the Li2 molecule. The internuclear distance is set to 2.5 a.u. and the basis set is given in reference 50.
ITERATIONS 0 1 2 3 4 5 6 7 8 9 10 11
ELECTRONIC ENERGY -1.831818370 -1 332317806 -1.833038739 - 1.833041 347 -1.833041529 -1.833041615 - 1 333041656 - 1 A33041676 - 1.833041685 - 1.833041689 - 1 33304 1692 -1.833041693
ENERGY INCREMENT . CI MO -0.00275609 -0.00223826 -0.00666303 -0.00054629 -0.oooO2246 -0.00000361 -0.00000000 -0.OOOOO 182 -0.00000000 -0.00000086 -0.00000000 -0.00000041 -0.00000000 -0.00ooOO19 -0.00000000 -0.00000009 -0.00000000 -0.00000004 -0.OOO00000 -0.00000002 -0.00000000 -0.00000001 -0.000oooo0 -0.00000000
TOTAL ENERGY =
-14.73041693
results for this molecule, which are s h o w n ~in Table 111 and Table IV for algorithms MC-A and MC-B respectively, are similar to those reported for the H2 molecule; namely, the convergence when using algorithm MC-A is found in 4 iterations and not in only one, due to t h e existence of a coupling between MO's and CI coefficients. Algorithm MC-B needs more iterations in order to reach the same final result, and has after the first three iterations a linear convergence behaviour instead of the quadratic shown by the algorithm MC-A.
Jacobi Rotations
6.4.2.
405
Structure of each EJR rotation. As was stated before, the structure of each EJR rotation is
fully defined by the values of the characteristic parameters c a b and Eat, for t h e CI and MO rotations, respectively. The energy parameters seem arbitrary but, in practice, their values present some relationships wich simplify notably the form of the polynomial. The values of those parameters for the first iteration, included in the Table V for the Li2 molecule computation, are a good example of this fact. The values found show that, for the MO rotations, the Eo3, E12, E o l , and El 0 coefficients are null. Therefore, the energy variation polynomial shape is defined by t h e remaining four parameters, and, in particular, by the dominant one, the E02 parameter. Only when this parameter is small enough the optimal sine has a value which is significatively different from zero. Some systematic study has shown that the polynomial shape is only affected by the relative weight of the E04 term against the E02 one. I n the polynomial structure, one can appreciate three different parts: when E02 > 0 where only one minimum at s=O is present, next when E02 < 0 then two minima at values of the sine +1 and -1 appear, and finally when E02 = 0 in which case the situation depends on the particular problem and there can be more than one minima. Therefore, one needs good optimization algorithms in order to locate the absolute minimal energy sine value. Finally, again for Li2, Table V also illustrates that at the end of the iterative process, at sweep 7, the sum of El1 and E o l must be n u l l , a consequence of the Generalized Brillouin theorem. In the case of the CI rotations, as seen in Table VI the
406
Ramon Carbi, Joan Miro, Lloreng Domingo, and Juan J. Novoa
TABLE V. Values (in a.u.) of the non-null Eij coefficients of the MO energy increment for the first MC iteration of the Li2 molecule given in Table IV. The optimum sine (s) and the computed energy increment with this sine (in a.u.) are also given for the first two and the last EJR sweep whose number is also indicated. Only those rotations giving energy increments greater than 1 . 0 ~ 1 0 a.u. - ~ ~ are included in the table. ROTATING SWEEP MO' 1 3-1' 0.%5%&4-2 0.3162 0.4902 5-1 -0.1258 -0.4356 5-3 -0.0209 -0.1655 6-2 0.6857 -0.2948 6-4 0.0004 0.0007 7- 1 0.0896 0.3382 0.0041 0.0783 7-3 8-2 -0.1898 -0.3898 8 - 4 0.0008-0.00 19 2 3-1 0.0054 0.0036 4-2 0.6998 0.2366 5-1 -0.1226 -0.4529 5-3 -0.0127 -0.1588 6-2 0.3289 -0.4692 6-4 -0.0004 -0.oOoo 7- 1 0.0935 0.348 1 7-3 0.01 14 0.0752 8-2 -0.2337 -0.3889 8-4 0.0013 -0.0018 7
3-1 4-2 5-3 6-4 7- 3 8-4
A2E
0.043 1 0.0526 6.7793 2.1898 5.439 1 0.008 1 7.9691 3.31 17 9.2903 0.0528 0.0562 4.6245 6.7888 2.1915 6.0079 0.0201 7.9747 3.3067 9.3618 0.0548
l1
-0.OT20 -0.0036 0.0012 0.0141 -0.0000 -0.0 172 0.0003 0.0131 -0.0009 -0.0128 0.0037 0.0019 0.0001 -0.0046 -0.0022 0.0046 0.0006 -0.0023 0.0003 -0.0046
ENERGY INCREMENT 0.1348 -0.00082209 0.0003 -0.00000064 -0.0001 -0.00000006 -0.oooO2294 -0.0032 -0.oooO -0.00000000 -0.00533516 0.5285 -0.oooO -0.00000000 -0.oooO1296 -0.0020 -0.00000002 0.0001 0.1195 -0.0333 -0.00000019 -0.0002 -0.oooO -0.00000000 -0.00000244 0.00 10 -0.00000020 0.0002 -0.00026236 -0.1 127 -0.00000001 -0.000 1 -0.00000040 0.0003 -0.00000000 -0.oooO 0.042 1 O.oooO9747
0.0058 0.0029 0.0557 -0.0000 0.oooO -0.0125 -0.1588 4.6468 -0.0000 0.oooO -0.0125 -0.1588 2.1916 0.0007 -0.0002 -0.OOO1 0.0002 0.0205 0.0000 -0.0007 0.0112 0.0754 3.3071 0.0007 -0.0001 0.0015 -0.0017 0.0549 -0.0000 0.0003
s
-0.OOOoooOO -0.00000000 -0.00000005 -0.00000001 -0.000oooO3 -0.00000000
407
Jacobi Rotations
TABLE VI. Values (in a.u.) of the non-null €,ij coefficients of the CI energy increment for the first MC iteration of the Li2 molecule computed with the same basis set as in Table IV with a CI space of 20 configurations. The optimum computed energy increment (in a.u.) is also given. For the first two sweeps all t h e rotations giving energy increments greater than 1 .Ox1 0 - 6 a.u. are included. Only the 5-1 rotation is shown in the last sweeps. ROTATING CI SWEEP COEFFICIENTS 510 - - _ _ _ _ _ _ _..................... -------1 4- 1 0. 5-1 0. 6- 1 0. 6-4 0.006 6-5 0.006 13-1 0.006 15-1 0.006 16-1 0.008 16-5 0.006 17-1 0.010 20- 1 0.010 2 5- 1 0.013 5-4 0.003 6- 1 0.008 6-5 0.009 15-4 0.002 15-5 0.009 20- 1 0.014 3 5- 1 0.020 0.025 4 5- 1 0.028 5 5- 1 6 5- 1 0.029 .... 11 5- 1 0.032
501
511
502
-------- -__----- -------0. -0.015 4.970 -0.004 -0.007 -0.001 -0.001 0.006 0. -0.016 -0.002 0.017 0.017 -0.137 0.001 -0.057 0. 0. 0.004 0.025 -0.206 -0.244 -0.266 -0.278 -0.293
-0.017 -0.100 0.002 -0.007 0. 0.204 0.185 0.002 -0.089 0. 0.110 0. -0.001 -0.006 -0.001 -0.003 -0.017 0.188 0.234 0.260 0.275
2.967 0.910 0.014 0.008 7.142 9.069 7.365 0.001 5.601 9.735 2.956 0.002 0.915 0.024 O.OO0 0.007 9.690 2.940 2.928 2.921 2.916
0.293 2.910
ENERGY INCREMENT -0.oooo12 -0.oooO35 -0.003141 -0.000002 -0.00 1274 -0.000001 -0.001149 -0.000964 -0.OoooO3 -0.000234 -0.OoooO7 -0.oooO62 -0.OoooO5 -0.000893 -0.00034 1 -0.000013 -0.000003 -0.OooOo2 -0.000026 -0.OooOo9 -0.OoooO3 -0.000001
-0.~6000
408
Ramon Carb6, Joan Miro, Llorenq Domingo, and Juan J. Novoa
structure of the EJR rotations is very simple. At convergence, Generalized Brillouin Theorem holds and therefore imposes that the sum 5 0 1 + 5 1 1 becomes cero. As a consequ,ence, the magnitude of the rotation sine will be strongly related with the evolving values of 501 and 51 1 . The convergence is normally achieved in few iterations, when the number of configurations is small.
6.4.3.
Extension to other molecular systems
I n order to test the general validity of the previous results, first we have applied algorithm MC-A to a set of diatomic systems, shown i n Table VII. In all the cases, we have used SCF MO's as starting point. The "split valence" basis set of Dunning and Hay /51/ was used for each atom and the interatomic distances were taken from the work of Snyder and Basch /52/ or, if not quoted there, as the experimental reported value. Almost quadratic convergence is obtained in all cases. The number of MO iterations needed i n each global iteration varies w i t h the number of orbitals present, as expected. Furthermore, we have also verified the invariance of the final result from the starting point. Results for some polyatomic molecules, given in Table VIII using Algorithm MC-B, show a similar behaviour.
6.4.4. On choosing the CI space. In order to obtain information on the best CI space we have carried out a systematic search with a different number of configurations on the helium atom using a STO triple zeta quality basis set. The results are given in Table IX and show that the best convergence with optimum correlation
Jacobi Rotations
409
TABLE V I I . Total energy for the ground state of various diatomic molecular structures studied using the Algorithm MC-A num.num. E(SCF) num. E(MC) num. MOLECULE orb. el. (a.u.) conf. (a.u.) iter. C(SCF) ---------------- ----- ___--- ____-----______--__ __----___------__----------- -__-----H2 16 2 -1.13268787 2 -1.15130503 3 0.994 LiH 11 4 -7.98243979 2 -7.98349017 6 0.999 Liz 8 6 -14.71814758 2 -14.73041693 3 0.968 BeH 11 5 -15.14360491 2 -15.15449758 3 0.996 CH+ 11 6 -37.88516611 2 -37.90448079 4 0.994 CH 11 7 -38.25735724 2 -38.27569839 5 0.994 NH 11 8 -54.94899181 2 -54.97034529 4 0.993 OH+ 1 8 -74.96433590 2 -74.98674456 0.994 OH 1 0.993 9 -75.38088047 2 -75.40549505 OH0.992 1 10 -75.34782807 2 -75.37201 130 8 12 -75.35644086 3 -75.41475156 0.964 c2 Be0 0.995 8 12 -89.39400793 2 -89.40884632 CN 8 13 -92.14880317 3 -92.2 1242918 0.965 FLi 8 12 -106.96395611 2 - 106.97651066 0.997 0.999 8 13 -108.29770309 3 -108.30286184 N2+ N2 18 14 -108.87813657 3 -108.94795882 4 0.966 co 18 14 -112.68483960 3 -112.73475610 5 0.979 BF 18 14 -124.08140340 3 -124.10727997 4 0.997 NO 18 15 -129.19383614 2 -129.23964110 4 0.975 18 16 -149.57112871 3 -149.60159868 4 0.992 18 18 -198.70749705 2 -198.78989444 5 0.960 F2 18 19 -198.81385465 2 -198.81689740 4 0.999 F2AH 20 14 -242.42947043 2 -242.44235399 4 0.999 SiH 20 15 -289.23178590 2 -289.38999376 4 0.999 PH 20 16 -341.24176813 2 -341.25451937 4 0.999 SH 20 17 -398.03741786 2 -398.05039714 4 0.999 HC1 20 18 -460.03030032 2 -460.04303296 4 0.999
oz
410
Ramon Carbo, Joan Miro, Lloreng Domingo, and JuanJ. Novoa
TABLE V I I I . Total energy (in a.u.) with the selected configurations for the ground state of some polyatomic molecules. Snyder and Basch geometry (ref. 52), the basis set given by Dunning and Hay (ref. 51), and Algorithm MC-B are used. Number of Molecule Svmmetrv &-& mte functions Total Energy Iterations CH2 N2H2 H2CO CH3OH
C2v C2h C2" Clh
'A1 'Ag 'A' 'A'
1 95
-38.848867 -38.901893
8
1 8
- 109.697 17 1 -109.713451
8
1 12
- 1 13.827063 - 1 13.877026
1
- 1 15.004535
12
- 1 15.033926
4 6
energy is obtained when the CI space is formed by the reference configuration plus all the double excitations from that one, or, alternatively, using a complete active space for the selected AO's. The conclusion that for a two electron system the best MCSCF function consists of the reference function and all the double excited ones, is a reformulation of the fact that, using natural orbitals, the f u l l CI function may be written i n terms of configurations consisting of a doubly occupied orbital only /53, 54/. In the cases where the configurational space was highly insuficiently described the process failed to converge in a reasonable number of iterations using either Algorithms MC-A or MC-B.
41 I
Jacobi Rotations
TABLE IX. Total energy (in xu.) computed by different configurational spaces for the Helium atom using a STO of triple zeta quality basis set. The configurations included in each case are indicated by the values of the coefficients when the process converges, otherwise are not shown. The criterion of convergence is set toI.Ox10-6 a.u. Configurations selected
__--____________________________________----------Total
Energy
( I S , ] ~ ) (ls,2s) (2s,2s) (ls,3s) (2s,3s) (3s,3s)
----- -_------_ __--_ ___-_ __--_ _____ _____ -________
--__-_-_ __-___-_
0.9708 -0.2367 -0.0358 -0.0066 -0.0067 -0.0048 0.9705 -0.2384 -0.0360 -0.0059 -0.0002 0.9706 -0.2377 -0.0362 -0.0069 -0.0043 -0.0072 -0.0048 0.9706 -0.2381 -0.0356 0.9396 -0.3413 -0.0176 -0.0186 -0.0088 0.9979 -0.0623 -0.0134 -0.0115 -0.0054 0.3573 -0.9330 -0.0080 0.0415 0,0035 0.9705 -0.2382 -0.0361 -0.0072 0.9703 -0.2393 -0.0358 -0.0017 0.9349 -0.3547 0.0035 -0.0021 0.9978 -0.0634 -0.0166 -0.0oO8 0.9703 -0.239 1 -0.0359 -0.0043 0.9979 -0.0634 -0.0138 -0.0043 0.3575 -0.9338 -0.0030 0.0044 0.9380 -0.3460 -0.0184 -0.0072 0.9979 -0.0622 -0.012 1 -0.0056 0.3576 -0.9338 0.0066 0.0044 0.9980 -0.0084 -0.0011 -0.0633 0.9977 -0.0012 -0.0232 -0.0632 0.9702 -0.2396 -0.0357 0.9978 -0.0634 -0.0145 0.3573 -0.9340 -0.0006 0.9351 -0.3544 -0.0017 0.9980 -0.0635 -0.0012 0.3576 -0.9339 0.0070 0.9702 -0.1712 0.1713 0.9351 -0.0019 -0.3545 0.9980 -0.0634 -0.0043 0.3578 -0.9338 0.0043
-2.8784 17 -2.877850 -2.878417 -2.878417 -2.878417 -2.878417 -2.878417 -2.877846 -2.877846 -2.877846 -2.877882 -2.878417 -2.878417 -2.878417 -2.878417 -2.878417 -2.878417 -2.877846 -2.877846 -2.877846 -2.877885 -2.877846 -2.877846 -2.877846 -2.877846 -1.349630 -2.877846 -2.878417 -2.878417
Global Iterations
_--_-_--___ 1
27 2 2 22 2 24 2 16 23 20 3 3 26 23 2 24 27 20 3 3 26 24 17 24 49 53 4 25
412
Ramon Cwb6, Joan Mir6, Lloreng Domingo, and Juan J. Novoa
TABLE IX (Continued) Configurations selected ................................................... 0.9979
-0.0633 -0.0026 -0.0635 -0.0012 -0.0633 0.9998 -0.0177 -0.0025 0.9978 -0.0155 -0.0633 0.3168 -0.1664 -0.9338 -0.0634 0.3578 -0.9337 0.9994 -0.0333 0.9999 -0.0097 -0.0097 0.9998 -0.0182 0.9352 -0.3539 0.9232 -0.3843 -0.0634 0.9999 -0.0025 0.9979 -0.0634 0.3577 -0.9338 0.3576 -0.9338 0.9994
0.9980
0.9980
0.9999
0.9980
-0.0128 -0.0333 0.9980 -0.0012
1.m 1.m
6.4.5.
Total -2.877846 - 1.301915 -2.877846 -2.877846 - 1.301915 -2.877846 -2.877846 -2.877846 -2.877846 -1.301722 -2.863280 -2.863280 - 1.301722 -2.877846 - 1.301722 -2.877846 -1.301908 -2.877846 -2.877846 -2.877846 -2.861671 - 1.301585
Global 5
3 21 27 3 5 22 3 25 2 3 2 2 86 2 5 3 5 25 25 1 1
On choosing the active orbital space.
In light of the previous results on helium, i n the construction of state functions we used a limited number of active MO, and on them we performed a complete set of excitations. The procedure is related to a similarity measure between density functions attached to the occupied MO. The
Jacobi Rotations
413
algorithm can be described with the following scheme: a) A set of occupied MO's is chosen (lil>,li2>, ... lin>). b) A set of virtual MO's is chosen ( I v ~ > , I v ~... >hn>) , in such a way as the pairs (li1>,lvl>), (li2>,lv2>), ... (lin>,lVn>) are most similar according to the measure
the where (iilvv) are Coulomb integrals involving occupied-virtual MO pair (li>,lv>). The similarity measure r(i,v) is nothing more than a correlation coefficient between the density functions of the MO pair. The meaning of r(i,v) is such as li> is more similar to Iv> then r(i,v) --->1. Therefore, given any occupied MO li>, the virtual pair Iv> is chosen as to fulfill max[v'](r(i,v')). c) A complete excitation set of state functions is built from (lil>,li2>,...lin>) and (Ivl>,lv2>,...Ivn>]. The underlying idea of this kind of virtual MO selection consists of maintaining as constant as possible the density of t h e basic configuration when substituting the MO li> by the most similar virtual Iv>, in an attempt to provide in this manner t h e highest interaction between t h e original determinant and the new ones. performed Following the previous procedure, we have some computations on the ground state of the water molecule, which can be found in Table X , in order to have some information on how many and which symmetry type the occupied MO, selected as active, must have. Optimal results were obtained when every symmetry species was represented i n the occupied active space.
Ramon CarM, Joan Miro, Lloreng Domingo, and Juan J. Novoa
414
TABLE X. Dependence of the total energy (in a.u.) with the selected configurations for the ground state of water. Snyder and Basch geometry (ref. 52),the basis set given by Dunning and Hay (ref. 51), and algorithm MC-B are used. The selected configurations are a complete active space defined by the active orbitals whose ordinal numbers are shown. The SCF reference energy, is -76.000945 a.u. Active orbitals
........................ la1 2al lbl 3a1 2b2
---- ----
---- ---- ----
1
2 3 4
5 1 1
2 3 4
1
5
1
2 2 2
3 4 3 3
1 1
4 4
2 2 2
5
4
3 3 4 3 4
5 5 5 5 5 5
Total Number of state functions energy Iterations ----------______-------------- ----------3 -76.003430 17 3 -76.014813 6 3 -76.011140 2 3 -76.014813 19 3 -76.016043 3 12 -76.017983 7 8 -76.0 13830 2 1 12 -76.017982 15 12 -76.054142 7 8 12 -76.040079 20 -76.029336 4 12 -76.054142 6 12 -76.040784 20 12 -76.040649 3 12 -76.054142 11 100 -76.032476 4 55 -76.088017 8 55 -76.088017 7 95 -76.078172 5 55 -76.088017 14
Jacobi Rotations
415
TABLE XI. Total energy (in a.u.) and optimum geometry (bohrs and degrees) computed using mono- and multiconfigurational EJR methods for various molecules in the given state. The number of state functions used as basis set is also given. Geometry _______________----Total State r(A-H) @(HAH) functions Energy -----------------_______----------_________2.27 128 1 -25.739455 128 23 -25.753884 2.28 132 1 -38.810618 2.07 133 59 -38.830790 2.08 130 1 -38.909755 2.05 128 28 -38.929805 2.07 180 1 -38.891992 2.03 180 10 -38.906385 2.04 2.11 106 1 -38.859876 103 33 -38.901 139 2.13 105 37 -38.905188 2.14 1 -38.552996 2.08 139 138 23 -38.57 1390 2.10 2.07 180 1 -38.545720 180 14 -38.563882 2.09 1.98 103 1 -55.479463 101 56 -55.531094 2.05 1.89 143 1 -55.498149 1.91 143 59 -55.522335 1.95 108 1 -55.539397 49 -55.592828 2.01 106 1 -55.492833 1.88 180 1.90 180 75 -55.510087
416
Ramon Carbo, Joan Miro, Llorens Domingo, and Juan J. Novoa
TABLE XI (Continued) Geometry _________--------___ State Total r(A-H) O W ) Molecule State functions Energy ----------------____ __----------_____ __------------ _______--1.96 180 NH2+ 3Cg1 -55.191655
"41 *
I420
'A1
H20+
2B1 2n
2A I
FH2+
6.4.6.
] A1
3 1 37 1 55 56 1 49 1 27 1 59 1 56
-55.209092 -55.111118 -55.161953 -76.002172 -76.09063 1 -76.061319 -75.600820 -75.659075 -75.579178 -75.604880 -75.578866 -75.605722 -99.829222 -99.888494
1.97 1.99 2.02 1.82 1.88 1.88 1.90 1.96 1.89 1.92 1.89 1.91 1.85 1.90
180 115 114 111 107 108 119 115 180 180 167 180 125 119
Applications on the geometry optimization of AH2 systems
With the above described techniques, a general monoand multiconfigurational geometry optimization procedure has been implemeted. The Optimization algorithm is the numerical Newton-Raphson procedure described by Payne / 5 5 / . Results on triatomic AH2 systems are presented in Table XI for various states of t h e neutral molecules and ions. As expected, the inclusion of correlation gives rise to some modifications in the geometry computed at the
Jacobi Rotations
417
monoconfigurational level. In all the cases reported in Table XI, no convergence problems were found.
6.5.- Comparison with the exponential MCSCF methodology. A comparison of the present algorithm with every one of the various MCSCF procedures used at present in literature is a task beyond the scope of this paper. But some kind of test must be done at least with one of the best procedures described so far. According to Werner /14/ the Augmented Hessian (AH) procedure of Lengsfield /62/ constitutes one of the best algorithms available today. Therefore, we have compared the present EJR procedure with the AH using t h e version programmed by Brooks, Laidig and Saxe contained in HOND07 /63/. B u t before making the comparison one must discuss some details concerning the exponential MCSCF methodology. It must be taken into account that second order MCSCF procedures, as the AH and other exact or approximate second order methods, converge quadratically when close to the final solution, but with a very small radius of convergence. More than this, when the MO-CI coupling is not included one finds linear convergence even with second order methods. For example, see Table I1 of Werner's paper /14/. A possible way to solve the convergence problems consists i n using high order energy derivatives in t h e Taylor energy expansion. The drawback is that a higher derivative calculation is really expensive. Consequently some authors had included these terms in some approximate way / 12,l 4, 28/. This, combined, for instance, with the Direct CI method of Knowles and Handy, using Slater determinants instead of CSF's, overcomes some problems.
418
Ramon Carbo, Joan Miro, Lloreng Domingo, and Juan J. Novoa
The advantage of the EJR procedure can be based on the fact that the present method includes all the energy derivatives up to any order in the energy expansion through the MO transformation, making the convergence radius infinite, as numerous computations had shown up to date /65/. These computations range from very simple molecules and basis sets, like some of those discussed in this paper, to large transition metal clusters. Consequently, the Jacobi Rotation procedure is truly independent of the starting MO's and opens the way to the construction of a Black Box MCSCF procedure. The claim of independent convergence is not a common feature of the usual MCSCF methodology. Indeed, MCSCF quadratic procedures may fail to converge even i n simple cases. Let us take, for example, the case of the CH2 molecule computed with an STO-3G basis set and a CI space formed by the l A 1 ground state configuration and the first biexcitation. Table XI1 shows the results for MCSCF computations using the f u l l y quadratic A H method including MO-CI couplings, the two-step AH method (where the MO-CI coupling is not considered) and the EJR procedure, where MO-CI coupling is not, at present, explicitly taken into account. The starting orbitals for all the computations are the SCF MO's. It is evident from these results that the fully quadratic AH procedure does not converge and one has to switch to the two-step AH which presents linear convergence. EJR converge in a similar number of iterations like the two-step A H method. It is possible to see that the linear convergence of the EJR procedure is due to the non inclusion of the MO-CI coupling in the energy variation of the MO transformations, as the convergence is achieved by optimizing only the MO rotations when the CI vector is set equal to the final CI values. The previous behaviour does not depend on the number of
419
Jacobi Rotations
configurations in the CI expansion as can be inferred from TABLE XI11 results, where the final MCSCF energy and the number of iterations for various CI spaces of the CAS (Complete Active Space) type is presented. In all the cases the two-step AH method gives the same energy and converges in a similar number of iterations than the EJR. For non-CAS CI spaces, see the results shown in Table I X where one can see how EJR convergence is fulfilled even in cases where noconvergence is found with the two-step or f u l l quadratic procedures.
TABLE X I I . Total energy (in a.u.) for a biconfigurational MCSCF computation of CH2 in the ' A 1 state with various algorithms. The STO-3G basis set was employed at the following geometry (in a.u.): C (O., O., 0.); HI (-1.8411892, O., 1.0531792); H2 (1.841 1892, O., 1.053 1792).
----------_____---__-----------------------------Aumented Hessian method ........................................................... With MO-CI coupling Without MO-CI coupling
-_-_-----------___________ ______________________________ -36.85564239 -36.85564239 -38.2 1989305 -38.22103444 -38.37692871 -38.3770102 1 -38.36733483 -38.37750608 -38.37036663 -38.37756460 -38.63733740 -38.37757648 -37.51216343 -38.37757222 -37.46986509 -38.37757988 -37.46993122 -38.37758005 -35.901I0428 -38.37758009 -33.62845154 -38.37758010 -23.7 1655960 -20.3 1578866 - 15.86176625 - 16.87542415 - 13.16446563
Elementary
Jacobi Rotations
___________--________________ -36.85564239 -38.28935957 -38.37694611 -38.37750796 -38.37756543 -38.37757663 -38.37757927 -38.37757983 -38.37757006 -38.37758009 -38.37758010
420
Ramon CarM, Joan Miro, Lloreng Domingo, and JuanJ. Novoa
TABLE XIII. Total energy (in a.u.) for a MCSCF computation of C H 2 in the 1A 1 state with a STO-3G basis set for various CAS expansions. The geometry (in a.u.) used was the following: C (O., O., 0.); H1 (-1.8411892, O., 1.0531792); H2 (1.8411892, O., 1.0531792). In all cases the initial orbitals feed into the MCSCF program were SCF orbitals.
Similar results have been obtained for the AIH, the A10, or A102 systems and, therefore, will not be presented here. Previous results show that the EJR method is very promising to solve the convergence problem in MCSCF computations. The only snag is the number of integral transformations to be done i n the exact formulation. However, this problem can be obviated if instead of the "exact" EJR algorithm outlined before, some kind of accumulated EJR is used /l/, where the integrals are only transformed after the f u l l set of rotations is done, in the way this problem is treated i n the exponential transformations. Preliminary work i n this direction shows that the efficiency of the method is kept while the amount of time in t h e intregral transformations is lowered.
421
Jacobi Rotations
7. OPTIMIZATION OF ROTATION SINE
7.1. An Exact Solution Finding the sine of the angle that minimizes the energy when two orbitals are transformed according to EJR is a univariate minimization problem and as such may be solved using any of the classical methods available /56/. However, in this particular case a straightforward solution may be obtained after a close inspection of t h e equations attached to each EJR particular case. 7.1.1.
A simple procedure
energy variation In most usual situations, when t h e polynomial coefficients are computed, those multiplying the higher powers of the sine may considered negligible in front of the one multiplying the lower terms. Therefore the energy may be approximated by the expression: E=Eo+ E ~ s c + E ~ s *
(7.1)
which will be very useful to start our analysis about t h e behaviour of the family of polynomial functions associated with the EJR variation problem. The derivatives with respect to the sine of the approximate polynomial may be written:
422
Ramon Carbo, Joan Mir6, Llorens Domingo, and Juan J. Novoa
aE/as
= - c (El t
- 2 E2 t - E l )
where t=s/c is the tangent of the rotation angle. Being quadratic in nature, the first derivative may have two real solutions which produces a vanishing value, that is: the energy extremal. These two solutions may be compactly written that:
(7.3)
where a=E 1/E2. The single parameter a may be used to further simplify the previously written equations, i n such a manner that:
E = Eo
all
+ E2 (a s c + s2 ) (7.4)
d2E/as2 =E2 (2
-
a t (3+t2 ))
Introducing the energy extrema previously found into these latter equations, i t is easy to see, that the sign of the root that corresponds to a minimum depends on the sign of E2, namely if E2 > 0 then the root has to be taken as negative, and viceversa. Furthermore, the solutions found in this way always correspond to energy decreasing steps. When E2 > 0, it is convenient, in order to avoid roundoff errors, to use a modification of the classical quadratic equation solution, such as:
413
Jacobi Rotations
which prevents substractions between magnitudes of similar order . The results, obtained with this simplified formula, can be used as starting points to solve the cases that arise from both mono- and multiconfigurational energy variation. 7.1.2.
Monoconfigurational energy
The first derivarive of the monoconfigurational variation (5.2) can be written as aE/& = - c [ ( E I + E ~s2)t2 -2(E2 +2E2 ~2)-(E1+3E3s2 )]
energy
(7.6)
where t is again the tangent of the rotation angle. One can see that, although having its parameters dependent on the sine, this first derivative is again quadratic i n the tangent. A very simple, and fast convergent, recursive method may be devised from this formula, leaving the rotation sine as a constant and finding, at each iteration step, the optimal tangent value, which will lead to a new sine value for the next iteration. Therefore, calling A=E1 + E3 s B= E2 + 2E4 s C= El + 3E3 s the solution has the form
(7.7)
424
Ramon Carb6, Joan Miro, Lloreng Domingo, and Juan J. Novoa
where the sign of the square root was taken according to the considerations made in the previous section. The expression on the left is used when B>O, while the one on the right for the case B
Multiconfigurational energy
A very similar reasoning may be used for the multiconfigurational case where the energy is written as i n (6.8). The expression for the first derivative is:
which leads again to a quadratic form when using
(7.10)
425
Jacobi Rotations
X
TOTAL= -
IE I
Ell=.
X
X
x
I
X
E02=+
El310 E04=X
t
-1 .
FIGURE 1.- Form adopted by the energy increment curve and its most important components for a particular case in a helium atom computation.
In this case the convergence surely will be a little worse as t h e multiconfigurational coefficients depend on smaller powers of the sine and will have a larger contribution. Figure 1 shows the value of energy increment versus the rotation sine (solid line) for a particularly complicated case found when optimizing the helium atom, together with the weight of the most important terms, identified by its Eij component. In any case, the convergence is still fast enough to make the algorithm suitable but one has to be careful with the two or more than two
426
Ramon Carbo, Joan Mir6, Llorens Domingo, and Juan J. Novoa
minima cases. 7.1.4.
Brillouin theorems
Brillouin theorems (see for example, reference /57/ ) can be deduced i n both mono- or multiconfigurational cases using the same technique as in the CI expression of section 4.2. Then making s=O in the first derivative formulae and forcing the gradient to nullity one easily obtains: a) M on ocon figurational
(7.1 1) This general expression takes the form 4 I A% h + A 9 D, : P - AmOD,: Q I j > = O
(7.12)
for the Roothaan-Bagus energy expression given in section 5.2. b) M u 1 t i c on f i g u ration a I
(7.13) It seems in this way that throughout EJR formalism one is able to find some sort of generalized Brillouin theorem for any arbitrary expression involving Quantum Mechanical objects. In fact, any Quantum Mechanical object may be written in a general way as:
427
Jacobi Rotations
Z = f(li>,lj>)
(7.14)
Then, a EJR of indices (i,j) will give rise to some structure which at least will contain terms in either s or sc, or both,
a z = ZOl s + Z l l
sc
+ O(s2)
(7.15)
thus, the first derivative may be written as:
a 6Z / 3s = ZOl + Z l l
c + O(s2)
(7.16)
and the extremum condition at s=O gives (asz/as),=o=z()l+Zll
which constitutes a Brillouin Mechanical object expression.
7.2.
=o
(7.17)
Theorem valid for any Quantum
Performance Improving Procedures
I t was mentioned before that as soon as the EJR iterative process is settled and has begun to converge, the rotations will be fairly small, - i n fact, computational experience shows that this happens after the fourth sweep, as in classical Jacobi diagonalization - so that approximate solutions of the energy polynomial, which discard terms with high powers of t h e rotation sine, may be accurate enough to guarantee the total process convergence. I t is also quite usual that from the whole set of active rotations, very few of them have energy decreasing the contributions several orders in magnitude higher than
Ramon Carbo, Joan Miro, Lloreng Domingo, and JuanJ. Novoa
420
remaining ones. These rotations that lay behind in the iterative process should be done more often than the others, if one wants the process to converge in a reasonable number of rotations. These considerations are important for an efficient implementation of a Jacobi Rotation Procedure. 7.2.1.
Approximate optimal solutions
When only the terms up to second order in the sine are considered i n the EJR energy variation expressions, simple and non-iterative expressions are found for the optimal rotation angle and its corresponding energy variation. In the monoconfigurational case, one obtains an expression similar to the one described i n section 7.1.1.,
E = Eo
+ El s + E2
(7.18)
~2
which has trivial solutions:
and
The multiconfigurational energy can be approximated to
where t h e well known approximation to the has been used. The solutions are therefore:
cosine
c=l-s/2
Jacobi Rotations
7.2.2.
429
Rotation selection and filtering
Between the collection of computational details which may be commented here, the strategy of performing the minor number of possible rotations, through a given energy minimization process, can help to cut the necessary computer time by a significant amount. The selection of rotations which can produce an energy lowering within a given precision range, avoiding the remaining meaningless ones,can be performed using energy increments estimated by means of the formulae given in the previous section,comparing the obtained results with a predetermined precision value. Although this precision threshold may be arbitrarily set, computational experience suggests that the best order of magnitude will be given by the arithmetic mean of the estimated energy lowerings for all the active rotations. Such a strategy requires the computation of the low order energy parameters for all the active orbital pairs before each sweep, but a significant amount of time is spent in this previous filtering. The alternative of estimating the energy decrements based upon t h e ones obtained in t h e previous sweep, although saving filtering time, d o not give as good results as using t h e whole set of avalaible rotations. Furthermore, a significant performance improvement is
430
Ramon Carb6, Joan Mir6, Llorens Domingo, and Juan J. Novoa
found if, at each sweep, t h e rotation that leads to the largest energy lowering is performed first, followed by a normal sweep; this sweep may even include the rotation already performed if its newly estimated energy decrement happens to be larger than the average value. The main steps in each iteration for the best actual s t r a t e g y are : a) Compute the estimated energy lowerings for every active index pair rotation: A E i j . b) Find the minimum of AEij and perform the corresponding rotation. c) Compute a threshold value as: T = -1X
AEij
"nt
(7.23)
where nrOt is the number of rotations. d) For each rotation: estimate the energy lowering AEij < T. and perform the rotation if
7.3.
AEij
Localization Procedures and Jacobi Rotations
EJR can be used for purposes other than energy minimization. As an example of the present scheme flexibility a short discussion on OM localization follows. The Edmiston-Ruedenberg localization procedure / 5 8 / consists of an intrashell transformation of selfrepulsion energy until the sum
(7.24)
431
Jacobi Rotations
attains its maximal value. Let us consider this expression more generally as a sum of norms of the density functions (pp=lp>}. A generalized norm can then be defined as
(7.25)
where 0 must be some positive definite operator. In t h e case of repulsion integrals 0 = l/r12. Thus, a general localization function may be defined as
P
(7.26)
The localization function Le may be considered from t h e point of view of EJR as a sum of fourth-order objects. Thus given an active pair { i j } and a rotation sine one obtains
GC=G[(iiIOIii)+(jjIOIjj)l= =-2sc[
(iil@lij)-(ijtOljj)]-
- (1/2) s t (i i~ 0 1 i i ) + ( j j I o I j j
) - 2(i i I
o I j j) -4(i j I 0 1
i j) 1
(7.27)
where S=sin2a, C=cos2a. But the interesting fact is that t h e above expression becomes the same as the variation of the square euclidean distance between functions pi and pj:
432
Ramon Carbo, Joan Mir6, Lloreng Domingo, and JuanJ. Novoa
2
D =(iil@lii)+(jjl@ljj)-2(iii@ljj)
(7.28)
which properly varied produces
6 D2= 2 6 C
(7.29)
Thus, maximization of Le in a general Edmiston-Ruedenberg scheme becomes the same as maximization of distances between pairs of intrashell orbitals. Another interesting feature is the possibility of using exactly the EJR scheme to obtain localized functions. That is, localized orbitals can be produced in the same general procedure as the energy minimization one, providing the way to obtain simultaneously with the best orbitals some orbitals which bear some kind of canonical localized form with respect to a positive definite operator. The best operator which fits both minimization and localization schemes is, of course, the electron repulsion l / r i 2. In this manner as optimal energy gives a minimal nondefinite measure between orbitals belonging to different shells, localization provides a maximal distance between orbitals belonging to the same shell. Brillouin theorem here looks as
that is, at the maximal distance one will have the relationship (iilij)=(ijijj)
(7.3 1)
Jacobi Rotations
433
between repulsion integrals if this measure has been chosen. Let us finally say that maximizing a distance measure becomes something analogous to minimize the similarity measure r(i,j) defined in section 6.4.5. Thus localization in this context becomes the same as to obtain within a shell the least similar set of functions.
ACKNOWLEDGEMENTS
Computer time has been made available through a grant CACYT-0657/81. One of us also (J.J.N.) wishes to acknowledge the support given by the "Caixa d'Estalvis d e Barcelona" and the Bristish Council. The authors whish also to express their gratitude to Mr. J. Peris and Mr. J. Moreno for the helpful work done during early stages of the EJR theoretical and computational development.
REFERENCES 1. Carbo, R., Domingo, LI., and Peris, J. J.,(1982) Adv. Quantum Chem. 15, 215. 2. Miller, K. J., Ruedenberg, K., (1968) J. Chem. Phys., 48, 3414. 3. Raffenetti, R. C., Ruedenberg, K., (1970) Int. J. Quantum Chem., 3, 625. 4. Hoffman, D. K., Raffenetti, R. C., Ruedenberg, K., (1972) J. Math. Phys., 13, 528.
434
Ramon Carbo, Joan Miro, Llorenq Domingo, and Juan J. Novoa
5. Carbo, R., Domingo, LI., Peris J. J., and Novoa J. J., (1983) J Mol. Struct., 93, 15. 6. Carbo, R., Domingo Ll., and Novoa J. J., (1985) J . Mol. Struct., 120, 357. 7. Roothaan, C. C. J., and Bagus, P. S.,(1963) Methods in Computational Physics, 2, 47. 8. Roos, B., Salez, C., Veillard, A., and Clementi, E., (1968) "A general program for calculation of Atomic SCF Orbitals by the expansion method".IBM Research. RJ 518. 9. Clementi, E. and Roetti, C., (1974) Atomic Data and Nuclear Data Tables, 14, 185. 10. Hinze, J., (1981) Int. J. Quantum Chem., Quantum Chem. Symp., 1 5 , 69. 1 1 . Dupuis, M., (ed) (198 1 ) NRCC Proceedings: "Proceedings of the Workshop on Recent Developments and Applications of Multi-Configurational Hartree-Fock Methods". 12. Olsen, J., Yeager, D. L. and Joergensen, P.,(1983) Adv. Chem. Phys., 54, 1. 13. Almlof, J and Taylor, P. R., (1984) NATO ASI Ser., Ser. C, 133, 107. 14. Werner, H. J., (1987) Adv. Chem. Phys., 69 (part II), 1 ; Shepard, R., Adv. Chem. Phys., 69 (part II), 63. 15. Polezzo, S., and Fantucci, P., (1980) Mol. Phys., 39, 1527. 16. Polezzo, S., and Fnntucci, P., (1980) Mol. Phys., 40, 759. 17. Polezzo, S., and Fantucci, P., (1981) Cazz. Chim. Ital., 111, 49. 18. Matsen, F., and Nelin, C. J., (1981) Int. J . Quantum Chem., 2 0 , 861. 19. Fantucci, P., Polezzo, S., Trombetta, L., (1981) Int. J . Quantum Chem., 19, 493. 20. Levy, B., (1981) Lawrence Berkeley Lab., [Rep.] LBL-12157, 134. 21. Mukherjee, N. G., (1980) Int. J. Quantum Chem., 18, 1045. 22. McCullough, E. A., (1982) J . Phys. Chem., 86, 2178. 23. Igawa, A., Yeager, D. L., and Fukutome, H., (1982) J . Chem. Phys., 7 6 , 5388. 24. Lengsfield, B. H., (1982) J . Chem. Phys., 77, 4073. 25. Polezzo, S., and Fantucci, P., (1982) Stud. Phys. Theor. Chem. 21, 181.
Jacobi Rotations
435
26. Lengsfield, B. H., and Liu, B., (1981) J. Chem. Phys., 75, 478. 27. Polezzo, S., (1980) Chem. Phys. Lett., 7 5 , 303. 28. Werner, H. J., and Meyer, W., (1980) J. Chem. Phys., 73, 2342; (1981) J. Chem. Phys., 74, 5794. 29. Ishikawa, Y. (1980) Z. Naturforsch., 35, 408. 30. Siegbahn, P., Heiberg, A., Roos, B., and Levy, B., (1980) Phys. Scr., 21, 323. 31. Brooks, B. R., Laiding, W. D., Saxe, P., Goddard, J. D., Schaeffer, H. F., (1981) Lect. Notes Chem., 22, 158. 32. Yurtsever, E., and Shillady, D., (1983) Chem. Phys. Lett., 94, 316. 33. Das, G., (1981) J. Chem. Phys., 74, 5775. 34. Camp, R. N., King, H. F., (1982) J. Chem. Phys., 77. 3056. 35. Bauschlicher, C. W., Bagus, P. S., Yarkony, D. R., and Lengsfield, B. H., (1981) J. Chem. Phys., 7 4 , 3965. 36. Werner, H. J., and Knowles, P. J., (1985) J. Chem. Pkys., 82, 5053; Knowles, P. J. and Werner, H. J.. (1985) C h c m . Phys. Lett., 115, 259. 37. Yeager, D. L., and Joergensen, P., (1980) M o l . Phys., 39, 587. 38. Yeager, D. L., Albertsen, P., and Joergensen, P., (1980) J . Chem. Phys., 73, 281 1. 39. Joergensen, P., Albertsen, P., and Yeager, D. L., (1980) J . Chem. Phys., 7 2 , 6466. 40. Joergensen, P., Olsen, J., and Yeager, D. L., (1981) J. Chem. Phys., 7 5 , 5802. 41. Olsen, J., and Joergensen, P., (1982) J. Chem. Phys., 77, 6109. 42. Yeager, D. L., Lynch, D., Nichols, J., Joergensen. P., and Olsen, J., (1982) J. Phys. Chem., 8 6 , 2140. 43. Joergensen, P., Swanstroem, P., and Yeager, D. L., (1983) J . Chern. Phys., 7 8 , 347. 44. Olsen, J., Joergensen, P., and Yeager, D. L., (1983) I n t . J. Quantum Chem., 24, 25. 45. Jensen, H. J . A. and Joergensen, P. (1984) J. Chem. Phys., 80, 1204. 46. Joergensen, P., and Simons, J., (1983) J . Chem. Phys., 7 9 , 334. 47. Golab, J. T., Yeager, D. L., Joergensen, P., (1985) Chem.
436
Ramon Carb6, Joan Miro. Llorenq Domingo, and Juan J. Novoa
Phys., 9 3 , 83. Levy, B., (1970) Int. J. Quantum Chem., 4, 297. Wood, M. H. and Veillard, A., (1973) Mol. Phys., 26, 595. Lengsfield, B. H., (1980) J. Chem. Phys., 73, 382. Dunning, T. H. and Hay, P. J., (1977) Mod. Theoret. Chem., 3 , 1. 52. Snyder, L. C. and Basch, H. B., (1972) "Molecular Wave Functions and Properties", John Wiley and Sons, New York. 53. Lowdin, P. O., (1955) Phys. Rev., 97, 1509. 54. Lowdin, P. 0. and Shull, H., (1956) Phys. Rev., 101, 1730. 55. Payne, P. W. (1976) J. Chem. Phys., 65, 1920. 56. Beveridge, G. S. G. and Schechter, R. S., (1970) "Optimization, Theory and Practice", McGraw-Hill Kogakusha, Tokyo. 57. Levy B., and Berthier G., (1968) t n t . J. Quantum Chem., 2, 307. 58. Edmiston, C. E. and Ruedenberg, K., (1963) Rev. Mod. Phys., 3 5 , 457. 59. Huzinaga, S.,(1965) J. Chem. Phys., 42,1293. 60. Iwata, S., (1981) Chem. Phys. Lett., 8 3 , 134. 61. Kendrick, J. and Hillier, I. H., (1976) Chem. Phys. Lett., 4 1 , 283. 62. Lengsfield, B. H., (1980) J. Chem. Phys., 73, 382. 63. HOND07, Dupuis, M., Watts, J. D., Villar, H. O., Hurst, G. J. B., (1988) IBM Scientific and Engineering Department, Kingston, NY. 64. Rubio, J., Novoa, J. J., Illas, F., (1986) Chem. Phys. Lett., 126, 98; Illas, F., Merchan, M., Pelissier M., and Malrieu, J. P., (1986)Chem. Phys., 107, 361; Illas, F., Rubio, J., Ricart, J. M., (1988) J. Chem. Phys., 88, 260; Rubio, J., Ricart, J. M., Illas, F., (1988) J. Comp. Chem., 9, 836; Illas, F., Rubio, J., Ricart, J. M., J. Chem. Phys., (1988) 89, 6376. 48. 49. 50. 51.
43 7
Jacobi Rotations
APPENDIX I
Expression of MO multiconfigurational energy variational coefficients
The multiconfigurational energy may be written in a way as:
yijk, ( i j I k l )
w . . h. . +
E= ij
rJ
‘1
general
ijkl
where the sums run over the indices wit.. the following constraints: j I i, I I k I i and I 2 j when i=k, or, with packed indices, this is the same as to consider [kl] I [iJ]. The energy variation is given by the formula (6.8) text where the parameters are defined by:
in
the
438
Ramon Carbo, Joan Mid, Llorenq Domingo, and Juan J. Novoa
Jacobi Rotations
439
440
Ramon CarM, Joan Miro, Llorens Domingo, and Juan J. Novoa
APPENDIX 11
Computational procedure followed in Algorithm MC-A
a) b)
Compute A 0 integrals Compute initial molecular orbitals c) Transform atomic to molecular integrals d) Define initial CI vector e ) Compute first and second order density matrices f) Jacobi rotations: global-iterations = 0 do while energy-increment > global-precision MO rotations: MO-sweep = 0 do while MO-energy-increment > MO-precision loop over ( i j ) active MO pairs * compute the MO-energy polynomial coefficients * compute optimum sine * rotate the MO's, and the molecular integrals end loop ( i j ) MO-sweep = MO-sweep + 1 enddo MO-energy
CI rotations: CI-sweep = 0 do while CI-energy-increment > CI-precision loop over { 1,J) active CI coefficient pairs
441
Jacobi Rotations
*
compute the energy polynomial coefficients * compute optimum sine * rotate CI vector * update the first and second order density matrices end loop (1,J) CI-sweep = CI-sweep + 1 enddo CI-energy global-iterations = global-iterations enddo energy-increment g)End algorithm MC-A
+
1
This Page Intentionally Left Blank
Index
A Ab initio quantum chemical methods, 101-112 computational cost, 104 embedding theory, 105-106 error bound method, 104-106 generalized valence bond-CI method, 108-110
generalized valence bond method, 106-108 localization scheme, 105 modified effective potential, 102-103 multiconfiguration electron state, 102 oxygen, binding site on Si(lW), 103-104 self-consistent field Hartree-Fock method, 101-102
self-consistent field studies, 102 vibrational frequency shift for oxygen on Ni(100), 102-103 wave function for dosed shell singlet state, 101 Acetylene, chemisorption binding energies, 87-90
Active orbital space, choosing, multiconfigurational energy expression, 412-414 Adatom-adatom forces, 113-114 Adatom-substrate binding forces, 114 Adjoint operators, stability problem, 190-198 bi-orthogonality theorem, 191 block-diagonalized, 194 classical canonical form, 191 complex symmetric operator, 193-194 general theory, 190-194 Hartree-Fock scheme, 195-198 Jordan blocks, 192-193 ket-bra operator, 190 Segre characteristics, 192 Adsorbate on metal surfaces, surface structure,
AH, systems, geometry optimization, multiconfigurational energy expression, 415-417 Alkylidynes, 120 Alloys, surface structure, 131-134 Aluminum, chemisorption on GaAs (110), 112 Angle-resolved auger electron spectroscopy amplitudes of outgoing partial waves, 72 surface structure, 56 Angle-resolved photoelectron emission spectroscopy, surface structure, 51 Angle-resolved photoemission finestructure, 67 relative intensity modulations, 68 surface structure, 28-29, 52 Angle-resolved ultraviolet photoemission spectroscopy, surface structure, 51 Angle-resolved x-ray photoelectron spectroscopy, surface structure, 51-52 Angle-resolved x-ray photoemission, surface structure, 51 Anonymous parentage approximation scheme, 325 Atomic adsorption, semiconductor surfaces, 165 Atomic beam diffraction, surface structure, 33-34
Atom-superposition electron-delocalization molecular orbital theory, 84 Auger electron emission, 71-72 Augmented Hessian procedure, 417-418
B Bi-orthogonality theorem, 191 Bloch equation, 328 Fock-space, 334, 341, 355, 358 zero valence problem, 345 Bloch equation based Fock space theory,
144-146
345-350
surface coverage, 114 Adsorbate-adsorbate interaction, 115
IP calculations, 349 satellite peaks, 349 443
444
Index
subsystem embedding condition, 348 valence-universal wave-operator, 347 wave-operator, 345 Bloch-Horowitz theory, 349 Bloch wave-operator, 345 x-Bonded model, 96-97 Bond length, contractions and reconstructions, 117-118 Born-Oppenheimer approximation, 220 Bragg peaks, LEED, 77 Brillouin theorem, 432 CI form, 386 rotation sine optimization, 426-427 Broyde-Fanno-Goldfarb-Shannomethod compared with Murtagh-Sargent method, 280 optimization hydrogen peroxide, 271 NH,, 275, 277, 282-283
C Carbon monoxide binding energy, 85 bonding geometry, 120 adsorbate, 95 bonding to platinum (100) surface, 97 chemisorption on metals extended Hiickel theory, 85-87 Ni(100), 99, 108-109 surface structure, 157-158 as molecular adsorbate, 119-120 CAS-SCF function, 325 C-C bond, 120 CH,, total energy, biconfigurational MCSCF method, 418-420 Chalcogen, chemisorption on metals Ni surfaces, 100 surface structure, 136-139 Chemical composition, surface structure, 36 CI energy elementary Jacobi rotations, 384-386 increment, non-null &j coefficients, 407-408 CI-MO computation algorithms, 4OO-401 CI space, choosing, multiconfigurational energy expression, 408, 410-412 CI wave function, multiconfigurational energy expression, 398 Clean metal structures, surface structure, 122-126 Clebsch-Gordon coefficients, 63
Cluster expansion methods, 292-293 incomplete model space, 357 size-extensivity, 298-302 core-extensive theories, 301 linked clusters, 300-301 open-shell cluster expansion theories, 301 pair-correlation, 299 particle correlations, 299 size-extensive theory, 301-302 wave-function, 298-300 Complementary active space, 354 Complete active space, 297 Complete model space, 298 Complete model space theory, 360 Complex scaling method, 188, 222-230 dilatation operator, 189, 217-218 e-Be scattering problems, 223, 230 nearest scattering root, 227-228 resonant orbital energy, 222-226, 229 SCF resonances, 223 shape resonance, 222 unbounded similarity transformation, 219 Conjugate gradients, line search strategy, 262-263 Core-extensive theories, 301 Core-hole decay, 30 Coulombic many-particle Hamiltonian, 218-220 Coulomb integral, elementary Jacobi rotations, 387 Coupled-cluster equations constructing, 350-351 IP calculations, 349-350 Coupled-cluster hear response theory, 315-319 hermitian version, 321 interaction Hamiltonian, 315, 321 IP calculations, 349 nonhermitian matrix, 317 numerical applications, 318-319 time-dependent version, 323 working equations, 316-318 Coupled-cluster theory closed shell, 295 incomplete model space, applications, 360-361 open-shell, 345 time-dependent version, 321-323 variational cluster amplitudes, 352 energy, 353 Fock-space, 350, 352-353 isometry condition, 352
Index
Coupled perturbed Hartree-Fock equations, 255-258, 287 Coupled perturbed Hartree-Fock theory, 241-242 CuCO clusters, 95 CusCO clusters, 95 Cu,,HI cluster, 106
D d-band, hybridization with s-band, 97 Debye-Waller attenuation factor, 60-61 DFP, update optimization of NH,, 275, 278 Diffuse low energy electron diffraction, surface structure, 27-28 Dilatation operator, 189, 217-218 Di-nitrogen, chemisorption on metals, surface structure, 157-158 Dipole scattering, 74 Discrete variational linear combinations of atomic orbitals, 98 Discrete variation self-consistent charge, 98
E e-Be scattering problems, 223, 230 Edmiston-Ruedenberg localization procedure, 430-432 Electron affinities, 357 Electron appearance potential fine structure spectroscopy, surface structure, 32 Electron-atom scattering, 59-61 Electron diffraction four-step description, 39-41 surface structure, 5, 25-29 Electron energy loss near-edge structure, surface structure, 55-56 Electronic energy increment structure, 401, 403 variation, 402-404 Electronic spectroscopy, surface structure, 36 Electron loss spectroscopy, surface structure, 54 Electron microscopy, surface structure, 53 Electron-photon scattering, 74-76 Electron propagation, 58-59 Electron scattering multiple, see Multiple electron scattering single versus multiple, 76-77 Electron-stimulated desorption, surface structure, 37-38
445
Elementary Jacobi rotations, 377-380 advantage, 418 CI energy, 384-386 comparison with exponential MSCSF methodology, 417-420 Coulomb integral, 387 exchange integral, 387 invariants and relationships, 387-389 localization procedures, 430-433 MO framework, 380-383 monoconfigurational density matrix, 383-384 monoconfigurational energy, see Monoconfigurational energy MO pair, 381 multiconfiguration energy, see Multiconfigurational energy expression structure, 398-399 optimization of rotation sine, see Rotation sine optimization quantum mechanical objects, 381 rotation structure, 405-408 two-electron molecular integrals, 386-389 unitary matrix, 379-380 Eley-Rideal mechanism, 87 Energy hypersurface, characterization of minima, 243-245 Equation of motion methods, 304-305 Error bound method, 104-106 Ethylidyne, chemisorption, 89, 91 Exchange integral. elementary Jacobi rotations, 387 Excitation energies, 357 Excitation operator, 312, 329 Extended appearance potential fine structure detection modes, 52-53 surface structure, 52-53 Extended fine structure techniques, 31 Extended Hiickel theory, 83-89 acetylene chemisorption binding energies, 87-90 advantage, 84 application. 85-89 atom-superposition electron-delocalization molecular orbital theory, 84 binding energies, 85 chemisorption of hydrocarbon molecules, 87 CO chemisorption, 85-87 CO-Pt, 86 electron delocalization energies, 84
446
Index
ethylidyne, chemisorption, 89, 91 molecular orbitals, 83 total energy, 83-84 Extended-surface model, 82 Extended x-ray absorption fine structure phase shifts, 69 surface structure, 31-32, 49 Extended x-ray energy loss fine structure, surface structure, 55
F Field ion microscopy, surface structure, 33 Fine-structure techniques, see also specific techniques range of multiple scattering, 77 surface structure, 29-32 Flecher-Reeves algorithm, 262 Fock-Dirac density matrix, 1% Fock-Dirac density operators, 206 Fock-Dirac operator, 187-188 Fock matrix, 393 Fock operator, 390-391 Fock-space equation, 348 Fock space theory, see Bloch equation based Fock space theory; Full cluster expansion theories Full cluster expansion theories Fock space, 332-353 advantage, 335 Bloch equation, 334 Bloch equation based theory, 345-350 flexibility in wave operator, 333 generating coupled-cluster equations, 334 similarity transformation-based theories, see Similarity transformation-based Fock-space theories valence universal wave-operator, 333 variational CC theory, 350, 352-353 Hilbert space, 324-323 Bloch equation, 328 CAS-SCF function. 325 cluster operators, 324 cluster structure, 324-325 effective Hamiltonian, 328 multireference CI approach, 325-326 multi-root formalisms, 331 Silverstone-Sinanoglu strategy, 327 single root formalisms, 328-331 Full-valence model space, 297
G GaAs, zincblende structure, 108, 110-111 Generalized valence bond theory, 106-108 via configuration interaction, 108-110 Geometry optimization, see also NH, AH, systems, multiconfigurational energy expression, 415-417 hydrogen peroxide, 271 Global convergence tests, multiconfigurational energy expression, 401-404 Golden-rule expression, 72 Greenstadt method, optimization of NH,,279
H Hamiltonian, see also Many-particle Hamiltonian dressed molecular, 318 effective, 328 disconnected, 364 energy-dependent, 303 generating hermitian, 344 hermitian, 350 matrix, 362-363 one-particle, 208 valence universal, 355 valence universal nonhermitian, 338 hermitian valence-shell, 339 interaction, 321 many-electron, transformed, 187-188 skeletons, 310 transformed Fock space, 343 Hartree-Fock equations canonical, 197, 213 coupled perturbed, 255-258 solving, 215 Hartree-Fock function, 357 canonical, 208, 215 closed shell single determinant, 295 ground state, 313 transformation, 231 Hartree-Fock method, see also Complex scaling method; lkrfala theorem application to many-particle Hamiltonian, 221 bi-variational expression, 195-196 canonical Hartree-Fock equations, 197 complex conjugation, 211 complex symmetric, 189, 215-216, 231 operator, 205-209 real operator, 209-211
Index dilated SCF equations, 231 effective one-electron operator, 196 effective one-particle Hamiltonian, 208 Fock-Dirac density matrix, 196 Fock-Dirac density operators, 206 iterative SCF-procedure, 210-211 one-particle projector, 206, 208-209 purpose, 198 restricted, 210 self-consistent, 101-102 similarity transformations, 201-205 Slater determinants, 195 stability problem of adjoint operators, 195-198 symmetry dilemma, 210 Hartree-Fock-Slater method, 94, 98-101 chemisorption on Ni surface carbon monoxide, 99 chalcogen atoms, 100 determination of energetically favorable adsorption site, 99-100 discrete variational linear combinations of atomic orbitals, 98 discrete variation self-consistent charge, 98 molecular ionization energies, 99 total energy calculations, 100-101 Hartree-Fock theory, coupled perturbed, 255-258, 287 Hartree-Fock total energy, 84 Haussdorf transformation, 397 HCN, augmented Hessian, 285 HKO, exact Hessian optimization, 273-274 Helium energy increment curve, 425 total energy, configurational spaces, 410-412 Hermitian matrices, 313-314 Hessian matrix, 243-245, see also QuasiNewton methods approximate, 282-283 BFGS optimization cycles required, 281-282 estimating, coupled perturbed HartreeFock equations, 287 HzCO, 273-274 inverse, 252-254 NH,, 273-274 resolving for analysis, 263 Hessian method approximate analytic, 255-258 augmented, 260-262, 284-285 High-resolution electron energy loss spectroscopy, 119 surface structure, 37, 54
447
Hilbert space, see QLTO Full cluster expansion theories one-electron, 189 H E E L S , scattering, 74-75 Hydrocarbon, chemisorption systems, 87 Hydrogen, chemisorption, 106 Hydrogen peroxide coordinates and gradients, 267-268 energy descent, 270 geometry optimization, 271 Newton iterations, 267, 269
I Impact scattering, 74 Incomplete model space, see Size extensive formulations, incomplete model space Inelastic electron-electron scattering, 72-73 Inelastic low-energy electron diffraction, surface structure, 54 Infra-red reflectance-adsorption spectroscopy, surface structure, 37 Insulators, surface structure, 160-163 Invariance theorem, 214 Ion scattering, surface structure, 35
J Jordan blocks, 192-193
L Langmuir-Hinshelwood mechanism, 87 Lattice-gas chemisorption systems, 28 LEED beam-set neglect, 67, 79 Bragg peaks, 17 defects, 78-79 diffuse theory, 66 disorder effects, 80 inverse, 65 long- versus short-range order, 78-79 range of multiple scattering, 77-78 renormalized forward scattering, 66 reverse scattering perturbation. 66 two-dimensional surface periodicity, 65 Level-shift function approach, 306 Liz molecule electronic energy variation, 402-404 non-null tij coefficients, 405-408 Linear response theory, 306
Index
448
Line search strategy, 245-250 approximate analytic Hessians, 255-258 augmented Hessian method, 260-262 conjugate gradients, 262-263 cosine test, 248 coupled perturbed Hartree-Fock equations, 255-258 descent condition, 246-247 downhill search, 260 extrapolation formula, 250 interpolation formula, 250 Newton’s method, 251-252 quadratics surface, 259 quasi-Newton methods, 252-255 restricted step methods, 258-260 search direction methods analysis, 263-265 steepest descent, 251 tests for right and left extremes, 248 Linked cluster theorem, 301, 313-314 incomplete model space, 354 Localization procedures, elementary Jacobi rotations, 430-433 Low energy electron diffraction advantage, 26 high Miller-index surfaces, 26 surface structure, 3, 25-27, 53-54
M Many-body perturbation theory core-extensive open-shell, 302-303 incomplete model space version, 308 open-shell, 292, 326, 354 Rayleigh-Schrodinger, 303-304 Many-particle Hamiltonian under complex scaling, 217-221 application of Hartree-Fock method, 221 Born-Oppenheimer approximation, 220 dilatation operator, 217-218 matrix property, 219 Coulombic, 218-220 stability problem, 188 symmetry property, 188 Many-particle operator, 187 similarity transformation, 199-216 associated kernels, 203 eigenfunction, 200-201 Hilbert space, 199 invariance of complex symmetry, 211-216
invariance property, 201 invariance theorem, 214 many-particle projector, 202-203 operator is self-adjoint and real, 214 real complex symmetric operator, 209-211 restricted, 211-216 Tarfala theorem, 201-205 transformation formula, 203 unbounded operator, 199 Many-particle projector, 202-203 MCSCF method biconfigurational, total energy of CH,, 418-420 comparison with elementary Jacobi rotations, 417-420 Medium-energy electron diffraction, surface structure, 53 MEED, chain method, 67 Method of complex scaling, see Complex scaling method Minima, characterization, 243-245 MO energy increment, non-null Eij coefficients, 405-406 MO functions, elementary Jacobi rotations, 380-383 MO integrals, monoconfigurational energy expressions, in terms of, 389 Molecular adsorption, 119-121 surface structure, 168-171 Molecular conformation, prediction through models, 241 Molecular integrals, two-electron, elementary Jacobi rotations, 386-389 Monoconfigurational density matrix, elementary Jacobi rotations, 383-384 Monoconfigurational energy expressions, 389-394 energy variation parameters, 394 Fock matrix, 393 Fock operators, 390-391 polynomial coefficients, 390 Roothaan-Bagus atomic energy, 391-394 rotation sine optimization, 423-424 sets of occupation numbers, 391 two-electron contribution hypermatrices, 392 variation structure, 389-391 MO pair, elementary Jacobi rotations, 381 MO rotations, multiconfigurational energy expression, 398-399
Index Muffin-tin model, 58 multiple electron scattering, 62 Mulliken population analysis, 98-99 Multi-commutator Haussdorff formula, 336 Multiconfigurational energy, 395-420 choosing active orbital space, 412-414 choosing CI space, 408, 410-412 CI-MO computation algorithms, 400-401 CI wave function, 398 computational procedure, 440-441 electronic energy increment structure, 401,403 electronic energy variation, 402-404 elementary Jacobi rotation, 387 extension to other molecular systems, 408-410 geometry optimization of AH, systems, 415-417 global convergence tests, 401-404 Haussdorf transformation, 397 MO rotations, 398-399 rotation sine optimization, 424-426 total energy, 401-402 ground state, 408-409 selected configurations, 408, 410 unitary transformations, 3%-397 variational coefficients, 437-439 variational structure, 396-398 Multiple electron scattering, 61-65 approximations, 66-69 forward, 76 high-energy fine-structure techniques, 67-68 at higher energies, 62 higher-order events, 79 incident electron beam, 65 intensity modulation, 68 at low energies, 61-62 matrix dimension, 64 muffin-tin model, 62 plane-wave expansion, 64-65 point-group symmetries, 64 range, 77-78 scattering amplitude, 63 versus single, 76-71 Multiple-scattering method, 89 Multireference CI approach, 325-326 Multi-root Hilbert space formalisms, 331-332 Murtagh-Sargent method, compared with BFGS, 280
449
N Near-edge fine structure techniques, surface structure, 30 Near-edge x-ray absorption fine structure, surface structure, 30, 50 Newton iterations, hydrogen peroxide, 267,269 Newton’s method line search strategy, 251-252 optimizations, 271-274 Hessian eigenvalues, 273-274 number of cycles required, 272 NH, BFGS optimization, 275, 277, 282-283 DFP update optimization, 275, 278 exact Hessian optimization, 273-274 Greenstadt update in optimization, 279 Rank 1 update optimization, 275-276 Ni,O cluster, 99, 102 Ni,,O, cluster, 102-103 Nitric oxide, chemisorption on metals, surface structure, 157-158
0 One-particle subspaces, stable under complex conjugation, 233-236 Open-shell coupled cluster theory, waveoperator, 345 Open-shell cluster expansion approach, 301, 305-309 core-excitation, 306-307 Fock-space strategy, 305-397 Hilbert space, 305-306 intermediate normalization, 308-309 occupation number representation, 307 quasi-complete model space, 308 virtual space determinants, 308 Open-shell many body formalism, 292, 302-310, 328 arrow conventions, 310-311 classification of orbitals into holes and particles, 310-311 earlier perturbative developments, 302-304 energydependent effective Hamiltonian, 303 Hugenholtz convention, 309-310 linked valence expansion, 303-304 open-shell cluster expansion approach, 305-309 propagator methods, 304-305
Index
450
Open-shell many-body perturbation theory, 292, 326, 354 Optical techniques, surface structure, 37 Optimization algorithms, 245 Oxygen binding site on Si (loo), 103-104 chemisorption GaAs (110), 108, 110-111 metals, 136-138 Ni(100), ordering, 114 molecular ionization energies, 99
P Pair-correlation model, 299 Phase shifts, 59 EXAFS,69 temperature-dependent, 60-61 Photoelectron diffraction, surface structure, 28-29 Photoelectron emission, 69 Photoemission matrix elements, 70-71 Photon propagation, 69-70 Physisorption, 115 T-back-bonding, 97 Plane waves, 57-58 Potential energy searches, 241-242 computational procedures, 266-267 summary of optimizations, 284 Potential energy surface characterization of minima, 243-245 line search strategy, see Line search strategy optimization algorithms, 245 Projection operator, antisymmetric, 201-202 Propagator methods, 304-305 Pseudo-selection rule, 73
Q Quantum mechanical objects, elementary Jacobi rotations, 381 Quasi-complete model space, 308 classification of operators, 356-357 Quasi-complete model space theory, 354 Quasi-Newton methods line search strategy, 252-255 with non-unit starting Hessians, 281-283 optimization cycles required, 272 optimization of hydrogen peroxide, 271 with unit Hessian, 274-281 BFGS optimization, 275, 277
comparison of MS and BFGS procedures, 280 DFP update optimization, 275, 278 Rank 1 update optimization, 275-276
R Rayleigh-Schrodinger many-body perturbation theory, 303-304 Reconstructed metals, surface structure, 131-134 Reflection high-energy electron diffraction, surface structure, 53 Resonant orbital energy, 223-226, 229 Restricted Hartree-Fock scheme, 210 Restricted step methods, line search strategy, 258-260 RHEED, chain method, 67 Roothaan-Bagus atomic energy, 391-394 Rotation angle, 219 Rotation sine optimization, 421-433 derivatives, 421-422 exact solution, 421-427 Brillouin theorems, 426-427 monoconfigurational energy, 423-424 multiconfigurational energy, 424-426 simple procedure, 421-423 modification of q u a h t i c equation, 422-423 performance improving procedures, 427-430 approximate optimal solutions, 428-429 rotation selection and filtering, 429-430
S s-band, hybridization with d-band, 97 Scale factor, 217, 219, 221 Scanning tunneling microscopy, surface structure, 34 Scattered wave, 60 Scattering amplitude, 60 Schrodinger equations, 327 Secondary-ion mass spectrometry, 36 Second harmonic generation, surface structure, 37 Segr6 characteristics, 192 Selenium, chemisorption on metals, 139 Self-consistent field Hartree-Fock method, 101-102 Self-consistent-field-Xa-scatteredwave method, 89, 92 adsorption of ethylene on Ni, 96-97
index advantages and disadvantages, 94, 98 calculations, 92-95 carbon monoxide adsorbate bonding geometries, 95 bonding to platinum (100) surface, 97 core and valence photoemission spectra, 95-96 core-level satellite structure, 96 electronic wave functions, 92 exchange potential, 92 molecular orbital study, 96 muffin-tin approximation to potential, 98 orbital energies, 94 r-bonded model, 96-97 potentials, 93-94 Self-consistent-field-Xwscatteredwave method, 89, 92 results, 95-98 Semiconductors atomic adsorption on, 165 surface structure, 149-155 Semi-cluster expansion theories, 310-323 anonymous parentage approximation scheme, 325 CC-LRT, 315-319 computation of transition moments, 323 core-correlation, 311-312 eigenvalue equation, 319 excitation operator, 312 hermitian matrices, 313-314 h-h matrix elements, 319-320 IP and EE computations, 319-321 linked cluster theorem, 313-314 non-variational methods, 315-323 physical considerations, 310-313 spin-adapted version, 323 time-dependent version, 321-323 vacuum term, 322 variational methods, 313-315 Shape resonance, 222 Similarity transformation-based Fock-space theories, 335-344 classification of operators, 342 closed-shell cluster amplitudes, 339-340 cluster wave-operator, 344 correlated core energy, 338 decoupling conditions, 342-343 eigenvalue equations, 340 hermitian valence-shell Hamiltonian, 339 multi-commutator expansion, 336-337 one-valence block, 337-338
451
passive scattering, 337-338 passive valence lines, 336 S operators, 341 spectator scattering, 337-338 two valence blocks, 338-339 unitary cluster operator, 343 valence universal cluster operator, 335 valence universal wave-operator, 339-340 zero-valence core problem, 334, 336-338 Single root Hilbert space formalisms, 328-331 core excitations, 330 energy, 330 excitation operator, 329 wave operator, 329 Size extensive formulations, 301-302 incomplete model space, 353-364 applications, 360-361 Bloch equation, 358 complementary active space, 354 construction zero, 362 defining equations for S operators, 356 disconnected effective Hamiltonian, 354, 364 effective Hamiltonian matrix, 362-363 electron affinities, 357 excitation energies, 357 general formulation, 354-357 having valence-holes and -particles, 357-360 IP calculations, 361 linked cluster theorem, 354 model space projection, 359 parent model space, 362 p-h excited determinants, 357-359 products of quasi-open operators, 354-355 quasi-open S operators, 355 size-extensivity of energies, 361-364 valence-universal effective Hamiltonian, 355 valence universal wave operator, 358 quasi-complete model space, classification of operators, 356-357 Slater’s transition-state theory, 94 Slater type orbitals, 98 Small-angle inelastic scattering, 74 S operators, 341 defining equations, 356 quasi-open, 355 Spherical-wave, 58 outgoing, amplitude, 71
Index
452
Substrates, atomic penetration into, 118-119 Subsystem embedding condition, 348 Sulfur, chemisorption on metals, 138-139 Superlattice, 116 Surface chemistry, 81-82, see also Ab initio quantum chemical methods; Selfconsistent-field-Xwscattered wave method extended Hiickel theory, 83-89 Hartree-Fock-Slater method, 98-101 multiple-scattering method, 89 Surface crystallography, 112-121 adsorbate surface coverage, 114 Miller indices, 115 mutual atomic interaction, 113-114 notation for surface lattices, 115-116 ordering principles, 113-115 Wood notation, 116 Surface EXAFS inner potential, 76-77 surface structure, 32, 49-50 Surface extended energy loss fine structure, surface structure, 55 Surface strain, 110 Surface structure, 2-4, see also Multiple scattering angle integrated mode, 41 angle-resolved auger electron spectroscopy, 56 angle resolved mode, 41 angle-resolved photoelectron emission spectroscopy, 51 angle-resolved photoemission finestructure, 28-29, 52 angle-resolved ultraviolet photoemission spectroscopy, 51 angle-resolved x-ray photoelectron spectroscopy, 51-52 angle-resolved x-ray photoemission, 51 atomic adsorbates on metal surfaces, 144-146
atomic beam diffraction, 33-34 atomic penetration into substrates, 118-119 bond angles versus bond directions, 81 bond length contractions and reconstructions, 117-118 chalcogen chemisorption on metals, 136-139
chemical composition, 36 clean metal structures, 122-126 cluster model, 82
CO, di-nitrogen and nitric oxide chemisorption on metals, 157-158 complementary techniques, 35-38 determination, 4-5 diffuse low energy electron diffraction, 27-28 electron diffraction, 5, 25-29, 39-41 electron energy loss near-edge structure, 55-56 electronic spectroscopy, 36 electron loss spectroscopy, 54 electron microscopy, 53 electron-stimulated desorption, 37-38 energy versus time diagrams, 42-46 extended appearance potential fine structure, 52-53 extended fine-structure techniques, 31 extended-surface model, 82 extended x-ray absorption fine structure,49 extended x-ray energy loss fine structure, 55 field ion microscopy, 33 fine structure techniques, 29-32 forms of disorder, 79-80 future needs and directions, 173 high-resolution electron loss spectroscopy, 54, 119 inelastic low-energy electron diffraction, 54 insulators, 160-163 ion scattering, 35 layer spacing versus bond lengths, 80-81 low energy electron diffraction, 3, 25-27, 53-54 medium-energy electron diffraction, 53 molecular adsorption, 119-121, 168-171 multiple scattering, 47 near-edge fine structure techniques, 30 near-edge x-ray absorption fine structure, 50 numbers and types, 117 optical techniques, 37 ordering, 113 photoelectron diffraction, 28-29 process formalisms. 56-57 reconstructed metals, 131-134 reflection high-energy electron diffraction, 53 scanning tunneling microscopy, 34 semiconductors. 149-155 S E W S , 32, 49-50 single versus multiple scattering, 76-77 structural information, 41, 47 electron propagation, 58-59
Index
surface extended energy loss fine structure, 55 surface topography techniques, 32-35 techniques, 6-24 unsaturated hydrocarbons, 120-121 vibrational spectroscopy, 37 x-ray absorption near-edge structure, 50 Surface topography techniques, surface structure, 32-35 Symmetry dilemma, 210
T Tarfala theorem, 189, 221, 231 many-particle operator, 201-205 Tellurium, chemisorption on metals, 139 Theoretical model, 293-298 all-electron extensive models, 296 core-extensive models, 2% core-valence extensive models, 296-297 criteria, 294-296 size-consistent, 295, 297 size-extensivity, 294, 297 Thermal desorption spectroscopy, 36 Two-electron molecular integrals, elementary Jacobi rotations, 386-389
453
U Unitary transformations, 396-397 Unsaturated hydrocarbons, surface structure, 120-121
V Valence lines, passive, 336 Valence universal cluster operator, 335 Valence universal wave-operator, 333, 335, 347, 358 Bloch, 345 Variable metric methods, line search strategy, 252-255 Vibrational spectroscopy, surface structure, 37
W Water, ground state, total energy, 413-414 Wick's theorem, 346 Wolfsberg-Helmholz formula, 83
X X-ray absorption near-edge structure, surface structure, 30, 50
This Page Intentionally Left Blank