This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
,e)ds( S°' "" o i]£ + ot at t
or x 3 dx 2
P, = x - ^ ^
1-i^i
x 2
x dx Finally 2_
0
x
2
3
Alternatively, a recursion formula can be used (e.g., Scheid, 1968) (n+l)P n + 1 =(2n+l)P n -nP n _ 1 which, from the same seed P o = 1 and Px=x
produces the sequence
P 2 = i(3x 2 -1)
= !(35x 4 -3Ox 2 + 3) 8 = i(63x 5 -7Ox 3
(2.6.13)
106
Linear algebra
1.0
Legendre Polynomials
n-\-l
0.5
-0.5
-1.0 -1.0
0.5
-0.5
1.0
Figure 2.12 The first Legendre polynomials.
Because of a different normalization
2.6.4 Associated Legendre polynomials Legendre polynomials are one specific variety of a more extended class of orthogonal polynomials called associated Legendre polynomials. An associated Legendre polynomial Pim(x) is defined relative to an ordinary Legendre polynomial Pt(x) through
dxn
(2.6.14)
Alternative definitions lack the (— l) m term. Pt(x) is therefore a concise expression for Pl°(x). Note that, because of the derivative term in equation (2.6.14), if m > /, P*m(x) = 0. Advanced calculus would show orthogonality properties with
2.6 Linear function spaces
107
Table 2.4. The first associated Legendre polynomials up to l = m = 2. /
m
0 1
0 0 1 0 1 2
2
Pim(x) 1 X
-0-X2) 1 ' 2
l/2(3x2-l) -3(l-x 2 ) 1/2 x 3(1 -x2)
and
2/+1 (l-m)\ The numerical generation of associated Legendre polynomials is discussed by Press et al. (1986). These authors use the following recurrence on / x)
(2.6.15)
and the two starting values Pwm(x) = ( - l)m(2m-1)!!(1 ~x2)m/2
(2.6.16)
where the double factorial n\\ denotes the product of all the odd integers ^ n , and P£+1(x) = x(2rn + l)Pmm(x)
(2.6.17)
Examples of ordinary Legendre polynomials with m = 0 are P 0 °(x)=l P 1 °(X) = X X 1 X P 0 ° ( X ) = ^
(2-0)P2°(x) = xx 3 x P! 0 (x)- 1 x P0°(x) = 3 x 2 - 1 The first polynomials P,m(x) are given in Table 2.4. The number of associated Legendre polynomials up to the order / is (/+1)(/ + 2)/2. 2.6.5 Spherical harmonics
Some global problems deal with the variations of some geochemical parameters on the surface of the Earth. Problems of that sort have recently appeared when, for example, the world-wide distribution of the 206 Pb/ 204 Pb and other isotopic ratios
108
Linear algebra
in oceanic basalts permitted Dupre and Allegre (1983) to identify large-scale anomalous regions in the southern hemisphere for which Hart (1984) coined the name of 'Dupal anomaly'. The need to account for these variations with series of functions orthogonal on a sphere meets that encountered by geophysicists when trying to extract the most significant part of the variations in the gravity equipotential (geoid). These problems can be dealt with adequately using spherical harmonics. Spherical harmonics closely resemble normal Fourier harmonics except that they are functions of both the latitude and the longitude instead of the linear abscissa on a standard axis. Bi-dimensional Fourier analysis on a plane exists but is inadequate since the most desirable property of the requested expansion is the orthogonality of its components upon integration over the surface of the Earth, assumed to be spherical for most practical purposes. The standard coordinates on the surface of a sphere of radius r, i.e., the spherical coordinates, are the longitude
J the Earth surface
f(
(2.6.18)
/ 2 (4>, 6) dS{(j), 6) = const
(2.6.19)
and
r
I
the Earth surface
where the functional dependence of the surface element dS with <\> and 6 is to be determined. From Figure 2.13, we see that the arc length elements d/^ = r sin 6 d(/> and dle = rd6 satisfy the length conditions since (1) integrating rsin0d> at constant 0 for (j) varying from 0 to 2n gives the circumference 2nr sin 6 of the small circle, and (2) integrating rdO at constant (f> for 6 varying from — n to n gives the circumference 2nr of the large circle. Likewise t h e surface element dS = dle d/^ satisfies t h e surface c o n d i t i o n
m
'2n
"I
C~TC2n
r sin Odd) \rd(b = r2
o
C
2n
J
f
1
1 d
Ji Uo
J
= r2 \ d(f)\ d(cos 9) = r2 x 2TT X 2 = 4rcr2 (2.6.20) Jo J -l which is the surface S of the sphere. Further calculations will be carried out with reference to a sphere of unit radius, which can always be arrived at by proper
2.6 Linear function spaces
Figure 2.13 The system spherical coordinates: > is the longitude and n/2-6
109
is the latitude.
normalization. We get S=
dS=
d(/> d ( - c o s 0 ) =
whole sphere
which shows that the surface element is dS = — dcf) d(cos 6). As in the case of Fourier components, we will refrain from using complex variables. Instead, we will handle spherical harmonics as two separate sets of orthogonal functions C™ (0,0) and S,W(<M) such that
Cr(
(2.6.21)
4TT
(2.6.22)
™^ 9) =
4n
Example: —^ -si 47r(2+l)! i.e.,
/
i 16n
Linear algebra
110
Longitude
Latitude Figure 2.14 The spherical harmonic S32((p,6).
Figure 2.14 shows the spherical harmonic S32 (>,9). The normalization property of associated Legendre polynomials stated above, guarantees that these functions are orthogonal over the surface of a sphere
[2
(2.6.23)
Jo A bounded function /((/>, 6) can be expanded as an infinite series of C™ (0,6) and S (0,6) over the surface of the sphere
1=0
m
=o
(2.6.24)
where the oclm and f$lm coefficients are to be found from an integral expression similar to that used for Fourier coefficients. For a given / there is a total of (/ + 1)(' + 2) — (/ + 1) = (/+ I) 2 oilm and Plm coefficients (since all fil0 are zero), e.g., 49 coefficients are needed to expand a function in spherical harmonics to the order 6. Likewise, a 00 is the average value of the function over the sphere. A typical application of spherical harmonics will be given in Chapter 5.
3 Useful numerical analysis
3.1 Functions of a single variable 3.1.1 Derivatives The derivative of the function /(x) with respect to the dependent variable x is the scalar number defined as d/W /(x + Ax)-/(x) kim ffx (x) = (x) = kim x dx Ax-o Ax fx'(x0) represents the slope of the tangent to the curve y = f(x) at x = x0. An extremum, i.e., either a minimum or a maximum, corresponds to a null derivative. The quantity
is the derivative of order n of/(x) with respect to the variable x and is obtained by applying n-times the derivation formula. • The log derivative of a function /(x) is the derivative of ln/(x), i.e., dln/(x) dx
=
fx'(x) f{x)
1
The logarithmic differential of the function/(x) is defined as d/(x)//(x). If/(x) can be written as a ratio of two functions g(x)/h(x\ then ^
^
fix)
g(x)
^
(3.1.2)
h(x)
Example:
1+x
x2
)
1+x
V^
1+x/ dx
<& Optimum spike addition (Webster, 1960). We now deal with a specific example
that can be extended to any isotopic pair. Let us assume that 150 neodymium spike is added to 100 mg of a sample with ca. lOppm Nd in order to measure sample 111
112
Useful numerical analysis
neodymium concentrations. Isotopic proportions of 148 and 150 isotopes are 5.73 and 5.62 percent in natural Nd (molar weight 144.24), and 0.40 and 97.25 percent, respectively, in the Oak Ridge 1 5 0 Nd spike. Calculate the optimum amount of a spike containing 12.5 nmol 1 5 0 Nd per gram of solution to be added to this sample which minimizes the error on the calculated sample concentration. Assume that sample and spike Nd isotope compositions are perfectly known. Let us consider the two isotopes 1 4 8 Nd and 1 5 0 Nd and measure the isotope composition of the spiked mixture. Using the subscripts sa, sp, and m for sample, spike, and mixture, it was shown in Section 1.3 how to calculate the spiking ratio r 8
N,sp
Nd
8
150 Nd
Nd\ 50 Nd/ m
/ 1 4 8 Nd\ / 148 Nd V150Nd
where Afsa'5°Nd and iVspl5°Nd refer to the numbers of 1 5 0 Nd atoms of sample and spike, respectively, present in the mixture. Using x to represent the 1 4 8 Nd/ 1 5 0 Nd ratio, we get the more compact form
r
_
and try to find the value of x m which makes the relative error on r a minimum. Assuming that the isotopic compositions xsa and x sp are perfectly known for the spike and sample, the relative error dr/r on the spiking ratio r is dr = dlnr
_
~ dx ™
dx
™
_
*m(s,p-xj
d x m _ dxm
The relative error on the measurement dx m /x m is amplified by the factor y to give the relative error on the spiking ratio dr/r. The amplification term y goes to infinity for the extreme cases x m = x sa (no spike) and x m = x sp (no sample). Given that (xsp — xsa) is constant, y is minimum for
dx m (x s p -xj(x m -x s a )
-=0
or ~ X m ( - X m + X sa + X sp - X m )
=0
which is equivalent to (3.1.3)
3.1 Functions of a single variable
0
0.05 148
0.1
113
0.15
0.2
Nd/150Nd ratio in the sample-spike mixture, xm
Figure 3.1 Optimization of spike addition for the isotope dilution technique: y is the error amplification factor.
and hence 1-
-^ (3.1.4)
^sp
Relative error on isotope dilution measurements is minimum whenever the mixture isotopic ratio is the square root of spike and sample isotopic ratios. In the present case, xsa = 5.73/5.62 =1.0196, and x sp = 0.40/97.25 = 0.004 113. We can plot the amplification factor modulus |y| as a function of the isotopic ratio in the mixture (Figure 3.1). Minimum amplification is obtained for
hence the assay contains approximately O.lOOgxlOxKT 6 = 6.93nmolNd 144.24 gmol" 1
114
Useful numerical analysis
or 0.0562 x 6.93 nmol = 0.3895 nmol of natural
r
150
Nd. Addition of
0.06351
from the spike, amounting to 6.133nmol150Nd 12.5 nmol 150Nd per gram of solution of spike solution, minimizes error propagation on concentration. <>
3.1.2 Equation of the tangent to a curve Tangents to curves and surfaces play a key role for a certain number of petrologic or geochemical systems which undergo infinitesimal changes. Finding the compositional changes associated with the segregation of a mineral phase and deciding whether a system is stable relative to small perturbations are problems that commonly need the tangent equations to be found. Given the curve with equation y =/(x), the limit of the chord intersecting the curve in X and x is the tangent to the curve at x = xO. The slope s of this tangent is given by r s= lim
f(x)-f(x0) x — x0
df(x) dx
while the tangent equation must satisfy
and therefore (x-x0)
(3.1.5)
<& Cumulate control lines vs liquid lines of descent. In a study of the 1931-1986 basaltic eruptions from the Reunion Island (Indian Ocean), Albarede and Tamagnan (1988) found the Ni and Cr concentrations (in ppm) listed in Table 3.1 and shown in Figure 3.2. Samples with Ni>100ppm were found to contain large amounts of cumulus olivine. The last five samples of Table 3.1 are picrites. The smooth trend observed for all the rocks is suggestive of a complementary fractionationaccumulation relationship out of a single magma batch. Therefore, it is asked whether the trend observed for Ni and Cr in basalts with less than 100 ppm Ni may be ascribed to the removal of the olivine found in the cumulates. When minerals fractionate from a crystallizing magma, mass balance requires that, in a linear plot of Cr vs Ni, the points representing the composition of the parental
3.1 Functions of a single variable
115
Table 3.1. Ni and Cr concentrations in ppm in Piton de la Fournaise lavas, Reunion Island, 1931-1986 (Albarede and Tamagnan, 1988).
Ni
Cr
Ni
Cr
Ni
Cr
Ni
Cr
55 56 61 62 65 67 70 71 72 73
58 76 66 82 150 128 106 123 124 160
74 74 74 75 75 77 77 78 79 79
191 181 145 224 177 157 203 184 181 172
79 80 81 83 88 88 96 100 112 118
230 166 189 223 236 248 272 363 380 442
144 146 157 203 462 802 870 890 975
494 448 523 555 903 1547 1700 1635 1740
1000
100
Picrites =
;
Basalts
1000
100
Ni Figure 3.2 Plot in log-log scales of the Ni and Cr concentrations (ppm) in post-1930 basalts (open symbols) and picrites (solid symbols) from the Piton de la Fournaise volcano, Reunion Island, Indian Ocean (Albarede and Tamagnan, 1988).
magma, the residual magma and the average cumulate define a straight line (see Chapter 1 and Figure 3.3). In order to define the composition of the solid in equilibrium with the magma at a given point of its fractionation history, let the residual magma composition approach that of the parental magma. We conclude that the instantaneous cumulate must lie on the tangent to the locus of liquids (the so-called liquid line of descent) at the point representing the magma (Figure 3.3). The tangent line is also the locus of all combinations between the magma and the instantaneous cumulates, i.e., the locus of the cumulative rocks belonging to a magmatic stage with a unique differentiation extent: this is the cumulate control line (by reference to the widely used olivine control line). The model developed below is from Albarede (1976).
Useful numerical analysis
116
Parent magma
(Linear scales)
u
Instantaneous cumulate M
Bulk cumulate
Residual liquid
Ni Figure 3.3 Schematic residual magma-cumulate relationships for instantaneous and average magmatic products in a linear plot of Ni and Cr concentrations. For mass balance to be obeyed, the instantaneous cumulate must lie on the tangent to the liquid line of descent at the point representing the instantaneous liquid.
In the present case, the question is whether Reunion olivine-rich rocks belong to a cumulate control line of the basaltic rocks. Through a simple log-log regression, we find that the basalt trend ( = the liquid line of descent) can be approximated by a power law, a form justified in Sections 1.5 and 9.3. liq
=«(Ciitl
)
(3.1.6)
The equation of the tangent to this curve in the (Ni, Cr) plane at CNi = C liq Ni and c ''liq ~ IS (3.1.7)
or -
C
li
(3.1.8)
3.1 Functions of a single variable
117
From the last two equations, we get CCr = C l i q C r + ab(CUqNi)b-
l
(CNi - C liq Ni )
(3.1.9)
which can be rewritten C Cr -C Hq Cr = a 6 ^ ! ! r ( C N i - C l i q N i )
(3.1.10)
Qiq
Finally, comparing with equation (3.1.6), we get s~
(~* Cr
_
Z^S-=b-^-
^Ni
r
Ni
f* r
Cr
(3.1.11)
Ni
W1-11/
The slope of the linear array defined by the olivine-rich lavas (olivine control line) with more than lOOppm Ni is found by linear regression to be (CCr -C liq Cr )/(C Ni -C liq Ni )=1.6 whence we conclude that the liquid which could be involved in the olivine-rich lavas has a ratio Cr/Ni= 1.6/b= 1.6/2.8^0.6, which is well out of the range of the Cr/Ni ratios in basalts (1 to 3). Therefore, olivine-rich lavas are not cumulative rocks genetically related with the basaltic sequence. The large value of b is probably related to the presence of spinel on the liquidus. This does not exclude, however, that they may be cumulates from an earlier differentiation stage with a smaller b value , i.e., before spinel saturation.
b
sol •
At first sight, the answer would be AG = AG0 = Gliq(XUqb)-Gsol(Xsolb\ but we will show that this is wrong. We assume that dn = dn a + dn b moles are transferred from the liquid to the solid phase. Let us assign the symbol fi to chemical potentials, e.g., juliqa for species a in the liquid. Then dn AG = /<sola dnsola + /isolb dnsolb + ^liqa dnliqa + /zliqb dnliqb We have assumed that dn = dnsola + dnsolb and therefore
(3.1.12)
118
Useful numerical analysis
hence dn AG = (^sola - A*liqa)cKola + (//solb - /*liqb)dnsolb
(3.1.13)
The molar fraction of b in the newly formed solid is defined as
y b_ ^ sol
with a similar definition for the liquid fractions. The change AG in Gibbs energy upon transfer of dn moles is b q
-AO^soi b
(3.1-14)
which, for reasons which will appear later, is rewritten as AG = V b = XXsol V
This equation can be recast into AG = Xsol V sol b + (1 - * S O , W - [*nq Vuq b + (1 - * n q b K q a ] b
-X s o l b )( M l i q b -^ l i q a )
(3.1.15)
The terms on the right-hand side of the first line represent the difference between the molar free enthalpies of the solid and liquid solutions at composition Xsolb. In addition, a standard result for binary systems (e.g., Swalin, 1962) states for G and fi the following relationship
Applying equation (3.1.16) to the liquid results in AG = Gsol(JTsolb) - G liq (X liq b ) -
(3.1.17)
Let us label with an asterisk the value of the free enthalpy taken at Xsolb along the tangent to the curve of the liquid free enthalpy at X liq b (Figure 3.4) so that
liq l A sol ) — ^ l i c ^ l i q
dGUt
dX
(3.1.18)
3.1 Functions of a single variable
119
A
A
/
/
/
\
\
V
/
]~~
Gsol
\
•
AG i
AG 0 '
/ <
• — _ _ _ _ _
'liq
sol
Figure 3.4 Change of Gibbs free enthalpy at the onset of crystallization of a solid solution with composition Xsolb from a liquid solution with composition Xliqb. The change corresponds to AG, not to AG0.
which gives the simpler expression = G s o l (X s o l b )-G l i q *(X s o l b )
(3.1.19)
The change in Gibbs free enthalpy is therefore the difference measured at Xsolb between the G value of the solid and that taken along the tangent to the liquid curve (Figure 3.4). At equilibrium, Gliq and Gsol have a common tangent and therefore AG = 0. <=• Such a calculation can be extended to other molar quantities such as the volume change (AV) or the entropy change (AS). Unlike AG, the quantities AV and AS do not vanish at equilibrium. Although they derived the same tangent rule through a significantly more complicated theory, Walker et al. (1988) showed that in the context of olivine flotation in melts at high pressure, the slope of pressure-temperature (P, T) equilibrium univariant curves given by the Clapeyron rule (e.g., Denbigh, 1968) cannot be used to retrieve the relevant information on the actual change in density upon melting and crystallization in the mantle.
120
Useful numerical analysis 3.1.3 Leibniz's rule for the derivative of a definite integral
• Given /(x, t) a function of time and of another variable x, and given the time-dependent limits a(f) and /?(f), Leibniz's rule states that
11""/™*,. r ^ Jm
drj a a(r) (r)
^
-
J
dt
M dt
*
,3,.20, dt
Example: d Ckt
—
Ckt
sin(2f x) dx =
dtJo
2x(cos 2tx) dx + (sin 2kt2) xk- (sin 0) x 0
Jo
3.1.4 Taylor series • The Taylor expansion of the function f(x) about the value x 0 is
2!
X0
or in compact form
where the exclamation mark stands for the factorial expansion. The Taylor-McLaurin expansion of the function/(x) about x o = 0 is a particular case of equation (3.1.21) 4- ...
It is particularly useful for approximating some functions in the vicinity of x = 0: (a) the exponential function — e° + — e ° + . . . = l + x + — + — + . . . 2! 3! 2! 3!
(3.1.22)
(b) the natural logarithm function x2
1
1+0 *
-.x-xl 2 (c) the power series
x +
'2!(l+0)
2
.l+...
4
1
3!(l+0)3 x
l+ 3
x3
1
(3.1.23)
3.1 Functions of a single variable
121
with the particular case for a = — 1 x3+ ...
(3.1.24)
Isotopic fractionation provides illustrative examples of first-order expansions of unknown functions. In general, the mass spectrometric measurement r/ of the ratio between two isotopes of mass mt and m, of the same element, differs from the natural value RJ. Only a very small fraction of the original sample produces ions and different processes taking place in different parts of the mass spectrometer act differently on the sensitivity of each isotope. We assume that instrumental isotopic fractionation is mass-dependent.
Equilibrium fractionation. A simple fractionation law, called the linear law (e.g., Hofmann, 1971), relates the measured and natural isotopic ratios through a function /(Am/) of the mass difference Am/ = m7 — m, between the isotopes defining the ratios )
(3.1.25)
As Russel et al. (1978), we write the Taylor expansion of/(Am/) about Am/ = 0 and get /(Am/) = /(0) + Am//'(0) 4- higher-order terms and drop the terms which involve derivatives of order higher than one. As one isotope cannot be fractionated relative to itself (Am/ = 0),/(0) = 1 and the linear fractionation law reads r/
= K/(l+Am/(5)
(3.1.26)
where 8 = /'(0) is a constant coefficient called the mass discrimination or mass bias per mass unit. Let us take the example of the 148 Nd/ 150 Nd ratio
(1 2S)
~
natural
In a ratio-ratio plot (Figure 3.5), typically r/ 2 vs r/ 1 , the ratios combine as r/2-iV2_R/2Am/2_^ n r/ 1 -/*/ 1 K/'Am/ 1
/2Am/
2
Am/1
which shows that they are linearly related.
Mass discrimination with distillation effects. Let us assume that the isotope composition of an element is being measured by thermal ionization. This method consists in ionizing the sample atoms by evaporation on a metal filament. Statistical thermodynamics (e.g., Denbigh, 1968) tells us that, while vapor pressure is a function
Useful numerical analysis
122
O
i 42 o C/5
Domain of the linear law
Isotopic ratio 1 Figure 3.5 The domain of the linear law for mass-dependent discrimination between two isotopic ratios.
of the molecular weight of the isotope, the fraction evaporated and ionized is a complex function of ionization potential, temperature, work-function of the filament, etc. At a given time, we assume that the ionization parameters are fixed. Therefore, the ionization probability of an isotope i and, equivalently, the proportion of atoms on the filament that per unit time eventually ionizes is a function of mt only. Let g(mt) be that proportion. If we call nt the number of ionized atoms of isotopes i, we can write dw.
(3.1.27)
For two isotopes i and j , we combine expressions (3.1.27) as duj
drii
dlnn,— dlnrc,
d In rJ(i)
n{dt
dt
dt
Expanding the function g in a Taylor series to the first order with respect to m, we get the approximation
3.1 Functions of a single variable
123
or dlnr,'(f) — ss - (m,- - mi)gm (m,)
Upon integration, this equation becomes r/(0 = rIj(0)exp[- Am, V f a M =r/(0)[exp (-gJimdt)^
(3.1.28)
where the pre-exponential term represents the isotopic ratio of the first fraction evaporated from the filament. This ratio is at equilibrium with the ratio of the sample initially on the filament. From the earlier discussion, we can express r,J(0) as
Two limiting cases arise depending on the intensity of the distillation effect. If distillation is not important, we can expand the distillation exponential term to the first degree as rt\t)« /V(0)(l + Am/ <5)[1 - A w ^ W f l * M<>){ 1 + Am^-^'(m ( )t]} where the second-order terms are neglected. [8 — gJim^t] is known as the time-dependent mass fractionation per mass unit. This is the widely used timedependent linear law of mass fractionation suitable for large samples. For small samples, however, mass fractionation is important and the cumulative effects are dominant so 8 can be neglected resulting in
This relationship is known as the power law of mass fractionation (Wasserburg et al, 1981; Hart and Zindler, 1989). Writing oc(t) for -^ w / (mI )r for the mass fractionation per mass difference unit, the 148 Nd/ 150 Nd ratio (Am/= —2) would change with oc(t) according to a power law if
3.1.5 Roots of implicit equations and extrema of functions: the Newton method Some equations such as/(x) = 0 cannot be explicitly solved for x. If multiple solutions are not expected in a narrow range, Newton's method is often simple to implement and has faster convergence than the natural method of interval splitting. The method is recursive and uses the first-order expansion of/(x) in the vicinity of thefcthguess / [ x ( * + 1J] % /[x (fc) ] + [x(fc + 1 } - x (fc)]/'[x(fc) ]
124
Useful numerical analysis
Table 3.2. Calculation of yj2 as a solution to the equation x2 — 2 = 0 by the Newton method. Compare with the true value of 1.41421. Step/c 0 1 2 3 4
7.00000 1.36111 0.137 80 0.00222 0.00000
3.00000 1.833 33 1.46212 1.41500 1.41421
Expressing our goal that/[x (fc
+ 1)
6.00000 3.66667 2.92424 2.83000 2.82843
1.16667 0.37121 0.047 12 0.00078 0.00000
] = 0 results in
The Newton method can also be used to find the extremum (maximum or minimum) of a function/(x), i.e., the value for which the first derivative f\x) is zero. The iterative search for the extremum is implemented by a formula derived from equation (3.1.29)
i
i
i
The extremum X is a minimum if the second derivative/"(X) is positive, a maximum if f'\X) is negative. <& Find the square-root of 2. This amounts to solving
Pretending that we ignore the result, we calculate/'(x) = 2x and take 3 as the initial guess x(0). Hence,/[x ( O ) ] = 3 2 - 2 = 7,/'[x ( O ) ] = 2 x 3 = 6 and x(1) = 3-(7/6)= 1.8333 Table 3.2 lists the results for four iterations. After four iterations, the estimate reproduces the true value of yjl to the fifth decimal place, o Find the roots of the equation
1+x2
3.1 Functions of a single variable
125
which appears in connection with diffusion from a sphere into a finite volume (see Chapter 8). x (in radians) is the intersection of the periodic function tan x with the single value function x(l-hx 2 )" 1 . Since tanx goes to infinity for x = (2n+l)n/2 (n = 0, ± 1 , ± 2 , ...), there is one of these intersections per interval [_(2n — l)n/29 (2n +1)TT/2]. Consequently, we take 2nn/2 = nn as the initial guess for each interval. In addition, let us replace g(x) by f(x) such that /(x) = ( l + * 2 ) t a n x - x which is easier to handle and has the same roots since (1 +x 2 ) is strictly positive. The derivative f'(x) is +x 2 Xl+tan 2 x)-l Let us try for n = + 5, i.e., with 5TT = 15.708 as the initial guess x(0) . The first step gives /'[x ( 0 ) ] = (1 + 25TT2)0 - 5TT = -15.708 / ' [ x ( 0 ) ] = 2 x 5TT x 0 + (1 + 25TT2)(1 + 0 2 ) - 1 = 246.739 x{1) = 5n-(- 15.708/246.739)= 15.772 and the second step / [ x ( 1 ) ] = ( l + 15.7722)tan 15.772-15.772 = 0.1492 /'[x ( 1 ) ] = 2 x 15.772 x tan 15.772 + (1 +15.7722)(1 + tan 2 15.772)-1 =251.770 x ( 2 ) = 15.772-(0.1492/251.770)= 15.771
The improvement is not significant, so we stop here, o & A primary isochron 2 0 7 Pb/ 2 0 4 Pb vs 2 0 6 Pb/ 2 0 4 Pb on a series of rock samples gives a slope of 0.256. Calculate the age T of the isochron. Standard textbooks on chronology (e.g., Faure, 1986) give the slope of a 207p b/ 204p b
v s
206p b/ 204p b
i s
35L 1 f(T) = — — e'~ *238ur - 0.256 = 0 T 137.88 e '-l
Taking the derivative relative to T, we get 1
*
/-''35UT/ /-")38L^
1\
"
/'"'38U7'/ /" ) 35U^
1\
Using / 2 3 8 U = 0.155125Ga" 1 and / 2 3 5 U = 0.98485 0 a " 1 and an initial guess T 0 = 5Ga, Table 3.3 lists the results of the first five iterations, o
126
Useful numerical analysis
Table 3.3. Iterative calculation of the age of an isochron with a slope of 0.256 in the vs 206Pb/204Pb diagram by the Newton method. Step/c 0 1 2 3 4 5
jn
5.00000 4.01055 3.41007 3.234 33 3.222 31 3.222 26
207
Pbl204Pb
/[7^]//TT<*>] 0.58927 0.17202 0.032 60 0.001 97 0.00001 0.00000
0.595 55 0.28647 0.18549 0.163 59 0.16219 0.16219
0.989 45 0.60048 0.175 74 0.012 02 0.000 05 0.000 00
# I n a Concordia diagram 2 0 6 Pb/ 2 3 8 U vs 2 0 7 Pb/ 2 3 5 U, a series of zircons give a good alignment with a slope a = 0.043 633 and an intercept /? = 0.094613. Calculate the ages at which this line intersects the Concordia. Equation of the Concordia is (Wetherill, 1956; Faure, 1986)
whereas the straight-line equation reads y-ax-j5 = 0 These equations can be combined as /(T) = e;238uT-1 -a(e ; 2 3 5 u r - 1)-0 = which has for derivative
Using two different initial guesses (5 and OGa) in order to approach the two intersections from different directions, Table 3.4 gives the results of the first iterations towards each intersection. The zircon alignment intersects the Concordia at 1.00 and 2.00 Ga.<^ & A basaltic liquid with an FeO content C 0 FeO of 10 percent and an MgO content C0MgO of 12 percent crystallizes olivine (ol). Calculate the FeO and MgO contents CliqFeO and CliqMgO after 15 percent crystallization. Irvine (1977) has shown (see Section 1.5) that
CfaFeO and CfoMgO being the contents of FeO in fayalite and MgO in forsterite, the
3.1 Functions of a single variable
127
Table 3.4. Iterative calculation of the two age intercepts with the Concordia curve for a zircon linear array having a slope a = 0.256 and an intercept /? = 0.094 613. Step/c First intercept: 1 2 3 4 5 6 7 8 9
/'[T<*>]
/[r<*>]//'[r<*>]
5.0000 4.1243 3.3571 2.7447 2.3202 2.0875 2.0091 2.0001 2.0000
-4.8824 -1.6891 -0.5581 -0.1715 -0.0465 -0.0095 -0.0009 0.0000 0.0000
-5.5755 -2.2017 -0.9113 -0.4039 -0.1999 -0.1213 -0.0990 -0.0965 -0.0965
0.8757 0.7672 0.6124 0.4245 0.2326 0.0784 0.0090 0.0001 0.0000
Second intercept: 1 0.0000 2 0.8436 0.9883 3 4 0.9999 5 1.0000
-0.0946 -0.0113 -0.0008 0.0000 0.0000
0.1122 0.0782 0.0671 0.0661 0.0661
-0.8436 -0.1447 -0.0116 -0.0001 0.0000
solution is (Chapter 1 and Albarede, 1992) Q
FeO
Q MgO
(3.1.32) where KD is the ratio (FeO/MgO) ol /(FeO/MgO) liq and the parameter z is defined as C1/^
MgO//^1 MgO
Z — r Ujiq
/U 0
c\ i 'l'W
yD.l.DD)
Taking the derivative of/(z) with respect to z gives
f'(z)=-KD
r
F
FeO
(3.1.34)
Using a table of molar weights, we get C fa FeO = 2 x 71.85/203.78 = 70.52 percent CfoMgO = 2 x 40.31/140.71 =57.30 percent A natural starting value for z if F. The calculation listed in Table 3.5 was stopped once the relative deviation in z was less than 0.001. The final z value of 0.4306 is converted through equation (3.1.33) into CliqM*0 = 0.4306 x 12/0.85 = 6.08 percent
128
Useful numerical analysis Table 3.5. Iterative calculation of the z value, equation (3.133), for thef(z)=0 equation (3.1.32). Step/c
f(z)
f'(z)
z
0 1 2 3
-0.1121 0.0054 0.0000
-0.2556 -0.2867 -0.2842
0.8500 0.4115 0.4305 0.4306
and through equation (3.1.31) into C liq FeO = 6.08 x (10/12) x (0.4306)° 2 9 ~ * =9.21 percent.
<& Upon heating for a time t = 1 hr at 800°C, a plagioclase crystal with a radius a = 1 mm has lost 40 percent of the radiogenic argon it initially contained. Calculate the argon diffusion coefficient 2 at this temperature. Assume that the plagioclase crystal may be considered as a sphere with isotropic diffusion properties and that, initially, radiogenic argon was homogeneously distributed. Let us define the dimensionless number
From Section 8.6, the fraction F(T) of argon left in the sphere at x can be written F(T) = -^-t\exp(-n2n2T) n2n=ln2
(3.1.35)
Hence, the implicit equation to be solved for T is /(T) = F(T)-0.6 = 0
The derivative of /(T) is /'(T)=
- 6 £ exp(-n27r2T)
(3.1.36)
The first seven iterations produced from an arbitrary trial value T = 0.0001 are listed in Table 3.6. The final result is exact to better than four decimal places. This result allows the diffusion coefficient to be calculated as
3.1 Functions of a single variable
129
Table 3.6. Iterative calculation of the % value corresponding to a lost fraction of radiogenic argon of 40 percent. Spherical geometry is assumed, equation (3.1.35).
Stepfc
F(T)
/'M
/W//'(T)
T
1 2 3 4 5 6 7
0.9664 0.9664 0.8444 0.6932 0.6145 0.6004 0.6000
-166.26 -166.26 -32.261 -14.028 -10.168 -9.6357 -9.6216
-0.0022 -0.0076 -0.0066 -0.0014 -0.0000 -0.0000
0.0001 0.0023 0.0099 0.0165 0.0179 0.0180 0.0180
Find numerically the mininimum over [0, 1] of the binary entropy function /(x) = x l n x + ( l -
The obvious result x = 0.5 can be arrived at in a number of ways. The first and second derivatives are /'(x) = ln[x/(l-x)]
and r ( x ) = x"
With the trial value x(0) = 0.1, we obtain /'[x ( 0 ) ]=-2.1972,/"[x ( 0 ) ] = 11.111, and x(1) = 0.2978. The result 0.5000 with four correct decimal places is obtained with x(3). o
3.1.6 Ordinary differential equations: the Euler method Quite commonly, differential equations appear in the form
(3.1.37)
and cannot be solved explicitly. We have to resort to one of the many numerical methods of which the simplest versions are given here. The Euler method has little practical value, but forms the basis for most of the more elaborate methods. It consists in a first-order expansion of the derivative. The approximation at step tin+1) is (3.1.38)
130
Useful numerical analysis
Table 3.7. Solution of the differential equation dy/dt = -2ty by the Euler method with a time step of 0.1 (left) and 0.01 (right).
t
y
-2ty
True value
t
v
-2ty
True value
0.00 0.10 0.20
l 1.0000 0.9800
0 -0.2000 -0.3920
1 0.9900 0.9608
0 0.01 0.02
1 1.0000 0.9998
0 -0.0200 -0.0400
1 0.9999 0.9996
0.90 1.00
0.4655 0.3817
-0.8379 -0.7634
0.4449 0.3679
0.19 0.20
0.9663 0.9627
-0.3672 -0.3851
0.9645 0.9608
& Solve the equation that results from introducing the Boltzmann variable into the diffusion equation (Chapter 8) y'(t)=-2ty given the initial value y(0)= 1. This equation has an exact solution y = exp(-t2). For a rather coarse time step of 0.1, we obtain .1-0.0)(-2x0xl)=l .2) = )/(0.1) + (0.2-0.1X-2x0.1 x l) = 0.98 Table 3.7 compares the accuracy for two different time step sizes. Obviously, final accuracy depends on the time step chosen and considerable computational effort would be required for a good approximation, <=>
3.1.7 Ordinary differential equations: the Runge-Kutta method Although Press et al (1986) compare the use of the Runge-Kutta method to 'ploughing the fields' and that of high-order predictor-corrector schemes to 'racing on the fast lane with a sports car', we are still dealing with a reliable method easy to implement and quite successful. The Runge-Kutta method is sketched in Figure 3.6 and uses successive approximations of the function on the [t{n) — r(w+1}] interval. Let us define
3.1 Functions of a single variable
131
t Figure 3.6 Numerical solution of ordinary differential equations: sketch of the four steps of the Runge-Kutta method to the order four giving the n + 1 th estimate y ( n + 1 ) from the nth estimate y{n).
and the intermediate evaluations
Then
f-fc4)
(3.1-39)
o This scheme has tight connections with the Simpson's rule for numerical integration.
&
Let us solve the same equation as above y'(t)=-2ty
with initial value y(0)= 1 and a constant time step tin+1) — t(n) = OA.
Useful numerical analysis
132
Table 3.8. Solution of the differential equation dy/dt = —2ty by the Runge-Kutta method to the order four with a time step of 0.1.
Step n
y> *i
k2
K exp(-r2)
0
2
1
0 1 0.0000 -0.0100 -0.0100 -0.0198 0.9900 0.9900
0.1 0.9900 -0.0198 -0.0294 -0.0293 -0.0384 0.9608 0.9608
0.2 0.9608 -0.0384 -0.0471 -0.0469 -0.0548 0.9139 0.9139
4
3 0.3 0.9139 -0.0548 -0.0621 -0.0618 -0.0682 0.8521 0.8521
0.4 0.8521 -0.0682 -0.0736 -0.0734 -0.0779 0.7788 0.7788
5 0.5 0.7788 -0.0779 -0.0814 -0.0812 -0.0837 0.6977 0.6977
Let us show how the method is implemented on the first time step y(0)=l
r r /c3 = 0.1x
/
o.i\ /
o\i
/
oi\ /
-0.01M
-2x(0+ —jx( 1+
j
=-0.00995
/c4 = O.lx [ - 2 x(0 + 0.1)x (1-0.00995)] = - 0 . 0 1 9 8 0 1
hence (0-2x0.01-2x0.00995-0.019 801) -=0.99005
y(0.1)=l-h-
A few more steps with results trimmed to the fourth digit are computed in Table 3.8. The Runge-Kutta provides a robust and reasonably precise answer to most differential equations, o 3.1.8 Interpolation with spline functions Interpolating discrete data is an old concern of physics but, in the case of numerous data points, the conventional collocation and osculating polynomials are of too high degree to be really useful (Scheid, 1968). Interpolation of n + 1 data points yOiyl9..., yn tabulated at x0 xx ' xn for intermediate values of the dependent variables x can be done by a number of methods but one of the simplest and most elegant is the constructions of cubic splines. On each interval, the 'data' function will be approximated by a cubic polynomial such that, at their common point, the polynomials of neighboring intervals have identical values, slopes and curvatures (Ahlberg et ai, 1967). Let us consider the interval (xt_ x - x£). A third degree polynomial
3.1 Functions of a single variable
133
has a linearly changing second derivative y", hence y"(x) = yyi-l" + (\-y)yi'\
for ^ ^ x ^ x , -
(3.1.40)
where y is a factor depending on x. Solving at the extremities x t _! and x, of the interval, we get X( X
/(x)=
~
ft-r + fl
^—)yi",
for x.-.^x^x,-
(3.1.41)
while over the interval xt and xi+1 the expression becomes +
X Xi
~
yi+l",
for x ^ x ^ x I + 1
(3.1.42)
These expressions can be integrated relative to (x, — x) 1 l(x,-x) 2 y(x)=--
yi.l 2x I -x,_ 1
I" 1 (x,-x)2l n - x,-x-y, +cx for x^ L
x-x,--
2x,-x I _J l
I_U/' + _
2 X, +
t
—y.
+ l»
+ C2 for x,
2 Xf + j — X,-
— Xf J
where c t and c2 are two constants of integration. A second integration gives for Xt-
6x l-xI _1 2 — ly/' + —yi+i" + c2(x-Xi) + c4 for x,y(x) = -\ 3(x-x,) 6L x« + i-XiJ 6x l + 1 - x f
Writing that the first cubic goes through the points (Xi-^y^^ yi-i = -(xi-xi-l)2yi-1" o
+ -(xi-xi-1)2yi"-cl(xi-xi-.l) 3 y i
hence
r)
+ c3
3
+yi~yi-1
U
and (x,-,^,-), we get
(3.1.43)
where yi'i~) is the left-derivative at x = xt. Likewise, writing that the second cubic goes through the points (xf,yf) and (xi+1,yi + 1), we get
3
o
134
Useful numerical analysis
hence (3.1.44)
where y/ ( + ) is the right-derivative at x = xf. The first left- and right-derivatives must be equal for the common point x = xh hence
X: —Xf_!
which is recombined into
For n data points, (n— 1) equations, such as the one above, can be written with (n+ 1) unknowns y/' (i = 0,..., n). Two additional equations are needed, which most often are end conditions at i = 0 and i — n. The two conditions specifying the slopes
(3.1.45)
(3.1.46) xM-xM_1
make it possible to complement the system of equations and solve it for the (n +1) unknowns y" (i = 0,..., n). In a matrix form, this is written as AM=D
(3.1.47)
where the current element atj of the (n +1) x (n + 1) tri-diagonal matrix A are given by
0
(3.1.48)
otherwise
M i s the (n+1)-vector of the unknowns y{\ while D is the (n+1)-vector with current
3.1 Functions of a single variable
135
Table 3.9. Chondrite-normalized Ce/Yb ratio of some recent lavas of the Piton de la Fournaise, Reunion Island (Albarede and Tamagnan, 1988).
i Year (Ce/Yb)N
0
1
2
3
4
5
6
7
1948 20.9
1953 21.2
1956 22.0
1966 20.8
1972 21.7
1975 22.4
1981 21.3
1985 18.9
element
(3.1.49)
n
= 6\ y n ' {
>
Once the linear system is solved for the unknowns y", the value interpolated at any arbitrary x can be calculated from either formula for x ^ ^ x ^ x , (3.1.50)
— 6
l
-(x — xt)3
for x I ^ x ^ x l + 1
(3.1.51)
xi+l-Xi
where the left- and right-derivatives y{( } and y{{ + ) are given by equations (3.1.43) and (3.1.44), respectively. It is often preferred to choose the values of the derivatives instead of the curvatures as unknowns. In particular, the right-derivatives y{{ + ) at the data points (breakpoints) appear in the standard pp-form (piecewise polynomial) used in many software packages (de Boor, 1978). This transformation is a simple task using the relationship between derivatives and curvatures, preferably in a matrix form. & The Piton de la Fournaise volcano (Indian Ocean) erupts basalts with chemical compositions that change with time. The rare-earth elements have been measured on eight dated historic lavas (Table 3.9 and Figure 3.7, Albarede and Tamagnan, 1988), and chondrite-normalized (Ce/Yb)N ratios over the time interval 1948-1985 are given in Table 3.9. Calculate an annual interpolation of these results. Let us build first the matrix A of coefficients atj and the right-hand side vector D with the - admittedly questionable - assumption that the derivatives at the end-points are zero. Matrix A is calculated from equations (3.1.48), e.g., aoo = 2 x (1953-1948)= 10,
136
Useful numerical analysis
1940
1950
1960
1970
1980
1990
Year Figure 3.7 Spline interpolation of the chondrite normalized Ce/Yb ratio in recent lava flows of the Piton de la Fournaise volcano (Albarede and Tamagnan, 1988). The end derivatives are supposed to be zero.
a01 = 1953-1948, and so on. Vector D is calculated from equations (3.1.49), e.g., dn =
u
l
/ 21.2-20.9 \ 1 9 5 3 - 1948
,1953 - 1 9 4 8 /
-0
=0.36
r 1956-1953; V
so we arrive at
A=
10
5
0
0
0
0
0
0
0.36
5
16
3
0
0
0
0
0
1.24
0
3 26
10
0
0
0
0
2.32
0
0
10 32
6
0
0 0
0
0
0
6
18
3
0 0
0
0
0
0
3
18
6 0
0
0
0
0
0
6
20 4
0
0
0
0
0
0
4
8
and D =
1.62 I 0.50 2.50 2.50 3.60
Solving the matrix equation AM=D,we get the vector M of unknowns y" (i = 0,..., 7), which enables the left- and right-derivatives to be calculated through equations (3.1.43) and (3.1.44) (Table 3.10). Let us give an example by calculating the (Ce/Yb)N value interpolated for the year 1960 with thefirstinterpolation formula. Clearly i = 3 (1966), so x3 - x = 1966-1960 = 6, then x3-x2 = 1966-1956 =10. Moreover, y3 = 20.8, ^ ( " } = -0.0423 and ^ = 0.0919.
3.2 Functions of several variables
137
Table 3.10. Second derivatives, first left- and rightderivatives calculated for each data point listed in Table 3.9 through equations (3.1.43) to (3.1.46).
Finally, y3"-y2"-
i
y"
0 1 2 3 4 5 6 7
-0.0185 0.1090 -0.1371 0.0919 0.0085 -0.0683 -0.2161 0.5581
yti+)
0 0.2262 0.1840 -0.0423 0.2589 0.1693 -0.6839 0
0 0.2262 0.1840 -0.0423 0.2589 0.1693 -0.6839 0
=0.0919-(-0.1371) = 0.2290, hence \x3*)
=y3-
+>V(*3*)\ 2 6 x3 — x2
(
3
)
which, upon replacement with the actual values, yields y(t =1960) = 20.8-(-0.0423) x 6 + -(0.0919) x 6 2 - - ( — ) x 6 3 = 21.88 2 6 \ 10 / The calculation can be made for an arbitrary number of points provided their abscissa lie inside the range of x values. Figure 3.7 shows the characteristic features of spline interpolation, a very smooth aspect although with some 'overshooting' problems, i.e., extrema located between the data points. Alternative interpolation schemes are discussed by Wiggins (1976). o 3.2 Functions of several variables
3.2.1 Introduction • Let / ( x 1 , x 2 , . . . , x j be an n-multivariate scalar function of the dependent variables x 1 ,x 2 ,...,x n . An equivalent notation is/(jc), where JC is the vector made of the n variables Xj, x 2 ,..., xn. The partial derivative of the function/(jc) with respect to the dependent variable x, is defined as
df(x)
/ ( x 1 , x 2 , . . . , x l + Ax,,...,x n )/(x 1 ,x 2 ,...,x l ,...,x M ) = lim
(J.z.l)
A Higher-order partial derivatives can be defined in a similar way. Using equation (3.2.1), we can show the important equality
d2f(x) dx dy
djm dy dx
(3 2 2)
138
Useful numerical analysis
• Let u{t) be an n-vector with components (ul9 u2,..., un) depending on a single scalar variable t. The derivative of the vector u with respect to the scalar t is the n-vector defined as du
— = lim dt Ar^o
u(t + At)-u(t)
(3.2.3)
At
It makes a vector with n components dujdt (i= 1,2,..., n). Example: with respect to t is the vector = The derivative of u(t) = \ |_sin t J dr |_cos t J The gradient vector grad/(jc) of the scalar function/(JC) is an n-vector defined as df(x)dx1 df(x)/dx2
grad/(jc) =
(3.2.4)
df(x)/dxn The nabla notation V/(JC) is commonly used. Example: the gradient vector grad /(JC) of the function defined as
grad/(jc) = .COS X 3 _
As a particular case, the gradient of a scalar quantity obtained as the dot product of the constant column vector u(u1,...,ut) and the column vector x (xu...,xn) is the vector u itself for
grad wx = Vi#T x =
duJx/dx2
u2
(3.2.5)
JuTx/dxn_ Example: the gradient vector of the scalar 2xt 4-3x2 is the column vector [2, 3]T . The variation df(x) of an n-multivariate scalar function/(x) along a displacement direction x can be written as
<*/(*)= I ^ - d x dx
(3.2.6)
where do: is the n-vector with elements dx^dx^...,dx M . The variation d/(x) therefore has the meaning of the scalar product between the gradient and the displacement vector. Particular displacement vectors dx generate contour lines with constant/(JC) values such as df(x) = 0. The dot product vanishes and the gradient is therefore a vector perpendicular
3.2 Functions of several variables
139
to the contour lines of the function. The gradient vector points uphill. If d/(jt) is negative, the function f(x) decreases along the direction x and the opposite is true if it is positive. Example: At point (1,2, n/2), the direction [-1,3,2] T parallel to the infinitesimal change 3) =
djc(-d/c,3d/c,2d/c)
with d/c>0 induces a change of the function
defined previously, such that d/(jt) = -dkx (22) + 3 dk x (2 x 1 x 2) + 2 dk x [cos(;r/2)]
The direction [— 1, 3, 2] T corresponds to an increasing value of/(jc). • The divergence of a vector v with components v^v2 ..., vn is the scalar number noted either divr or, with the nabla notation V», defined as ivi> = V-i>= V —*dX
(3.2.7)
The Taylor's expansion of a bivariate function /(x, y) in the neighborhood of the point (xo Jo) is obtained by expanding /(x, y) with respect to x at constant y then with respect to y at constant x (or the other way around) and can be*written as
f +highe,orderterms
Defining AJC as the column vector with elements (x —x0) and (y — yo\ an alternative matrix formulation is /(x, y) = /(x 0 , y0) + AJCT grad / + - AjtT//Ajt + higher-order terms
(3.2.8)
where the 2 x 2 symmetric curvature matrix or Hessian H is defined as d2f(x,y) H=
dx2
d2f(x,y) dx dy
d2f{x,y) dx dy d2f(x,y)
(3.2.9)
dy2
An extremum is a minimum if any fluctuation of the coordinates about this point causes the function to increase. It is a maximum if any fluctuation causes the function to decrease. In any other case, we are dealing with a saddle point. We write the fluctuation A/of/(x,y)
140
Useful numerical analysis
about the extremum with coordinates (x*, y*) as A/ % f(x, v) - /(x*, y*) = - AxTHAx In general, / / can be diagonalized as
H=U\U ~l where A is the diagonal matrix of real eigenvalues and U an orthogonal and therefore invertible matrix. The fluctuation A/now writes
Af*-A Introducing the new vector z such that z=U~ 1Ax, we obtain Afx-zTAz
= -trAzzT = -(A1zl2 + A2z22)
where the properties of the trace of a matrix have been used (Section 2.2.4). If all eigenvalues are positive, the extremum (x*,/*) corresponds to a minimum. If they are negative, the extremum corresponds to a maximum. If they are of mixed sign, we are dealing with a saddle point. Figure 3.8 depicts the different cases. These concepts are easily extended to more than two variables. &
For x = (w, v)9 calculate the Hessian of the function f(x) defined as f(x) = — sin u cos v
and map curvature changes. This function is plotted on Figure 3.9 and shows a regular pattern of maxima and minima. Its Hessian matrix is Tsin u cos v cos u sin v~\ |_cos u sin v sin u cos vj The relation tr H= sin2 u cos2 v>0 shows that the eigenvalues of H have the same sign. Curvature changes sign whenever the determinant of H vanishes, i.e., for det H— sin2 u cos2 v — cos2 u sin2 v = 0 or det H= (sin u cos v — cos u sin v) x (sin u cos v + cos u sin v) — 0
3.2 Functions of several variables
141
Figure 3.8 Curvature of a function f (x,y). Top: the Hessian H has two negative eigenvalues. Middle: two positive eigenvalues. Bottom: mixed-sign eigenvalues.
and, finally det H= sin(w — v) x sin(w + v) = 0 which holds for u — v = n or u + v = n. o
142
Useful numerical analysis
Figure 3.9 Plot and curvature of the function/(w,v) = — sinwcos i; for —
3.2.2 System of implicit non-linear equations: the Newton-Raphson
method
The Newton method can be extended to several variables in order to find the zeroes of n functions/ in n variables xi9 which we lump as the vector x = [ x 1 , . . . , x J T , i.e., to solve the system of equations fi(xl,...,xn) = fi(x) = 0
(3.2.10)
for i = l , . . . , n . We start with an initial guess jc(0) = [x 1 ( 0 ) ,...,x n ( 0 ) ] T of x, expand the function to the first order, and make the result equal to zero. We get / T v (!) v t 1 )!— / T v (°> v- (°>~l _l_ «ro«l / T v (0) v (0)1 • A *•(!) — n JilXi ,..,xn j — jiix1 ,...,xn JH-graajfLXi ,...,xM j A X — U
where Ax(1) is the vector of the increments xiil) — xiiO)(i=l9...9ri). in expanded form the n equations of this type, we obtain
ML dxl
dxn
I
k
Lumping together
(3.2.11) Y
(1)
Y
(0)
dxn
Let us call/[jc ( 0 ) ] the n-column vectors of the n values of/ calculated at x{0\ and (0) D[JC ( 0 ) ] the matrix of the derivatives. We assume that Z)[x ] is non-singular, i.e., that its determinant (the Jacobian of the ^^/transformation) is different from zero. The increment Ax(1) is calculated as (3.2.12)
3.2 Functions of several variables
143
and the iteration repeated as far as needed. This formula extends equation (3.1.29) to multiple dimensions. Although several examples implementing the Newton-Raphson method for the computation of chemical equilibrium are developed in Chapter 6, we will now present some simple applications that illustrate its basic principles. & Let us assume that, at high temperature and ambient pressure, the binary system albite-anorthite (ab-an) is ideal. The temperature Tf and enthalpy AH{ of melting of each component is Tfab=1373K Atffab = 64.3kJmor 1 Tfan = 1830 K A//fan = 133.0 kJ mol" J Assuming that AHf are constant, calculate the composition of the solid and liquid coexisting at 7 = 1600 K. The variable X referring to mole fractions, equilibrium is achieved when the chemical potentials ji(ln X) for each element are equal in each phase
an
2
= //solan(0) + 0tT In Xsolan
where M is the gas constant and /i(0) the Gibbs energy of the standard state (pure phase or end-member) at the same temperature and pressure. Closure requires the conditions AY
liq
ab
_i_ AY a n — 1l ' Hq ~~
sol
ab i y an i i~A sol ~~ x
y
A
Equilibrium conditions can be recast into two equations in the two independent variables X liq an and X sol an hq
an liq
sol
-lnX sol an ] = 0
(3.2.14)
We recognize in the first two terms on the right-hand side of each equation the Gibbs energy of melting AGf of each end-member which, assuming that AH{ is constant, can be expressed as
and therefore AGfab = 64.3 x (1-1373/1390)=-10.6kJmoP 1 AGfan= 133 x(l-1600/1830)=+ 16.7kJmor 1
Useful numerical analysis
144
Table 3.11. Iterative calculation of the solution to equation (3.1.23) through the Newton-Raphson method. The vector x is the set of the two variables Xliqan and Xso*n. D(n)
n
-Ax-/)"1/
fin)
0 1 2 3 4 5 6
x( w + 1 ) -x ( n ) + Ax 0.6 0.3
-18075 25 937 2329.8 5041.5 561.32 -2738.5 -7.7393 -238.17 0.0369 -2.7995 -1.88xlO" 6 -3.76 xlO" 4
- 3 3 257 22171 -19197 43 328 - 1 5 585 90857 -16142 75 625 -16215 74079 -16216 74061
Let us define x = derivatives D(x) is
19004 -44343 50857 -18015 36148 -21049 35 874 -21143 36055 -21081 36057 -21080
0.29297 -0.43843 0.16061 0.10644 -0.02949 0.00281 -0.003 67 -0.00187 -0.00004 -0.00002 5.8 xlO" 9 2.7 x H T 9
0.30703 0.73843 0.14642 0.63199 0.17591 0.629 17 0.17958 0.63104 0.17962 0.63106 0.17962 0.63106
9.99 x 108 3.08 x 107 7.81 x 106 5.68 x 104 7.84 x 10° 1.41 xlO" 7
*0, A M l a n ] T a n d / ( J C ) = [ / 1 ( J : ) , / 2 ( x ) ] T . The matrix of partial
SIT (3.2.15)
D(x) = Sf2 Y
an
^Miq
Let us choose the initial guess for JC(0) = (0.6, 0.3). Successive steps produce the results shown in Table 3.11. Figure 3.10 shows the Gibbs free enthalpy of the liquid and solid mixtures together with the final result JC(6) = (0.179 62,0.631 06). The last column in Table 3.11 lists the squared-modulus s = / T / o f the vector/as a convenient measure of convergence. <j= Not all calculations converge so nicely, especially when the derivatives of higher order are large and variable. The choice of variables as well as their value at the starting point may turn to be critical in achieving reasonable convergence.
3.2.3 Extrema: the steepest-descent method A large category of problems consists in finding the extremum of a function with respect to several variables. Finding the maximum of a function/(JC) is equivalent to finding the minimum of —f(x), so the discussion will be restricted to the search for minima. Let us assume that/(jc) is a function in the n variables xl9 x2, >,xn collected
145
3.2 Functions of several variables
I o 0.2
0.4
0.6
0.8
1
Figure 3.10 Computation of equilibrium concentrations for coexisting solid and liquid solutions.
Figure 3.11 The steepest-descent method: the search in one direction is discontinued when no further decrease is possible, i.e., when the search direction is parallel to the local contour line. The next step starts in a perpendicular direction, i.e., in the direction opposite to the local gradient.
into the vector x. Since the grad f(x) vector points towards the maximum increase of/(jc) (Figure 3.11), minimizing/(JC) may be iteratively achieved using (3.2.20)
where a is a constant. The linear search for the optimum value <xm of a is carried out by either bisection or by more efficient method such as Davidon's cubic interpolation (e.g., Walsh, 1975; Fletcher, 1987). A measure of how fast/(jt) decreases from x{k) to xik+1) is the scalar gik + 1\ such that (k + 1)
=
1) _
(3.2.21)
Useful numerical analysis
146
Table 3.12. Search for the minimum ofthe function f(x) = exp(0c12 steepest descent.
by the method of the
The scalar a, equation (3.2.20), is estimated by crude linear search, g is the convergence criterion given by equation (3.2.21). Values in italic refer to the minimum along the search direction.
a 0 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0 0.1 0.2 0.3 0.4 0 0.15 0.30 0.45 0.60
1.0 0.9197 0.8393 0.7590 0.6786 0.5983 0.5179 0.4376 0.3573 0.4376 0.3829 0.3283 0.2736 0.2189 0.1642 0.1096 0.0549 0.1096 0.0872 0.0649 0.0425 0.0202 0.0425 0.0297 0.0169 0.0042 -0.0086
x2
/
-1.0 -0.8393 -0.6786 -0.5179 -0.3573 -0.1966 -0.0359 0.1248 0.2855 0.1248 0.0936 0.0624 0.0312 0.0001 -0.0311 -0.0623 -0.0935 -0.0623 -0.0369 -0.0115 0.0140 0.0394 0.0140 0.0123 0.0130 0.0137 0.0144
20.0855 9.5322 5.0811 3.0422 2.0459 1.5453 1.3111 1.2494 1.3373 1.2494 1.1784 1.1225 1.0798 1.0491 1.0293 1.0200 1.0207 1.0200 1.0104 1.0045 1.0022 1.0035 1.0022 1.0012 1.0006 1.0004 1.0005
df/dx, 40.171
df/dx2
g
-80.342
8069
1.0935
0.6236
-6.18
1.0935
0.6236
-6.18
0.2235
-0.2542
0.0859
0.2235
-0.2542
0.0859
0.0852
0.0559
0.0048
0.0852
0.0559
0.0048
0.0083
0.0549
0.0038
At the minimum along the kth search direction, the old and the new directions should be orthogonal, i.e., (3.2.22) grad/[jc(k+ ^-grad/iy 0 ] =0
Calculate by the steepest-descent method the minimum of the function
using the starting point ( + 1 , - 1 ) . The solution is obviously (0, 0). The first four stages of linear search are given in Table 3.12. <^
3.2 Functions of several variables
147
The steepest-descent method does converge towards the expected solution but convergence is slow in the vicinity of the minimum. In order to scale variations, we can use a second-order method. The most straightforward method consists in applying the Newton-Raphson scheme to the gradient vector of the function/to be minimized. Since the gradient is zero at the minimum we can use the updating scheme grad f[x(k
+1
>] = g r a d / [ j c ( f c ) ] + / / < k ) [ > ( * + l)- jc ( k ) ] = 0
or *<*+1> = x <*>-[#<*>]-i g r a d / [ x ( k ) ]
(3.2.23)
where H{k) is the curvature matrix (Hessian) o f / a t step k. Equation (3.2.23) is the extension of equation (3.1.30) to multiple dimensions. As will be seen in Chapter 5, this method is extremely useful for refinement near the minimum but may otherwise run away towards any point where the gradient vector vanishes, such as saddle points or even maxima. This inconvenience may be overridden by using the mixed scheme due to Marquardt (1963) and which is written grad /[jc(fc)] + [#<*> + a/J[jc(fc + 1 } - *<*>] = 0 where a is a parameter and /„ the (n x n) identity matrix. This scheme allows the user to shift from the most reliable gradient method far from the minimum for large values of a to a fast-converging Newton-Raphson method for small values of a near the minimum (Fletcher, 1987). Other methods, which also have the property of accelerated convergence near the minimum, take advantage of how the gradient varies locally, or, in other words, build up a local second-order approximation to the function/ The most commonly applied methods are those of conjugated gradients due to Fletcher and Reeves and the variable metric methods, e.g., the Davidon-Fletcher-Powell (DFP) and Broyden-FletcherGoldfarb-Shanno (BFGS) algorithms. These methods require more substantial theoretical developments, which may be found in the books by Walsh (1975), Press et al. (1986), Fletcher (1987) and are found in most major software packages such as MatLab. 3.2.4 Constrained minimization Constraints are relations of equality or inequality, which must be exactly obeyed by the unknowns of a model. A familiar example is the mineral abundances in a rock or the end-member proportions in a mixture, which must sum up to unity whatever the errors on the data. • When the objective function to be minimized is linear, the problem is relatively simple (Figure 3.12) and is the substance of linear programming. Let us take a simple example in n = 2 dimensions and assume that we are looking for the minimum of a linear function/(JC), where JC is the vector [x 1 ,x 2 ] T , with the equality constraint bTx = q (b and q being known constants) and the inequality constraints xt ^ 0 , x2 ^0. In geochemistry or thermodynamics, the equality constraints are typically conservation equations, while phase or end-member
148
Useful numerical analysis
x\\\\\\v<x\\\\\\\
Figure 3.12 Minimum of the function/(x) submitted to the equality constraint bTx = q, and to the inequality constraints x ^ O , x 2 ^0. The feasible set is the segment of the constraint line located in the positive quadrant. The minimum occupies an edge of the feasible set. proportions are usually required to be non-negative. Constant values of/(jc) define straight lines in the plane (xl5 x2). The inequality constraints split the space in two subspaces. The feasible set of solutions is the convex subspace that comprises all the values of the vector x which satisfy both the equality and inequality constraints. A corner is a vector of n values (a point), an edge is a segment associated with a linear relationship between the n variables. In the case of Figure 3.12, we see that the feasible set is a segment of a straight line. It could be an empty set if the equality constraint was entirely contained in the forbidden subspace of the inequality constraints. The vector corresponding to the minimum, if it exists, may be shown to occupy a corner of the feasible set (e.g., Strang, 1976). The idea of the simplex method is to select any corner of the feasible set and to proceed from corner to corner along the edges in a direction that minimizes the function f(x) until no further reduction can be achieved. A linear programming solution devised by Wright and Doherty (1970) is widely used in the literature for the problem of finding modal abundances, while an example of free energy minimization will be discussed in Chapter 6. When constraints are equality only, the method of Lagrange multipliers is of broad applicability. Let us consider, as in Figure 3.13, a function/(JC) to be minimized on which we impose the constraint that g(x) = 0. From the previous discussion on the gradient properties, we know that — grad / is a vector perpendicular to the curves of constant /(JC) pointing towards decreasing values, while grad g is orthogonal to the locus of x vectors such as g(x) = 0. The orthogonal projection of —grad/on the constraint g(x) = 0 represents a direction of decreasing/(JC). A minimum will be reached once —grad/and grad g are
3.2 Functions of several variables
149
Figure 3.13 Constrained minimization: the minimum of a function f(x) submitted to the constraint g(x) = 0 occurs at M on the constraint subspace, here on the curve g(x) = 0 where V/(JT) + XVg(x) = 0. P is the unconstrained minimum of f(x). This principle is the base for the method of Lagrange multipliers.
collinear, i.e., grad / + X grad g = 0
(3.2.24)
where X is a constant called a Lagrange multiplier. If more than one constraint should be obeyed, as many Lagrange multipliers as there are constraints should be used. In practice, this procedure amounts to increasing the number of independent variables in the system by as many new variables as there are constraints. Equation (3.2.24) is indeed equivalent to finding the minimum of (3.2.25) The derivatives with respect to x produce the set of equations represented by grad 5 = grad / + X grad g = 0
(3.2.26)
150
Useful numerical analysis
while the derivative with respect to k gives 9(x) = 0
(3.2.27)
which is precisely the constraint which we want to be verified. &
Find the minimum of y2
submitted to the constraint
This problem amounts to minimizing
where / is a Lagrange multiplier, relative to x, y, and L The derivatives relative to each variable cancel at minimum dS dS dS — = 2x + ^ = 0 , — = 2 y + A = 0, a n d — = dx dy dX Adding the first two equations and subtracting the third twice results in /.= - 1 , x= 1/2, and y= 1/2
& Distribution of energy states. According to quantum theory, the energy states £ o>£i>£2>--- t h a t atoms in a gas, a liquid or a crystal can reach are distinct and have an equal probability of being taken by an atom. Standard textbooks (e.g., Swalin, 1962) show that the entropy 5 of a population of N atoms, nt being in the energy state ei9 is S = — k Y n.In — t
N
where k is the Boltzmann constant. Find the values of nt for maximum entropy 5 for constant total energy E. The first constraint is a fixed total atoms number £ > = iV or X d w . = ° i
(3.2.28)
i
and the second is a fixed total energy Y4eini = E or YJsidni = ° i
i
(3.2.29)
3.2 Functions of several variables
151
Let us minimize the function S1" given by
S+ = 5 + xfc eini - E \ + fih nt where a and ft are two Lagrange multipliers. We first observe that
and therefore
Each derivative relative to n 0 ,M 1? n 2 ,... must cancel, so for any state i A: In — - a e , - ^ = N
which we rearrange as
Summing up over all the energy states and applying the constraint (3.2.28), we get
and therefore n
N
Yea£'/fc
Introducing this expression into dS and equating to (Lejdnj)/T would show that a is equal to — T " 1 and produce the familiar Boltzmann distribution.o When the minimum of a function f(x) is sought not algebraically, as in these examples, but numerically, different techniques exist, which are described in a specific and abundant literature. Once more, a simple approach makes use of the minimization properties of the gradient direction. The so-called gradient projection method consists in using the projection of the gradient direction onto the constraint subspace as the search direction (Figure 3.14). It works best with linear constraints and an example will be given in Chapter 6.
Useful numerical analysis
152
\ \
-grad/
\ \ \ \
£(*) = const
\ /
\ \ \ x2
\ \
/x\ \
/decreases
gradg
\ \ \
\%
\
\
\ \
\ Figure 3.14 Search direction for the minimum of a function/^!, x2) submitted to the constraint g(x1,x2) = 0. The optimum direction is the projection of the opposite of the gradient onto the constraint subspace.
3.2.5 The Runge-Kutta method for a system of differential equations The single-variable method can be extended to a system comprising any number of differential equations. The Runge-Kutta method is commonly found in software packages. For simplicity, it will be described for a system of only two equations — = *r =f(t,x,y) dt
(3.2.30)
dy —=yt'=g(t>x>y) dt
where / and g are two known functions of t, x, and y. Let us define
Given the intermediate evaluations kA = h(n)f[t{n\ x(n), /">], /c 2
t<»» +
"
x<»>
+ 2 i , y»» + !i I
», x
],
;3 = l4 =
r
L(«)
3,
y(n) + / 3 ]
3.2 Functions of several variables
153
the values of x and y at time t{n+1) are
<#^ Solve the following equation, which appears in connection with some diffusion problems, in C(t) from t = 0 to t = 0.5 " + 2fC'-0.5C =
(3.2.31)
with the conditions C= 1 and C' = 0 at t = 0. Let us use a time step h of 0.1 and define x = C, y = C. Equation (3.2.31) becomes a system of first-order equations
' = 0.5x-2ty
(3.2.32)
with the conditions x=l and y = 0 at t = 0. With reference to equation (3.2.30), the functions / and g are defined as (3.2.33)
Let us work out the first step in detail using t(0) = 0, x ( 0 ) = 1, >>(0) = /cx =0.1 x0 = 0 / 1 = 0 . 1 x ( 0 . 5 x l - 2 x 0 x 0 ) = 0.05 /c2 = 0.1 x (0 + — J = 0.0025
/2 = 0.1x o.5n + -J-2K)+ — /c3 = 0.1
- ^ ) ] = 0.04975
= 0.002488
kt = 0.1 x (0 4- 0.049 814) = 0.004 981 /4 = 0.1x [0.5(14-0.002488)-2(0 + 0.1)x(0 + 0.049814)] =0.049 128
Useful numerical analysis
154
Table 3.13. The first five iterations (n = l, ..., 5) of the Runge-Kutta solution to equation (3.2.31) for a time-step of 0.1.
n
0
1
2
3
4
5
t(n)
0 1 0 0 0.0500 0.0025 0.0498 0.0025 0.0498 0.0050 0.0491 1.0025 0.0497
0.1 1.0025 0.0497 0.0050 0.0491 0.0074 0.0480 0.0074 0.0481 0.0098 0.0466 1.0099 0.0977
0.2 1.0099 0.0977 0.0098 0.0466 0.0121 0.0447 0.0120 0.0448 0.0142 0.0425 1.0219 0.1424
0.3 1.0219 0.1424 0.0142 0.0426 0.0164 0.0400 0.0162 0.0401 0.0183 0.0373 1.0382 0.1824
0.4 1.0382 0.1824 0.0182 0.0373 0.0201 0.0343 0.0200 0.0345 0.0217 0.0312 1.0582 0.2167
0.5 1.0582 0.2167 0.0217 0.0312 0.0232 0.0279 0.0231 0.0281 0.0245 0.0247 1.0813 0.2447
x{n) y(n)
k1
h
k2
h
u
x(n+l)
which gives the x and y values at the next step u_
u
0 + 2x0.0025 + 2x0.002488 + 0.004981
=0+
Table 3.13 gives the results up to £ = 0.5. <= & Convective dispersal of a conservative tracer in a velocity field. This calculation is of major interest for many problems such as the geochemical evolution of the mantle, the mixing time of the ocean, the understanding of magma mixing processes, ... Justification of the equations used is presented in Chapter 8. Solve for t = 0.02, 0.04, 0.06, ... the system of differential equations dx . (n — = — n sin nx sin dt \2
&y n fn — = — cos nx cosl dt 2 \2
ny y
(3.2.34)
7iy v
given x= y = 0.l at t = 0. It is left to the reader to build Table 3.14. 3.2.6 Interpolation with spline functions When a bi-dimensional table of data with unevenly distributed coordinates is available (e.g., data points on a map), the need for computing interpolated values is frequently
3.3 Partial differential equations: the finite differences method
155
Table 3.14. The first five iterations of the Runge-Kutta solution to equation (3.2.34) for a time step of 0.02.
n t(n)
xin) y(n)
f G
h
k2
h K U(»+D X
y(n+D
0 0 0.1 0.1 -0.9233 0.4616 -0.0185 0.0092 -0.0167 0.0097 -0.0169 0.0097 -0.0153 0.0103 0.0832 0.1097
1 0.02 0.0832 0.1097 -0.7638 0.5129 -0.0153 0.0103 -0.0138 0.0108 -0.0139 0.0108 -0.0126 0.0113 0.0693 0.1205
2 0.04 0.0693 0.1205 -0.6303 0.5670 -0.0126 0.0113 -0.0114 0.0119 -0.0115 0.0119 -0.0104 0.0125 0.0578 0.1324
3 0.06 0.0578 0.1324 -0.5191 0.6244 -0.0104 0.0125 -0.0094 0.0131 -0.0095 0.0131 -0.0085 0.0137 0.0484 0.1455
4 0.08 0.0484 0.1455 -0.4268 0.6854 -0.0085 0.0137 -0.0077 0.0143 -0.0078 0.0144 -0.0070 0.0150 0.0406 0.1599
5 0.10 0.0406 0.1599 -0.3506 0.7501 -0.0070 0.0150 -0.0063 0.0157 -0.0064 0.0157 -0.0057 0.0164 0.0343 0.1756
encountered. A most common application is the drawing of contour maps (such as isopleths). Several procedures exist but the success of cubic splines makes this technique easily available from software packages. Two-dimensional interpolation is carried out as a successive construction of one-dimensional splines along rows followed by one-dimensional splines along columns (e.g., Press et a/., 1986). Other useful spline variants in multiple dimensions are described by Sandwell (1987). 3.3 Partial differential equations: the finite differences method
Partial differential equations (PDE) involve relationships between partial derivatives of a function. One of the most common PDE problems to be solved in geochemistry appears to be the diffusion equation, which, in the jargon of PDE specialists, is said to be parabolic, because of the values of each derivative's coefficients, in contrast with elliptic equations (e.g., Laplace equation in a square) or hyperbolic equation (e.g., the wave equation). Many methods exist which can solve these problems, some of which appeared in the last 20 years, e.g., the collocation and least-square methods (Finlayson, 1972), the variational finite-element method (Zienkiewicz, 1977) and the multigrid method (Hackbusch, 1985). However, the easiest route is by far the finite difference method (e.g., Mitchell, 1969). Although computational effort with finite differences is small, the results, surprisingly enough, are often quite satisfactory. In addition, although being rather slow, the finite differences are easy to learn and easy to implement. The increasing power of computers makes it possible to consider this method as a competitor to the most efficient and sophisticated methods, sometimes difficult to master for a non-specialist, always time-consuming to implement. Only for natural objects with irregular geometry, such as plutons, could alternatives to the finite differences result in a net gain in the time and effort spent in solving the problem.
156
Useful numerical analysis 3.3.1 One-dimensional diffusion problems: general
Let us consider the diffusion of a homogeneously distributed substance out of a slab with unit thickness in the direction x and infinite size in the other two dimensions. C(x, t) being the concentration at time t at a distance x from the origin, the diffusion equation is £OM)*C<M) dt dx2 where 3f is the diffusion coefficient of the substance. The following initial condition holds C(x,0) = Co(x) while at the boundaries, the conditions are
The first step is to discretize the (x, t) domain using a uniform mesh x = iAx, i = 0,l,...,n=l/Ax t=jAt, 7 = 0,1,...,00 and call Ctj the concentration at the mesh point (node) x = i Ax, t =j At (here i and 7 have do not have their usual meaning of element and phase). Let us define the central difference operator 8X as
From this definition, the second-order difference is
or 5 x 2 C/=C l _ 1 ^-2C/ + Cl + 1^
(3.3.2)
In the finite difference schemes, the derivatives are replaced by difference operators, e.g., the first derivative dc(xt)
8cy
^x
Ax
c ; - c ;
3
Ax
and d2C(xj) ^ Sx(dxCA6xCi AxV Ax /
CI-/2C/CI^ Ax 2
Ax2
3.3 Partial differential equations: the finite differences method
157
The time derivative is evaluated at mid-point between t and t + At, i.e., dc{x,t) dt
6 r c^ + 1 / 2 At
c/+1-cy At
(3.3.5)
and the second derivative with respect to x is replaced by a linear combination of the second-order finite differences at t and t + At ,,,6,
Axz
dx
where 9 is a parameter defined over (0,1]. Rewriting the diffusion equation into a partial difference equation leads to
At
|_
Ax2
(3.3.7)
Ax2 J
Two well-known schemes can be derived from this equation. For 0 = 0, the second-order difference is evaluated at t = i Ax and the scheme is called explicit
Ax2
At
The 'molecule', that defines the nodes involved in the calculation for a pair of i, j values, is shown in Figure 3.15. Defining the parameter r as
(3-3-8)
r~ Ax2 and recombining J 1"
(3.3.9)
or equivalently, in terms of the central difference operator
For r < 1/2, the explicit scheme is shown to be stable, e.g., does not make errors grow exponentially with time (e.g., Mitchell, 1969). Alternatively, for 6 = 1/2, i.e., evaluation of the second-order difference as an average of the values at t and t + At, leads to the robust Crank-Nicholson scheme ^
^
+ CI. + ^ + 1 )
(3.3.10)
The 'molecule' of the Crank-Nicholson scheme is also shown in Figure 3.15.
158
Useful numerical analysis
Equivalently, in terms of the central difference operator 8X, we can write
Multiplication by 2 and recombination leads to the linear system of equations (3.3.11)
Given the boundary conditions of zero concentration for i = 0 and i = n9 the equations can be recast into matrix form 1 -r
0
0
0
0
0
2(1+r)
-r
0
0
0
j+1
0
0
-r
2(1 + r)
-r
0
0
0
-r
2(1 + r)
-r
0
0
0
0
-r
2(1 +r)
-r
0
0
0
0
0
0 - - .
0
0
0
0
0
0
0
1
0
0
r
2(1-r)
r
0
0
r
2(1 - r )
r
0
1
C J
!
0 0
0
0
0
0
r
2(1 - r ) ..-Or
..
0
0
r
0
-r)
r
0
1
Let us lump the concentrations at the nodes f = 0, 1, 2,..., n— 1, n into a single vector c7 and defining the (n+ l)x(n-|-1) matrices An and ^ n by their current element atj and bip respectively, such that
.= - r
bij = r
for | i - ; | = l
j= 0
6l7 = 0
otherwise
The matrix form of the Crank-Nicholson implicit scheme becomes AncJ+l=BncJ
3.3 Partial differential equations: the finite differences method
159
Implicit CrankNicholson scheme
Explicit
i+l
i+l
i-l
i-l
Time Figure 3.15 Finite difference 'molecules' for explicit (left) and implicit Crank-Nicholson (right) schemes.
or (3.3.12) which shows how the node concentrations at time t + At are calculated from those at time t. Although the Crank-Nicholson scheme is unconditionally stable, use of small r values such as r< 1/2, improves accuracy (Mitchell, 1969). &
Solve with both the explicit and implicit methods the equation
ct
with the initial condition C(x, 0) = 1 and the boundary conditions C(0, t) = C(/, t) = 0 for / = l c m and ® = 0.005cm 2 s" 1 . For the purpose of illustration, a very coarse mesh size will be used with Ax = 0.25, i.e., w = 5, and At = 2.5 s. Therefore
r = — r2 = Ax
0.005 x 2.5 z— = 0.2 0.252
Useful numerical analysis
160
Figure 3.16 Results of the explicit difference scheme applied to the following diffusion problem: (a) initial concentration equal to unity (b) end concentrations at x = 0 and x = 1 cm are zero (c) Qi = 0.005 cm2 s" 1 . Length increment is Ax = 0.25, i the number of length increments. Time increment is At = 2.5 s, j the number of time increments.
We first implement the explicit scheme. The initial conditions require C,° = 1 = 1 for i = 1,2,3,4 (initial condition) Q>° = 0
(boundary condition)
C 5°
(boundary condition)
Points i = 5, 4, and 3 may be obtained by symmetry from points 0, 1, and 2. Hence at t = At = 2.5s l
= Co°
(boundary condition)
1
C 1 = 0 . 2 x C 0 0 + 0.06xC 1 0 + 0.2xC 2 ° = =0.2 x ( V + O^ x C2° + 0.2 x C 3 °= 1
The same calculation can be can be carried on as far as needed. The results up to r = 4Af=10s are depicted in Figure 3.16. Let us now turn to the implicit Crank-Nicholson method and form the matrix An as 1
0
0.2
2(1+0.2)
0 A =
-0.2
0 -0.2 2(1+0.2) -0.2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
-0.2 2(1+0.2) -0.2 0
-0.2 2(1+0.2) 0
0 -0.2 1
3.3 Partial differential equations: the finite differences method 1
0
0
-0.2
2.4 -0.2
0
-0.2
0
0
-0.2
0
0
0
0
0
0
0
0
0
0
0
2.4 -0.2
2.4 -0.2 -0.2
0
2.4 -0.2
Likewise 1
0
0 «„=
0
2(1-0.2)
+ 0.2
+ 0.2
+ 0.2
0
0
0
0
0
0
0
0 0
0
0
0
0
0
0
+ 0.2
2(1-0.2)
+ 0.2
0 0
+ 0.2
2(1-0,2) + 0.2
0
2(1-0.2) + 0.2
0
1
0
"1
0
0
0
0 "
0.2
1.6 0.2 0
0
0
0
0.2
1.6 0.2 0
0
0
0
0.2
1.6 0.2 0
0
0
0
0.2
1.6 0.2
0
0
0
0
0
1
therefore "1.0000 0
0
0
0
0.1678
0.6784
0.1409
0.0118
0.0010
0.0001
0
"
0.0141
0.1409
0.6902
0.1418
0.0118
0.0012
0.0012
0.0118
0.1418
0.6902
0.1409
0.0141
0.0001
0.0010
0.0118
0.1409
0.6784
0.1678
0
0
0
0
0
1.0000
•
•
We can now calculate the approximation cx for i= 1, 2, 3, 4 at t = At as fl.OOOO 0
0
0
0
0.1678
0.6784
0.1409
0.0118
0.0010
0.0001
0
" "0" 1
0.8321
0.0141
0.1409
0.6902
0.1418
0.0118
0.0012
1
0.9847
0.0012
0.0118
0.1418
0.6902
0.1409
0.0141
1
0.9847
0.0001
0.0010
0.0118
0.6784
0.1678
0
1
0.8321
0
0
0
0
1.0000
0m
•o
_0
161
Useful numerical analysis
162
o
Figure 3.17 Same as Figure 3.16 but for the implicit Crank-Nicholson scheme. then iterate over successive time steps. The results up to t = 4 At = 10 s are depicted in Figure 3.17. o
3.3.2 More boundary conditions
For boundary conditions which require a prescribed value of the flux instead of concentration, we introduce what is usually called fictitious points. Let us assume that at x = 0, the condition is dC(xj)
(3.3.13)
ex We introduce the approximation valid for any value of t, hence of j dC(x, t) ex
l
<~ - 1
(3.3.14)
where i = — 1 refers to the point symmetrical to the mesh point i = 1 relative to the interface. Then we write that the finite difference approximation holds for the mesh point / = 0 at the interface, e.g., for an implicit scheme -rC^j+1 +2(1 Combining the two equations, the fictitious point concentrations can be eliminated
or (3.3.15)
3.3 Partial differential equations: the finite differences method
163
As will be shown below, writing this equation as a matrix equation is rather straightforward. Similar techniques can be used for the so-called 'radiation' boundary conditions which involve linear combinations of concentration and concentration gradient and appear in connection with elemental fractionation between adjacent phases. & Solve with the Crank-Nicholson scheme the diffusion problem described in the worked example in Section 3.3.1 d2C(x,t) <3 dx2
dC(x,t) ct
with C(x, 0)= 1, C(/, r) = 0 for /= 1 cm, and 2 = 0.005 cm2 s~ \ but with no flux at x = 0. Let us keep the same mesh size Ax = 0.25, i.e., n = 5, and Ar = 2.5s. Let us form the matrices
A =
2(1 + r )
—r
0
0
0
0
—r
2(1 + r)
—r
0
0
0
0
—r
2(1 + r)
- r
0
0
0
0
—r
2(1 l-r)
—r
0
0
0
0
2(1 + r)
—r
0
0
0
0
1
" 2.4
-0.4
-0.2
r 0
0
0
0
0 "
2.4
-0.2
0
0
0
0
-0.2
2.4
-0.2
0
0
0
0
-0.2
2.4
-0.2
0
0
0
0
-0.2
2.4
0
0
0
0
-0.2 1
0
and 2(1 - r )
2r
0
0
0
0
r
2(1-r)
r
0
0
0
0
r
2(1-r)
r
0
0
0
0
r
2(1 -r)
r
0
0
0
0
r
2(1-r)
r
0
0
0
0
0
1
164
Useful numerical analysis 1.6
0.4
0
0
0
0 "
0.2
1.6
0.2
0
0
0
0
0.2
1.6
0.2
0
0
0
0
0.2
1.6
0.2
0
0
0
0
0.2
1.6
0.2
L0
0
0
0
0
1
and the vector of flux conditions at x = 0 -4rg0Ax 0 =0
V= 0 0
The matrix equation reads (3.3.16)
We obtain "0.6903 0.2837
0.0238
0.0020
0.0002 0.0000"
0.1419
0.7022
0.1429
0.0120
0.0010
0.0001
0.0119
0.1429
0.6904
0.1419
0.0118
0.0012
0.0010
0.0120
0.1419
0.6902
0.1409
0.0141
0.0001
0.0010
0.0118
0.1409
0.6784
0.1678
0
0
0
0
0
1
We can now calculate the approximation cx at t = At fO.6903
0.2837
0.0238
0.0020
0.0002 0.0000" "1"
0.1419
0.7022
0.1429
0.0120
0.0010
0.0001
1
•i.oooo0.9999
0.0119
0.1429
0.6904
0.1419
0.0118
0.0012
1
0.9988
0.0010
0.0120
0.1419
0.6902
0.1409
0.0141
1
0.9859
0.0001
0.0010
0.0118
0.1409
0.6784
0.1678
1
0.8322
0
0
0
0
0
1
The results up to t = 4 At = 10 s are depicted in Figure 3.18.
_0_
0
3.3 Partial differential equations: the finite differences method
165
1 0.8 0.6 0.4 0.2 0
1
0
2
3
4
5
i Figure 3.18 Same as Figure 3.17 but with no flux at x = 0.
3.3.3 A word about advection Solving the purely advective equation or even introducing an advection term into the diffusion equation is a source of numerical difficulties. The simplest advection equation of a medium moving at velocity v in one dimension can be written dC(xj)
dC(xj) = —v-
(3.3.17)
dx
ct
Most simple finite difference schemes tend to be either unstable, inaccurate or both. Strang (1986) recommends using Friedrichs scheme
C,- +1
— C,--!
At
(3.3.18)
With r = vAt/Ax, the finite difference equation becomes (3.3.19)
and the stability condition is — l ^ r ^ - h l . For the cases where advection appears as an additional term in the diffusion equation, the reader is referred to the discussion in Chapter 9 of the book by Fletcher (1991). 3.3.4 Two space coordinates: The ADI method Given a rectangle R {a ^ xl ^ b, c ^ x2 ^ d) and the two-dimensional diffusion equation x 1 > x 2 > r) = dt
p 2 C(x l t x 2 > r) L
d*i
2
d2C(xux2,t)~
(3.3.20)
166
Useful numerical analysis
with the initial condition C(xu x2, 0) = C o(x 1 , x2) and the boundary conditions C(a,x2,t) = Ca(x29t) C{b9x2,t) = Cb(x29t)
The grid points are ilAx (Jl=0, 1, ..., wl), i2Ax (i2 = 0, 1, ..., nl\ andjAr. Explicit different schemes show poor stability properties (Mitchell, 1969). In terms of the central difference operator, it may be shown that an accurate implicit equation is 1--^ which is the extension of the Crank-Nicholson implicit formula to two dimensions. The widely used Peaceman-Rachford method deals with the Laplacian operator in two steps. Approximation is carried out on one space derivative with an explicit scheme and an implicit scheme for the other derivative (hence the name of Alternating Direction Implicit or ADI method), which results in the two successive formulas
I,i2
H/2
Developing, the two steps can be written
(3.3.21)
+ ui2i+V2
(3.3.22)
The 'molecule' of the difference scheme over the time steps is given in Figure 3.19. The half-step may not have in general the significance of an intermediate time step and is more a matter of computational convenience. For sake of demonstration, let us assume that concentration on the edge is constant with time c C
J—c 0i2 — e 0 i 2
with similar equations for the other three edges. Let us first compute the condition along the vertical (y) edges for the first half-step. For il = 1 and 1 < f 1 < n\ — 1, equation
3.3 Partial differential equations: the finite differences method
167
a
il
Figure 3.19 Finite difference 'molecule' for the alternating-directions implicit method (ADI) of solving the two-dimensional diffusion equation.
(3.3.21) becomes Ili o
(3.3.23)
iUn2
(3.3.24)
A similar equation holds for Q = nl—\ and 1 < i \
and for the horizontal (x) edge. For il = 1 and 1 < il < nl — 1, equation (3.3.23) becomes 2(l+r)CUi2j+1'2-rC2,2j+^2 and for il = nl,
= rCui2_^ + 2(\-r)Cui2^rCUi2
j +l
+ rC042
(3.3.25)
I
(3.3.26) We now turn to the conditions at the corners. As an example, let us compute equation (3.3.21) of the ADI scheme for il = i2= 1 (3.3.27)
168
Useful numerical analysis
and for i\=n\ — 1 and i2 = n2—l.
(3.3.28)
Let us form the system of equations for I
0 -r
..• 0
0 0
7 + 1/2 2
7+1/2
0
0
-r
-r2(l+r)
-r
0
•.
0
-r
2(l+r)_
2
C
1 , «"2 — 1
C
0,i2
r
0
2(1 - r ) + r C«l-2,i2-l
j
L
0
r
For f2 = 1, the system becomes "2(1 +r)
-r
0
•••
0
-r
2(1+r)
-r
0
0
0 0
0 •••
-r 0
2(1 +r) -r -r 2(1 +r)_
c ^2,0
nl - 2 , 1 .^nl-1,1
C
nl-2,
U
« l -1,2 _
while for i2 = n2-l "2(1+r)
-r
0
•••
0
-r
2(1+r)
-r
0
0
0 0
0 ...
-r 0
2(1+r) -r -r 2(l+r)_
c2
l
7+1/2
3.3 Partial differential equations: the finite differences method CUn2-2J
169
C\,n2-\
J
C2,n2-2
C2,«2-V r
L2(l-r). mCnl,n2
- 1 + Oil - I,n2_
All these equations can be stacked as •2(1+r)
-r
0
-r
2(1+ r)
-r
0
-r
2(1 + r)
-r
0
-r
(1+r).
"2(1 -r)
r
0
0
r
2(1-r)
r
0
0
0
r
0
...Or e
o
^2,0
2(1 -r)
r 2(1 -r\
r
r
...
o
Cnl,n2-2
c2
Cnl,n2-l+Cnl-l,
where C J is the (n\ — 1) x (n2— 1) matrix at time step 7 of concentration values at all the inner grid points (not including the boundaries). In a compact matrix form, the equation reads (3.3.29) In this equation, Anl _ x is an (nl — 1) x (nl — 1) tridiagonal matrix with 2(1 + r) on the main diagonal and — r on the other two diagonals. Bn2-1 is an (n2— I)x(n2— 1) tridiagonal matrix with 2(1—r) on the main diagonal and I r o n the other two diagonals. H is an (nl — 1) x {nl— 1) matrix built as indicated. For the second half-step described by equation (3.3.22), we can write (J'2 = 2, 3, ..., n2-2, Ci,i2J C2,i2j
—r 2(1 +r) —r
1
Useful numerical analysis
170 2(1 - r )
r
0
r
-r)
r
0
0
0
0
r
2(1 -r)
r
0
r
2(1 - r ) _
0
0
. J+l/2
C
J+l/2
-2,i2
cnl - I i 2
J+l/2 i+1/2
and expressions similar to those of the first half-step for concentrations on the edges. In a matrix form, the system of equations can be written (3.3.30)
In this equation, An2 _ x and Bnl _ x are defined as before, while His an (nl — 1) x (nl — 1) matrix. Rewriting the expression (3.3.29) for the first half-step as
and inserting into equation (3.3.30), the matrix equation becomes
and finally ,-'
(3.3.31)
We now get a fairly good feeling that handling boundary conditions in two dimensions is significantly more difficult than those of one-dimensional problems. Flux conditions can be applied to the boundaries using the method offictitiouspoints. Solve the two-dimensional diffusion equation C(x,y,t)] d2C(x
dC(x,y,t) dt
dy2 J
in the rectangle defined by a^x^b = a + L and condition C(x, y, 0) = 0 and the boundary conditions
with the initial
C(a,y,t) = O •(b,y,t) =
^
d-c
b-a
for a diffusion coefficient Q) = 0.005 cm2 s i. Let us mesh the rectangle assuming Ax = 0.2, which is equivalent to nl=ny = 4, n2 = nx = 5, and At = 0.1, i.e., r = 0.0125 (note that x and y are switched relative to the
3.3 Partial differential equations: the finite differences method
171
indices n\ and til). We now build the time-independent matrices 2.0250
-0.0125
-0.0125
2.0250
-0.0125
-0.0125
2.0250
-0.0125
-0.0125
2.0250
0 0
0
0
0 0
and "1.9750 0.0125 0
0
0.0125
1.9750 0.0125 0
0
0.0125
1.9750 0.0125
0
0
0.0125 1.9750
which are combined into 0.9754 0.0122 0.0001 0.0000" 0.0122 0.9755 0.0122 0.0001 0.0001
0.0122 0.9755 0.0122
0.0000 0.0001
0.0122 0.9754
Likewise 2.0250
-0.0125
-0.0125
2.0250
0
-0.0125
0 -0.0125
and
1.9750
0.0125
0
0.0125
1.9750
0.0125
0.0125
1.9750.
L0
2.0250J
so 0.9754
0.0122
0.0001
0.0122
0.9755 0.0122
LO.0001 0.0122 0.9754
H is built from the boundary conditions as 0.2000 0.4000 0.6000 1.55'
H= 0
0
0
0.50
.0
0
0
0.25.
hence 0.0025
0.0049
0.0075
0.0190"
0.0000
0.0000
0.0001
0.0062
.0.0000
0.0000
0.0000 0.0031.
Useful numerical analysis
172
The concentration matrices will be shown as Cj fringed on each edge by the boundary values written in italic, i.e., as (n y +l)x(n J C +l) = 6 x 5 matrices, and a symbols with overbar Cj will be used. Given the initial concentration matrix '0 0.2 0.4 0.6 0.8 1.0 ' 0 0
0
0
0
0.75
C° = 0 0
0
0
0
0.50
0 0
0
0
0
0.25
0 0
0
0
0
0
we can calculate concentrations at t = At '0 0.2
0.4
0.6
0.8 1.00'
0
0.0025 0.0049 0.0075 0.0190 0.75
0
0.0000 0.0000 0.0001 0.0062 0.50
0
0.0000 0.0000 0.0000 0.0031
0.25
0
0
0
0
0
0
0.4
0.6
0.8 1.001
t = 2At V0 0.2 0
0.0049 0.0098 0.0149 0.0372 0.75
2
C = 0 0.0000 0.0001 0.0003 0.0124 0.50 0
0.0000 0.0000 0.0001 0.0061
10 0 and so forth. <=•
0
0
0
0.25 0 A
4 Probability and statistics
4.1 A single random variable 4.1.1 General
In most natural situations, physical and chemical parameters are not defined by a unique deterministic value. Due to our limited comprehension of the natural processes and imperfect analytical procedures (notwithstanding the interaction of the measurement itself with the process investigated), measurements of concentrations, isotopic ratios and other geochemical parameters must be considered as samples taken from an infinite 'reservoir' or population of attainable values. Defining random variables in a rigorous way would require a rather lengthy development of probability spaces and the measure theory which is beyond the scope of this book. For that purpose, the reader is referred to any of the many excellent standard textbooks on probability and statistics (e.g., Hamilton, 1964; Hoel et al, 1971; Lloyd, 1980; Papoulis, 1984; Dudewicz and Mishra, 1988). For most practical purposes, the statistical analysis of geochemical parameters will be restricted to thefieldof continuous random variables. • A random variable X is defined by (a) a continuous domain Q of definition in % e.g., ] - oo, 4- oo[, [0, + oo[, or [ - 1 , + 1]. (b) its probability distribution function (sometimes called the cumulative distribution function or, for short, distribution function), i.e., a continuous non-decreasing function F{x) defined for each value x in Q, such as F(x) = #{X^x}
(4.1.1)
where & stands for 'probability of. Strict notation conventions should be observed: an upper-case letter should refer to a random variable, while the same letter in lower case refers to the values taken by this variable. Writing the same relation at x + cbc
and subtracting, we get
The probability density function f(x) (also the density function or frequency function) is the 173
Figure 4.1 Relationship between the probability density function/(x) of the continuous random variable X and the cumulative distribution function F(x). The shaded area under the curve f(x) up to x0 is equal to the value of f(x) at x0.
4.1 A single random variable
175
derivative of the distribution function F(x). These two functions relate through dd
/ (() )d d # { ^ ^
+
}
(4.1.2)
dx f(x) therefore has the significance of a probability per unit of X in the neighborhood of the value x. Because F(x) is non-decreasing, its derivative/(x) is non-negative. Conversely, if Q is the domain ] — oo, +oo[, the cumulative distribution function F(x) relates to the probability density function/(x) through =\
f(u)du
(4.1.3)
J—
F(x) represents the surface under the curve of the function/(u) up to u = x (Figure 4.1). Since the total probability over the domain is 1
1 •
(4.1.4)
If p is an integer between 0 and 100, the pth percentile is the value xp of the random variable X which limits to the right p/lOOths of the surface S under the probability curve (Figure 4.2). 4.1.2 Expectation and moments
Let us consider a continuous random variable X defined over a given domain Q of ^ , such as [0, ..., +oo[ or ] - o o , ..., +oo[ and let/(x) be its probability density function over Q. Let g(X) be a function of the random variable X. •
The expectation $(g) of g(X) is <%)= I g(x)f(x)dx
(4.1.5)
and is usually denoted with greek letters. Expectation is therefore a linear operator. The expectation has the meaning of a centroid with the local' probability /(x)dx as a weight-function. • Using parentheses to emphasize power functions, the nth moment is the expectation $\_(X)n~\ of X". The mean \i is the first moment H=
x/(x)dx
(4.1.6)
JQ
• The nth central moment is g[_(X-n)n] or £{[X-g{X)~]n}. central moment, i.e.,
The variance a2 is the second
\rckr(Y\ rr 2 jPf F Y /?( YY\2\ /P(Y2\ JCP( Y\/P( Y\ y a\\J\ ) — Ox — 0 I L ^ — ^ V^ )J ( =z & V-A ) — <5 ^A )0 \J\ )
(A \ H\ yr.l.. I)
• The standard deviation a is the square-root of the variance and has the same unit as the random variable. A random variable is standardized (or reduced) if its variance is unity and centered if its mean is zero.
Probability and statistics
176
100 100
Figure 4.2 Definition of the percentile p: the curve is the probability density function /(x) of the continuous random variable X. xp is the pth percentile when the surface up to xp represents p percent of the total surface S under the curve. ^
Calculate the mean and variance of the uniform density f(x)=l/2a
for
The mean is the first moment, i.e., c+a i rx 2 i + f l li = £'(X)=\ x —dx= — =0 J_fl 2a L4«J-fl while the variance is the second central moment i fx 3 ~l +a a3
2
&
2
2
—a3
a2 =— 6a 3
<^
Find the variance of the Cauchy distribution
n
1+x2
Because f(x) is symmetric about the mean, its mean must be zero. Its variance should be Tll-hX 2
but this integral diverges when x->oo. The Cauchy distribution has no finite variance, o • The moment generating function M(t) = $(etX) is particularly useful to compute moments. Expanding the exponential, we get = S[ l + tX+l-X2+
...]=\+t£(X)+t-g(X2)+
...
4.1 A single random variable
111
Alternatively, M(t) may be expanded in Taylor series in the neighborhood of t = 0 M(t) = M(0) + tM'(0) + - M'(0) + ...
(4.1.8)
where the number of primes indicates the order of the derivative. Comparing the last two equations shows that the moment of order n is equal to the nth derivative of M(i) for t = 0 (4.1.9) Jt = O
& Calculate the moment generating function of the general normal distribution f(x) given by
From the definition of the moment generating function, we write
Multiplying and dividing the right-hand side by the exponential of fit + a2t2/2, we get / M(t) = exp( fit H
\
(T2t2\C
+ co
I
1
(x-u)2-^rT2t(Y-n\^rT4t2l
f exp
^ / J - ao ^/luo
L
which we rearrange as 2<72
J
The integrand is almost the normal density. Introducing the variable u= -
M(t) can be rewritten as
<
A*f +
a2t2\ 2 /
x
1
f +0
x /2^J-oo
On the right-hand side of the multiplication sign, we recognize the integral of the reduced normal density function which sums up to one, and therefore e^t+£ r
(4.1.10)
178
Probability and statistics
The first and second derivatives are
M"(t) = G2M{t) + (/i + (72t)M'(0 = (T2 Af (0 + (// + a2t)2M{t) The values of M(t) and its first two derivatives at f = 0 are M(0)=l, M/(O) = iu, and M"(0) = //2 + (72 The mean of the normal density function is therefore &(X) = M'(0) = fi
(4.1.11)
and its variance g(X2)-g(X)g(X)
= M'\Q>)-[M\0)]2 = ii2 + G2-v2 = o2
(4.1.12)
two well-known and quite useful results. 4.1.3 A compendium of some common probability density functions • A random variable is said to be distributed as the uniform distribution if f(x) is given by /(x)=-i-, b—a
a<x
(4.1.13)
and/(x) = 0 otherwise Because of its importance in natural phenomena, e.g., radioactive decay and population dynamics, let us introduce the exponential distribution through an illustration, n atoms are assumed to decay over the time interval [0 — 0], each atom having the same probability of decaying at any time in this interval. In other words, the time at which an atom decays is uniformly distributed over [0 — 0]. Let us call N(t) the number of decay events between 0 and t. The probability that a single atom has not decayed at time t, is 1 —1/6 (Figure 4.3). The probability that none of the n atoms has decayed at time t is (4.1.14) The probability that the first decay takes place before t is therefore the complement of this expression to one since
The cumulative distribution function of the time T to the first decay is F(t) = ^{T^t}=
&{N(t) # 0} = 1 - 0>{N(t) = 0}
and therefore F(t)=l-(l--'
4.1 A single random variable
o
t
179
e
Time Figure 4.3 Distribution of events on the time axis in a Poisson process over the time interval [0 — 0]: the probability of occurrence for one event in a time interval depends on its duration only (see text). The corresponding density of probability function f(t) is obtained as the derivative of F(t) f(t) = — = -
1--
dt e\
(4.1.15)
e)
Let us assume that probability of decay per unit time is constant, i.e.,
We rewrite equation (4.1.15) as
and 'enlarge' our view by increasing the number of decays n->oo, which because of constant decay rate implies 0->oo. For finite time t, At«n and, expanding the logarithm to the first order (see Section 3.1), we get the density of the exponential distribution defined over [0, + oo[
XCXP x f nn 1 ^ 1 L n\
(4.1.16)
The exponential distribution has mean /x = X and variance a2 = X2. i The most widely used distribution is the normal distribution (also Gaussian) with the familiar bell shape defined over ] — oo, + oo[. Its most general form is (4.1.17) with mean \i and variance a2. The moment generating function is given by equation (4.1.10). It is said to be centered when fi = 0 and reduced when a2 = l. • A random variable X is distributed as a log-normal distribution if In X is distributed as a normal distribution. If U is a random normal variable, a log-normal distribution is the distribution of a variable X such as (4.1.18) or X = fiaRu
(4.1.19)
crR can be viewed as a 'relative' standard deviation: for U= ± 1, X is multiplied, respectively divided, by aR.
180
Probability and statistics
Figure 4.4 Gamma probability density functions for scale parameter p = 1 and different shape parameters a = 1, 2, and 5.
The Cauchy distribution, also bell-shaped, is defined over ] — oo, + oo[ and reads (4.1.20) Its mean is zero but we have demonstrated that it does not have a finite variance. Several distributions belong to the family of gamma distributions defined over ]0, + oo[
fix)-
1
(4.1.21)
where a and p are parameters with a and p > 0. a is often referred to as the shape parameter and p as the scale parameter. This function is plotted in Figure 4.4 for /?= 1 and a = 1, 2, and 5. F(a) is the gamma Eulerian function
,-JV..
x*-le~xdx
F(a)=
(4.1.22)
Jo Jo
and is also well known in its form of the factorial function for integer values of a (4.1.23) Additional properties of this function are the recursion formula (4.1.24)
and the special value (4.1.25)
4.1 A single random variable
181
The mean of the gamma distribution is a/? and its variance a/?2. The moment generating function is (4.1.26) The parameters a and /? can be computed from // and a2 using a = (/V
p = <J2/n
(4.1.27)
Setting a = 1 and /?= I/A, we recognize the exponential distribution of equation (4.1.16). • For a = v/2 and /? = 2, the gamma distribution becomes the chi-squared (x2) distribution A
(
4
.
1
.
2
8
)
with mean v and variance 2v. For reasons that will appear later, v is known as the number of degrees offreedom. The pth percentile of the chi-squared distribution with v degrees of freedom is denoted xP,v2• A special case of the gamma distribution is obtained by replacing a by n+1, where n is an integer, and x by /to (this can be easily achieved using the change of variable technique discussed below). The resulting distribution f(x) is /(x) = i x n e - x \
(4.1.29)
and is known for integer values of x as the Poisson distribution. A random variable X is distributed as the beta distribution over the range [0,1] if its density function is given by
l^"\-\l-xr-i
(4.1.30)
with the two parameters ax and a 2 > 0 . This function is plotted in Figure 4.5 for different combinations of ax and a2. Its mean and variance are ^
(
4
.
1
.
3
1
)
The parameters ax and a2 can be computed from \i and a2 using «i = ^ ( l - M ) - ^
and
a2 = ^ ^ - / i - ( l - M )
(4.1.32)
No simple form of the moment generating function exists. In the special case where a i = a 2 = 1» t n e b e t a distribution reduces to the uniform distribution over [0, 1]. • Finally, we will frequently refer to Snedecor's F-distribution. A random variable defined over ]0, +oo[ is distributed with the F-distribution with Vi and v2 degrees of freedom
182
Probability and statistics
0
0.2
0.4
0.6
0.8
1
X Figure 4.5 Beta probability density functions for different parameter pairs ax and a2.
when its density function is given by
2
;
/Vl
i/2 (vi+v2)/2
/ ( * ) = •
(4.1.33)
The pth percentile of the F distribution with vx and v2 degrees of freedom is denoted F p V) vv The order of vt and v2 is critical since the F-distribution is not symmetric in these variables. Its mean is
v,-2
v2>2
(4.1.34)
and its variance , 2v22(v1+v2-2) 'v,(v 2 -2) 2 (v2-4)'
v2>4
(4.1.35)
• Making the change of variable x = t2 in the F density for Vj = l and v2 = n, we get the Student's t-distribution defined over ]-oo, + oo[
f(t) =
1
V2
(4.1.36)
nn
Its mean is zero and its variance n/(n — 2). The pth percentile of the t distribution with v degrees of freedom is noted tPtV. The Student's f-distribution converges rapidly towards the normal distribution: in practice, when v > 30, the two distributions become indistinguishable.
4.1 A single random variable
183
Useful relations among percentiles are ti-.P/2,,v2 = fi-p,v
(4-1-37)
and (4.1.38)
4.1.4 Some relationships between fundamental distributions Although the formal proof of some relationships will be postponed until the necessary background is exposed, it is probably necessary at this point to justify the rather lengthy compendium of distributions of Section 4.1.3. The exponential distribution with parameter k is the distribution of waiting times ('distance' in time) between events which take place at a mean rate of k. It is also the distribution of distances between features which have a uniform probability of occurrence (Poisson process), such as the simplest model of faults on a map. The gamma distribution with parameter n and k'1, where n is an integer is the distribution of the waiting time between the first and the nth successive events in a Poisson process. Alternatively, the distribution/(t), such as f(t) = -(h)nQ-Xt n\
(4.1.39)
represents the probability that n events have occurred over the time t. The associated gamma and Poisson distributions are well-suited to describe non-negative quantities that result from the addition of a finite number of units. Detrital input to a sedimentary layer from a particular drainage basin through river transport can be treated as a Poisson process by assuming that small batches of sediments are carried to the sea at random times. Each single layer represents a time interval over which the total sedimentary mass added from that drainage basin may be treated as a gamma variable. The same simple approach can be used to model the distribution of contributions from a magma source to a given batch of magma. Deloule et ai (1991) have shown that the energy spectrum of hydrogen atoms in amphiboles measured by ion probe fits a gamma distribution consistent with the atom being ejected from the sample after absorption of about seven electrons by a neighboring iron atom. Given two gamma variables X and Y with parameters ax, /? and ay, /?, respectively, the proportion X/(X + Y) is distributed as a beta distribution with parameters ax and ocY. This relationship between the exponential, gamma and beta distributions is useful in handling mass balance problems. Using once again the sedimentary example, if only two basins contribute material to a detrital layer with quantities being described by Poisson processes of identical rate k but different numbers of batches nx and nY, the proportion of each component is distributed as a beta distribution with parameters nx and nY. Although this model may appear a little simplistic, it is still appealing enough to use the beta distribution for mass fractions in mixtures.
184
Probability and statistics
The physical and conceptual importance of the normal distribution rests on one unique property: the sum of n random variables distributed with almost any arbitrary distribution tends to be distributed as a normal variable when n->oo (the Central Limit Theorem). Most processes that result from the addition of numerous elementary processes therefore can be adequately parameterized with normal random variables. On any sort of axis that extends from — oo to + oo , or when density on the negative side is negligible, most physical or chemical random variables can be represented to a good approximation by a normal density function. The normal distribution can be viewed a position distribution. The ratio of two normal random variables with zero mean is distributed as a Cauchy variable. Isotopic ratios such as 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb therefore should not be described as normal variables since ratios of ratios (e.g., 2 0 7 Pb/ 2 0 6 Pb) should be distributed with a consistent distribution. A consistent distribution for isotopic ratios is the log-normal distribution. The square of the distance between two points with position distributed normally is distributed as the x2 distribution with one degree of freedom. The sum of n such distances is distributed as the x2 distribution with n degrees of freedom. The ratio of squared distances, which is used for instance to test the ratio of variances or the ratio of a squared distance to a variance, is distributed as an F-distribution. 4.1.5 Estimators The set of all possible outcomes of a measurement considered as a random variable is usually called the population. The parameters of the density function associated with a particular population, e.g., mean or variance, are not physically accessible since their determination would require an infinite number of measurements. A measurement, or more commonly a set of measurements ('points' or 'observations'), produces a finite set of outcomes called a sample. Any convenient number describing in a compact form some property of the sample is called a statistic, e.g., the sample mean
(4.1.40)
where m is the number of observations Xj in the sample, or the sample variance
s2 = —
(4.1.41) m-1
The (m— 1) factor will be dealt with below. An alternative expression for s2 is obtained by developing the squared parentheses as
(4.1.42)
-(xf[ m —1
m—1 (.
m
)
4.1 A single random variable
185
where the last term is often read as the mean square minus the squared mean. Although this expression is usually the easiest to implement on a computer, it may lead to devastating roundoff errors. If the value of a sample statistic #is used to estimate a parameter 6 of the population, this statistic is called an estimator and its value for the sample the estimate. Sample mean x and variance s 2 are the usual estimators of the population mean ji and variance a2. Estimators are most useful and reliable when they are convergent, i.e., |£_0|_»O when m->oo
(4.1.43)
i(6) = 0
(4.1.44)
and unbiased, i.e.,
Sample mean x and variance s2 are convergent and unbiased estimators (e.g., Hamilton, 1964), which implies that the so-called empirical variance 62 given by m
mm
m
has a bias s2/m. o2 is nevertheless convergent since the bias tends to zero when m-> oo. If the estimator S is not biased, we can write its variance a@2 as G62 = &{[6-&(G)-\2}
= S\_(6-0)2-]
(4.1.46)
The distribution of a sample statistic is a sampling distribution. A particularly important result concerns the variance of the sample mean given by a-2 = S\_{x - //)2] = — m
(4.1.47)
A proof of this statement can be found in Hamilton (1964). The square-root of a^2 is usually referred to as the standard error of mean. 4.1.6 Change of variable Let cp be a monotonous function and note q> ~ x its inverse function which is assumed to be single-valued (i.e., a value of the independent variable is associated with one value of the dependent variable). If the random variable X has the density function fx(x\ then Y = (p(X) has the density function fY(y) given by
£
(4-1.48)
dy A more rigorous statement can be found in standatd textbooks such as Hoel et al. (1971). In order to demonstrate this relationship, let us call Fx and FY the distribution
186
Probability and statistics
functions corresponding to fx and / y , respectively. Let us assume first that
Since cp is increasing,
Using the chain rule for differentiation, we get
fr(y)=
dFY
dFx
dy
dcp
l
dy
Since x = cp l(y\ the first term on the right-hand side is simply the function/and therefore
^ dy
fxlcp(y)l^ dy
If (p is monotonically decreasing,
and
dcp
dy
Considering the sign of dx/dy, we can combine the two expressions in the form
fY(y)=fxl
(4.1.49)
<& If X is a normal random variable with zero mean and unit variance, find the distribution fY{y) of the variable Y related to X through the function
where a and \i are two constants (p > 0). The assumption that x is distributed as
2K
4.1 A single random variable
187
gives the result as a straightforward application of equation (4.1.49)
if We recognize the non-central normal distribution with parameter or deviate Another way of handling changes of variables is through the moment generating function. If Z is the sum of two independent random variables X and Y, integration of the two variables under the integral can be carried out independently, hence Mz(t)=MX+Y(t)=
[e«* + y >]=£{e x
eY)=s(ex)S(eY)=Mx(x)MY(t)
Consequently, the distribution of the sum Z of two normal variables X and Y with respective moment generating functions <7yt
{)
MY(t) = Q^t+
and
—
From the previous result / )
^
+
^
t
+
2 I _ 2w2 x
Y
(4.1.51)
The sum Z of two normal variables X and Y is a normal variable. Its mean is the sum of individual means, its variance the sum of individual variances. If X is normal variable with mean \i and variance
Likewise, the distribution of the sum Z of two gamma variables X and Y with identical second parameter f$ and with moment generating functions Mx(t) = (l — fit)~ax and MY(t) = (l—ptycCY has interesting additive properties. Again
The sum Z = X + Y is therefore a gamma variable with first parameter <xx + OLY. A straightforward consequence is that the sum of a y2 variable with m degrees of freedom and a y2 variable with n degrees of freedom is a y2 variable with m + n degrees of freedom. When cp ~ 1 is not a single-valued function, the range of variation of the random variable X can be split into intervals over which the function has this desirable property. We can use this method to find the distribution of Y = X2 where X is a normal random variable with zero mean and unit variance fx(*) = hi
188
Probability and statistics
X is defined over ] — oo, +oo[, but one value of Y corresponds to two values of X. We say that cp is double-valued. Let us call Fx(x) and FY(y) the distribution functions of X and Y, respectively, and/ y (y) the density function of Y. By definition
and
and therefore
If we apply the chain rule to the definition of the density function fY(y)
dy we get n ,_dF ,_dFyy_f _fdF^)
jY\y) ) — —-
dy d
— —
LL
-p
f
dF^jy/y)] 1 _[dF^/y) 7=
^
7= —
77== 11
7=
J ^L
Each term in the brackets can be replaced by the appropriate expression of fx
and, finally
As stated in Section 4.1.4, Y = X2 is distributed as a chi-squared distribution with one degree of freedom. <& Find the distribution of Y = - In X for X uniformly distributed as fx(x)=l fx(x) = 0 elsewhere Applying the rule for the change of variable, we get
dy
dy
4.1 A single random variable
189
Y is therefore distributed exponentially since
(4.1.54)
-Ue"*
>l If u is a uniform deviate, — In u is an exponential deviate. <> & Find the distribution of Y = e* where X is a normal variable with mean \i and variance
while it is easily found that dy dx The density of probability function g(y) of Y is therefore
My)=
> expF-i^Y]
(4.1.55)
which is the log-normal distribution, o & Show that, for an m-point sample from a normal population X with mean JU and variance a1, the quantity
(4.1.56)
where x and 52 are the usual sample mean and variance, is distributed as a chi-squared distribution with m — 1 degrees of freedom. In order to prove this statement, we write
or, factoring the terms independent of j
£ (x,-x)2= £
(X.-^+^X-M)2-^-^)
f; (x,-
190
Probability and statistics
Since
we obtain m
m
{Xj — x) — 2 , \Xj~~W —n{x — ii)
2 m Dividing this equation by a(x , it— becomes x}2 fx
—u
or
The left-hand side is the sum of m squared normal deviates with zero mean and unit variance and is therefore distributed as a chi-squared variable with m degrees of freedom. Referring to the sampling distribution of x given above, the second term on the right-hand side is also distributed as a chi-squared variable with one degree of freedom. Because of the additive property of chi-squared variables, the first term on the right-hand side is distributed as a chi-squared variable with m — 1 degrees of freedom. <^ & The 5 18 O of rain water changes with the fraction x of water precipitated from atmospheric vapor according to the law 518O-hl000 = (518 O 0 +1000)(l-x)a - 1
(4.1.58)
where the subscript 0 refers to the equatorial value and a is the liquid-vapor 18 O/ 16 O fractionation coefficient (e.g., Faure, 1986). The values of the random variable X representing the fraction of water precipitated from atmospheric vapor at a certain station are assumed to be distributed as a beta distribution with parameters m and n. The mean of the variable X is determined to be 0.20 and its standard deviation cr = 0.10. Find the probability density of rain water 5 18 O at this station assuming 5 18 O 0 = 0anda=1.0111. The probability density distribution fx(x) of the random variable X is a beta distribution with parameters m and n. Therefore
(4.1.59)
4.1 A single random variable
191
From equation (4.1.32), the parameters m and n of a beta distribution can be computed from the mean and the variance as a2 =^--(l-n)-fji ( a22
= 3 and
n= -
a2
Using a table of factorial functions, we compute - ^ - = 1092 r(3)r(i2) which gives the probability density distribution of the random variable x /*(x)=1092x 2 (l-x) 11 From equation (4.1.58), we express 1 — x as a function of 5 18 O l-x=
/ V
/
5 1 8 O\ 9 0 1000/
We will need the derivative of x with respect to 5 18 O, which reads dx d518O
90 / -I 1 1000V
8 1 8 O\ 8 9 1000/
(4.1.60)
We now express the probability density distribution of the random variable 5 18 O as dx d518O or, inserting the appropriate expressions for x and its derivative relative to 5 1 8 O
The graph of this distribution is shown on Figure 4.6. Using the power series expansion of Section 3.1 to the first order and assuming / V
5 18 OV 1000/
,
18 n.8 O ]1000
we find that the variable (—0.09x5 18 O) is approximately distributed as a beta distribution with parameters 3 and 13. The mode (maximum) of the distribution is at 8 1 8 O » —1.7, its mean at 8 1 8 O » —2.1. In contrast, application of equation (4.1.58)
Probability and statistics
192
-10
-8
-6
-2
0
Figure 4.6 Calculated probability density function of 5 1 8 O values in rainwater (see text for assumptions and parameters).
to the mean value of x would have given 518O = 1000(1 -0.2) 0 0 1 1 1 -1000% -2.5 which is slightly, although significantly, incorrect, o & The concentration Csoll of an element i in the residual solid after extraction of a liquid fraction F by fractional melting is given by equation (9.3.14) Cj{ = Cji\-F)KXm-x
(4.1.62)
where Co* is the concentration in the solid source before melting and Dt is the solid-liquid partition coefficient for element i. The symbol F is used for consistency with other chapters and should not be mistaken as representing a cumulative distribution. For batch partial melting, equation (9.2.2) multiplied by D( gives r
i
—
(4.1.63)
Assuming that F is a beta random variable with parameters m and n, calculate the probability density of C^/CQ for each model. Assume mean F of 0.04 and a standard deviation of 0.04 and apply the resulting equations to elements with D~ 0.005, As in the previous exercise, we compute the two parameters of the beta distribution of the random variable as m = 5/3 and n = 40. The probability density function/^F) is = 525.4F 2 / 3 (1-F) 3 9
(4.1.64)
193
4.1 A single random variable
which is plotted in Figure 4.7. Then, the probability density function /(CsoiyC0l ) is given by (4.1.65)
d(c sol yc 0 l )
0
0.05
0.1
0.15
Fractional melting Dt=0.005 D,=0.05
melting
0
0.2
0.4
0.6
0.8
1
Figure 4.7 Assumed probability density function for the degree of melting F (top). Resulting probability density functions for the reduced solid concentration of element i upon fractional melting (middle) and batch melting (bottom) for different solid-liquid partition coefficients Dt.
194
Probability and statistics
Let us first express F for fractional melting as a function of _j_
(Q
F=l-l
so1
i\\-D,
while its derivative is
diCj/Cj)
(4 .,.66)
1-
For batch partial melting, F depends on CsJlCol as Di
-
l
(
while its derivative is D{ (
dF
d(csolycy)
1
i-DAc sol yc 0
(4.1.67)
Inserting equations (4.1.64), (4.1.66), and (4.1.67) into equation (4.1.65) will provide the probability density function f{CsoX'/C0'). For fractional melting, for instance, the result is 0«
with a corresponding equation for batch melting. Both distributions have been plotted in Figure 4.7 for the assigned values of Dt. The differences in the probability density functions between the two processes are quite striking for very incompatible elements (Z), = 0.05 and D{ — 0.005): batch melting has curves more 'peaked' than has fractional melting, o <& The fraction F of a gas, e.g., argon, remaining at time t in a spherical mineral of radius R is given by equation (8.6.7) as 6
oo
1
/
Qi
n
n=
in
\
K
where 3) is the diffusion coefficient of the gas in the mineral. Calculate the fraction cp of gas remaining at time t in a population of spherical minerals with radii distributed as a gamma distribution with parameters a and j8. Let us first illustrate with a simple example how to handle this problem. If, instead
4.1 A single random variable
195
of a continuous size distribution, the mineral population was made of, say, 3 size groups of minerals, each with volume Vl9 V2, V39 then each volume fraction would b e / u / 2 , / 3 . At any time, the total fraction outgassed cp(t) could be expressed as the weighted mean of the outgassed fraction for each size fraction.
where F(t) has been rewritten F(V, t) to emphasize the dependence on radius and hence volume. Now, for a continuous size distribution, fv(V)dV is the fraction of the population with volume between V and V + dV. The fraction outgassed cp(i) is therefore
Jo
The probability density function fR(R) of the radii is FR(R) = — — Rz-tQ-wP
(4.1.70)
where the parameters a and p can be related to the mean fiR and variance oR2 through GR2 = (*P2
MR = aft
or a=( V M 2
and jS = /!*/«
(4.1.71)
Let us introduce the dimensionless variable R/iaR 0La ( R \ a ~ 1
0La
fR(R) =
R*-i
-*R/MR =
_
e-*RinR
(4.1.72)
The radius R relates to the volume of the sphere by 1/3 (4L73)
with the derivative — = 4nR2 dR
(4.1.74)
The probability density function of volumes, which we keep expressed as a function of the radii, is (4.1.75)
196
Probability and statistics
or, inserting equation (4.1.73) fv(V)=
-V)
exp
f
/ 3
-J——V
xp -J—
As a function of the dimensionless variable R/fiR, the fraction F(R, t) remaining in a sphere of radius R at time t reads
Introducing the dimensionless time x = @t/jnR2 and the dimensionless radius x = R/fiR, the fraction of gas cp(t) remaining in the population can be expressed as 6a a 2
°° r i r°° 2
7r r(a) n = iLn Jo
/ V
T
x
2
\\ -*** dx JJ
(4.1.77)
This expression looks complicated but can be computed with minimum effort by using numerical quadrature software. Allowing a size distribution for the outgassing of argon from minerals (Turner, 1968) makes it easier to understand why more argon is lost at low temperature from natural crystals than predicted by the uniform size distribution, regardless of mineral geometry, o 4.1.7 Confidence intervals If a random variable X is defined over a continuous domain Q in % the unknown mean // of a sample lies in a known two-sided confidence interval a> = [x a , xfc] at 100(1—a) percent, or, equivalently, is known at the a significance level, if
or ^(x f l ^^x f t )=l-a
(4.1.78)
The two-sided confidence interval is limited by the 100a/2 and 100(1 —a/2) percentiles. A commonly used confidence interval is 95 percent (a = 0.05), although for the search of outliers (rogue values produced during data acquisition), larger confidence intervals are occasionally preferred. An application of the confidence interval concept central to most statistical assessment is the £2-test for small normal samples. Let us consider a normally distributed variable X with mean \i and variance a2. It will be demonstrated below that for m observations with sample mean x and variance s 2, the variable U defined as
4.1 A single random variable
197
is distributed as a ^-variable distribution with m — 1 degrees of freedom. At the 100(1 —a) confidence level, we write j? f
100a/2,m-l^
II l
— ^ *100(l -a/2),m- 1 [ ~
a
[^.L./y)
Upon multiplication by s/x/w and subtraction of x, we get s
f X
M ~
s
-
i
+ ^100a/2m-l —^= ^ ~~ ^ ^ ~~ X + ^100(1 -a/2),m- 1 —7= (
m
*
Multiplying the inequalities in parentheses by —1 changes the 'smaller than' into 'greater than' X
~tl l00a/2,m-l —7= ^ V- ^ X ~ hoO(l -a/2),m-
s
m
Since the r-distribution is even, t100a/2fm.1=
1
^100(1-a/2),m-1
^= ^ r- ^
A
-t100(1-a/2),m-i,
' 1100(1 -<x/2),m- 1
and, finally
^ f —l
a
^t.l.OWJ
A widely used a = 5 percent significance level produces a 95 percent confidence interval extending over ±t91.5,m-1s/-s/m about the mean x. For m = 6, 11, 31, and oo, the f 97.s,m-i values are 2.57, 2.20, 2.04, and 1.96, respectively (e.g., Spiegel, 1975). The last figure applies to the academic case of an infinite number of measurements and corresponds to the 95 percent confidence interval for a standard normal distribution. Therefore, the normal approximation of the ^-distribution is correct to « 1 2 percent for m> 10 and to 4 percent for m>30. Alternatively, one may associate significance levels with specific intervals around the mean. For large m, intervals of 1 x s/y/m, 2 x s/y/m, and 3 x s/y/m on each side of x correspond to the 31.7, 4.5, and 0.3 percent significance levels, respectively. They limit 68.3,95.4, and 99.7 percent of the surface enclosed under the density of probability curve. Jargon often refers to these intervals as la, 2a, and 3a of the mean intervals. The same principle applies to the confidence interval of variances. We find that keeping the same sample from the same normal distribution, the variable v such as
v=(m-l) —2 G
is distributed as a chi-squared variable with m — 1 degrees of freedom. We therefore write the definition of a two-sided 95 percent confidence interval as (4.1.81)
198
Probability and statistics
Dividing by (m— 1) s2 a n d taking the reciprocal, we get
2
^<J2^—
ZlOOa/2,m-l
>=*!-« ZlOO(l -a/2),m- :
where the inequality signs have been reverted, or, equivalently 1-a Xl00(l-«/2),m-l
(4.1.82)
VXl00a/2,m-l
& The 2 0 6 pb/ 2 0 4 Pb ratios of four samples from a Polynesian island have been determined to be 18.999, 19.091, 19.216, and 19.222. Assuming that these measurements represent a sample from a normal population, find a 95 percent confidence interval for the mean and the standard deviation of the population. Here m = 4 and the number of degrees of freedom m—1 = 3. For a two-sided exclusion domain, a 95 percent confidence interval corresponds to a = 2.5 percent. From standard statistical tables (e.g., Spiegel, 1975), we obtain t91 s 3 = 3.18. Let us calculate the sample mean __ 18.999+19.091 +19.216+ 19.222 _ ~ 4
x
and variance 2_
(18.999-19.132)2+ (19.091-19.132)2+ (19.216-19.132)2+ (19.222-19.132)2
or 5 = 0.107. A 95 percent confidence interval for the population mean \x is given by
or 18.961*^ ^ 19.303 Likewise, we found the values / 2 5 3 2 = 0.216 a n d X91.5^2 = 9.35 from the tables, which provides a 95 percent confidence interval for a as
: 0.216 or 0.061 <
4.1 A single random variable
199
4.1.8 Random deviates Models are often best understood relative to the situation they are designed to describe if their constitutive variables are allowed to fluctuate statistically in a realistic way. Once a variable has been assigned a suitable density of probability distribution and the parameters of this distribution have been chosen, the fluctuations can be conveniently produced by using random deviates from statistical tables. A random deviate is a particular value of a standard random variable. Many elementary books in statistics contain tables of deviates from uniform, normal, exponential, ... distributions. Many high-level computation-oriented programming languages (e.g., MatLab) and spreadsheets, such as MicroSoft Excel, also contain random number generators. The book by Press et al. (1986) contains software that produces random deviates for the most commonly used probability distributions. <#^ Make a table of 20 'crustal' values of ^Nd(0) which is assumed to be a normal variable with mean \i= — \2 and standard deviation o = 3. The assumption is that the random variable U such as
is a normal variable with mean 0 and standard variation 1. We first produce 20 random normal deviates ut for i= 1,..., 20 (here the MatLab 'rand' function has been used with the option 'normal') and then compute the values [eNd(0)](I) using
Computation from Table 4.1 would produce the satisfactory values x= —11.6 and 5 = 3.1.4= 6 Build a table of 20 values of 'basaltic' Ce concentrations CCe assumed to have a log-normal distribution with mean ft = In 20 ppm and a standard deviation a = In 4. The assumption of a log-normal distribution is that the variable U such as
In 4 is normally distributed with a mean 0 and a standard deviation 1. In a sense, the log-normal distribution is the normal distribution of relative errors: for u= + 1 , the concentration equals exp(/i) multiplied by 4, while for u— — 1, it is divided by 4. Once the normal deviates ut are produced independently for the two elements, the concentrations are calculated as ln(CCe)(l) =
200
Probability and statistics
Table 4.1. Twenty random crustal values of^^^(O) from a normal population with mean fi= —12 and standard deviation o — 3 produced by the normal random deviates w;
i 1 2 3 4 5 6 7 8 9 10
-0.2345 1.4525 0.7631 0.1402 1.0192 -0.5806 0.9448 -1.7994 0.5777 0.7101
[%d(0)F
i
-12.70 -7.64 -9.71 -11.58 -8.94 -13.74 -9.17 -17.40 -10.27 -9.87
11 12 13 14 15 16 17 18 19 20
To /YYTl(0 L^Ndv^vJ
1.8706 -0.4809 -0.6921 -0.2945 1.4001 -1.9289 -0.8867 0.0824 0.2545 0.4311
-6.39 -13.44 -14.08 -12.88 -7.80 -17.79 -14.67 -11.76 -11.24 -10.71
Table 4.2. Twenty random basaltic Ce concentrations (in ppm) from a log—normal population with mean fx = ln20ppm and standard deviation a = In 4 ppm produced by the normal random deviates wv
i 1 2 3 4 5 6 7 8 9 10
(CC.)(i)
-1.47 0.27 1.77 -1.77 1.63 1.20 -1.70 -0.59 2.21 -0.83
2.59 29.00 231.28 1.72 190.94 104.89 1.89 8.88 425.51 6.35
i 11 12 13 14 15 16 17 18 19 20
(C Cej(0 «
«
•
-2.37 -0.87 -0.70 0.86 -1.30 1.97 -2.30 -0.99 0.11 1.38
0.75 5.99 7.58 65.84 3.32 305.30 0.82 5.04 23.19 134.57
or (CCe)(l) = 20x4Ul in ppm. Using a generator of normal deviates, we get Table 4.2. o
4.2 Several random variables • For an n column-vector X of n random variables X=(X1, X2, ..., I") T , and a continuous domain Q of definition in 9T, we can define a joint multivariate distribution function
4.2 Several random variables
201
x 2 , ..., x") consistent with the defintion of single variable functions (4.2.1) The joint multivariate density function ^(x1,
x 2 , ..., xn) is (4.2.2)
If Q coincides with 9T, Fx and fx are related through fx1
Fx(x\
2
fx2
M
x ,..., x ) =
fx"
... %) — oo •/
oo
fx(u\
w 2 ,..., un) du 1 du 2 ... dun
*/ — oo
An important concept is the marginal density function which will be better explained with the joint bivariate distribution of the two random variables X and Y and its density fXY(x, y). The marginal density function fxM(x) is the density function for X calculated upon integration of Y over its whole range of variation. If X and Y are defined over 9*2, we get P + QO
J-co
fxM{x)dx measures the weight of the probability 'slice' taken along y at X = x (Figure 4.8). fxM(x) is the density function of X regardless of variations of Y. This concept is easily extended to higher dimensions.
Figure 4.8 A bivariate probability density function. The slice parallel to the y axis represents the marginal density fxM(x\
Two random variables X and Y are independent if their joint density function fXY can be factored as a product of two density functions, each involving one variable, e.g., (4.2.3)
202
Probability and statistics
• Given two random variables X and Y, the covariance of X and Y is cov(X, Y) = <xxy =
(4.2.4)
cov(X, 7) = ^ { [ I - ( f ( I ) ] [ y - # ( 7 ) ] }
(4.2.5)
or
which can be developed as cov(X, Y) =
(4.2.6)
From this definition, cov(X, Y) is identical to cov(Y, X). For two independent variables, the definition of expectation shows that (4.2.7)
so their covariance is zero. The reciprocal statement is not necessarily true. The correlation coefficient pXY of two random variables x and y is , Y) (4.2.8) and may be viewed as the covariance standardized between — 1 and +1. The correlation coefficient measures the linear dependence between the two variables X and Y. Let us assume that they are perfectly correlated, i.e., Y = aX + b with a and b constant. The linearity of the expectation operator amounts to
and therefore
which, using equations (4.1.7) and (4.2.5), results in the relations between expected values var( Y) = a2 var(X) and cov(X, Y) = a var(X) and therefore a var(X) PYY —
—
a — = —
pXY is therefore equal to +1 if a is > 0 and — 1 if a < 0. • Referring to the n random variables X 1, X2, ..., Xn collectively as the vector X, the same
4.2 Several random variables
203
definitions apply and we can form the covariance (or dispersion) matrix L x as
-"
co\(X\Xn)~
No bold face will be used for the subscript X of £*. The current element (il, il) of L x can be rewritten as S{Xn Xi2)-S{Xn)S(Xi2). The condensed form of the covariance-matrix is obtained by using the outer product defined in Section 2.1 = ${\_X-
£(X)\\X-
(4.2.9)
The current element (il, i2) of the correlation matrix p is calculated by the relation (4.2.8). Let the standard-deviation matrix a be /var(X x)
0
0
0
\
0
0
0
(4.2.10) n
J\2iT(X p
It can be verified that (4.2.11) The concept of covariance matrix can be extended to two distinct vectors of different dimensions, XeW and YeW , Y) = £{\_X-£(X)~]\_X-g{Y)~]T} or Cov(Ar, Y) =
T
) -
(4.2.12)
the mxn matrix Cov(Ar, F)) being no longer symmetric or even square. The mxn correlation matrix C o r r ^ , Y) is also the matrix of correlation coefficients and is likewise no longer symmetric or even square. If ox is the mxm matrix of standard deviations on X and oY is the n x n matrix of standard deviations on F, we get the relationship equivalent to equation (4.2.11), i.e.,
\Y)oY
(4.2.13)
The converse relationship Corr(*, Y) = ax ~l Cov(*, Y)aY ~ *
(4.2.14)
will prove useful for principal component analysis (Section 4.4).
4.2.1 Estimators For a vector X of n random variables with mean vector \i and nxn symmetric covariance-matrix £ x , an m-point sample is a matrix X with n rows and m columns.
204
Probability and statistics
We should be aware that the variables now appear row-wise, so that the current element x/ of X is the jth measurement of the ith variable. • The sample mean vector x is a column vector with n elements such as jf = ^ m
(4.2.15)
where Jm is an m column-vector (1, 1, . ..,1)T. • Variances and covariances can be lumped together into the n x n symmetric sample covariance or dispersion matrix S (or £) with current element siU2 such that
Siui2 = si2,n = —
;
(4.2.16)
m-1 where the sum is over all the observations/ We recognize the sample variances for i\ = il and covariances for i\ ^i2. The 'deviation'matrix of X from the mean sample vector is the mxn matrix X—xJT while the sample covariance matrix reads
Jx-*rv-*j
(4217)
m-1 which is the most useful form of this matrix with the least roundoff error. • The symmetric sample correlation matrix R (or p) is similarly defined by its current element m / il —X v E {Xj 7=1
fl
v
S
il,i2
S
il
S
i2
i2 i2 Vv ){Xj —Xv \)
(x/1-?1)2
£(xji2-xi2)2
Calling the diagonal matrix of the sample standard deviations s, R and 5 relate through S=sRs
(4.2.18)
• Likewise, the sample covariance matrix between two vectors JC and y would be y) = x~p-xy1
(4.2.19)
Defining sx and sy as the diagonal matrices of sample standard deviations on JC and y, respectively, the sample correlation matrix would be Corr(x,y) = sx~x COV(JC,y)sy~1
(4.2.20)
We are dealing with unbiased estimators, so we can write
S(R) = e
which is the basis for statistical assessment in any modeling situation.
(4.2.21)
4.2 Several random variables
205
Table 4.3. Lead isotope ratios of four Polynesian lavas. 6
Pb/ 2 0 4 Pb
207
18.999 19.091 19.216 19.222
Pb/ 2 0 4 Pb
15.569 15.616 15.621 15.619
& Four samples from a Polynesian island gave the lead isotope compositions given in Table 4.3. Calculate the mean and standard deviation vectors, the covariance matrix and the correlation coefficient between the two isotope ratios. The sample mean vector is x = [19.132, 15.606]7, the standard-deviation vector s = [0.107, 0.025]7, and the covariance matrix S given by ["0.011509
0.002 3141
LO.002 314
0.000 621J
which, as expected for lead isotopic compositions, indicates a rather strong correlation between the two ratios (r = 0.87). o 4.2.2 Useful multivariate distributions • A normal (gaussian) probability density function in one centered and standardized variable X reads
n independent normal centered and standardized variables lumped into an n-vector X will be distributed as a multivariate normal distribution with the joint density probability function
^=i^e"^=(^^
(42 22)
-
Since the standard deviations are unity and the variables are independent (zero covariance), the covariance-matrix of X is the identity matrix /„ and the contours of constant probability in the space 9?" are given by JCTX = constant
The surfaces of constant probability density are hyper-spheres. In the more general case, the vector Xn with mean pn and nxn covariance matrix
206
Probability and statistics
T<x has the 'non-central' joint density of probability (see below)
(27rr/2x /det2^
L
2
(4-2.23)
Contours of constant probability density in the space W are such as (x - /i)TLx - l(x - p) = const
(4.2.24)
and describe concentric hyper-ellipsoids centered in /i. If Al9 A 2 ,..., 2n are the eigenvalues of £ x , all non-negative, the axes of these hyper-ellipsoids along their eigenvectors have half-lengths proportional to X /I^, JT~2,..., J~kn (see Section 2.4) This is the base of the widely used concept of 'error ellipse'. Parallel to the case of a single random variable, the mean vector and covariance matrix of random variables involved in a measurement are usually unknown, suggesting the use of their sampling distributions instead. Let us assume that x is a vector of n normally distributed variables with mean n-column vector /i and covariance matrix L x . A sample of m observations has a mean vector x and a n n x w covariance matrix S. The properties of the f-distribution are extended to n variables by stating that the scalar m(x — p)TS~ x(x — /i) is distributed as the Hotelling 's-T2 distribution. The matrix Sjm is simply the covariance matrix of the estimate x. There is no need to tabulate the T2 distribution since the statistic
^ n(m— 1)
(
n(m— 1)
/s \m
may be shown (e.g., Seber, 1984) to be distributed as a distribution F with n and m — n degrees of freedom. When m»n, i.e., when the number of measurements largely exceeds the number of variables, the left-hand side of equation (4.2.25) tends towards T2/n. From equation (4.1.38), T2 which we can rewrite nT2/n, is therefore distributed as nFp^ao = x2P,n
(4.2.26)
Since T2 is a chi-squared variable with n degrees of freedom, its mean value should tend to n when the number m of measurements is very large.
4.2.3 Change of variables Let
(4.2.27)
4.2 Several random variables
207
where J(y/x) is the value for X= x and Y=y of the Jacobian of the transformation, i.e., dY.
J(Y/X) = det
(4.2.28)
dYn
& X is an n-vector of n independent normal standardized variables, i.e., with zero mean, unit variances and null covariances. Find (i) the density function and (ii) the covariance matrix of the vector Y given by (4.2.29)
Y=AX+b
where Ae9inxn is a non-singular matrix and (i) As we have seen above, the joint multivariate density function of the vector X is simply the product of the n normal density functions
In the present case, J is simply the determinant dctA of the matrix A and cp'1 is calculated as
x = A-\y-b) Using equation (4.2.29) that gives the density function of a dependent variable, we obtain
lA-\y-b)YA-\y-b)\
1
Application of basic rules of matrix manipulation gives
and
so that
(y-bnAAJ)-\y-b)\
1
(4.2.30)
208
Probability and statistics
(ii) The covariance matrix Hx of X is defined as
Applying the linearity property of the expectation to the change of variable given by equation (4.2.29), we get
and, upon subtraction from equation (4.2.29)
The transpose of this equality is [ Y- S{ y)] T = [X- <$(X)YAJ Multiplying the last two equations and using the definition of the covariance matrix, we obtain
and therefore XY = AKXAT
(4.2.31)
In the present case, L x = /„ and therefore
which gives the non-central joint normal density equation (4.2.23) for the vector Y
(2n)n/2^deti:Y
& A random n-vector X has a mean vector p. and a n w x n covariance matrix L x .
(4.2.32)
4.2 Several random variables
209
From equation (4.2.31), we can write f —a
but since, from equation (4.2.11),
we obtain the useful result Lc- = p
(4.2.33)
This result does not depend on the vector /i and can be extended to any origin of the vector £. The correlation matrix is therefore the covariance matrix of any standard-deviation normalized vector, o & X is a normal random variable with mean \i and variance a2. Given the set of samples of m observations with mean X and variance S 2 , which will be treated as independent random variables, show that the ratio
T=
X
~^
(4.2.34)
is distributed as the Student ^-distribution with m — 1 degrees of freedom. We know from Section 4.1 that X is distributed normally with mean ju and variance a2/m and therefore (X — fi)/((r/y/m) is a normal deviate with zero mean and unit variance. In addition, (m— l)S2/a2 is distributed as a chi-squared variable with m— 1 degrees of freedom. In order to find independent variables from simple distributions, let us transform the value tofTas X — }1
X — fi
G
X — fi
1
t=-
or t=-
Defining U=
-,
W = y/{m-l)S2/(j2.
and v = m - l
(4.2.35)
a/y/m we infer that U is a normal deviate with zero mean and unit variance, W is a
210
Probability and statistics
chi-squared variable with v degrees of freedom and T can be recast as T=
U f
V/v
Since X and S2 are independent, so are U and W. U is distributed with the density function f^u) such that Mu)
=
2n
and W as l T(v/2)2 V/2
Let us make the change of variables U
X=
, and Y=W
JW/v
The Jacobian J of the transformation is
= det
dX dX ~dU dW dY dY ~dU ~dW
i
JT
i
u
iwJWJv
= det .
0
1
1—
VW
i—
VY
Since independence of U and W is assumed, the joint distribution function/xy(x, y) is
or, expressing U and W as functions of x and y
This expression can be rearranged as
The probability density function of the random variable x is obtained by integrating
4.2 Several random variables
211
x, y) over the whole range of y variations (marginal density)
which, after isolation of the constant terms, becomes P
L 2V
/J
v
We recognize in the integral a form that is close to the Eulerian gamma function (4.1.22), which becomes more visible by preparing for variable change as y-D/2 f oo
Introducing the new variable z as
we obtain the expression ~|(v+l)/2
where the integral equals F[(v + l)/2]. Thefinalprobability density/x(x) is therefore hl)/2
which we identify as the Student t-distribution described by equation (4.1.36) with v = m — 1 degrees of freedom.
212
Probability and statistics
The mean x of these measurements is 0.710259, while their standard deviation s is 0.0000104 (we take one more digit to keep a reasonable precision on ratios). Let us form the variable t which is meant to represent a specific value taken by the Student-t variable and such that t=
x-0.710250 0.710259-0.710250 - ^ = — - = 1.94 s/ x /6^T 0.000 010 4/ x /5
The Student-r percentile £5,97.5 is 2.57, so 95 percent of the surface enclosed under the Student distribution curve lies inside the interval [ — 2.57, +2.57]. Since t lies within that interval, we will assume that the mass spectrometer is unbiased for Sr isotope measurements. «=>
4.2.4 Confidence region of a sample from a normal population The confidence intervals defined for a single random variable become confidence regions for jointly distributed random variables. In the case of a multivariate normal distribution, the equation of the surface limiting the confidence region of the mean vector will now be shown to be an /t-dimensional ellipsoid. Let us assume that X is a vector of n normally distributed variables with mean n-column vector /1 and co variance matrix Hx. A sample of m observations has a mean vector x and a n n x n covariance matrix S. We know that the statistic T2 = m(x — p)TS~ 1(x — /i) is distributed as the Hotelling's2 T distribution and that m-n
m-n
vrc-i/- ^ l (x~fi)
2 J S n(m— 1)-T = -n(m— -m(x-fi) 1)
is distributed as the distribution F with n and m — n degrees of freedom. A 100(1 —a) percent confidence interval for T2 is defined as (4.2.36)
n(m-\)
Because we are dealing with positive numbers, a one-sided confidence condition defines the confidence region. Equation (4.2.36) can be rewritten as r
:
m—n
The coordinates xiooil-a)(p) by the equation
100(1 -x),n,m-n ( ~~ L
a
y*.£.J I)
)
of the confidence region boundary are therefore given
* ^ ^ I W « * . - . m—n
(4-2.38)
4.2 Several random variables
213
or Ol00(l - a ) 0 * ) - - * ] T
^100(1 -a),»,m-«-
\_ m — n
l>100(l -«)W ~ *]
= 1
(4.2.39)
mj
that is, an ellipsoid centered at x. If Af is the rth eigenvalue of 5, the length dt of the semi-axis along the ith eigenvector will be (4.2.40) m(m — n)
For m » n , i.e., when the sample size greatly exceeds the number of variables, we could write, using equation (4.1.38), a slightly simpler form of the confidence region as ),„} = ! - *
(4.2.41)
< ^ For the four samples from a Polynesian island considered above, draw the 95 percent confidence region for the mean \i of lead isotope ratios and compare the results with the individual 95 percent confidence interval for the mean of each ratio. We found x = [19.132, 15.606] 7 and the covariance matrix S such that [0.011509 0.002 3141 ~|_0.002 314 0.000 621J The eigenvalues of S are calculated by MatLab to form the diagonal matrix A as
["1.198 xlO" ~|
0
2
0 1 4 J 1.497 xlO" 1.
and its eigenvector matrix as [0.9799 ~|_0.1996
0.19961 - 0.9799 J
It will be easily checked that U is an orthogonal matrix. Moreover, from standard statistical tables
m(m — n)
4( — 2)
The length of the semi-axis is ^/b.0120 x 3.77 = 0.412 along the first eigenvector and y/0.000 150 x 3.77 = 0.0462 along the second eigenvector. We can now draw the 95 percent confidence ellipse using the method outlined in Section 2.4. First, let us write the diagonal form of the matrix S as
s= u\uT
214
Probability and statistics
and therefore S-l = U\-1UT = (\-1/2UT)T^'l/2UT
(4.2.42)
Inserting this form of S~x into the equation (4.2.39) of the confidence ellipse gives Si'1
Sn(m-l) ~-*J
* 100(1 -<x),n,m-n~
|_ m — n
\ L
mj
{\-1/2uT)T\-1/2uT a)W--^] — 77 n(m-\) r m(m — n) f 100(1
O -<x),n,m-n
JJ
(4.2.43)
Introducing the vector z defined as A" 1/2
z = -^-UT(fi-x)
(4.2.44)
equation (4.2.43) becomes zTz=l
i.e., the equation of the unit circle. We therefore calculate the coordinates of an arbitrary number of points zt(i = 1,2,...) on the unit circle. This is most easily done by incrementing an arbitrary angle cpt from 0 to In and taking zt = [cos q>i9 sin (p,]T. We next compute the coordinates for the 95 percent confidence ellipse of the mean using the reverse transformation
We can now compute the coordinates x95(0 (/i) and y95(0(/i) of the ith point on the 95 percent confidence ellipse as _ [19.1321 I"-0.4043 (0
y Q 5 (A«)
0.0092Tcos ^
~ L15.606 J "*" L -0.0824 - 0.0452 J|_ sin
Table 4.4 gives the coordinates for 12 points of the 95 percent confidence region of the mean JU. The complete ellipse is drawn in Figure 4.9. The 95 percent confidence intervals of the mean of each coordinate 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb are calculated for m = 4, i.e., for 3 degrees of freedom. From standard statistical tables, we obtain £95 3 = 3.18 (two-sided). The 95 percent confidence intervals for the mean are therefore for
206
Pb/ 204 Pb
215
4.2 Several random variables
Table 4.4. Contours of the 95 percent confidence ellipse of the mean for the Pb isotope composition of Polynesia lavas given in Table 4.3.
*,<deg)
COS (p(
sin (pi
0 30 60 90 120 150 180 210 240 270 300 330 360
1.0000 0.8660 0.5000 0.0000 -0.5000 -0.8660 -1.0000 -0.8660 -0.5000 -0.0000 0.5000 0.8660 1.0000
0.0000 0.5000 0.8660 1.0000 0.8660 0.5000 0.0000 -0.5000 -0.8660 -1.0000 -0.8660 -0.5000 -0.0000
19.1412 18.9378 18.7864 18.7277 18.7772 18.9219 19.1228 19.3262 19.4776 19.5363 19.4868 19.3421 19.1412
15.5611 15.5259 15.5123 15.5239 15.5575 15.6042 15.6514 15.6866 15.7002 15.6886 15.6550 15.6083 15.5611
IS 7S 15.70 S
15.65
^
15.60 15.55
18.6
18.8
19.0
19.2
19.4
19.6
206pb/204pb Figure 4.9 The 95 percent confidence ellipse of the mean for the Polynesian data listed in Table 4.3. The horizontal and vertical bars show the 95 percent confidence intervals of the mean calculated independently for each coordinate.
and 15.606±3.18(—^
for
207
Pb/ 204 Pb
These intervals are drawn in Figure 4.9. In the case of correlated variables, the
Probability and statistics
216
Table 4.5. Ce and Yb concentrations (ppm) and Ce/Yb ratios in a geochemical survey of 20 igneous samples. C
Ce
Yb
Ce/Yb
5.22 4.01 5.02 5.37 0.72 1.54 0.97 1.21 1.25 1.26
4.23 2.95 2.45 2.00 12.48 19.42 17.29 97.00 15.84 60.06
C
22.07 11.83 12.31 10.75 8.98 29.98 16.83 117.78 19.73 75.45
C
Ce
94.34 63.78 16.47 3.83 8.78 42.06 5.71 103.49 74.33 29.82
Ce/Yb 2.20
1.11 0.67 0.91 0.71 0.85 2.57 1.03 0.93 0.94
42.85 57.55 24.59 4.23 12.29 49.65 2.22 100.72 80.00 31.59
confidence region calculated from the joint distribution is significantly larger than the confidence region calculated from individual distributions. <=• ^ In a random geochemical survey, Ce and Yb concentrations have been measured in twenty igneous samples. The results in ppm are reported in Table 4.5. Find the 95 percent bivariate confidence regions for the mean of (i) the Ce-Yb pair (ii) the Ce/Yb-Ce pair. (i) For the Ce and Yb pair, the sample mean is x = [38.42,1.92] T and the standard deviation vector s = [36.17,1.92]T hinting to a strong non-normality. The correlation between the data is weak with a correlation coefficient of —0.28. The sample covariance matrix S is T13O8 ~|_-16.42
-16.421 2.61 J
with eigencomponents . [1308 0 "I A and A= L 0 2.408J
[ Q-0.9999 0.01261 U=\ 0.0126 0.9999 J L-OJ
rT
In agreement with the small correlation coefficient, the eigenvectors are nearly perfectly parallel to the coordinate axis. Using tables, we find 2(20-1)
w(m-l) .
.
m(m — n)
r
2 ~ ^95,2,18 = 0.106 x 2.62 = 0.277 = 0.526 20(20 — 2)
100(1 -a),w,m-«—TT—'
Defining z as a vector (cos q>i9 sin cpt) of coordinates for a point on the unit circle as
4.3 Error propagation and error calculation
217
above, the transformation formula for the contour of the 95 percent confidence region of fi is obtained as
or, inserting the values _ [38.421 L 1.92J
-19.0239 0.2392
- 0.0103Tcos (/>, -0.8162_l_sin
(ii) For the Ce and Ce/Yb pair, the sample mean is x = [38.42,31.97] T and the standard deviation vector s = [36.17,32.14]T hinting again to a strong non-normality. In contrast, the correlation between the data is strong, as could be expected from Ce appearing in both variables, with a correlation coefficient of 0.93. The sample covariance matrix S is T13O8 10801 L1080 1033J and has the eigencomponents T0.7504 0.66101 U=\ L 0 81.5J L0.6610 -0.7504 J The eigenvector coordinates have a rather similar modulus in agreement with the strong correlation coefficient (0.93). F 9 5 „ m_n is as in (i) and finally [2260
A=
0 1
J
and
_ p 8 . 4 2 l T-18.763 ~|_31.97_r|_-16.526
3.139X008^1 - 3.564 JL sin
which is the equation of a rather 'steep' and elongated ellipse drawn in Figure 4.10. If X and Y are independent random variables, f(X, Y) and g(X, Y) ~ where / and g are some functions of X and Y - may in general be suspected to be correlated to an extent that should always be carefully assessed. <=>
4.3 Error propagation and error calculation 4.3.1 General concepts We have already met the concept of error propagation a few times when dealing with the change of variable formulas for probability distribution, but let us try to illustrate it with a simple example. We want to measure the diffusion coefficient 3) of uranium in a glass by maintaining at a specific temperature and for a specific time t the surface of one long glass rod in contact with a concentrated solution of uranium. We admit without further justification (see Section 8.5) that the depth x of uranium
238
Probability and statistics
zero and their variance is given by
m— 1
(4.4.6) N o w comes the very principle of the principal c o m p o n e n t analysis. A total variance is n o w defined as the trace of the matrix Sx or, using a p r o p e r t y of the trace of a matrix product given in Section 2.2 t r £ x = tr(tfA£/ T ) = tr(A' I i/ T tf) = trA = £ lj
(4.4.7)
a n d the p r o p o r t i o n of that variance explained by the c o m p o n e n t k is the ratio pk given by
Adding variances on different variables at the denominator, e.g. pH and temperature in solutions, does not make much sense and is certainly not invariant upon rescaling. Proportions of explained total variance do not survive a simple change of units! For this reason, PC A is commonly carried out instead on normalized variables £ such as Z^s-^Xi-x)
(4.4.9)
where s is the diagonal matrix of sample standard deviations. The n x m matrix S collects the z = l,...,m normalized measurements £,. As we have seen in equation (4.2.33), the covariance matrix of standardized variables is the correlation matrix of the non-standardized variables. Therefore, the £f have the correlation matrix R for covariance matrix. The diagonal form of R is R=VAVT
(4.4.10)
where A is the diagonal matrix of eigenvalues (5 l5...,<5 n and V the matrix of the orthogonal eigenvector v 1 ? ...,v n of R. The component fjt of the ith vector xt along the 7'th eigenvector v, of R is given by
/;, = v/£
(4.4.11)
and the components can be collected in an n x m matrix F obtained as before through F= VTE
(4.4.12)
E=VF
(4.4.13)
or, since V is an orthogonal matrix
4.3 Error propagation and error calculation
219
e.g., chemical heterogeneities in the glass or alternate transport processes, pointing to a failure of the simple model. In other to describe these fluctuations, we must therefore calculate the sample statistics of x and Q) from the set of measurements. In case (ii), we do not really know which part of the measurement dispersion can be ascribed to fluctuations in the diffusion process and which part comes from the poor measurement reliability. If we could describe the distribution of the random variable x, we could proceed as in the previous section by changing the variable and assess the distribution of Of. This is however an unlikely situation and only sample statistics, namely the sample mean and variance, are available from the experiment. A confidence interval (spread) on the measurement is identified as the experimental uncertainty on x. This uncertainty is then propagated by the techniques described below on the estimate of @9 then compared with the observed spread of the measurements. If the propagated experimental uncertainty on <2) is significantly larger than the measured confidence interval, we can decide that a unique Q) value is consistent with the observations. If the spread on 3) is significantly larger than the propagated experimental uncertainty, we can suspect either fluctuations in other parameters or, more seriously, failure of the model. Situations such as (ii) are the most frequent and actually turn out to be quite satisfactory since we can assess the limitations of the measurements relative to fluctuations (or failure) of the model. We now know enough probability and statistics to make an assessment of the calculated errors. We will give some examples of error propagation, both linear and non-linear, using explicit or Monte-Carlo techniques. Examples of decision making (how significant is significant?) will be given in Chapter 5.
4.3.2 Linear error propagation
We have considered in some detail in Section 4.2 the case where the random vector Y of n ancillary or dependent variables relates linearly to those of a vector X of n principal or independent variables (e.g., raw data) with covariance matrix T,x through the matrix equality Y=AX+b where X, F, 6e9T and A, Ttxe9lnxn. From equation (4.2.31), the covariance-matrix Ly of Y is equal to
There is no restriction in the derivation of this relationship that would prevent its extension to cases where Ae9T, £xe9TX11, Y and beW1, and ^e9? m x " with m^n. Then, L y e9l mx " and equation (4.2.31) is still valid. Error propagation is achieved by replacing the population parameters by the value estimated by sampling, e.g., x for the sample mean xy = Axx + b
(4.3.3)
Probability and statistics
220
Table 4.6. Miner alogical matrix (molar fractions) of a metamorphic carbonate.
CaO MgO SiO 2
ca
do
di
1 0 0
1/2 1/2 0
1/4 1/4 1/2
and 5 for the sample covariance matrix y
=
ASxAT
(4.3.4)
where the estimates are subscripted with reference to the corresponding variables. The covariance matrix of the mean vector xy would be derived through a similar expression. & A sample of metamorphic carbonate contains calcite CaCO3, dolomite Ca0 5Mg0 5 CO3 , and diopside CaMgSi2O6. A chemical analysis on the calcinated (CO2-freej rock indicates the following molar proportions: 0.525 (0.03) CaO, 0.225 (0.01) MgO, and 0.25 (0.02) SiO2 with standard deviations given in parentheses. Find the molar proportions of each mineral in the rock and their standard deviation. Let us denote x = [xCaO, xMgO, xSiO2]T the vector of rock concentrations, and y= [>>ca, ydo, ydi]T the vector of molar proportions in each mineral. The mineralogical matrix of this rock is given in Table 4.6. CaO mass balance between minerals and rocks reads -7^di = 0.525
4
Likewise, for MgO ^ d i = 0.225
For SiO2, the equation is
Lumping the three equations in a matrix form, we get 1 1/2 l/4ip ca 0 1/2 1/4 L . =
.0 0
l/2JLdJ
X x
x
"0.525"
CaO
MgO = _
0.225 .0.250.
4.3 Error propagation and error calculation
221
a matrix equation which is inverted as 1 0
-1
.0
0
0 T0.525"
"0.3"
- 1 0.225 = 0.2
2
2 L0.250.
.0.5.
Defining the matrix A as 1 A=
-1
0"
2
-1
0
2.
this equality can be rewritten = Ax
In the absence of further information on correlations, we form the covariance matrix Sx of the dependent variable vector x as 0.032
0
0 L 0
0.01 2
0
0
0.02 2 .
Applying equation (4.3.4) for the linear propagation of errors gives the covariance matrix Sv of the vector v 0.095
0.005
0.0051
0.005
0.005
0.005 x l 0 ~
0.005
0.005
0.010
The square-root of the diagonal elements gives the standard deviation and the final results are (standard deviations in parentheses) 0.300(0.031)' 0.200(0.007) L0.500(0.010)J
This procedure for propagating errors is not entirely satisfactory since it neglects a source of strong correlation: the phase proportions must sum up to 100 percent even when they are allowed to fluctuate within errors. This point is dealt with in Chapter 5. o & Rare-earth elements in minerals can be measured in situ by ion probe. It is observed that Gd oxide peaks overlap with Yb masses (isobaric interference). The
Probability and statistics
222
Table 4.7. Atomic abundance of selected isotopes of Yb and GdO at mass m.
m
amYb
amGd°
171 172 173 174
0.143 0.219 0.161 0.318
0.148 0.205 0.157 0.248
similarity of isotopic abundances makes correction by peak stripping rather imprecise. In an experiment, the following number of counts per second (cps) Im has been found for each of the following peaks: 87.0 cps at mass 171, 128.0 at 172, 95.6 at 173, and 174.1 at 174. Given the isotopic proportions amYb and amGdO in the species Yb and Gd 16 O listed in Table 4.7, allot a number N of cps to each species. Standard deviation on each peak is assumed to be equal to the square-root of the number of cps (Poisson statistics). The measurement time is 1 second. Repeated measurements show that intensities of each peak fluctuate with a correlation coefficient of 0.9. A peak is the sum of the contributing species weighted by the isotope abundance of each species on this isotope (interference equation)
The co variance matrix of/ can be computed from standard-deviations and the unique correlation coefficient through equation (4.2.18) "9.33
0
0
1
0.9
0.9
0.91 r9.33
0
11.31
0
0.9
1
0.9
0.9
0
0.9
0
1JL 0
0
0
9.78
0
0
0
0.9 13.19
0.9
1
.0.9 0.9 0.9
0
0
0 "
11.31
0
0
0
9.78
0
0
0
13.19.
or
5=
- 87.00
105.53
91.20
123.04"
105.53
128.00
110.62
149.24
91.20
110.62
95.60
128.97
L123.04
149.24
128.97
174.00.
Let us write the system of interference equations for the masses 171 and 172 143 0.148~|pVYb 0.219 0.205
4.3 Error propagation and error calculation
223
The solution is JVYh 1 f-66.19 70.71
47.79T 87.01 _ p 5 8 l - 46.17 JL128.0 J " [_242 J
with the covariance matrix given by linear propagation [equation (4.3.4)]
N
[-66.19 ~[_ 70.71
47.79T 87.00 105.53T-66.19 -46.17JL105.53 128.00JL 70.71
47.79T -46.17J
or 772 -10528
-105281 1372 J
As expected from nearly identical isotopic abundances of Yb and Gd 1 6 O at masses 171 and 172, the standard deviations on N Y b and N G d O are extremely large and errors are strongly correlated (r% — 1.0). A slightly better result is obtained upon replacement of peak 172 by peak 174. The system of interference equations now reads J 171~|
[0.143
0.148~|pVYb 1
/17J
L0.318
0.248 i_NGdo]
or
»U Gd oJ
3601 240J
L
with covariance matrix [ 312 L-2896
-28961 932 J
Errors on NYh and NGdO have been divided by a factor 2 but remain strongly correlated
4.3.3 Linearized error propagation for non-linear relationships The approach developed in this section is of considerable practical importance for the assessment of errors on data obtained through a complex reducing procedure from raw measurements (e.g., optical and mass spectrometry), or on variables inferred through complex modeling. Given a relationship between a random variable X with mean fix and variance ox2 and a dependent variable Y such as (4.3.5)
224
Probability and statistics
where cp is some known function, we wonder how to estimate the corresponding statistics fiY and oY2. Error propagation can be achieved through different means. First, if the density of probability function is known for the variable X (usually a measurement), or if at least a reasonable guess of this function can be arrived at, the density of probability function for the variable y can occasionally be calculated analytically through equation (4.1.48) provided the function
dX
Taking the expectation of each side gives dcp(X) dX
\_S(X) — tix~] + higher-order terms
or, neglecting the terms of order higher than one fiY~
(4.3.6)
These equations can be subtracted from each order to give d
{X-iix)
The expected values relate through ,
d
dX The linear approximation for variance 'propagation' is therefore
*r41\'
(4-3.7)
The useful equations (4.3.6) and (4.3.7), which are valid only for smooth monotonous functions, can be translated into relationships between the corresponding sample statistics y * (P(x)
(4.3.8)
4.3 Error propagation and error calculation
225
and (4.3.9)
Applying the equations (4.3.6) to (4.3.9) to highly non-linear functions, cp is usually inappropriate. & The mean 143 Nd/ 144 Nd of a sample has been found to be 0.513 114 with a standard deviation of 0.000007. Given a present-day 143 Nd/ 144 Nd ratio in chondrites of 0.512638 (an arbitrarily precise estimate), find the standard deviation on the mean eNd(0) value. By definition, eNd(0) is calculated as
_r
d) c h o n d r i t e s - J
which, in this case, leads to sNd(0) = ( ^ H ^ - l ) x l 0 V0.512638 /
4
= 9.29
The derivative of the dependent variable £Nd(0) relative to the independent variable (143 Nd/ 144 Nd)sample is 10 4
d6 Nd (0) 4
)sampie
(' 3Nd/
144
Nd)chondrites
The variances relate to each other through
and, inserting the values, we get «I>Nd(0)] = d
104 0.512 638
x 0.000 007 = 0.14
o
These relationships will be applied with utmost care for the determination of confidence intervals, especially when the probability density function of the dependent variable y is not symmetrical. If we now consider an n-vector X of n random variables ('data') with mean /i x and covariance matrix L x related to a vector Y of m ancillary variables through i= 1,..., m functions (pt (4.3.10)
226
Probability and statistics
and expand each y component in X about the mean fix j — fiXj) + higher-order terms
Grouping all similar equations in a matrix equality gives dcpi
dcp
dxn (X— nx) + higher-order terms -YJM-
Denoting A the mxn matrix of partial derivatives dcpJdXp the previous equation becomes Y(X)= Y(}ix) + A(X-fix) + higher-order terms Introducing the expectations, we obtain ^[ Y(X)~\ = Y(nx) + AS(X-nx) + higher-order terms or (4-3.11) The approximate propagation formula for the covariance matrix is therefore (4.3.12) We can relate the estimates x and S through the following propagation formulas for the mean (4.3.13) and the variance (4.3.14) The same note of caution applies to strongly non-linear functions j as in the case of a single random variable. & The Nd crustal residence age of a sediment is calculated as the time where the isotopic ratio of the sediment or its igneous protolith had the same 1 4 3 Nd/ 1 4 4 Nd ratio as a model depleted mantle. It is assumed that, once the sedimentary protolith is extracted from the depleted mantle, no further 1 4 7 Sm/ 1 4 4 Nd fractionation
4.3 Error propagation and error calculation
227
takes place. Given a sediment with ( 1 4 3 Nd/ 1 4 4 Nd) s = 0.511 815 (s = 0.000012) and ( 147 Srn/ 144 Nd) s = 0.108 (5 = 0.001) and a model depleted mantle with present-day values of ( 1 4 3 Nd/ 1 4 4 Nd) D M = 0.513 114 and ( 147 Srn/ 144 Nd) DM = 0.222, calculate the Nd crustal residence age of this sediment and its standard deviation. Assume that errors on each ratio are uncorrelated. The decay constant of 147 Sm is X = 0.654 x 10"* * a~ 1 . The equation of radioactive decay for the Sm-Nd system in the sediment reads
(
'
'
where the superscripts indicate the geological age (0 for present, T for the age) with a similar equation for the depleted mantle. At time T = TDM, the isotopic ratios were equal in both systems, and therefore
which gives TDM
1
"
/UM
= 147
(
V
/
/S
/ ^| ^ -|
Sm/144Nd)DM°-(147Sm/144Nd)s°
TDM is obtained from the equation ^DM = T l n | 1 +
14 4T 4 VM\ / ' 1 47 7CWI/14
Sm/
c ^ / 114 4 - ^ Nd) D M00 -( '114477Sm/ /
d\
0
In order to propagate the uncertainties on ( 1 4 3 Nd/ 1 4 4 Nd) s and ( 147 Srn/ 144 Nd) s towards TDM, we first need to compute the partial derivatives of TDM relative to these two variables. Using the rules of calculus, we get
dTDM <3( 143 Nd/ 144 Nd) s 0
-1 1 ( 1 4 7 Sm/ 1 4 4 Nd) D M °-( 1 4 7 Sm/ 1 4 4 Nd) s ° k
( 1 4 3 Nd/ 1 4 4 Nd) D M °-( 1 4 3 Nd/ 1 4 4 Nd) s °
(147 Sm/ 144Nd)DM °-(147 Sm/ 144Nd)s° which, with the help of equation (4.3.16), can be simplified into
then dTD 2(143Nd/144Nd)DM°-(143Nd/144Nd)s°
228
Probability and statistics
Likewise, the derivative of TDM relative to ( 147 Sm/ 144 Nd) s is calculated as
d( 147 Srn/ 144 Nd) s °
/L[( 1 4 7 Sm/ 1 4 4 Nd) D M °-( 1 4 7 Sm/ 1 4 4 Nd) s °] 2
or
d( 147 Srn/ 144 Nd) s °
/i( 1 4 7 Sm/ 1 4 4 Nd) D M °-( 1 4 7 Sm/ 1 4 4 Nd) s 0
(4.3.17)
Inserting the numerical values gives TDM =
d(
143
Nd/
144
1
f 0.513114 — 0.511815"! In 1 + = 1.732Ga 0.654 x x 10"n L 0.222-0.108 J Nd) s °
0.654 x l O "
<3T n M £( 147 Sm/ 144 Nd) s °
11
x
-0.011266 0.513114-0.511815
1 0.654 x 1 0 " l l
0.011266 x
= —1326 Ga
- 15.11 Ga
0.222-0.108
Summarizing the results, the vector A is ,4 = [-1326, 15.11] The covariance matrix of the measured ( 1 4 3 Nd/ 1 4 4 Nd) s and ( 147 Srn/ 144 Nd) s is _|"(12xlO" 6 ) 2
I
0
0
1 2
0.001 J
with zero off-diagonal terms since the variables are uncorrelated. The variance of 7DM can now be calculated from equation (4.3.14) as STDM2 = ASAT
= (- 1326 x0.000012) 2 + (15.11 xO.OOl)2
and its standard deviation as
This method can be extended to sets of non-linear implicit relationships. Let us assume that the m-vector Y is defined by a set of k = 1,..., m expressions as a function of the n components of the vector X through (4.3.18) In order to expand each expression to the first degree about the mean, we simply
4.3 Error propagation and error calculation
229
calculate the differential of each expression (4.3.19)
Replacing the infinitesimal increments by the deviation from the mean and combining the m differential equations (4.3.19), we get the system
dX,
8Y,
dxn (X— fix) + higher-order terms
(Y-I*r) = d
_dXl Noting A the m x n matrix with elements dcpJdXj and B the mxm matrix with elements drjk/dXj yields the compact expression B( Y— fiY) = A(X—px) + higher-order terms Taking the expected value of each side, we get B[fiY-g(Y)~] = A[ftx-£{X)~\
+ higher-order terms
Neglecting at this point the higher-order terms, multiplying the last equation by its transpose and taking the expectation gives I? V
J>T
i r<
iT
iil^yiS =A2*XA
(A 1 OA\
(4.J.ZU)
One can assume that the dependent variables Y are defined relative to the variables X by at least n independent equations. Upon pre-multiplication by B~l and post-multiplication by the (/?T)~\ this equation becomes (4.3.21)
The sample covariance matrices are likewise related through the following relationship SYxBlASx(BlA)T &
Isotope dilution with mass fractionation correction. ATNdsp = 0.1 nmol of sa
(4.3.22) 150
Nd
spike is added to a sample containing NNd mol of natural Nd. The mixture is run on a thermal ionization mass spectrometer. Intensities Im are measured in volts (V) on the digital voltmeter (DVM) at masses m= 144, 146, and 150 and reported (Table 4.8) with standard deviation together with isotopic abundances in natural Nd and commercial spike 150Nd. The concentration in the spike solution is thought to be known with a la relative uncertainty of 1 percent and weighing errors are negligible. Tests have shown that intensity variations are correlated with a correlation coefficient
Probability and statistics
230
Table 4.8. Atomic abundance of isotopes of mass m in natural and spike Nd. Im and s(IJ are the measured intensities (in volts) at mass m and their standard deviations. Ion currents are converted into voltage through a high-value resistor. s(Im) 144 146 150
0.237 881 0.171726 0.056239
1.190409 0.857 826 0.473 914
O.OO5O15 0.004 563 0.977 890
0.000 354 0.000212 0.000 149
of 0.92. Calculate the number of moles of natural Nd present in the run and its standard deviation. The number of moles Nm of mass m is the sum of moles of the contributing species (natural Nd and spike) weighted by the isotopic abundances amnai and amsp at mass m Nm = amnatNmnat
+ amspNNdsp
(4.3.23)
Thermal ionization has a mass-dependent efficiency and the intensity Im measured at mass m by the DVM relates to the actual number of moles Nm through the unknown function f(m) Nm = Imf(m)
(4.3.24)
As discussed in Section 3.1,/(m) can be linearly developed in the vicinity of an arbitrary mass value (e.g., 144) and we write ) = / 144 [l-(m-144)<5]
(4.3.25)
where S is the mass discrimination factor or mass bias. We can therefore rearrange the equations for the three masses as a linear system
7l AT iV
1 i
—
146~
nat Nd
7144 nat
(4.3.26)
/ 144 nat
or 1
144
' 146 LM50-I
fu
4.3 Error propagation and error calculation
231
which can be solved for the three unknowns iV Nd nat // 14 4, N N d s o / / i 4 4 , and 3. Since NNdsp is known, the solution of this system of equation will g i v e / 1 4 4 and N N d n a t . In order to compute error propagation, we must evaluate the partial derivatives with respect to both the dependent and independent variables. This will be more clearly seen by differentiating equation (4.3.24) dN/M = f(m) d/m + Im d/(m) = amnat dNNdnat + aj» dNNds*> while the differential of f(m) through equation (4.3.25) reads d/(m) = d/ 144 [l - (m -144)5] -f144(m - 144) d<5 The independent variables fixed by the experiment are / 1 4 4 , / 1 4 6 , / 1 5 0 and N N d s p (n = 4). The dependent variables calculated from the three equations (4.3.26) are iVNdnat, <5, and fl 4 4 (m = 3). We assemble the terms accordingly and write amnat cWNdnat + /./i44(m - 144) dc5 - / m [ 1 - (m -144)5] d/i 44 = f(m) dlm - am** dNNd**> Inserting numerical values, the system 0.237 881 0.005 015 0.000 000"
7144
0.857 826
0.171726 0.004 563 1.715 652
/144
.0.473 914.
.0.056239 0.977 890 2.843486.
1.190409"
is solved as V nat / f V Nd Jl44 XT sp/ f /V Nd 7 l 4 4
=
" 5.00 ~ 0.200 .-0.00100.
3
i.e.,/ 144 = 0.500nmol V" 1 , N Nd nat = 2.500nmol, and a mass bias 3 of l.OOpermil. The reader will check that changing the reference mass, say to 148, does not change the results. We now define and compute matrix A as
A=
"/(144)
0
0
0
/ ( l 46)
0
0
/(150)
. 0
-
0.5000 0.0000 0.0000
-0.005015"
0.0000 0.5010 0.0000
-0.004 563
.0.0000 0.0000 0.5030
-0.977 89 .
and matrix B as
/(146) B=
2/144/1
J
/l44
/(150) /l44
-1.1904"
0.1717 0.8579
-0.8595
L0.0562 1.4218 - 0.4768 J i
6/144/1
0.2379 0.0000
232
Probability and statistics
which gives the more compact relationship equivalent to equation (4.3.19) r d/144 =A
dd
B
d/1
L d/ 144 j
After calculation, we get
l
B A =
-10.2164
21.3087
-12.9082
25.0017"
-0.4123
0.5850
-0.0006
0.0000
-2.4616
4.2581
-2.5795
5.0003.
The covariance matrix Sx of the independent variables is built from the correlation matrix and the standard deviation of each variable according to equation (4.2.18). Note that the uncertainties on spike addition (fourth variable) are not correlated to those of the intensity measurements. Sx is computed as ro.000354
o
o
on
0
0.000212
0
0
0
0
0.000149
0
0
0
0
0.001J
r0.000354
0
0
0 "
0
0.000212
0
0
0
0
0.000149
0
0
0
0
0.001.
r 1
0.92
0.92
01
0.92
1
0.92
0
0.92
0.92
1
0
0
0
L 0
U
or rl.25xHT 7 6.90 xlO~
8
4.85 xlO"
8
0
6.90 x 10" 8 4.85 xlO~ 8 4.49 xlO"
8
2.91 xlO"
8
0
2.91 xlO"
8
0
2.22 xlO~
8
0
0
1.00xl0~6_
0
The covariance matrix Sy of the dependent variable is obtained through equation (4.3.22) (0.025)2
0.000 000 5 2
0.000000
(6xlO" )
L0.000 126
0.000 000
0.000 126 0.000000 (0.005)2 J
4.3 Error propagation and error calculation
233
The solution with standard deviations quoted in parentheses therefore reads N Nd nat = 2.500 (0.025) nmol S = - 0.00100 (0.000 06) per mass unit / 1 4 4 = 0.500 (0.005) nmol V " l
A related example of linearized error propagation during the isotope dilution measurement of lead in rock samples using the double-spike technique is given by Hamelin et al. (1985). o
43.4 Monte-Carlo simulations In many cases, the function relationship q> is too complicated for the distribution function of the dependent variable(s) to be analytically calculated, yet we do not want to linearize error propagation because we feel such a simplification would result in far too inaccurate results. This is typically the case when the dependent variable is calculated by numerical integration of a differential equation. With the advent of fast desktop computers, the Monte-Carlo error propagation technique has become an easy way of circumventing this difficulty. A large number of random deviates on X (or X if we are dealing with a vector of random variables) are generated by software with the appropriate density of probability distribution, then (p(X) or q>(X) are recalculated with the new value(s) and their statistics evaluated directly by the computer. Many deviates ( > 200) are usually needed if we do not want the statistics to be biased by the 'computer sampling', e.g., if a Student-t distribution is to be accurately approximated by a normal distribution. & A test of mixing is to be made in a 87 Sr/ 86 Sr vs l/C Sr plot. In order to estimate statistically the quality of the alignment, the standard deviation of l/C Sr is needed for each data point. Assuming the mean value of CSr for a sample is 70 ppm and its standard deviation 20 ppm, estimate through a Monte-Carlo method the mean of the inverse value l/C Sr and its standard deviation. The procedure consists in producing 500 normal deviates ui9 i.e., random numbers normally distributed with zero mean and unit variance. We then compute 500 random values xt of CSr normally distributed with mean /x = 70ppm and variance a1 = 20 2 ppm2 using
and then compute the 500 inverse values y{ = l/xf . In a real experiment with MatLab, the mean CSr was 79.8 ppm and the standard deviation 20.4ppm. The computed mean l/C Sr was 0.0136ppm" 1 and the standard deviation 0.0045 ppm" *. These estimates are significantly biased relative to the linear propagation theory which would predict 0.0125 and 0.0031, respectively. This example shows that linear propagation should be applied with utmost care when a variable depends on another through a strongly non-linear relationship. <=
234
Probability and statistics
& We want to assess the extent to which the La/Yb ratio of a basaltic liquid is changed by the fractionation of a clinopyroxene-garnet mineral assemblage. The value of the chondrite-normalized La/Yb ratio of the primary melt is assumed to be 5.0. Clinopyroxene (cpx) and garnet (ga) are the only phases at the liquidus. The mean mineral-liquid partition coefficients are K cpx La = 0.01, KcpxYb = 0.4, KgaLa = 0.005, X ga Yb = 4. They are distributed as log-normal variables: the standard deviation of each In X is 2 (i.e., we assume that individual partition coefficients are, on average, known to within a factor of two). From experience, we also know that the 'shape' of partition coefficient patterns is fairly stable, so we assume the correlation coefficient between errors on In K's of the same mineral to be r = 0.95. The fraction (1— F) crystallized is distributed exponentially with a mean value of 0.2. The cumulate is made, on average, of x cpx = 40 percent cpx, this variable being distributed as a log-normal distribution with a standard deviation of 0.1 (10 percent error). From the equations ruling elemental fractionation during progressive removal described in Section 9.3, the relationship between mean values is /La\ VYb/liq
/La\ W o
DLa.DYb
where D stands for the bulk solid-liquid partition coefficient and the subscript 0 refers to the undifferentiated magma. We first compute from equation (4.3.22) the covariance matrices SlnK of In K for each mineral In 2 0 T 1 • ln2jLo.95
0.95Tln2 0 ~| [0.4805 0.4564] 1 X 0 In2_|~ |_0.4564 0.4805J
The vectors [ l n X / a , l n X / b ] T are normal with mean [In 0.01, In 0.4] T for j = cpx and [In 0.005, In 4] T for j = ga. Let us describe how one random estimate of La/Yb can be computed. The vector u such that La A lnK
LlnX cpx
- In 0.01 | -ln0.4 J
Yb
is a vector of two uncorrelated normal deviates with zero means and unit variances. We have, indeed, from equation (4.2.31), the equality
which shows that the identity matrix I2 is the covariance matrix of u. The square-root of the matrix SlnK is calculated as
lnK
1/2 _ [0-5615 0.40651 L0.4065 0.5615 J
Using a generator of random numbers, we produce a 2-vector u of uncorrelated
4.3 Error propagation and error calculation
235
Table 4.9. La/ Ybfractionation by garnet-clinopyroxene removalfrom a basaltic melt: examples of three Monte-Carlo runs of error propagation. The ut and w, deviates are normal deviates, the deviates vt are log-normal. See text for the description of the computed random variables: K represents mineral-liquid partition coefficients, F the fraction of residual melt, x the fraction of a mineral in the cumulate, D bulk solid-liquid partition coefficients. i
AK
La
cpx
1 2 3
-0.3604 -0.8877 -0.3251
-0.0179 -1.0016 -0.3350
0.0081 0.0040 0.0073
i
vt
1-F
1-F
1 2 3
0.4897 0.6855 0.6642
0.1428 0.0755 0.0818
0.8572 0.9245 0.9182
Jf Yb ^cpx
0.3420 0.1589 0.2904
1.6552 0.6287 1.5346
u{
K La ga
K
fr" Yb ga
K
1.6341 -0.6429 0.6876
0.2622 1.0279 1.1546
0.0139 0.0053 0.0118
9.0044 5.4856 10.115
^cpx
*ga
£La
£>Yb
0.4684 0.4247 0.4630
0.5316 0.5753 0.5370
0.0112 0.0048 0.0097
4.9473 3.2234 5.5664
normal deviates and compute the random vector lnX cpx La 1 rinO.Oli
1/2
lnK cpxYb J = bn0.40j + ln* " We then compute Kcpx and KcpxYb and proceed similarly for ga. x cpx is calculated in a similar way: if w is a normal deviate, we get lnxcpx = ln0.40 + (lnl.l)xw or xcpx = 0.40 x 1.1w while xga is calculated as 1 — xcpx . The La bulk solid-liquid partition coefficient D La is calculated as
with a similar formula for D Yb . In order to compute the fraction crystallized (1 — F), we note that, for v being a deviate uniformly distributed between 0 and 1, —Inv is an exponential deviate with unit mean, and —0.2 In v is an exponential deviate with mean 0.2. The La/ Yb ratio can now be calculated and we proceed identically with a different set of random deviates as many times as needed. Table 4.9 gives three cases (n = 1,2,3) of such a calculation with u and w standing for normal deviates and v for uniform
Probability and statistics
236
o
o
10' Garnet
10 c
10 -1
10"4
10 -1
10 -2 L
La
0
25
o
-
-
o |
o
20 o
J
o o
o o
15 -
o -
o o
10
5
o
oo
o °
°-^ o ° * o
o o
o
(y
o
0
0.5
0.6
0.7
0.8
0.9
Figure 4.11 Monte-Carlo simulation (100 trials) of error propagation for La/Yb fractionation in residual melts by clinopyroxene-garnet removal from a basaltic parent magma (see text for parameter description and distributions used). Top: mineral-liquid partition coefficients for La and Yb. Bottom: variations of the La/Yb ratio as a function of the fraction F of residual melt.
4.4 Principal component analysis
237
deviates. Figure 4.11 plots 100 estimates of mineral-liquid partition coefficients and La/Yb ratios.
S=UAUT
(4.4.1)
where A is the n x n the diagonal matrix of eigenvalues Xu..., kn and U is the nxn matrix of orthogonal eigenvectors Uj (j=l,...,n\ the jth component of the ith
observation vector x( is the scalar ejt such as e,, = «/(*,-*)
(4.4.2)
ejt is therefore the coordinate of the ith measurement xt along the 7th eigenvector Uj of Sx. From that definition, we can infer the average component along the 7th eigenvector to be zero since
= 0
(4.4.3)
m The n coordinates associated with the n eigenvectors define the vector et of the point vector xt in the new system of coordinates, which is written formally as ei=U
T
(xj-x)
with the number of vectors et being m. Pre-multiplication of the old coordinates xt — x by UT corresponds to a rotation that makes the new axes correspond to the eigenvectors of the matrix Sx (see Section 2.2). The nxm matrix X therefore produces an n x m matrix E through £ = UTX
(4.4.4)
Applying equation (4.3.4) for variable change, we get the component covariance matrix Se as Se= UTSXU= UJUAUJU=\
(4.4.5)
Since A is diagonal, the covariance of the components along the 7th eigenvector is
238
Probability and statistics
zero and their variance is given by
m— 1
(4.4.6) N o w comes the very principle of the principal c o m p o n e n t analysis. A total variance is n o w defined as the trace of the matrix Sx or, using a p r o p e r t y of the trace of a matrix product given in Section 2.2 t r £ x = tr(tfA£/ T ) = tr(A' I i/ T tf) = trA = £ lj
(4.4.7)
a n d the p r o p o r t i o n of that variance explained by the c o m p o n e n t k is the ratio pk given by
Adding variances on different variables at the denominator, e.g. pH and temperature in solutions, does not make much sense and is certainly not invariant upon rescaling. Proportions of explained total variance do not survive a simple change of units! For this reason, PC A is commonly carried out instead on normalized variables £ such as Z^s-^Xi-x)
(4.4.9)
where s is the diagonal matrix of sample standard deviations. The n x m matrix S collects the z = l,...,m normalized measurements £,. As we have seen in equation (4.2.33), the covariance matrix of standardized variables is the correlation matrix of the non-standardized variables. Therefore, the £f have the correlation matrix R for covariance matrix. The diagonal form of R is R=VAVT
(4.4.10)
where A is the diagonal matrix of eigenvalues (5 l5...,<5 n and V the matrix of the orthogonal eigenvector v 1 ? ...,v n of R. The component fjt of the ith vector xt along the 7'th eigenvector v, of R is given by
/;, = v/£
(4.4.11)
and the components can be collected in an n x m matrix F obtained as before through F= VTE
(4.4.12)
E=VF
(4.4.13)
or, since V is an orthogonal matrix
4.4 Principal component analysis
239
Applying equation (4.3.4) for a change of variable, we get the component covariance matrix Sf as Sf = VTR V= V1 VA VJ V= A
(4.4. H)
As for the principal components of the covariance matrix, the principal components of the correlation-matrix have zero covariance. In addition, the variance of a component is simply given by the corresponding eigenvalue, i.e.,
(4.4.15)
m— 1
Since the trace of a matrix is invariant upon rotation, we get the sum of the n eigenvalues Sj as the trace of the correlation matrix. The dimensionless total variance is therefore n
total variance = £ dj = tr& = trR = n
(4.4.16)
The proportion of the total variance allotted to the kth component is
Pk
k
l i j I j=i
n
(4.4.17)
Relative to any other orthogonal reference frame and for any value of j , the total variance which is unaccounted for by the components associated with eigenvalues smaller than 5j is minimal (e.g., Johnson and Wichern, 1982). Reduction of the number of independent variables can therefore be achieved by dropping all the components that do not account for a substantial fraction of the total variance, i.e., those with the smallest eigenvalues. The cutoff value depends on where the significance level is set up for the problem under consideration. Let us assume that the components j=l,...,k are taken into consideration, while the components from k + 1 to n are considered as 'noise'. We therefore split the nxn matrix V into its most valuable or significant part, the nxk matrix Vk of the eigenvectors associated with the k largest eigenvalues and the 'noise' part, the nx(n — k) matrix VkL of the eigenvectors corresponding to the rest of the eigenvectors. Vk and VkL are made of orthogonal vectors. Likewise, the matrix jFis split into a h n matrix Fk comprising the k significant eigencomponents and a (n — k) x n matrix F^ of noise. Making this split apparent in the matrix equality (4.4.12), it becomes VT* = lVk W E
(4.4.18)
We are interested in the upper part Fk of F and therefore in Fk=VkTE
(4.4.19)
240
Probability and statistics
Conversely, the nxm matrix of reduced data S can be considered as the sum of a significant part Ek and a noise part E^-1, both with the same dimension as S, so equation (4.4.13) now becomes (4.4.20)
The m x n matrix Ek of the significant part in the original space of the reduced data is therefore Ek=VkFk=VkVkTE
(4.4.21)
or for the ith reduced vector ft ft* = VkFk = Vk FfcTft
(4.4.22)
The noise component of the reduced vector is the complement of %tk to ft. The squared modulus of this noise, which is the reduced squared distance (dk)2 of the ith actual measurement to the point represented by its significant principal components, is (d-k)2 = (ft* - ft)T(ft* -
ft)
(4.4.23)
This method is extremely useful to detect the points that have a large noise component (outliers) and therefore are exceedingly far from the subspace of the significant principal components. For the raw data, the significant part xk of the vector xt is
while its noise part is xt — xk. At this point we can describe most of the variability in the original data set by a small number of linear combinations of the original variables. In particular, once the components have been ordered with decreasing eigenvalues, graphing the first components pairwise will enable the data to be shown along the directions of maximum variability. It is convenient to draw a reference unit circle in the plane or a unit reference sphere in a three-dimensional space - which shows the locus of points located at a distance of one standard deviation from the mean. In addition, a unit vector (one standard deviation) along each original axis ^ can be drawn, which gives a quick visual feeling of how a plane defined by two components is oriented in the original reference frame, i.e., tells us 'where' the original data axis is in the component plane. If the unit vector of an original data axis lies in the plane of the two components, its representative point must be on the unit circle. If it is orthogonal to the two component plane, it must project at the center of the circle. From equation (4.4.13), the components of the unit vectors in the original £,- space are simply the rows of the matrix V. The correlation coefficients between the original data and the components, known as the component loadings, are also of great utility as they show which component carries
4.4 Principal component analysis
241
more information on an original data axis. Multiplying equation (4.4.13) by FT, we get EFT=VFFT
(4.4.24)
Both S, by definition, and F through equation (4.4.12), are centered, i.e., their expectation is a null matrix. Therefore the sample covariance matrix between the reduced data and the components is ZFT-EFT=
V(FFT FFT)= VSf= VA
(4.4.25)
where the bar over the matrices refers to the sample mean. The £, have unit variance while A is the covariance matrix of the components. From equation (4.2.20), the matrix of sample correlation coefficients is therefore (4.4.26)
The ijth term of this matrix represents the correlation coefficient (loading) between the ith variable and the jth principal component.
& Limestone samples from Coumiac in the Southern French Massif Central have been measured for major elements and carbon species by Grandjean (1989). The data are reported in Table 4.10. Total iron is counted as Fe 2 O 3 . Use PC A to suggest the possible mineral components that contribute to the rocks. The mean and standard deviation are first computed and shown in Table 4.10. Then the correlation matrix R is evaluated by computer as SiO 2
R=
A12 O3
Fe 2 O 3
CaO
CO 2
Org. C
1.0000
0.9107
0.3530
-0.5461
-0.6008
0.1525"
0.9107
1.0000
0.5481
-0.7273
-0.7637
0.1537
0.3530
0.5481
1.0000
-0.9705
-0.9537
0.5949
-0.5461
-0.7273
-0.9705
1.0000
0.9921
-0.5296
-0.6008
-0.7637
-0.9537
0.9921
1.0000
-0.5367
0.1525
0.1537
0.5949
-0.5296
-0.5367
1.0000
This matrix shows strong correlations between SiO2 and A12O3 on one hand, between CaO and CO2 on the other hand, hinting thereby at a detrital and calcite component, respectively. Fe2O 3 is strongly anticorrelated with both CaO and CO2 . The eigenvalues Sj of the matrix R and their percentage pj of contribution to the
Probability and statistics
242
Table 4.10. Major-element composition (weight percent) of the Coumiac limestones, South French Massif Central (Grandjean, 1989). Layer #
SiO 2
A12O3
Fe 2 O 3
CaO
CO 2
Org. C
3 4 8 9 11 12 13 14 19 20 21 22 23 24 26 27 28
1.23 0.29 0.28 2.69 1.56 1.52 2.00 5.72 0.72 2.21 1.50 1.75 0.82 1.70 1.79 3.20 2.80
0.48 0.13 0.07 1.68 0.82 0.76 0.65 2.45 0.13 0.30 0.15 0.33 0.35 0.69 0.61 1.57 1.21
1.12 0.39 0.31 27.14 0.44 3.23 0.60 6.90 0.23 0.41 0.38 0.45 0.54 0.37 2.11 0.75 1.65
53.30 54.36 54.50 36.17 52.88 51.74 53.90 46.05 55.24 54.09 54.42 53.91 54.42 53.58 52.86 51.76 52.04
42.34 43.55 43.71 30.84 41.64 41.25 42.03 37.19 43.07 42.18 42.87 42.73 42.77 42.23 41.69 40.62 40.79
0.03 0.02 0.02 0.17 0.04 0.03 0.02 0.05 0.05 0.09 0.09 0.12 0.12 0.04 0.02 0.06 0.04
Mean Std Dev.
1.87 1.29
0.73 0.66
2.77 6.50
52.07 4.59
41.26 3.07
0.06 0.04
Table 4.11. Eigenvalues of the correlation matrix of major-element concentrations in Coumiac limestones and percentage of variance explained by each component. Note that the eigenvalues 6} sum up to 6. 1 4.2329 70.55
Pj (percent)
2 1.2262 20.44
4 0.0470 0.78
3 0.4872 8.12
5 0.0054 0.09
6 0.0013 0.02
total variance are given in Table 4.11, whereas the eigenvalue matrix Fis given by
V=
0.3432
0.5651
-0.4362
0.6054
-0.0602
0.0503
0.4034
0.4808
-0.0592
-0.7421
-0.0329
0.2253
0.4391
-0.3113
0.3565
0.2316
-0.0678
0.7246
0.4722
0.1192
-0.2731
-0.0223
0.4788
0.0785
-0.1953
-0.0493
-0.7692
0.3637
0.2730
-0.5764
-0.7526
-0.1619
-0.0241
0.0067
0.6312 0.5379
91 percent of the variance is given by the first two components, so we can restrict our analysis to a plot of component 2 vs component 1 (Figure 4.12). Three groups
243
4.4 Principal component analysis
-1.5
-1
0
1
Component 1 Figure 4.12 Principal component analysis of the major elements in Coumiac limestones. 91 percent of the variance is explained by the first two components. The data can be explained by the combination of three chemical end-members: calcitic (CaO and CO2), detrital (SiO2 and A12O3), and organic (organic C and Fe2O3). Because of the closure condition these three end-members translate into only two significant components.
of variables are identified: the calcitic end-member (CaO and CO2), the detrital end-member (SiO2 and A12O3 ), and the organic end-member (organic C and Fe2O3) which suggests that diagenetic pyrite precipitation occurred whenever strong input of organic material made the environment reducing. Why two components if the end-members are three? The closure condition is an additional relationship, so the number of degrees of freedom for mixtures of these three end-members are only two. <= & The isotopic composition of radiogenic elements in 40 groups of oceanic islands has been compiled by Vincent Salters from the Lamont-Doherty Geological Observatory and is reported in Table 4.12. Find the minimum number of variables to explain at least 90 percent of the variance. Find the deviating islands. Plot the first three components pairwise. As a reference, the mean and standard deviation of each variable are listed in the appropriate column of Table 4.12. Then, the correlation matrix R is calculated as -0.470
-0.099
-0.117
0.812
1
0.344
-0.005
-0.041
0.470
0.344
1
0.855
0.880
0.099
-0.005
0.855
1
0.894
0.117
-0.041
0.880
0.894
1
1 R=
-0.812
244
Probability and statistics Table 4.12. Mean isotopic data for oceanic islands (courtesy Vincent Salters).
The (df)2 of each observation is its squared distance to the first two-component plane ('the Mantle Plane'). Sr
143
Nd
206p b
207p b
208p b
86Sf
144
Nd
204p b
204p b
204p b
(dtk)2
0.0298 0.9158 2.3561 0.8359 0.2172 0.3823 1.1064 0.7589 0.1384 1.2111 0.0685 1.3745 0.3734 0.5515 0.2638 0.0101 0.7976 0.1736 0.6122 1.5094 2.0918 1.9802 0.0175 1.2409 0.1321 0.7788 0.0874 0.9954 2.5327 0.0505 3.0063 0.7729 0.2985 0.9822 0.1748 3.1309 0.0547 1.5217 0.7974 2.1604
87
Island group St Paul-Amst. Ascension Australs Cook-Australs NCook S Cook Azores Balleny Bouvet Cameroon Line CapeVerde Carolines Christmas Cocos Comores Crozet Easter Fernando Galapagos Gough Hawaii Iceland Juan Fernandez Kerguelen Louisville Marion Marquesas NE Seamounts Nunivak Reunion Rio Grande Samoa San Felix Shimada Society St Helena Trinidade Tristan Tuamotus Walvis
0.703 73 0.702 83 0.70311 0.703 67 0.704 55 0.703 70 0.704 57 0.70294 0.703 69 0.703 14 0.70341 0.703 29 0.70440 0.703 03 0.70341 0.70400 0.703 22 0.70411 0.703 12 0.705 10 0.703 76 0.70311 0.703 66 0.70506 0.703 58 0.703 30 0.70424 0.703 37 0.70290 0.704 14 0.704 78 0.705 53 0.70409 0.70484 0.70478 0.702 89 0.703 80 0.70500 0.70408 0.704 69
0.512 88 0.513 04 0.512 88 0.512 82 0.512 78 0.512 77 0.51281 0.51297 0.512 84 0.51290 0.512 84 0.51297 0.51269 0.51299 0.512 82 0.512 85 0.51290 0.51281 0.51299 0.512 54 0.51293 0.51304 0.51284 0.51266 0.51292 0.51293 0.512 80 0.512 85 0.51311 0.512 85 0.512 55 0.512 75 0.51261 0.51264 0.512 80 0.512 87 0.51271 0.512 55 0.51271 0.512 54
18.879 19.421 20.533 20.001 19.386 19.743 19.707 19.752 19.445 20.020 19.254 18.462 18.639 19.234 19.615 18.929 19.865 19.409 19.076 18.445 18.188 18.453 19.121 18.259 19.271 18.562 19.362 20.155 18.588 18.855 17.619 18.914 19.079 19.046 19.128 20.678 19.116 18.476 18.132 17.914
15.585 15.612 15.733 15.673 15.608 15.637 15.703 15.600 15.652 15.672 15.580 15.489 15.605 15.589 15.609 15.587 15.640 15.634 15.564 15.624 15.462 15.484 15.604 15.555 15.610 15.540 15.604 15.629 15.471 15.580 15.490 15.607 15.581 15.681 15.592 15.763 15.601 15.518 15.490 15.492
39.131 38.916 39.876 39.621 39.342 39.482 39.810 39.359 39.065 39.758 39.026 38.289 38.742 38.973 39.479 39.037 39.670 39.331 38.692 38.99 37.899 38.106 38.961 38.646 38.991 38.367 39.258 39.907 38.088 38.919 38.054 39.071 39.029 39.354 38.915 39.985 39.110 38.867 38.879 38.472
X
0.703 87 0.00073
0.512 82 0.00014
19.118 0.690
15.594 0.070
39.037 0.528
s
4.4 Principal component analysis
245
Table 4.13. Eigenvalues and percentage of explained variance for the oceanic island isotope data of Table 4.12.
1 2.9155 58.31
Pj (percent)
2 1.7631 35.26
3 0.1871 3.74
4 0.101 2.02
5 0.0333 0.67
Thefiveeigenvalues of R were calculated by computer as the Sj listed and rearranged in a decreasing order in Table 4.13 together with the fraction Pj in percent of the total variance explained by each component. The matrix V of the five eigenvectors, in the order of the eigenvalues, is 0.2914 0.6122 0.2150 -0.6641 V=
0.6682 0.6659
0.5783
0.0132 -0.0127
0.5149
0.2991
0.5190
0.3074 -0.1693
0.2851
0.2672 0.1492 0.1918 -0.1806 0.1888
0.7935
-0.7230
-0.2036
0.5775 -0.5235
More than 93.5 percent of the variance is explained by the first two components, which tells us that two degrees of freedom describe most of the natural isotopic variation with the five chronometers. This observation has led to the concept of the 'Mantle Plane' of Zindler et al. (1982), since a plane is defined by only two independent variables, and has been extensively discussed by Allegre et al. (1987). Figure 4.13 shows that the first component is dominated by lead isotopes which plot next to the unit circle on the right, while the spread along the second component is dominated by the Sr-Nd anticorrelation. From component 3 to 5, the spread is small. The loading matrix V\1/2 given by equation (4.4.26) can be found in Table 4.14. Lead isotopes have strong correlation coefficients on the first component. They are decoupled from Sr and Nd isotopes which strongly correlate and anticorrelate, respectively, with the second component. On a global scale, Pb isotopic variations in oceanic islands seem to be decoupled from Sr and Nd isotopic variations. Let us decide that the cutoff value is k = 2, so the matrix Vk is made of the first two columns of V. As expected, the remaining part of the data, i.e., the components three tofive,which has been formally ascribed to noise, is very small. This is apparent in the last column of Table 4.12, where the (df)2 values have been listed. What Zindler et al. (1982) called the 'distance' of a datum to the Mantle Plane is the square-root of (df)2. Interestingly, many of the deviating points are those for which a HIMU component (Zindler and Hart, 1986) has been recognized (St Helena, Rio Grande, Australs). In order to account for this additional component, it is left to the reader to show that a third component would work adequately. «= What PC A is actually doing in this case in terms of processes is rather inappropriate. There is a consensus for explaining mantle isotopic variability as mixing geochemical
246
Probability and statistics
component 1
component 1
o
I
I
I
o
CD
O
o
-2 2 O
en
o
I
i
0
o
I
O
o -2 0
2 -2
component 2 o
!!SL • 86 Sr
component 3 206
143
Nd 144 Nd
n
Pb 204 Pb
,
2Q7
Pb
204pb
208
Pb 204p b
Figure 4.13 Principal component analysis of the mean isotopic data for oceanic islands (courtesy of Vincent Salters). In the top left corner, the plane of the first two components (the 'Mantle Plane' of Zindler et a/., 1982) explains 93 percent of the variance. Component 1 is dominated by lead isotopes, component 2 by Sr and Nd isotopes. Other components are plotted for reference. In the top right corner, the 'Mantle Plane' is viewed sideways along the direction of the second component, so the distance of each point to the plane can be easily seen. In the bottom left corner, it is viewed along the axis of the first component. The bottom right corner shows how little variance is left with components 3 and 4.
entities, inadvertently also called 'mantle components' (e.g., Allegre, 1982; White, 1985; Zindler and Hart, 1986; Hart et al, 1992) which should not be confused with principal components. These mantle components most certainly represent contributions from distinct mantle reservoirs with distinct evolution. However, the process of 'adding-up' mantle components does not produce the linear relationships in ratio-ratio plots that PCA assumes implicitly, but rather generates hyperbolic mixing surfaces (see Chapter 1). The components created by PCA are a convenient simplification of rather questionable significance.
4.4 Principal component analysis
247
Table 4.14. Correlation coefficients (loadings) between the original reduced data and the components for the oceanic island isotope data of Table 4.12.
87
Sr/ 86 Sr Nd/ 1 4 4 Nd 206 Pb/ 2 0 4 Pb 207 Pb/ 2 0 4 Pb 208 Pb/ 2 0 4 Pb 143
1
2
3
4
5
-0.4976 0.3670 0.9875 0.8791 0.8861
0.8130 -0.8818 0.0175 0.3972 0.4082
0.2890 0.2880 -0.0055 0.1233 -0.0732
0.0849 0.0610 0.0600 -0.2297 0.1835
0.0273 -0.0330 0.1449 -0.0372 -0.0956
An additional problem of the PCA is how analytical uncertainties may be taken into account. The normalization step represented by equation (4.4.9) involves estimates of standard deviations. A common choice is the sample standard deviation, although Allegre et a\. (1987) consider that a more natural scaling of the variations is the experimental uncertainty on an individual measurement. Both procedures lead to de-dimensionalized data but with an entirely different philosophy, depending on whether emphasis is on the natural or analytical dispersion. For instance, most of the Ni variations in basalts are due to olivine fractionation, while most of the Mn variations can be ascribed to analytical uncertainties, since the partition coefficient of this element is unity for most femic phases. Normalizing concentrations to the sample standard deviations s (sigma) results in the \o surface in the data space being represented by a unit reference circle in the component plane, and the analytical uncertainty surface associated with each point is an ellipse. Normalizing concentrations to the sample analytical uncertainty does the opposite: the analytical uncertainty volume associated with each point in the data space becomes a unit circle in the component plane, while the la surface in the data space is represented by an ellipse.
Inverse methods
Literature abounds with a rich terminology concerning the possible relationships between observations provided by experiment or analysis and parameters which are the physical quantities needed for a mathematical formulation of a process (the model) to be uniquely determined. A forward problem relates observations to parameters by a relationship such as parameters = if *(observations)
(5.0.1)
where <£ * is a matrix, differential, or integral operator and is usually easy to work with. For instance, given some observed (or assumed) mantle source composition, degree of melting, residual mineralogy and fractionation coefficients, calculating the composition of the basaltic melt segregated from the mantle source is a forward problem. Quite commonly, however, we are in situations where such an operator =£?* is not available and, instead, the relationship goes through a known operator <£ that relates the parameters to the observations j£? (parameters) = observations Assuming the operator !£ (the model) is known, the inverse problem consists in finding the inverse operator JSf *, and therefore the parameters which relate to the observations through equation (5.0.1). In addition, very few observations are pristine and basic measurements such as angular deviation of a needle on a display, linear expansion of a fluid, voltages on an electronic device, only represent analogs of the observation to be made. These observations are themselves dependent on a model of the measurement process attached to the particular device. For instance, we may assume that the deviation of a needle on a display connected to a resistance is proportional to the number of charged particles received by the resistance. The model of the measurement is usually well constrained and the analyst should be in control of the deterministic part through calibration, working curves, assessment of non-linearity, etc. If the physics of the measurement is correctly understood, the residual deviations from the experimental calibration may be considered as random deviates. Their assessment is an integral part of the measurement protocol and the moments of these random deviations should be known to the analyst and incorporated in the model. Fitting a parameter-dependent model to a set of observations consists in finding 248
5.7 Linear estimates
249
the set of parameters that is most suitable for bringing the observations and the model as close as possible. Part of this chapter is dedicated to giving the terms 'most suitable' a statistical sense. Given a set of measurements and a set of unknown parameters related to each other by a set of known equations, the least-square method provides the minimum-variance estimate of the parameters. The least-square method is not to be confused with the maximum-likelihood method, which require the knowledge of probability distributions, although they both result in identical solutions for normally distributed variables. 5.1 Linear estimates 5.1.1 General Given a matrix Am x n of known coefficients with m > n and a vector ym of observations or data, an unknown or model vector xn of parameters is sought which fulfills the condition of the model y = Ax
(5.1.1)
This equation has in general no solution because the data vector y should represent a linear combination of the n column vectors au a2, .,an of the matrix A, for y = xla1+x2a2 + ... +xnan
(5.1.2)
y therefore is required to lie in the column-space of A, a desired property which, at least within a certain precision, is usually not met in practice. We therefore make the assumption that the sample data gathered in vector y are only our best estimates of the real (population) values j , which justifies the bar on the symbol as representing measured values. This notation contradicts the standard usage, but is consistent with the basic definitions of Chapter 4. Indeed, for an unbiased estimate, we can still write that S(y)=y
(5.1.3)
The problem is therefore recast as a search for a population vector statistics j , of which the measurement y is an estimate, and which satisfy the model, i.e., y = Ax
(5.1.4)
None of the population parameters x and y can be found since their determination would require the whole range of attainable values to be measured. The least-square criterion provides estimates x and y of x and j , respectively, which also satisfy the model, i.e., y = Ax
(5.1.5)
and makes the length of the residual vector (y — y) as small as possible. The least-square solution y is simply the orthogonal projection of the data vector y onto the column-space
250
Inverse methods
Figure 5.1 The least-square estimate y of the solution to equation (5.1.4) is the orthogonal projection of the observation vector y onto the column-space al9 a2, ..., an of matrix A. of the matrix A (Figure 5.1), which we found in Chapter 2 to be given by y = A(ATA)-1AJy = Py
(5.1.6)
The projector P associated with that projection is A(ATA)~ lAT, with dimension mxm and rank n. Comparing with equation (5.1.5), we obtain the least-square solution as x = (ATA)-lATy
(5.1.7)
The least-square solution itself does not depend on the probability distribution of j : it is simply a minimum-distance estimate. Later in this Chapter, it will be shown, however, that its sampling properties are most easily described when the measurements are normally distributed. Finally, one may ask how a particular datum may influence the results. Various definitions of data importance, leverage, or influence can be used. P measures the importance of the each observation as seen from equation (5.1.6) (5.1.8) where pik is the ikth element of the projector P. This relationship tells us how much of each datum yk participates into the making of the estimate yt. Ideally, we would like the matrix P to be as close to a diagonal matrix as possible, which would insure independent observations. In addition, all pH ideally should be nearly equal in order for the contribution of each observation to the making of parameters to be equivalent. However there is probably no better measure of how a particular observation i influences the model than comparing the solution x with the solution x(i) obtained by leaving out the ith datum (e.g., Sen and Srivastava, 1990). The derivation of the change [ i — x(ij] requires some particular results of matrix algebra not included in this book and the reader may refer to these authors for complete derivation of the
251
5.7 Linear estimates Table 5.1. Ion beam intensities \{ (mV) at mass i, isotopic abundances a of metals and adjusted intensities. Ion currents are converted into voltage through a high-value resistor. Mass i
Tt
a?
a™
AtSm
Ii
142 144 146 148 150
207 62 43 26 22
0.1113 0 0 0 0
0.2713 0.2380 0.1719 0.0576 0.0564
0 0.031 0 0.113 0.074
207.00 62.29 42.65 26.10 21.73
following equation (5.1.9)
where (a1),- is the ith row of the matrix A. & Peak-stripping (mass spectrum deconvolution). An ion probe measures the ion current at the masses 142, 144, 146, 148 and 150 which are known to result from the overlapping isotopic signals of Ce, Nd and Sm. The vector Z not to be confused with the identity matrix, of measured peak ion currents T14.2J144r9... (values in millivolts) is given by Table 5.1. Neglecting instrumental mass fractionation (see Chapter 3), calculate the total elemental signal in millivolts for Ce, Nd, and Sm. Let a 142 Ce be the atomic fraction of 142 Ce in Ce and the like for other elements and isotopes. Let us call JCe, / N d and / Sm the elemental signal, i.e., the total number of millivolts summed over all the isotopes of the same element. Mass balance requires
I —n Cei M 4 4 — "144 *•
and so on for each mass. Defining the matrix A with current element atj we can write ^142 "Zee"
M44 M46
^148
=A
*Nd
An.
Consulting a table of nuclide abundances (Walker et ah, 1989) we can build the matrix A out of rows and columns of Table 5.1. Note that abundances do not sum up to
252
Inverse methods
unity because only a few isotopes were measured. Intermediate results are
ATA=
0.0124
0.0302
0.0000
0.0302
0.1663
0.0181
0.0000
0.0181
0.0192
and 159.21
-32.20
30.28"
-32.20
13.208
-12.421
30.28
-12.421
63.747^
Let us now build the 'stripping' matrix (ATA)~ 1AT, which is independent of the mixing proportions "8.9847
-6.7242
-5.5345
1.5667
0.4245"
(A Ay A = 0.0000
2.7586
2.2705
-0.6427
-0.1742
0.0000
-0.9799
-2.1351
6.4880
4.0167
T
l T
and gives the least-square solution as
x= Jsmj 207 "9.9847
6.7242
-5.5345
1.5667
0.4245
62
0.0000
2.7586
2.2705
-0.6427
0.1742
43 =
0.0000
0.9799
-2.1351
6.4880
4.0167_
26
"1255 248 _ 105_
. 22.
Again, the total elemental signal is in millivolts. The projector P is
T
l J
P=A(A Ay A =
1.0000
0
0
0.6262
0
0
0
0.4742
0.0482
0.0831
0
0.4742
0.3903
-0.1105
-0.0299
0
0.0482
-0.1105
0.6961
0.4439
0
0.0831
-0.0299
0.4439
0.2874
and the projected vector T=PI given in Table 5.1. The diagonal terms pu sum up to 3 as expected. Examination of these terms shows that mass 142 must be measured as it is the only source of information for Ce. Mass 144, which is the second highest peak for Nd, and mass 148, which is the highest peak for Sm, contain more valuable information on these elements than the smaller
5.1 Linear estimates
253
Table 5.2. Influence matrix: each figure represents by how much the variables in the first column change when the observation on the top row is left out.
Mass /Ce
Im
hm
142 2875 0.00 -0.00
144 5.25 -2.16 0.77
146 -3.16 1.30 -1.22
148 -0.51 0.21 -2.12
150 0.16 -0.07 1.54
peaks 146 and 150. The same conclusion is arrived at upon calculation of the influence of each observation. The influence vector of the ith observation is obtained by multiplying the ith column-vector of the matrix (ATA)~1AT calculated above by (7j — /;) and dividing by (1— pu\ e.g., for mass 146 [-5.53,2.27, -2.14] T [43-42.65] 1-0.39 with complete results listed in Table 5.2. For instance, not measuring the mass 146 would decrease Nd intensity by 1.3 mV. o # ^ Isotope dilution. A similar technique can be used to achieve deconvolution of mass spectra when isotopic spikes (elements with artificially altered isotope composition) have been added to a sample so as to perform what is known as isotope dilution. When set up in conjunction with Thermal Ionisation Mass Spectrometry (TIMS) or Inductively Coupled Plasma Mass Spectrometry (ICP-MS), isotope dilution is a fairly precise technique for elemental analysis. Mass interferences make the calculation slightly more complicated but the peak stripping technique is still applicable (Michard and Albarede, 1986). Let us assume that ion currents are measured at masses 140, 142, 143, 145, 146. Each mass 'peak' results from the overlapping isotopic signals of Ce and Nd. As in the previous example, we define the vector 7 of measured ion currents (in millivolts) shown in Table 5.3. The analyst has added 2|imol of 142-enriched Ce and 1 |imol of 145-enriched Nd Oak Ridge spikes. Determine the amount of natural Ce and Nd present in the run for the following mass spectrum Let a 142 Ce be the atomic fraction of 142 Ce in natural Ce, fr142Ce the atomic fraction of 142 Ce in the Ce spike and likewise for the other elements and isotopes. Let us call / Ce nat and / Ce sp the elemental signal in the natural Ce and its spike, respectively, i.e., the total number of millivolts summed over all the isotopes of each element, and likewise for Nd. Mass balance requires i
140 —"140
i
Ce
"'""140
i
Nd
+0140
i
Ce
+#140
'Nd
f Cer nat . _ Ndr nat , L Ce j sp • L, Ndr sp n j ^ 1 4 2 - ^ 1 4 2 7Ce +fl142 Nd + #142 ^Ce + ^ 1 4 2 ^Nd
Inverse methods
254
Table 5.3. First column: ion currents \x (m V) measured for a natural-spike mixture ofCe andNd. The matrix of isotopic abundances a, b was taken from the Chart of the Nuclides (Walker et al, 1989) and commercial Oak Ridge data sheets. Last column: adjusted values of the ion currents I{ Ion currents are converted into voltage through a high-value resistor. Natural
Spike
Mass i
/,.
a?
a™
b?
bt™
140 142 143 145 146
330 280 6 32 10
0.8843 0.1113 0 0 0
0 0.2713 0.1218 0.0830 0.1719
0.0789 0.9211 0 0 0
0 0.0125 0.0075 0.8967 0.0431
Tt 330.0 280.0 6.24 32.01 9.82
and likewise for other masses. In matrix form ^140
~j
-^142
r
nat™ nat
=A
^143 /l45
f sp -7Nd _
^156.
From the data listed in Table 5.3, we calculate
A TA =
0.7944
0.0302
0.1723
0.0014"
0.0302
0.1249
0.2499
0.0861
0.1723
0.2499
0.8547
0.0115
0.0014
0.0861
0.0015
0.8061
and 1.3328
0.6181
-0.4486
-0.0619
0.6181
23.3764
-6.9274
-2.4000
0.4486
-6.9274
3.2767
0.6942
0.0619
-2.4000
0.6942
1.4871
Let us now build the stripping matrix (ATA) 1AT, which again is independent of the mixing proportions and will be the same for every mixture that uses the same spike
(A'l'A)~1AT =
1.1432
-0.0979
0.0748
-0.0042
0.1036"
-0.0000
0.0000
2.8292
-0.2118
3.9150
-0.1381
1.0975
-0.08385
0.0475
-1.1609
0.0000
0.0000
-0.2812
1.1343
-0.3485
255
5.7 Linear estimates
This leads to the solution (in mV of element)
'A1!
1.1432 -0.0979
0.0748 -0.0042
0.1036"
330
"351.2
ion
0.0000 -0.0000 0.1381
2.8292
1.0975 -0.8385
0.0000 -0.0000
3.9150
-0.218
0.0475 -1.1609
-0.2812
49.35
/: 0
246.6
11
1.1343 -0.3485_
10
_ 31.13
The amount of natural element present in the run was (351.2/246.6) x 2 umol = 2.85 umol of Ce and (49.35/31.13) x 1 umol = 1.59 umol of Nd. One should be careful of not working with mass units because of widely different molar weights for the natural element and the spike. The P matrix given by
P=A(ATAy1AT
=
1.0000
0
0
0
1.0000
0
0
0
0
0.3425
-0.0173
0.4742
0
0
-0.0173
0.9995
0.0125
0
0
0.4742
0.0125
0.6580
0
0
0
'
and the adjusted values J£ (Table 5.3) provide some enlightening information. Masses 140 and 142 must both be measured, since two masses are needed for isotope dilution, in this case for Ce. Their adjusted value is identical to the observed value. Mass 145 is the principal Nd spike and its measurement is therefore compulsory. Should one Nd isotope be dropped, it should be 1 4 3 Nd which is less abundant and hence less informative than 146 Nd. <= 5.1.2 The least-square straight line and least-square plane For a straight line, the linear relationship between two data vectors xm and ym can be written in three ways = axm
(5.1.10) (5.1.11) (5.1.12)
Handling these equations is normally done through the least-square method just discussed: on the right hand-side, the unknown vector will be the vector (a, b). The
256
Inverse methods
jth row of the matrix Amx2 of the general least-square problem will be made of the (xi91) in the first case, the (yh 1) in the second case, and the (xi9yt) in the third case. Different results for the slope and the intercept in an (x, y) diagram will be obtained for each case. Variable y in equation (5.1.10) and variable x in equation (5.1.11) are supposed to be known imperfectly. More complicated assumptions are implied by equation (5.1.12). In the jargon of regression techniques, the variable on the left-hand side of any equation is called the dependent variable, that on the right-hand side the independent variable. The least-square solution to equation (5.1.10) seeks adjusted values in the y direction, while for equation (5.1.11), the x direction is assumed. The third equation is known as orthogonal adjustment (Figure 5.2). Many books list the explicit solutions to some of these cases (e.g., Spiegel, 1975), but, with the advent of desktop computers that allow easy implementation of least-square solutions, these cumbersome expressions have become unnecessary. More powerful expressions have been proposed that enhance the visual assessment of error propagation or data influence (Provost, 1990).
Figure 5.2 Three different ways of adjusting a straight line to a set of observations. Adjustment in the ^-direction (top), the x-direction (middle) and orthogonal adjustment (bottom).
257
5.1 Linear estimates
It is left as an exercise to the reader to show that the mean point of a sample belongs to each least-square straight line. This fundamental property results in a disturbing feature of samples from loosely correlated variables: the mean point being constant, slopes and intercepts calculated from successive samples of the same population vary significantly, but remain strongly anticorrelated. Least-square straight lines seem to hinge around mean points. It may seem paradoxical, but the more loosely correlated two variables are, the more anticorrelated are the slopes and intercepts derived from least-square (regression) lines! This statistical artifact represents a major risk of misinterpretation. For a least-square plane, the linear relationship between three data vectors jcm, ym and zm can also be written in different ways, such as (5.1.13)
and (5.1.14)
l=ax
& Hirose and Kushiro (1993) have determined the composition of basaltic melts segregated from peridotite at 10-30 kbars. Some of the data are listed as molar fractions in Table 5.4. Making the rather crude assumption that the clinopyroxene Kcpx solubility product can be formed as (5.1.15)
where [brackets] refer to molar concentrations, determine the dependence of In Xcpx on temperature T and pressure P for the runs saturated in clinopyroxene. The common thermodynamic expression for that dependence AH
AS
PAV
where AH, AS, and AV are the enthalpy, entropy, and volume of clinopyroxene solution, respectively (e.g., Denbigh, 1968) is assumed to hold. Check whether the liquids segregated from clinopyroxene-free peridotites are clinopyroxene-undersaturated. We can make this problem linear by selecting x=1000/T, y=\000P/T and z = In Kcpx and write the relationship between thermodynamic quantities as z = ax + by + c
We therefore solve in the least-square sense the matrix equation for the 16 clinopyroxene-present runs 6.149 6.365
"0.657
6.566
1
0.646
9.690
1
0.556
16.685
1
= 5.406
Inverse methods
258
Table 5.4. Concentrations of selected elements (%) in melts from peridotite at different temperatures t(°C) and pressures P (kb) (Hirose and Kushiro, 1993). T is the temperature in K. *clinopyroxene-bearing residual assemblage. **spinel-clinopyroxenebearing residual assemblage. Adjustment of Kcpx given by equation (5.1.15) is made on clinopyroxene-bearing residual assemblage. X cpx < K cpx in clinopyroxene-absent runs confirms that predicts equation (5.1.15) clinopyroxene solubility reasonably well.
Run #
P
t
SiO 2
MgO
CaO
lnX c p x
1000 T
1000P T
lnK c p x
1** 4** 7** 14** 15** 18** 2* 8* 10* 19* 21* 22* 23* 24* 25* 26* 3 5 6 9 11 12 13 16 17 20
10 15 20 10 10 15 10 20 25 15 20 20 25 25 30 30 10 15 15 20 25 30 30 10 10 15
1250 1275 1350 1250 1300 1300 1300 1375 1425 1350 1375 1425 1425 1450 1500 1525 1350 1350 1400 1425 1450 1475 1500 1350 1400 1400
0.466 0.450 0.432 0.466 0.460 0.457 0.456 0.435 0.429 0.448 0.430 0.444 0.438 0.438 0.411 0.424 0.466 0.443 0.444 0.429 0.433 0.420 0.416 0.463 0.470 0.453
0.106 0.111 0.148 0.109 0.137 0.112 0.142 0.164 0.182 0.178 0.188 0.213 0.182 0.214 0.224 0.241 0.186 0.153 0.195 0.200 0.212 0.195 0.217 0.182 0.223 0.213
0.093 0.077 0.090 0.086 0.111 0.075 0.111 0.098 0.100 0.120 0.108 0.108 0.100 0.106 0.103 0.103 0.085 0.105 0.103 0.100 0.100 0.097 0.094 0.109 0.092 0.104
-6.149 -6.365 -5.993 -6.199 -5.739 -6.349 -5.721 -5.801 -5.695 -5.447 -5.587 -5.401 -5.657 -5.436 -5.551 -5.406 -5.681 -5.759 -5.534 -5.606 -5.532 -5.700 -5.641 -5.455 -5.396 -5.390
0.657 0.646 0.616 0.657 0.636 0.636 0.636 0.607 0.589 0.616 0.607 0.589 0.589 0.580 0.564 0.556 0.616 0.616 0.598 0.589 0.580 0.572 0.564 0.616 0.598 0.598
6.566 9.690 12.323 6.566 6.357 9.536 6.357 12.136 14.723 9.242 12.136 11.779 14.723 14.510 16.920 16.685 6.161 9.242 8.966 11.779 14.510 17.162 16.920 6.161 5.977 8.966
-6.179 -6.347 -6.026 -6.179 -5.691 -6.100 -5.691 -5.795 -5.733 -5.629 -5.795 -5.354 -5.733 -5.517 -5.465 -5.262 -5.233 -5.629 -5.187 -5.354 -5.517 -5.675 -5.465 -5.233 -4.802 -5.187
and obtain a= -22.10, b = 0.12% and c = 9.18. The estimates In Kcpx (Table 5.4 and Figure 5.3) correlate well with the measured values. The value of [SiO 2 ] 2 [MgO][CaO] for clinopyroxene-absent runs are on the left of the linear array in Figure 5.3, which correctly predicts clinopyroxene undersaturation. 5.1.3 Least-square polynomials m pairs of experimental data u{ and y{ (i = 1,..., m) are to be fitted with a polynomial of degree n— 1, such as (5.1.16)
259
5.7 Linear estimates -5.0
o cpx absent
o
-5.5
-6.0
-6.5 -6.5
-6.0
-5.5
lnZ cpx Figure 5.3 Correlation between the estimated ^ c p x = [SiO2]2[MgO][CaO] at clinopyroxene saturation and the observed value in melts for cpx-present and cpx-absent runs in Hirose and Kushiro's (1993) experiments on peridotite melting. The straight line calculated from cpx-present runs represents the saturation line.
where the n at are constants to be determined. In matrix form, this can be written
u2
lym Let us lump together the m observations yt (i= 1,..., m) into a vector j , the polynomial coefficients a7- (y = 0,...,n—1) into a vector x of unknowns, and define the (/—l)th power of the fth observable (M*)7"1 as the current term atj of the matrix Amxn. We now apply the usual method. Polynomials of high degrees tend to generate nearly singular matrices A which result in excessive fluctuations. ^ In a basalt-rhyolite interdiffusion experiment (Alibert and Carron, 1980), potassium concentrations CK were measured in a basalt at a given arbitrary distance y in um between rhyolitic and basaltic liquids experimentally heated for 5000 seconds (Table 5.5 and Figure 5.4). In order to determine the diffusion coefficients, a fit of the experimental points with a polynomial is requested. Use the reduced concentration u( (the fractional deviation of the concentration at ut from the concentrations in the original liquids) given by ir Ui=
c
K
r
* - c K"c:
K
_L/^
K
*-c: *
Inverse methods
260
Table 5.5. Profile of concentrations CK (%) in the glass during a rhyolite—basalt diffusion experiment (Alibert and Carron, 1980). yifim) is an arbitrary distance across the granitebasalt interface, u is a dimensionless concentration normalized to the concentrations at the profile end-points. y
CK
u
y
0.0 10.0 26.3 34.8 42.5 50.8 66.6 83.1 99.3 109.3 124.1 134.0 142.3
5.00 4.99 4.79 4.37 3.78 3.26 2.75 2.39 2.16 2.01 1.87 1.82 1.80
-1.0000 -0.9940 -0.8657 -0.6090 -0.2388 0.0896 0.4090 0.6299 0.7761 0.8657 0.9552 0.9881 1.0000
4.6 5.8 24.8 36.9 40.6 51.5 67.7 82.5 97.1 110.0 127.5 135.4 138.5
^ get From Table 5.5, we get 2C y . K u =
"1.8-5.0
1.8 + 5.0 = -0.625CJK + 2.125
1.8-5.0
For n = 6, the matrix A is calculated as IT
u1
uJ
vr
u
_
1.000
-1.000
1.000
-1.000
1.000
-1.000
1.000
-0.994
0.988
-0.982
0.976
-0.970
1.000
-0.866
0.749
-0.649
0.562
-0.486
1.000
-0.609
0.371
-0.226
0.138
-0.084
1.000
-0.239
0.057
-0.014
0.003
-0.001
1.000
0.090
0.008
0.001
0.000
0.000
1.000
0.409
0.167
0.068
0.028
0.011
1.000
0.630
0.397
0.250
0.157
0.099
1.000
0.776
0.602
0.467
0.363
0.282
1.000
0.866
0.749
0.649
0.562
0.486
1.000
0.955
0.912
0.872
0.832
0.795
1.000
0.988
0.976
0.965
0.953
0.942
1.000
1.000
1.000
1.000
1.000
1.000
261
5.1 Linear estimates
-1
-0.5
0
0.5
Reduced concentration u Figure 5.4 Least-square fit of the K concentration data in Alibert and Carron (1980) experiment of diffusion at basalt-rhyolite interface by a polynomial of degree (n-1) with n = 6 (top) and n = 10 (bottom). When n increases from 6 to 10, the solution begins oscillating between the data.
and the 'observation' vector y as j^ = [0.0, 10.0, 26.3, 34.8, 42.5, 50.8, 66.6, 83.1, 99.3, 109.3, 124.1, 134.0, 142.3]T (Actually, there is as much observation in the matrix A as there is in the vector y.) This gives the solution 1
47.72 ' 39.05 33.65
-29.30 -9.802 57.22 t or, equivalently j> = 47.72 + 39.05M + 33.65M 2 - 29.30M 3 - 9.802M 4 + 57.22M 5
262
Inverse methods Table 5.6. Isotopic data on Heard Island volcanics (Barling and Goldstein, 1989) Sample
206p b/ 204p b
87Sr/86Sr
65002 65054 65015 69244 H10 65171 65085 65151 69285
18.527 18.656 18.796 18.776 18.189 18.211 18.110 17.953 17.790
0.704 78 0.704 80 0.704 82 0.704 88 0.705 23 0.705 34 0.70547 0.70600 0.707 92
Except for the end points, thefitis acceptable. Let well alone! Increasing the order of thefitdoes not improve the results and at the degree 9, the solution begins to oscillate wildly where it is not constrained by the measurements, i.e., between the data points (Figure 5.4). o
5.1.4 Least-square hyperbola Given m pairs of experimental data ut and vt (i=l,...,m), and a general hyperbola equation (u-ujiv-vj^c
(5.1.17)
where u^ and v^ are the positions of the asymptotes on the axes and c is a constant characteristic of the curvature, we rewrite this equation for each pair of measurement as UiV^c-u^v^ + ViU^ + UiV^
(5.1.18)
Then we lump together the m products utVi into a vector j , c — u^v^, w^, and v^ into a 3-vector X of unknowns and form the ith row of the Amx3 matrix with 1, vt and u{. Then apply the usual method. ^ Barling and Goldstein (1989) have measured Pb and Sr isotope compositions in recent lavas from Heard Island (Southern Indian Ocean). They obtained the data listed in Table 5.6. Find the parameters of a least-square mixing hyperbola fitting the observations. Examination of the data in a 87Sr/86Sr vs 206 Pb/ 204 Pb diagram (Figure 5.5) suggests that these samples form a hyperbolic array and therefore represent a suite of mixtures between two end-members. In order tofita hyperbola to the data, let us build the
263
5.7 Linear estimates
0.709 0.708h
I
0.707
5
0.7060.705 0.704
17.6
17.8
18.0
18.2
18.4
18.6
18.8
19.0
206pb/204pb Figure 5.5 Least-square mixing hyperbola for the isotopic data on Heard Island of Barling and Goldstein (1989). Data from Table 5.10. The 87 Sr/ 86 Sr value of the MORB source (2*0.7025) lies below the horizontal asymptote. Asthenosphere and oceanic lithosphere are unlikely source components of Heard Island basalts.
vectors x, y and the matrix A. The matrix equality Ax=y reads numerically 1 0.70478
18.527
18.527x0.704 78
13.057
1 0.704 80
18.656
18.656x0.704 80
13.149
1 0.704 82
18.796
18.796x0.70482
13.248
1 0.70488
18.776
18.776x0.70488
13.235
1 0.705 23
18.189
18.189x0.705 23
1 0.705 34
18.211
18.211x0.705 34
12.845
1 0.70547
18.110
18.110x0.70547
12.776
1 0.70600
17.953
17.953x0.70600
12.675
1 0.70792
17.790
17.790x0.70792
12.594
=
12.827
which gives the least-square solution -12.450 17.674 0.704 42J
The mixing hyperbola has been drawn in Figure 5.5. The two asymptotes intersect
264
Inverse methods
the axes at the values 206 87
Pb/ 204 Pb= 17.674
Sr/86Sr = 0.704 42
which, as discussed by Barling and Goldstein (1989), has the important corollary that MORB-type mantle source ( 8 7 Sr/ 8 6 Sr< 0.703) is not involved in the genesis of the Heard Island lavas. Keeping the full precision we would find that the curvature factor is 4.2855 x 10~ 4 so the mixing hyperbola has the equation 87
Sr 4.2855 x 10~4 + :2 0 6 Pb/ 2 0 4 Pb-17.674 86 Sr= 0.70442 '
We should nevertheless be aware that this method does not guarantee that all points will fall onto the same branch of a hyperbola, <=>
5.7.5 The periodogram Periodic or nearly periodic variations of geochemical parameters with the time of deposition for sediments are quite commonly observed. Several characteristic frequencies of these variations can be related to the Milankovich orbital frequencies, which make the analysis of time series in sedimentary sections an attractive tool of paleoclimatology (Berger, 1988). 5 1 8 O and 5 1 3 C data in sedimentary carbonates sampled by drill cores offer the best-documented example of geochemical time series: unfortunately, the measurement cannot be triggered arbitrarily by the analyst as it can in many fields of geophysics. It is rather dictated by the existence of a support (a rock of appropriate composition) that carries the signal over some periods of time: the analyst has only loose control over where in a stratigraphic sequence the measurement can be made. Therefore, the wealth of methods dedicated to Fourier analysis usually fail for that sort of problem because measurements are most commonly made at times that are not equally spaced. In fact, the interpolation of the measurements is almost certain to give spurious results. Given a geochemical variable y9 m measurements at times tl9t29...9tm produce the unevenly spaced time series )>i,.y2>--->.ym> which we lump together as the vector y. In order to find out eventual periodicities, Lomb (1976) suggests fitting the data by a sine wave using a least-square criterion. For any arbitrary frequency / , the fitting function is written y = asin2nf(t-T) + bcos2nf(t-T)
(5.1.19)
where a and b must be determined by least-squares and x is a time-shift variable that gives the solution some convenient properties. The fit can be represented by a series of equations linear in a and b, such that yt = a sin 2nf{ii - x) + b cos 2nf(ti - T)
(5.1.20)
265
5.7 Linear estimates
or in a matrix form sin 2nf(t x — T)
COS 2nf(t1 — T)
sin 2nf(t2 - T)
COS 2nf(t2 - T)
.sin 2nf(tm - x) cos 2nf(tm - x)_
In short, the last equation takes the form y = Ax where A is an m x 2 matrix and JC is the column vector made of a and b. The minimum variance estimator x of x reads The 2 x 2 matrix ATA is given by sin 2
sin 2nf(ti — T) co
Y sin 2nf(ti — x) cos 2nf(ti — x)
Y
i= 1
cos2
^nf^i ~ T)
i= 1
The solution becomes particularly simple and time-invariant if T is chosen in such a way that the cross-terms vanish. Using basic trigonometric identities, we cancel the off-diagonal term X sin2nf(tt-T)cos2nf(ti-x) i= 1 1 m
= - £ sin4nf(tt-1) 2 j= i
1 m
= - X 8^47^*^008 47^1 — - YJ cos 471/t,-sin 4TT/T = 0 2 i= i
(5.1.21)
2 j= i
hence COS4TT/T ^
sin 4^/^- = sin 4TT/T Y
i= 1
cos 4nfti
i= 1
and (5.1.22)
Since the vector ATy can be written
yicos2nf(ti-T)
266
Inverse methods
the solution for a and b is given by A - O / I sin22nf(tt-T)
(5.1.23)
and b=f^ yiCOslnfiti-T) £ cos22nf(ti-x)
(5.1.24)
The 'reduction in the sum of squares' is a concept that may a priori look surprising (Lomb, 1976; Scargle, 1982). Nevertheless, its use is supported by the convergence between the reduction in the sum of squares and the familiar power spectrum in Fourier analysis when the data become equally spaced. It is simply the difference AS(f) in the sum of squares before the fit and after thefitfor one particular frequency
or, in a vector form,
AS(f)=yJy-(y-y)J
AS(f)=yTy-yT(I-Pf(I-P)y where the usual symbol P is used for the projector A(ATA)~1AT. symmetric and idempotent, the reduction AS(f) becomes
Since projectors are
AS(f)=yTy-f(I-P)yTPy This is written in full, AS(f)=yTA
(A TA) ~ 1 A Ty = (A Ty)T(A TA) ~ 1 ATy = yTA x
which reduces to
I" | yt sin Infix, - T ) T I" | yt cos 2nf(tt - T ) T AS(f) = ^ =L + kill ! X sin2 2nf(tt - T) i= 1
(5.1.26)
^ cos 2 27c/(tI- - T) f= 1
Whenever the frequency / becomes close to a strong periodic component of the measured signal, the terms in parentheses tend to add up and the periodogram shows a power peak around/. Between these peaks, the terms are not correlated, their sign and amplitude tend to be random, and the sum will be small. Still with reference to Fourier analysis, it is common practice to plot the power P(f) as AS(/)/2.
5.7 Linear estimates
267
Table 5.7. 5 18 O values (°/oo) in pelagic foraminifera from hole 704 of the Ocean Drilling Program in the South Atlantic (Hodell and Cieselski, 1991). mbsf: meters below seafloor. mbsf
5 18 O
mbsf
5 18 O
mbsf
5 18 O
mbsf
5 18 O
49.05 49.26 49.90 50.05 50.25 50.55 50.76 51.11 51.40 51.75 52.05 52.26 52.61 52.86 52.90 53.25 53.55 53.76 54.02 54.11 54.40
3.52 3.59 3.62 3.37 3.43 3.40 3.51 3.29 2.68 2.22 3.17 3.38 3.07 2.50 2.79 2.63 2.49 2.39 2.53 2.41 2.55
54.75 55.30 55.71 55.90 56.31 56.50 56.80 57.11 57.41 57.74 57.81 58.00 58.30 58.61 58.90 59.31 59.50 59.80 60.11 60.40 60.74
2.37 2.94 2.97 2.87 3.19 3.26 3.28 3.17 2.21 2.53 2.36 2.28 2.84 2.82 3.15 3.16 3.03 2.94 2.70 2.42 2.44
60.81 61.00 61.30 61.61 61.90 62.24 62.31 62.50 62.71 63.11 63.74 64.50 65.01 65.31 65.60 66.00 66.21 66.81 67.10 67.51 67.68
2.22 2.50 2.72 2.76 3.10 3.38 3.24 3.24 3.22 3.13 3.17 2.13 1.57 1.92 2.78 3.25 3.42 2.88 2.24 2.11 2.08
68.01 68.31 68.60 69.01 69.21 69.51 69.54 69.81 70.10 70.51 70.71 71.01 71.31 71.60 71.91 72.01 72.21 72.51 72.81
3.74 3.54 3.44 2.84 2.25 2.71 2.85 3.43 3.57 3.29 3.17 3.31 3.05 2.87 3.24 3.01 2.97 2.72 2.91
& Neogene calcareous sediments were recovered from the hole 704 of the Ocean Drilling Program in the South Atlantic. The 5 18 O of pelagic foraminifera have been analyzed at 82 different depths z reported in meters below seafloor (mbsf) by Hodell and Cieselski (1991) (Table 5.7 and Figure 5.6). For simplicity, we will assume that depth below the sea bottom varies linearly with time. The first step in the calculation consists in removing the long-term drift over the whole period by fitting the data with a parabola and determining a periodogram out of the residuals from this fit. Depth has also been scaled to 1 in order to minimize round-off errors. Applying the method shown above, the best-fit parabola is obtained for 518O = 3.2696-2.1282zred + 2.0372zred2 This calculation gives the sequence of reduced 5 18 O, noted 5 18 O red , i.e., the difference between observed and fitted values, which is plotted in Figure 5.7. Table 5.8 gives a few numerical results for a few data points and / = 4. From columns 3 and 4, we could get 82
X sin47r(4zred)= - 1.8385,
82
£ cos 47r(4zred) = -2.7100
Inverse methods
268
Depth z9 (mbsf) Figure 5.6518O of Neogene pelagic foraminifera from the ODP hole 704 in the South Atlantic sampled at different depths z reported in meters below seafloor (Hodell and Cieselski, 1991)
co -0.5 -
0.2
0.4
0.6
0.8
Depth zred, (mbsf) Figure 5.7 Same data as in Figure 5.6 but depth has been normalized and a parabolic trend has been removed from the data by least-square adjustment.
hence t 4TT4
- 2.7100/
5.7 Linear estimates
269
Table 5.8. Reduced depth z and 3I8O values after removal of a parabolic trend. Sample calculation for i=4 at selected depths. y, = 5 18 O red , £, = 4nfzrei, i2 = 2nf(zrei yi
^red
-1).
sin^
cos^
sin{ 2
cos f 2
0.0000 0.0088 0.0358 0.0421
0.2504 0.3391 0.4240 0.1864
0.0000 0.4298 0.9743 0.8553
1.0000 0.9029 -0.2255 -0.5182
-0.2937 -0.0758 0.5655 0.6887
0.9559 0.9971 0.8247 0.7250
0.9747 0.9874 1.0000
-0.1607 -0.4343 -0.2686
-0.9549 -0.5929 -0.0000
0.2969 0.8053 1.0000
-0.8032 -0.5773 -0.2937
0.5957 0.8166 0.9559
Multiplying column 2 in Table 5.8 by columns 5 and 6 gives 82
82
X 8 18 O red sin27i4(z red -T)=:8.4603,
£ 6 18 O red cos27r4(z red - T) = 1.3883
i= 1
i= 1
while squaring columns 5 and 6 yields 82
82
X sin 2 27r4(zred - T) = 42.6374,
£ cos2 27r4(zred - T) = 39.3626
i= 1
i= 1
Finally 2 2 P(4) = _1/8.4603 + 1.3883 \ = 0.864 2\42.6374 39.3626/
The calculation has been carried out for frequencies ranging from 1 to 20 with intervals of 0.2 and the results are shown in Figure 5.8. The periodogram shows a strong periodic component a t / ^ 3 . 5 and its harmonics at ^ 7 and 10.5. The data are distributed over 23.76 meters. Given a sedimentation rate of 63 m Ma" 1 (Hodell and Cieselski, 1991), the section covers a time interval of about 380000 years. The present stratigraphic section contains evidence for a 380 000/3.5 «10 5 years cyclic component extremely common in the sedimentary record (e.g., Berger, 1988). Although a more detailed discussion does not pertain to this book, the peak width may be shown to be related to the analyzed core length, o 5.1.6 Fitting global data with spherical harmonics
Given a finite number of measurements at a given latitude (90° — 6) and longitude 0 on the surface of the Earth, we look for a smooth function that could be fitted to the data and represent their variations to within any desired precision. Spherical harmonics are suitable because they make an orthogonal set of functions which can
270
Inverse methods
10
15
Frequency/ Figure 5.8 Periodogram of the reduced data shown in Figure 5.7. The strong peak at ca./ = 3.5 and its harmonics at ca. 7.0 and 10.5 correspond to a ca. 105 year periodic component.
be expanded to an arbitrary degree (although limited by the number of data points): because of orthogonality, subsequent truncating of the solution to a lesser degree still gives a solution which satisfies the least-square criterion. Let us assume that n measurements y(4>i,9i) (i= 1,2,...,n) of one geochemical parameter are to be fitted with spherical harmonics to the degree m, i.e., with a function j , such as (5.1.27)
Replacing (/> and 9 by the observed value, we get one equation such as equation (5.1.27) for each of the n calculated j>i(>i, 0t), which we collect as an n-vector y. The /? = (/+1) 2 unknown coefficients 6tlm and film are lumped together as a p-vector x of unknowns. Two points make this problem a little difficult: indexing carefully the coefficients 6tlm and $lm in the vector x and finding an efficient routine which enables Cjm(>, 9) amd Sjw(0,9) to be calculated. The first point is largely a matter of attention, the second can be solved by borrowing a routine from professional packages, e.g., the routine from Press et al (1986) discussed in Section 2.6. The zero-order sine terms being zero, the individual least-square equation reads ^
<& Table 5.9, compiled by Vincent Salters from the Lamont-Doherty Geological Observatory, lists 2 0 6 pb/ 2 0 4 Pb data in basalts from different oceanic islands the
271
5.7 Linear estimates Table 5.9. Average
206
Pb/204Pb ratios in basalts from some ocean islands (courtesy Vincent Salters).
Latitude (90°-0;)
Longitude
-28.69 -7.95 -23.66 -21.58 -19.58 -21.51 38.50 -67.08 -54.35 1.32 15.70 6.93 -10.50 5.54 -12.04 -46.45 -26.47 -3.83 -0.52 -40.33
65.46 -14.37 -149.37 -155.61 -158.43 -159.05 -28.00 -168.88 3.50 7.52 -24.12 158.32 105.67 -87.08 43.74 52.00 -105.47 -32.42 -90.72 -10.00
206pb
Longitude
204pb
Latitude (90°-0 f)
18.879 19.421 20.533 20.001 19.386 19.743 19.707 19.752 19.445 20.020 19.254 18.462 18.639 19.234 19.615 18.929 19.865 19.409 19.076 18.445
20.36 64.43 -33.62 -52.61 -45.22 -46.92 -9.29 37.95 60.00 -21.07 -30.28 -14.03 -26.42 16.87 -17.61 -15.97 -20.50 -37.10 -20.07 -28.66
-156.68 -19.73 -78.83 72.42 -154.40 37.75 -139.96 -61.88 -166.00 55.75 -35.28 -171.36 -79.98 -117.47 -148.91 -5.72 -29.42 -12.28 -130.10 2.49
206pb 204pb
18.188 18.453 19.121 18.259 19.271 18.562 19.362 20.155 18.588 18.855 17.619 18.914 19.079 19.046 19.128 20.678 19.116 18.476 18.132 17.914
location of which is indicated on the map of Figure 5.9. Map the variations of this isotopic ratio with spherical harmonics to the degree 5. The matrix A is made of 40 rows and 25 columns, 15 columns for Cjm(> ,-,#,•) and 10 columns for 5^(0,-, 0f). A is too large to be listed. The 15 coefficients alm and the 10 coefficients $lm can be obtained through the standard procedure and are listed in Table 5.10. Contours of equal 2 0 6 pb/ 2 0 4 Pb can be drawn on a map by making a grid of longitudes and latitudes and inserting the calculated coefficients dlm and film in equation (5.1.27). The map of Figure 5.9, drawn as a Mercator projection, i.e., using the transformation x = longitude
,r (
fn latitude\l = ln tanl - + shows the characteristic problem of extrapolating functions. The spherical harmonics adjust more tightly in areas where data are abundant by letting themselves vary wildly where the data are missing, e.g., under the continents. The large bumps devoid of data (Australia, North America, Australia) should not be taken as features indicative of a geochemical trend. <=
60°N
30°N
30°S
60°S
120°W
60°W
60°E 206
204
120°E
Figure 5.9 Least-square fit of the Pb/ Pb ratios listed in Table 5.9 on basalts from different oceanic islands with spherical harmonics to the degree 4. The results are reported as lines of constant values. Results in continental areas are not shown.
5.2 Non-linear least-squares
273
Table 5.10. Spherical harmonic expansion of the data listed in Table 5.9. 1
0 1 2 3 4
m= 0 lm
71.75 7.206 -0.434 -6.791 0.1378
1
2
3
0 -2.854 4.316 0.810 -0.899
0 0 -8.127 -7.381 -1.859
0 0 0 1.228 -2.712
0 0 0 0 4.190
0 0 3.665 -6.310 -0.274
0 0 0 5.751 0.335
0 0 0 0 -0.878
4
film 0 1 2 3 4
0 0 0 0 0
0 -7.86 -13.2 1.38 5.82
5.2 Non-linear least-squares When the function to be fitted to data does not depend linearly on the parameters, recursive methods must be used. A slightly modified version of the Newton-Raphson method (Chapter 3) will be used (Hamilton, 1964). Let x be the vector of the n unknowns x, and y = f(x) the m-vector of 'observable' functions yt = f(x). The analytical form of the functions/^*) may be the same or not. Let the vector y represent the m observations yt of these functions. A vector x is sought which minimizes the scalar c2 such that c2= Kfl-y,-) 2
(5.2.1)
1=1
Since we are dealing with a finite sample, we define, as for the linear case, the least-square estimators x and y of x and j , respectively, as the vectors such as y=f(x)
(5.2.2)
and satisfying )=y
and
£{x) = x
(5.2.3)
which minimize the scalar c 2. Given an initial guess x° of JC, we expand f^x) in a Taylor series about JC° (5.2.4)
274
Inverse methods
Table 5.11. Ni concentrations (ngkg^1) and depth (m) in the water column from the eastern Pacific (Bruland, 1980).
z C Ni
50 100 250 500 750 1050 1200 1500 1800 2000 2500 3000 220 265 360 425 525 575 580 625 630 640 630 640
we minimize c2=
(5.2.5)
relative to x l 5 x 2 ,...,x n . This problem is equivalent to solving the linear least-square matrix equation Ay = A Ax
where the current elements of the vectors Ax and Ay and of matrix AmXn
(5.2.6)
are (5.2.7)
(5.2.8)
From the initial guess x°, we calculate the m values of fix0) and their derivatives relative to each Xj. Solving the least-square system, we get an improved estimate of JC, that we use as the initial value for the next iteration until the values cease to change significantly. Indicating the /cth estimate by the superscript fc, we can write (5.2.10)
where x{k) and A(k) are the /cth estimate of the vector x and matrix A, respectively. The least-square iteration scheme is therefore (5.2.11)
& Bruland (1980) has measured Ni concentrations (ng kg l) and depth (z in meters) in the water column from the eastern Pacific (Table 5.11). Find the best set of coefficients for the advection-diffusion model of Craig (1969) to fit these data. In Section 8.8., we find that the advection-diffusion model of Craig (1969) amounts to a sum of exponentials. The data listed in Table 5.11 are to be fitted by CNi(z) = exp( - — ) aexp( — ) + £exp( - — )
5.2 Non-linear least-squares
275
Table 5.12. The first step of non-linear least square refinement of the parameters 6t, ft, and £ for selected depths in the water column. The last three columns represent the elements of the matrix
lOOOfl,-!
lOOQo,,
50 100
229.36 288.03
-1.67 -3.34
1042.55 1086.90
959.19 920.04
-1.67 -3.34
2500 3000
5059.65 7796.80
-158.13 -242.01
8031.19 12182.5
124.51 82.08
-339.82 -613.23
-484.93
668.93
1.54
for a, /?, e, and the mixing length /m = 600m (Craig, 1969). We define
and the vector x = (a, /?, e). Then we compute
ft =exp =— P
da
fi ( ai2 = — =exp dft
zt[
\2l sz
(
(EZ\
BZX\
ai3 = — = — aexp y — -/?exp - —
de 2lml
\llJ
\
21Jj
As the initial guess, we chose a(0) = - 20, /?(0) = + 20, # 0 ) = 1. Table 5.12 shows some results for the first iteration. For the second iteration, we take the values of a(1), /?(1) and 8(1) as the new starting point. The sequence of values taken by x is tabulated in Table 5.13. Convergence is achieved in 6-7 iterations. Convergence is towards s = 1, i.e., the initial value is entirely fortuitous. This value is an important indication that scavenging is not very efficient for Ni. The quality of the fit may be seen in Figure 5.10 <> & Let us calculate parameters for a non-linear fit used in Chapter 8 as an example of the Matano interface technique. In order to determine the diffusion coefficient of Ce in apatite, Iqdari and Velde (unpub. data, 1992) kept natural apatite in CeCl 2 at
Inverse methods
276
Table 5.13. Iterative refinement of the parameters a(k), £ ( k ) and £ k ) . Ttpr h
/v(fc)
0 1 2 3 4 5 6 7
-20 668.93 455.05 537.75 643.61 668.54 668.79 668.79
300
fiyk>>
c
20 -484.36 -205.11 -326.27 -456.48 -484.41 -484.69 -484.69
400
500
1 1.5445 1.3697 1.1193 0.9943 0.9833 0.9834 0.9834
600
1
NiCngkg" ) Figure 5.10 Adjustment of Ni concentrations in the water column (data from Bruland, 1980) with the advection-diffusion model of Craig (1969).
different temperatures from variable durations. In one run, the sample was kept for 15 days at 1100°C and Ce concentrations CCe measured from the surface inwards along the c-axis with X being the distance to the surface. Table 5.14 and Figure 5.11 give the analytical results. It is found that, for the sake of Matano integration, these results could be fitted with a six-parameter equation 'Ccc-x1
C Ce -x 3
5.2 Non-linear least-squares
277
Table 5.14. Ce concentrations (%) in apatite immersed in CeC^for 15 days at 1100°C as a function of the distance X (\im) to the interface (Iqdari and Velde, unpub. data, 1992). X
0 5 10 15 20
CCe
X
25 30 35 40 45
1.864 1.832 1.638 1.227 0.6
CCe
X
0.25 0.152 0.085 0.078 0.046
50 55 60 65 70
CCe
0.043 0.012 0.013 0.012 0.013
Ce (%) Figure 5.11 Iqdari and Velde's (unpub. data) Ce diffusion experiment on apatite. Adjustment of the distance to the mineral surface as a function of Ce concentrations.
Find the vector x = (xl9 x29 • •., x6)T of unknown parameters using [0,2,2,2,20, — 20] T as the initial estimate. In this case, the vector y is built out of the X values (0,5,..., 70) while each equation differs from the other by the value Q C e of the concentration. The ith row of the 6 x 15 matrix A of partial derivatives is dX
x2
~dx~x
dx4
dX
dX
fo~2"
dx3
dx6
\2'
Inverse methods
278
Table 5.15. Refinement of the fit parameters used to parameterize the Ce diffusion profile of Figure 5.11 up to iteration 12.
k
x2
x4
x5
0 1 2 3
0 -0.0061 -0.0190 -0.0353
2 1.499 1.840 2.347
2 1.894 1.870 1.873
2 -0.2042 0.5083 0.2791
20 21.00 19.78 18.21
-20 -6.163 -4.973 -4.500
11 12
-0.0500 -0.0500
2.878 2.878
1.932 1.932
0.8753 0.8792
16.98 16.98
-2.872 -2.864
Inserting the initial guess into these expressions gives the first estimate of A0 as
r A° =
0.6
0.537
108.132
-7.3529
1 1.8640
0.6
0.546
70.862
-5.9524
1 1.8320
0.7
0.611
15.262
-2.7624
1 1.6380
1.3
0.815
3.347
-1.2937
1 1.2270
13 888.9
83.333
0.506
-0.5030
1 0.0120
11834.3
76.923
0.507
-0.5033
1 0.0130
and/(i°)as /(£ 0 ) = [-30.91, -27.45, -17.06, -5.50, ..., 185.42, 172.58]T
The results of the first 12 iterations are listed in Table 5.15. Although total convergence is not achieved after 12 iterations, the fit provided by the parameters at this stage is quite acceptable (Figure 5.11) and the final form of this empirical least-square equation becomes 0.879
2.878 •+ •
C Ce + 0.05
Ce
C -1.93
17-2.86C C e
5.3 Constrained least-squares 5.3.1 Linear constraints: the closure condition Since many geochemical units are concentrations of fractions which sum up to unity, let us first demonstrate a useful statement. A vector is normalized when its components
5.3 Constrained least-squares
279
sum up to one. A normalized vector y of 9fm satisfies the condition
where Jm is the vector (1,1,..., 1)T. Given a matrix AmXn column-vectors, i.e., satisfying
with normalized
JmJA=Jj
(5.3.2)
y = Ax
(5.3.3)
then any vector xn such as
is also normalized. This can be shown by pre-multiplying the last equality by JmT Jjy = JmTAx = JnTx=\
(5.3.4)
In most cases of interest, however, the system represented by equation (5.3.3) is overdetermined and we must enforce the closure condition with a different method. Let us return to a standard mass-balance least-square problem, such as, for instance, calculating the mineral abundances from the whole-rock and mineral chemical compositions. If xl9 x2, •. -,xn are the mineral fractions, which may be lumped together in a vector JC, the closure condition
may be written JCT/=/TJC=1
(5.3.5)
Arranging the whole-rock mineral concentrations for each element i (i= l,...,m) in a vector j , and putting the concentration of element i in the phase j at the ith row andjth column of the matrix Amxn (mineral matrix), the usual overdetermined system is obtained y = Ax
(5.3.6)
which has no exact solution. The constrained least-square problem is therefore to find estimates x and y for which three conditions hold: (i) estimates x and y fit the model equation (5.3.3); (ii) the distance (y — Ax)1 (y — Ax) between y and y is minimum; and (iii) the constraint equation (5.3.5) is obeyed. Let us define the Lagrange multiplier as — 21 and form the function c2 that we want to be minimum c2 = (y- Ax)J(y - Ax) - 2k(JcTJ- 1)
(5.3.7)
280
Inverse methods
Since each term in c2 is a scalar and therefore symmetric, equation (5.3.7) may be rewritten as
Let us define the differential of the vector x = (xux2,...,AW)T dx = (dxudx2,...,dxn)T. Differentiating c2 relative to x gives
as the vector
dc2 = - 2 dxJATy + dJtTAJAx + xTATA dx - 2X dxJJ= 0 The second and third scalar terms of c2 are the transpose of each other and are therefore equal, hence dc2 = - 2 dxJATy + 2 dxTATAx - 2k dxTJ or dc2 = 2dxT(ATAx-ATy-M)
=0
(5.3.8)
Clearly, the only non-trivial solution to this equation is ATAx = ATy + M
(5.3.9)
and therefore l
ylJ
(5.3.10)
where x0 is the unconstrained solution (ATA)~1ATy. It is now easy to calculate X in such a way that the constraint is obeyed. Pre-multiplying by / T , we get JTJc = JT(ATA)~ xATy + UT(ATA) ~lJ=\ or, since each term in the equation is a scalar
Note that the denominator is simply the sum of all the terms in the matrix Inserting this expression of X into the solution for JC, we get
{ATA)'1.
a rather cumbersome expression which nevertheless may turn out to be useful to compute the error matrix on the solution using methods described later in this chapter.
281
5.5 Constrained least-squares Table 5.16. Major-element composition (weight percent) of a komatiitic liquid and normative minerals.
SiO 2 TiO 2 A12O3 FeO MgO CaO Na 2 O
liq
ol
cpx
42.0 0.3 8 6.5 22 8 0.5
41.9 0.07 0 7.77 48.5 0.06 0
54.6 0.13 1.9 2.22 15.8 20.6 1.44
ga 41.5 0.11 18 7.04 18.1 6.7 0
^ Let us recast the major-element composition of a komatiitic liquid (liq) into virtual minerals olivine (ol), clinopyroxene (cpx), and garnet (ga) whose compositions are listed in Table 5.16. In the usual notation, the liquid column is the vector yn while the last three columns form the matrix Alx3. After thefiguresare divided by 100, the unconstrained solution is built using the following intermediate steps. First, we compute in the usual way the unconstrained solution x0 as 0.210" 0.267 0.442
This solution happens not to be normalized since
We therefore calculate 0.51" (ATA)-lJ=
1.67 , and 6.46
JT(ATA)-iJ=4.2S
which gives the value of the Lagrange multiplier X as A = (l-0.919)/4.28 = 0.0188
282
Inverse methods
Inserting this value into equation 5.3.10 gives the final solution "0.210" "-0.010" "0.200" 0.267 + -0.032 = 0.235 0.122_ 0.442_ 0.564 5.3.2 Quadratic constraints: mineral reactions Let us consider the problem of finding the stoichiometric coefficients of a mineral reaction, m element concentrations have been measured on n mineral phases of a rock (C/, i = 1,..., m; j = 1,..., n) and it is suspected that the phases are not chemically independent. In other words, we can find n numbers Vj(j=l,...9n) such that (5.3.13) Obviously, the trivial solution v ^ O (j = 1,..., n) does not fit our needs and we must search for solutions as a constrained problem in which the solution vector is of constant, yet arbitrary, length. In other words, we become interested in the vector with some criterion of best direction regardless of its magnitude, which we may conveniently take as unity. Let us lump the Cf coefficients into t h e m x n matrix A and the n coefficients v, into the vector xm hence (5.3.14)
Ax = 0
Obviously, for m ^ n, such a system has no exact solution, and, as before, the search is restricted to that for an estimate x of x. As the solution is only approximate, a residual error vector s may be defined such that Ax = e
(5.3.15)
The least-square criterion suggests to minimize the modulus sTs of this error vector subject to the condition that the modulus xTx of the estimate is unity. The problem is therefore to minimize the sum c2 such as x-X(xTx-l)
(5.3.16)
where X is a Lagrange multiplier. Differentiating with regard to i , one gets dc2 = dxTATAx + JcTATA dx - X dxTx - XxT dJt = O
or, since each term is a scalar and hence symmetric dc2 = 2dxT(ATAJc-Xx) = 0
(5.3.17)
The solution to the problem is therefore the solution to the eigenvalue equation ATAx = XJt
(5.3.18)
5.3 Constrained least-squares
283
Table 5.17. Molar chemical composition of minerals in the assemblage quartz (qz) - muscovite (ms) - K-feldspar (Kf ) sillimanite (sil) - water (w).
SiO 2 A12O3 K2O H2O
qz
ms
Kf
sil
w
1 0 0 0
6 3 1 2
3 1/2 1/2 0
1 1 0 0
0 0 0 1
and hence c2 = XxTx-l(xTx-l) = l
(5.3.19)
As the matrix ATA is positive definite, i.e., it has positive eigenvalues, for c2 to be minimum, the solution x must be the eigenvector ux associated with the smallest eigenvalue kx of this matrix. & Find stoichiometric coefficients for the assemblage quartz (qz)-muscovite (ms)-K-feldspar (Kfhsillimanite (sil)-water (w) given in the form of the mineral composition matrix of Table 5.17 An eigencomponent routine confirms that the matrix A1 A has one eigenvalue equal to zero with corresponding eigenvector [0.485, -0.485, +0.2425, -0.485, -0.485]T. It is common practice to use integers as stoichiometric coefficients. This can be achieved by dividing each component by the component of smallest modulus (0.2425), which produces the vector [2, - 2,1, - 2, - 2] corresponding to the mineral reaction 2 quartz +1 muscovite «-+ K-feldspar+ 2 sillimanite + 2 water
<>
& Given the mineral assemblage quartz (qz)-pyroxene (px)-garnet (ga)-plagioclase (pi) compositions given in Table 5.18, discuss the possible reactions between mineral end-members. The mineralogical matrix of Table 5.18 is full rank (n = 4) and, in terms of the phase rule, does not seem to be a reactive assemblage since the rock contains four minerals for five independent (chemical) components. However, breaking down the last three minerals into end-members: pyroxene as enstatite (en) + diopside (di)+jadeite (jd); garnet as pyrope (py) + grossular (gr); and plagioclase as albite (ab) + anorthite (an) gives the new mineralogical matrix reported in Table 5.19. Three (8 - 5) independent mineralogical reactions exist among these end-members. We can calculate the whole set of possible reactions through a calculation of eigencomponents in all the systems with 5 + 1 = 6 end-members. Indeed, any assemblage with six end-members must involve at least one reaction. There are (8 x 7)/2 = 28 ways of discarding two end-members among eight and the same number
284
Inverse methods Table 5.18. Molar chemical composition of minerals in the assemblage quartz (qz) - pyroxene (px) - garnet (ga) plagioclase (pi).
SiO 2 A12O3 MgO CaO Na2O
qz
px
ga
pl
1 0 0 0 0
2 0.125 1 0.5 0.125
3 1 1.2 1.8 0
2.5 0.75 0 0.5 0.25
Table 5.19. Molar chemical composition of mineral end-members in the assemblage of Table 5.18. See text for abbreviations.
SiO 2 A12O3 MgO CaO Na 2 O
qz
jd
en
di
py
gr
ab
an
1 0 0 0 0
2 1/2 0 0 1/2
2 0 2 0 0
2 0 1 1 0
3 1 3 0 0
3 1 0 3 0
3 1/2 0 0 1/2
2 1 0 1 0
of possible univariant reactions given in Table 5.20, each reaction being as legitimate as any other. Since only three reactions can be independent, other reactions can be obtained by linear combinations. For instance, reactions 2, 3, 4, 5, 15, 19, 20, ... are identical, 15 and 17 sum up to 18, and so on. Given reliable models of activity in plagioclase, clinopyroxene, and garnet, and thermodynamic data at various temperatures and pressure, we could devise many temperature-pressure estimates although with only three degrees of freedom, o 5.4 Handling errors in least-square problems • As soon as observations are considered as samples of random variables, we must redefine the concepts of distance and projection. Let us consider in three-dimensional space a vector y of one observation of three random variables Ylf Y2, and Y3 with its density of probability function/^. The statistical distance c of the vector j> to another point y can be defined by the non-negative scalar c 2 , which has already been met a few times, e.g., in equations (5.2.1) and (5.3.7), and such that c2=-2\nfj(y) +const
(5.4.1)
and is a measure of the probability that y is different from y. The distance of y to the plane
285
5.4 Handling errors in least-square problems
Table 5.20. The complete set of stoichiometric coefficients for the set of end-members described in Table 5.19. These coefficients have been obtained from the eigenvectors associated with the null eigenvalues of the mineralogical matrices formed out of 6 end-members or minerals at a time. No more than three equations from this set can be considered independent.
en
#
qz jd
1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 0 3 -3 -1 -1 0 0 1 1 0 0 -1 -1 0 0 1 1 0 0 0 0 3 -3 0 0 3 -3 -1 0 2 -1 1 -21 0 -3 0 3 0 0 0 3 -3 6 0 -1 -5 0 0 3 -3 1 1 0 0
di py
gr ab an
-i 1 0 0 0 0 1 0 0 0 -1 0 0 0 1 0 0 0 -1 0 -l 1 0 0 -l 1 0 0 -l 0 0 1 1 0 -1 0 -2 -1 0 3 -1 -2 0 3 5/2 4 0 -6 -1 1 0 0 0 0 -1 0
# 15 16 17 18 19 20 21 22 23 24 25 26 27 28
qz jd
en
di py
gr ab an
0 L 0 0 0 0 -1 1 0 I 0 0 0 0 1 [ () 0 0 2 -1 -l 1 () 1L 2 -1 -l 0 -1 0 1L 1L 0 0 0 0 -1 ]L ]L 0 0 0 0 0 -1 1 0 _| ]L () 1 -2 0 1 1 -1 () -1L 1 -2 0 1I I 0 0 0 0 -1 0 * () 3 0 -2 -1 0 3 () ;3 3 0 -2 -1 -3 3 3 () 0 3 -1 -2 0 3 () -:3 0 -3 1 2 3 _3 () () 3 -3 -1 1 0 0 L
defined by the two vectors aY and a2 is the maximum-likelihood point y of the plane, i.e., the point y of the plane that makes the function fy(y) maximum.
If Yl9 Y2, and Y3 are normally distributed, the constant probability surfaces are ellipsoids centered at y (Figure 5.12) and the statistical projection y of y will be defined as the point where the plane is tangent to the innermost probability ellipsoid. Points on the same ellipsoid are by definition at the same statistical distance from y. If Sy is the covariance matrix of the vector j , the statistical distance c between y and y is given by = (y-y)TSf\y-y)
(5.4.2)
5.4.1 A simple illustration: the weighted mean A vector x of n random variables has been measured m times, the iih measurement resulting in an estimate of the mean value xt and of the co variance matrix St. A best estimate x of the pooled ('weighted') average makes the sum of squared statistical distances to each x{ minimum. The scalar expression (5.4.3)
Inverse methods
286
Figure 5.12 Statistical projection y of the observation vector y onto the plane defined by the vectors a t and a2. y is the point where the plane is tangent to the innermost probability ellipsoid.
is therefore minimum for x = x. Differentiating the scalar c2 for JC, we obtain 2cdc= X dxTSi~1(x-xi)+ X (x-xi)JSi'1dx i= 1
=2 £
i= 1
i= 1
or, at minimum (5.4.4) i= 1
i= 1
The solution x is given by - l
m
(5.4.5)
The covariance (standard error) matrix of x is given by (5.4.6)
Defining the weight matrix Wk of the /cth measurement as (5.4.7)
5.4 Handling errors in least-square problems
287
Table 5.21. Average lead isotope compositions measured over three blocks with standard deviations and correlation coefficients. The last row indicates the weighted mean of the three blocks and its standard deviation and correlation coefficient. /206p b X
Block #
/206pbx
/2O7p b \
204
/207p b X
V°Pb/
V PnJ
A 204 Pb/
1 2 3
18.688 18.695 18.678
0.007 0.013 0.011
15.532 15.549 15.522
0.011 0.015 0.013
0.908 0.925 0.937
x
18.688
0.0051
15.535
0.0068
0.912
we get the compact 'weighted' average
x= £ Wfr
(5.4.8)
1=1
As a special case, replacing matrices and vectors by scalars, the case n = 1 reduces to the well-known weighted average
x=—
(5.4.9)
and s*2 = ( 1 ^ )
(5.4.10)
^ Three blocks of lead isotope measurements on the same lead sample by mass spectrometry have produced the average ratios 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb reported in Table 5.21 together with their standard deviation and correlation coefficient. Calculate the weighted mean and standard error matrix of these three blocks. From the definition of variance and correlation coefficient and equation (4.2.18), we find that the covariance matrix S1 is ["0.007 0 T 1 0.908T0.007 0 ~|_ ' L 0 O.OllJLo.908 1 j|_0 0.011 _|~
T0.490 0.699~ Lo.699 1.2ioj
Likewise
2
J 1.690 1.8041 „ Jl.210 1.3401 J = 10"4 and 5r3 = 10~4 L 1.804 2.250 J L1340 1.690J
288
Inverse methods
The inverse matrices are calculated as 11.601 -6.7021 -6.702 4.698 J and
2
_. .
_t_ "
J 4.105 L ~ 3.292
3.084 J'
6.779 -5.3751 _-5.375
4.854 J
and therefore _!
_t 1
+
2
_!_ +
3
I 22.486 L-15.368
-15.3681 12.635 J
The standard error matrix of the weighted average is calculated from equation (5.4.6) as 5=i0_J0.2637
L0.3207
0.3207" 0.32071 0.4692 J
The weight matrices are calculated as
1
_[" 0.9096 ~L-0.2604
-0.26041 0.0551J
2
_[" 0.0268 0.12101 ~|_-0.2279 0.3913J
**
3
_[" 0.0636 0.13941 ~L-0.3481 0.5536J
It can be checked that Wl9 Wl9 and W3 sum up to the identity matrix. The weighted average x is calculated from equation (5.4.8) as 0.9096
-0.2604Tl8.688i I" 0.0268 0.1210"|["l8.695l + 0.055l jLl5.532j L-0.2279 0.3913jLl5.549j |~ 0.0636 ~hL-0.3481
or A
0.1394T18.6781 0.5536jLl5.522j
_ T12.9541 p.3831 p . 3 5 l l _ p8.688l
^ ~" Ll 1619 J + Ll-824j + L2.093 J ~ L15.535 J Due to its smaller uncertainties, the first measurement is dominant. The final results for the weighted average, equation (5.4.5), its standard errors, equation (5.4.6), and the correlation coefficient are reported in the last row of Table 5.21. 0=
5.4.2 Linear least-square systems We assume that a random variable vector YofW1 (here upper-case is used to indicate not a matrix but an ordered set of m random variables) distributed as a multivariate normal distribution has been measured through an adequate analytical protocol (e.g., CaO concentration, the 87 Sr/ 86 Sr ratio,...). The outcome of this measurement is the data vector ym. Here ym is the mean of a large number of measurements with expected
5.4 Handling errors in least-square problems
289
valuey m and has a covariance matrix Sy~l. We know from Chapter 4 that the squared statistical distance c2 between y and j , such that c2 = {y-y)TSy-\y-y)
(5.4.11)
is then distributed as a chi-squared variable with m degrees of freedom (remember that the vector y has m components). From equation (4.2.41), the smaller the value of c2 , the larger the confidence limit 100(1—a) since
where the value of percentiles increases with (1—a). As discussed in Section 4.2, the case of a small number of measurements could be handled in a similar way by adopting the appropriate statistics. The solution associated with the maximum probability (Figure 5.12), said to be maximum-likelihood, is these unbiased estimates x and y of x and y which make the value c2 of c2 minimum and equal to y
y
)
(5.4.13)
Jc and y also satisfy the model y = Ax
(5.4.14)
c2 is known as the weighted sum of squared residuals. We can revert to the standard least-square solution by a change of variable as discussed in Chapter 2. Let us define $=Sy-v2y
(5.4.15)
$ = Sy-1/2y
(5.4.16)
d2=($-$n$-$)
(5-4.17)
and
c2 can be written
Pre-multiplying equation (5.4.14) by Sy~1/2, it becomes $ = Sy1/2y = Sy1/2Ax = yix
(5.4.18)
with the definition of 91 being embedded in the last equality. The standard least-square normal equations now apply as (5.4.19)
290
Inverse methods
or, reverting to the initial variables ( Q - 1/2 j\T \LJy fi)
o - 1/2 A * /r» Ljy /\x — y^y
- 1/2
i\Tc - 1/2 S*) *^v y
/c A "\(\\ yj.^r.ZyJ)
which is simply equivalent to AJSy-1AJc = ATSy-1y
(5.4.21)
i = (^l T 5l)- 1 ^ T ^
(5.4.22)
The least-square solution is
or x = (AJSy~lAy1ATSy~ly
(5.4.23)
and ~1ATSy-1y with the projector P of Sy~1/2y P=Sy-
(5.4.24)
onto Sy~1/2$ defined as l/2
A(ATSy- Uy^Sy-
l/2
A)T
(5.4.25)
We see that for Sy = Im, the cumbersome equations (5.4.23) and (5.4.24) reduce to the standard least-square solutions. The covariance matrix S^ on x can be obtained through equation (4.3.4)
or, after simplification Sjt = (ATSy~1Ayi
(5.4.27)
Again through equation (3.3.4) and (5.4.24), the covariance matrix Sy on y is Sy = [A(AJSy~ 1A)'1AJSy~
1
^\SylA{ATSy~ lA)~ 1ATSy~ X ] T
(5.4.28)
and, after simplification l
A)~1AT
(5.4.29)
The n x m covariance matrix cov(i, y) between the model x and the observations y can be obtained easily. By linearity y
y
(5.4.30)
Multiplying by [y — $(y)]T and taking the expectation gives y lATSy~ 1Sy = (ATSy~ lAyxAT
(5.4.31)
5.4 Handling errors in least-square problems
291
In order to get the correlation coefficients, this matrix must be pre-multiplied by the inverse of the diagonal matrix having the standard deviations of x on the diagonal line, and post-multiplied by the inverse of the diagonal matrix having the standard deviations of y on the diagonal line. We will now investigate the sampling properties of the statistic representing the weighted sum of squared residuals c2 given by equation (5.4.13). We first observe that the slightly different expression (y — y)TSy~1(y — y) is zero since (y-yVSy-^y-^^S.-^y-^Sy-^y-^^-m^-^)
(5.4.32)
where, from the properties of the standard least-square solution, the vector (ij/ — ij/) is orthogonal to the plane containing both \jf and ij/ and therefore orthogonal to (\l/ — }j/). Equation (5.4.13) for c2 therefore can be rewritten
or
and finally \y-y)-(x-xfSjt-\x-x)
(5.4.33)
Since the two terms on the right-hand side are approximately distributed as chi-squared variables with m and n degrees of freedom, respectively, c2 is therefore distributed as a chi-squared variable with (m — n) degrees of freedom. The expected value of c is S(c2) = m-n
(5.4.34)
a property which can be shown to hold even when y is not normally distributed (take the expectation of c2 and apply the commutativity properties of the trace). A common use is to report the Mean Square of Weighted Deviations (MSWD), i.e., c2/(m — n) whose expectation is 1. The value of the expectation can also be used to scale errors when the covariance matrix Sy of the data vector is unknown. The usual procedure is to assume that Sy can be approximated by Sy**Im
(5.4.35)
where a is an unknown scalar which is calculated in such a way that c2 is actually equal to (m — n). This way of assessing errors is sometimes referred to as error calculation in contrast with the more commendable procedure of error propagation used up to this point. In addition to depriving the user of the capability of testing the model through a test of c2 with respect to the physically determined uncertainties, the covariance matrix is assumed to be diagonal (uncorrelated measurements) and
292
Inverse methods Table 5.22. Ion beam intensities \{(mV) at mass i (m V) with standard deviations and adjusted intensities \v See Table 5.1 for atomic abundances. Mass i
It
S(li)
rt
142 144 146 148 150
207 62 43 26 22
1.439 0.787 0.656 0.510 0.469
205.8 61.6 42.2 25.7 21.4
absolute uncertainties made equal for all the data. Error calculation should therefore be restricted to problems which do not need critical assessment. Finally, the assessment of the importance and influence of each observation can be deduced from the 'error-free' case. The matrix of data importance (and a measure of their independence) is still the projector P. The reader can consult Sen and Srivastava (1990) for relevant expressions of their influence. & Let us consider the peak-stripping problem solved in Section 5.1.1 when experimental uncertainties are taken into account. Ion probe measurement of the ion current at the masses 142, 144, 146, 148 and 150 are given in Table 5.22 and form the vector I. For the particular setting of counting times, one standard deviation of the mean has been found by previous measurements to equal 10 percent of the square-root of peak height (Poisson statistics). Because noise is largely due to fluctuations of beam intensity, peak heights have also been found to correlate with a correlation coefficient of 0.85. Calculate the total elemental signal in millivolts for Ce, Nd, and Sm, their co variance matrix and discuss whether the linear signal addition is an acceptable hypothesis. The relevant mass balance equations have been written in Section 5.1.1 and the matrix A is given in Table 5.1. The standard deviations on intensities are calculated, i.e., for mass 142 as 0.1 x ^ 2 0 7 = 1.439 and arranged to form the diagonal matrix S. Then the correlation matrix R is formed out of 1 on the diagonal and 0.85 everywhere else, and, from equation (4.2.18) the covariance matrix Wf is calculated as W,=SRS
which gives "2.070 0.963
0.802
0.624
0.574"
0.963
0.620
0.439
0.341
0.314
0.802
0.439
0.430
0.284
0.261
0.624
0.341
0.284
0.260
0.203
0.574
0.314
0.261
0.203
0.220
5.4 Handling errors in least-square problems
293
We first calculate the covariance matrix of the solution as T46.383 8.131 11.622' 8.131
8.466
6.070
11.622
6.070
8.821
which gives the propagated standard deviations on / Ce , / Nd , and / Sm as the vector [6.81, 2.91, 2.97]T. Given the standard-deviation matrix S , V46.383
0
0
0
^8.466
0
0
o
V'8.821-
-
the correlation matrix p between these variables is
p=
1
T
1
&- (A Wr Ay
1
0.41
0.58
0.41
1
0.70
0.58
0.70
1
The least-square solution / Ce , / Nd , and / Sm is calculated from equation (5.4.23) as 1250.1 245.6 102.1
/smJ
1/2
Determining the projector P is slightly more tedious (Wr through the methods described in Section 2.3) and gives
should be computed
0.908
0.099
-0.203
0.027
-0.178
0.099
0.776
0.377
0.146
-0.025
0.203
0.377
0.339
-0.177
-0.098
0.027
0.146
-0.177
0.729
0.380
0.178
-0.025
-0.098
0.380
0.248
The diagonal elements of P sum up to three, the number of variables to be determined, and rank the importance of each peak in that determination. The 'adjusted' values that satisfy the model are calculated according to equation (5.4.24) and listed in Table 5.22. The covariance between the three model parameters x and the five observations y can be calculated through equation (5.4.31). Their 3 x 5 correlation matrix R(x,y) can be shown to be 0.75
0.43
0.31
0.51 0.41"
0.76
0.96
0.76
0.79
0.68
0.69
0.73
0.54
0.89
0.71
294
Inverse methods
Larger coefficients show that 7Nd depends mostly on variations of F 144 while 7Sm depends mostly on variations of T148. The model itself can be tested against the sum of squared residuals c2 = 4.01. If, as a first approximation, we admit that intensities are normally distributed (which may not be too incorrect since all the values seem to be distant from zero by many standard deviations), c2 is distributed as a chi-squared variable with 5 — 3 = 2 degrees of freedom. Consulting statistical tables, we find that there is a probability of 0.05 that a chi-squared variable with two degrees of freedom exceeds 5.99, a value much larger than the observed c2. We therefore accept to the 95 percent confidence level the hypothesis that the linear signal addition described by the mass balance equations is correct, o 5.4.3 Non-linear least-square systems: isochrons A particular experiment provides observations of n independent variables Xj (j= 1,...,/?) and of a single dependent scalar variable Y which we suspect to be related through a linear relationship such as Y = t
XJXJ + P
(5.4.36)
where the n a, and the one fi are parameters to be determined by the experiment, m observations are carried out (i=l,...,m). For w=l, we are dealing with a straight line in the (x,y) plane, for n = 2 with a plane in the (xux2,y) space, and so on for higher dimensions. Examples of such relationships are countless: Rb-Sr and Sm-Nd isochrons, mixing planes and hyper-planes in multiple elemental or isotopic systems make the best known cases. Kent et al. (1990) list some applications, with emphasis on rare-gas isotopic systems. The variables Xj and y can be either independent, which is nearly the case for Rb-Sr and Sm-Nd isochrons, or strongly correlated as in the Concordia U-Pb plot. The case /? = 0 (lines or planes going through the origin) may be derived with little difficulty from the following by setting all the corresponding coefficients to zero. In order to take advantage of a matrix formulation, we define the vector Xn of the n variables Xj and the vector a of the n unknowns a,- and write 0
(5.4.37)
We now proceed to m observations. The ith observation provides the estimates xtj of the independent variables Xj and the estimate yt of the dependent variable Y. The n estimates x^ of the variables Xj provided by this ith observation are lumped together into the vector xt. We assume that the set of the (n+ 1) data (xhyi) associated with the ith observation represent unbiased estimates of the mean (xh yt) of a random (rc+1)-vector distributed as a multivariate normal distribution. The unbiased character of the estimates is equivalent to <$(xij) = x i j
(5.4.38)
5.4 Handling errors in least-square problems
295
St is the ( n + l ) x ( n + l ) covariance matrix of the ith measurement (jcf,^). The 7th diagonal term is the variance of xij9 while the (n+ l)th diagonal term is the variance of yt. The off-diagonal terms are the corresponding covariance terms. In order to illustrate how the maximum-likelihood expression can be built, let us consider the case n = 1 (only one X\ which is the case of a straight line relating X and Y. The expression of c2 is given by
=
l(xl-xl){yl-yl)...(xm-xm)(ym-ym)']
(5.4.40)
(xm-xm) iV.r m
ymfmrn
where the S( * matrices are 2 x 2 blocks on the matrix diagonal. If each observation is the result of a large number of measurements (ideally the mean value of many replicates), c2 should be approximately distributed as chi-squared variable with 2m degrees of freedom. Off-diagonal blocks are zero because different measurement pairs are supposed to have uncorrelated errors. In this quadratic form, only the terms with matching values of index will be different from zero. The expression of c2 can be written as the sum of the quadratic forms corresponding to each measurement (5.4.41)
Generalizing this maximum-likelihood expression to an arbitrary number n of variables gives (5.4.42)
Lumping the vector xt and the scalar yt together into a single vector ut of dimension ), c2 becomes (5.4.43)
where the means noted with plain symbols are to satisfy the constraint (5.4.44)
(Note that a scalar behaves as a symmetric matrix.) Because of finite sampling, a and P cannot be evaluated exactly. Instead, we will search for unbiased estimates a and P of a and /? together with unbiased estimates yt and xtj of yt and xtj that satisfy the linear model given by equation (5.4.37) and minimize the maximum-likelihood expression in xt and yt. Introducing m Lagrange multipliers Xb one for each linear
296
Inverse methods
constraint to be satisfied by each set of variables corresponding to one observation, the hat Q-variables minimize the expression y2 given by
Differentiating y2 with respect to the ub we get
dy2= - £ 2d«IT5I.-1(«l-^)+ t 2*iduiT\
fl
or, after rearrangement under the same summation sign
J
(5.4.46)
For the sum to be zero regardless of the value taken by the dub each term in the braces must vanish. Therefore, the values at the minimum, labeled with a hat on the symbol, satisfy (5A47)
-i r or
(5.4.48) Pre-multiplying the last expression by [<£T, - 1 ] and combining with equation (5.4.44) gives [deT, - l]tfj = [AT, - I]*,—;L( .[iT , - 1 ] 5 ,
= -p
(5.4.49)
|_ — 1J In order to work with more compact notation, we introduce the weight wt of the ith observation. wt is the scalar given for its value at minimum by !
(5.4.50)
Combining equations (5.4.49) and (5.4.50) results in A, = * r 1 { W T . - l ] * i + ^}
(5-4-51)
Equation (5.4.48) becomes !
(5.4.52)
5.4 Handling errors in least-square problems
297
which can be inserted into the expression (5.4.43) of c2 to give its minimum value
or after simplification t2= jr
V^-HD*1",
-i]«,-+/?} 2
(5.4.53)
In a non-matrix form, this expression becomes c 2 = Z ^i" 1 ! J'i" Z ^ o ~ ^ )
=
Z Wi~l$i2
(5.4.54)
with weight wt defined as n
n
n
n
(5.4.55) and residual 8f as ^• = [aT, - l ] « f + i?= t a/Xy + jJ-y,
(5.4.56)
Let us define ^ as the m x n matrix with current element x(j and the vector ym as the vector with element yt. The m-vector 5 of residuals 8t is -j
(5.4.57)
and differentiates as dd = Xd
(5.4.58)
et being the fth unit vector with (m—1) null components and the ith element equal to 1, the m x m weight matrix W is defined as W= £ e-w^
(5.4.59)
c 2 can therefore be written in the compact matrix form
(5.4.60)
Jrn^
298
Inverse methods
In order to be able to proceed with differentiation, let us partition the covariance matrix of the fth measurement as (5.4.61) where S* is the n x n covariance matrix of the n xtj (j = 1,..., n\ ct the vectors of the covariances cov(xl7, yt), and sty the variance of y(. Using equation (5.4.50), w, can be expressed as wt = aT5fxa - 2aTcf + sty
(5.4.62)
The last equation differentiates as dw,- = 2 daT(5»xa - ct) = 2 daT8f
(5.4.63)
where the vector Bt defined as a^Si'a-Ci
(5.4.64)
dBi = Sixda
(5.4.65)
differentiates as
The differential of c2 is
or, after expression (5.4.58) for d8 has been used XTW
l
5- f {wi-2di2) \ + 2JmTW-l5dp
(5.4.66)
Programming being usually easier with matrix forms, we define E as being the m x n matrix made by the m vectors £fT, and z{a'b) an m-vector with wi~a8ib in fth position. Equation (5.4.66) can be rearranged as
At the minimum, the partial derivatives of c2 with respect to each variable must be zero, i.e., all the components of the gradient g(<x, f$) of c2 in the space of the (n +1) variables a and /? must vanish. Canceling the coefficients of da T gives the vector of the first n components of #(a, jS) through the vector equation
= XTW '3- X (wr252)Ei = 0n
(5.4.67)
5.4 Handling errors in least-square problems
299
while the coefficient of dj8 gives the additional relation
gn+1(*J) = JmTfr-1$= £ ^ = 0
(5.4.68)
The components of the weighted residual vector W'^^S therefore sum up to zero. The most natural way of finding the minimum of c2, i.e., of solving the system of equations (5.4.67) and (5.4.68) in the (n+1) unknowns cci9...,aw, ft is to use the Newton-Raphson method. In order to implement this method described in Chapter 3, the derivatives of each gradient component gi (i = l , . . . , w + l ) relative to each unknown are needed. Using the language of optimization, these derivatives are the elements of the (n+1) x (n+1) symmetric Hessian matrix H of c2. Let us partition the matrix H as
Hxx =\ n H \_hyx
(5.4.69)
hyy
using the nxn submatrix Hxx, the n-vectors hxy and hyx and the scalar hyy. The i/th element of Hxx is dgi/docj = d2c2/d(xid(Xp the ith element of hxy and hyx are dgjdfi and dgn + 1/d<xi9 respectively, both equal to d2c2/dai dp while hyy is dgn+1/dfi = d2c2/d/32. Differentiation of the first term in equation (5.4.67) gives ~ V )d + XTW-1(Xda + JmdP)
(5.4.70)
e^Xis the ith row of A'so we will refer to it as xtT whereas etTS is simply 8t. Calculation of the first term in parentheses on the right-hand side using equation (5.4.63) leads to
^)= - 2 X ( w r 2 t o T d «
(5.4.71)
i=l
Differentiation of the last term in equation (5.4.67) gives
= - 4 f; ( w r ^ X e ^ d a + l f (wj-2«51)8IeiT(^d« + /mdi8)+ f (w,-2S,2)S,*
i- 1
i= 1
(5.4.72)
where use has been made of equation (5.4.57). The sum of the coefficients of da in equations (5.4.70) to (5.4.72) are the elements of H*x, therefore [4w,"3<5iV,T-2w,"2^*<«iT +
W)-w,
300
Inverse methods
or, in a compact form Hxx = XTW-1X+4ET#3<2)E-2[ETZi2>1)X+XTZi2>l)E]-
£ (w,."2^)
(5.4.73)
i= 1
where Z(a'b) is an m x m diagonal matrix with wi~1Sib in ith position. Because e?Jm is equal to one, the coefficients of d/? give the vector hxy as )
(5.4.74)
i= i
Differentiation of #n + 1(a,/?) in equation (5.4.68) is likewise doi + JndP)
(5.4.75)
For the first term on the right-hand side, we get f
(^)= -2 5 wr^ajda
(5.4.76)
The sum of the coefficients of da in equations (5.4.75) and (5.4.76) are the elements of hyx which are, as expected, the transpose of hxy given by equation (5.4.74) while the coefficient of d/J in equation (5.4.75) gives the scalar hyy hyy = JmrW-1Jm
(5.4.77)
The Newton-Raphson scheme prescribes the updating formula
^ l l ^ ^ or
E'L^l}
(5.4.78)
<547
-"
In the vicinity of the minimum, the H should be positive-definite. This may not be the case everywhere in which case there is a small but real danger of iterating towards a saddle instead of the minimum. It is therefore highly advisable, especially when the data scatter about the best-fit straight line, plane, or hyper-plane, to use the best possible initial estimate. Most commonly, one of the linear estimates (Section 5.1) will be good enough. The estimates (adjusted values) x( and yt of x( and yt are calculated from equations (5.4.52) and (5.4.56) combined as
5.4 Handling errors in least-square problems
301
It is estimated that, for reasonably good fits, the linear results hold, i.e., c2 is distributed as a chi-squared variable with (m — n) degrees of freedom and
which makes it possible to test whether the data can be fitted by a linear relationship within a given confidence interval. Moreover, if m » n , the contribution c2 of each observation to c2 given by £.2 = ^ . - 1 ^
(5.4.81)
ideally should be equal and distributed as a chi-squared variable with 'nearly' one degree of freedom. Using statistical tables, we decide that the probability of c2 to exceed 3.84 is only 5 percent, so each observation with c2 much in excess of that value can be considered as unreliable. For a more elaborate discussion of outlier detection, the reader could refer to Kent et a\. (1990). Now, we would like to attach a variance to the estimates of a and jS that make c2 minimum. Given the complex and non-linear nature of the gradient equations (5.4.67) and (5.4.68), we assume for simplicity that <x and fi are normally distributed and resort to linear propagation in order to retrieve an estimate for the covariance matrix of a and p. The covariance matrix S^p of the vector (
J
(5.4.82)
where da and d/? are infinitesimal changes about the expected value. As described in Chapter 4, we let X and y together with a and ft change slightly about the minimum value. For <5, we get dS = Jf da + Jm dp + dJPa - dy
(5.4.83)
The differential of g(<x, ft) dal
d
VXT W~ 1(dJRx - djO +
L
dXTW~181_
JmTW-i(dX*-dy)
J"
is zero at the minimum which results into
where, as usual, hat Q refers to values at minimum that should be treated as constants. The transpose expression is given by [daT,djS]#T= -idXA-dyYlfr-1*, W^J^-FW-^dX^
(5.4.85)
In order to obtain S^p, we will multiply equation (5.4.84) by equation (5.4.85) and
302
Inverse methods
take the expected value. What the products mean must be explicited in a little more detail. The f/cth element of the mxm matrix
X dXijdj-dyAl X dx*A—d and, since different measurements were assumed to be unrelated, it is zero for i # k. The ith diagonal element is simply w( and the matrix ^{[dJPa-dj][dJPa-dj] T }^ W
(5.4.86)
We first observe that 1
£= £
wk-xdkdxkj
The ./7th element of
since the measurements are uncorrelated. Arranging this result in a matrix form gives
Finally, we evaluate the m x n matrix ^[(dXat-dy^d^W'1^. this matrix is
The i/th element of
Uwr'd £ (5.4.88) since expected values vanish for /c / f. The term in brackets is the 7th term of the vector £t = Sfx — ct and the matrix E is therefore a suitable approximation of the matrix g[{dXz-dy){dX* fV~ 1 ^) T ]. Combining equations (5.4.86) to (5.4.88) gives the linear approximation of the matrix fP as
(5.4.89) OJ
0
5.4 Handling errors in least-square problems
303
and the final estimate of S a # as
(5.4.90)
0
i
In the simplest case of a straight line, i.e., for n= 1, all the programs devised over the years to compute isochron parameters and error bars converge toward the same value. As discussed by Kent et al (1990), most early calculation schemes (York, 1966; York, 1969; Mclntyre et al, 1966; Williamson, 1968; Brooks et al, 1972) now look rather awkward and unnecessarily complicated. This situation simply reflects that the need for exhaustive error handling in linear estimation came from geochronology in the first place and not from statistical sciences. However, the output of these early computational schemes should not necessarily be considered incorrect. Similarly, even if not expressed in a full matrix form, Albarede and Provost's (1977) solution to mass-balance equations (/? = 0) is a Newton-Raphson scheme in its own right. The matrix-oriented solution which has just been established is based on Kent et al. (1990) for the Newton-Raphson scheme, although the reader is warned about a typo in their equation (2.16). For isochrons, this method gives results indistinguishable from those obtained by Minster et a/.'s (1979) implementation of Williamson's scheme. Finally, expressions for the assessment of the influence of individual observations have been given by Kent et al (1990). & The 206 Pb/ 204 Pb and 207 Pb/ 204 Pb ratios of 11 Archean granites from Zimbabwe are given in Table 5.23 (courtesy Beatrice Luais) and shown in Figure 5.13. Together with the mean ratio values, the Table shows the in-run standard deviation of the mean of each ratio and of the 207 pb/ 206 Pb ratio. This ratio being close to unity is measured with better precision, which introduces a rather strong correlation between the variables x = 206 Pb/ 204 Pb and y = 207 Pb/ 204 Pb. Check whether these data form a statistically significant alignment. Give the age of the isochron y = ccx + P with error limits. In the present case, a reduces to the scalar a. Wefirsthave to calculate the covariance between the 206 Pb/ 204 Pb and 207 Pb/ 204 Pb ratios out of the variances on 206 Pb/ 204 Pb, 207pb//204pb a n d 20 7pb /206 pb G i v e n t h e m t i o r = ajh o f t w o quantities a andfe,we first propagate the variance on r from that on a and b. This can be done by taking the log-derivative of the ratio through dr r
da a
db b
Squaring and taking the expectation, we get, provided a and b are significantly
Inverse methods
304
Table 5.23. Pb isotope composition of Archean granites from Zimbabwe (courtesy Beatrice Luais).
Covariances in the last columns are calculated as explained /206p b \
Sampler
V 2 0 4 Pb/
1 2 3 4 5 6 7 8 9 10 11
18.073 16.714 33.747 32.376 17.488 14.262 17.579 18.386 15.839 17.398 17.756
/206p b N 51
0.018 0.017 0.034 0.032 0.017 0.014 0.018 0.018 0.016 0.017 0.018
the text. ' 2 0 7 Pb\
V 2 0 4 Pb/
/«"Fb\ A 204 Pb/
< 206pby)
15.707 15.341 18.951 18.694 15.576 14.923 15.597 15.712 15.177 15.496 15.552
0.016 0.015 0.019 0.019 0.016 0.015 0.016 0.016 0.015 0.015 0.016
0.000 43 0.000 46 0.000 28 0.000 29 0.000 45 0.000 52 0.000 44 0.000 43 0.000 48 0.000 45 0.000 44
V /*"Fb\
1
in
PCWfl
C0
/206p b 2 0 7 Pb\
\ao4p b ' 204p b y ) 0.000 253 0.000 223 0.000 567 0.000 532 0.000 238 0.000 184 0.000 254 0.000 252 0.000 210 0.000 221 0.000 253
20
18
16
14 10
20 25 206p b /204p b
15
30
35
Figure 5.13 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb ratios of Archean granites from Zimbabwe (B. Luais, unpub. data) and the least-square straight-line (isochron). (See text for results.)
different from zero var(r) ^ var(a) :r
T
~r ~~~a ~
var(fr) +
2
~b~
cov(a, b)
aV~
which gives the covariance between a and b. Replacing a, b, and r by the appropriate
5.4 Handling errors in least-square problems
305
isotopic ratios, it becomes /206p b 207p b \ V 207
204
Pb'
204
var( Pb/ Pb) ( 2 0 7 Pb/ 2 ° 4 Pb) 2
204
{
Pb/ ~ 2 206
+
206p b
207p b
204
204
Pb
204
var( Pb/ Pb) (206p b/ 204p b) 2
Pb
var( 207 Pb/ 206 Pb)"| (2O7p b/2O6p b)2 J
which gives the covariances listed in the last column of Table 5.23. The covariance matrix can now be calculated. For instance S x is _|~(0.018)2 0.0002531 1 ~l_0.000253 (0.016)2 J The 11 points of the isochron diagram are drawn in Figure 5.13. Their respective error ellipse (la error) could be drawn using the method described in Section 2.4, but errors are too small for the ellipses to be clearly seen. In order to get a good starting value, we could solve the standard unweighted least-square set of 11 equations. Convergence properties are better shown, however, by taking nearly arbitrary starting values, e.g., a = 0.1 and /?= 15. Some other choices may not converge, which shows that saddle points of the function c2 distant from the actual minimum do exist. It would be too space-consuming to give every detail of the calculation. For the reader who needs some practice, it may help to have some intermediate results before the iterations start. £x initially equals 0.0182 x 0.1-0.000253 = 0.000221. The 11 x 1 matrix E is equal to 0.001 x [-0.221, -0.194, -0.451, -0.430, -0.209, -0.164, -0.222, -0.219, -0.184, -0.192, -0.221] T . The diagonal elements of W are calculated from equation (5.4.50) and are equal to [4794, 5456, 3857, 3776, 4731, 5258,4799,4787,5389,5442,4794]. The gradient g is initially equal to [911463,53 218] T and the Hessian matrix H to
tf-10'xP-4826
0H28
l
|_0.1128 0.0053 J
which gives the updated value a"| _ro.l"| ff_Lew Ll5J
_3 [" 0.001174 X |_-0.02496
-0.02496T9114631 [0.35781 0.5492 J|_ 53218_|~[_8.5188j
Successive iterations converge extremely fast. After the fifth step, the results hardly change (Table 5.24) which, using the Newton method outlined in Section 3.1, indicates an age of T = 2.9065 Ga. The value of c2 = 83.45 can be compared with the 95 percentile of a chi-squared variable with 11—2 = 9 degrees of freedom (16.9) and seems much too high for the data set to form a statistically acceptable isochron. Table 5.25 shows that point 6 with a c62 value of 19.41 is particularly suspect. Further elaboration on the data is a matter of individual appreciation. Leaving
306
Inverse methods Table 5.24. Iterative refinement of the isochron parameters (slope a and intercept f$) for the lead isotope data listed in Table 5.23.
c2
Iter. k 0 1 2 3 4 5 6
0.1 0.3578 0.26112 0.21983 0.21056 0.21012 0.21012
15 8.5188 10.783 11.677 11.861 11.870 11.870
7472 101882 8247 327.6 83.94 83.45 83.45
Table 5.25. Partial reduced residuals and adjusted values of isotope compositions for the isotopic data of Table 5.23.
Sample i 1 2 3 4 5 6 7 8 9 10 11
1 2
c^wr ^ 9.587 10.16 0.586 2.762 6.077 19.41 6.822 2.724 2.678 5.320 14.50
206pb
207pb
204pb
204pb
18.028 16.760 33.765 32.339 17.455 14.211 17.541 18.410 15.861 17.431 17.811
15.658 15.392 18.965 18.665 15.538 14.856 15.556 15.738 15.203 15.532 15.612
out the data with high c2 values may be justified only if supported by field or microscopic evidence. Multiplying the errors by a common factor, in such a way that c2 becomes exactly (m — n), is common practice. Doing so, however, dissimulates the surrender of the critical test of whether the data do or do not define an isochron. We may rather question the use of in-run covariance matrix as an estimate of the sampling error: in-run errors can be made almost as small as one wishes by increasing the number of ratios measured, which makes little physical sense. We can instead use the covariance matrix of sample replicates, when available, or the covariance matrix of standard replicates (e.g., NBS 981). Estimating propagated errors requires some more effort of matrix programming. E does not actually change very significantly over the iterations. The covariance
5.5 Gradient projection and the total inverse
307
11.89
0.209
0.211
Figure 5.14 The Is error-ellipse for the 2 0 7 Pb/ 2 0 4 Pb vs 2 0 6 Pb/ 2 0 4 Pb isochron of Figure 5.13. A strong anticorrelation is generally expected between errors on the slope a and intercept /?.
matrix is = 10 in~ £«./!=
3
0.000 399 7 -0.007 857] -0.007 857 0.16903 J
which gives sa = 0.000 632 or an s r ^ (0.000 632/0.210 121) x 2.9065 = 0.000 87 Ga, s^ = 0.0130 and a correlation coefficient between a and ft of —0.956. The negative correlation reflects the property quoted above that the least-square straight line goes through the mean point. The corresponding error ellipse is drawn in Figure 5.14. o
5.5 Gradient projection and the total inverse
A more general least-square technique applicable to non-linear problems with non-linear constraints has been put forward by Tarantola and Valette (1982) and is described at length in Tarantola (1987). The basic idea is that we rarely start computing a model without having an idea of what the model parameters should be. In other words, data and model parameters should be treated as a unique set of parameters and assigned a unique covariance matrix, the model then being treated as a set of constraints applied to the set as a whole. Let us consider, for instance, in the {x9y) plane a set of three points (upvpj = 1,2,3) through which we want to fit a straight line with equation v = au + b. To the set of six 'data' Uj and vp we will add the model parameters a and b. Let us consider, for simplicity that all the data have unit variance and uncorrelated errors. We can lump the data up vp and the model parameters a and b together into a single 6 + 2 = 8-vector x for which we think an initial guess JC0 is acceptable. In an approach similar to the gradient projection method of Rosen (1961), Tarantola and Valette (1982) assume that finding the total inverse solution amounts to find an estimate x of x which minimizes the squared-distance to the
308
Inverse methods
initial guess {x — xo)T(x — JC0) with three constraints, such that ^ — ^ — 5=0(i=l,2,3)
(5.5.1)
or any equivalent expression relating the components of the vector x. Let us start with the illustrative case where we seek to minimize the distance (x — xo)T(x — x0) subject to one constraint which we write g(x) = 0. We want to minimize the sum c2 such as c2 = (x - xo)T(x - x0) - 2Xg(x)
(5.5.2)
where I is a Lagrange multiplier. Differentiating relative to JC, one gets dc2 = 2djtT(*-Jto-/lgrad0)
(5.5.3)
where use is made of the relationship dg = S — dxf = (grad g)T dx = dxT grad g dxt Therefore, c2 is minimum for (5.5.4) Left-multiplying by (grad#)T, equation (5.5.4) becomes (grad g)T(x - x0) = / (grad g)T grad g or X = [(grad gf grad g] " x(grad g)T(x - x0)
(5.5.5)
Therefore x-xo
= grad #[(grad g)T grad g~\ ~ x(grad gf{x - x0)
On the right-hand side, we recognize the projector P onto the gradient of g P=grad #[(grad g)T grad g] " x( More generally, if there are n data points, n Lagrange multipliers X} will be needed. Let m be the sum of the number of observations and the number of parameters. The general function c2 to minimize will be c2=(x - xo)J(x - x 0 ) - Z AM*) Upon differentiation, we obtain dc2 = d*^* - x 0 - Z h 8 r a d 0j(
(5-5-6)
5.5 Gradient projection and the total inverse
309
Let us rewrite the term under the summation sign in a matrix form
j=l
JUJ
We now lump the n Lagrange multipliers into the vector k, and the g^x) into a vector g{x). We further define the mxn matrix F of partial derivatives by its current term fij9 which is the derivative of thejth constraint with respect to the ith parameter, as
The minimum of c2 is for k such that X
JCQ
— f At
(5.5.8)
Pre-multiplying by F1* gives
and therefore (5.5.9)
Inserting that value of k into equation (5.5.8) gives
where we recognize the projector F(FlTF)~1F(T onto the subspace of constraint gradients. Since the constraints g(x) = 0 are to be satisfied at the minimum, the following form is suggested {D.j.lK))
{X — XQ) = ryr t)
\_r [X — XQ) — ff\X)j
Tarantola and Valette (1982) recommend the fixed-point algorithm with the new step being calculated from the old step as *new = *0 + Fold(FoldTFold)
l
[F o l d V o i d - *o) ~ 0(*old)]
(5.5.11)
JC0 being the best estimate of the solution which may differ from the initial guess. Implementation of a covariance structure into this numerical scheme is described in Tarantola and Valette (1982). In essence, an a priori covariance structure is assumed for the whole set of observations and parameters, which should be tightened by iterative refinements since we are still dealing with a minimum variance estimate. There are some marked differences between the total inverse of Tarantola and Valette (1982) and other methods, such as those described for isochrons in the previous
310
Inverse methods Table 5.26. Drift correction during ICP-MS measurements: time (in seconds) and mass-138 intensity (in Mcps) data. j
1000 x tj
h
1 2 3 4 5
0.095 0.210 0.325 0.405 0.510
0.298988 0.270547 0.250778 0.242 391 0.235 427
section, which allow both the parameters and the observations to 'drift' toward the minimum value of c2. Some information is introduced in the total inverse approach that does not belong in the regular least-square solutions. It is extremely common, for instance, that geological evidence is used to make a strong statement about the age of a granite, while experience tells us that initial Sr isotopic ratios rarely fall outside a given range of values. If we try to make a simplistic statement, both the standard least-square and the total inverse methods amount to projecting the observations on the model subspace (as, e.g., in figure 5.12 with an arbitrary surface instead of the plane). The difference is that, contrary to least-squares, the total inverse includes any a priori knowledge of the parameters in the observations. When it comes to the covariance structure, however, problems become acute. Total inversion requires that a joint probability distribution is known for observations and parameters. This is usually not a problem for observations. The covariance structure among the parameters of the model becomes more obscure: how do we estimate the a priori correlation coefficient between age and initial Sr ratio in our isochron example without infringing seriously the objectivity of error assessment? When the a priori covariance structure between the observations and the model parameters is estimated, the chances that we actually resort to unsupported and unjustified speculation become immense. Total inversion must be well-understood in order for it not to end up as a formal exercise of consistency between a priori and a posteriori estimates. Total inverse produces cumbersome sets of equations, especially when errors are taken into account. As usual, not considering errors amounts to taking uncorrelated variables with unit variances, but offers attractive illustrative properties. Examples of application abound in literature, but do not usually give enough detail for the student to use them as practical illustrative references. For these reasons, a simple illustration with no errors will be presented, and readers interested in a complete treatment should refer to Tarantola (1987). & Signal drift during ICP-MS measurement. Because of interaction between the optics and the ion beam, the sensitivity of the measurement degrades with time. Barium signal at mass 138 (in Mcps, million counts per second, noted 7138) has been found to decrease with time as given by Table 5.26. It has been found that the rate
5.5 Gradient projection and the total inverse
311
of change of the signal obeys the law
where / 138 (0), / 138 (oo), and X represent constants. Given five measurements of / 1 3 8 at five different times, find /i 38 (0), / 138 (oo), and X by total inversion. The observations are five values of the time, five values of / 1 3 8 while the parameters are three. We thus have a vector of m=13 variables, which we will rank as tl9 Iissihl- •> *5> hssiUl *i38(0)> / i3 8 (°o) ? and X. There are n = 5 relationships (/= 1, • • -, 5) relating intensities to time
The elements of the matrix F are calculated from
=o
dtj
- = —e ^138(0)
cll38(co)
The best tj and /i 3 8 (^) a priori values clearly are the values for the observations themselves, while some a priori values must be chosen for the parameters. Examination of the data suggests that / 138 (0) = 0.35Mcps, / 138 (oo) = 0.22Mcps are probably not unreasonable. Much of the signal decrease takes place over the measurement interval, so taking X= 1/100s is probably not a bad estimate. We form the initial vector of constraints go{x) as go(x) = [0.0287, 0.0346, 0.0257, 0.0201, 0.0146]7 and the initial matrix Fo of derivatives as 0.5028 1 0 0 0 0 0 0 0 0 0.3867 0.6133 0.0048
0 0 0.1592 1 0 0 0 0 0 0 -0.1225 -0.8775 0.0033
0 0 0 0 0.0504 1 0 0 0 0 -0.0388 -0.9612 0.0016
0 0 0 0 0 0 0.0226 1 0 0 -0.0174 -0.9826 0.0009
0 0 0 0 0 0 0 0 0.0079 1 -0.0061 -0.9939 0.0004
Applying equation (5.5.11) repeatedly gives the successive approximations listed in Table 5.27. Convergence of the calculation is extremely rapid and can be monitored
312
Inverse methods
Table 5.27. Iterative refinement of the time-drift parameters by the total inverse method. Iteration # h
A h
h h
h U
h h h
1(0) 1000 A
9T(x)g(x)
0
1
2
3
0.0950 0.2990 0.2100 0.2705 0.3250 0.2508 0.4050 0.2424 0.5100 0.2354 0.3500 0.2200
0.0895 0.2880 0.2075 0.2550 0.3247 0.2449 0.4050 0.2424 0.5100 0.2411 0.3563 0.2404
0.0897 0.2878 0.2077 0.2549 0.3247 0.2449 0.4050 0.2424 0.5100 0.2411 0.3567 0.2404
0.0897 0.2878 0.2077 0.2549 0.3247 0.2449 0.4050 0.2424 0.5100 0.2411 0.3567 0.2404
10.000
9.9999
9.9999
9.9999
3.3 x l O " 3
5.4 x l O " 8
7.4xlO~ 1 4
1.8xlO~ 1 9
by checking the squared modulus of the constraint vector g(x) which should be zero at minimum. It can be checked that, as expected from a projection, different starting values would converge to different solutions.
In some cases, what we expect to retrieve from observations is not a set of parameters but rather some function describing the continuous variation of a property. In order to illustrate the nature of this problem, we can take a geophysical analog: given many arrival times at various stations of seismic waves propagating from the focus of an earthquake, can we deduce the continuous function relating the seismic velocity to depth in the Earth? That sort of problem has generated an immense literature in geophysics (Backus and Gilbert, 1967; Parker, 1977; Tarantola, 1987), but is not of very common concern in geochemistry. Inferring spatial distributions of rare gases in minerals from step wise heating data (Albarede, 1978), a cooling history from fission track data (Corrigan, 1991) are among the very few but illustrative geochemical examples. Given, the very special character of this problem, it will be developed from an example on outgassing data. Let us assume a spherical mineral with radius R which initially contains a gas with concentration C0(r% r being the radial distance from the center. Upon incremental heating, this gas is lost to the extraction line and at the ith heating step when time is tb the fraction of initial gas remaining is f(tt). Loss takes place by radial diffusion with temperature-dependent, hence time-dependent, coefficient $)(t\ We assume that the total amount of gas held by the mineral at t = 0 is equal to one, i.e., that \~1CR
4 -nR3) 3
4nr2C0(r)dr=l /
Jo
(5.6.1)
5.6 The continuous inverse model
313
We further change to the dimensionless spatial coordinate x = r/R and introduce the dimensionless time T such that r=
(5.6.2)
—— df
Jo R2
where t' is a dummy integration variable. The radial distribution of concentration is given for the sphere by Carslaw and Jaeger (1959) as C(x, T) = - £ sin(/c7rx) x exp( - k2n2z) x' sin(/c7rx')C0(x') dx' Jo **=1
(5.6.3)
An analog expression can be found for parallel diffusion in Chapter 8. By integration of this expression over the entire sphere volume the fraction/(r) of initial gas remaining at dimensionless time T becomes / M = - - Z ^—^- ex P( -
k2n
^) )
xx
sin(/c7rx)C si 0(x) dx
(5.6.4)
which allows us to formulate the inverse problem: given a set of fractions /(T,-) of initial gas remaining when step i is completed and assuming that the x{ values can be estimated, derive the initial distribution C0(x) of the released gas in the heated mineral. We can exchange the order of summations, and rewrite
n J oL* = i
(5.6.5)
k
Let us define the kernel function K(x, T) as K(x, T) = - ^ — Y ^ ^ - sin(/c7rx) x exp( - /C2TC2T)
(5.6.6)
The kernel function could also be made simpler by using X(x, T) = xv"3 - ^ e r f — - erf — ) \ 2 X /T 2^/T/
(5.6.7)
where the error function erf is defined in Appendix A of Chapter 8 and the approximation is arrived at by using the properties of the theta function (Chapter 8, Appendix B). Let us define the nxn symmetric matrix A by its current element atj such that fly=
f 1X(x,T,)iC(x,T i )dx = n^ [ ' I f ^ s i
Jo
Jo U = ifc
314
Inverse methods
Using the result 1 sin2(/c7rx) dx = 2 Jo
derived in Section 2.6, we get the form a
ij = ~2 t ^expC-fcVfa + T,)]
(5.6.8)
In Section 8.6, we derive a useful approximation valid up to 85 percent loss which amounts to 3(Ti + TJ)
Equation (5.6.5) relating the fraction C0(x) can now be written /(T,)=|
/(T,-)
(5.6.9)
remaining at time rf to the distribution
K(x,Ti)txy/3C€{xf]dx
(5.6.10)
Jo
The critical step of the inverse problem is a projection of the unknown function onto the base formed by the kernel functions. This base is not orthogonal and, because the number of kernel functions is finite, cannot describe perfectly the unknown function. For that reason, we write that the unknown function x v / / 3C 0 (x) is the sum of its projections with component ut along thejth kernel functions plus any orthogonal complement labeled Xy/3C±(x) and belonging to the null-space of the kernel functions (5.6.11)
where the n scalar components Uj have been lumped into the vector u and the n kernel functions X(x,t f ) into the functional vector K(x). The functions Xy/3C±(x) belonging to the null-space of the kernel functions, i.e., satisfying the property 1
K(x, T 4 )[Xv/3C ± (x)] dx = 0, i = 1,..., n o
exist whenever the kernel functions are not linearly independent, which is usually the case when the number of observations is large. Assuming for simplicity that the space of the K(x9 rf) is full rank, i.e., that x x /3C ± (x) is zero, we write
/(*,)= [' t ujK(x,Tj)K(x,Ti)dx= Jo ; = i
£ Uj T K(x,Tj)K(x9Tt)dx j=i Jo
(5.6.12)
5.6 The continuous inverse model
315
Table 5.28. Ar outgassing data for lunar anorthosite 15415 (Turner, 1972). F is the remaining fraction of 38 Ar at temperature r°C. T is derived from 37 Ar outgassing data because this isotope is artificially produced by a nuclear reaction between fast neutrons and Ca.
Step # t°C 1-F lni
1 600 0.0061 -12.20
2 700 0.0282 -9.45
3 800 0.107 -6.87
4 900 0.324 -4.52
5 1000 0.611 -2.98
6 1200 0.891 -1.74
or, in matrix notation f = Au
(5.6.13)
Combining with equation (5.6.11) gives the final result
f
(5.6.14)
Discussing practical aspects of the inverse model such as stability, resolution, and statistical assessment would require considerably more development of the subject than presented here. For these important topics, the reader can consult the references quoted above and their enclosed bibliography.
<#^ Spatial distribution of spallation Ar in the lunar anorthosite 15415 (Albarede, 1978). Retrieve the 38 Ar spatial distribution in plagioclase crystals from the stepwise heating data of Turner (1972), assuming spherical minerals of identical grain size. The fractions lost are listed in Table 5.28 together with the xi values deduced from the diffusion of 37 Ar, an isotope produced artificially by neutron activation and supposed to be homogeneously distributed in the minerals (see Chapter 8). The high-temperature steps have been ignored because evidence from 39 Ar diffusion does not support simple diffusion loss at this temperature. We first build the matrix A. Let us show how to calculate the element a24.. We compute T2 + T4 = e~ 9 ' 4 5 + e~ 4 - 52 = 0.01097, then using equation (5.6.9)
a24=l-6 /
+ 3x0.01097 = 0.67840
This calculation is repeated for the whole matrix and inversion of equation (5.6.13)
316
Inverse methods
produces the vector u as
u=A
1
f
=
fO.1083
0.1083
0.1073
0.0977
0.0685 0.0475"
0.9893
0.9693
0.8938
0.6794
0.3894
0.1083
"0.9939" 0.9718
0.9693
0.9580
0.8902
0.6784
0.3891
0.1083
0.8930
0.1331
0.8938
0.8902
0.8520
0.6661
0.3848
0.1073
0.6760
-0.3002
0.6794
0.6784
0.6661
0.5658
0.3443
0.0977
0.3890
0.2531
0.3894
0.3891
0.3848
0.3443
0.2258
0.0685
J).1090_
_-0.0046_
" 1.2251" -0.2386
We now compute the vector K(x) for each value of x that we wish, say x = 0.995. This vector is made out of six K(x, xt) components. For K(x, T4), the calculation reads K(0.995,£T4 52 ) = 0.995 v /3- v /3( erf-
1+0.995
V :
1 -0.995 \ - e r f — ; =
or K(0.905,e" 4 - 52 )= 1.723-1.732 x(l-0.0270) = 0.0383 The complete vector #(0.995) is #(0.995) = [1.5261, 0.5294, 0.1431, 0.0383, 0.0131, 0.0043] 7 which enables us to compute the concentration of
38
A r at x = 0.995
C 0 ( 0 . 9 9 5 ) = # T ( 0 . 9 9 5 ) I I = 1.0179
1.2 1.0 0.8
I
0.6 0.4 0.2 0.85
0.90
0.95
1.00
Radial distance x-r/R Figure 5.15 Distribution of spallogenic 38 Ar in plagioclase crystals from the lunar anorthosite 15415 analyzed by Turner (1972). A population of spherical grains with uniform grain size is assumed. The T, values were deduced from the diffusion of 37 Ar (Albarede, 1978).
5.6 The continuous inverse model
317
Values at different x produce the distribution represented in Figure 5.15. Because of the spherical geometry, more points have been calculated next to the surface. It is difficult to assess in few words the statistical and physical significance of this profile. However, it is consistent with a nearly homogeneous distribution of spallogenic 38 Ar produced during interaction of the cosmic rays with calcium atoms and a subtle loss in recent time suggested by a rim depleted in 38 Ar. Further applications, error handling, and discussions may be found in Albarede (1978). <$=•
6 Modeling chemical equilibrium
6.1 Introduction The purpose of this chapter is to outline the simplest methods of arriving at a description of the distribution of species in mixtures of liquids, gases and solids. Homogeneous equilibrium deals with single phase systems, such as electrolyte solutions (e.g., seawater) or gas mixtures (e.g., a volcanic gas). Heterogeneous equilibrium involves coexisting gaseous, liquid and solid phases. Finding the distribution of components fspeciation') in all the phases of any system requires application of rather simple rules (Van Zeggeren and Storey, 1970): (a) the mass conservation condition: unless radioactive decay is present, atoms and exchangeable particles like electrons must be conserved; (b) the minimum Gibbs free energy condition: according to the Second Principle, a system must spontaneously evolve towards the lowest attainable energy condition, either stable or metastable; (c) the phase rule, which tells us for a given choice of components (either atoms, such as O and Na, or particles such as protons and electrons) and for a number of externally imposed constraints (pressure, temperature, oxygen fugacity, ...) how many phases are present. Condition (a) has been addressed repeatedly in this book, and we will see from different examples how two contrasting approaches to condition (b) may be used, through either the mass action laws or direct minimization of the Gibbs free energy. Condition (c) is a consequence of (a) and (b) and for a thorough discussion the reader is referred to the books by (i) Van Zeggeren and Storey (1970) and Smith and Missen (1982) for general principles; (ii) by Morel and Hering (1993) and Michard (1989) for the case of electrolyte solutions. Let us define the chemical components as the 'building blocks' describing all the possible chemical species present in the system (Morel and Hering, 1993). The chemical components are the minimal set of atoms, species, or ions that may describe the entire attainable stoichiometry of the system. A desirable property is that these components be independent of each other, i.e., that they form an orthonormal set in composition space. By Morel's method, mass balance breaks down in two steps. First, in much the same way as we constructed a mineralogical matrix for a multi-mineral rock, we can build a component matrix for the solution we are dealing with, including all the possible species. Second, the solution recipe is written down. We may think of producing a solution by dissolving a given quantity (the 'recipe') of NaCl and K 2 SO 4 , 318
6.1 Introduction
319
or making a gas mixture by the combustion of hydrocarbon in air. In these cases, species present in the mixture must add up for each component to the total number of components in the recipe. We can think of imposing carbon dioxide pressure in equilibrium with a solution, which amounts to constrain dissolved H2 CO3 . A similar case arises in rocks where oxygen fugacity is buffered by a mineralogical assemblage, e.g., quartz-fayalite-magnetite. Given a solution with m species and n independent components, the component matrix BmXn, is usually rectangular with m ^n. In the n-dimensional component space, there can be no more that n independent components, and some additional m — n independent relationships must exist between species concentrations. These additional relationships are chemical reactions and, in the same way as there is no unique choice of components, there is no unique set of independent reactions. For instance, the following three reactions 2 H2O + CO2^>CH4 + 2 O2
are not independent, since the third is simply the sum of the first two reactions, and there is no best choice of two equations. To each equation, we can associate a change in Gibbs free energy G and an equilibrium constant K. In the case of non-ideal mixtures, K relates activity for solutions and fugacities for gases. Let us assume the following reaction between the hypothetical species A, B, and C vAA<=>vBB + v c C
where the v coefficients are stoichiometric coefficients. At equilibrium, the change in Gibbs free energy AG = vB/zB + vc//c-vAJuA, where \i stands for the Gibbs free energy of each species, is zero. The total Gibbs free energy of the system must therefore be minimized with as many constraints on AG as there are independent reactions. Alternatively, we may take the mass action approach and write
where K(T9 P) is the equilibrium constant and depends on temperature and pressure. Equivalent expressions exist among fugacities for gaseous systems. Finding the speciation in this case amounts to solving simultaneously the n linear conservation equations and the m — n non-linear mass action equations. The case of activity coefficients in solutions is easily but tediously implemented since well-constrained expressions exist, like those produced by the Debye-Hiickel theory for dilute solutions or the Pitzer expressions for concentrated solutions (brines). The interested reader may refer to Michard (1989) for a recent and still reasonably simple account. However simple to handle, activity coefficients introduce analytically cumbersome expressions incompatible with the size of a textbook. Real gas theory demands even more complicated developments.
320
Modeling chemical equilibrium
6.2 The Newton-Raphson method applied to solutions
The Newton-Raphson method consists in solving simultaneously the conservation and mass action equations. Because of its simplicity and rather fast convergence, it is well-fitted to sets of non-linear equations in several unknowns, as described in Chapter 3.
6.2.1 Homogeneous equilibrium in solutions We will follow the layout of the problem as described by Morel and Hering (1993) and use their conventions. A solution may be defined as a solvent, most frequently water, and solutes, which can be neutral species such as O 2 or, more frequently, ions such as Ca2 + or OH~. Charged particles, such as electrons and protons, are not present in solutions but nevertheless may be handled in the same way as other charged species (Stumm and Morgan, 1981). Solvent concentration overwhelms solute concentrations. For reasons of roundoff errors due to water being the dominant species, a simplification is introduced for dilute solutions (Morel and Hering, 1993). One unknown and one equation are simultaneously eliminated from the set of conservation equations which make the recipe. The unknown species H 2 O is expressed as a function of the unknowns OH" and H + , which assigns OH" a - 1 H + coefficient, and the OH" conservation equation (first column in Table 6.1) is left out. & The calcium carbonate system at constant £(CO2). Assuming E(CO2) = 2.2 mmol kg ~ * and m (Ca 2+ ) = 1.2 mmol kg " 1 , calculate the concentration of Ca2 + , HCO3~, CO 3 2 ~, OH" and H + in the system. Assume the activity coefficients are unity and neglect H 2 CO3 . The second dissociation constant of carbonic acid K2 is 4.68 x 10" 11 (pK= -10.33) and the water dissociation constant 10" 14 . We assume that no precipitation occurs, although this assumption will later be proved to be inadequate. Using a non-linear system is a complicated way of solving this particular problem, but this example is quite illustrative and can be extended to any number of components. Although ionized atoms like H + and Ca 2+ are natural components of the dilute solution under consideration, carbon and oxygen do not appear as such in natural systems. Since the group CO 3 2 " is not destroyed in any reaction it will therefore be taken as the carbon host. The component matrix B is shown in Table 6.1. As explained above, the H2 O row is subtracted from the OH" row, which is left with — 1 in the H + column, which produces the new component matrix of Table 6.2. The system to be solved involves five unknown concentrations which, for sake of illustration, are made equal to the corresponding activities. An identical number of equations must be found that include component conservation plus a number of mass action laws corresponding to the formation of as many species as the excess of species over components. We first write the recipe, i.e., the mass balance for the components, not including the components of water. Calcium mass balance reads l[Ca2+] = 1.2xl0"3
6.2 The Newton-Raphson method applied to solutions Table 6.1. Thefull component matrix of the calcium carbonate system.
H2O H+ OHCa 2 + HCO3CO 3 2 ~
OH"
H+
Ca 2 +
CO32-
1 0 1 0 0 0
1 1 0 0 1 0
0 0 0 1 0 0
0 0 0 0 1 1
Table 6.2. The component matrix of the calcium carbonate system in dilute solutions. This matrix can be derived from Table 6.1 by expressing the unknown species H 2 O as a function of OH~ and H + , then removing the O H " conservation equation, i.e., removing the corresponding column.
H+ OH" Ca 2 + HCO3CO32-
1 -1 0 1 0
Ca 2 +
CO32-
0 0 1 0 0
0 0 0 1 1
while neglecting H 2 CO 3 , carbonate mass balance is
Conservation of H + and OH" amounts to the electroneutrality condition
The species forming upon reactions are first the hydroxyl ion
then the hydrogenocarbonate anion [HCO3]
321
322
Modeling chemical equilibrium
Let us build an x vector out of the five unknowns x 1 = Lri j , x 2 = LUri j , x 3 = |_L^a
j , x 4 = LriL,w 3 J? ana X5 = [
The five equations to be solved read (the electroneutrality equation is written first) -x4-2x5
/2(x) = x 3 -m(Ca 2+ )
Taking the derivatives of these five equations gives the matrix D
D=
1
-1
2
-1
-2
0
0
1
0
0
0
0
1
1
i
0
0
0
0
0
0 x2 x5 x4
x
XiX 5
*i
x4
2
Y
x4
In order to start the iterative calculation, a first estimate must be made. Although a subsequent section will show how to generate such an acceptable startup, the purpose of this exercise is to show how it works in a blind situation, which means that we do not want to be too smart. Let us assume that pH = 8, and split the carbonate component evenly between HCO 3 ~ and CO 3 2 ". [ C a 2 + ] cannot be different from the amount present in the solution. We get the initial estimate, labeled with the superscript (0) as (0 Xl
>=10" 8 , x 2 (O) =l(T 6 , x3(0) = 0.0012, x4(0) = 0.0011, x5(0) = 0.0011
which results in the five equations for the initial components of f{0)(x)
/2(O)(x) = 0.0012 -0.0012 = 0 / 3 (0) (JC) = 0.0011 + 0.0011 - 0.0022 = 0 /4(O)(jt)=l(T8xl0-6-10-14 = 0 <°>(JC) = ( 1 0 - 8 X 0 . 0 0 1 1 / 0 . 0 0 1 1 ) - 4 . 6 8 X
=0.995 x l O " 8
Let us define a goodness-of-fit criterion as the modulus of the residual vector given by x 10"
f(x\
62 The Newton-Raphson method applied to solutions
323
Inserting numerical values into the expression for matrix D{0\ we obtain •
1
0 i0)
D
=
0
10'
6
-1
2
-1
-2
0
1
0
0
0
0
1
1
0
0
8
1(T 10~
- 1
0
0
0
-9.091 x l ( T
6
9.091 x
r6 .
and using MatLab, to compute Z)~\ the increment vector Ax as 6.4167 xlO~ 9 " -6.4167 x l O " 7 1
= D f(x) =
0 9.0034 x 10 ~ 4 - 9.0034 x 10 " 4
The first updated vector jt, with its five components labeled with the superscript (1), is obtained as x 1 ( 1 ) = 1 0 " 8 + 0.6417x 10~ 8 = 1.6417x 10~ 8 (1)
= 10" 6 -0.6417 x 10" 6 = 0.3583 x 10" 6
x 3 (1) = 0.0012+ 0 = 0.0012 x 4 (1) = 0.0011+9.0034xl0" 4 = 2 . 0 0 0 3 x l 0 - 3 x 5 (1) = 0.0011 -9.0034 x 1 0 " 4 = 1.9970 x 10" 4
This calculation is repeated for successive estimates of x. At the fifth iteration, the solution obtained is x/ 5 > = [H + ] = 5.225 x 10"'°(pH = 9.28) (5)
= [ C a 2 + ] = 0.0012
x 4 {5) = [ H C O 3 - ] = 2.0020 x 10~ 3 x 5 (5) = [CO 3 2 ~] = 1.8086 x l 0 ~ 4
with the goodness-of-fit parameter s = lf(x)Tf(x)y2
= 1.75 x 10"
In this particular case, convergence is extremely fast.
Modeling chemical equilibrium
324
6.2.2 Heterogeneous equilibrium in solutions A phase distinct from the solution is also present. We will now consider two simple examples of (i) a solution coexisting with a solid phase and (ii) a solution coexisting with a solid phase at a given pressure of a gas that dissolves and dissociates in the solution. ^ Equilibrium with precipitation. The previous example calculated carbonate speciation admitting unrestricted solubility of all species. Actually, it is easily verified that the calculated calcium and carbonate concentrations exceed calcium carbonate solubility as measured by the solubility product [Ca 2 + ][CO 3 2 -] =K S = 10~ 835 =4.47 x 1(T9 What is then the actual distribution of dissolved species and the amount of calcium carbonate precipitated from the solution? Let us add the moles of precipitated calcium carbonate as a sixth variable x 6 , modify the calcium and carbonate conservation equations f2(x) and/3(jc) in order to account for solid phase contribution, and use the expression of the solubility product as a sixth equation. The six equations to be solved read f1(x) = x1—x2 + 2x3 — x4 — 2x5 f2(x) = x 3 + x6 — m(Ca2 + ) Mx) = x4 + x5 + x6 - Z(CO2)
f4(x) =
fs(x) = (XiX5/x4) - K2
f6(x) = x 3 x 5 - Ks
Xlx2-Kw
The matrix of partial derivatives is obtained by taking the derivatives of the previous equations 1
-1
0
D=
1
0
0
0
x2
Xi
0
0
0
0
x.
X5
0
-2
0
0 1
0
1
1
1
0
0
0
* x4 x,
0
*1*5 Y
x4
2
0
0
Let us keep the same initial guess as in the previous example and assume that no solid is present x / ^ K T 8 , x 2 (0) =10" 6 , x3(0) = 0.0012, x4(0) = 0.0011, x5(0) = 0.0011, x6(0) = 0 The initial values off^Xx)
to/ 5 (0) (x) are left unchanged while the value of/6(0)(.*:) is
f6(O)(x) = 0.0012 x 0.0011-4.47 x 10"9% 1.320 x 10" 6
6.2 The Newton—Raphson method applied to solutions
325
The matrix Z)(0) of partial derivatives can be calculated as 1 0 Z) ( 0 ) =
2
-1
-2
0
0
1
0
0
1
0
1
1
1
0
0
0
0
10"
-1
6
10"
8
1
0
0
-9.091 xlO~
0
0
0.0011
0
0 6
9.091 xlO" 0.0012
0 6
0 0
Instead of using the previous goodness-of-fit criterion, we can examine the magnitude of the 'residual' f(x) components after each iteration. After seven iterations, the component of/(jc) with the largest modulus is/5(jc)= —1.5 x 10~18 , which is small enough for assuming convergence. The final components of vector x are
(7)
= [Ca2 + ] = 1.005 xlO" 3
x4(7) = [HCO3-] = 2.000 x 10" 3 (7)
= [CO 3 2 -] =4.448 x l ( T 6
Most noticeable is the large change in pH induced by the precipitation of 0.196mmolkg"1 of calcium carbonate. o=> & The calcium carbonate system at constant CO2 pressure. We take the pressure of atmospheric carbon dioxide pCo2 = ^ x 10~4atm and assume calcium carbonate saturation. Calculate the concentration of Ca 2 + , H 2 CO 3, HCO 3 ~, CO 3 2 ~, OH" and H + in the system. Again, we assume the activity coefficients are unity. The carbonic acid dissociation constants are [HCO3 -][H +] = 4.47x10" [H2CO3]
and [CO 3 2 -][H + ] [HC(V]
= K2 = 4.68xl0" 1 1
while solubility of atmospheric carbon dioxide in water equilibrated with atmosphere obeys [H2 CO 3] Pco2
= a = 0.04molkg-1atm"
326
Modeling chemical equilibrium
The assumption of constant CO 2 pressure makes the system extremely simple since [H2 CO 3 ] = apCO2 = 0.04 x 3 x 1(T4 = 1.2 x 10"5 and therefore the conservation of carbonates cannot be written with the equation used in the previous examples. In addition, calcium carbonate saturation requires
The five unknowns are , x 3 = [Ca2 + ], x 4 = [ H C O 3 ] , and x5 = [CO 3 2 "] Let us write the electroneutrality condition, as well as the equations for water and carbonic acid dissociation and finally the saturation condition as
)=
(x1x5/x4)-K2
The algebraic expressions that make the matrix D of partial derivatives are
Z) ( 0 ) =
1
-1
2
-1
-2
x2
xl
0
0
0
0
0
xtx5
x]
v42 X
x,
x4
0
0
X,
0
0
0
x5
0
x5 x4
X
We can now start the calculation, which requires an initial guess of the x values. Let us assume a pH value of 8 and therefore x / ^ K T 8 , x 2 (O) =l(T 6 which constrains the hydrogenocarbonate content as x4(0) = X1apCO2/x1(0) = 0.536x 1(T3, and the carbonate content as x5(0) = X2x4(0)/[H + ] = 0.251 x 1(T5 The initial calcium guess is calculated from the electroneutrality equation (0)
^0.271 x 10"
327
6.2 The Newton-Raphson method applied to solutions
The initial matrix D(0) of partial derivatives is numerically evaluated as -1
1 6
—1
2 8
0
4.68 x 10" 3
0
0
-8.72 x l ( T 8
5.36 xlO" 4
0
0
10 - 8
10~
HT
0
0
-2
0
2.51 x l 0 ~
6
0
0
1.86 xlO" 5 0
2.71 x l 0 ~ 4
The initial residual, i.e., the initial vector of the functions we are trying to cancel is /<°>(x) = [-9.900 x 1 0 " \ 0, 0, 8.078 x 10" 28 , -3.790 x 10" 9 ] T This is no real surprise since the first four equations were assumed to be obeyed. Only the last condition of calcium carbonate saturation is grossly violated. The first increment is obtained in the usual way as — D ^ / a n d gives 1.8517 xlO~8" -1.8517xlO~ 6 5JC(0) =
-5.0736 x l 0 ~ 4 -9.9326 x l 0 ~ 4 -9.2969 x l 0 ~ 6
After nine iterations, the calculation converges towards the result 5.35 x K T 9 1.87 xlO" 6
[H + ] [OH "] [Ca 2
>=
5.11 xlO~ 4
[HCO a"]
1.00 xlO" 3
"I.
_8.76x 10~6
FYP c»11 *r»»-\/-»/=» / . J-zAlAsiiisiiv^i/
7
nf \\ '
9
F< >(x) =
L Id IIILIIL a i t
2.44 x l O "
20
- 3 . 4 2 xlO~
22
"
8.26 x l O ~ 1 8 - 1 .84 x l O ~ 1 9 - 1 .99xKT 1 6
Again, this is not the easiest way to work out the solution to this particular problem, but the example illustrates a method that can be extended to much more convoluted situations. <^
328
Modeling chemical equilibrium 6.2.3 More about scaling
Because equilibrium constants may often be of widely different orders of magnitude, solving some problems may lead to roundoff errors and poor accuracy. & Mercury-chloride complexes in dilute solutions. This slightly more difficult example will be useful in showing how to handle poorly conditioned systems of equations. It is assumed that mercury chloride HgCl 2 is dissolved in pure water with a molality m = 10" 5 mol kg" 1 . Given the equilibrium constants for chloride complex formation
[HgCl2°] and for hydroxide complex formation [Hg + ][OH-] 2 [Hg(OH)2°]
—R
_1f)-21-85
' -
— P O H — AU
calculate the distribution of species in solution. A convenient choice of components is H + , H g 2 + and Cl~ for six species H + , O H " , H g 2 + , Cl~, HgCl 2 ° and Hg(OH) 2 °, which gives the component matrix shown in Table 6.3. Let us define the vector J C ( X 1 , X 2 , . . . , X 6 ) by its components v.
ru+n v
rnu"i
v
rurr2+n v-
rr*~\~~\ v-
nine*] on anri v
rHa/nw^ °n
The six equations in the six unknowns xl9 x 2 ,..., x 6 are the electroneutrality condition, conservation of mercury and chloride components, plus the three mass action laws corresponding to water dissociation and mercury complexation by Cl~ and O H " . The condition of electroneutrality reads -] + 2[Hg2+]-[Cr] = 0 or lxx — Ix2 + 2x3 — lx 4 = 0 Conservation of mercury requires [Hg 2 + ] + [HgCl20] + [Hg(OH)2°] = 10"5 or Ix 3 + Ix5 + Ix6 = m while, for chlorine
6.2 The Newton-Raphson method applied to solutions
329
Table 6.3. The component matrix of mercury chloride in dilute solution.
H+ OHHg 2 +
cr HgCl 0 2
Hg(OH) 2 °
H+
Hg 2 +
cr
1 -1 0 0 0 -2
0 0 1 0 1 1
0 0 0
I 2 0
or
As a function of the new variables, water dissociation equilibrium requires
while Hg complexation by chlorine reads
x5
Likewise, for the Hg hydroxilated complex, we get _
R
_10-21.85
Since the constant terms on the right-hand side of the previous equations vary by some 17 orders of magnitude, we may face serious roundoff errors in estimating/(x). One way of scaling the equations is to divide each conservation equation by the total amount of the corresponding component and each mass action relation by the corresponding equilibrium constant. If the choice of the initial estimate x{0) is not too awkward, we should obtain the six equations as differences between numbers of more similar magnitudes (ideally unity for all but the electroneutrality condition)
h(x) = -
2m
'--I -1
330
Modeling chemical equilibrium
The matrix D of derivatives becomes
D=
1
-1
0
0
0
0
2 1
0
1
1
m
m
1
1
lin
m
0
0
0
0
(*4)2
,ZX 3 X4
—x3/x4\
n
0
o
0
0
m
it o
-1
0
rC\^ 5
0
2x 2 x 3
0
0 /*OH\
Taking the arbitrary initial guess as x 1 (0) =10" 7 , x2(0) = 10"7, x 3 (0) =10- 7 , x4 = 10- 5, x 5 (0) =10- 5 , x 6 ( 0 ) =10- 5 and inserting these values into the six expressions for the components of/(jc), we obtain
- 1 = 1.01
10"
10 -14
io- 7 (io- 5 ) 2 - 1 = 12.8 *7.244xl0- 14xl0-5 io- 7 (io- 7 ) 2 ' 1.412 x l O " 2 2 x 10"
= 7.082 xlO 5
and the matrix of partial derivatives
Z) ( 0 ) =
1
-1
0
0
0
0
105
0
105
105
0
0
0
0.5 x 105
105
0
7
10 0 0
10
2
7
0
0 1.416 x l O
-1
0.276 x 10
-0.138 xlO
13
0
0
0.138 xlO 13
0.708 x l O
0
0 7
7
0 7
0 -0.708 xlO 1 1
6.3 Gibbs energy minimization
331
We get the increment 6x(0) as '-1.445 xlO- 7 ' 1.445 x 10"7 -0.905 x 10 ~7 0.933 x 10"5 0.335 xlO~ 6 0.986 x 10- \ and the first updated estimate as
x 2 (1) =10- 7 -1.445 xlO~ 7 = -0.445 xlO x 4 (1) =10~ 5 -0.933 xlO" 5 = 0.67 xlO" 6 5 x;55(1) ( 1=>=1010- 5 -0.335 -0.335xl0x 10-6 = 0.967 x 10' 5
x6= 10~5-0.986 x 10"5 = 1.44 x 10"7 After 25 iiterations, the following result is obtained v
v
(25)
( 2 5 ) __
(25)
A QQ rur + i _ ^ 294 x 10 5 (pH (nXJ — = 4. ^
= [ C r ] = 1.294 x 10- 5
(25) = [HgCl20] = 0.353 x 10- 5 x 6(25) = [Hg(OH)2°] = 0.647 x 10~5 Most of the mercury is therefore in the form of chloride and hydroxide complexes. The three 'residual' components of f(x) with the largest deviation are /3(25>(x)=
-
KT
which indicates excellent convergence since they are to be compared with differences of numbers approximately equal to unity. <=> 6.3 Gibbs energy minimization 6.3.1 Mixtures of ideal gases We will now take advantage of a slightly different problem, homogeneous equilibrium in gases, in order to illustrate a different numerical approach, the steepest-descent method. At equilibrium, the Gibbs free energy of a system is minimum. Standard thermodynamics (e.g., Denbigh, 1968) states that the Gibbs free energy G of a perfect
332
Modeling chemical equilibrium
gas mixture containing rij moles of species j is given by
where Pj is the partial pressure of species j9 and fif the Gibbs free energy of the pure gas at unit pressure, pj is related to the total pressure P
Rearranging with the use of logarithm properties G = £ Ujhij0 + @T In P + @T\nnj-@T In N)
(6.3.1)
j
where the total number N of gas moles in the mixture is n
Let us define the constant cj as
and the reduced Gibbs free energy © as Y
(6.3.2)
where the dependence on molar composition has been emphasized. In a formal way, our goal is to locate the minimum of ©(«) relative to the vector n(nu n 2 ,... rij) subject to the constraint that the conservation equation holds, i.e., that the components in the species add up to the total number of components in the recipe. This task will be best handled by the steepest-descent method described in Chapter 3. The constraints will be handled through the gradient projection method which is a close equivalent to using the method of Lagrange multipliers described in the same chapter. In addition, better understanding will be achieved if the section on projectors developed in Section 2.2 has been read first. The ith component of the gradient of function © must be equal to the chemical potential fit. In order to prove it, we first observe that dlniV_ 1 dri:
N
6.3 Gibbs energy minimization
333
and therefore — =ci + \nni-\nN ont
+
ni(l/ni)-Yjnil/N n
or = ci + \n(ni/N) = fii
(6.3.4)
orii
We note all the derivatives in compact gradient form V© = c + In n - JlnN = c + In n - / I n JTn
(6.3.5)
where
In a matrix form, the system of mass balance equation constraints (component matrix) reads BTn-q = 0
(6.3.6)
where B is the component matrix and q the recipe of the system. As per Section 3.2, the unconstrained direction of steepest-descent of (5 is tfk + i)_n(k
+
i)=_aV&k)
( 6 3 7 )
where n(h+1) is the (fc+l)th estimate of the vector «, and a is a constant to be determined. At the nth step, the mass balance constraints (6.3.6) require BTnik)-q
=0
(6.3.8)
For the same constraint to be satisfied at step k+1, n(k + 1) must obey
This holds true for n{k + 1) relating to n(k+1) through (6.3.9) which can be checked upon pre-multiplying both sides by BT. What we have actually performed is a projection (Figure 6.1) of the minimization direction on the constraint subspace of the matrix B using the projector P, such that P=I-B(BJB)
X
BJ
(6.3.10)
334
Modeling chemical equilibrium
Figure 6.1 Search for the minimum of the Gibbs function © in a two-component space (nn and ni2 are mole numbers) with the mass conservation constraints Bn = q. The search direction is the projection of the gradient onto the constraint subspace. Minimum is attained when the gradient is orthogonal to the constraint direction, which is the geometrical expression of the Lagrange multiplier methods.
We get the update formula of the (k + l)th estimate as a function of the /cth estimate as (6 3
!
where 5w(fc) is the incremental correction to the /cth vector n(k) to give the (ZcH-l)th estimate Snik)=-(xPV&k)
(6.3.12)
PV&k) is therefore the direction of constrained minimization. As in the case of Lagrange multipliers, no progress can be made and search will stop when the (/c + l)th minimization direction PV(5 (k+1) is orthogonal to the /cth minimization direction PV(5(k). A criterion for minimum is when the inner product of these vectors becomes less than an arbitrarily small value. From equation (6.3.5), the gradient of function (5 at the /cth step is V©(*> = c + In n{k) - / I n
JTnik)
(6.3.13)
The condition for two consecutive minimization directions to be orthogonal is that their inner product /(a) vanishes, i.e., /(a) =
(6.3.14)
/(a) can be expanded as /(a) =
=0
6.3 Gibbs energy minimization
335
Defining the scalar r as
/(a) becomes / ( a ) = [PV(5 (fc) ] T c + [PV<5 (fc) ] T ln[ii (k) + 8ji(fc)] - r ln[N (fc) - r a ]
(6.3.15)
Since /(a) is a known function of a, we will use a Newton step in order to find an approximate root a(1) of/(a) = 0. From equation (6.3.12), we know that
da
Defining M (k) as the diagonal matrix with n(k) + Sn(k) components on the diagonal, we get
da
-. L
-.
Nik)-m
which we apply to a(0) = 0 (or, equivalently, 8n(k) = 0) to give a(1) X(*)
/(a)/da
- r2/Nk
ik)
(6.3.16)
This Newton step can be repeated, but if the function (5 is not ill-behaved, it may prove simpler to restart the whole loop and recalculate the local gradient. Finding speciation in a multicomponent system is significantly more complicated when different phases are present. A well-known application is the equilibrium condensation model investigated in detail by several authors in order to reproduce the gross chemical features of the solar system (Larimer, 1967; Grossman, 1972; Grossman and Larimer, 1974). Thermodynamic modelling of mineral assemblages (Saxena and Eriksson, 1983) and of the liquid-line of descent of magmas (Ghiorso, 1985a, b) are other successful applications of this theory. The principal difficulty lies with the requirement of non-negative mole numbers, which requires some specific techniques to be used (Van Zeggeren and Storey, 1970; Smith and Missen, 1982). ^ Speciation in a gas mixture. Let us work out a case provided by Van Zeggeren and Storey (1970), involving combustion of propane in air in the proportions of one mole of propane (C 3 H 8 ) and five moles of air (O 2 + 4N 2 ) at 40 atm and 2200 K, which provides a nice illustration of how to calculate the production of greenhouse gases by automobiles. Using ln40 = 3.689 and 01T= 18.292kJmol" 1 , and thermochemical Gibbs free energy at 2200 K from Barin and Knacke (1973), we make Table 6.4. We recognize in the four columns below the components C, O, N and H, the 6 x 4 component matrix B. The gas recipe with four conservation equations as functions of the mole number
Modeling chemical equilibrium
336
Table 6.4. The component matrix and Gibbs energy of formation of various gaseous species in the C-O-N-H system at 2200 K. AGf,22OO
j
Species
C
1 2 3 4 5 6
CO 2 N2 H2O CO H2
1 0 0 1 0 0
o2
o
N
H
(kJmor1)
2 0 1 1 0 2
0 2 0 0 0 0
0 0 2 0 2 0
-982.6 -498.7 -752.4 -623.2 -361.8 -531.3
of propane rcC3H8° and air n a i r °, is for carbon
and for oxygen 2nCO2 + lnHlO + lnco + 2nOl = 2nair°
Nitrogen conservation requires
while for hydrogen
We can collect these equations in the compact form of equation (6.3.8) BTn(O)-q = (
which actually means
0
1 0
2
0
1 1 0
0
2 0 0
0
0
0
2 0
"0"
0" n2
"10
1
"3
2«air°
0
0
n4
Kir°
0
2 0
"5
L8« C3H8 oJ
LoJ
C
J
-50.03 -23.57 -37.44 -30.38 -16.09 -25.36
6.3 Gibbs energy minimization Building the projector P=I-B(BTB)~1BT
337
requires patience and gives
' 0.45
0.00
-0.05
-0.45
0.05
-0.20 1
0.00
0.00
0.00
0.00
0.00
0.00
-0.05
0.00
0.45
0.05
-0.45
-0.20
-0.45
0.00
0.05
0.45
-0.05
0.20
0.05
0.00
-0.45
-0.05
0.45
0.20
-0.20
0.00
-0.20
0.20
0.20
0.20
p=
•
•
Numerical conditions are nC3H8° = 1 and n air ° = 5. An acceptable initial guess is to be found that satisfies the constraint of equation (6.3.6). Although an alternative method will be described below, an efficient way to make such an estimate is to complement the matrix B and the vector q by arbitrary numbers in order to make a system of linear equations that can be conveniently solved. We like to use an initial guess with only positive values of mole numbers since we will have to take their logarithm. Moreover, such a choice makes more sense. In this particular case, finding an acceptable starting value will not be difficult. We would, however, appreciate a method that can be extended to a large number of species. A convenient trick is to use a random number generator to stuff the lower part of the matrix B and vector q with random numbers until the solution of the square system has only positive components. In other words, we solve systems such as 1 0
0
1 0
0
2
0
1
1 0
1
n2
10
0
2
0
0
0
0
"3
40
0
0
2
0
2
0
"4
8
"5
P
P p
P p
P p
-P p
P p
P p_
3
m
P.
where each p position is assigned a different random number until the n,'s are all positive. After a few seconds, a computer may produce the initial guess #i(0) = [2.2075, 20.000, 2.3291, 0.7925, 1.6709, 1.2317]T which gives JTn{0) = N(0) = 28.2317 and In Ni0) = 3.3404. The components of V©(0), the gradient of the function ©, are 1
V© (0) =
- 50.03 + In 2.2075 - 3.3404'
'-52.5786'
- 23.57 + In 20.000 - 3.3404
-23.9147
-37.44 +In 2.3291-3.3404
-39.9350
- 30.38 + In 0.7925 -3.3404
-33.9530
-16.09 +In 1.6709-3.3404
-18.9171
-25.36 +In 1.2317-3.3404
-28.4921
338
Modeling chemical equilibrium
which gives the estimate of the reduced Gibbs function as © ©<0> = [fi<°>]TV©(0) = - 781.0 mol The projection of the gradient along the constraint is />V©(0) = [-1.6322, 0, -2.8284, 1.6322, 2.8284, 2.2303]7 and therefore r = -1.6322 + 0 - 2.8284 +1.6322 + 2.8284 + 2.2303 = 2.2303 The matrix M (0) is "2.2075 0 0 20
0 0
0 0
0 0
0 0
0
0
2.3291
0
0
0
0
0
0
0.7925
0
0
0
0
0
0
1.6709
0
0
0
0
0
0
1.23
The inner product between the direction of minimization V(5(0) and the direction of the constraint PV© (0) is /(0) = [PV©(0)]TV©{0) = ( - 1.6322X - 52.5786) + (0)( - 23.9147) + ... = 26.3016 In order to calculate d//da for a = 0, we need to calculate
2.2075
20.000
2.3291
hence
da
28.2317
= -16.6530
The Newton step is /(a) d/(a)/da
-16.6530
which suggests the increment 5/i (0) =-aPV© (0) = [2.5779, 0,4.4671, -2.5779, -4.4671, -3.5225] T Unfortunately, such an increment 8n makes some components of n negative. The
6.3 Gibbs energy minimization
339
Table 6.5. Mole fraction of gaseous species x after the seventh iteration. The column heading vZS refers to the results obtained by Van Zeggeren and Storey (1970) for the same system but using the JANAF tables of thermodynamic data. vZS
Species CO 2 N2 H2O CO H2
o2
0.1084 0.7396 0.1473 0.0026 0.0006 0.0016
0.1080 0.7387 0.1467 0.0029 0.0008 0.0012
idea is to use the largest a value that makes all n components non-negative. Logarithms, however, are no more tractable with null than with negative values. We therefore use a value of a which is 95 percent of that value that makes all the n components non-negative, i.e., inclusive of zero. In the present case, the fourth component is the limiting factor and we choose a = 0.95 x 0.7925/1.6322 = 0.4613 The first updated vector n{1) is 1
2.2075-0.4613 x(-1.6322)'
20.0000-0.4613x0
1
2.9604"
20.0000
2.3291 -0.4613 x (-2.8284)
3.6337
0.7925-0.4613x1.6322
0.0396
1.6709-0.4613x2.8284
0.3663
1.2317-0.4613x2.2303
0.2030
After seven iterations, the inner product between the direction of minimization and the direction of the constraint becomes
which shows that we are close to convergence and the solution does not move much from /i(7) = [2.9310, 20.0000, 3.9829, 0.0690, 0.0171, 0.0431]T
The results are listed as mole fractions JC(7) in Table 6.5 together with those of Van Zeggeren and Storey (1970). Considering that these authors used different sources of
340
Modeling chemical equilibrium
thermochemical data, the agreement of the present results with theirs is fairly good. The total reduced Gibbs free energy at that point is (5= — 791.612mol and G = (5^T= -791.612 molx 18.292 k J m o r ^ - 1 4 4 8 0 k J For a large number of gaseous species and for real gases, a more powerful method such as the method of conjugated gradients of Fletcher and Reeves (e.g., Fletcher, 1987; Press et ai, 1986), would be more efficient.^
6.3.2 Pure coexisting phases The extreme case where the system is made of pure phases (no gas mixture, no solid or liquid solutions) can be handled in a slightly different way with a method that draws on linear programing methods. Given the Gibbs free energy of formation of all possible minerals, the objective for a rock of known composition is to find the mineral abundances that minimize the Gibbs free energy function of the rock. Linear constraints are the conservation equations of each element (or oxide). In addition, mineral abundances cannot be negative. Let us consider a rock at temperature T whose chemical composition q (recipe) is expressed as the vector of all the molar fractions x0 of s elements or oxides. It is assumed that it can be made by an arbitrarily large number p ^ s of mineral phases exclusive of solid solution. B is the component matrix of these minerals for the selected set of elements or oxides. Let n} be the number of moles of mineral j and g} its Gibbs free energy of formation AGf T estimated when formed from either the elements or the oxides. The function to be minimized is the Gibbs free energy G given by G
= t nj9j = nT9
(6-3.17)
where n and g are vectors with components rij and gj9 respectively. The conservation equations are written in their usual matrix form BTn = q
(6.3.18)
In an s-dimensional space, s vectors at most can be independent. At equilibrium, a rock made of 5 elements cannot consist of more than 5 minerals, which implies that at least p — s of the p mole numbers are zero. In order to find the set of independent vectors that minimize the energy, we first rearrange the order of variables and split the vector n into two parts. The first part is the vector nB made of s base variables, and the second part is the vector nF of (p — s)free variables. Provided the base variables are non-negative, the non-negativity constraints can be satisfied by setting the free variables to zero. For the vector n to be a feasible solution, it should also satisfy the recipe equation, i.e., M
II B 2? F + I , F / ? F
^
(6.3.19)
6.3 Gibbs energy minimization
341
where the matrix B has been split into the upper s x s matrix BB and the lower (p — s)xs matrix B¥. Assuming that the base variables can be chosen in such a way that BB is regular, we can write %T = <1TBB ~l - nFJBFBB
~x
(6.3.20)
For nF = 0, we immediately get the relationship between nB and q. We now want to change both nB and nF in a direction that decreases G. More precisely, we will exchange one free for one base variable at a time as long as the Gibbs free energy can be decreased. The last equation can be differentiated as 5/i B T = -dnFTBFBB~l
(6.3.21)
a n d G as 5G = 6nTg = [5/iBT, 5 % T ] r B 1 = bnBJgB + SnFJgF = SnFT(gF - BFBB " ^ B ) I0J
(6.3.22)
where gB and gF are the Gibbs free energy of formation associated with the base and free variables, respectively. The vector gF — BFBB~1gB is the gradient of G with respect to the free variables with the constraint that changes keep the conservation equation satisfied. Each component of nF is zero and can only be increased. 8G therefore can be negative only if the constrained gradient gF — BFBB~1gB has negative components. The most natural policy is to increase the component i of nF associated with the most negative component of gF — BFBB~1gB. Because the elements of the matrix B are all non-negative, equation (6.3.18) forces at least one element of nB to decrease when one element of nF decreases. Actually, element nB changes in proportion to the ith row of BFBB~X, which we call utT9 but with the opposite sign. The constraint that each element of nB stays non-negative amounts to finding the largest scalar a such that nB-(XUi^0
(6.3.23)
The first element k of nB reaching zero is found by selecting the smallest positive components of nB/ut (the ratio being understood as the vector obtained by element to element ratio). At this point, the fcth mineral of nB is exchanged with the ith mineral of wF, which simply amounts to exchanging their corresponding row in both B and #, and the calculation restarted from the beginning. The calculation stops when 8G cannot be decreased any further, i.e., when each component of gF — BFBB~1gB is positive. & A rock in the SiO 2 -MgO-CaO composition space at 1000 K can consist of the following minerals: periclase (pe), forsterite (fo), enstatite (en), quartz (qz), diopside (di), merwinite (me), larnite (la), and lime (li). The molar compositions of each mineral formula weight are listed in Table 6.6 together with their Gibbs energy of formation AGf 1 0 0 0 ° from the elements given by Robie et al. (1978). Given a rock with molar composition nsiO2 = 0.45, rcMgO° = 0.45, and nCaO° = 0.10, find the stable mineral assemblage.
Modeling chemical equilibrium
342
Table 6.6. The component matrix and Gibbsfree energy of formation for various minerals in the system SiO2-MgO-CaO. AU
#
Mineral
1 2 3 4 5 6 7 8
periclase (pe) forsterite (fo) enstatite (en) quartz (qz) diopside (di) merwinite (me) larnite (la) lime (li)
f,1000
SiO 2
MgO
CaO
(kJmor1)
0 1 1 1 2 2 1 0
1 2 1 0 1 1 0 0
0 0 0 0 1 3 2 1
-493.092 -1771.526 -1256.427 -729.920 -2630.281 -3810.761 -1926.167 -531.007
In a three-component space, three vectors at most can be independent. The vector n of eight variables (mole numbers) is therefore split into a vector nB of three base variables with non-negative values and a vector nF of eight minus three = five free variables equal to zero. Likewise, the matrix B is split into a 3 x 3 matrix BB and a 5 x 3 matrix BF, while g is split as gB and gF. Expressing the rock composition with respect to oxide minerals ensures that the components of the base vector nB combine as non-negative numbers to form the recipe. We therefore use the mineral assemblage qz-pe-li as the starting assemblage and rearrange B and g accordingly, i.e.,
1 0
B=
0
-729.920
4
0
1 0
-493.092
1
0
0
1
-531.007
8
-1771.526
2
-1256.427
3
1 2 0 ,9
1
1 0
2
1
1
-2630.281
5
2
1 3
-3810.761
6
-1926.167
7
1 0
2
The mole numbers of oxide minerals are obtained from equation (6.3.20) as
"1 0 T
J
nB = q BB ~' = [0.45
0.45
0.10] 0
0
1 0
_0 0
1_
= [0.45
0.45
0.10]
6.3 Gibbs energy minimization
343
which gives a total Gibbs free energy G of — 603.456 kJmol" 1 . "1 2 0" 1 1 0 "1 0 0" -1 BFBB
= 2 1 1 0 1 0
"1 2 0" 1 1 0 2 1 1
=
2 1 3 _0 0 1_ _1 0 2.
2 1 3 _1 0 2.
We now calculate the constrained gradient of G relative to the free variables making the components of the vector nF 1771.526" 1256.427 gF-BFBB
l
gB =
"1 2 0" 1 1 0 '-729.920"
' -55.422" -33.415
2630.281 - 2 1 1 -493.092 = -146.342 3810.761 1.926.167.
2 1 3 .-531.007.
-264.808 -134.233.
_1 0 2m
The most negative component is i — 4, so the merwinite mole number will be moved from the status of a free variable to that of a base variable. The fourth row i#4T of BpBy,1 is [2,1,3] and the components of nB/u^ are 0.45/2 = 0.225, 0.45/1=0.45, 0.10/3 = 0.03. The third component (k = 3, lime) of nB is first to reach zero upon increase of merwinite. We therefore exchange the rows assigned to lime and merwinite in the matrix B and the vector g as -729,920
4
-493.092
1
-3810.761
6
-1771.526
2
-1256.427
3
2 1 1
-2630.281
5
0 0 1
-531.007
1 0 2
-1926.167
1 1 0
9=
The procedure is repeated which produces the successive replacements li=>me=>di, then pe=>fo, and finally qz=>en. At this stage, the mineral assemblage of the rock is nen = 0.15, nfo = 0.10, rcdi = 0.10, while the constrained gradient gF-BFBB~1gB along the free variables has only positive components [22.007 (pe); 11.408 (qz); 84.572 (me); 101.519 (li); 80.213 (la)]. This last condition shows that the value of G = — 628.645 kJ mol ~ l is minimum, o
7 Dynamic systems
7.1 Introduction
Dynamics deals with changes in the state of a system with time. We can think of a geological system evolving in response to changes in geological parameters that are not explicitly time-dependent: although trace-element contents in differentiating magmas may change as a function of descriptive parameters that are time-dependent, they can be adequately described by their degree of fractionation. Likewise, the chemistry of clastic sediments with different provenance can be thought of as resulting from a time-dependent process, but most local chemical aspects of these sediments can be handled efficiently using source composition and mixing proportions. These systems are not described as dynamic systems because the time-dependence is not a critical factor in determining the geochemical variable of interest. In contrast, some other systems have characteristic time-scales involved, such as those in geochemistry through fluxes, that are time-dependent in essence, and homogenization processes that need some time to complete. These are real dynamic systems. Let us first introduce some important definitions with the help of some simple mathematical concepts. Critical aspects of the evolution of a geological system, e.g., the mantle, the ocean, the Phanerozoic clastic sediments,..., can often be adequately described with a limited set of geochemical variables. These variables, which are typically concentrations, concentration ratios and isotope compositions, evolve in response to change in some parameters, such as the volume of continental crust or the release of carbon dioxide in the atmosphere. We assume that one such variable, which we label /, is a function of time and other geochemical parameters. The rate of change in / per unit time can be written .„,-„ i(tlQ
(7.1.1)
dr where x^xfa) is a time-dependent external parameter (e.g., temperature) and F = F(f9 xh t) any suitable function. Only one external parameter is needed to illustrate the general behavior of dynamic systems. f0 and F o are the values of/and F at t = 0. A Taylor expansion of F to the first term in the neighborhood of Fo gives
o
\dxJo 344
\dtJo
7.2 Single-variable residence time analysis
345
where zero subscripts indicate that derivatives are taken at t = 0. For sake of demonstration, we consider that F does not depend explicitly on time (autonomous system) and xt is constant. The last two terms of the right-hand side vanish and the rate of change equation becomes
Let a be the derivative (dF/df)0, then df at Changing to a new variable F 0 /a + / —f0 leads to the solution / - / o = — (**-!) a Only negative values of a lead to physically bounded values of/. The reciprocal of a has the dimension of time and is called the relaxation time o f / i n the system. (We will see later that for chemical adjustments, this parameter has the meaning of a residence time.) Relaxation time is a measure of how fast an isolated system adjusts to a change in its conditions (i.e., relaxes). Fo is the forcing constant and produces a systematic drift in the chemical state of the system that depends on how the system interacts with its surrounding. In very simple words, forcing terms tell us where the system goes, while relaxation time measures the pace of change. Generally, most systems can have their rate of changes described in the general form T^+f at
= h(xht)
(7.1.2)
where i is the relaxation time, possibly time-dependent, and h a forcing function of time-dependent parameters, h can be deterministic if the function is exactly known for each t: this will be the case for the concentrations and isotopic ratios of interacting reservoirs, h can also be stochastic if only its statistical properties such as its mean and variance are known, as in the case of Brownian motion and diffusion. It is known in the latter case under the name of the Langevin equation (Haken, 1978). 7.2 Single-variable residence time analysis 7.2.7 Non-reactive species A system, which can be thought of as a lake, an ocean basin, a domain of the mantle or of the crust, ... has a constant volume V in m 3 (Figure 7.1). It receives an input Q of material (water, magma, sediments, ...) in m 3 a" 1 and releases an equivalent output. Assuming that the system is well-stirred, Cl represents volumic concentrations in molm" 3 of a conservative chemical species i. By conservative, it is meant (see
346
Dynamic systems
Q
Figure 7.1 Box model for a non-reactive species i.
Chapter 1) that its concentration in the reservoir can be modified only by processes taking place at the boundaries. Species i can be added to or subtracted from the system by solid, liquid or gaseous input and output, not by chemical reaction or radioactive decay inside the reservoir. For the sake of illustration, we will consider a water reservoir, whose properties will be labeled 'liq\ Mass balance requires dVQ liq
'liq' and 'in' subscripts refer to the liquid (the reservoir and outlets) and input (upstream) values, respectively. Assuming constant V, we get V dC Q
liq
dt
-C
i n
(7.2.1)
which has the form of equation (7.1.2). We will first investigate the evolution of an element in a few simple situations where it is hosted by a single reservoir and its concentration affected by input and outputfluxesand also by its chemical reactivity, a) The amount Ml(0) of species i is released at t = 0 in a reservoir which is initially free of this species, i.e.,
This is the simplest of autonomous systems with no 'force' acting on it (pure relaxation). At £ = 0, the concentration is and, therefore
v
V V/QJ
(7.2.2)
Multiplying (7.2.2) by V, the amount M\t) left in the reservoir at t is therefore M\t) = M'(0) expf -
J-^
(7.2.3)
7.2 Single-variable residence time analysis
347
Between t and tH-dr, the outlet of the reservoir loses a quantity &M\t) of species i that has 'resided' for a time t in the reservoir and given by dM\t)= -QCliq\t)dt=
-M%0)QexJ-J-
The residence time Fof the species i is the mean age of each mass fraction, i.e.: fM'(00)
1
f00 t
( t \ t=— — tdM\t)= exp dr M\oo)-M\0)JMi{0) Jo V/Q FV V/Qj since M'(oo) = 0. Introducing the new variable u = Qt/V, we obtain _ V f °°
eJo which is integrated by parts as
_ v f00 F=—
v
°° v r°°
e
° eJo
we""dM = - [ - w e " u ] + -
eJo
e""du
and finally t=V/Q
(7.2.4)
The relaxation time V/Q is therefore the mean residence time of the species i in the reservoir. This parameter does not depend on the nature of the species as long as the species is non-reacting In particular, V/Q is also the water residence (or renewal or flushing) time. For this reason, it will be denoted TH. An alternative and illustrative derivation of the residence-time equations involves Dirac delta-distributions. Let us assume that Cini(t) = M'(0)5(t)/g or, equivalently, that a mass M'(0) of i is injected into the reservoir at £ = 0, since C + coMH0) f + QO Q - ~ Cj At = M'(0) 5(t) df = Mj(0) J — oo
J — oo
Sc-
The homogeneous system has the solution
where A(t) is a time-dependent parameter. Taking the derivative and comparing with equation (7.2.1) leads to dA(t)
dr
( exp
Q\ M'(0) M'( 1= 5(0
V V V)
V
348
Dynamic systems
Again, we make use of the property of the delta-function integral to solve this first-order differential equation as MHO) Cf (0 \ MHO) (0 \ MHO) A(t)-A{0) = — — 5 ( 0 e x p \ - u )du = — — e x p - 0 = — — V Jo \V / V \V ) V
The reservoir being devoid of element i at time t = 0, ,4(0) is zero and therefore
which is identical to equation (7.2.2). The exponential is the unit response function of the system, a fairly general concept that shows how transit through the system 'spreads' a unit input signal. More generally, time-dependent input signals can be decomposed as a succession of delta-like input signals of varying intensities and their individual output summed up in order to recover the total output signal. The unit response function is similar to the Green functions also hinted at in Chapter 8. However illustrative and elegant this method, most solutions can be derived in a much more straightforward way by calculus techniques, b) Change in the input concentration at t = 0. The condition is now Cinl = const for Upon integration of equation (7.2.1), we get
7.2.2 Reactive species We assume (Figure 7.2) that sedimentation takes place in the reservoir at the rate P and that the species i under consideration is entrained by the sediment with a concentration Cscdl dKC- ' — r ^ = QCJ - ec l i q ' - PCsedl dt
(7.2.6)
We introduce the solid-liquid partition coefficient Dt C ^sed
' - Dx vf
'
i^liq
Residence time and the forcing term become apparent when this equation is rearranged in the form of equation (7.1.2) as
t
dt
7.2 Single-variable residence time analysis
349
cl
sed
Figure 7.2 Box model for a reactive species i.
which gives the dynamic equation THdCliq OL: dt
,_Cin
r Cliq
(7.2.7)
a,
The factor a/5 defined as (7.2.8)
is a coefficient that measures the reactivity of the element in the reservoir and is equal to unity for a non-reactive species. Both the residence time T, (7.2.9)
and the forcing term are inversely proportional to a,. Note the additivity of inverse residence times I
Q
P
?i
V
V
1
1
- = - + - £ ; = — + —Dt *
TH
(7.2.10)
T sed
Residence time and reactivity are strongly correlated through equation (7.2.9). This is true for sea water composition since Whitfield and Turner (1979) showed a rather good correlation between oceanic residence times and seawater-crustal rock partition coefficients which are taken as a measure of element reactivity in the ocean. Actually, a better estimate of reactivity is given by oceanic suspensions, so Li (1982) suggested to use pelagic clay-seawater concentration ratios as a proxy to partition coefficients. The mass balance equation (7.2.7) will now be solved for different cases: (a) a finite amount of the species i is added to the reservoir at t = 0, (b) upstream concentration Cinl is changed at t = 0, and (c) upstream concentration Cinl is a periodic function of the time. a) The amount M\0) of species i is released at £ = 0 in a reservoir which is initially free of this species, i.e., = 0 for
350
Dynamic systems
At t = 0, the concentration is
Concentration is expressed as above C^-^e-*.
(7-2.11)
with the mass M\t) held by the reservoir being Mi(t) = M\0)Q-t/Xi
b) The input concentration changes at t = 0. The condition now is Cin' = const for then a
Ir
dt\
i
u
in \
a,- /
L
r i TJV
in
a,
which is equivalent to Cli;(t) = C liq i (0)e-^ + ^ ( l - e - ^ )
(7.2.12)
with the steady-state concentration Cliq*(oo) given by C^oo^Cj/oLt
(7.2.13)
Steady-state is established more rapidly for a reactive than for a non-reactive species and the steady-state concentration will be smaller (Figure 7.3). ^ Cadmium in the Greifensee Lake (Stumm and Morgan, 1981). This Swiss lake has a volume V of 1.25 x 108 m 3 , water input is Q = 9 x 107 m 3 a~ 19 sedimentation rate is P = 4x 10 7 kga~ 1 . Cd has a sediment-water partition coefficient DCd = 65m 3 kg~ 1 . Calculate the Cd residence time in the lake. Let us compute the water flushing time V 1.25 xlO 8 THH = — = = 1.4a («17 months) Q 9xlO 7 and the reactivity factor from equation (7.2.8)
351
7.2 Single-variable residence time analysis
2.5
// ^
<& Non-reactive species — TH = 0.2
a
Reactive species
0.5
0.2
0.4
4
i ~
0.6
0.8
t
Figure 7.3 Comparative evolution of the concentration for a non-reactive species and a reactive species when the input concentration is doubled at t = 0. In this particular case, TH = 0.2 is the water residence time in time units, a,- = 4 the reactivity coefficient, equation (7.2.8), of the reactive species. The residence time TCd = t H / a cd a n d the limiting concentration CinCd/aCd are divided by a factor of ~ 3 0 relative to a non-reactive case, e.g., chlorine. Entrainment by sediments flushes the excess Cd 30 times faster and decreases Cd steady-state concentration 30 times relative to a sediment-free lake. <^ c) Input concentration is a periodic function of time. The input concentration is assumed to take the form
where T is the period and Cj and AC in ' are constant terms. The time-dependent forcing term is no longer zero. It seems reasonable to look for a solution in the form t-St
where Cliq£, ACliq\ and the phase shift 8t are constants to be determined from the mass balance equation (7.2.7) rewritten as
dr
TH
sin 2 * * 1 sin27c-J
W e use the identity . ^ r —5t t 6t sin 2n = sin In — cos In T T T
t 8t cos 2n — sin In — T T
(7.2.14)
352
Dynamic systems
on both sides of equation (7.2.14) and evaluate the derivative on the left-hand side, i.e., dCnJ
2TE
f
t
bt
.
t .
bt~]
— = — ACHa cos 2n — cos 2n — h sin 2n — sm 2n —
dt
q
T
L
T
T
T
TJ
First, the constant terms must cancel out, hence
c
l
o
C
C
The terms in cos(27ct/r) on both sides must be equal, hence — AQiqq1" cos 2TT — = — ACy* sin 2n — T T TH T
or bt
27TTH
tan27c- = = T a:T
2711;
T
^
(7.2.15)
The equality of the sine terms gives — AC liq I sin27r— = - — <xiACUqicos2n TH|_ T T T
AC-J J
or ACJ 2mH . „ 5t ^ bt ^ bt(2mH —= -sin27c — + a:cos27i— = cos27i— ACliqf T T T T\ T
^ bt tan27i — T
= OL: cos 271 —f tan 2 2n — + 1 TV T
Using the well-known identity cos2 = (1+tan 2 )" 1 and taking the reciprocal of each side
Given that a ^ 1, these equations show interesting properties of the solution (Figure 7.4). First, the amplitude AC liq ' of concentration fluctuations in the reservoir is damped relative to the amplitude AC in ' of the input fluctuations by a factor which depends on both the residence time TH of the fluid in the reservoir and the reactivity oct of the element.
7.2 Single-variable residence time analysis
353
Figure 7.4 Effect of a periodically changing input concentration Cj for a species / in a well-stirred reservoir. Tis the period. The concentration C,^1 in the reservoir shows fluctuations with the same period T, but delayed by 5f and damped. Second, the fluctuation is delayed by a time 8t which is a function of the residence time Tf of the element in the reservoir. For an infinite residence time the argument of the tangent tends towards n/2 and the delay bt towards T/4, while for a short residence time, the delay tends towards zero. As expected, reactive elements respond more rapidly than inert elements. The phase shift and the damping factor relating input to output concentrations represent the angular phase and argument of a complex function known as the transfer function of the reservoir. Such a function, however, is most conveniently introduced via Laplace and Fourier transforms. Applications of these geochemical concepts to the dynamics of volcanic sequences can be found in Albarede (1993). 7.2.3 Radioactive decay and first-order kinetics When species i disappears by either radioactive decay or chemical reaction with first-order kinetics, the mass balance equation must be changed according to (7.2.17)
where X{ is the decay constant (or kinetic rate coefficient) of the species i. The equation is easily modified into
dt
354
Dynamic systems
The equations developed above for stable elements can therefore be worked out for radioactive elements or chemical reactions once the reactivity factor at has been changed into af + AfrH The residence time Tt* of the element i in the system is now (7.2.18)
and the limiting or steady-state concentration
For a pair made of a radioactive isotope i and a stable isotope j of the same element (e.g., 14 C/ 12 C), it can be safely assumed that cct = (Xj. In this case, their ratio at steady-state may be written
In a well-stirred reservoir at steady-state, we can calculate the residence time of the element from
J
^L(C7C J ) liq
J
& Broecker and Li (1970) and Broecker (1974) found that the 1 4 C/ 1 2 C ratio in the deep ocean was 84 percent of this ratio in the pre-bomb surface ocean. Assuming that surface carbon (dissolved and falling debris) is the only source of deep ocean carbon, calculate the residence time TC of this element in the deep-ocean. The 14 C decay constant is 1.2 x 10~ 4 a~ 1 . From equation (7.2.19), we calculate the residence time of 14 C in the deep ocean as TC =
1.2 x K T 4
x( V0.84
1 ) = 1600 a /
TC is also known, somewhat improperly, as the mixing time of the deep ocean, o 7.2.4 Isotope and trace-element ratios Let us consider two reactive species i a n d ; (ions, elements, or isotopes) in a reservoir and their rate of change governed by the equations
^ 5 -=--(«,c l i q i -c i n i ) (7.2.20)
dt
7.2 Single-variable residence time analysis
355
Using the rule of ratio differentiation, the rate of change of the ratio R = Cl/Cj can be written
dt
C lk A dt
Clit/ dt
(
Inserting equations (7.2.20) into equation (7.2.21) gives d/? liq _ dt
i atcy-<W , i TH
Cllq'
TH
QLf^-cjR Cllq'
Upon simplification, this equation becomes
Reformulating the rate of change of element j in equation (7.2.20) as 1 CJ _ xj
|
d
we get the dynamic equation
af + TH d In CUqj
dr
where the relaxation and forcing terms have been emphasized. For two isotopes with identical chemical properties af = a7, whereas for a ratio of non-reactive elements <& DePaolo and Ingram (1985) found that the 87 Sr/ 86 Sr ratio of the ocean has changed almost linearly from 0.7078 to 0.7092 over the last 35 million years. Holland (1978) estimates the oceanic residence time of Sr to be 4 million years. Find the relationship between the rate of change of seawater Sr concentration (presently 8 ppm) and the runoff 87 Sr/ 86 Sr ratio. Let a Sr be the common value of the reactivity coefficients for both isotopes. Replacing the subscripts 'liq' by 'SW\ 'in' by 'runoff and evaluating, we get the rate of change of 87 Sr/ 86 Sr as d( 87 Sr/ 86 Sr) sw dt
=
0.7092-0.7078 35
_ Q _t 1n _ . = 3.889x10 5 M a 1
356
Dynamic systems
Rewriting equation (7.2.23) as dC s w S r T
1
dJRsw/dt
a
~~ -^r
H/ Sr
and inserting the numerical values, we get the relative rate of change of Sr concentrations as 1 Sr
C s w dr
4
38.89x10' 0.7078 + 38.89 x 1 0 - 6 x (35-t)-( 8 7 Sr/ 8 6 Sr) r u n o f f
This relationship is drawn in Figure 7.5 for various runoff 87Sr/86Sr ratios at £ = 0, 15, and 30 Ma BP. Quite surprisingly, the Sr residence time of 4 Ma requires that seawater Sr concentration should change at a rate in excess of 20 percent per million year, which is extremely unrealistic. Curves were also drawn for rSr = 20Ma, which 0.10
TSr = 20 Ma
0.00
-0.10
-0.20
-0.30 0.708
TSr = 4 Ma
0.710
0.712
0.714
0.716
( 8 7 Sr/ 8 6 Sr) r u n o f f Figure 7.5 Relative rate of change of Sr concentration in seawater calculated from equation (7.2.23) for the last 35 million years using the Sr isotope data for seawater of DePaolo and Ingram (1985). Calculations are made for t = 0, 15, 30 Ma. Residence time of Sr in the ocean is assumed to be 4 Ma (Holland, 1978, bottom) which gives an unrealistic rate of change for Sr concentration. An alternative residence time of 20 Ma (top) seems more adequate.
gives more acceptable but still quite rapid Sr depletion. This conclusion is insensitive to a particular choice of runoff 87Sr/86Sr value in a reasonable range (0.710-0.712, see Albarede et a/., 1981). This result indicates that the assumption of a constant 87 Sr/86Sr ratio in the runoff is inadequate. An extreme assumption would be that 86 Sr concentration in seawater stays approximately constant. In other words, 86Sr
7.2 Single-variable residence time analysis
would be at steady-state but not
87
357
Sr, which translates into
runoff ~
aSr
dt
or
Applications of isotopic box models may be far-reaching: Albarede et al. (1981) have investigated the balance of Sr isotopes in seawater between runoff and ridge crest hydrothermal activity. They deduced a range of estimated values for 87 Sr/ 86 Sr in the global river system of 0.7097-0.7113 nearly consistent with the direct estimate of 0.711 by Palmer and Edmond (1989). Raymo et al. (1988) and Richter et al. (1992) suggested that enhancement of continental erosion by the uplift of the Himalayas and Tibetan Plateau explains the modern increase in seawater 87 Sr/ 86 Sr ratio. The seawater 87 Sr/ 86 Sr record may carry geodynamic information of global importance, o # ^ The 87 Sr/ 86 Sr ratio of the lavas erupted by Vesuvius have been found by Cortini and Hermes (1981) to decrease linearly from 0.70793 in the 1754 eruption down to 0.707 22 in the 1882 eruption. This change is not correlated with a systematic trend in Sr concentrations. Assuming that lavas are erupted from a perfectly mixed reservoir withholding a constant mass of magma at an effusion rate Qout = 0.001 km 3 a" 1 , estimate the size of this reservoir as a function of the 87 Sr/ 86 Sr ratio in the input magma. Neglecting the variations in Sr concentrations amounts to assuming a S r = l . We get the rate of change of 87 Sr/ 86 Sr as dRliq _ d(87Sr/86Sr)liq _ 0.707 22-0.70793 _ dr dt 1882-1754
ip-6a-i
The magma residence time can be estimated by recasting equation (7.2.23) as
dRyJdt
+
dC, iqSf
dRViq/dt
Inserting the numerical values gives Kiiq - #in _ T
" ~ ~ dRliq/dt ~
0.707 93 - 5.55 x 10 " 6(t -1754) - Rin
-5.55 xKT 6
and from the definition of the residence time T H 0.707 93-5.55 x 10" 6 (f-1754)-K i n
Dynamic systems
358
0.704
0.706
0.708
(87Sr/86Sr).
in
Figure 9.16 Kinetic fractionation during crystal growth. Steady-state distribution of melt concentrations in the vicinity of a solid growing at the rate v for trace elements with different solid-liquid fractionation coefficients [equation (9.6.5), Tiller et al. (1953)]. The stippled area indicates the steady-state chemical boundary-layer with thickness S = @/v.
This relationship is drawn in Figure 7.6 as a function of the 87Sr/86Sr ratio in the input magma for the dates t= 1760, 1820 and 1880. Provided the initial assumptions on the magma tic regime are valid, the magma chamber is smaller than 1 km 3 .o ^ Aplitic magma with constant composition is continuously injected into a dyke where it crystallizes as a quartz-feldspar assemblage while the residual liquid is expelled toward the surface. Sr in the dyke is found to be isotopically zoned. We assume that the 87Sr/86Sr ratio of 0.710 measured in the earliest rock in the dyke represents the isotopic ratio of the injected magma. Trace-element partitioning suggests that the injected magma has a 87Rb/86Sr ratio of 10 000 and that the ratio of reactivity coefficients aRb/aSr was 0.05. We assume a flow-rate high enough for Rb and Sr concentrations to be at steady-state almost instantaneously. It is found that the most-evolved rocks in the dyke have a constant 87Sr/86Sr ratio of 0.720. Estimate the residence time of Rb in the dyke. Setting constant concentrations in equation (7.2.23), and replacing R by 87Sr/86Sr gives d( 87 Sr/ 86 Sr) liq _ dt
( 8 7 Sr/ 8 6 Sr) l i q -( 8 7 Sr/ 8 6 Sr)
+ A87Rb (87Rb/86Sr)liq
where the additional term on the right-hand side accounts for radioactive decay at constant 86Sr. Since concentrations have reached steady-state and neglecting the effect of decay on 87Rb, combination with equation (7.2.13) gives
7.2 Single-variable residence time analysis d( 8 7 Sr/ 8 6 Sr), l q
( 8 7 Sr/ 8 6 Sr) l i q -(87Sr/ 8 6 Sr) i n
1
359
87Rb
( a R b /a S r
Since the last differentiates have a constant 87 Sr/ 86 Sr ratio, we assume the last equation to be at steady-state and therefore the residence time r Rb = in/a,^ of Rb in the dyke is T H _ 1 ( 87 Sr/ 86 Sr) liq -( 87 Sr/ 86 Sr) in _ 1 ^ 0.720-0.710 87 86 n ( Rb/ Sr)in 1.42 x 10" 10000 aRb ^Rb Dynamic accumulation of radiogenic 87 Sr in a differentiating magmatic system has been suggested by Vidal et al. (1979) to account for the initial 87 Sr/ 86 Sr heterogeneities in concentric granitic intrusions from the Kerguelen islands.
1 zt
This relation shows that relative fluctuations of concentrations about steady-state values are more important for elements with short residence time, i.e., for reactive species. Some elements are more abundant simply because they are chemically inert with respect to the processes taking place in their host reservoir. This is notably the case of N 2 and O 2 in the atmosphere, Na, Cl, Mg in the ocean, Si, Mg, Fe, Ca in the mantle. Atmosphere, seawater, and mantle peridotite are perceived as chemically homogeneous because of the remarkably inert behavior of their major elements. Chemical fluctuations may happen that potentially break the basic assumption of a well-stirred system made earlier in this chapter. This may not be a problem if a stirring process exists, such as thermal convection in the mantle and the atmosphere, or thermo-haline convection in the ocean, that mixes the system down to a certain distance and levels off heterogeneities. The related concept of mixing time, which we just met with carbon in the ocean, is scale-dependent, i.e., sample size dependent. A sample can be scaled in different ways: a sampling bottle in the ocean, a hand specimen for igneous rocks or the height of the melting column for lavas. In solids, the critical size for homogeneity is necessarily larger than mineral grain size and smaller than the system size itself (Figure 7.7). For a given sample size, an element is homogeneously distributed in a system if a suitable dispersion parameter, such as the standard deviation of concentration, falls below a critical level. The time it takes for the size of heterogeneities to decay below the sampling size is the mixing time of the system. An appropriate scale for the mixing time is the reciprocal of the local velocity gradient (see Chapter 8). If the residence time is significantly longer than the mixing time, the system levels off changes faster than they are introduced from the surroundings. If the mixing time is longer than the residence time, stirring is slow relative to external perturbations and the system is heterogeneous. The more reactive an element, the more variable is its concentration. The relationship between dispersion and residence time is well-known in the lower atmosphere (troposphere), where concentrations of reactive gases, such as H 2 O and
360
Dynamic systems
Figure 7.7 The size of heterogeneities depends on the sample size. In thisfigure,dots represent lithospheric material dispersed in the mantle. Small samples are more heterogeneous than large samples.
O3, vary much more than those of inert gases (Junge, 1974). A related observation was made by Hofmann (1988) for the variability of trace elements in mantle-derived rocks: the variability is higher for incompatible than for compatible elements. In the mantle, melts play the role of the scavenger, which is also the role played by particles in the ocean. Incompatible elements, such as Th and La, for which the liquid-solid partition coefficient is higher, vary more than compatible elements such as Mg and Ni. 7.2.6 Stability of single-variable systems When a geochemical variable, e.g., the concentration of an element in a reservoir, is constant with time, the system is said to be at equilibrium although a better practice in compliance with thermodynamics is to use the term steady-state. We now inquire about the stability of equilibrium, i.e., whether in a given state of equilibrium an arbitrary small perturbation is going to decay and bring the system back to equilibrium or grow until the system achieves another state of equilibrium. The following derivation and example are largely inspired by Logan (1987). We can usually assume that, in a reservoir, the concentration C of an element is initially at equilibrium and its rate of change obeys a law of the form (7.2.24)
where F is a known function of the concentration C and a parameter \i. At t = 0, the
72 Single-variable residence time analysis
361
system is at equilibrium, i.e., C = C0 and F = 0. What happens if, for a given value of the parameter, concentration is perturbed by a small increment 8C? Letting
and substituting into the differential equation, we get d(6C) at Expanding F in a Taylor series to the first-order gives the approximation ^
at
A*) ++ ^ ^ (Co, ji)5C » F(C0 , A*) oC C
(7.2.25)
Because the initial condition of equilibrium requires that the first term on the right-hand side of the first equality vanishes, this equation simply becomes d(5C) dF __*_(Co, The solution is 5C(r) = 8C(0) exp — (Co, ft) < which shows that for the perturbation to decay, the stability criterion is dF — (C0,Ai)«>
(7.2.26)
The reader is referred to textbooks on differential equations and applied mathematics for a more rigorous and general proof of the stability criterion (e.g., Logan, 1987). & The first-order non-isothermal (FONI) reactor. A continuous, well-stirred magmatic reservoir similar to those discussed above is supposed to be thermally insulated. A dissolved element i precipitates with a temperature-dependent rate of crystallization. Crystallization rate is assumed to obey first-order kinetics with Boltzmann temperature dependence such as
dt
M
Hq
^^'
where k is a constant, 0t the gas constant, and E the activation energy of crystallization. In a system with input and output of liquid - = QCinl-QCUql-kVCUqlexpl
.
/
-
E
362
Dynamic systems
The heat balance follows a similar relationship with the rate of latent heat release in proportion with the amount of element crystallized
p
dt
q
p
where cp is the heat capacity of the mixture and L the latent heat of crystallization per mass unit of element. Note the plus sign before the last term due to the heat released by crystallization. All parameters besides the descriptive variables Cliql and Tliq are assumed to be constant. Dividing the first equation by VCinl and rearranging, we get AC
i //""
i
1
At
/
C
VIO\
u t
K
'\
/"•
l
/ ^ \
i
/
17
1
C I
C '
V ^ T Ti /T
^in /
^in
\
^
i
in
i
liq/ J in
Once divided by VcpTin, the second equation becomes
Tin/
L
^P Tin Cinl A
^7: n 7iiq/7:n
Introducing the reduced time t\ concentration u and temperature v
and the dimensionless parameters
0± hV
cp Tin
where \i is equivalent to a reduced flow-rate, 6 measures the strength of latent heat effect, and y is the temperature-dependence of the kinetic factor. The two equations can be rewritten dt1 d
v
— = 1i —tH d1
6
u
e-
Multiplying the first equation by 6 a n d adding the two equations gives
dr1 which can be integrated as
7.2 Single-variable residence time analysis
363
where reduced concentration u0 and temperature v0 are the values at t = 0. Following Logan and for sake of illustration, we simply assume that uo = vo= 1, hence 17=1+0(1-11)
W e can n o w discuss the behavior of the reduced concentration u b y writing dw u Y — = 1 - u - - exp dr /i |_
y L
= F(u, i
(7.2.27)
The right-hand side function F(u, fx) is highly non-linear and is contoured in the w, \i variable space (Figure 7.8) for various values of F. The contours have simply been drawn by ascribing values to the dimensionless parameters (0 = 3 and y = l) and F( — 0.06 to +0.06), then calculating /i from a range of u values. Equilibrium is achieved for F = 0. However, for a given value of range of the reduced flow-rate //, approximately from 0.41-0.63, a given value of this parameter can be matched with
0.01
0.02
0.03
0.04
0.05
0.06
Reduced flux, Figure 7.8 Stability analysis of concentration in a simplified model of adiabatic magma chamber with first-order precipitation kinetics. Contours are those of the function F(w,/i) (equation 7.2.27). Unstable (actually bi-stable) behavior (hysteresis) is observed around the branch C-C where the derivative of the function F(u, fi) relative to the reduced concentration u is positive. Reducing the flux of magma produces a pathway A-B-C-D-E; increasing the flux produces A'-B'-C'-D'-E1.
364
Dynamic systems
three values of the reduced dissolved concentration u. u is therefore a multiple-valued function of JJ, and this state is known as a multiple steady-state. These multiple branches are not equivalent. From the contour lines in Figure 7.8, we can deduce which branch is stable, and which branch is not. The middle branch (C-C) lies in a range where, for a given value of \i, F increases with u. The derivative of F with respect to u is therefore positive which is just the criterion we found for an unstable equilibrium. Any fluctuation of u at constant \i will drive the system away from the branch C-C. The opposite holds for the upper and lower branches A-C and A'-C that lie in a range where F decreases when u increases. The derivative of F with respect to u is therefore negative and any concentration fluctuation around an equilibrium state along these branches dies out rapidly. The branches A-C and A'-C1 are stable steady-states. Let us now find out how the system works. Assume that it starts at a large reduced flow-rate (point A) and reduce the input slowly. Up to the point C, any deviation from the equilibrium curve will die out rapidly. At C, concentration fluctuations become unstable and the system evolves quite rapidly towards D (// isfixed)where it finds a stable steady-state. The system has become unstable because reducing the flow-rate enhances crystallization which through the kinetic factor enhances the rate of precipitation and thereby depletes the residual liquid. The system quenches. Upon reducing the flow-rate further, the stable evolution continues towards point E. If the process is reversed and the flow-rate increased, the system evolves smoothly from A' to C on a stable branch. At C, the excess heat brought in by a large input of fluid is no longer balanced by the output and is suddenly converted into latent heat. The system 'thaws'. A large fraction of solid is rapidly dissolved up to D' where the system joins a stable branch. The evolution of the system is therefore not reversible and is reminiscent of hysteresis effects in variable magnets. This dual behavior is one of the cases of what is known as a bifurcation. The system has access to competing steady-states and shifts from one state to another in a catastrophic move. The present case is fairly similar to the triggering of ignition in gases Benson (1982) in which combustion releases heat that enhances the rate of reaction. At the time of writing, the potential for irregularly fed magma bodies to to have a liquid line of descent broken through bifurcation has not been explored. Instabilities due to non-linear interaction between processes of mass and heat conversion are well known in industrial chemical reactors and the interested reader could consult the book by Gray and Scott (1994). In Chapter 8, another case of bifurcation associated with metasomatic fronts is discussed in which the physical foundation of multiple steady-states is substantially different from the present example, o It must be realized that the basic reason for bifurcation is that the function F is multiple-valued and therefore non-linear. Other sources of non-linearity, like auto-catalysis have been explored systematically and have proven to be the starting point of geochemical catastrophes (e.g., Ortoleva, 1994). 7.2.7 Random geochemical variables
When the geochemical variable is not uniquely determined but is a random variable, we would like to be able to assess how the parameters of the population change
7.2 Single-variable residence time analysis
365
through time. Let c be a geochemical variable (e.g., the concentration of an element) assumed to be a continuous random variable defined over a domain Q and /(c) its density of probability function (e.g., a normal density function)./(c) has the standard properties of a density function, i.e., f(c) ^ 0 everywhere and
I
/(c)dc=l
Given the autonomous evolution equation
where F is a known function, we inquire about how /(c) changes with time. The problem is a Lagrangian problem actually similar to a conservation problem (see Chapter 8) since probabilities are conservative. Making a simple comparison with frequency histograms, whatever is lost from a frequency bin must be found in other bins since frequencies sum up to unity. The derivative dc/dt, or identically F(c, t\ has the meaning of a velocity along the c-axis position and f(c)F(c) the meaning of a probability flux along that axis. Let c0 and c0 + dc be two points along the c-axis, where dc is arbitrarily small. /(co)dc represents the probability that c lies between c0 and c0 + dc. The fraction of the population that enters or leaves this segment at c0 during the time interval dt is /(co)F(co) while /(c 0 + dc)F(c0 + dc) is the fraction of the population that enters or leaves this segment at c0 + dc. We write that the rate of change of/in that segment equals the sum of fluxes at both ends = - U(c0 + dc)F(c0 + dc) - /(co)F(co)]
(7.2.28)
where the minus sign accounts forfluxdecreasing the inside probability when counted away from the boundaries. Linearizing the first term in the right-hand side through a Taylor series dc
/ ( c 0 + dc)F(c0 + dc)» f(co)F(co) + co
and, switching to an equality sign d[/(co)dc] dt
dc
dc CO
This equality holds true for any arbitrarily small segment dc and any c0, so the following equality is identically true
f dt
=
-^=-Ff-/^ dc
dc
dc
(7.2.29)
366
Dynamic systems
In a case where F would contain a stochastic term (e.g., Brownian motion, noise), this equation would lead to the celebrated Fokker-Plank equation with a diffusion (second-order) term. This equation is a partial differential equation whose order depends on the exact form of/and F. Its solution is usually not straightforward and integral transform methods (Laplace or Fourier) are necessary. The method of separation of variables rarely works. Nevertheless, useful information of practical geological importance is apparent in the form taken by this equation. The only density distributions that are time independent must obey fF — f— = const dt If the process under investigation is radioactivity, for instance, then — = —Ac dt
where X is the decay constant and the only steady density function would be proportional to c"1. Unfortunately, such a distribution is usually unbounded. Radioactive decay affects the density function of radioactive elements. Two consequences of this simple analysis are far-reaching. First, the common perception that normal or log-normal functions may be used as catch-all probability density functions is physically untenable since these functions are not time-invariant relative to most geological processes (mixing, differentiation, ...). Second, there is more information on the physics of geological processes contained in the density function of concentrations, ratios, and other geochemical parameters than what is reflected by their mean or variance. Obviously, this information is deeply buried and convoluted, but deserves attention anyway. 7.2.8 Population dynamics A related probabilistic approach to the evolution of heterogeneous systems consists in splitting the reservoirs into many units that have known geochemical characteristics and known rates of changes and to handle them collectively. We can relate this method to that of insurance companies which, in order to forecast their profitability and determine customer's contribution, divide the human population into classes defined by individual age, wealth, professional occupation, ... and assign each class, usually on the basis of surveys, a probability of accident, disease, or death. Models predicting the mass- and age-distribution of clastic sediments using a discretization of the geological and orogenic time-scale have been developed by Veizer and Jansen (1979) and applied to Nd crustal residence age by Allegre and Rousseau (1984). An extension of this model to the continuous time-scale was given by Michard et al. (1985) and will be discussed below. For the sake of illustration, we will calculate the Nd isotope composition of continents in a very simple model of crustal evolution. A newly formed crustal segment results from the addition of both juvenile mantle and material recycled from the
7.2 Single-variable residence time analysis
367
preexisting crustal segments. M(T,t) is the mass of crust formed prior to the time T and still surviving erosion at t. There is no continental crust at t = 0. The mean life of the crust relative to erosion is the constant T, which means that the probability of a piece of crust to be eroded per unit time is independent of its age and equal to 1/T, therefore
dt
x
This can be integrated into M(T, t) = const xe" t/T Writing this equation for t = T results in
Next, we assume that juvenile crust is extracted from the mantle at a constant rate g. Therefore M(T,T) = gT
(7.2.31)
or
(7.2.32) The amount dM(T, t) of crust formed between T and T + d T still surviving at t is
dT
d
e
(7.2.33)
and since the amount of crust M(t, t) existing at t is gt, the fraction f(T,t)d preserved crust which formed between T and T + dT is
Tof the
dT M(t,t)
t
The increment of crust newly formed at t is created at a rate g for the juvenile part, and, for the recycled part, as the negative of the erosion rate. Therefore dM{T,t) dT
M(t,t) A t\ =g 1 + z
.„_.. (7.2.34)
v )
Note that the left-hand side has not been expressed as dM(£, t)/dt as in Michard et a\. (1985), which would incorrectly imply a crustal growth rate, but as a density of probability of the crustal ages for T in the vicinity of t. The integral in the middle term represents the eroded components summed over all the class ages [T, T + dT] from T = 0 to T = t.
368
Dynamic systems
We can now turn to isotopic ratios. The Nd isotope composition 1 4 3 Nd/ 1 4 4 Nd and the 1 4 7 Sm/ 1 4 4 Nd ratio are noted y and x, respectively, with the m and c subscripts denoting mantle and crust. Using this notation, the chronometric equation of the Sm-Nd closed-system reads
(note that t and T are times and not ages.) For practical purposes, the Sm/Nd ratios may be assumed to be constant and Xt«l, so the chronometric equation linearly expanded becomes t-T) The
143
(7.2.35)
Nd/ 1 4 4 Nd ratio at time t of a crust formed at T is yc(T, t)»yc(T, T) + Xxc{t - T)
(7.2.36)
yc(T, t) is the weighted average of the ratio in the juvenile fraction and that in all the recycled fractions contributed by all crust segments of age 0 to T c(T, f
y&> ) =
rru^
t)dM(T, t)
gym(t) + - f ' ^ ^ =
yc(T, t)dT
^TTTT^
(7-2-37)
Multiplying both sides of the last equality by the denominator of the right-hand side and combining the two integrals yields
It is convenient to use the difference of isotopic compositions between the crust and mantle values at the time the crust forms as a work variable, therefore
which we split as
or equivalently
7.2 Single-variable residence time analysis
369
The last integral on the right-hand side is just M(t, t), i.e., gt, hence ,., 9[
1 f'<
,
Since the y values under the integral sign are the values at t and not at the time T the crustal segment formed, a correction for decay over t—T brings us back to the formation time "(*c-*J *
JO
Defining the new variable w(t) as /
^W,0-y«W]
(7-2.38)
with M(0) = 0, since mantle and crust are isotopically indistinguishable at t = 0, we can write
u(t) = - f V ) d T + -(x c -x m ) fe r /{l + - V - T ) d T
(7.2.39)
The last integral on the right-hand side of equation (7.2.39) is a standard example of calculus textbooks but we will nevertheless evaluate it by part integration. Let J be this integral divided by T and expand the product of the different factors
This form suggests a change of variable z = T/T, giving rt/t
j=\
r e
r
T
/T\21
H t + (t-T)--Tl-\
T
U-=
Ct/x
ez[
The easy route is to calculate this integral iteratively. Defining /„ as /„= I z"e2dz Integration by parts gives In = znQz-
\zn-lQzdz
=
znGz-nIn_1
370
Dynamic systems
which produces the sequence
Let us expand the expression under the integral sign of J tlo + (t-z)I1-
tl2 = It + (t - T)(Z -1) - T(Z2 - 2z + 2)] ez + const
We can get J by making the difference of this expression at z = t/x and z = 0
which is rearranged as j = (t-T)e'/T + T = T | - - 1 je f / t +l Inserting this expression of J into equation (7.2.39) gives
)=1 f
u(t) = 1 f u(T) dT + AT(XC - x
This integral equation can be transformed into an ordinary differential equation by taking its derivative relative to t and applying Leibniz's rule to the integral
Rearranging and noting that w(0) = 0, we get u(t) h ii'(0 = — + -{xc-xm)e* T
(7.2.40)
T
The solution to the homogeneous equation being e'/T, we can write the solution u(t) to the complete equation as u(t) = f(t)et/x
(7.2.41)
where/(t) is a function to be determined. Taking the derivative and comparing with equation (7.2.40), we get t/x
f(t)
t/t
T
u(t)
h
T
or f'(t) =
-(xc-xm) X
T
t/x
7.3 One element in several interacting reservoirs
371
which has the solution A
t2
/(0 = - ( x c - x J - +const T
2
Inserting this expression into equation (7.2.41), it becomes /T
k
t2
T
2
u(t) = f(t) e' = -{xc- xj - e'/T + const x e'/r
We can now write explicitly u(t) in terms of the geochemical variable yc(t, t) through equation (7.2.38). Again, the condition that there is no primordial crust at t = 0 requires that yc(t9 i) and ym(t) are equal and therefore the constant is zero. Rearranging, we get
Finally, the expression derived by Michard et al. (1985) becomes yc(t, t) = ym(t) = Uxc- xj — kt 2 r+
(7.2.42)
The crustal residence age TDM of sediments formed at time t, i.e., the mean age of their continental protolith is defined as _
1 yc(t,t)-ym(t)_
1 t
^DM — -fstrat + T
^strat + ~
l
\ '•*>'*J)
2 T+t
where Tstrat is the deposition or stratigraphic age. If the characteristic time of erosion is short (T «0), the crust is well-mixed and TDM is given by T —T rp _ rp i ^ D M ~ 'strat '
0
i
strat
~
where To is the age of the oldest event of crust formation. If erosion is inefficient and slow (T very large), there is no contribution from old to new crustal segments and T DM ^ Tstrat. The newly formed crust is said to be juvenile. 7.3 One element in several interacting reservoirs Various geological problems deal with systems which, within a good approximation, can be considered as geochemically homogeneous over a certain time-scale. The mantle and the crust over periods of up to « 1 0 6 years, the ocean for most elements not involved in biological processes over periods of « 1 0 3 years are examples of quite homogeneous systems. These reservoirs can be thought of as 'boxes' with wholesale chemical properties, such as concentrations or 'total standing crop' of an element in the reservoir, and much can be learned about the geochemical evolution of several
372
Dynamic systems
boxes that are allowed to have chemical exchanges through a rather simple formalism of the 'box model'. Exchange of Sm and Nd between mantle and crust, of CO2 between the ocean and atmosphere will be investigated as simple practical examples. The simultaneous handling of a multiple reservoir by systems of equations was initiated by Southam and Hay (1976) and extensively developed by Lasaga (1980, 1981). 7.3 J A closed-system 3-box model with concentrations as the variables The layout of such a model is shown schematically in Figure 7.9. Since we are going to deal with only one conservative species, no ambiguity will arise if we drop temporarily the superscript. Let Vk be the volume of the feth reservoir, Qk^t the flux of material from reservoir (= box) k to reservoir /. There exists one equation per reservoir that describes the conservation of the species dV1C1 dt dV2C2 dt
(7.3.1)
dV3C3 dt
These three equations are not independent which can be checked by adding all three
Figure 7.9 A three-reservoir model. Vj represents the volume of the reservoir j , Ci the concentration in this reservoir of the element investigated, Q^j the material flux from reservoir i to reservoir j .
373
7.3 One element in several interacting reservoirs
and verifying that the total amount of species i is constant d t
1
1
2
2
3
(7.3.2)
3
Dividing each equation by the volume of the corresponding reservoir, we get dt dC2
"dT
V1 — ZL^UJLC
dt
F3
—
V3
or, in a matrix form dC/
~dT dC2 "dT dC3
6^2
63^2
63-1+63^2
6l-
(7.3.3)
l_c3j
The matrix is singular since Vx multiplied by the first line + V2 multiplied by the second line + V3 multiplied by the third line sums up to zero. An entirely equivalent formulation uses the absolute amounts K;Cf present in each reservoir instead of concentrations dVlCl V,
dt dt
dt
Vt
V2
with the resulting matrix equality
dt dV2C2
62^1
dt dt
Again, the matrix is singular since the rows sum up to zero.
v2c2 v3c3_
(7.3.4)
374
Dynamic systems
This case can be generalized from three to n geochemical reservoirs using dC
O
O
^ = -I%^+X%^ dt
Vt
J=1
j = l
(7.3.5)
Vj
where the first summation refers to outputs and the second summation to inputs. Equivalently ^
= - I % ^ C , ) + t ^(VJCJ)
(7.3.6)
73.2 The general box model: an empirical model It is often difficult to define precisely the elemental flux from a system to another as a product of a mass flux of a carrier multiplied by a concentration in this carrier. For instance, the flux of carbon from the biosphere to the atmosphere is not adequately represented by a carrier flux since carbon dioxide escapes directly to the air. We therefore have to resort to a direct formulation in terms of total quantities (amounts, e.g., in tons, kilograms or moles) and fluxes. Denoting Mt the total quantity of the species under consideration in the reservoir i and J^j the flux of the same species from reservoir i to reservoir j , we note that = — L, Jt^j
an<
=
^
L Jj-+i
(7.3.7)
which we rewrite dM, 0 1 1 1
„ Ji^i*M
= — 2^ dt
J
dM/ n
Mj and
jvi Mt
„
Jj^i%Jf
= 2^ dt
M^
j=ti Mj
We note that the ratio
is the ratio of a flux to a mass. It has therefore the dimension of an inverse time (e.g., a" 1 ) and it will be further assumed to be constant. This assumption amounts to considering that the time the element i spends within a reservoir is controlled by parameters independent of both the standing crop Mt and the various fluxes J^j and Jj^t. Typically, time constants arise from hydrodynamic conditions or from entrainment by major carrier species other than water, air, .... Such a simple model works well for diluted solutions but is clearly wrong when, for instance, Mt is buffered by solubility conditions or when the fluxes are controlled by non-linear, e.g., autocatalytic effects (Lasaga, 1980). Combining the fluxes gives the mass balance of the species under consideration in the rth reservoir n dM " — ! = - I k^jMt+ £ kj^Mj
dt
;=i
;=i
(7.3.8)
7.3 One element in several interacting reservoirs
375
Defining the current element of the matrix A by the following expressions n
«»=- Z
1 k
t^j=—'and
Jf=l
a
tj=kj^i
T(i)
where T (0 is the residence time of the considered species in the ith reservoir. Lumping the amounts M{ together into the vector JC, the system can be recast into the standard form — = Ax dt
(7.3.9)
When the matrix A is constant, the system of linear equations is linear. This system is solved with the procedure described in Section 2.5. The non-symmetric matrix A is first diagonalized
A = U\U '= £
£
where 9lt is the matrix formed as
If the matrix is time-invariant, or equivalently, if residence times are constant, the solution can be calculated as (7.3.10)
where x0 is the vector of concentrations at t = 0. The matrix exponential eAt is known as the transition matrix of the system. Due to the way the matrix is built, this system has very simple properties. (i) The rows of the matrix A are not linearly independent, since
hence the matrix is singular and has one eigenvalue equal to zero. In other words, in an n-box model, only n2 — n independent flux coefficients can be fixed independently. (ii) Given one zero eigenvalue for A, since the complex eigenvalues of a real matrix are conjugated, a two-reservoir system cannot have a complex eigenvalue. A minimum number of three reservoirs is required for periodic fluctuations. A small number of reservoirs cannot give oscillations of significant amplitude (the reader is urged to make a numerical experiment with random matrices). (iii) The non-zero eigenvalues are non-positive, a consequence of applying the Gershgorin's circle theorem columnwise (see Section 2.4). Indeed, eigenvalues are within the circles
376
Dynamic systems centered on the diagonal terms an (always negative) and having a radius r such that n
The modulus notation is omitted for these terms are positive. But as we just saw n
and therefore
Since each a^- is negative, this equality holds true only if the real value of each Xj is negative. As discussed in Section 2.5, the solution is therefore physically stable. When t->ao all but the exponential term with the zero eigenvalue tend to zero, possibly after a few oscillations if some eigenvalues are complex. The concentrations are relaxing towards the steady-state given by x^ = Saoxo
(7.3.11)
where $I0 is the matrix 2lf associated with the zero eigenvalue. If x0 corresponds to the unperturbed set of elemental amounts in each reservoir at steady-state, i.e., if JCO = JC°°, then we can write
which is an alternative way of viewing the steady-state values as associated with X{ = 0. & The global phosphate system is described in Figure 7.10 (Lasaga, 1980). Table 7.1 gives the amounts held by each reservoir, and Table 7.2 the fluxes between reservoirs. Assuming steady-state, calculate the evolution of the world phosphate system if 1 0 0 0 0 x l 0 9 k g of phosphorus from fertilizer (mined from an isolated reservoir) were dumped on land in a short period of time. An example will show how the kt^j terms are evaluated
Proceeding similarly with the other terms gives the matrix A -5.0xl0~9 5.0 xlO" 9
9.15 X 10"5 -4.18 X 10" 4
0 0.0212
0 0
0 0
1.95 xlO" 5 " 0
0
3.10 X 10~4
-0.0212
0
0
0
0
0
0
0
8.50 X 10"
0
0
6
0
-7.54
0.384
0
7.23
-0.390
0
0.304 6.53 xlO" 3
6.66 x H T 4 -6.85 xlO" 4
311
7.3 One element in several interacting reservoirs
Table 7.1. Amounts of phosphorus (109kgP) stored in each reservoir at \ = 0 and the initial perturbation. i
Reservoir
1 2 3 4 5 6
Sediments Land Terrestrial biota Oceanic biota Surface ocean Deep ocean
Steady-state JC00
Perturbation 5JC
4xlO9 2xlO5 3000 138 2710 8.71 x 104
0 lxlO4 0 0 0 0
Table 7.2. Phosphorus fluxes between the six reservoirs (109kga-'). Fluxes not given are assumed to be negligible.
5 = 58
J 5 _ 4 =1O4O
^5=1.7
3
—I
4 Oceanic biota
Land biota
2
5
—
Land
Surface ocean
1
6 m
Sediments
—
•
Deep ocean
Figure 7.10 The long-term phosphate system (Lasaga, 1980). The arrows show the fluxes that are taken into account.
The negative reciprocal of phosphorus residence time in each reservoir is found on the diagonal entries of matrix A (Table 7.3). A is factored giving six eigenvalues and six characteristic times of the system as the negative reciprocal of the eigenvalues
139.0
3150
138.8
3100
138.6 138.4
3050
138.2 2.10
2730
2.08
2725
2.06 2720 2.04 2715
2.02 8 Sediments (x 109)
87.6
Deep ocean (x 103)
87.4
87.2
10°
102
104
106
102
104
io6
Time (a) Figure 7.11 The long-term phosphate system. Evolution of the amount of phosphorus (in units of 109kg) held by the six systems described in Figure 7.10.
108
7.3 One element in several interacting reservoirs
379
Table 7.3. Residence time of phosphorus in each reservoir. Reservoir
TP (a)
Sediments Land Terrestrial biota Oceanic biota Surface ocean Deep ocean
200 x10 6 2395 47.2 0.133 2.56 1459
Table 7.4. Eigenvalues and characteristic times of the six-reservoir system for phosphorus. i
1 2 3 4 5 6
-7.91 -0.0217 -0.0215 -9.85x10" 5 -1.89x10" 5 4.50x10" 19
0.126 46.2 46.5 10150 52900 oo
(Table 7.4). The six eigen- (column-) vectors form the matrix U given by P-6.71X10" 8 -6.52xlO" 4 -2.99xlO" 3 -6.70X10" 1 -7.18X10" 1 0.999999 ' 0.0 0.0
0.0
7.08X10"1
7.38X1O"1 -4.44x10" 5 5.00x10" 5
0.0
1
l . l l x l O " 2 -6.67xlO" 7 7.50xl0" 7
-7.05X10"
U= -7.20X10" 1 -3.52xlO" 2
1.56xlO"3 -1.04xl0~ 4
1.07 x 10" 3 3.45 x 10"8
6.93X10"1 -6.89X10" 1
3.04xl0" 2 - 2 . 0 4 x l 0 " 3
2.11 x 10"2 6.78 x 10"7
7.24X10"1 -3.23 x 10"2 -7.67xlO" 2
6.96X10"1 2.18 xlO" 5
2.72xlO" 2 •
•
The results are shown graphically in Figure 7.11. As shown by Lasaga (1980), the global relaxation time of the system, which is its longest finite time constant (52 900 a), is shorter than the longest residence time in the system (P in sediments with 200 Ma). It is left to the reader to show that these results can be predicted from Gershgorin's theorem (Section 2.4) applied linewise to matrix A. Matrix A is ill-conditioned (i.e., numerically singular) because its five non-zero eigenvalues vary by more than five orders of magnitude: the condition number \XJk5\ is 4.2 x 105. Eigenvectors associated
380
Dynamic systems
with nearby eigenvalues are 'wobbly' (Golub and van Loan, 1983). The system 'hesitates' numerically between the truly zero sixth eigenvalue and the second smallest eigenvalue that corresponds to the longest time-constant (52 900 a). In mathematical terms, the null-space of the matrix A has a dimension of two. The long-term behavior of a system is poorly known whenever reservoirs have widely different residence times. In the present case, the sediment reservoir contains most of the total phosphate. P, therefore, spends most of its time in this reservoir and the system reacts as if one extra eigenvalue was approaching zero. As a result, large machine- or softwaredependent errors may arise on calculating the eigenvector coefficients associated with the smallest eigenvalues. We now calculate 9I0 as T9.9993 x 10" 1 9.9993 x 10" 1 9.9993 x 1 0 ' l 9.9993 x 10" 1 9.9993 x 10"* 9.9993 x 10" 4.9996 x 10~ 5 4.9996 x 10" 5 4.9996 x 10" 5 4.9996 x 10" 5 4.9996 x 10" 5 4.9996 x 10~ 5 7.4995 x 10" 7 7.4995 x 10" 7 7.4995 x 10" 7 7.4995 x 10" 7 7.4995 x 10~ 7 7.4995 x 10" 7 ° ~ 3.4497 x 10" 8 3.4497 x K T 8 3.4497 x 10" 8 3.4497 x 10" 8 3.4497 x H T 8 3.4497 x 1(T 8 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 •
2.1173 x 10" 5 2.1773 x 10" 5 2.1773 x 10" 5 2.1773 x 10" 5 2.1773 x 10" 5 2.1773 x 10~ 5
•
It will be checked that
within five decimal places. Dumping fertilizer on land perturbs the initial state of the system, so that we write
and the new steady-state will be given by
Sundquist (1985) provides an interesting application of these methods to the geological carbon cycle. 7.3.3 The general box model with forcing terms One way of circumventing the difficulties encountered for systems with widely different time constants is to split the reservoirs into two categories. The first category will comprise the reservoirs with short residence times which will be explicitly required to satisfy the constraints of mass conservation. Reservoirs with long residence times will make up the second category which we will treat as source and sinks. Equation (7.3.8) will be transformed into HM
n
n
n
—-i = - X h^jMt+ X kj^Mj+ X (JJ^-J^J*) at
j=i
j*i
j=i
J*i
(7.3.12)
j=1
j*i
where the J* symbols refer to input-output terms which can be functions of time
7.3 One element in several interacting reservoirs
381
and of all the M f . Defining b as the vector made of the n rightmost summation terms in equation (7.3.12), we rewrite equation (7.3.9) as — = Ax + b dt
(7.3.13)
If b does not depend explicitly on the Mh decoupling reservoirs introduces a forcing term without affecting the short-term relaxation behavior.The diagonal decomposition of A as UAU'1 gives — = UAU1x + b dt or, pre-multiplying by U~1 du~lx
(7.3.14)
dt l
Introducing the new variables y=U equations
x and h=U
x
b, we get a system of n scalar
&
^ = Xiyi + hi dt
(7.3.15)
which is integrated as
In matrix form, these n equations read y = eAty0 + QAt
t~XuHu)du
Jo where y0 = U~ x x 0 or
x=UeAtU-1xo+UeAt\
Q^U-^ Jo
which, using the matrix exponential notation of Section 2.5, becomes x = eAtx0+
eA(t-u)b(u)du
(7.3.16)
Jo & Two oceanic basins noted A and B contain initially a mass of sodium M A ° and M B °. Sodium residence time in each basin is TA and TB. Sodium exchange takes place
382
Dynamic systems
7
B->A
'A->B
B
Figure 7.12 Exchange of Na in an open two-reservoir system: the flux J0(t) of Na weathered from evaporites introduces a forcing term into conservation equations. between A and B only (Figure 7.12) and steady-state is assumed. From £ = 0, an important mass M o of evaporite is subject to weathering and brings sodium to basin A with a flux J0(t) described by the equation (7.3.17) where Jo(0) and 6 are constants. Calculate the new steady-state and the transient evolution of Na in the two reservoirs. Assume the following values: T A =12Ma, TB = 60Ma, 0 = 5 My, M A ° = 2, M B ° = 10, M o = 3, all the masses being in arbitrary units. Let us first calculate steady-state concentrations when all Na weathered from evaporites has been transported to the sea and all the parameters become time-invariant. At steady-state, fluxes between reservoirs must be equal
which, combined with the conservation condition 2(Na) = MA° + MB° + M0 = MA* + MB gives the successive equalities r
J
B-+A
._, —JA^B
OO_MA-_MB™_1 —
— T
A
T
B
with the symbol oo labelling steady-state values. Therefore
(7.3.18)
Inserting numerical values gives MA°° = 12.5 and MB°° = 2.5 in the same units as the
7.3 One element in several interacting reservoirs
383
other masses. The total flux of sodium released by evaporites from t = 0 to t = oo must be equal to M o and therefore
Mo= p Jo(t)dt = Jo(0) fV"«dt = Jo
Jo
from which we can calculate J0(0). In matrix form, mass balance reads 1
1
1
1
dt dM B
MBJ L 0
(7.3.19)
dt
or, in the usual form
dt
The matrix A can be diagonalized as UXU'1 roots of the equation
The eigenvalues Ax and X2 are the
i.e.,
X =
1
1
TA
=
TB +
TB
TB
TATB
and, as expected from the closure condition, X2 = 0. The relaxation time T of the whole system satisfies 1 1 1 _= —|— T
TA
(7.3.20)
TB
or =10Ma Using the method outlined in Section 2.5, we obtain the eigenvector matrix U and its inverse U~1 as
U=
and
384
Dynamic systems
The matrix exponential eAt is calculated using the standard rules of matrix multiplication -Tx/2
?A
eAt=UeAtU~l =
2
2
. / T A + TB
2
e"
r/T
0
1
0" 1
which is evaluated as TB
- ( TB
(7.3.21) TA
Preparing for integration of the second term on the right-hand side of equation (7.3.16), we first compute the product eA{t~u)b(u) Mo A{t u)
Q ~ b(u) =
as
QA(t-u)b(u) =
The indefinite integral of eA(t
u)
b(u) with respect to u is
TB
f ^(fM)
+ const
T-t
which, evaluated between 0 and £, gives
e^(r"u)A(M)dw =
Jo
TB
M0T2
(7.3.22)
7.4 Several elements in several interacting reservoirs
385
We can combine the last results for the transient mass of sodium in reservoir A
and in reservoir B MB(t) = (l-e-'")-J In order to make the problem dimensionless, we divide the last two equations by MA°° + MB°° , and refer to the fraction of total sodium held at t by reservoirs A, B, and evaporites as/A(£),/B(f), and/0(r), respectively. The solution takes the simple form ^(e-^-e-^)
(7.3.23)
and
/BW = -[l-/o(0)e-^]--^!l(e-^-e-^)
(7.3.24)
and (7.3.25) This transient solution is depicted in Figure 7.13. For more complex dynamic problems, the use of Laplace or Fourier transforms would be particularly recommended (Wiberg, 1971; Anonymous, 1982). <>
7.4 Several elements in several interacting reservoirs This is the most general and promising case which finds application in all the major geochemical and geodynamic systems. Recent years have witnessed the outpouring of many interesting attempts of geochemical modeling using complicated systematics with application to the long-term evolution of the mantle-crust system and, as a response to environmental worries, the evolution of the ocean-atmosphere system. A crucial tool in a better understanding is the use of isotopic tracer, but the price is the handling of non-linear systematics. The same observation can be made for chemical systems in which reactions introduce non-linear relationships among the geochemical variables. One example of isotopic systems and one example of a chemical system will be selected with simple assumptions made in order to illustrate the potential virtues of this approach. Stability analysis of these non-linear systems is considered beyond the scope of this book and the reader may refer to specialized publications (e.g., Arnold, 1978; Procaccia and Ross, 1978, and the review of Crawford, 1991).
Dynamic systems
386
0.2 ^
0
5
10
15
20
Time, t (Ma) Figure 7.13 Evolution of the fractions fA(t) and/ B (f) of total Na held by the two reservoirs A and B in the system described in Figure 7.12. fo(t) is the fraction remaining in evaporites at t.
7.4.1 Multiple reservoir isotopic systems We will now show how the evolution of chemical and isotopic ratios in a multiple-reservoir configuration brings in information about kinetics and chemical fractionation among the reservoirs. That sort of approach goes back to the early days of modern isotope geochemistry (e.g., Jacobsen and Wasserburg, 1979; O'Nions et aU 1979; Allegre et al., 1980; DePaolo, 1980; Allegre et a/., 1983a, b) but its usefulness as a geodynamic constraint is often insufficiently perceived because of many sources of indeterminacy. This is a special case of coupling concentrations since a ratio involves two concentration or quantity variables as its numerator and denominator. We will first set up the equations for the number of moles of three species, one stable species N, one radioactive species M, and one radiogenic species N*. For the evolution of a stable isotope ratio, e.g., 18 O/ 16 O, the equations are still valid but, of course, without the terms involving radiogenic decay. For example, A/"=144Nd, M= 1 4 7 Sm and N* = 143Nd. We denote R the ratio M/N, in this example 147Srn/144Nd, and R* the ratio N*/N9 here 143Nd/ 144 Nd. For the stable species N, mass conservation reads
At
A N,
(7.4.1)
Dividing by the total amount of species N in the whole system and introducing the mass fraction (allotment) XjN of species N hosted by the jth reservoir, we get
7.4 Several elements in several interacting reservoirs
387
For the radioactive species, the mass balance equation contains a radioactive decay term n diV/J ^ I _ _ V _LH_M
dt
j|=i
n
J
^ J 1
Mj
M —2M
^
4- y
(14 2)
Mj
j=i
where X is the M decay constant, while for the radiogenic species, there is radioactive accumulation n
H/V*
i. .N*
~dT"~M
N(*
n
T.
.N*
"' + jh Nj*
j
The ratio M/N in the ith reservoir changes as dRt dt
d(MI/Nl) dt
1 /dMj Nt\ dt
l
dNt dt
Inserting the appropriate expressions, we get
dt
fti
Mt
Nt
fit
-N
J
Mj Nj Nt
fit
Nt
fit Nj Nt
which is equivalent to dR
(J
~dT~ ^tXN'i
M
M
J
\
M
idT/hi
N •
N
J r
N-
Nj ~Ni l~
J ~Ni h i
l
and to M
dt
•J%\ N,
Mt )
k\
T
Mj
'
Nj
N
'JN,
Since M and JV are different chemical species, we make fractionation factors apparent in the form
dr
l
j%
Nt V
MJN,
We call the expression D
j M,N=Ji-j
Mil /Jt-j
N ( 7 4 3 )
MJN{
the coefficient of relative fractionation between M and N upon extraction from reservoir i into reservoir j . It is truly the ratio of M/N in the material extracted towards the reservoir j from the parent reservoir i to M/N in the parent reservoir i.
388
Dynamic systems
Inserting this new variable, the equation now reads
tit
J^i
J^i
^i
where we have introduced the kinetic factors k for N defined in Section 7.3.2. The equation for the ratio R* = N*/N is quite similar, apart for a positive accumulation term. Isotopes of the same element have identical chemical properties, so fractionation factors are unity
and the evolution of the radiogenic ratio is given by dR
i* dt
_ v J¥:i
N
*
J 1
J
"
l
X N J xtN
Summarizing, each set of radioactive-radiogenic pair + stable isotope gives a non-linear set of three equations per reservoir dxN i
at
N V"1 I
= —Xt j^t 2 J ki-*j
N
V"1 /
j^t kj-+i + L,
V"'
at
YJk]^(]Ri)
j^i
N
X
N
j
I
N
^ '
N
R
i
) ^ - A R xt
i
(7.4.5)
Xi
Solving the forward problem of the isotopic and chemical evolution of n reservoir exchanging a radioactive and its daughter isotope requires the solution of 3n—1 differential equations (the minus one stems from the closure condition). The parameters are n (n — 1) independent flux factors k for the stable isotope N and n (n — 1) independent M/N fractionation factors D. In addition, the n values of /^*, the n values of Ri9 and the n — 1 allotments xtN of the stable isotope among the reservoirs must be assumed at some time, preferably at the beginning of the evolution (e.g., 4.5 Ga ago), or in the modern times, in which case integration is carried out backwards in time. One kind of inverse problem providing useful kinetic information assumes a known evolution of AT, R, and R*, typically their present-day values and derivatives. Another kind of inverse problem with an alternative set of equations replaces the knowledge of the rates of change in equations (7.4.5) by the properties of the bulk system, commonly by assuming some sort of grand average of the whole system, as with the chondritic composition of the bulk silicate Earth. Two reservoirs provide five equations for [2 x 2(2—1)+1], i.e., five unknowns. The two-reservoir configuration can apparently be exactly solved for the kinetic and fractionation factors of each isotopic system, although we are going to find some undeterminacy for the fractionation factors. Three reservoirs provide eight equations per system for
7.4 Several elements in several interacting reservoirs
389
Continental crust
Mantle
Figure 7.14 The two-reservoir mantle-crust system.
[2 x 3(3 — 1)+1], i.e., 13 unknowns. The system cannot be solved for the kinetic and fractionation factors unless some assumptions can be made, presumably some of the six fractionation factors. For a larger number of reservoirs, the problem turns out to be a non-linear set of underconstrained equations. It therefore becomes essential to make a large number of assumptions including the chemical fractionation factors. & Calculate the kinetic and fractionation factors for the 147 Sm- 143 Nd system in a simple modern Earth as a two-reservoir mantle-crust (m-c) system (Figure 7.14) as a function of Nd distribution in each reservoir. R is the elemental ratio 147Sm/144 Nd and R* the isotopic ratio 143 Nd/ 144 Nd. Assume the following values for the parameters
We further assume that steady-state is achieved for Sm/Nd fractionation between the two reservoirs
while examination of geochemical data on modern and ancient rocks suggests the following approximation
with 5m = 0.22 and sc = 0.17. The rate of change sm and sc of the 143 Nd/ 144 Nd ratio are expressed in units of A.47Sm in order to emphasize the difference between the observed values for each reservoir and the value estimated from the curves of Nd isotope secular evolution. This was shown by DePaolo (1980) to be compelling evidence for continuing exchange between mantle and crust. For two reservoirs, the summation signs disappears from the descriptive equations (7.4.5). In addition, we can safely equate the kinetic and fractionation factors of 144Nd
390
Dynamic systems
with those of elemental Nd. The six conservation equations (7.4.5) read
dt dx™
Nd v Nd
j.
Nd v Nd
dt dR
Nd
Nd —k M D Sm/Nd\o , u — /Cm->c U — L ; m ^ c ^ m + ^c
"^c
dRn dt
,
Nd/ n
-
_ *m
X
Sm/Nd n
c*
m/Nd n
n \ m
Sm/Nd^n
_'
Rm*)
dKc*
From the last two equations expressing the rate of change of isotopic ratios, we extract a relationship between the kinetic (flux) constants k and the fraction x of Nd apportioned to each reservoir Nd
—
sm-Rm
x
Nd ~Nd~ A l 4 ? Sm
(7.4.6)
Nd Nd
=
7
* Sm
For compact notation, we define a dimensionless time as
The rate of change of Sm/Nd fractionation in the mantle (assumed to be at steady-state, which is not inconsistent with observations, see Albarede and Brouxel, 1987), will provide additional relationships for the fractionation factors D. Inserting expression (7.4.6) for k values into equations (7.4.5) produces
dt
)Km
RS-R.
sc-Rc
In order to make apparent only two unknowns related to the partition coefficients, we can write
391
7.4 Several elements in several interacting reservoirs
which results in the system m
_ n
Sm/Nd\
~c
R*-R,
*_R *
R
Rrr
m
Nd
In a matrix form, the system of differential equations becomes x, Nd
sc-Rc di dRc
dr
-R,
Rm-sm Sm/Nd
1 _
-Rn
R *Rm*-Rc*
(7.4.7)
The matrix on the right-hand side is singular (its determinant is clearly zero), so that the partition coefficients cannot be independently determined. However, the allotments x of Nd between the two reservoirs can be retrieved by adding the two lines of equation (7.4.7) after multiplication of the first row by x m Nd and of the second row by x c Nd . The terms involving the partition coefficients cancel out and we get Nd^m
i
Y
N d ^ c
\/n
n
\
^m ~ Sm
„
Nd,
/n
j? \
5
c ~
c
„
Nd
or
dx
— +K
-R
Inserting the numerical values for the modern Earth, we obtain ° 2 5 °' 2 2 0.25 - 0 0.5131-0.5118 = 0.666 0.17-0.12 0-(0.25-12) + 0.12 0.5131-0.5118
(0.12 - 0.25)
About 40 percent of the Nd held by the mantle + crust system therefore resides in the crust. This estimate is contingent on neither the ( 1 4 3 Nd/ 1 4 4 Nd) b u l k nor the ( 147 Sm/ 144 Nd) bulk ratio of the mantle-crust system (in particular, the chondritic
392
Dynamic systems
silicate-Earth reference). Interestingly, ( 1 4 3 Nd/ 1 4 4 Nd) b u l k and ( 147 Srn/ 144 Nd) bulk can be evaluated from the simple mass balance equations / 1 4 3 Nd\ V 144 NdAui k
/ 1 4 3 Nd\ V 1 4 4 NdA
m
V 1 4 4 Nd
=0.4x0.5118 + 0.6x0.5131=0.5126
These results fall fairly close to chondritic values, 0.51264 and 0.1967, respectively. The Nd kinetic factors (the reciprocal of residence times) can be computed from equations (7.4.6) as O22-0:25L 0.5118-0.5131 0.666
/c m ^ c Nd =
17 12 °-°0.5131-0.5118
which hints at steady-stade not being attained over the age of the Earth (4.5 Ga). Finally, if we assume that crustal Sm and Nd are recycled to the mantle without fractionation, i.e., Dc ^mSm/Nd« 1, we get an estimate of Dm^cSm/Nd from equation (7.4.7) for the crust as 0=-0.25
Q17
~ a 1 2 (l-Dm^cSm/Nd) + 0 + (0.25-0.12) Q 1 7 ~ 0 1 2 0.5131-0.5118 0.5131-0.5118
0.12
and, therefore
If that sort of simplistic model is acceptable, the extent of fractionation hints at rather small degrees of melting in the mantle and/or the presence of a residual phase that fractionates Sm from Nd, probably garnet. o 7.4.2 Non-linear coupling of geochemical reservoirs
When different elements in interacting reservoirs are allowed to dissociate, to react with each other to form complexes, and to precipitate, the rates of change of concentrations usually become non-linear functions of these concentrations. Since no unique theory can be established to describe all possible situations, a reasonably simple example amenable to calculation and of geological interest will give a perspective for more complete formulations. The outline of a model for the response of carbonates in a simplified ocean-atmosphere system to a sudden influx of carbon dioxide has been put forward by Broecker and Peng (1982), Berner et al. (1983), and Lasaga et al. (1985). The model presented in Figure 7.15 aims at suggesting tendencies and kinetics in the carbonate system on the 2000-200 000 year scale and will resort to drastic approximations which do not hold if time-scales outside of that range are
7.4 Several elements in several interacting reservoirs
393
Atmosphere
Pco,
Runoff (C*2+)
Figure 7.15 A simple ocean-atmosphere-continent system. Pressure of CO 2 enhances Ca release from the continental crust (which is assumed to be made of CaSiO3) and controls the depth of calcite saturation. Calcite precipitation is therefore controlled by the hypsometric curve, equation (7.4.8), and pCOr
considered. Because a well-mixed ocean is assumed, this model is particularly inappropriate for short term predictions of the possible effects of fossil fuel burning. It may however be used to discuss the consequences of the glacial-interglacial changes in CO2 that have been found in polar ice cores (Barnola et a\., 1987). Assumptions of the present model are: (1) The mass M of the ocean (1.37 x 1021 kg) and the continental runoffs (3.6 x 1022 kg Ma" 1 ) are constant over 200000 years. (2) Atmospheric CO 2 equilibrates instantaneously with dissolved oceanic carbonates. Actually, Broecker and Peng (1982) show that the rate-limiting step for this process is vertical mixing in the ocean which takes place in approximately 1600 years. (3) The ocean is chemically homogeneous and, in addition to carbonates, contains Ca2 + plus enough of inert ions to adjust alkalinity to the values observed in modern seawater (%2.1 x 10~ 3 eqkg - 1 ). The inert ions, assigned as N a + and Cl" for convenience, are assumed to be time-invariant in the ocean. (4) CaCO 3 surface productivity P by organism is constant. CaCO 3 preservation is proportional to the oceanic surface shallower than a depth zs which we identify with both the carbonate compensation depth (CCD) and the lysocline. The bathymetry of the oceans is approximated by a normal function with a mean value of 5 km and a standard deviation of 1.25 km. These figures are in reasonable agreement with the hypsometric curve of Menard and Smith (1966). By virtue of the relationship between normal functions and error functions (Chapter 8, Appendix A), the fraction of the ocean surface which is shallower than zs is therefore given by (7.4.8)
394
Dynamic systems
(5) Solubility of calcite increases significantly with increasing pressure. In the present ocean which contains 10.3 x 10" 3 mol kg" 1 Ca 2 + , the depth zs of CaCO 3 saturation is given by [CO32~]=90xl0"6xe016(Zs~4) (Broecker and Takahashi, 1978). zs is the depth in km and carbonate concentrations are in mol kg" 1 . Since seawater Ca 2 + concentration is allowed to change as a result of calcite precipitation and river input, it is more general to state that the saturation product Ks changes as
= 0.927 x 10 ~ 6 x e 0 1 6 ( Z s - 4 )
(7.4.9)
(6) Dissolution of carbonated sediments is, probably unduly (Broecker and Peng, 1982), neglected. (7) Continental crust is assumed to be made of calcium silicate. High CO 2 pressure increases weathering and Ca2 + in the runoff according to the fictitious reaction CaSiO 3 + CO 2 = SiO2 + C a 2 + + C O 3 2 -
(7.4.10)
A different weathering reaction proposed by Berner et al. (1983) involves HCO 3 ~ instead, but is entirely equivalent to the present equation when HCO 3 " dissociation is taken into account. (8) Activity and fugacity coefficients are equal to unity. (9) The present-day concentrations are taken as steady-state values. (10) The ocean-atmosphere system contains 35 x 1015 kgC, only 2 percent of which resides in the atmosphere (Berner and Berner, 1987). ^ Assuming that 0 . 7 x l 0 1 5 k g reduced C from forests or fossil fuel are nearly instantaneously burned as CO 2 at t = 0, i.e., that an additional 2 percent oxidized carbon is injected into the ocean-atmosphere system, calculate how Ca 2 + , alkalinity, and Z(CO 2 ) in the ocean, the depth of Ca-carbonate saturation (lysocline), and the atmospheric pCOl change over 100000 years. Carbonic acid is diprotic and its dissociation may be written with the two equations
[H+XHCO3-] [H 2 CO 3 ]
pngxy] [HCO3-]
(7A11)
while dissolution of atmospheric carbon dioxide in ocean and river water is assumed to obey Raoult's law with constant solubility coefficient a [H 2 CO 3 ] = apCO2
(7.4.12)
Combining the last three equations gives the CO 2 pressure as a function of the speciation of carbonates
-*'-
C H C O
^
(7.4.13)
7.4 Several elements in several interacting reservoirs
395
For p H > p X l 5 the total of carbonates can be approximated as Z(CO2)« [HCO 3 " ] + [CO 3 2 - ]
(7.4.14)
while, neglecting minor ions H + and O H " , electrical neutrality demands
Writing for short p = pCo2, x = C a 2 + , y = £(CO 2 ), and label with the subscript r the riverine values. The alkalinity A of a solution (Stumm and Morgan, 1981) is defined as its neutralizing capacity -]
(7.4.15)
Subtracting (7.4.14 ) from (7.4.15) and turning to the short notation results in [CO32] = A-I,(CO2) = A-y
(7.4.16)
while, upon subtracting (7.4.15) from twice equation (7.4.14), we get [HCO 3 -] = 2Z(CO 2)-A = 2>;-/1
(7.4.17)
Introducing K' = ocK1/K2, we now express equation (7.4.13) as Kp
J^^l
(7A18)
A-y Rearranging, we get the second order polynomial in y 4y2-{4A-K'p)y + A2-AK'p = 0 which has one non-negative solution (see Broecker and Peng, 1982) ]"2}
{4A
(7.4.19)
The weathering reaction leads to the relationship [Ca 2 + ] r [CO 3 2 -] - = K* Pco2
where X* is the equilibrium constant. We assume that rivers transport C a 2 + but very little other major ion besides carbonates, i.e., Ar = 2xr
396
Dynamic systems
and therefore P Combining with equation (7.4.19) rewritten for runoff concentrations
we get an explicit expression that can be solved for K* if xT and pco2
This equation can also be solved for x r if pco2
are known
and K* are known
by transformation into a fourth-degree polynomial. It is nevertheless solved faster using a Newton refinement scheme since a reasonably good initial estimate is usually available. The system is controlled by three independent variables, C a 2 + , Z(CO 2 ) and alkalinity A. In differential form, the rates of change can be written as the difference between the input rate in runoff and output rate by CaCO 3 sedimentation M ^ = Rxrlp(x, y, Aft - PF[zs(x, y, Aft dt M ^ = Ryr - PFlzs(x, y, Aft at dA M — = 2Rxrlp(x, y,A)-] - 2PF[zs(x, y9 Aft at
(7.4.23)
with the initial conditions x = x 0 , y = y0, A = Ao. The rate of Ca and carbonate removal (CaCO 3 sedimentation) is equal to the calcite productivity P multiplied by the preservation function F. We take X'=113molkg" 1 atm"' 1 . At steady-state, which we note with the superscript t , p t = 3 x l 0 " 4 a t m , x t = 10.3 x l O ~ 3 m o l k g " 1 , x r t = 3 . 6 x l 0 " 4 m o l k g ~ 1 . Since in this model CaCO 3 is the only sink of carbon dioxide, xrjf = yr^ at steady-state, and therefore yT = yr^ = 3.6 x 10 ~4 . This is enough to estimate the weathering constant K* from equation (7.4.21) sje_
113x0.00036 8
0.000 362 0.0003
0.000 36[7
113 x 0.0003V 0.00036
0.0003 LV
,
2
113 x0.0003l1/2
0.000 36 +0.000 36
8
/
2
J
7.4 Several elements in several interacting reservoirs
397
or K* = 1.69xKT5 Under a steady-state CO 2 pressure of 3 x 10~ 4 atm, A* = 2.1 x 10" 3 eq k g " 1 . Inserting this value into equation (7.4.19) gives the steady-state Z(CO 2 ) yt =
i [ 4 x 0.0021 -113 x 0.0003 8 + [(4 x 0.0021 -113 x 0.0003)2 - ( 4 x 0.0021)2 +16 x 0.0021 x 113 x 0.0003]1/2]
or
We n o w need t o c o m p u t e the calcite productivity term P from t h e conditions a t steady-state
which requires t h e preservation a t steady-state t o b e estimated. Recasting equation (7.4.9) as a function of the variables x, y, a n d A gives
and at steady-state, the calcite saturation depth can be computed from 13.89 +In x V * - ) ^
+
zsT = 4H 0.16 Replacing the variables b y numerical values, it becomes , 13.89+ ln[10.3x KT 3 (2.100-1.995)x 10" 3 ] T zs = 4 H = 4.9 km 0.16 The fraction of the calcite produced which reaches seafloor at steady-state is
F(z.) = i l+erf^^4 2\
=0.468
1.25^/2
which gives calcite production rate as p.3-6xl0»x3.6xl0-*
0.468 Introducing the time-constant characteristic of seawater renewal by runoflf and calcite
398
Dynamic systems
entrainment as TH = M/K=1.37xl0 21 /3.6xl0 22 =0.0381Ma- 1 M/P= 1.37 x 1021/2.77 x 1019 = 49.5 Ma" 1 the system of differential equations can finally be written in a non-linear form amenable to calculation dx dt
xr[p(x,y,A)~] 0.0381
dy
3.6 x l O "
dt
0.0381
dA dt
4
49.5 FlzJLx,y9A)]
(7.4.24)
49.5
2dx
dt
At time t, the state of the system is defined by a set of variables x, y, and A. The atmospheric pCOl is calculated through equation (7.4.18), C a 2 + in runoff (xT) through equation (7.4.22). The saturation depth zs is obtained through equation (7.4.9), then the preservation function F through equation (7.4.8). Integration can be carried out numerically using a Runge-Kutta method.
10.34
0.02
0.04
0.06
0.08
0.1
Time, t (Ma) Figure 7.16 Medium-term response of the model described in Figure 7.15 to a sudden increase by 2 percent of the total amount of oxidized carbon in the ocean-atmosphere system. Ca (top) and X(CO2) (bottom) in the ocean.
600 r
0.6
n
5001
0.5
I
B 0.4 a
400
OQ
2.20
0.6 J0.4
'2 2.15
0.2 2.10
0
0.02
0.04
0.06
Time, / (Ma)
0.08
0.1
0.02
0.04
0.06
0.08
0.1 0
Time, t (Ma)
Figure 7.17 Same as Figure 7.16 for p CO2 , seawater alkalinity A, runoff [ C a 2 + ] and the fraction F of precipitated calcite which is preserved on the ocean floor. It takes a little less than 10000 years for runoff calcium to neutralize the excess dissolved CO 2 , but calcite precipitation takes much longer to eliminate Ca and carbonate excess.
^
400
Dynamic systems
The constitutive variables at t = 0 are x= 10.3 x 10" 3 molkg" 1 , y= 1.02 x 1.995 x 10~ 3 molkg~\ and ,4 = 2.1 xlO~ 3 eqkg" 1 . The system of differential equations has been solved using commercial Runge-Kutta software (MatLab) up to t= 100000 years. The major features of this calculation are quite interesting (Figures 7.16 and 7.17). An input of CO 2 into the ocean-atmosphere system results in a shallower saturation level F: more CO 2 dissolves, the pH decreases and HCO3~ dominates over CO 3 2 ~. For constant alkalinity, the system accommodates more carbon as bicarbonates. Since poor preservation does not permit evacuation of the excess HCO 3 ~ and CO 3 2 ~, the system must 'wait' until the burst of atmospheric CO2 enhances erosion and drives enough Ca to the sea. Seawater alkalinity is raised in this process, the excess carbonic acid is neutralized and calcite sedimentation can resume. It takes roughly 10000 years for this neutralization to take place and the lysocline to be restored. It takes another 100000 years for the system to eliminate the excess calcium and carbonate ions by carbonate sedimentation. Further practical applications of modeling methods will be found in Walker (1991). o
8 Transport, advection, and diffusion
8.1 Fluxes 8.1.1 Basic definitions We already used the concept of conservative process in Chapter 1. In the present context, conservative properties are scalar, vector, or even tensor variables that can be added or subtracted. They can only be modified by exchange with the surrounding across interfaces and by being locally stored in sinks or locally released from sources. Mass and energy are conservative scalar properties, concentration and temperature are not. Momentum is a conservative vector property, velocity in general is not. A flux represents an 'amount' of a conservative property transported in a given time across a boundary, an outlet, or an imaginary surface. It may be a number of cars driving on a highway on Sunday, the mass of water running through a fault zone in a year, the heat flow across the Earth's surface or the vector describing the rate of transport of electromagnetic energy through space. In proper words, we should call flux density whatever flux refers to a unit time and a unit surface and reserve the use of flux for less specific situations. As common usage has unfortunately decided differently, we will have to be careful about the possible ambiguities associated with flux denomination. Let us consider a medium moving with velocity v (components vx, vy, vz). A medium with non-zero velocity is said to be advective. Let us first define in the most general way the flux of volume at a point M of the familiar 3D space: this is simply the quantity of volume moving across the unit surface perpendicular to v per unit time. For an arbitrary surface 85 next to M and perpendicular to v (Figure 8.1) and during time dt, the volume will be dK=65|v|dr and the flux per unit time and unit surface will be the vector Jv such as _
55vdr SSdt
In a moving medium, velocity therefore represents the flux of volume. The flux of material / associated with the movement is the amount of matter carried by the flux 401
402
Transport, advection, and diffusion
vdt
Surface 85 Figure 8.1 For a surface perpendicular to the displacement, the flux of volume through the element of surface 55 during the infinitesimal time interval dt is the vector v 55 dt. v therefore is the flux of volume crossing the unit surface per unit time. of volume. This flux is variable in space and time. If p is the local density (in kgm" 3 ) of the medium, this flux is simply J=Jvp = pv If Cl (in kg/kg) is the local concentration of the species i and if there is no relative movement of different species relative to each other, the flux of the species i will be the vector j such as i
= pvCi
(8.1.1)
If species i moves with a velocity vl different from that of other species, this expression must be modified as i
= pivi = pviCi
(8.1.2)
where P; = pC' is the mass density of component i, again in kgm" 3 . The total flux of matter / is the sum of the fluxes Jl of its individual components i and therefore J=YJJi
(8.1.3)
i
The velocity v of the moving medium is the mean velocity for all species weighted by their concentration
(8.1.4) Were other units used for concentrations, namely mole numbers instead of weight,
403
Figure 8.2 For a surface which is not perpendicular to the displacement, only the component of the velocity perpendicular to the surface, i.e., vn measures the flux.
volume instead of mass, other reference frames should be chosen for the mean velocity (e.g., Brady, 1975). Now let us define the fluxes across an arbitrary surface (Figure 8.2): they simply are the scalar amount of volume, mass or species i which crosses an arbitrary surface, not necessarily perpendicular to v. This flux, which is here represented by the lower-case letter;, is the projection of the vector flux onto the normal to the surface. Since the dot product vn (or vTw) is the projection of v onto the normal to the surface, the flux of volume jv per unit surface is
whereas the flux across the arbitrary surface will be
where use is made of the oriented surface dS, a vector defined as the product of the scalar surface d5 with the normal n
dS=ndS For a closed surface, the convention is to use a normal oriented outwards. Flux j of matter and flux J1 of species i through dS will be
and
JidS=pCivdS respectively.
404
Transport, advection, and diffusion 8.2 The divergence theorem and the conservation equations
Let us consider the material balance in an infinitesimal cubic volume dV around a point M with edges dx, dy9 dz, so
dV=dxdydz The balance of material in the x direction is the difference between the fluxes through the surfaces perpendicular to the x axis (Figure 8.3). This corresponds to the difference 71 xH
, y,z \{idydz)-jl
x
,y9z Y{idydz)
where notation emphasizes the dependence of the flux on the position variables and unit vectors, or its first-order expansion dJ(x,y,z)dxl
I"
dJ{x9y9z)dx
This equation is equal to
dx The dot product /(x,y,z)7 is simply the component Jx of J(x9y,z) along the x axis. The total change over the volume dV is the sum of the mass transport in the three directions, i.e., dJx dJy 3J2\A A A 1 H I dx dy dz = div /(x, y9 z) dV dy dz J dx The divergence of the flux vector is therefore the net rate of accumulation of the quantity which is transported in and out of the volume element dV. This can be integrated over an arbitrary volume Q limited by the surface S to give the divergence theorem of Gauss
ff f div J(x,y,z)dV = JJ J(x,y,z)-dS 8.2.1 The continuity equation In the real world, matter must be conserved. Let us relate the rate of variation of the mass contained in an arbitrary volume Q to the flux across the surface Z d fff f sum of all the fluxes ) CC — pdV = < V = - J(x,y,: dtjjj [across the boundaries J JJ
8.2 The divergence theorem and the conservation equations
405
Figure 8.3 Material balance in a moving medium through the faces of an infinitesimal cube centered in M.
on the left-hand side, the order of integration relative to space coordinates and derivation relative to time can be interchanged, provided the time derivative is specified to be local at the point M(x,y,z): a partial derivative sign is therefore necessary. On the right-hand side, the minus sign is necessary to make a flux vector pointing outwards decreasing the mass content of the volume. The divergence theorem can be applied which results in
Once mass flux is expressed as a function of density and velocity, it becomes
f
divpvdK
(8.2.1)
This equation holds true for any arbitrary volume, and so must be true for the volume element dV. We can therefore drop the integration sign and term dV to obtain dp dt
= — div pv
(8.2.2)
which is the continuity equation. Velocities in an incompressible (isochoric) medium, i.e., a medium with constant p, will therefore verify div v = 0
(8.2.3)
8.2.2 The general transport equation The elemental conservation equation is laid down on the same principles as the continuity equation. The rate of variation of the amount of element i contained in
406
Transport, advection, and diffusion
an arbitrary volume Q due to the flux (x, y, z) across the surface Z, and to chemical reactions is
where Ak[ is the production (> 0) or consumption (< 0) rate of the species i in the kth chemical reaction (in mol or g per unit time and volume). If Akl = 0 for all fc, the species is conservative. Note the minus sign on the first right-hand side term. As for the continuity equation, we change the order of derivation and integration on the left-hand side and apply the divergence theorem on the right-hand side
(814)
This equation holds for any arbitrary volume, and so must be true for the volume element dV. We can therefore drop the integration signs and the common factor dV in order to obtain
or introducing the expression of /*(x,j;,z) as a function of the velocity v1
Let us split the flux term pVO into the purely advective term pvC1 and a diffusive term, hereafter also referred to as Hf\ describing the movement of the species i relative to the center of mass p(vl — v)Cl = - div p(vl" - v)C - div pvO + £ Akl
dt
Expanding the derivatives of the left-hand side and the second divergence term on the right-hand side gives C — + p — = - div V - O div pv - pvgrad O + £ Ak* dt
dt
k
which, using the continuity equation (8.2.2), can be simplified into
p
dC* = - div Yf - pvgrad C" + X Akl dt
k
(8.2.5)
8.3 Advection and percolation
407
This is the fundamental transport equation for the species i, which does not depend on any assumption other than mass conservation. This equation is valid for a fixed reference frame, such as, using a comparison borrowed from Bird et al. (1960), counting fish in a river from a bridge. Occasionally, the fisherman wants to track the concentration in a given parcel of matter, say that he is now counting fishes from a boat carried by the stream. Let us write the total differential of O
,^ fdci\ Ad fdci\ d, x +(dci\ '+hr hr dc= — d ' + h r d x + h r V Vd dt J Jxyz
\\ dx d J Jyzt
, (dc\ d d *+hr *+hr
\\ ddy J Jxzt
, dz
\\ ddz J Jxyt
Divide everything by dt dC
(dC\
~df " \fr)xyz
fdC\ +
dx fdCl\
dy fdC{\ dz
+
Kte),* df \dy)X2t di + \ ¥ J X , d^
If concentrations have to be known at a point moving with an arbitrary velocity, the increment ratios dx/dt, dy/dt, and dz/dt can be constrained accordingly. Commonly, the velocity will be that of the medium itself (the boat is now drifting freely) and dx
dy =
dt
y
dz = y
*' dt
r
=
dr
y
2
where vx, vy, and vz are the components of the rate of displacement v, which amounts to 'pin' the ratio dO/dt to the moving medium. This ratio is usually called the substantial derivative and noted with upper case D
The rate of change of Cl along the movement therefore is DC1
(8.2.6)
8.3 Advection and percolation In the case of pure advection (no molecular transport), the diffusion term in the general transport equation (8.2.5) is made equal to zero and time-dependent mass balance is expressed as vgradC+X^ dt
pk
(8.3.1)
408
Transport, advection, and diffusion
8.3.1 Effect of bioturbation on concentration profiles in sediments The mechanical activity of burrowing and digging animals, such as worms, mixes the surface layer of deep-sea sediments and therefore smears the stratigraphic record of the chemical changes at sediment surface. This problem has been investigated quite thoroughly by a number of authors interested in recovering the original history of chemical fluxes (Berger and Heath, 1968; Ruddiman and Glover, 1972; Guinasso and Schink, 1975). The simplest approach considers a perfectly mixed bioturbated layer of thickness L and homogeneous concentration C\ If v is the sedimentation rate, the mass balance condition for element i reads
where J0\t) is the flux (mol m 2 s *) of i at the sediment-water interface. Let us assign arbitrarily a time equal to zero to a layer presently at depth z 0 below the bottom of the bioturbated layer. The layer now at depth z was formed at time t such that Z = Z0- fv(l)dT Jo For a bioturbation of constant thickness L, the conservation equation now becomes pLdCl = [Jol(z)-pvCl]l
(8.3.2)
Integration of equation (8.3.2) shows that, at constant sedimentation rate, a spike of element flux at t = 0 with no subsequent input will be smeared out upwards with a characteristic length L and decrease exponentially (Figure 8.4) according to the law C\i) = C'(0) e "Vt/L = C\0) e "(ro"2)/L
(8.3.3)
where C(0) is the concentration at £ = 0 at the bottom of the bioturbated layer. Alternatively, a reduced flux of element i per unit length can be computed from
pv
\
dz /
(8.3.4)
Michel et al. (1990) have measured the iridium content in Ocean Drilling Project hole 690 sediments from the Weddel Sea across the Cretaceous-Tertiary (K-T) boundary (Table 8.1). Depth z is relative to an arbitrary level in the core. Compare the reduced iridium flux at each depth down the core assuming the bioturbated layer is either 4 or 8 cm thick. Making the approximation at the feth depth level that
409
8.3 Advection and percolation Table 8.1. Iridium content (ppt or 10 12 gig) in sediments from the section 15X of the OOP 690 hole (Michel et al., 1990). Depth z (cm) is measured relative to an arbitrary reference layer.
22 23 24 25 26 27 28 29 30 31 32 33 34
CIr
z
CIr
z
CIr
128 221 163 190 197 126 164 150 134 223 264 333 328
36 37 38 39 40 41 42 43 44 45 46 47 48
467 747 1101 1487 1566 983 1237 602 378 578 511 619 362
50 51 52 53 54 55 56 57 60 63 66 71 77
298 389 248 361 218 223 297 292 174 252 126 111 84
Time
li
Figure 8.4 Smearing of a burst in the input of an element i at the surface of the sediment by bioturbation over a layer of constant thickness L (from Ruddiman and Glover, 1972). With time, combined sedimentation at rate v and bioturbation, smears the concentration peak up the sedimentary column.
Figure 8.5 Reduced flux of iridium at the K-T boundary in the Weddel Sea assuming bioturbated layer of constant thickness L = 4 cm and L = 8 cm. L = 0 cm corresponds to the raw data of Michel et al (1990).
8.3 Advection and percolation
411
we get the reduced fluxes plotted in Figure 8.5. For L = 0, the original data are obtained. The negative values obtained for L = 4 and L = 8 cm appear to result from both a sharp peak and the assumption of constant L (this effect is parent to the Gibbs effect mentioned in Chapter 2). The values becoming more negative above the Ir peak when L increases suggest that the bioturbated layer has actually decreased dramatically as a result of the K-T catastrophic event, asteroid impact or volcanic eruption, that killed most of the burrowing organisms. The nearly periodic pattern of the Ir flux (10 cm «10,000 years) is not explained. o=
8.3.2 Exposure ages and the assessment of erosion rates Cosmogenic isotopes, such as 10Be or 26A1, are created by the interaction of cosmic rays with the Earth surface (Lai, 1988; Brown et a/., 1991). Their measurement in surficial rocks have been suggested to provide a quantitative estimate of erosion rates. Relative to a depth z-axis with the origin kept at the surface, the assumptions of the model are as follows: (a) since the Earth surface moves upwards with respect to the origin, i.e., toward negative values, the material velocity is a negative scalar that will be considered constant and labeled — v; (b) due to the rays being progressively absorbed by the surface material, the production rate decays exponentially with depth from a value of Po at the surface with a characteristic attenuation distance I/a; (c) the cosmogenic isotope decays with a decay constant L Therefore, the rate of change of the concentration C of cosmogenic isotope is the sum of the advective flux plus the production rate minus the decay rate — = v — + P0Qaz-AC dt
(8.3.5)
dz
Integration of this partial differential equation will be easier by pinning the frame to the rock (remembering that the algebraic speed is — v) DC = POQ-^-^-XC
(8.3.6)
Dt
where £ is now the distance to the interface at t = 0, i.e., { = z + vt
(8.3.7)
An equation without exponential would be easier to handle, so we change the variables as
with the substantial time derivative equal to — = — e"a(C"vr)
Dt
Dt
412
Transport, advection, and diffusion
The conservation equation simplifies into DM
which can be rearranged as / D [u--
P.
\
•7
Dr and integrated into — = const x e - (A+av)r
u
(8.3.8)
Reverting to concentration C eaz = —— + const(C) x e "(A+av)t we apply a condition of constant concentration C o at all depths C eaC = -^5L + const(C)
(8.3.9)
where we have made a careful distinction between z (depth at t) and z (depth at t = 0). Inserting the value of the constant into equation (8.3.2) gives
X + zv with the final result (8.3.10) At steady-state, surface concentration is
A + av 8.3.3 Dispersal of a conservative tracer in a velocity field. This is a problem of considerable geodynamic and petrological impact: on the mantle scale, molecular processes are negligible (Hofmann and Hart, 1978) and elements can
8.3 Advection and percolation
413
be considered as conservatively (= passively) dispersed by mantle convection (Richter and Ribe, 1979; Hoffman and McKenzie, 1985). Injection of oceanic crust and sediments at subduction zones creates geochemically anomalous zones in the uppermost mantle. These anomalies are entrained by mantle movements. Likewise, contamination of a magmatic body by roof pendants rapidly spreads under effects of magma convection. Convective velocity fields are fairly complex as they depend on a number of major assumptions on the temperature dependency of rheology. We can illustrate the salient features of convective dispersal by choosing a simple velocity distribution in a rectangular convection cell (0 < x < a, 0 < y < b) such as that associated with the onset of Benard instability in the conditions of Boussinesq approximations (e.g., Turcotte and Schubert, 1982). Let us make the calculation for the so-called 'free-slip' conditions, which permit free movement along the boundaries, both vertical and horizontal, such as a convection cell which would be limited by no rigid boundary. From Turcotte and Schubert (1982), we take the velocityfieldto be dx
y .
dt~~'
\j/0 — COS 5T-SII
dy dt
n . y ) — sinTc 2a
X
a x
COS 7 1 -
(8.3.11)
where ^ 0 is a constant. No material crosses the limits, for vx = 0 at x = 0 and x = a, and vy = 0 at y = 0 and y — b and velocities are maximum along the limits. Tracer i is conservative, i.e., diffusion is negligible and no reaction changes its local concentration. Its concentration therefore satisfies the advection equation
Dt
0
In practice, the tracer follows the moving material with the same velocity, like dye transported by a solution. The position of a point 'dyed' by the tracer can be followed by solving the system of differential equations above. As shown in Section 3.2, a numerical Runge-Kutta scheme is appropriate. In Figure 8.6, we assume a = b = \po = \ and points falling on a circle at t = 0 are tracked by increments of 0.2 until r = 0.8. The warping and stretching of the circle shows how mechanical mixing proceeds: the area enclosed by the curve is preserved, since material is conserved, but its perimeter is stretched considerably. The rate of stretching, e.g., the rate of length change per unit time, often simply scaled by l/|gradv|, measures the efficiency of the mixing process (Figure 8.7). Mechanical boundary layers, such as continental lithosphere or the D" layer in the mantle or bottom water in the ocean, are regions where heterogeneities are efficiently created and preserved. Physics of mixing has recently opened as a new field with many promises for engineering (Ottino, 1989) as well as Earth sciences. The concept of a marble-cake mantle (Allegre and Turcotte, 1986) in which layers of subducted oceanic crust are stretched by mantle convection together with residual peridotite is currently under intense scrutiny by scientists in geochemistry.
414
Transport, advection, and diffusion
0
0.2
0.4
0.6
0.8
1
Figure 8.6 Convective dispersal of a passive tracer located at t = 0 inside the circle in the upper right corner by the velocity field described by equations (8.3.11).
|«-----:
~T
vx(y + dy)dt
_*_ vx(y)dt Figure 8.7 Scaling of the stretching rate by the velocity gradient. For simplicity, the medium is assumed to move in a direction parallel to the x-axis. Two points initially connected to each other by the vector [0,dy] become connected, after a time df, by the vector [dv x ,dy]. dvjdy is therefore a measure of the stretching rate in the direction x.
8.3.4 Percolation and infiltration metasomatism Migration of fluids in a porous matrix with solid-liquid fractionation results in a process much similar to the chromatographic separation of elements (DeVault, 1943; Korzhinskii, 1970, Hofmann, 1972). This mechanism has recently been revived in the context of mantle metasomatism by Navon and Stolper (1987), Bodinier et al. (1990), Vasseur et al. (1991), in the context of hydrothermal systems by Lichtner (1985) and, for stable isotopes, by Baumgartner and Rumble (1988). Only a simplified account of this model will be given here. Let cp be the porosity of the medium, p sol and pUq the density of the solid matrix and melt, respectively, and vliq the fluid velocity relative
8.3 Advection and percolation
415
to the matrix. The reference frame is attached to the solid matrix. We first write that the rate of variation of the amount of material contained in the volume Q enclosed by the surface Z equals the total flux across the surface
= -IT
v
~
(8.3.12)
Exchanging the order of derivation and integration on the left-hand side integral and using the divergence theorem on the right-hand side, we get
^ [ Wn q + (1 -
What is true for an arbitrary volume Q is true for the volume element dV9 hence we can drop the integration signs and dV on both sides of the equation
- [ W i i q + (l-
(8.3.13)
which expresses continuity of the fluid and its solid matrix. Let us now calculate the mass balance for an element i which is transported by the fluid and is allowed to react with the solid. Csolf and CXij are the concentrations of element i in the matrix and fluid respectively. The variation of the total amount of element i contained in the volume Q enclosed by the surface Z per unit time is equal to the sum of fluxes which cross the surface
iiqCW + U -cp)psolCson dV= - JJ W l i q C l i q l v l i q -d5 Dispersive forces due to the tortuous path of the fluid, matrix deformation, and diffusion are neglected. Taking the same steps as for the continuity equation, we get - [ W i i q C W + (1 - ^PsoiCsoi1'] = - <*iv
Taking the derivatives of the products, rearranging the left-hand side and expanding the divergence on the right-hand side gives - [WiiqCW + (! ~ ^PsoiCso/] + - [(1 - q>)pjfij ~ Cliq')] dt dt Hq '
- Cluj div
416
Transport, advection, and diffusion
Expanding the left-hand side and rearranging, we obtain 3
ZQ
< V l < pl
>)P,OI] W lliq >)P] + [[W iq
i
d(C
i
C ')
+ dd -<7>)P <7>)Psosi]-£*+ (1 -
=(CJ - <:„„') ^ ^ ! - (?>pllqvliq-grad Cllq' - C,,,' div Wliq v liq From the fluid continuity equation (8.3.13), the first term on the left-hand side cancels out with the last term on the right-hand side, giving the general conservation equation (8.3.14) which will prove useful in modeling trace-element partitioning during magma genesis. Further simplification is achieved for constant p sol and cp dC l dC l ^+(l
Wvwgradcy
(8.3.15)
We assume that C so / and C liq ' are related through a thermodynamic relationship, as may be the case for trace elements. In one dimension (x), equation (8.3.15) can be recast as a function of Csoll only
which we can rewrite as
dt
dx
vl is defined as
v l lq q
(8.3.17)
and represents the velocity of the isopleths Csol* and C liq ' since, for a parcel of solid moving with the velocity v1 =-^L = 0 Dt
(8.3.18)
The evolution of a concentration profile depends on how the velocity changes with concentration. It is left to the reader to show by taking the derivative of equation (8.3.17) at constant cp and vliq that the sign of dvVdCliq' is opposite to the sign of d2Cso,V(dCliq1)2- For a trace element i, the last expression is the derivative of its
417
8.3 Advection and percolation
solid-liquid partition coefficient relative to the concentration in the liquid and therefore measures the deviation from regular solutions (Henry's law). Three cases may be accordingly singled out: (a) For dvVdCliql = 0 or tfCj/idC^)2 = 0 (Henry's law), v1 is constant. This case is investigated in Chapter 9: all the isopleths of species i move at the same velocity and the concentration profile in the solid is simply translated without modification. (b) For dv'/dCliq l < 0 or d 2 C sol '/(dC liq f ) 2 >0, isopleths of low concentrations catch up isopleths of high concentrations: downstream positive gradients build up as metasomatic sharp fronts (Hofmann, 1972), negative gradients flatten. (c) For dvVdCuq1' > 0 or d2CsolV(dCliq1)2 < 0, isopleths of high concentrations catch up isopleths of low concentrations: downstream positive gradients flatten, negative gradients build up as sharp fronts.
At some point, gradients may become infinite and form a shock wave. This 'breaking' time represented as stage 3 in the cases (b) and (c) of Figure 8.8 depends on the initial distribution and solution properties and is known mathematically as a 'bifurcation' (e.g., Logan, 1987; Strang, 1986). Further evolution results in a
(a)
d2dsol \
(Henry's law)
\
\
-ari
(b)
/
/
(c)
Figure 8.8 Advective propagation of a chemical wave of tracer i moving with a velocity v1 in a wetted porous solid at times t= 1, 2, 3 for different values of d ^ ^ d C ^ 1 ' ) 2 . Breaking takes place at t = 3 in cases b and c.
418
Transport, advection, and diffusion
Figure 8.9 A multi-valued relation between x and C so / is unstable and produces a concentration discontinuity (front). The hatched areas on both sides of the vertical line must be equal, which defines the position of the front.
multi-valued function C so / over a certain range of x, i.e., for a single position x three distinct values of Csoll satisfy the conservation equation, which is schematically depicted in Figure 8.9. Such a situation is unstable with regard to fluctuations and a concentration front forms with its position controlled by mass balance constraints (the two hatched areas must be equal). This theory was developed for metasomatic fronts by Korzhinskii (1970) and Hofmann (1972) and the topic comprehensively covered in the book by Ortoleva (1993). We will now estimate the front velocity using the method of Guy (1984). Let us consider, in one dimension, the conservation of element i in a rock column (x = xx to x = x2) of unit section which contains a propagating front at the time-dependent position s(t). The discontinuity is handled by breaking the column at s = s(t). The amount of element i in the rock column only changes by fluid exchange through both ends of the column, hence +(1 -
[w l i q C l i q ' + (l -
dx
where the - and the + superscripts on s indicate which side of the discontinuity is used as the integration limit. Using the Leibniz's rule for differentiating integrals gives — dx
at
8.4 Diffusion basics
419
Let xx-+s~ and x 2 ~> s+ - The two integrals tend to zero while the other terms remain finite because of the concentration jump. The velocity ds/dt of the discontinuity is therefore
C Hs+)-C
dT
Wliq + (1 - W s o l ^
fV + x
Hs-)nq ^
(8 3
' '
9)
,v _x
This solution is a generalization of equation (8.3.17) in the absence of a front. Many geological problems dealing with the genesis of ore bodies and oil fields, with the chemistry of aquifers, etc... cannot be handled adequately with a matrix of constant mineralogical composition. Upon water percolation, some mineral reactions take place which affect the chemical and physical properties of the rock. These reactions lead to extremely complicated transport problems which, in the frame of a non-specialized textbook, can only be hinted at. Contrary to chemical elements, molecules or minerals are not necessarily conserved in a closed system because chemical reactions may affect their relative proportions. Every geologist is familiar with weathering of granites increasing the amount of kaolinite at the expense of K-feldspar. For each reaction between N species (ions, molecules, minerals, ...), a simple symbolic form can be written N
l l
0<^> X vklZZl where Z 1 is the chemical symbol of the ith species and vk its stoichiometric coefficient in the fcth reaction. When the reaction proceeds, the numbers nl of molecules i present in the system change by the increment dnkl and mass balance requires dnkl
dnk2
dnkN
where £k is the progress variable of the fcth reaction. Assuming that species i is involved in R reactions, its conservation equation therefore must be written ^
(8.3.20)
where d£k/dt is the rate of the fcth reaction. Readers are referred to Bear (1972), Lichtner (1985), and Phillips (1991) for a detailed treatment of this complicated problem with many applications to economic geology. 8.4 Diffusion basics 8.4.1 The diffusion equation From now on, we assume that the diffusion ( = molecular) transport is not negligible, so we need some expression relating the diffusion flux to measurable quantities, e.g.,
420
Transport, advection, and diffusion
concentrations. The first Fick 'law ' assumes a proportionality between the flux of the species i and a 'force' which is the volumic concentration pCl ¥ = - 0 1 grad pCl
(8.4.1)
where Q}{ is the diffusion coefficient of the species i in the medium under consideration. & is expressed in surface unit per time unit (e.g., m 2 s~ 1 ). If the medium is incompressible, the conservation equation (8.2.5) can be transformed into dCl
A'
— = div(^f grad C) - v grad Cl + £ — dt uP
(8A2)
The first term on the right-hand side can be expanded as div(0' grad Cl) = 0'V 2 C + grad 0'*grad C where the Laplace operator V2 = div(grad) = V(V), also noted A, and such as
has been introduced. For constant <3\ we obtain the standard diffusion equation ?sCl
dt
^VCvVC+y * P
(8.4.3)
The right-hand side is the sum of three terms describing diffusion, advection, and chemical reaction, respectively. For the one-dimensional equation with x as the space variable, the diffusion equation is a partial differential equation of the first order in time and the second order in x. It therefore requires concentration to be known everywhere at a given time (in general t = 0) and, at any time t > 0 , concentration, flux, or a combination of both, to be known in two points (boundary conditions). In the most general case, the diffusion equation is a partial differential equation of the first order in time and the second order in the three space coordinates x, y, z. Concentration or flux conditions valid at any time t > 0 must then be given along the entire boundary. Taking the one-dimensional problem with x as the space variable as an example, boundary conditions can be of three forms: (i) a known concentration, e.g., Cl = C0 at x = L (ii) a known flux or, equivalently, a known concentration gradient dCi/dx = q at x = L
8.4 Diffusion basics
421
(iii) a mixed relation between flux (gradient) and concentration dCi/dx + ctCi + P = O at
x=L
(a and ft are constants). This last condition is also known as the radiation condition.
8.4.2 The diffusion coefficient The diffusion coefficient depends on the diffusion process being considered. In self-diffusion, isotopes of the same species are exchanged independently of any other species. This is the case for radioactive tracer diffusion in pure solids. This process is of considerable importance to the understanding of isotopic systems in geochemistry and is amenable to calculations by statistical mechanics. Chemical diffusion involves the interchange of different chemical species usually in a heterogeneous system and is a much more complicated phenomenon poorly described even by sophisticated models. In general, self-diffusion is about an order of magnitude faster than chemical diffusion (Hofmann and Hart, 1978). Diffusion being a thermally activated process, the diffusion coefficient depends on the absolute temperature T according to an Arrhenius law (8.4.4) a dependence usually represented in a plot of In 2fl vs 1/T. In this equation 0t is the gas constant ( 1 . 9 8 7 c a l K ^ m o l " ^ 8 . 3 1 4 4 J K ^ m o l " " 1 ) and Et is the activation energy. Fick's law describes diffusion in ideal or regular solutions, particularly the behavior of trace elements, fairly well. In more complicated systems, it becomes inadequate to use the concentration gradient as the diffusion driving force. The gradient of the chemical potential fit (= partial molar Gibbs energy) of the species i represents the actual energy gradient that drives atoms in one direction or another. As a simple illustration, let us consider a binary solution in which we neglect cross-effects of elements and write that the relative velocity vl of one atom of the species i is given by v'(x, y,z)= —Ml grad fit
where Ml is called the mobility of the species i. Darken (1948) and Darken and Gurry (1953) relate the Fick diffusion coefficient 9)1 to the mobility M l by equating the diffusion flux T 1 to the relative velocity of the atom
In one space dimension, neglecting temperature gradients relative to concentration gradients, this can be rewritten
dx
dx
422
Transport, advection, and diffusion
Atom fractions Xt relate to volume concentrations through
x-
pC
Neglecting variations in the molar volume makes the denominator constant and therefore the relative variations of concentrations and mole fractions are equal d In pC1' which results in
In a binary mixture, /i, can be expressed as
where G is the free enthalpy of the system (e.g., Swalin, 1962). The coefficient of diffusion is therefore (8.4.5)
For solutions which present a mixing gap, the locus of points where the second derivative of G vanishes is called the spinodal. Within the spinodal, this second derivative is negative which results in negative diffusion coefficients or uphill diffusion (Figure 8.10). Even this more elaborated description of ion movements in response to gradients of chemical potential may turn out to be insufficient, in particular when uphill diffusion is active: (a) Local charge balance must be obeyed which introduces stringent constraints difficult to meet (e.g., Kirkaldy and Young, 1987) (b) Cross-effects of multiple chemical gradients on the chemical potential are not easy to quantify: a simple experimental evidence of the cross-diffusion effects is the buildup of concentration gradients for an element 2 having an initially uniform distribution as a result of a concentration gradient in element 1 (Figure 8.11) (Kirkaldy and Young, 1987). (c) For an identical composition, the chemical potential of a species in a homogeneous system is different from its potential in a heterogeneous system. It has been suggested (Hilliard, 1970) that its appropriate form is
d2cl Ji,-2K — dx2 where K is the 'gradient-energy coefficient' which results in a more complicated form of the diffusion equation.
423
8.4 Diffusion basics
>0
Figure 8.10 According to Darken's theory, the sign of the diffusion coefficient changes where the second derivative of the Gibbs function relative to the molar fraction Xt vanishes (spinodal).
C/5
g
1
Initial distributions Distributions at t |~
C1
I
O
U
Distance Figure 8.11 Evidence of cross-diffusional effects. The homogeneous distribution of species 2 (dashed line, top) is perturbed by a coexisting gradient of species 1 (bottom).
8.4.3 The Matano interface A common method to measure diffusion coefficients consists in the welding of two samples with different concentrations of the element of which the diffusion coefficient is to be known. Upon heating, diffusion asymmetrical profiles are quite commonly obtained which show that different species diffuse at different rates. The concentration-
424
Transport, advection, and diffusion
dependent diffusion coefficient can be obtained by a method devised by Boltzmann and applied by Matano. The one-dimensional diffusion equation (8.4.2) of a medium at rest and with no reaction reads
dt
dx\ dx
where the superscript i is temporarily left out. The rather general similarity method (Logan, 1987; Zwillinger, 1989) uses particular variable transformations in order to reduce partial differential equations into ordinary differential equations. We already used implicitly a linear similarity transformation relating t and x by introducing the substantial derivatives [equations (8.2.6) and (8.3.18)]. For the diffusion equation, the transformation is no longer linear and we use the Boltzmann variable (8.4.6) By the chain rule, the derivatives become d
du d
x
d
dt
dt du
2tJ~t du
d _ du d _
1 d
dx
f\ du
and dx du
By changing the variables, the partial differential equation of diffusion has been turned into an ordinary differential equation udC _ d / 2 du du\
dC du
where simple derivatives are used since the only variable left is u. Alternatively
(8A7)
which can be integrated from one end of the diffusion couple where concentration is Co with zero gradient (no flux at the end), to the current concentration C as 1
r d -(®dc\
2jc 0
V du)c
(@dc\
-(®dc\
\
\
du/Co
dujc
where the lower-case c is a dummy integration variable. For a given value of t, u can be replaced by x and the equation rearranged into — xdc + [3>—) =0 2tJCo \ dxJc
8.4 Diffusion basics
425
Likewise, at the other end where concentration is Cx and gradient vanishes, mass balance between C and Cx leads to
2tJC
Adding up these two conservation equations leads to
i
Ci
xdc = 0
Co
We now have to make the definition of x more explicit and decide where to locate the origin (the Matano interface) for the conservation condition above to hold. In the laboratory, one would use an arbitrary coordinate X, such as the distance to one end of the experimental device, then x = X-Xm(t) where Xm(t) is the position of the Matano interface. Integrating the material conservation equation above, we get rci
rex
\X-XJfi\dc=\ J Co
rc\
Xdc-\ J Co
Xm(t)dc = J Co
and Xm(t) is found to be the mean distance (8.4.8)
Xdc c0
i.e., the distance at which the surface hatched in Figure 8.12 is split in half. Finally, the diffusion coefficient is computed from re
1 Jc 0
r
xdc
2r(dC/dx)c
c 1 J
xdc
2t(dC/dx)c
(8.4.9)
& Iqdari and Velde (unpub. data, 1992, see Table 8.2) described experiments of Ce diffusion in apatite soaked in CeCl 2 with asymmetric diffusion profiles. For one of their runs carried out at 1100°C for 15 days, and described as an example of a non-linear least-square fit in Section 5.2, it has been found that the relationship between the Ce concentration CCe and the distance X to the mineral surface is described by
CCe + 0.05
1.93-CCe
We would like to know how the Ce diffusion coefficient varies along the profile.
426
Transport, advection, and diffusion
Table 8.2. Cerium concentrations (%) in apatite soaked in CeCl2 (Iqtari and Velde, unpub. data, 1992). X(um) is the distance to the interface. Expression for I(X) is given in the text. Diffusion coefficient ^ ( c m 2 s " 1 ) calculated by the Matano method.
I(X)
X (umm) 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
1.864 1.832 1.638 1.227 0.600 0.250 0.152 0.085 0.078 0.046 0.043 0.012 0.013 0.012 0.013
21.2 21.3 20.6 16.9 8.2 1.1 -1.6 -3.8 -4.1 -5.4 -5.6 -7.2 -7.2 -7.2 -7.2
dX/dCCe
X
-15.3 -10.3 -5.3 -0.3 4.7 9.7 14.7 19.7 24.7 29.7 34.7 39.7 44.7 49.7 54.7
X0
198.1 87.9 6.4 -2.8 -9.2 -34.5 -73.1 -160.5 -178.3 -314.9 -335.4 -751.3 -727.7 -751.3 -727.7
9 0 -1.98xlO"13 -7.02xl0~14 6.00 x 10" 1 4 2.23 x 10" 1 3 6.03 x 10" 1 3 9.55 x l O ~ 1 3 1.36xlO" 1 2 1.40 x 10" 1 2 1.43 x 10" 1 2 1.40 x 10" 1 2 -3.51 x l O " 1 3 -2.07 x 10" 1 3 -3.51 x l O " 1 3 -2.07 x 10" 1 3
x
m
Figure 8.12 The Boltzmann-Matano technique. Initially, concentration is C o to the left of the initial interface located at x 0 , Cx to the right. The hatched areas between C o and Cx must be equal, which defines the position of the Matano interface. The framed area represents the numerator of equation (8.4.9). The diffusion coefficient is computed from the same equation.
8.4 Diffusion basics
427
The expression given for X as a function of C leaves us in trouble at both ends of the diffusion profile. X appropriately tends towards + oo and — oo when CCe tends asymptotically towards —0.05 and 1.93, respectively, which are nearly the extreme concentrations in the profile. This is what we expect from an infinite system. However, the integrals of the rational fractions are simply natural logarithms which cannot be evaluated for a zero argument and therefore do not converge when evaluated between CQ and Cv We will therefore restrict the calculation to the interval between extreme concentrations, say C o = 1.865 and Cx =0.012. The flux at both ends will not be strictly zero, since for these values,
dC fdx — = 1— dX \dC but we can safely assume that the amount of Ce that diffused out of this concentration interval is very small. The apparent shortcoming of this assumption is that estimates of diffusion coefficients near the ends will be unreliable. Assuming that the dimension of the system is infinite in both directions, we first determine the position of the Matano interface from i
i
rc\
Xm =
p.012
Xdc = C1-CoJCo
Xdc 0.012-1.865 J
lM5
Inserting the expression for X, we get
Xm==
0.012-1.865 C / ( " ) ] n "
where I(u) = 2.878 ln(w + 0.054) + 0-879 ln(w -1.93) +1 lu -
2.86M2
and, upon evaluation ^2-21^ 0.012-1.865 The derivative of X relative to CCe is given by dx dC^ ~
2.878 0.879 (CCe + 0.05)2 + (1.93-CCe)2 " '
In units of cm 2 s~ \ the Ce diffusion coefficient can be calculated as
1A-8
Ay
—-{[/(C C e )-7(1.865)] -15.3 x (C Ce -1.865)} Ce 2(15 x 24 x 3600) dC dC Ce l "
428
Transport, advection, and diffusion
As predicted, the points next to the ends of the profile give unreliable (negative) estimates because thefluxesat the end points themselves become a substantial fraction of the fluxes in the neighborhood. In addition, noise in sections with rather flat gradients becomes a problem. More elaborated non-linear regression techniques should be used to handle this specific problem. Nevertheless, the profile between 15 and 50/mi, where steep Ce variations are observed, shows evidence for substantial changes of the ^ C e which seems to hint at much faster diffusion when this element is only in trace amounts in the apatite lattice. <=> 8.5 Solutions of the diffusion equation: parallel flux
Different techniques are commonly used to solve the diffusion equation (Carslaw and Jaeger, 1959). Analytic solutions can be found by variable separation, Fourier transforms or more conveniently Laplace transforms and other special techniques such as point sources or Green functions. Numerical solutions are calculated for the cases which have no simple analytic solution by finite differences (Mitchell, 1969; Fletcher, 1991), which is the simplest technique to implement, but alsofiniteelements, particularly useful for complicated geometry (Zienkiewicz, 1977), and collocation methods (Finlayson, 1972). A series of cases relevant to geochemical problems in parallel and radialfluxeswill now be described as examples of a few methods. 8.5.1 Parallelflux:the instantaneous point source in the infinite medium Quite illustrative is the case of a finite amount of a diffusing substance deposited initially at a given position. The Gauss function (8.5.1)
where M is a constant to be determined, may be shown by simple substitution of its derivatives to satisfy the one-dimensional diffusion equation with constant diffusion coefficient Q) 9 dt
(8.5.2) dx2
At t=0, C is zero everywhere except at x1 where it is infinite. The amount of diffusing substance can be found by integrating C from — oo to + oo r+»
M r + » r (x-x i ) 2 i J Cdx = ——= exp dt
Introducing a Boltzmann variable ^
(8.5.3)
8.5 Solutions of the diffusion equation: parallel flux
429
0.0 -4
-3
Figure 8.13 Dispersion of an instantaneous point source [equation (8.5.1)]. A quantity of the diffusing species equivalent to the surface of whatever curve on the diagram is deposited initially at x = 0. The curves are Gauss functions.
it becomes
f + Q0 J - 00
Cdx =
M f •K/ H " ~
From a well-known result of calculus, the definite integral on the right-hand side is y/n so M is just equal to the quantity of diffusing substance. The present solution is therefore applicable to the case where M grams (or moles) per unit surface is deposited on the plane x = x1 at t = 0. In terms of concentration, the initial distribution is an impulse function (point source) centered at x = x1 which evolves with time towards a gaussian distribution with standard deviation JlQ)t (Figure 8. 13). Since the standard deviation is the square-root of the second moment, it is often stated that the mean squared distance traveled by the diffusion species is 2<2>t. Two important consequences of this equation are: (i) Identical Boltzmann variables x/y/t produce identical concentrations. The solution at x = 0 for t > 0 is identical to the solution at t = oo for x > 0. Likewise, the solution at x = oo for t > 0 is identical to the solution at t = 0 for x > 0. (ii) The distance of all the points with identical concentrations (isopleths) to the origin varies with the square-root of time.
430
Transport, advection, and diffusion 8.5.2 Two half-spaces with uniform initial concentrations
The infinite medium with one-dimensional diffusion and constant diffusion coefficient can be treated easily with the point source theory. Let us first assume that two half-spaces with uniform initial concentrations C o for x < 0 and 0 for x > 0 are brought into contact with each other. The amount of substance distributed per unit surface between x' and x' + dx' is just C o dx'. From the previous result, at time t the effect of the point source C o dx1 located at x1 on the concentration at x will be zexp
-
Summing over all the point sources at x' from — oo to + oo and noting that the contribution from the half-space x > 0 is zero will give the concentration distribution C = —-
dx1
exp
2 v /^rJ-oo
L
A9t J
Using again the Boltzmann variable change u= or
x—x
1 d « = —- dx
u changes from + oo to x/2 (@t)1/2 and the integral becomes c C + °° C = —%\ exp(-M2)dw
Introducing the error function erf x defined as (Appendix 8A) erfx = — and the error function complement erfc x erfcx=l—erfx =
2 f + 0° *4 e
d<^
r~ I * / 71 J x
the solution can be rewritten r_
v
(8.5.4)
8.5 Solutions of the diffusion equation: parallel
flux
431
Several curves of C against x have been drawn in Figure 8.14 for different values of the parameter 2t. We note that the distribution must remain symmetrical about x = 0. Concentration at the interface is therefore equal to C o/2.
8.5.3 The infinite medium with a layer of uniform initial concentration With initial condition C = C o for — X < x < + X and C = 0 elsewhere, we can assume that the initial distribution is the sum of point sources uniformly distributed between — X and +X. Again, the amount of substance distributed per unit surface between x1 and x' + dx1 is C o dx'. Summing up the contribution of all the point sources such as — X ^ x1 ^ + X will give the concentration distribution
Introducing Boltzmann's variable, we get x+X ;=-^
v
x+X Q-*&U
> x-X
x-X
u == ^ \ V - ce- ^ d dw ll--^:!^ X/TTJO ./TTJO
Q~ e U" du
2,/Wt
Finally
(8.5.6)
Defining the dimensionless variables £ = x/X and t = @t/X2, the solution can be rewritten
This solution keeps the remarkably constant value C « C o / 2 at x= ±X for values of @t/X2< 0.5 (Figure 8.15).
5.5.4 The infinite medium: an arbitrary initial distribution The method of point sources can be extended to any form of the initial distribution C0(x) in the infinite medium. The amount of substance distributed per unit surface between x' and x' + dx1 is just C0(x')dx'. Summing over all the point sources at x' from — oo to + oo gives the concentration distribution C=
-2
-1.5
-1
-0.5
Figure 8.14 Two half-spaces in contact with each other at x = 0 have the concentration Co for x<0 and 0 for x>0. Interface concentration is constant and equal to Co/2. The curves are labeled for different values of the parameter Qit.
0.8
0.6
0.4
0.2
0
-2
-1.5
-1
-0.5
0
0.5
x/X Figure 8.15 The infinite medium with a layer of thickness 2X and uniform initial concentration C o [equation (8.5.6)]. Interface concentration at x= ±X stays nearly constant and equal to Co/2 for ©t<0.5. The curves are labeled for different values of the parameter 3>t.
1.5
434
Transport, advection, and diffusion
The exponential term which represents the effect of a point source is sometimes called the influence function or Green function of this diffusion problem. The method of sources and sinks easily produces solutions for an infinite medium or for systems of finite dimension when their boundary is kept at zero concentration. Different boundary conditions require a more elaborate formulation (Carslaw and Jaeger, 1959). 8.5.5 The infinite medium with C0(x) being a periodic function
ofx
A useful result is obtained for the initial distribution C0(x) given by x C0(x) = A sin 2n—+B A
where A is the amplitude and X the wavelength. B is a constant which represents the mean concentration and will be taken equal to zero, which amounts to considering a perturbation around the mean value. A quick look at the partial derivatives in the diffusion equation shows that double differentiation relative to x keeps the sine term and suggests a solution in the form
r A
Inserting this expression into the diffusion equation leads to x An2 x f\t) sm2n- = -$ —2 f(t) sin 2n - + B A
A
A
or 4TT2
/ ' ( * ) = - 0 — f{t) A
which is integrated into f{t) = const x expl
t\
X2 J
I
The solution is therefore C(x, ty= A sin 2TT - expf
— J = C0(x) expf
— j
(8.5.8)
The shorter the wavelength, the faster its decay. Mineral scale heterogeneities in rocks disappear long before meter-scale or even larger heterogeneities. This concept can be extended to any arbitrary combination of periodic functions: in Section 2.6, we have already met the idea that any function bounded over an interval can be expanded as a sum of sine and cosine functions. Shorter wavelengths will decay much faster
8.5 Solutions of the diffusion equation: parallel flux
435
Figure 8.16 The infinite medium with initial concentration C0(x) = sin 2nx + sin 10nx + sin 20nx. The curves are labeled for different values of the parameter Q)t. Short wavelengths are smoothed much faster than long wavelengths [equation (8.5.8)].
than longer ones: diffusion filters out short concentration variations first. This is illustrated in Figure 8.16 for A = 1 and the sum of three sine functions with A = 0.1, 0.2, and 1 at @t = 0, 0.001, 0.01. 8.5.6 The semi-infinite medium with constant surface concentration Let us assume parallel flux in a semi-infinite medium bound by the plane x = 0. Diffusion of a given element takes place from the plane x = 0 kept at concentration Cint. Introducing a Boltzmann variable u with constant diffusion coefficient such as
equation (8.4.7) is modified as d 2C — 2du
dC -2u — du
Equation (8.5.9) is equivalent to
du = -2u du
This is integrated a first time into 8C — = const x e du
(8.5.9)
436
Transport, advection, and diffusion
and a second time into
where a and jS are constants to be determined from the boundary and initial conditions. Applying the initial and boundary conditions gives Co = a erf oo + /?
hence :=(C 0 -C int )erf-
+c int
which can be rewritten — = erfc Cint-C0
(8.5.10)
Curves corresponding to various values of Q)t have been drawn in Figure 8.17. A standard chemical (or diffusion) boundary layer thickness b can be defined as the
I c
1.6 S (0.05) = 2
Figure 8.17 Constant surface concentration [equation (8.5.10)]. The curves are labeled for different values of the parameter 3it. The framed area is equal to the hatched area and defines the thickness of the boundary layer at for ®t=0.05 [equation (8.5.12)].
8.5 Solutions of the diffusion equation: parallel flux
437
mean distance to the interface traveled by the diffusing substance. The total mass of diffusing substance which penetrated the interface at t is given by the surface under the curve
It is shown in Appendix 8A that r + 00
i
f Jo
and therefore r = 2(C i n t -C 0 )J— n
(8.5.H)
The boundary layer thickness 3 (Figure 8.17) will be defined as the thickness of a layer with uniform concentration C int which would contain the same amount M of diffusing substance 3 =——— = 2 /— Cint-Co
n
*
(8.5.12)
8.5.7 The slab with uniform initial concentration When combined with the Fourier expansion of functions, separation of variables is another powerful method of solutions which is particularly useful for systems of finite dimensions. Regardless of boundary conditions, we decompose the solution C(x, t\ where the dependence of C on x and t is temporarily emphasized, to the general one-dimensional diffusion equation with constant diffusion coefficient dC(x,t)_^d2C(x,t) dt dx2 into the product C(x,t)=f(t)g(x) Inserting this expression into the diffusion equation gives
dx1
dt or 1
df(t)_ dt
1 d2g(x) g(x)
dx2
438
Transport, advection, and diffusion
Since the two members in the last equation cannot be functions of the independent variables x and r, they must be equal to a same constant, which suggests using an exponential form for f(t) and a trigonometric form for g(x). The diffusion equation is indeed identically verified for C(x, t) = (xn sin nn — exp( - n2n2@t/X
2
)
where n is an integer and aM an arbitrary constant. This solution satisfies the condition C(0, t) = C(X, t) = 0. Any superposition of solutions with different values of n would also be a solution. The condition at t = 0 suggests that the initial concentration distribution can be expanded in a series of sines, e.g., C(x,0)= Y, oin sin nn — n=0
X
A general solution to the diffusion equation is therefore ansmnn-Gxp(-n2n2@t/X2)
C(x,t)= X n=0
X
Such a sine expansion is generally made possible with the condition of zero concentration at x = 0 by using an odd function for the initial concentration distribution. A simple example is for initial uniform concentration C o between 0 and X for which we can assume a fictitious concentration — C o between — X and 0. Using the results of Chapter 2, the Fourier expansion of the boxcar function which is C o between 0 and X, and 0 at x = 0 and x = X is 4C
°°
—
X
1
x
sin(2n+l)7r-
7T n = 0 2 n + l
(8.5.13)
X
Concentration at t therefore is C(x,t) = — £ —?— sin(2n+l)7r-exp[-2n+l)2 7c 2 ^r/X 2] X n ,, = o2rc+l
(8.5.14)
Formulating the same problem in a symmetrical way, i.e., for a slab with —h<x<+h may happen to be occasionally convenient. Changing X into 2h and x into {y + h) would give m A 4C0-(-l)M 2H+1 x f /2n+l\2 2 "I C(x,t) = > cos 7r-exp %zQ)t hz
n n%2n+l
2
h *± \
2 J 2
J
This solution converges very slowly for values of @t/X «l. Alternative solutions suitable for small values of the time are available from the standard books on diffusion (Carslaw and Jaeger, 1959; Crank, 1976). The amount M(t) of diffusing substance still
8.5 Solutions of the diffusion equation: parallel flux
439
present in the slab at t is
4C0 » 1 , , , Cx 2 2 2 M(t)= C(x,t)dx =—- X exp[-(2n+l) 7r ^t/X ] si n Jo Jo n = o2n + l The integral can be calculated as f*
x COS(2tt+l)7t — 1)TT
X
X
dx-
L-cos(2rc+l)7r
2X
(2H+1)TT
(2n+l)7r
Jo
(8.5.15)
X
and the mean concentration C(t) = M(t)/X is given by (8.5.16)
a solution which converges extremely fast except for small values of @t/X2. Using the properties of the theta functions, an alternative solution can be computed which converges rapidly for small values of the time (Appendix 8B) l Co
4
X
[
1
°°
nX ~\
+2£(-l)"ierfc—— (8.5.17) = i 2J&U Jn » where ierfc refers to the integral of the error function (Appendix 8A). The solution C(t)/C0 has been drawn in Figure 8.18 for various values on the parameter Q)t. The approximation ^
(8.5.18)
gives the fraction left in the mineral at t with four exact digits for loss < 40 percent, and two digits for loss <80 percent (Figure 8.18). Using symmetry arguments, the solution for diffusion with no flux at one end can be derived from these equations. Obviously, the concentration profile for zero surface concentration is symmetrical relative to X/2, which means that dC/dx is zero at that point: the flux of diffusing substance through this point is zero. Other combinations of boundary conditions can be found in standard textbooks (Carslaw and Jaeger, 1959; Crank, 1976). 8.5.8 The slab with accumulation of a radiogenic isotope In K-Ar or zircon U-Pb dating, modeling the loss of radiogenic isotopes by volume diffusion is important. If Po is the local concentration at t = 0 of a radioactive element decaying with constant 2, a source term exists in the transport equation of the radiogenic element which is the local rate of accumulation lP o e~ A f . For dual decay,
Transport, advection, and diffusion
440
Parallel diffusion in a slab = 0.0001 Simple loss: full solution Simple loss: approximation Decay and loss
0.2
10-5
101
10
3t/x2 Figure 8.18 Mean concentration in a slab (0<x<X). Heavy line: simple loss and initial concentration Co [equations (8.5.16) and (8.5.17)]. Dotted line: approximation (8.5.18). Thin lines: initial concentration 0, radiogenic production from a radioactive parent with concentration Co, w is the loss parameter [equation (8.5.22)].
such as K-Ar, this equation should be corrected for the branching ratio. The diffusion coefficient of the radiogenic element is assumed to be constant, thus the onedimensional diffusion equation reads
dt
(8.5.19)
dx
At t = 0, the slab is supposed to be free of radiogenic element (C o = 0). At the boundaries x = 0 and x = X of the system, concentration is kept to zero. For simplicity, Po will be assumed to be constant over the mineral. Let us consider in the first place the total concentration N = C + Poe~Xt of radioactive and radiogenic isotopes. Since there is no loss of the radioactive isotope, the variation of N equals the loss of the radiogenic isotope. In other words
dx2
dt
Since P o is independent of x, this can be rewritten dN _
d2N
~dt~
~dx^
with the conditions that N = P0 at t = 0 and N = Poe
Xt
at x = 0 and x = X. Such a
8.5 Solutions of the diffusion equation: parallel
flux
441
formulation is extremely handy when numerical methods such as finite differences are used. Analytic solutions can be arrived at by using Duhamel's principle briefly outlined in Appendix 8C and which will be subsequently applied to spherical systems. Alternatively, the solution can be worked out through a series of steps similar to those taken for the non-radioactive case. We assume that a solution can be found as a product of a function f(t) and a series of trigonometric functions in x such as
Qx,t) = f(t) f
ansmnn —
n = 0
-**•
Equation (8.5.19) becomes
fit)
Y ansmnn — = w =
0
ansinnn—h>lPoe~Ar
^—f(t) Y
-^*-
•**•
n =
0
-**•
The trouble is now that the source term does not include the sum of sines, so we will use a trick resting on the Leibniz's rule for differentiating integrals. A particular solution of the diffusion equation with radiogenic accumulation is XP0e-*uQxp[-n2n2@{t-u)/X2]du
C(x,t)= x,t)=
£ a M sinmrn=0 X
Jo J
Using Leibniz's rule, we get
which, inserted into the diffusion equation, amounts to the additional condition oo
Y
Z aM sin nn — = 1
w= 0
X
The function that has such a Fourier expansion while satisfying C(0,t) = C(0,X) = 0 is the boxcar function. We found in Section 2.6.2 that the an coefficients of this function are 0 for even values and 4/nn for odd values of n. The general solution is therefore 4 oo
i
C(x,t) = - £ nn = o2n+l
x
rt
sin(2n+l)7rXJ0
r
/lP o e" A "exp -(2n+l)2n2 |_
@(t — u)~\ K
}
2
X
\du (8.5.20) J
Let us evaluate the integral I n on the right-hand side of equation (8.5.20)
= XP0 expl" -(2n +1) V | [ 1 J' expj^n + l)V J^ - A j du
442
Transport, advection, and diffusion
which can be integrated into
(2n+l)2n2®
Y Inserting this value into equation (8.5.20) gives
The mean concentration C(t) at time t is obtained by integrating this expression from 0 to X )= ^ |
C(x,t)dx
Jo
Using equation (8.5.15), we get, in terms of the dimensionless time
nn
~
.
ex
f P
(2n+l)2n2@tl e x ( 77-, ~ P
\9>t (8.5.21)
1 — (2n+ 1) n w
where, the dimensionless parameter w is defined as (8.5.22) This formula could be made slightly more compact by using Cauchy's Theorem of Residues for complex variables and the Theorem of Mittag-Leffler (e.g., Spiegel, 1973), but, since the advent of reasonably fast desktop computers, this is no longer a critical requirement. For fast decay, all radioactive atoms are rapidly converted into radiogenic atoms. It can be checked than when X goes to large values, i.e., when w vanishes, the second term after the summation sign tends to unity, the second exponential within the braces vanishes and the expression becomes identical to the case with initial concentration Po which we have established in the previous case. The solution has been drawn for several values of the parameter w in the Figure 8.18.
8.5.9 Disequilibrium fractionation during solidification Let us consider the influence of a solid-liquid interface advancing at a constant velocity on the solid-liquid fractionation of an element i. In the case of unidirectional solidification, it is convenient to consider that liquid crosses the immobile interface with an absolute constant velocity v, while a solid-liquid fractionation coefficient K is applied to the fractionation of element i. Let us assume that the interface is at x = 0, the medium being solid for x < 0 . Liquid fills the half-space 0 < x < o o and
8.5 Solutions of the diffusion equation: parallel
flux
443
concentration is initially C o . Diffusion in the solid is much slower than in the liquid and therefore is neglected. Since the liquid moves towards negative x, concentration of a particular element in the liquid obeys the diffusion equation ^
^ ^E +V ^ dx2 dx
dt
(8.5.23)
where 2 is the diffusion coefficient of the species. Mass balance at the interface x = 0 requires the equality of the diffusion and advection fluxes on the liquid side with the advection flux on the solid side - r ^ - PuqvCu, = - p sol vC sol
(8.5.24)
OX
or ^— = I — K ~ 1 — C,iq = - C liq dx \/9 liq /Q) S
(8.5.25)
in which the parameter a, corresponding to the parenthesis in the middle term, and the characteristic length 3 = Q)/v have been introduced in order to keep the notation compact. In addition, when x-»oo, C-+Co since the liquid away from the interface does not 'feel' it. An element-dependent characteristic time of the process is 6 = @/v2. This problem has been considered by Smith et al. (1955) and Hulme (1955), who, using the method of Laplace transforms, found the solution (also listed in Carslaw and Jaeger, 1959, p. 389)
Co
^
^
{ + (1+2g)T
i±^
(8.5.26)
f where we have introduced the dimensionless variables „ vx x { = — = -,
, and
v 2t t T= — = -
(8.5.27)
After some manipulation, concentration of the liquid at the interface ^ = 0 becomes
0
2(1 + a) (
\
2/
2
(8.5.28)
Let us investigate the long-term asymptotic properties of this solution. Given that l + 2a\ 2 1 2 / 4
2
Transport, advection, and diffusion
444
we write
which is recombined in a more symmetric way Cnq(0) Co
1
1 +a
e+ •
Using the property of the error function listed in Appendix 8B 1
exp«2erfcu-
as y/n\nu
when r-> oo, the asymptotic value becomes Cliq(0) Co
1 1+a
|~l+2a 2
2
a
L (l + ) (l + 2a)N/7rr
2(1+a)
The term between braces is identically zero so Cliq(0) Co
y
1 _ Pnq
The steady-state profile in the liquid is calculated in Section 9.6. The rate at which interface concentration builds up or goes down is shown for various a in Figure 8.19. Neglecting density change upon solidification and con-
Figure 8.19 Evolution of the liquid concentration at the interface with a solid growing at the constant rate v from a solution initially at C o . K is the solid-liquid partition coefficient. Steady-state takes longer to establish for incompatible elements.
8.6 Radial flux and spherical coordinates
445
sidering incompatible elements (K^O), equation (8.5.29) simplifies to an excellent approximation into
or
^T *
(1
2K
e )
( 1
^ ^
e )
(8 53O)
-
This equation shows that, at constant growth rate, the more incompatible the elements, the Ionger4t takes for steady-state to establish. We therefore can expect kinetic disequilibrium beween mineral and liquid to be more conspicuous for incompatible than for compatible elements. 8.6 Radial flux and spherical coordinates 8.6.1 Introduction In the case of flux with spherical symmetry, i.e., with no dependence on the latitude and longitude, gradient and Laplacian operators must be expressed as a function of the radial distance r to the origin
The derivative operator transforms into d _ dr d _ x d dx dx dr r dr which results in r
dr
where i,j, k are the unit vectors along the x, y, z axis. The fraction on the right-hand side represents the ratio of the vector with modulus r to the modulus r itself. It therefore represents the unit vector er along the radial direction at the point under consideration gradC = er— dr
(8.6.1)
From the previous expression of the derivative d2C _ d (x dC\
\dC
d (\ dC
dx2
r dr
dx\r
dx\r
drJ
dr
446
Transport, advection, and diffusion
which can be developed as d2c_\dc dx2
x2 d(\
dc\
r dr\r
dr J
r dr
or d2C
1 dC
x2 dC
x2 d2C
Let us write the Laplacian explicitly as a sum of second-order derivatives relative to x, y, and z d2C dx
d2C
d 2C
,
+ — - + — - = AC = V2C = dy
dz2
x2 + y2 + z2 dC
3dC
— r3
r dr
x2 + y2 + z2 d2C
+ r2
dr
dr2
or r2 dC
_3dC
2
r dr
r
3
r2 d2C r2 dr2
dr
Finally
whence the diffusion equation for radial flux with constant diffusion coefficient can be rewritten
dc
(d2c
dt
0 \dr2 [
2 dc\ r (dr
8
.
6
.
3
)
Similar expressions can be derived for cylindrical coordinates.
8.6.2 Radial diffusion in the sphere In the diffusion equation (8.6.3) with radial flux and constant diffusion coefficient, let us introduce the new variable u(r, t) = u = Cr 1 du u r dr r2
dC dr
\(du r\dr
u r
and for the second derivative d2C _\d2u ~\ 2
dr
1 du Ts 2
r dr
2
~\
r dr
1 du
2u _ \fd2u
~~2 "^
r dr
it
r
' ~
2 du ~
-
2u ~
8.6 Radial flux and spherical coordinates
447
therefore
fd2C 2
V^ "
2du\®Yd2u +
2du 2u 2/du u\l
r ~dr~)~~r~ld? ~ ~r Jr + ~? + ~r\dr ~ ~r)\
The diffusion equation in the new variable u becomes dC _ ® d2u
or du Q) d2u - = - —2 r dr dt
(8.6.4)
Let us calculate the concentrations in a sphere with a uniform initial distribution Co and zero surface concentration. Initial and boundary conditions are w(r,0) = Cor, u(a, 0 = 0, w(0,t) = 0. The diffusion equation in one dimension (8.6.4) admits r
f
a
\
u(r, t) = an sin nn-exp
®t\
-n2 22n2—) a2/
as a particular solution which fulfills the conditions at r = 0 and r = a for n integer, an being a constant to be determined. Any superposition of solutions with different values of n would also be a solution. The condition at t = 0 suggests that if the initial concentration distribution can be expanded in a series of sines, e.g., oo
y
u(r,0)= £ ccn sin nnn-Q
d
the general solution is 00
r ( Q)t\ u(r,t)= YJ awsinmz;-exp( —n2n2—-\ n= i
a
\
a )
In order to make the solution consistent with initial and boundary conditions, we will use for u(r, 0) the ramp function defined in Chapter 2. For 0 < r < a, u(r, 0) = Cor, while u(r, a) = 0. Using the results of Section 2.6, the Fourier expansion of this function is 2aC0 » (-1)" .
u(r,0)=
>
n
n
=i
r
sinmr-
n
a
which gives the general solution
CM) — ^ nr
I tlltan.LaJ-M*) n
=i
n
a
\
2
)
(8.6.5)
Transport, advection, and diffusion
448
The amount of diffusing substance M(i) still present in the sphere at t is obtained upon integration of the concentration times the volume (4nr2 dr) of the infinitesimal shell of thickness dr r sin nna 4nr2dr M(t) = | C(r, t) 4nr2 dr = - 2C0 £ (-1)" exp( - n2n2 — OVJo r nn Let us call Jn the integral on the right-hand side of the last equation. Jn can be integrated by parts as
_
1
^
r sin nnaam
r
nil
A
2A
4a
(°
V
•
4a
A
Jn= Jo
4a C
nn
n Jo
a
r
z
nn- Anr dr = — smnn-rdr — — ar n Jo a n
a
COStt-
r cos nn a -dr nn a
Since the last integral on the right-hand side is zero (8.6.6)
which gives M(t) as 8Coa3 « 1e x / n 2 n 29t\ M(t) = X ^ P ~ -T 7r
n
=ir
\
a^/
The mean concentration of the sphere therefore is M(t)
6C0 - 1
/
^ ^ e x P ~nn
^~~r 2
n
n=
\n
2
2
2Qt\
—)
(8.6.7)
2
\
a/
Again, this solution converges very slowly for small extents of loss, i.e., for small values oi^tja1. In this case, the solution expressed as an error function series should be used (Appendix 8B) na
/^+
ie
(8.6.8)
Of course, both equations (8.6.7) and (8.6.8) are valid solutions which only differ by the rate of convergence. Figure 8.20 under the label a = 0 illustrates how the solution varies for different values of the parameter Q)tja2. For loss extents < 85 percent, the approximation (8.6.9) Co
gives the lost fraction with at least four exact digits.
449
8.6 Radial flux and spherical coordinates
Radial diffusion
a=5
a=
0.2 .
io- 3
Equilibrium with a
101
10°
10
Figure 8.20 Mean concentration in a sphere (0
In the case where the surface is kept at Cs, the solutions just derived hold with C — Cs and C0 — Cs written in place of C and C o . In particular, we will subsequently make use of the solution C(r,t)-C0_1
C(r,t)-Cs
2a °° (— 1Y r •— X !—^-sinnTt-expl -n2n2^nrn = i
n
a
V
or
)
(8.6.10)
Once equation (8.6.10) is integrated over the sphere for C o = 0, we get the expression (8.6.,,,
which is useful to simulate a 'clean' sphere immersed in a 'dirty' liquid. It may be applied to the uptake of trace-elements by minerals from liquids or to the sorption of rare-gases from a surrounding fluid phase. 8.6.3 Desorption from a sphere into a well-stirred solution Let us assume that a sphere with radius a is immersed in a liquid of finite volume, e.g., a mineral in a hydrothermal fluid. Diffusion in liquids is normally fast compared to diffusion in solids, so that the liquid can be thought of as homogeneous. Similar conditions would apply to a sphere degassing into a finite enclosure, e.g., for radiogenic argon loss in a closed pore space. Given the diffusion equation with radial flux and constant diffusion coefficient dt
Br2
dr
450
Transport, advection, and diffusion
we calculate the evolution of the concentration in a sphere of radius a with uniform initial concentration C o desorbing the diffusing substance into a well-stirred solution of volume w with zero initial concentration. The surface concentration of the sphere is in equilibrium with the solution through a partition coefficient K such as
In addition, mass balance requires that the amount of element leaving the sphere increases the concentration in the surrounding liquid, i.e.,
dr
dt
K
dt
This problem requires specific techniques not developed in this chapter, such as Laplace transforms, and the reader interested in the derivation of the solution may refer to the textbook of Crank (1976). Defining a as the final distribution ratio, i.e., the amount of solute contained in the solid divided by the amount contained in the liquid when £->oo J ^ 3w
(8.6.12)
the solution is = Co
1-6 Y 1+a
n=
(8.6.13) i 9a 2 + 9a + qn
where qn is t h e nth solution of the equation tang M =
n
-~
(8.6.14)
an equation we can solve numerically by the Newton method described in Section 3.1. Letting co increase indefinitely, a decreases to zero, a = 0 corresponds to a situation of very large values of K or very small volume of liquid. Most of the species considered is held by the solid. In this case, the solutions to the equation
are simply nn and the solution is identical to that with no surrounding liquid. If a increases to infinity, e.g., because of a very high partition coefficient, the first term on the right-hand side of equation (8.6.13) tends to unity while the second term vanishes. Negligible amounts of elements are transferred from the sphere into the surrounding medium and concentration stays C o . It will be left to the reader to verify that mass balance is verified at equilibrium for a concentration C(oo) given by
The curves of the solution for various values of a have been drawn in Figure 8.20.
8.6 Radial flux and spherical coordinates
451
8.6.4 The sphere with accumulation of a radiogenic isotope This problem, in which a radiogenic element is allowed to leak out of its host mineral as it forms, has found important applications in geochronology, particularly for the K-Ar method (Wasserburg, 1954) and the U - P b method (Tilton, 1960; Wasserburg, 1963) with the so-called continuous loss model. The equation for radial diffusion of a radiogenic element in a sphere with radius a and uniform parent isotope concentration P = P0 at t = 0 can be written
, r dr,
(8.6.16)
where X is the decay constant of the parent isotope. Again, changing variables and using the total concentration N of radiogenic and radioactive isotope such as N = C + Poe~kt would lead to the equation dN
dt
„ fd2N
$ ( + \ dr
2 dN
r dr
where 3) is the diffusion coefficient of the daughter isotope. However, we rather change the variable C into w(r, t) = Cr as above which gives
dt
dr2
The same derivation as that used for the accumulation of radiogenic isotope in a slab would lead to the solution but we will take advantage of this case to fully develop an application of Duhamel's principle (Appendix 8C). The assumption of zero initial and surface concentration of the radiogenic isotope is equivalent to ii(r, 0) = 0
Introducing the concentration deficit due to loss of radiogenic element times r as the new variable v(r, t\ not to be confused with a velocity, we obtain Q-Xt)-u(r,t)
(8.6.17)
Since the second derivative of v(r, t) + w(r, t) with respect to r is zero, we can rewrite the diffusion equation as dv(r,t)_^d2v(r,t) dt dr2
452
Transport, advection, and diffusion
The boundary conditions become v{r, 0)=0 Q-Xt)
(8.6.18)
t?(O,r) = O
In order to apply the Duhamel principle, we must retrieve the solution v(r,t) for constant v(a, t) = vs(t) = vs from equation (8.6.10). Upon multiplication by r/a, the solution reads \r 2 » (-1)" r / y(r,r) = t;s - + - £ sinmi-exp \_a
nn = i
n
a
\
0A~| -n2n2—)
a JA
T being an arbitrary time, we first define the function g(r, t — T) as . r 2 « (-ir . r f 2 2^-i)"| g(r,t — T) = - + - > sinn7r-exp —nznz — a nn = i n a |_ a2 J We then let surface value VS(T) vary with T according to equation (8.6.18). Using Duhamel's principle (Appendix 8C), we find that the solution for the time variable function VS(T) is d Cf v(r, t) = - \
d P
v(r, t - T) dr = -
i7g(T^(r, r - T) dr
or upon changing t — x in T
.(r,r) = ^ f ° i ; s a - T ^ ( r , r X - d r ) = ^ f i;.(r-r)flf(r,r)dr Applying the Leibniz's rule to the integral on the right-hand side, we get
which, since vs(0) = 0, reduces to the integral term. From the definition (8.6.18) of vs(r) dvs(t-T) dt
and therefore
dt
8.7 The diffusion coefficient varies with time
453
Since
the solution i;(r, t) is Ar
t>(M) = rP 0 (l-e
)+
sinn7r
Z
TlA
ex
P
2~ )~e
From the definition (8.6.17) of v(r,t\ we retrieve the concentration C(r9t) as
_..,_.
.
1
2 2
v
'
—n n w a where, as before, the dimensionless parameter w is defined as w = @/Xa2. The amount of radiogenic isotope accumulated in the sphere at time t is M(t)=\ C{r,t)4nr2dr Jo Using equation (8.6.6), we obtain the mean concentration of the sphere as C(t)= — ^ - = 4na3/3
^ £ e ~C — n n=i n2(l—nn2w) 2
(8.6.19)
a solution given by Wasserburg (1954). Again the Theorem of Residues could make the formula look marginally better. The solution for short values of t is given by Carslaw and Jaeger (1959, p. 245) but, quite unfortunately, involves error functions with complex arguments and therefore is too complicated for being useful. For fast decay, X is very large and the solution converges towards equation (8.6.7), as required.
8.7 The diffusion coefficient varies with time
8J.I General When the diffusion coefficient varies with time, the fundamental transformation commonly used consists in defining a new time variable T as (8.7.1)
454
Transport, advection, and diffusion
or
where X is a length scale (usually the thickness for a slab or the radius for a sphere). We first use the chain rule to obtain dC _ dx dC _ 9(t) dC
Defining the space variable £ = x/X9 we get dC 1 dC — = , and dx X d£
d2C 1 d 2C — 2- = ——dx X2 d£2
The diffusion equation can now be rewritten as
dT
d£2
(8.7.2)
All the solutions with constant diffusion coefficients can therefore be used for problems with time-dependent diffusion coefficients upon replacement of Q)tjX2 by T. An important application of this transformation is the recovery of diffusion coefficients from stepwise heating experiments. Let us imagine minerals which can be considered as spheres of the same size a and initially containing a uniform concentration C o of a gas, argon for instance. In a stepwise heating experiment, the experimentalist heats the sample at increasingly higher temperatures. At the fcth heating step, the diffusion coefficient<3)kremains constant between tk_l and tk. The remaining fraction of gas in the mineral Cfc(i)/C0, which can be calculated once all the gas is ultimately extracted at the end of the experiment, can be matched with a unique value of xk through _
or the corresponding expression for small times. Since the diffusion coefficient is constant by steps, we can write
and therefore T» + i-T» = ^ t + ' ( f * 2 * l ~ t t ) a
(8-7.3)
where tk + x — tk is the duration of the /cth heating step. Plotting ln[(rfc+x — xk)/(tk+2 — rfc)] i;s l/T(Arrhenius plot) should provide a straight-line with slope - Ej3t and intercept
8.7 The diffusion coefficient varies with time
455
Table 8.3. Experimental results on 15415 lunar anorthosite, (Turner, 1972). Cumulated fraction 1 — F k of37Ar outgassed at each temperature step. k = 0 refers to the undegassed state.
Step k
1
2
3
4
5
6
7
t°C 1-^
600 0.0076
700 0.030
800 0.106
900 0.320
1000 0.611
1200 0.892
1400 1.000
Q)0la2. If the duration At of all steps is the same, plotting ln(rfc+1 — xk) vs 1/T should also provide a straight-line with slope - E/& and intercept @0At/a2. ^ In order to determine the 3 9 Ar- 4 0 Ar age of the 15415 lunar anorthosite, Turner (1972) irradiated the plagioclase from this sample in a fast-neutron producing reactor. In addition to Ar isotopes being produced by neutron reaction, 37 Ar was also produced by a 40 Ca(n,a) 37 Ar nuclear reaction. Argon was progressively extracted from the sample during lh heating steps at increasing temperature. The fraction of the total 37 Ar extracted from the sample at each temperature is listed in Table 8.3. Draw the Arrhenius plot for 37 Ar diffusion assuming spherical feldspar grains with radius a and homogeneous Ca distribution. The fraction of total 37 Ar degassed after the /cth heating step is ou tgassed fc 37 Ar total
37 A Ar
n2
where, since it is a stepwise heating protocol
(with to = 0) and, therefore, equation (8.7.3) applies. We have shown in Section 3.1 how to extract zk values from the equation above using a Newton iterative scheme. The last heating step cannot be used since we should then be able to increase temperature to infinity in order to extract 100 percent of the argon present in the crystals. Next, the differences Tk + 1— xk can be calculated (with T 0 = 0) and, since the duration of each step is constant, their logarithm plotted directly against the reciprocal of the absolute temperature. The complete results (see Albarede, 1978) are listed in Table 8.4 and plotted in Figure 8.21. As commonly observed, the high-temperature step (here 1200 °C) significantly deviates from the linear trend observed at lower temperature. A regression on the points up to 1000 °C gives a slope of - 2 5 500K corresponding to an activation energy E = 25 500 x 8.3144«212kJmol" x and an intercept of 16.9. Albarede (1978) has shown that stepwise heating data can be inverted to recover the spatial distribution of radiogenic, planetary and spallogenic components. This topic was covered in Section 5.6. o
Transport, advection, and diffusion
456
Table 8.4. Derivation of the Arrhenius coordinates for the 37Ar outgassing of the 15415 lunar anorthosite.
Temperatures t in °C and T in K. Step/c+1
1
600 1.15 0.0076 -12.20 -12.20
1000/T
0
700 1.03 0.03 -9.45 -9.52
800 0.93 0.106 -6.87 -6.95
1000 0.79 0.611 -2.98 -3.22
1200 0.68 0.892 -1.74 -2.08
r
Anorthosite 15415 (Turner, 1972)
-2 -
^
-4 -
I
-6 -
-
900 0.85 0.32 -4.52 -4.62
-10 -12 -14 0.6
0.8
1000IT Figure 8.21 Arrhenius plot of the 37Ar outgassed from the lunar anorthosite 15415 (Turner, 1972). Only steps 1-5 are taken into account in calculating the least-square straight-line parameters. 8.7.2 Cooling ages
The strong temperature dependence of the diffusion coefficient suggests that when a system cools down, chemical equilibration and loss of radiogenic isotopes come to a freeze over a short time interval. The closure temperature of a chemical system is the temperature at which the inward and outwardfluxesof atoms or isotopes involved in exchange reactions with surrounding minerals and fluids fall below a critical level. For instance, exchange of oxygen isotopes in an igneous or hydrothermal system continues well below the temperature of crystallization and oxygen thermometers record temperatures that are interpreted as those of the end of isotopic exchanges. The cooling age of a chronometric system, such as 40 K- 40 Ar, is a measure of the
8.7 The diffusion coefficient varies with time
457
time at which temperature drops below the closure temperature for the exchange of radioactive and radiogenic isotopes, here 40 Ar, with the surrounding medium. These concepts have been covered in detail in a classical paper by Dodson (1973) but the account presented here will follow a different line of argument. Dodson admits that the end of chemical and isotopic exchanges involving a labile species ( 18 O, 40 Ar, ...) coincides with the diffusion coefficient 2 of the radiogenic isotope dropping below a uniquely defined critical value. This critical value is a function of the geometry of the system and of the cooling rate, or equivalently of the rate at which the diffusion coefficient changes with time. In terms of dimensionless parameters, we can write the requirement for a closed spherical mineral with radius a as 3)Q
1
T<
~a A
where A is a constant depending only on geometry and 9 a time scaling constant to be defined and characteristic of the loss process (the symbol T used by Dodson will be avoided as it would collide with the variable defined for time-dependent diffusion coefficients). Dodson chooses the constant 9 as the time necessary to decrease the diffusion coefficient by a factor of e, i.e., 1 dt
6
the minus being a result of Q) decreasing with time, which results in
J
dln0\
A
dt J The critical value <£ic of the diffusion coefficient is the value of 9) which makes this relationship an equality and is related to the closure temperature Tc through the Arrhenius equation (8.4.4). Therefore
A
dt
or 1 —=
Z
0 t ( In
E
--
a1
V A9a
dt
and, finally Tc =
(8.7.4) 0l\n(A®oO/a2)
an expression derived by Dodson through a different set of arguments. 9 is treated by Dodson as a constant, which is by no means critical. Since 9 appears in a logarithm,
458
Transport, advection, and diffusion
the closure temperature has only a weak dependence on the cooling rate. He suggests that values of A equal to 55, 27, and 8.7 for the sphere, the cylinder, and the plane sheet, respectively, ensure a system tightly closed to the loss of radiogenic atoms. Actually, a single closure temperature, or even a narrow temperature range, characteristic of the diffusing species in a particular mineral phase independent of the cooling history of the mineral may simply not exist. Whether we are dealing with stable or radiogenic isotopes, a measure of the closedness of a system to an isotopic or atomic species is the relative rate of loss (e.g., the fraction lost per Ma) d In F/dt, where F is the fraction of atoms still present. This quantity is equivalent to the probability for an atom to escape from the system in this unit time and is the reciprocal of its residence time. We are going to relate d In F/dt to the temperature for a system which holds a given amount of atom or isotope and deduce a temperature threshold which separates the open from the closed system. As in Dodson's theory, we admit that a closure temperature would not depend on the rate of radioactive accumulation and carry out the calculation for a stable species, but the present analysis introduces explicitly the rate of loss. We will work out the solution for the case of a spherical mineral and radial diffusion holding initially a homogeneously distributed amount of a labile isotope (e.g., 40Ar). This calculation can be easily extended to other geometry. Given the dimensionless time x defined as
we have seen (Section 8.6.2) that, for loss extents up to 85 percent, the relationship between F and x can be accurately approximated by equation (8.6.9) as
Let us solve this relationship for x. Switching to an equality sign, we write this relationship as a second-degree polynomial in /
Taking the positive root gives
By chain rule, dlnF dt
d l n F d r _ dF dx dt Fdx a2
and therefore dlnF ^ 3 ( 1 - 1 / ^ ) go dt F a2
8J The diffusion coefficient varies with time
459
or, substituting for ^fnr from above,
dlnF dt
=
3 Jl-(l-F)n/3 F
i_ji_(F)/3 iF)n
@0 ( 2-exp a
\
E 0tl
In the manner of Dodson, let us define A(F) as
F i_yi_(i_ which produces the more compact notation 0O / E dt = A(F) —2 exp[ P \ &T a relationship which can be rearranged into T
=
^—TT-
(8-7-6)
(-dlnF/dt) At a given F, we can therefore ascribe a temperature to the maximum relative rate of loss for a system to be closed and a temperature to the minimum relative rate of loss for a system to be open. This interval may be thought of as the closure temperature interval. The relationship (8.7.6) between temperature and the rate of loss has been calculated for amphibole and orthoclase crystals, both 50 um in diameter, which lost 0.1, 1, 10 and 50 percent of the diffusing species initially present (Figure 8.22). The diffusion constants are those of Harrison (1981) for amphibole and Foland (1974) for orthoclase. For a nearly untouched amphibole which has lost only 0.1 percent of its initial content of labile atoms, the system would switch from a fully open regime of 10 percent loss per million year at 720 K (point A) to a rather tight state of 1 percent loss per billion year at 580 K (point B), i.e., over a temperature range in excess of 100 degrees. A different amphibole that lost 50 percent of its labile atoms would close over approximately a similar temperature range (interval A'-B'), although at temperatures in excess of ~ 8 0 K relative to the 0.1 percent loss case. This is because the average probability for a labile atom to escape from the crystal decreases very quickly as soon as loss has started. Physically, the progressive formation of a depleted diffusion rim around the crystal tends to slow down subsequent loss. Formally, the factor A(F) decreases very fast for small losses, e.g., F « 1. The tighter a system, the larger is its closure temperature interval. Although the analysis would certainly be slightly different for the accumulation of radiogenic atoms in a mineral instead of the simple closure to the loss of a stable species, it is intuitively acceptable that if a closure temperature exists, it is the same for a radiogenic isotope (e.g., 40 Ar) and a non-radiogenic isotope (e.g., 36 Ar or 39 Ar) from the same element. Differences may arise because the protective depleted rim
Transport, advection, and diffusion
460
900
i i i mi
1—i i i i in i
1—i i i ii
i i in ii
1—i i i 11 in
1—i i i i II II
800
Orthoclase
10" 6
10,-5
10 -4
10 -3
10 -2
10 -1
10°
Fraction lost per Ma Figure 8.22 Closure temperature Tc as a function of the rate of loss for a spherical geometry [equation (8.7.6)]. The numbers on the curves are for different fractions lost by the mineral. Amphibole data from Harrison (1981), orthoclase data from Foland (1974). Points A, A', B, B': see text. cannot form until enough radiogenic atoms have accumulated and therefore the interior of the system has closed. In a global sense, however, closure temperatures seem to depend significantly on the cooling history, and the thermal aspects of the cooling age theory must be applied to geological problems with the utmost care. The modeling of Ar outgassing from K-feldspars assumed to have coexisting domains with different sizes has been recently carried out by Lovera et ah (1989). This method seems very promising for the reconstruction of thermal history and vertical movements in young mountain belts. 8.8 Two useful steady-state solutions A conservative property is at steady-state when fluxes, sources, and sinks do not change with time. It is not to be confused with equilibrium which is a state with no flux, no source, and no sink. The general transport equation (8.4.3) of element i at steady-state is k p
8.8 Two useful steady-state solutions
461
and receives a certain number of important applications. Steady-state fractionation of a trace-element during crystal growth is described in Chapter 9 and two examples from the hydrous environment will be described below 8.8.1 Early diagenesis: sulfate reduction Sediment deposition on the seafloor traps interstitial water. After deposition, complex reactions take place in the sediment, most of them fueled by the decay of organic matter, such as sulfate reduction, denitrification,... Because of fast diffusion rates of most cations in seawater, the presence of interstitial water makes exchange between overlying sedimentary layers a much easier process than if sediment deposition was dry. The book by Berner (1980) is entirely dedicated to these processes and only a short example is given here. Let us consider sulfate reduction by bacterial activity at the expense of decaying solid organic matter. Berner suggests the simplified equation SO42 ' + 2CH2O->H2S + 2HCO3 " where CH 2 O represents the organic compounds. Let us call C c the volume concentration of organic (reduced) carbon per volume of sediment (solid + interstitial liquid), supposed to be locked in the solid fraction: molecular diffusion is neglected ( ^ c = 0) and organic carbon flux is entirely advective. The transport equation for organic carbon is (l-^)Cc]a2[(l-0)Cc] dt dz2
d[(l-4>)Cc] dz
Q
where (j) is the porosity. We further assume that carbon disappears with first-order kinetics, i.e., Ac=-k(\-4>)Cc
(8.8.1)
where k is the kinetic factor. After simplification, this equation is integrated into c
— co e
v
{6.0.z)
with C o c being the surface concentration (molkg" 1 of solid sediment) of organic carbon. Reduction of one atom of sulfur in interstitial water requires oxidation of two atoms of organic carbon from the sediment and care must be taken that conservation is written between numbers of atoms. Sulfate is destroyed with first-order kinetics
Neglecting the movement of water relative to the surrounding sediment, we write the steady-state transport equation in one dimension with burial, e.g., in a medium
462
Transport, advection, and diffusion
moving downwards with the burial velocity v
dz2
dt
(8.8.3)
dz
In this equation, C1 is the concentration of element i in pore water at depth z below the seafloor and A1 is a reaction (sink and source) term. For reactions involving the oxidation of organic matter, A1 can be evaluated independently. For constant porosity 0, the sulfate transport equation becomes so
d2CSO4 dz2
dCSO4 dz
k\-4> 2 <j)
*z
where the symbol for partial derivatives has been dropped and the term on the right-hand side represents the sulfate sink. The resolution of the homogeneous differential equation leads to an exponential term in vz/^ s ° 4 . This term being unbounded when z-»oo, its coefficient in the solution is necessarily zero. The exponential on the right-hand side therefore suggests trying a solution in the form
Inserting this expression into the transport equation gives
which, as expected, cancels the exponential terms. Rearranging
j8 is equal to the sulfate concentration C^04 deep in the sedimentary pile. It can be determined by making concentration at z = 0 equal to seawater concentration
4
—r s°4 -csw
which finally gives
(8.8.4)
463
8.8 Two useful steady-state solutions
Table 8.5. Sulfate concentrations Cs°4 (mmoll 1) at depth z (cm) in pore waters from the Saanich Inlet, British Columbia (Murray et al., 1978). 0 25.8
z
Cs°4
1 21.7
4 14.4
6 9.9
9 4.8
13 2.6
18 0.9
30 0.1
3
0.01
Figure 8.23 Sulfate concentrations in pore waters as a function of the depth below the water-sediment interface of the Saanich Inlet Murray et al. (1978). The exponential curve supports the diffusional diagenetic model.
This model requires an excess of sulfate over reducible carbon. Concentrations may be measured in solutions squeezed from sediment cores, diffusion coefficients are known from standard chemical data tables and sedimentation rates determined from 14 C, 210 Pb, or 230Th dating. Therefore, this model finds its best use in the recovery of the kinetics of organic matter decay. A discussion of this and similar equations and numerical applications may be found in Berner (1980). & Murray et al. (1978) measured sulfate concentrations in pore water from the Saanich Inlet (British Columbia) and obtained the data listed in Table 8.5. Calculate the reaction rate constant and the content of organic carbon in surface sediment using v = 3.3 x 10" 8 cms" 1 , ®s°4 = 2x 10" 6 cm 2 s" 1 , 0 = 0.93, psol = 2.7kgl"xWe assume that C^ 0 4 is negligible. A plot of In Cs°4 vs the depth z gives a fairly good straight-line (Figure 8.23) corresponding to = 26.6e"0184z
464
Transport, advection, and diffusion
The rate constant can be calculated from the logarithmic slope
The pre-exponential term includes the surface concentration of organic carbon C o c
= 26.6
and therefore
c m
<> v + ^
o = 26.6 x 2 x
„.
*
v7
= 53.2 v
2
l-0 (j) P
0.93 1+0.184x200/3.3 = 53.2
\-
psol
0.07
2.7
hence C 0 c = 3200mmolkg" 1 = 3.2x 12x 10" 3 g/g = 3.8 weight % organic C
Murray et al. (1978) found that this value is a factor of 4 larger than what is actually measured and suggest that methane upward diffusion accounts for the missing carbon. <> 8.8.2 The advection-diffusion model in the water column Some easily adsorbable metals in the ocean are removed from the water column by falling particles with strong surface reactivity such as oxi-hydroxides: this is the case of many transition elements, the rare-earth elements, thorium, ... For sake of simplicity, first-order adsorption kinetics is commonly assumed. Likewise, dissolved radiocarbon is removed from sea water by radioactive decay: the physical removal process is different, but still 14 C atoms decay with first-order kinetics. On the scale of the ocean, molecular diffusion is an inefficient transfer process. However, turbulent transfer in the water column is commonly described via the same phenomenological (i.e., formal) equation: at a given locality in the ocean, the turbulent or eddy diffusivity 2 describes how fast eddies are transported. It also measures the efficiency of the transport down the concentration gradient in much the same way as the diffusion coefficient in Fick's law. It is larger by several orders of magnitude and, being associated with bulk material transport, is identical for all the elements. Craig (1969, 1974) proposes a one-dimensional model with first-order removal kinetics
dz 2
dz
0
where z is the depth below the ocean surface, v the upwelling velocity, and k a kinetic coefficient. The eddy diflFusion coefficient is overlined in order to stress its representing
8.8 Two useful steady-state solutions
465
a turbulent flow property. For a conservative species, the reaction term is zero. The characteristic equation of the differential equation (8.8.5) has two roots given by
Defining the mixing length Zm = ^/v, the scavenging length /s = v/fc, equation (8.8.5) becomes dz where the term between brackets represent the ratio of dissolved flux to upwelling velocity (reduced flux) at depth z. Introducing the parameter s such as
the roots of the characteristic equation take the simple form 1 ,.
n
rr-rr.
1±£
and the solution therefore is C(z) = a exp(
— J + p exp( - - — — 1
(8.8.6)
where a and ft are two constants to be determined from the boundary conditions. In order to make notation compact we introduce the hyperbolic cosine and sine functions defined as e
U
+
e
cosh x =
A
• U
and sinh x =
These functions satisfy coshO=l
sinhO = O
(cosh x)' = sinh x and (sinh x)' = cosh x After a little manipulation, we get the alternative form C(z) = expf —— )(a cosh — + b sinh — ) P
V 2/JV
2/m
(8.8.7)
21J
where the constants to be determined from the boundary conditions become a = (<x + P)/2 and 6 = (a — /J)/2. Although alternative sets of conditions could be both
Transport, advection, and diffusion
466
physically meaningful and tractable, we assume that concentrations are known at the top (z = 0) and the bottom (z = Z) of the scavenged layer, giving C(0) = a
and f
Z
sZl I
eZ
b = C{Z) exp — - C(0) cosh — / sinh —
L
2/m
2 / J / 2/m
Inserting these values of a and b into the expression (8.8.7), we get
<
C(z) = -
z \ 2/J
sinh
s(Z — z) Z —z sz + C(Z) exp sinh — P 2/m 2/, 2/m
(8.8.8)
sinh — 2L
where use has been made of the identity sinh(w — v) = sinh u cosh v — cosh u sinh v We deduce the general expression of the reduced flux of dissolved element at z
dC(z) 1 dz "
C(0) expj - -^-) cosh f^L-i* - C(Z) exp ^ - -icosh — (8.8.9)
sinh — 2L
The flux J of element carried downwards by the sinking particles (the 'rain') can be estimated by comparison with the flux of dissolved element. We write that, at steady-state, the sum of dissolved and particulate fluxes remains constant, i.e., for two depths zx and z 2 dC — dz
dz
-vC(z
where the — v term stems from the movement being upward, or in a reduced form
dz
dz
+ C(Z) = -
^ Draw the concentration and flux profiles of a species i with surface concentration of 2 and bottom concentration of 10 (arbitrary units). Assume that the mixing length can be obtained from the distribution of conservative quantities, usually salinity or potential temperature. Craig (1969) suggests a value of ~800m in the 4000m-deep Pacific.
467
8.9 Simultaneous precipitation and diffusion 5
10
/Z=0.2
' / 1\
0.2
7
/
0.4
0.6 \ 0.8
\
\
- X\^ -1.0 Concentration (arbitrary unit)
1/ V -0.5 Total dissolved flux Advective flux at Z
0.0
Figure 8.24 The advection-diffusion model (Craig, 1974) in a water column of depth Z, mixing length /m, and scavenging length /s. Concentrations [left, equation (8.8.8)] andfluxes[right, equation (8.8.9)] in the water column for the IJZ values labeled on the curves.
The data impose IJZ = 0.2. The concentration profiles have been drawn for /s /Z = 0.1, 0.5, and 10 (Figure 8.24). Also drawn are the fluxes of dissolved species i for the same values of the parameters, which makes it possible to estimate the flux carried by sinking particles. For instance, a quick graphic examination reveals that, for /s/z = 0.5, the flux of species i reaching the bottom Z with the rain of particles is approximately —0.1—(—0.7) or 60 percent of the dissolved flux advected at the base. An example of inverse calculation from dissolved Ni concentrations in the Eastern Pacific measured by Bruland (1980) is discussed in Chapter 5. Particularly important in inverting the data is to make sure that e must be larger than unity, since the rate constant is a positive parameter. <*=• 8.9 Simultaneous precipitation and diffusion
When diffusion interacts with crystal growth and nucleation, phenomena of periodic precipitation may appear that have been known historically as Liesegang rings. Bands and concentric rings with alternating mineral abundances are not uncommon in all sorts of geological environments: orbicular structures in plutonic rocks or striated chemical precipitates in sediments give dramatic examples of pattern formation or self-organization. Some mechanisms of pattern formation require competing chemical species with contrasting diffusivity and chemical reactivity. Continuous growth at
468
Transport, advection, and diffusion
one site is rapidly starved by the inability of one of the species, which is part of the precipitate, to move over large distances. In contrast, periodic precipitation requires that the energy for driving the slowly diffusing species up its own gradient over short distances is compensated by the energy released in the phase change. If this condition can be met, precipitation proceeds by matching the long diffusion distance of the fast species with a succession of bands roughly as wide as the diffusion distance of the slow species. Other theories do not explicitly assume the presence of several components and rely on autocatalytic reactions (Flicker and Ross, 1974; Noyes and Field, 1974) or capillarity phenomena (Feinn et al, 1978; Lovett et al, 1978). For decades, many theoretical models of periodic precipitation have been put forward (Wagner, 1950; Prager, 1956; Kahlweit, 1965). Concentration gradients are no longer deemed to be necessary to generate heterogeneities (Flicker and Ross, 1974) while widely accepted theories emphasize the role of capillarity in retarding crystal growth of small particles (Feinn et al, 1978; Lovett et al., 1978; Ortoleva, 1984; Kirkaldy and Young, 1987). The following simple derivation which shows how unstable behavior may be initiated in a precipitating two-component system is adapted from Kirkaldy and Young (1987). Let us assume a solid infinite matrix with O and Cj being the concentration of two conservative species i and j . i and j may react to form a compound, e.g., a local precipitate, an oxide,... with fixed concentrations Col and Coj. pp and p m , the densities of the compound and matrix, respectively, are assumed to be constant. The compound is finely dispersed, and we call p its volume fraction. Mass balance of element i requires
div
A w ) d K
(8.9.1)
Let us assume no advective flux and the diffusive flux to be proportional to the fraction 1 — p of matrix material
Ax,y,z)v -(1 In a local form, we get the conservation equation (1 —p)pm
h (PPCQ — pmC() — = (1 — p)pm@iV2Ci'—pm&
grad C'-grad p
or
dt
pm
1 - p dt
1-p
(8 . 9 . 2)
At the onset of the precipitation, the product of gradient terms on the right-hand side can be neglected and we obtain
dt
dx2
pm
dt
8.9 Simultaneous precipitation and diffusion
469
The rate of precipitation depends on the rate of change in concentrations dp _ dp dO dp dCj Tt~'dCi~dt+~dCj'^t The conservation equation is rearranged into
dt V
Pm
C
* X
dx2
dCJ
l **
da CV dt
Pm
(8.9.3)
We define a positive variable Pl describing how changes in matrix concentrations depend on the precipitation increments as
(8.9.4)
dC l /dp
Pl is positive since precipitation decreases the amount of species i in the matrix. Pj is defined in a similar way. Two elements i and j give the system of conservation equations dCl _ & d2O ~dt~ l-Pi~dxY+ dC{ _ & 32Cj ~dt~ 1-Pj~dx2~
Pj CQ1 dCj l-PiCj~dT P{ Coj dCl \-PJ^Cj~dt
F 1-Pj
which we combine as pipj
ear
1
&
e2cl
l-Pi~dx2+
~dT[_ ~(l-P^l-P^j
col e2cj
pj
j
(l-Pi)(l-Pj)~CJ~dxT
We finally get the system of equations dO dt
dt
PJ
• l
-P -P
j
Pl 1-P -Pj i
2
dx
c o l d2cj j 2 (1 -p'Xi -P ) Co dx
' d2O Co " Cn 1 dx2
j
1- -Pl-Pj
dx2
(8.9.5)
The theory of linear differential equations indicates that long-term evolution depends on the boundary conditions and the determinant of the coefficients preceding the second spatial derivatives (which can actually be considered as effective diffusion coefficients). Such a system is likely to be highly non-linear. One extreme case, however, is particularly interesting in demonstrating how periodic patterns of precipitation can be arrived at. We assume that (i) species i diffuses very fast and dC'/dp is large so that Pl is small and (ii) that species j is much less mobile and Pj is large. The
470
Transport, advection, and diffusion
Thomson-Freundlich equation relates the solubility of a particle to its curvature radius (e.g., Swalin, 1962). Using this equation, Kirkaldy and Young (1987) show how periodic precipitation may result from the capillary resistance of the matrix to grow small precipitates. Formally, the conditions read &»&
Pi«\«Pi
and
Under these simplifying assumptions, the system of diffusion equations becomes
dx2
dt ^ dt
=
-
PJ
^
(8.9.6)
dx2
The mobile diluted species i has a normal behavior relative to diffusion, whereas the sluggish major species j undergoes uphill diffusion. Contrary to what would happen for the mobile species i which diffuses down its own concentration gradient, any oscillatory component of a perturbation 5Cjf = A sin would increase exponentially with time as x bCj(x, t) = A sin 2n - exp
pa2
where Pj is assumed to be constant. Spinodal-like behavior is to be expected in such a system. The prediction of band spacing in this case depends on the functional dependence of P's on concentrations. Assuming the existence of initial chemical gradients, Kirkaldy and Young (1987) propose a scaling distance much reminiscent of a mean-squared diffusion distance encountered in standard downhill diffusion
(8.9.7)
Given the conditions on Pl and P\ the relation dp ^
Pj
dCj
shows that precipitation oscillations have the same wavelength as concentration waves, which provides a semi-quantitative framework for Liesegang structures. Derivation of band spacing in autocatalytic and capillarity-based models is entirely different and the interested reader should refer to the literature referenced above.
Appendix 8A: The error function
• The error function erf u is defined as 2
= - ^^=
e~x2 dx
yjit JO
therefore erfO = O, erf oo = l, and erf( — u)= — erfw • The error function complement erfc u is defined as = erfcw =
2/f 2 f + oo x Q~x'dx 'dx = = 1 Q~ 1 yJnJu yJn\Jo
J
+ QO
e~ xx "dx"dxe~
J
f
, \ e~* dx dx \\ = = 1-erfu 1e~* Jo /
J
with erfc 0 = 1
erfc oo = 0
erfc(-w)=l-erf(-u)=l+erfu = 2 The functions erf and erfc are depicted in Figure 8A.1. Other important properties of these functions are werfcw->0 as M->OO 1
derfw_
-as u->oo
(8A.2) (8A.3)
2
du ~~ff d erfc —
d erf—
di
di
lI--A__c-»*
• The integral of the error function complement ierfc is defined as ierfc u=
f°° erfcxdx Ju 471
(8A.5)
Transport, advection, and diffusion
472
Figure 8A.I Graph of the functions erf u and erfc u. Attention must be paid to the sign in deriving the function ierfc, since using Leibniz's rule dierfcw d f00 , , f^derfcx^ . ^ dw A — =— encxdx= dx + erfcoo xO—erfc ux — =—erfcw Ju du du du duju Using integration by parts and changing the variable x2 into v, we also get the useful result ierfc w = [x erfc x]
•TS
e *2x dx = — u erfc u
e v dv
or ierfcw =
1
e "2—werfcu
(8A.6)
As a particular result, we get for w =
f
erfc xdx = ierfc 0 =
o
X/TC
J
Also, using equation (8A.4),
= —pierfc-—
Wterfc—-
Appendix 8A: The error function
473
and, therefore
(8A.7)
Error function relates to the normal density of probability (pdf)/x(x) of Gauss. fx(x) is given by fx(*) = where \i and a2 are the mean and variance of the normal pdf. This form suggests that the cumulative distribution function F(x) which measures the probability that the variable X is ^ x can be expressed in terms of the error function. By definition F(x)= |
fx(X)dX
Making the variable change
the expression of the distribution function becomes x~H x
= F(x) =
C
1
exp
r
2 2
(x—u\ r~^~fi (X — u)i ~\ [ojl a f5. fl 2n — dX= e""2———dw =
The integrals are split at zero in order to make the erf form apparent
[ r J - oo and finally
x/7r
Jo
11 1 [oji
e""2dw
Appendix 8B: The theta functions
Theta functions are special functions related to Jacobian elliptic functions (Morse and Feshbach, 1953; Widder, 1975) with special properties that make then extremely useful to calculate solutions to diffusion problems for small values of time. Three of the four theta functions will be used in the present context
n=0
^, mi) = 1 + 2
(8B.1) (-l)ncos2^exp(-n27r2i)
where i is the square-root of — 1. For k = 1 to 4, these functions satisfy the diffusion equation
The essential transformation properties are (Morse and Feshbach, 1953)
(8B.2) 17IT 7TT
lTTT 7TT
Let us now calculate some solutions for short values of the time. For a sphere with homogeneous initial concentration and zero surface concentration, we replace Q^tja2 by T. From equation (8.6.11), the fraction F s p h left at x is
Let us first observe that when x tends to zero, F s p h tends to 1 (no loss) and we get the result V - = — n=m2~ 6 474
Appendix 8B: The theta function
475
Then, we can write
n2n2
Jo
n2n2
hence
P
f=i\n( jn {J
n
= l-6 t fe-"2"2" d« o
«=iJo
We can exchange the order of summation and integration
p f e-"v»du o «=1
Introducing the function # 3 and using the second transformation rule in equation (8B.2) gives
F =l-3 |T-U( 1+2 I
and reverting to the infinitesph sum, we get Expanding the last expression, we get
n
i
\
—-1
yjnu J which can be simplified using equation (8A.7)
oo
ft e -« 2 /u
du-6£
—=
n = i Jo ^Jnu d( y/wierfc-—
Integrating each term separately
Fsph=lwhich upon replacing T by its value gives equation (8.6.8). The same method can be used in order to obtain a spatial distribution which converges more rapidly for small values of Q)tja2 than equation (8.5.14). For a slab with homogeneous initial concentration and zero surface concentration, the dimensionless variable @t/X2 is replaced by T. The fraction left at T, given by equation (8.5.16), is transformed as in the case of the sphere. B2 now appears in place of # 3 and is replaced by # 4 through the first of the equations (8B.2).
Appendix 8C: Duhamel's principle
The basic principles are taken from Zwillinger (1989). Duhamel's principle enables solutions for surface conditions being functions of time to be calculated from solutions with permanent surface conditions. Although this principle is most easily derived through the use of Laplace transforms, more conventional demonstrations, not repeated here, can be found in Sneddon (1957) or Carslaw and Jaeger (1959). Given the diffusion equation I dC(x,t) d2C(x,t) —-— =9 —— dt dx2 with the initial conditions and boundary conditions C(x,0) = 0, C{a,t) = f(t\ and C(b,t) = g(t) we choose instead to solve equation II dC(x,t,r) d2C(x,t,T) — -=@ -^—dx2
dt
with the initial condition unchanged and boundary conditions C(
d f'
C(x, t) = —
C(x, t -
T, T) dr
An example can be found in the section dealing with the diffusion of radiogenic isotope out of a sphere. Duhamel's principle can be extended to cases of surface conditions being functions of both time and space variables and to variable source and sink terms as well (Zwillinger, 1989).
476
9 Trace elements in magmatic processes
9.1 Introduction
Trace elements are useful tracers of geochemical processes mostly because they are dilute: their behavior depends primarily on the trace element-matrix interaction (e.g., Rb-host feldspar, Sr-calcite) and very little on the trace-trace interaction (e.g., Rb-Rb, Sr-Sr). Consequently, the distribution of trace elements among natural phases largely obeys the linear Henry's law. The modeling of trace elements in various geological environments (magmas, hydrothermal fluids, seawater,...) relies on three different aspects (a) The total mass of each element distributing among several subsystems such as phases (minerals, melts, fluids) or reservoirs (the 'Depleted Mantle', the 'Lower Crust', the 'Antarctic Bottom Water') that altogether form a closed-system must be preserved. This condition is true regardless of configurations, proportions, physical or chemical state of individual subsystems. This conservation, or mass balance, property is unrelated to the trace character of the element. (b) The equilibrium distribution of a trace element i between two phases a and /?, which is usually handled through the law of diluted solutions (9.1.1) where Kp/J(T9P) is the temperature-pressure dependent Berthelot-Nernst distribution or partition coefficient. If the composition dependence relative to the major phase constituents is to be emphasized (Mclntire, 1963), a major element / and a new partition coefficient KP/J"(T9P) will be introduced such that — Kp/a' {I,r)
\y.i.Z)
This form of the partition coefficient, analogous to that used for Fe-Mg fractionation between olivine and melt (see Chapter 1), is necessary only for the rare cases where trace substitution affects Cj and C$ substantially. A number of reviews (O'Nions and Powell, 1977; Michard, 1989) describe the various sorts of partition coefficients expressed either in mass-fractions, atom fractions, or normalized to a major element and their respective merits. If the discussion is restricted to a narrow range of chemical compositions (e.g., basaltic systems, Irving, 1978, Irving and Frey, 1984), enough experimental information exists on trace-element partitioning to resort to the wonderfully simple equation (9.1.1). (c) Compatible elements are easily hosted by the structure of the crystallizing minerals (X sol/liq '> 1) while incompatible elements are rejected into the liquid (X sol/liq I « 1). 477
478
Trace elements in magmatic processes
This chapter will emphasize modeling of the simplest processes which govern magma formation and evolution. Probably none of the natural processes can be fully described by one of these simple models. More likely, these simple processes combine in a quite complex arrangement to form the magmas and their solid products. The burden of the proof increases quickly with increasing model complexity, however attractive a detailed model may be be. The extraction of unambiguous quantitative information from a given model demands a considerable amount of experimental, theoretical and computational work. Simple models should therefore be considered first: identifying precisely how and where simple models fail may be much more informative on the physical processes at work than inadequate implementation of a complex model with more independent parameters than can be effectively handled.
9.2 Batch-melting and crystallization
9.2.1 Introduction and forward problem Batch partial melting will hereafter be understood as equilibrium melting, which is in contrast to fractional melting discussed in Section 9.3.3. The foundation of this model is remarkably simple and was first laid down by Schilling and Winchester (1967). A number of more or less complex modifications enabling useful information to be extracted from the data were later introduced by Gast (1968), Shaw (1970) and Albarede (1983). Bulk equilibrium crystallization of a liquid batch can be handled with equations identical to those for batch-melting. We consider a molten multi-mineral assemblage, referred to as the source, which is presumed to give rise to an erupting magma. X-} is the mass-fraction of each phase j relative to the molten source (and not to the residue). Subscript j= 1 refers to the melt, j~2,...,n to the n — 1 residual mineral phases. The sum of the X} over all the n phases is unity. If Cliq l is the concentration of the ith among m elements in the liquid, the concentration of element i in phase j will be K/C l i q '. Let Col be the concentration of i in the source prior to melting. Hence, mass balance requires
in which a K/ value of 1 has to be assumed for the melt (/ = liq). Letting F be the molten fraction, the mass-fraction fj of phase j relative to the residue relates to Xj through
The sum of the fj over all the n—\ residual mineral phases is unity. This leads to the equation known as the equilibrium, partial, or batch-melting equation
CrJ = q
(9.2.2)
F + Dt{l-F)
9.2 Batch-melting and crystallization
479
Table 9.1. Mineral-liquid partition coefficients used for the forward modeling of batch-melting.
olivine-liquid clinopyroxene-liquid
Ni
Cr
Yb
Rb
6 1
1 8
0.1 0.3
0 0
In equation (9.2.2), Dt is the bulk solid-liquid partition coefficient D>= I
fjK/
(9.2.3)
i.e., the centroid of the mineral-liquid partition coefficients weighted by the corresponding mass-fraction of the minerals in the residue. 4? A peridotite contains 2500 ppm Ni, 1500ppm Cr, 0.2 ppm Yb, and 0.01 ppm Rb. Calculate the concentration of each element in the liquid produced by 10 percent partial melting when a residue containing 60 percent olivine and 40 percent clinopyroxene is left. Assume the partition coefficients given in Table 9.1. From equation (9.2.3), the bulk partition coefficients are DNi = 0.6x 6 + 0.4 x 1 = 4 DCr = 0.6x 1+0.4x8 = 3.8 DYb = 0.6 x 0.1 +0.4 x 0.3 = 0.18 Z)Rb = 0.6x 0 + 0.4x0 = 0 which gives the following melt concentrations from equation (9.2.2) CliqNi = 2500/(0.1 + 4 x 0.9) = 676 ppm CliqCr = 1500/(0.1 + 3.8 x 0.9) = 426 ppm CliqYb = 0.2/(0.1 + 0.18 x 0.9) = 0.763 ppm CliqRb = 0.01/(0.1 + 0 x 0.9) = 0.1 ppm This simple formalism may be applied to natural lavas provided the lava samples have not undergone extensive mineral fractionation. o 9.2.2 Inverse problem: the source composition is known The simplest inverse model consists in finding liquid and solid phase proportions assuming a melt and source composition. This case is depicted in Figure 9.1 and may be modeled quantitatively with no extra assumption. Equation (9.2.1) expresses the fact that, in the m-dimensional composition space, the source composition must be the centroid of melt and residual mineral compositions, each being weighted by the
480
Trace elements in mag ma tic processes
element 3 mineral 1
mineral
element 2
liquid element 1 Figure 9.1 Inverse partial melting problem in the three-dimensional space of elements 1, 2, 3 when the source is known. Projection of the source onto the sample subspace provides the mass-fraction of each phase of the molten source. If one phase is at the origin (sterile phase), every representative point can be shifted by a constant vector.
mass-fraction of the corresponding phase. Once C liq ' and, hence, the composition of each residual phase through the assumption of mineral-liquid partition coefficients, are known, the sample subspace is completely defined by the set of all possible phases, present or not, in the residue. The m x n matrix A of phase compositions is defined by its current element aij9 such as
and the n-vector x and m-vector y by their current element Xj and C o ', respectively. Solving the matrix equation y=Ax
(9.2.4)
in the least-square sense amounts to projecting the source-composition vector y onto the sample subspace. The closure equation
L *;=L*;=1
(9.2.5)
makes the least-square solutions more complex because one has to resort to the Lagrange multiplier technique for the constraint of equation (9.2.5) to be exactly verified. This system of m + 1 equations in n unknowns may be solved for m ^ n— 1. The solution (Albarede, 1983) has been given in its matrix form in Chapter 5. Defining
481
9.2 Batch-melting and crystallization
x0 as the unconstrained solution, i.e., Jto=(ATAy1ATy
(9.2.6)
the general solution can be put into the easily tractable form (9.2.7) where / is the n-column vector (1,1,..., 1). The denominator of the last equation represents the sum of the terms from the matrix ATA. A negative value of a phase proportion x7 indicates that the calculation should be restarted after discarding phase j from the residual phases. &
Invert the results of the forward partial melting example. From the concentration of each element in the liquid and in the source, we can retrieve the degree of melting and the residual mineralogy. We assume that the liquid contains 676 ppm Ni, 426 ppm Cr, 0.763 ppm Yb and 0.1 ppm Rb, whereas its source composition y vector is (2500, 1500, 0.2, 0.01) in ppm. We will test the assumption that the residuum is composed of olivine and clinopyroxene with the partition coefficients given above. Phase abundances Jt will be ordered as liquid, olivine and clinopyroxene. Let us compute, as an example, the element in the third row and second column of the matrix A
The whole matrix A is built in a similar manner, which gives liq x676
ol 6 x676
cpx 1x676 •
Cr 1 x426
1 x426
8x426
Ni
•1
Yb 1 x 0.763 0.1 x 0.763 0.3x0.763 Rb .1 xO.l 0 xO.l 0x0.1 .
"676
4056
676
426
426
3409
0.763
0.0763
0.229
0.0
0.0
_ 0.1
hence 1.854
-0.2761
-0.1972 "
-0.2761
0.04112
0.02937
-0.1972
0.02937
0.02098
and therefore 1.854
-0.2761
-0.1972 T 2328 394
"0.10
-0.2761
0.04112
0.02937
10744340
0.54
_-0.1972
0.02937
0.02098JL 6802826_
0.36
482
Trace elements in magma tic processes
This result reproduces the original melting conditions with 10 percent liquid while the residue contains 0.54/(0.54 + 0.36), i.e., 60 percent olivine and 0.36/(0.54 + 0.36), i.e., 40 percent clinopyroxene. Since we knew that the model was perfectly obeyed, we could expect that phase abundances sum up to unity. <> & Invert the results from the example above replacing the source concentration y values with the data 'polluted' by errors Ni = 2200 ppm, Cr = 1800 ppm, Yb = 0.15 ppm, Rb = 0.008 ppm. Following the same steps, we obtain 1.854
-0.2761
-0.1972 T2253 543
-0.2761
0.04112
0.02937
-0.1972
0.02937
0.02098 J|_7 622 854_
9685966
0.0108" 0.4627 0.4688
x0 does not sum up to unity any more. Indeed JT(ATAylJ= 1.0280
We now calculate "0.0108" 0.4627 = 0.942 0.4688
and obtain the normalized solution "0.0108"
1.3805/1.0280"
0.09"
0.4627
-0.2056/1.0280 (1-0.942) = 0.45
0.4688
-0.1469/1.0280_
0.46
i.e., with 9 percent degree of melting, and a residue made of 45/91 =49 percent olivine plus 46/91=51 percent clinopyroxene. <^ A problem arises whenever a column j of A comprises only zeroes, i.e., for a sterile phase which admits none of the measured elements in its lattice (olivine in basaltic systems, quartz in granitic systems). If there is only one of these phases, the difficulty may be overcome by a simple trick consisting in shifting all the concentrations by a constant vector which amounts to shifting the disturbing phase away from the origin (e.g., add 1 to all concentrations). If the sterile phases are multiple, like olivine, spinel and orthopyroxene for REE in basaltic systems, the matrix A has multiple singularities, and the inversion fails because the sterile phase contributions cannot be disentangled. For similar reasons, it can be realized that stuffing too many lines representing very incompatible elements into the matrix A results in either redundant or inconsistent information and, in turn, in a singular system even with
9.2 Batch-melting and crystallization
483
9.2.3 Inverse problem: when the source composition is unknown This case is somewhat more complex but a pictorial feeling of the solution may be obtained quite easily (Albarede, 1983; Albarede and Tamagnan, 1988). We can think of 5 undifferentiated lava samples originating within a common source such as several basalts produced from the same peridotite at different pressures (e.g., across the spinel-garnet transition as suggested for Mid-Ocean Ridge Basalts by Salters and Hart, 1989) and with different degrees of melting. As in Figure 9.1, each molten source may be represented by a compositional subspace of the complete m-element space. As illustrated in Figure 9.2 for three samples, three elements and two residual phases, rinding the common source of a suite of cogenetic lavas therefore requires that the intersection of these sample subspaces be calculated. A unique intersection does not exist in the general case and a solution will be sought in the least-square sense.
element 3
element 2 Figure 9.2 General solution of the partial melting problem for a suite of cogenetic rocks when the source composition is unknown (M and m are two mineral phases). Both mineral phases M and m accept some of the analyzed elements.
Because of the large number of unknowns, it has long been estimated that the partial melting model was heavily underdetermined. In fact this turned out to be true but for the wrong reasons. In most cases, sterile minerals exist, e.g., olivine and orthopyroxene, which make the problem formulation more obscure: there is usually no easily available element entering the lattice of these minerals except for some very compatible ones (Ni, Cr) that are far too affected by fractional crystallization for their concentration to give a reliable indication of the value in the pristine melt. For each sample, the point representing the composition of these sterile minerals falls at the origin (zero concentrations). The origin is therefore common to each sample subspace. If those sample subspaces have two points in common, they must have an
484
Trace elements in magmatic processes
element 3 liq3
liq,
sterile phase (e.g., olivine) Figure 9.3 General solution of the partial melting problem for a suite of cogenetic rocks when the source composition is unknown. One phase M has a regular behavior, the olivine is sterile and does not contain any of the analyzed elements. The solution is the direction represented by the heavy segment joining the source and the sterile phase.
infinite number of common points. In the three-dimensional diagram of Figure 9.3, these common points would define a line segment. For the experienced numbercruncher used to programing the standard batch-melting equations for REE and other incompatible elements, this relation expresses the well-known fact that the degree of melting and the amount of olivine in the residue cannot be constrained independently. An alternative formulation of the inverse partial melting problem therefore may be stated in this way: given a set of samples produced from a homogeneous source including sterile phases, the source composition cannot be uniquely determined, but the direction vector going from the origin through the source composition can. This statement expresses in a geometric way the property that, even if absolute concentrations are unknown, the relative concentrations are fully determined. In terms of REE, the concentration level is unknown but the shape of the pattern, enriched, depleted,... may be quantitatively assessed. Therefore, we want to decide which direction, among all possible choices, is common to all sample subspaces, or, at least, which direction represents the best zone of the sample subspaces in a least-square sense. Since a direction can be completely described by its unit vector, we can restrict the solution set to the surface of the unit sphere centered at the origin. Let us call.)? the solution of unitary modulus and yk its projection onto thefcthsample subspace (k = 1,..., s) represented by the matrix Ak. It is a simple matter to show that
9.2 Batch-melting and crystallization
485
where Qk is a symmetric mxm projector matrix such that Qk = AkiAkTAk)'1AkT
(9.2.8)
Finding the least-square solution reduces to minimizing the sum S of squared deviations yk —y between the estimated source solution and its projection onto each sample subspace. Thus, finding the minimum of S= £ (A-J>)T(A-i>)
(9-2.9)
is equivalent to finding a unitary estimate y which minimizes S=yTM$
where the symmetric mxm projector matrix Mis defined by M=tvm-Qk)\lm-Qk)
(9.2.10)
Alternatively, using projector properties
M= t (Im-Qk) k=l
or M=S/ffl-|a
(9.2.11)
This problem is a standard eigenvalue problem, related to what is known as the Rayleigh quotient (Strang, 1976). S is minimum when y is equal to the eigenvector vm associated with the smallest eigenvalue of M. Once the composition is found, the modal parameters of melting (degree of melting, abundance of residual phases) may be determined for each sample. Although the constrained solution described in the analysis of the first case may be safely used, it is found (Albarede and Tamagnan, 1988) much simpler and more significant to avoid constraining the abundance of the sterile phases and to infer them from the difference between unity and the sum of the unconstrained abundances. This solution has very attractive stability properties: (a) If the partition coefficients of one phase are changed by a constant factor, as, for instance, by doubling the Kf of the clinopyroxene, neither the solution nor the modal parameters of melting are changed. It is well-known that REE partition coefficients may vary significantly as a function of temperature or melt composition even for a single mineral (e.g., Irving, 1978) but, usually these variations are strongly correlated from one element to another. The solution to the partial melting therefore will be robust relative to uncertainties on the absolute partition coefficients.
486
Trace elements in magmatic processes Table 9.2. Melt and mineral fractions (%) in the molten source assumed to calculate concentrations in 5 'melts' used as an example for inverse modeling of partial melting.
melt min. 1 min. 2
lava 1
lava 2
lava 3
lava 4
lava 5
0.02 0.40 0.20
0.03 0.20 0.20
0.04 0.20 0.30
0.05 0.20 0.30
0.10 0.10 0.30
Table 9.3. Synthetic example of batch-melting inverse modeling. Assumed source concentrations Col for four arbitrary elements (column 2), mineral 1—liquid and mineral 2-liquid partition coefficients (columns 3 and 4), residual solid-liquid bulk partition coefficients calculated from mineral abundances listed in Table 9.2. Concentration units are arbitrary. Assumed values Element i eli
el2 el 3
K
1. 2. 3. 4.
0.00 0.10 0.20 0.50
i
D
,- calculated from modal compositions
Kmin.2'
lava 1
lava 2
lava 3
lava 4
lava 5
0.50 0.20 0.10 0.00
0.10 0.08 0.10 0.20
0.10 0.06 0.06 0.10
0.15 0.08 0.07 0.10
0.15 0.08 0.07 0.10
0.15 0.07 0.05 0.05
(b) Fractional crystallization changes very little the relative abundance of very to moderately compatible elements: for the elements which are not significantly fractionated by mineral separation (e.g., REE, Zr, Ba, Ta, ... in basalts) the solution is therefore nearly insensitive to the degree of fractionation or accumulation.
& The best way to convince ourselves that this rather convoluted technique works well is to build a synthetic example that we invert in a second stage. We use four elements (m = 4 , e^ to el4), five lavas (s = 5) for which we assume the melt fraction and residual mineral abundances listed in Table 9.2, and two non-sterile residual minerals (Mini and Min2) whose partition coefficients are listed in Table 9.3. The assumed source composition is listed in Table 9.3 which also shows the assumed bulk solid-liquid partition coefficients for each lava. Since we know the source composition, partition coefficients and phase abundances in molten sources, we can calculate the synthetic melt and mineral concentrations using equation (9.2.2). The five 4 x 3 matrices Ak can be built: the first column of Table 9.4 is made of the melt concentrations (lavas'). Mineral concentrations in the next two columns are computed from melt concentrations using the appropriate mineral liquid partition coefficients. High precision is needed to ensure accurate inversion. Now the five individual 4 x 4 projector matrices Qk are calculated from equation (9.2.8) and listed in Table 9.5. We form the 4 x 4 matrix M through equation (9.2.11)
9.2 Batch-melting and crystallization
487
and the result is 0.9350
-1.6104
1.0450
-0.2123"
-1.6104
2.7953
-1.8301
0.3776
1.0450
-1.8301
1.2133
-0.2562
-0.2123
0.3776
-0.2562
0.0564
Using Matlab, the eigenvalues of this symmetric matrix have been found to be, in decreasing order, 4.9738,0.0253,0.0009, and 0. The corresponding eigenvector matrix Fis -0.4314 0.7496
-0.5906 -0.6571 0.1826 0.1549 -0.5299 0.3651
-0.4916
0.6657 -0.1233 0.5477
0.1018 -0.4291
0.5217 0.7303
By dividing each eigenvector component by the smallest of them, we find that the components of the fourth eigenvector [0.1826, 0.3651, 0.5477, 0.7303]7 associated with the smallest eigenvalue (here zero) are in proportion of (1, 2, 3, 4) which is precisely the source composition used to produce the synthetic data (Table 9.3). The capability of this formalism to invert the data to produce relative source concentrations is therefore established. It is left to the reader to show that correct source mineralogical compositions can be retrieved using the procedure outlined in Section 9.2.2. <^ 9.2.4 Shaw's formulation The batch-melting equation (9.2.2) can be applied to any combination of melt and residual phase proportions but the bulk partition coefficient D( does not stay an invariant parameter unless the /} values remain constant with F. A melting process keeping the fj values constant is called modal melting and is in general not representative of thermodynamic equilibrium. Shaw's (1970) main rationale for changing Schilling and Winchester's (1967) equation was to include the assumption of constant phase proportions during eutectic melting in the melting equation. Introducing the phase proportions prior to melting X?, the partial melting equation (9.2.1) is recast into
Defining Dt° as the solid-liquid partition coefficient for F = 0, i.e., (9.2.12)
and Pt as the partition coefficient for the melt norm relative to source minerals (9.2.13)
488
Trace elements in magma tic processes Table 9.4. The liquid concentrations in the five synthetic melts (column 2) calculated from the batchmelting equation (9.2.2) and the parameters of Tables 9.2 and 9.3 and concentrations in the minerals equilibrated with each melt (columns 3 and 4). The collection of these columns for the fcth melt sample makes the matrix Ak. Concentrations
el2 el3 el 4 A2: eli el2 el 3 eU A3: eli el2 el3 el 4 /i 4 : eli el2 el 3 eU A5: eli el2 el3 el4
liquid
min. 1
min. 2
8.4746 20.3252 25.4237 18.5185
0.0000 2.0325 5.0848 9.2593
4.2373 4.0650 2.5424 0.0000
7.8740 22.6757 34.0136 31.4961
0.0000 2.2676 6.8027 15.7480
3.9370 4.5351 3.4014 0.0000
5.4348 17.1233 27.9851 29.4118
0.0000 1.7123 5.5970 14.7059
2.7174 3.4247 2.7985 0.0000
5.1948 15.8730 25.7511 27.5862
0.0000 1.5873 5.1502 13.7931
2.5974 3.1746 2.5751 0.0000
4.2553 12.2699 20.6897 27.5862
0.0000 1.2270 4.1379 13.7931
2.1277 2.4540 2.0690 0.0000
we get
\-F
(9.2.14)
and the batch-melting equation becomes
Cliq
'
(9.2.15)
P, is sometimes improperly but illustratively referred to as the partition coefficient of the melt. For eutectic melting, it is expected to remain approximately constant. From this equation expanded by Hertogen and Gijbels (1976) to complex melting
489
9.2 Batch-melting and crystallization Table 9.5. The five projectors Q k calculated from Table 9.4 as
Qv eli el2 el3 el4 Q2 elt el2 d3 el4 S 3 : elt el2 el3
cU 6 4 : el! el2 el3
eU 2 5 : elt el2 el3
eU
ell
el2
el3
el4
0.8800 0.2502 -0.2000 0.0549 0.8154 0.3205 -0.2137 0.0462 0.7776 0.3530 -0.2160 0.0411 0.7886 0.3459 -0.2132 0.0398 0.8035 0.3408 -0.2021 0.0303
0.2502 0.4785 0.4169 -0.1145 0.3205 0.4435 0.3710 -0.0801 0.3530 0.4398 0.3428 -0.0652 0.3459 0.4340 0.3489 -0.0651 0.3408 0.4090 0.3505 -0.0526
-0.2000 0.4169 0.6667 0.0915 -0.2137 0.3710 0.7527 0.0534 -0.2160 0.3428 0.7903 0.0399 -0.2132 0.3489 0.7849 0.0402 -0.2021 0.3505 0.7922 0.0312
0.0549 -0.1145 0.0915 0.9749 0.0462 -0.0801 0.0534 0.9885 0.0411 -0.0652 0.0399 0.9924 0.0398 -0.0651 0.0402 0.9925 0.0303 -0.0526 0.0312 0.9953
paths, Treuil and Joron (1975) derived important plots used in identification and inversion techniques. Let us consider a perfectly incompatible element (or 'hygromagmaphile' in the terminology of these authors) which we label with the superscript H. Then
Combining the two equations, they become r C
H l
C
l
or
1 = RL{
(9.2.16)
If P( is constant, this is the equation of a straight line in the C liq H /C liq l vs Cliq H diagram, which is the foundation of widely used plots like Th/Ta vs Th aimed at identifying partial melting processes. These plots and alignment parameters were used
490
Trace elements in magmatic processes
by Minster and Allegre (1977) to invert partial melting equations quantitatively. Hofmann and Feigenson (1983) have adopted a slightly modified approach for inverting the trace-element data of the lavas from the Kohala volcano (Hawaii). Unfortunately, the reciprocal inference is not true: observation of alignments in the CliqH/Cliq' vs CliqH diagram by no mean implies eutectic melting (Albarede, 1983). From the definition (9.2.13) of the P£'s, it is clear that whenever residual phase abundances X} (or their contribution to the melt Xo — Xj) vary linearly with F, which also happens for peritectic melting, straight lines will also be obtained although their slope and intercept have a slightly more involved interpretation than that given by the eutectic melting equation. For that reason, Shaw's (1970) equation and the derived inversion method of Hofmann and Feigenson (1983) overconstrain unduly and unnecessarily the phase proportions in the molten source during the generation of the lava suite. The mass balance and equilibrium conditions, and those conditions only, are taken into account comprehensively in the approach of Albarede (1983) and Albarede and Tamagnan (1988) which produces more reliable results. Feigenson and Carr (1993) have recently modified the method of Hofmann and Feigenson (1983). The condition of constant Pt is relieved through Monte-Carlo sampling of the P( space. In addition, the presence of the same element concentration (in our example, Th) in both the coordinates introduces a strong artificial correlation of no meaning in terms of process. Using a random number generator, e.g., in a spreadsheet, the reader may check that randomly and independently generated values of Ta and Th usually produce quite good correlations in Th/Ta vs Th diagrams. This topic is specifically dealt with in Section 4.2. <& As an illustration, we will alter slightly the example calculated above to describe how Shaw's (1970) choice of variables is handled. A peridotite made of 80 percent olivine and 20 percent clinopyroxene contains 2500 ppm Ni, 1500ppm Cr, 0.2 ppm Yb, and 0.01 ppm Rb. Calculate the concentration of each element in the melt produced by 10 percent partial melting of this peridotite with a liquid norm of 40 percent olivine and 60 percent clinopyroxene. Assume the partition coefficients listed in Table 9.1 (p. 479). From equation (9.2.12), the bulk solid-liquid partition coefficients Dt° are Z)ni° = 0.8x 6 + 0.2x1 = 5.0 = 0.8x 1+0.2x8 = 2.4 DYb° = 0.8 x 0.1 + 0.2 x 0.3 = 0.14 DRb° = 0.8x 0 + 0.2x0 = 0 while, from equation (9.2.13), the 'liquid' partition coefficients Pt are computed as PNi = 0.4x 6+0.6x1 = 3.0 PCr = 0.4x 1+0.6x8 = 5.2 pYb = 0.4 x 0.1 + 0.6 x 0.3 = 0.22 PRb = 0.4x 0 + 0.6x0 = 0
9.3 Incremental processes
491
We finally calculate the melt concentrations from equation (9.2.15), as CliqNi = 2500/[5 + 0.1 x (1 -3.0)] = 521 ppm CliqCr = 1500/[2.4 + 0.1x (1-5.2)] = 758 ppm CliqYb = 0.2/[0.14 + 0.1x (1-0.22)] = 0.91 ppm CliqRb = 0.01/[0 + 0.1x(l-0)] = 0.10ppm
o
9.3 Incremental processes These processes, also referred to as distillation processes, have previously been discussed in Section 1.5, although with little emphasis on the properties of solutions for constant partition coefficients. Henry's law gives trace elements a definite advantage since the differential forms of mass balance equations can usually be integrated when partition coefficients are constant. Concentrations of trace elements in solid and liquid phases during magmatic processes can be described by relatively simple equations, thereby making application to geological examples a reasonably simple task. 9.3.1 Fractional crystallization: forward problem This model applies whenever a phase is removed progressively from a homogeneous medium with chemical or isotopic fractionation. Simple cases involve the distribution of trace elements during the crystallization of minerals from cooling magmas, but the progressive boiling of hydrothermal solutions or the evaporation of lakes can be treated in the same manner. In this case, concentrations of the parent melt change with the fraction extracted and the removed mineral phases will be spatially zoned. Chemical or isotopic equilibrium between the phases at the time they separate is usually assumed (e.g., Sr is likely to be nearly twice as concentrated in the plagioclase as in the basaltic liquid this mineral crystallizes from) but is not really a prerequisite for having a distillation process, also referred to as a finite-reservoir process. Fractional crystallization with equilibrium at the solid-liquid interface (Figure 9.4) will be considered to set up the fundamental equations. Let us consider the behavior of the ith among m trace elements upon partitioning between a homogeneous liquid (labeled liq) and the n phases (labeled;) of the cumulate in a system of finite size. These phases are usually considered as mineral phases but liquid trapped in the cumulate can be handled as an additional phase with partition coefficients equal to unity (Greenland, 1970; Albarede, 1976). The solution has been worked out in Section 1.5 and given in its differential form as equation (1.5.3) dlnCu^^.-lJdlnF
(9.3.1)
where F is the fraction crystallized and Dt the solid-liquid partition coefficient. No assumption on constant parameters has been made so far. Expressing the bulk solid-liquid partition coefficient as a function of the mineral-liquid partition coefficients K/ and mineral fractions fj in the cumulate through equation (9.2.3), we
492
Trace elements in magmatic processes
Figure 9.4 Fractional crystallization model. The magma is well-stirred and in equilibrium with the last solid crystallized.
get the alternative formulation dlnC liq '=
(9.3.2)
in which use has been made of the closure equation. For invariant cumulate mineralogy, these equations are integrated for constant Dt into the familiar Rayleigh distillation equation of Neuman et al. (1954) (9.3.3)
with Co ' as the value of C,iq' for parent magma or F = 1. The cases Dt=0.1 and D~5 are depicted in Figure 9.5. The instantaneous solid withdrawn from the magma has the concentration C^J such as (9.3.4)
while the bulk cumulate has a mean concentration Csol' such that
or 1-F
(9.3.5)
An alternative expansion using equation (9.3.3) is (9.3.6)
The Rayleigh equation (9.3.3) rests on the assumption of constant D/s. This problem was discussed by Allegre et al. (1977). The mineralogy of the cumulate during fractional crystallization varies nearly step-wise provided phase boundaries remain nearly linear
493
9.3 Incremental processes
Fraction crystallized 0.8
0.6
0.4
0.2
10
If
..TV o.i IT
Fractional crystallization -
Periodic recharge, periodic eruption of differentiated magma, erupted fraction Y = 0.5
0.01
Periodic recharge, periodic eruption of undifferentiated magma, erupted fraction Y = 0.5
0.001 0.2
i
0.4
0.6
i
i
0.8
i
1.0
Fraction of residual liquid, F Figure 9.5 Evolution of the concentration with the fraction crystallized (from right to left) for the fractional crystallization model [Rayleigh equation (9.3.3), heavy line] and two models of magma chamber with periodic recharge, periodic eruption and continuous fractionation [equations (9.4.7) and (9.4.8)].
(compare with Presnall, 1969 for fractional melting), which suggests that bulk partition coefficients remain approximately constant between discontinuities along the liquid line of descent. As discussed in Section 1.5, the Rayleigh equation (9.3.3) shows the property of step-wise linear covariation for a pair of elements i\ and i2 in a In CUqi2 vs In C liq a diagram (Treuil and Joron, 1975; Allegre et a/., 1977; Allegre and Minster, 1978). This property is valid for any pure phase which participates in the fractionation process: lavas are most commonly used as representing the liquid phase but Fourcade and Allegre (1981) used hornblende to trace the evolution of REE in fractionating granitic liquids. An important consequence of the power-law expressed by equation (9.3.3) concerns the most incompatible elements, namely those with very low partition coefficients: their concentration varies with the reciprocal of F, i.e., very slowly in the early and intermediate stages of the crystallization process. This point will be returned to in Section 9.5. A related problem is the evolution of a trace-element ratio with crystallization
liq
494
Trace elements in magma tic processes Table 9.6. The matrix of partition coefficients used in the fractional crystallization example.
olivine-liquid clinopyroxene-liquid plagioclase-liquid
Ni
Sr
Yb
Rb
15 1 0
0.0 0.1 2.0
0.05 0.35 0.25
0 0 0
Incompatible-element ratios (e.g., Th/La, Nb/Zr, Ce/Yb in basalts) are therefore expected to be very insensitive to mineral separation from the melt and, for differentiated lavas, can be used as a parameter characteristic of their parent liquid [see below). Langmuir (1989) has investigated the special case where part of the differentiated liquids produced in the boundary-layer is remixed with the inner part of the magma chamber. $? A basalt contains 150 ppm Ni, 100 ppm Sr, 3 ppm Yb, and 10 ppm Rb. Calculate the concentration of each element after removal of 20 percent of a cumulate containing 30 percent olivine, 20 percent clinopyroxene and 50 percent plagioclase in the residual liquid and in the average cumulate. Assume the partition coefficients given in Table 9.6. The bulk partition coefficients are calculated from equation (9.2.3) as = 0.3 x 15 + 0.2 x 1 +0.5 x 0 = 4.7 DSr = 0.3x 0 + 0.2x0.1+0.5x2= 1.02 DYb = 0.3 x 0.05 + 0.2 x 0.35 + 0.5 x 0.25 = 0.21 DRb = 0.3x 0 + 0.2x0.0 + 0.5x0 = 0 which give the following residual liquid concentrations through equation (9.3.3) CliqNi= 150(1-O^) 4 7 "^ 65.7 ppm CliqSr = 100(1 - 0 . 2 ) 1 0 2 - 1 =99.6 ppm CliqRb= 10(1 -0.2)°" l = 12.5 ppm Using equation (9.3.5), the average cumulate concentrations are computed as CsoiNi= 150[l -(1 -0.2) 4- 7]/[l -(1 -0.2] =487 ppm CsolSr= 100[l -(1 -0.2) 1 02 ]/[l -(1 -0.2)] = 102 ppm CsolYb = 3[l -(1 -0.2) O2
-(1 -0.2)] = 0.69ppm
Rb
Csol = 10[l -(1 -0.2)°]/[l -(1 -0.2)] =0 Changes in concentrations induced by fractional crystallization are much more visible for compatible than for incompatible elements. <>
9.3 Incremental processes
495
9.3.2 Fractional crystallization: inverse problem Given the composition of the parent and residual magmas, and a mineral assemblage, the fraction of melt crystallized and the modal composition of the cumulate can be uniquely determined. An alternate form of the Rayleigh equation is more useful if derivation of a magma /? from a parent magma a through fractional crystallization is to be tested (Albarede and Provost, 1977). Assuming constant D, and upon integration of equation (9.3.2), we get
ln^=Z (*/"!)/> J
(9.3.7)
where the subscripts a and j$ refer to liquids a and /?, respectively. The m x n matrix A is defined by its current element atj fly
=K/-l
(9.3.8)
the n-vector x of unknowns Xj as Fa)
(9.3.9)
and the m-vector y of data yt as yi
= ln(C//C;)
(9.3.10)
Provided m ^ n, the resulting matrix equation y = Ax
may be solved to give the usual least-square solution x = (ATAy1ATy The degree of fractionation Fp/F^ is retrieved from \n(Fp/Fa)= txj
(9.3.11)
J=I
whereas the n / / s are calculated through equation (9.3.9). Finding the primary melt in a differentiation series is an entirely distinct inverse problem. Since the incremental character of mineral removal and elemental fractionation removes any useful closure condition, it is usually possible to imagine a melt more primitive than the least differentiated lava of a magmatic series. This problem is commonly handled with very compatible elements, typically Ni in basaltic systems, that vary extremely fast during magmatic differentiation but stay wellbuffered during melting (Treuil and Joron, 1975; Allegre et al., 1977) as discussed in Section 9.5.
496
Trace elements in magmatic processes
<& The inverse solution to the previous problem will provide a good example. Using equation (9.3.10), they vector is built from the initial and residual liquid concentrations as
y=
ln(65.7/150)
-0.826
ln(99.6/100)
-0.004 0.1768
ln(3.58/3) Jn(12.5/10) .
0.223
whereas, from equation (9.3.8), the matrix A is 15-1 A=
1-1 0 - 1 "
0-1
0.1-1
2-1
0.05-1
0.35-1
0.25-1
0-1
0-1 0-1
With the intermediate steps 0.007196
-0.01587
0.02946
-0.01587
0.5032
-0.1422
0.02946
-0.1422
0.4140
and -0.06694" T
l T
= (A A)- A y = -0.04463 -0.1116
the final solution is obtained from equation (9.3.11) as p ip _e-0.06694-0.04463-0.1116_e-0.2232_Q g
Given In 0.8= -0.2232, mineral fractions are retrieved through equation (9.3.9) 0.06694/0.2232
/„, Jcpx /plag.
=
0.3
0.04463/0.2232 = 0.2 0.1116 /0.2232_
.0.5
9.3 Incremental processes
497
9.3.3 Fractional melting This model of liquid extraction is symmetrical to fractional crystallization and has attracted renewed interest after the demonstration by Johnson et al (1990) that REE distributions in abyssal peridotite clinopyroxene cannot be accounted for by equilibrium melting processes. The solid is supposed to maintain its chemical homogeneity while liquid is continuously extracted. Only the last drop of liquid is supposed to be in equilibrium with the residue. Concentration in the solid can be retrieved from the general equations developed in Section 1.5 . The general differential equation (1.5.28) (9.3.12)
was obtained in which F is the melt fraction and Dt the bulk solid-liquid partition coefficient. Logarithmic plots of concentrations in residual solids which underwent fractional melt extraction at constant D's should define linear arrays. The slope snl2 in a In Csoli2 vs In C sol fl diagram would be ("13) For very small partition coefficients (incompatible elements), equation (9.3.13) becomes snl
&Dn/Di2
Constant D is probably a good approximation in so far as the degree of melting is significantly smaller than the proportion of the least abundant mineral phase. Integrating the differential equation gives expressions for the solid and the instantaneous liquid in equilibrium with it Cj^CoXl-Fj*'1
(9.3.14)
and
W ^
(9.3.15)
In a In C liq l2 vs In C l i q a diagram and for small degrees of melting, the instantaneous liquids would also define a straight line with the same slope as the solid array. Fractional melting processes are even more efficient than equilibrium melting in fractionating incompatible elements for small fractions of melt since D
in which the exponent can be very large (Figure 9.6).
498
Trace elements in mag ma tic processes
10
"If
0.1
0.1
0.2
0.3
Melt fraction, F Figure 9.6 Comparison of the equilibrium [equation (9.2.2)] and fractional melting [equation (9.3.15)] models for a bulk solid-liquid partition coefficient D( of 0.1 (top) and 2 (bottom). Although the concentrations predicted by the two models diverge rapidly for incompatible elements in instantaneous melts, they remain virtually identical for compatible elements.
If the various melt fractions are collected together in the proportions they are extracted from the source (aggregated melt), the average liquid concentration is obtained by making the initial concentration equal to the sum of liquid and solid with the appropriate weight-fractions (9.3.17)
or C '
(9.3.18)
Clearly, the averaging process decreases the efficiency of fractionation between incompatible elements. We can again introduce Shaw's Pt variables, which we assume to be constant (eutectic melting), and change variables according to equation (9.2.14). Thereupon, the differential form of the fractional melting equation can be rewritten
dlnCso/ dln(l-F) or
-1
9.3 Incremental processes
499
The last equation can be rearranged for easier integration into -dFP,
dlnCsoll =
¥ —
1 / FP\ - - d l n ( l - F ) = — dln( 1 ^)-dln(l-F)
D?U--Pi
Integrating between 0 and F, we get
(9.3.19) Again, the aggregated liquid can be calculated through equations (9.3.17) and (9.3.18)
In order to retrieve concentrations in the instantaneous solid, the instantaneous residual mineralogy and bulk partition coefficient must be calculated. The difficulty of applying the fractional melting model is the discontinuous character of the melting process (e.g., Presnall, 1969). Whenever a mineral phase is exhausted, the progress of fractional melting requires temperature jumps of expectedly large amplitude and discontinuous variations in melt chemistry which are not in general well-documented in natural examples. ^ The fractional melting equations can be illustrated with the melting example calculated above. A peridotite made of 80 percent olivine and 20 percent clinopyroxene contains 2500 ppm Ni, 1500 ppm Cr, 0.2 ppm Yb, and 0.01 ppm Rb. Calculate the concentration of each element in the aggregated liquid produced after 10 percent partial melting with a liquid norm of 40 percent olivine and 60 percent clinopyroxene. Assume that the partition coefficients are given by Table 9.1 (p. 479). The bulk solid-liquid partition coefficients for F = 0 are computed from (9.2.12) as DNi° = 0.8x 6 + 0.2x1 = 5.0 DCr° = 0.8x 1+0.2x8 = 2.4 D° = 0.8 x 0.1 + 0.2 x 0.3 = 0.14
whereas, from (9.2.13), the partition coefficients of the melt mode are PNi = 0.4x 6 + 0.6x1 = 3.0 PCr = 0.4x 1+0.6x8 = 5.2 p Yb = 0.4 x 0.1 + 0.6 x 0.3 = 0.22 PRb = 0.4x 0 + 0.6x0 = 0
500
Trace elements in magmatic processes
From equation (9.3.19), concentrations in the residual solid are
1-0.1
\-F
- = 2721 ppm
and, likewise CsolCr= 1590 ppm CSOIYb=0.10 ppm C sol Rb =0ppm Concentrations in the aggregated liquid are given by (9.3.18) ^ ^
C 0 N i -(l-F)C 5 O , N i _ 2500-(l-0.1)2721
01
= 511 ppm
and, likewise, CliqCr = 690ppm C liq Yb = 1.08 ppm CliqRb = 0.10 ppm
9.3.4 Continuous
melting
Because of presumably finite residual porosity after melting completion, some magma is expected to be left behind. Langmuir et al. (1977) called continuous melting a fractional melting process with residual porosity. These authors have not provided the constitutive equation which can be found, although not quite in a physically consistent form, in McKenzie (1985). This equation has recently appeared in full in Sobolev and Shimizu (1992) under the term of 'critical' melting. Using a method applied by Greenland (1970) and Albarede (1976) to fractional crystallization, the continuous melting equations can be derived from those governing fractional melting by assuming a source with constant volume porosity
xl
and
D,
(9.3.21)
9.4 Open magmatic systems
501
into the fractional melting equations. Mass and volume porosity relate through 0=
^
(9.3.22)
Psoi(l-
and therefore
(j>
psol(l
-
Introducing this expression into (9.3.14), the equations for continuous melting become
where the concentration in the residue (res) is related to that in the liquid and the solid through
For the liquid, we obtain
These equations converge towards those of the fractional melting model for cp«Dh and, contrary to McKenzie (1985) equation (29), Cliqf tends to Cj when porosity cp-+l. An example has been drawn in Figure 9.7 for Df = 0.001 and different values of cp. When the porosity cp and the partition coefficient are of the same order of magnitude, large variability is achieved in both the solid and the residue, a point which will be returned below. Considerable attention has been recently focussed on this model which may explain the fractionation of some strongly incompatible nuclides in the U decay series (McKenzie, 1985; Williams and Gill, 1989; Beattie, 1993).
9.4 Open magmatic systems The boundaries of open magmatic systems allow differential movements of either liquid or solid phases. Assimilation of solid or liquid material unrelated to the source of the melt is referred to as contamination, e.g., assimilation of granitic crust by basalts. Replenishment refers to the input of fresh magma into a differentiating magmatic body issued from the same source. Although it is usual and, as far as trace elements are concerned, justified to make a distinction between mostly liquid {magma chambers) and mostly solid systems {molten regions), a whole continuum of solid-liquid configurations exists which will be treated collectively under the generic heading of open magmatic systems.
Trace elements in magmatic processes
502
1 Residue
0.8 cp = 0.05
0.6 0.4 0.2 0
Continuous melting D: = 0.001
150 100 50
0.02 0.05
0.01
0.02
0.03
0.04
0.05
0.06
Degree of melting, F Figure 9.7 The continuous melting model for Dt = 0.001 and diverse values of the residual porosity (p. Concentrations in the residue, e.g., solid plus residual melt (top) and the liquid (bottom).
9.4.1 The steady-state magma chamber A simple version of this model is discussed in a more general context in Chapter 7. A body of magmatic liquid of constant mass M (Figure 9.8) and containing an element i with concentration initially at Col receives a continuous flux Q of fresh liquid with concentration Cin\ During the same time interval, an equivalent flux Q is either crystallized or erupted. The cumulate has a concentration Cso/ equal to D( times the liquid concentration Cliq\ Assuming a fraction of suspended crystals 1 — F in the magma chamber and a fraction 1—O of cumulate relative to the total (cumulate + erupted liquid), the budget for the element i can be written as (9.4.1)
Introducing the solid-liquid partition coefficients Dt and expanding the left-hand side for constant M :dF
M(l-D,)C I i q £
~dt
, d C W _=ecin --Q[
503
9.4 Open magmatic systems
M
Liquid Q
Q Crystallization
Figure 9.8 The steady-state magma chamber: constant M and Q are assumed, the liquid is well-stirred and in equilibrium with the last solid formed. we get the dynamic equation (see Section 7.2) M
1/1
p
r +( - )
i+
Q(
l-D.)— ' dt
l
M
dt
(
£)
dF
(9.4.2)
—
dt
In general, residence time depends on the rate of crystallization. For the limiting case where F = 1 (no suspended crystals), equation (9.4.2) simplifies to dC liq
dt
_
C in
(9.4.3)
Defining aI= + (l— )Df and the magma residence (flushing) time rm = M/Q gives the equivalent of equation (7.2.7) as
dt
(9.4.4)
The residence time of a trace element is Tm/at: compatible elements can be thought of as reactive and have shorter residence times than inert incompatible elements. As shown in Chapter 7, equation (9.4.4) can be integrated from 0 to t into equation (7.2.12)
= co expf - 4 " ) +
(9.4.5)
and steady-state concentration is Cinl/oLi. 9.4.2 A periodically erupting, periodically refilled magma chamber This model is a variant of the steady-state magma chamber discussed in the previous section but with periodic input and eruption rate. One version of this model
504
Trace elements in mag ma tic processes
corresponding to a specific sequence of processes has been calculated by O'Hara (1977) and O'Hara and Mathews (1981). A simpler derivation and more complete and physically elucidating solutions have been given by Albarede (1985). For an element i, fresh magma input balances output either as erupted lavas or as cumulate ( l - F ) C j + y C l i q f = (l-F+y)C 0 1 '
(9.4.6)
where 1 — F is the fraction of magma crystallized and Y the fraction erupted in each cycle. For O'Hara, eruption takes place before replenishment at the end of the differentiation stage. Inserting equation (9.3.6) into equation (9.4.6) gives l-{F-Y)FDi~1
Col
Alternatively, eruption may take place after replenishment at the onset of the differentiation stage (Albarede, 1985). Undifferentiated magma is erupted which has the concentration C liq l . Equation (9.3.5) therefore becomes
which, inserted into equation (9.4.6), gives
£» Co
l
l F+Y -
~
l-FDi+Y
(9.4.8)
A comparison of the two models described by equations (9.4.7) and (9.4.8) with fractional crystallization for Dt = 0.1 and Dt = 5 and assuming an erupted fraction Y of 50 percent is shown in Figure 9.5 (p. 493). Use of either equation (9.4.7) or equation (9.4.8) leads to quite different patterns of incompatible and compatible elements (Albarede, 1985; Caroffer a/., 1993) which makes it possible to discuss the timing of replenishment and eruption events. 9.4.3 Assimilation-fractional crystallization (AFC) Thermal and chemical effects of country-rock assimilation on the liquid line of descent of magmas were already known (Bowen, 1956), but considerable development and application flourished from the so-called assimilation-fractional crystallization (AFC) model. The AFC model resembles fractional crystallization with the difference that the magma chamber is continuously contaminated with assimilated country-rocks. The model was initially described for trace-element fractionation by Allegre and Minster (1978) and first applied to the combined systematics of 1 8 O/ 1 6 O and 87 Sr/ 86 Sr in crustally contaminated magmas by Taylor (1980). The constitutive equations were systematically developed for geological purposes by DePaolo (1981) and Taylor and Sheppard (1986) for trace elements and isotopes. RAFC process resembles AFC but includes magma chamber replenishment. Its constitutive equation is given by DePaolo (1985) and Hagen and Neumann (1990).
9.4 Open magmatic systems
505
The symbols used in Section 1.5 to describe the evolution of element i concentration in the solid and the liquid during fractional crystallization will be kept. Other parameters used in the present derivation are almost identical to those of DePaolo (1981) although reference to time, which is immaterial to the mass balance and equilibrium conditions, has been omitted. Let 'a' be the subscript representing the assimilated material, and assume that country-rocks concentration Cj is constant. Mass balance requires dMliq = dMa - dMsol (bulk material) dmliqi = dma'-dmgol' (species i)
(9.4.9)
whereas the solid-liquid equilibrium fractionation with partition coefficient Dt reads
dMsol
r Di < l M liq
(9.4.10)
Dividing the two equations (9.4.9) by each other, we get dmliq' = dmj-dmj dMliq dM a -dM sol then dm liq '^ dma£ dMa dMliq dM a dM a -dM s o l
dmj dMsol dMsol dM a -dM sol
(9 411)
The ratio r of assimilation and crystallization increments is defined as
r
_ dMa "dM^
(9.4.12)
Inserting r into equations (9.4.10) and (9.4.11) yields
dm^ = dmljL__Dm}}i_)_ dMliq
l
dMar-l
(94n)
Mliqr-l
Multiplying equation (9.4.13) by dM liq/m liq ', we get
.». i liq
m
* » 1 r ~"l
AS ^*liq
**. liq
m
AS ^Miq
« r
1 ~ l
Inserting the logarithmic differential of concentration into equation (9.4.14) gives W _ CJ
r dM,iq - 1 Mliq
MliqV-l
Defining the residual melt fraction F relative to the initial amount of magma M0 as F = Mliq/M0
506
Trace elements in mag ma tic processes
we get r dF
r + D,-l
dC liq -C a — -
,dF
—__Cliqy
and rearranging r
1dF C
liq
Defining z, over the [1 — Di91] interval as
,
=
!^iZl
= 1 -^L
r—1
(9.4,5)
1—r
the equation can be rewritten in a ready-to-integrate form as
No assumption on constant parameters has been made so far. If the amount of mineral precipitated is proportional to the amount of assimilated material, e.g., if latent heat is conserved, then r is constant. For the initial liquid state F = 1, Cliq* = Col as in Section 9.3.1 r
r i-\ c ' z{(r-\) L
c J
r
r{ \- ' ^-1) J 1
or equivalently C
^- = F~Zi+—-—^-(l-F~Zi)
(9.4.16)
z^r-VCo1
Col
Written in full, this expression becomes C- '
r
D X
C'
D 1
-^ = FT^- + ——---V1-FTTT ) (9A17) Co r + Dt—1 Col Making a = (l— r ) " 1 , the equation of Allegre and Minster (1978) for wall-rock dissolution during fractional crystallization is obtained, which is therefore equivalent to the AFC equation (9.4.17) derived by DePaolo (1981). Some combinations of parameters may lead to the reversal of normal fractionation trends. Constant C liq f is obtained for some critical value r c of r such that the F~Zi terms cancel out
9.4 Open magmatic systems
507
or (9.4.18)
rc represents a divide between the fractionation- and the contamination-controlled ranges. For 0 ^ r < rc (if rc > 0), concentration changes in the same direction as for simple fractional crystallization. For r > rc, changes are more similar to those expected from a contamination process. The corresponding expression for isotopic (or incompatible-element) ratios is given by DePaolo (1981). Let us label il and il the two isotopes of the same element. We further assume that their partition coefficient is identical, as are their r and zt values. Dividing equation (9.4.17) for isotope il by the corresponding equation for isotope il, we get
,—1 r
r
^i:« h
ii
r
il
l
q
which, dividing the left-hand side by C liq fl and the right-hand side by C o a , can be recast into r '•
/liq
1--
+ Dt-1 r
il
c5 i l I' 1
'rlCo'
iq
!
^_^1
or
The formula given by Fleck and Criss (1985) is easily arrived at from equation (9.4.19). Extreme values of concentrations occur for the fraction on the right-hand side being equal to 0 (pure contaminant melt) and 1 (no contamination). These relationships show a fairly simple behavior of the AFC model: the isotopic ratio (Ci2/Cn)liq should be linearly correlated with the inverse of the element concentration C l i q a , a property which it shares with all bulk mixing models. Such a linear relationship, initially suggested by Briqueu and Lancelot (1979) from the evidence of a numerical solution, was demonstrated by Fleck and Criss (1985) and Taylor and Sheppard (1986). The present analytical solution will help the reader to work out tests on geological cases. Although inversion techniques can be used (Mantovani and Hawkesworth, 1990), the parameter r can most easily be retrieved from either the slope or the intercept
508
Trace elements in magmatic processes
of AFC alignments in the diagram (Ci2/Cn)liq vs 1/Cliqfl. Extracting the term in 1/Cliqn from equation (9.4.19), the slope snl2 becomes -Cj1 aJ
l--
1-1
r
il
Multiplying this equation by the denominator of the last term on the right-hand side, we get l
-
or 2 cVM M
Defining, in the diagram (C l2 /C fl ) liq vs 1/Cliqfl, the slope sm of the mixing line between contaminant and initial magma as s Sm
_(Ci2/Cn)0-(Ci2/Cn)a (1/Con)-(1/Cmn)
r is calculated as
(9A20)
It is left to the reader as an exercise to demonstrate that r can also be retrieved from the intercept ini2 of the AFC alignment in the same diagram using
& Calculate the evolution of the normalized concentration C liq /C 0 of Sr in a magma with initial 87 Sr/ 86 Sr = 0.703 which fractionates a cumulate with partition coefficient Di = 2 while it assimilates surrounding rocks with a normalized concentration C a s 7C 0 Sr = 5C liq Sr/ C0 Sr and 87 Sr/ 86 Sr = 0.712. The calculations have been made on a spreadsheet: they consist in setting up a table of F and r values. Figure 9.9 shows the calculated isopleths (constant concentration lines) as they best show the critical phenomenon (steady concentration). From equation (9.4.18), the critical value rc of r is rc = — = 0.25 1-5
509
9.4 Open magmatic systems
•S
0.6 •
2 ^4H
0.4
o
1 0.2
00
Fractionation dominant
0.8
\*
X
M
I
( IIIf f f '
0.2
-
Assimilation dominant
0.6
0.4
0.8
Figure 9.9 AFC model for a bulk partition coefficient D{ = 2 and CJ/CQ1 = 5. Subscript 'a' refers to the contaminant. Parameter r is defined in equation (9.4.12). The critical r value rc = 0.25, calculated from equation (9.4.18), separates the fractionation dominant field from the assimilation dominant field. The labels on the curves refer to the values of CliqVCol.
1
r = 0.5
r = 0.3
oo
0.708
00
0.706
1 1 'W\ • \ V11 \ \
0.704
-
0.712 0.710
0.702
1
r = 0.2 P r-OA cT 5 percent crystallization increments
/
w
r=0
(S)ooc>ooo-o-o—o—o Initial liquid
0
o
o—
liq
Figure 9.10 AFC model with isotope ratios as in Figure 9.9. AFC curves at constant r are straight lines in a 87 Sr/ 86 Sr vs C 0 Sr /Cliq Sr diagram. The straight lines do not pass through the point representing the contaminant. Open circles indicate 5 percent crystallization increments.
The isotopic results calculated from equations (9.4.16) and (9.4.19) are shown in the 87 Sr/ 86 Sr vs C 0 Sr /C liq Sr diagram of Figure 9.10: again, we see how the system switches from the crystallization- to the assimilation-dominant regimes when r crosses the critical value rc. o
510
Trace elements in magmatic processes
This model describes the displacement of a molten zone of constant length and can be considered as a forerunner of the modern percolation theory. The process itself was first developed in metallurgy as a way of producing high-purity metals and is known as both zone-refining and zone-melting (Pfann, 1952). The original model, brought to geology by Harris (1957), initially dealt with a moving zone of completely molten material. It is of broader geological interest to consider (Figure 9.11) the displacement ofa partially molten zone of constant length L (= volume per unit surface) with a volume fraction O of liquid. This zone moves through a medium at concentration C0(z) and leaves a residuum with a volume fraction cp of liquid which will be referred to as residual porosity. psol and pliq are the densities of the solid matrix and melt, respectively. The molten zone is well-mixed so the reference level for the liquid and solid concentrations is conveniently taken as z. O, q>, and the partition coefficients are supposed to be invariant. When the molten zone has proceeded over a length z, mass balance requires dL[OPliqCliq' + (1 -
We define the enrichment factors ktL and ktR as ^ L = Opliq + (l-O)p sol 2) f
(9A23)
kiR = cpPnq + (l-
(9.4.24)
and
Rearranging equation (9.4.22), we get the general conservation equation ^
^
^
(9.4.25)
The characteristic length over which concentration of element i in the liquid changes by a factor e because of zone melting is (k^/k^L. If the distribution prior to melting is constant and such as C0\z + L) = Col independent of the depth z, equation (9.4.25) is integrated as
The value of Cliqf at z = 0 can be the concentration of a liquid generated by batch- or fractional melting from the same source or that of an exotic liquid introduced at the
9.4 Open magmatic systems
Freezing
511
wmm Molten zone propagates
Fresh wall-rock
Figure 9.11 The layout of the zone-refining model.
bottom of the melting column. The assumption of = 1 (all liquid zone) and
Pliq L
pliq L
(9.4.27)
When z»L, limit concentration is reached for (9.4.28)
Incompatible elements can achieve very large enrichment in the liquid. Steady-state is achieved over a characteristic length in proportion with (k^/k^L. For small porosities, this length is in the order of (<&/(p)L for incompatible elements, in the order of L for compatible elements (Figure 9.12). A very small fraction q> of liquid left behind therefore has a dramatic effect on both the limit concentration («C 0 '/p) and the characteristic length of incompatible elements. We can further investigate the concentration distribution left behind upon sweeping of the solid by a large number of liquid molten zones, which, for simplicity, we choose of identical thickness. This problem is identical to finding a distribution Col(z) which would be insensitive to the propagation of the molten zone. Equivalently, we can state that the molten zone does not transport any substantial amount of element I The asymptotic solution for an infinitely large number of sweeps has been given by Pfann (1952) in the case of complete melting. We will now derive a solution for constant L, Dh O, and cp. Since there is no transport of the element i by the molten zone, the local balance must be that of closed-system partial melting 1 Cz+ j ^\\a
\Z)
—
(9.4.29)
We recognize in equation (9.4.29) the batch equilibrium melting equation (9.2.2) with
512
Trace elements in magmatic processes
0.01 0.2
0.4
0.6
0.8
Z/L Figure 9.12 The zone-refining model described by simplified equation (9.4.27) for a completely molten zone. Concentration in the solid left behind the zone for different values of the bulk solid-liquid Dt. Steady-state is achieved over distances much shorter for compatible than for incompatible elements.
the solid concentration averaged over the molten zone as the source composition. What is left behind by the molten zone should both recover the concentration distribution at C0\z) and be in equilibrium with the liquid in the molten zone, i.e.,
Inserting this expression into (9.4.29) leads to the integral equation
iiqI(z)=
zS f +L
(9A3O)
We could take the derivative of this expression and apply Leibniz's rule, but we know that exponentials have the property that their integrals and derivatives are linearly related. We therefore try the solution CQ\z) = Co'(0) ez/^'
(9.4.31)
where (f is a constant with the dimension of a length to be determined. {, is the distance over which the concentration of element i changes by a factor e. Inserting this expression into the integral equation (9.4.30) gives LktL
513
9.4 Open magma tic systems
10
Upstrearn^^\^ enrichment ^^\
0.1
0.01 0.001
Downstream \ enrichment \
0.0001 10
-10
Figure 9.13 The zone-refining model with an infinite number of passes: determination through equation (9.4.32) of the length f£ in the exponential distribution of solid concentrations described by equation (9.4.31). Incompatible elements are such that the k^/k^ ratio is nearly equal to the ratio of residual porosity to the degree of melting and therefore are efficiently skimmed downstream (£,«£). The parameter d is therefore obtained as a solution of the transcendental equation (9.4.32)
This relationship has been displayed in Figure 9.13. For small values of O and cp, compatible elements are such that kfzzkf. This means that £,- »L and compatible elements such as Ni, Cr, or Mg are virtually unaffected by zone-refining. Incompatible elements are such that kf/kf «
i 2
—-
lnC 0 f 2 (z)-lnC 0 f 2 (0)_Cn
(9.4.33)
As for fractional crystallization and fractional melting, element-element plots with a logarithmic scale should show straight lines for the solid as well as for the liquid, since both differ by a constant coefficient. Contrary to fractional crystallization but similar to fractional melting, discussed above, and to percolation, to be presented below, zone-melting is a very powerful process to separate incompatible elements.
514
Trace elements in magmatic processes 9.4.5 Percolation and magma segregation
Percolation models differ from the zone-refining model essentially by the absence of mixing in the liquid, giving the liquid position-dependent properties. A simplified account of these models was described in Chapter 8. We will now provide a reasonably comprehensive account which may prove useful to the demanding reader, and then examine some properties of the chromatographic effect in a simple configuration. Let cp be the open volume porosity of the medium, p sol and p liq the density of the solid matrix and melt, respectively, vliq the liquid velocity relative to the matrix, and C so / and C liq ' the concentration of element i in the matrix and melt, respectively. Let us rewrite equation (8.3.14) as d<W
W
i
u
,
y
CJ-C^
d c h q 88
cpPliq + (l-cp)psoldCj/dCnqi
dt
hq hq
d(cppsol)
w , l q + (l-^)pMldCMlVdCllq'
dt
(9.4.34)
where, in order to keep the problem general, we temporarily keep the ratio dCsoiydCliq* on the denominators. The derivative d((ppsol)/dt on the right-hand side of (9.4.34) is a source term which corresponds to the rate of matrix conversion into melt. It should be expressed in unit mass per unit volume and unit time. Mass-fractions of melt, however, are more familiar to the geochemist. Using relation (9.3.22) between the mass-fraction 4> of porosity and the volume fraction (p, the following expression is easily derived
4>
Pliq
which, once inserted into (9.4.34), gives
Significant simplification can be achieved through the rather innocuous assumption of constant p sol . Given the differential form of equation (9.3.22) PliqPso1
dcp = -
we obtain the fundamental transport equation
+
dt
v
l
i
q
g
r a d C
$ + (l-t)dCsoll/dCUql
l
i
x
;
0 + (l-0)dC Ml '/dC liq '
(/> + (l-0)pliq/pSoi dt (9.4.35)
It should be kept in mind that the reference frame is attached to the solid matrix. In order to solve this equation, solid-liquid fractionation for element i, the flow field
9.4 Open magmatic systems
515
vliq and the rate of melting must be known. A common assumption for the dependence of cj) on time is that of adiabatic decompression, while buoyancy forces drive the melt out of the solid matrix. One extreme model (McKenzie, 1984) assumes that melt is expelled at a rate controlled by the deformation of the matrix in the gravity field (compaction). By contrast, Ribe (1985) argues that compaction should be negligible and uses Darcy's law of porous flow in a moving medium (upwelling) together with some arbitrary limits on the depth of melting. A simple solution is arrived at for local equilibrium obeying Henry's law and constant porosity. It resembles the solution which was worked out in the advection section of Chapter 8 except that now it is expressed in terms of concentrations in the liquid ^
+
dt
vliq grad 0^ = 0
(9.4.36)
An isopleth moves with an apparent rate vl such that v1' =
^
vliq = £lvliq
(9.4.37)
where st= ,
* , _ ^1
(9.4.38)
represents the fraction of element i that resides locally in the liquid. The isopleth velocity is identical to the fluid velocity for incompatible elements (Dt = 0) and slower for compatible elements. In a one-dimensional frame (z-axis), the equation becomes (9.4.39)
+v dt
dz
1
where the scalar component v of the velocity along z is now used. This is the linear traveling wave equation (e.g., Logan, 1987 and Chapter 8) which admits the general solution f(z — vlt) where / is a function that depends on the conditions at t = 0 only. The concentration distribution at any time t is simply the distribution at t — 0 shifted by the distance vh. For instance, if the initial distribution resembles a normal curve (9.4.40) where z0, I and Col
are
constants, the general solution is (9.4.41)
If, instead of keeping track of the concentration changes at a fixed level in the matrix, we follow a parcel of melt traveling with the velocity vliq (z = vliqr), the concentration
516
Trace elements in magmatic processes Table 9.7. Ratio vl/vliq during percolation of a melt with specific density 2.7 through a porous matrix with specific density 3.4for different values of partition coefficient D{ and porosity (p. Incompatible elements keep pace with the liquid, compatible elements lag significantly behind. Ratio vl/vliq for values of
Dt = 0
0.01
0.1
5
0.005 0.01 0.02 0.05 0.1
1.000 1.000 1.000 1.000 1.000
0.285 0.445 0.618 0.807 0.898
0.038 0.074 0.139 0.295 0.469
0.001 0.002 .003 0.008 0.017
becomes f
rVi-Wv,. )-7~l2') (9.4.42)
Chromatographic fractionation upon percolation is expected to be an extremely efficient way of changing relative distributions of trace elements. The initial concentration distribution is therefore simply translated at the velocity of the liquid: steady flow and full equilibrium between the liquid and its matrix require that the amount of element transported by the concentration 'wave' is constant. In more realistic cases, either the flow is non-steady due to abrupt changes in fluid advection rate or porosity, or solid-liquid equilibrium is not achieved. These cases may lead to non-linear terms in the chromatographic equation (9.4.35) and unstable behavior. The rather complicated theory of these processes is beyond the scope of the present book. & Discuss the elemental fractionation induced by the percolation of a melt with specific density 2.7 through a porous matrix with specific density 3.4 for different values of Di and volume porosity cp. The ratio vVvliq is given by the Table 9.7 which shows that the smaller the porosity, the more efficient the elemental fractionation. Assuming, as an illustration, that, at r = 0, element i is normally distributed as a function of depth with z o = 0 and A= 1, the spatial concentration distribution after a time t = 2 has been calculated from equation (9.4.41) and drawn in Figure 9.14 for various values of Dt. The more compatible the elements, the more they lag behind. Note the quite efficient separation of incompatible elements. An interesting property of trace-element ratios is their change around the initial value: since Nd is more incompatible than Sm, the Sm/Nd ratio is expected first to decrease and then increase below the initial Sm/Nd value as the liquid progresses in the rock column, o
9.4 Open magmatic systems
517
0.8 o
g o o
0.6
0.4
O
0.2
-2
-1
0
1
Reduced distance, z /A Figure 9.14 Chromatographic separation of elements with the same initial normal concentration (standard length A) and different bulk solid-liquid partition coefficient Dt through migration of a fluid in a medium of constant porosity q> at time t = 2 [equation (9.4.41)]. The pre-1987 literature on this topic is reviewed by Ribe (1987). McKenzie (1984) suggests that, since liquid looses contact with its source very rapidly, melt extraction upon compaction should be modelled by fractional melting. He later emphasized the role of residual porosity and used the continuous melting equations (McKenzie, 1985). Richter (1986) assumes gravitational compaction and through numerical schemes computes the apparent degree of melting that conventional models of batch-melting and fractional melting would hint at. Ribe (1985) recognizes that the batch-melting equation is solution to the steady-state percolation problem. Navon and Stolper (1987) investigate metasomatism by infiltration of basalts in peridotites and take diffusion in the solid into account, while Bodinier et al. (1990) and Vasseur et al. (1991) emphasize the kinetic role of grain-size heterogeneities. McKenzie (1985) and Spiegelman and Elliott (1993) investigate disequilibrium in the uranium series for compaction and steady-state models, respectively. A quite serious problem, however, still obscures most applications of the percolation theory to the transport of magmas. Most major elements, such as Si, Mg, Ca,... can be considered as compatible since their concentration in the peridotite source and the basaltic melt are similar within a factor of « 3 . Equation (9.4.37) indicates, as would equations (8.3.17) and (8.3.19) in the most general case, that major elements are slower than the liquid, especially for small porosities. But, what is the liquid made of, then? The velocity of a medium is the weighted average velocity of its constituents [see equation (8.1.4)]. The basalt velocity is that of Si, Mg, C a , . . . weighted by their
518
Trace elements in magma tic processes
abundance in the melt. Chemical melt velocity appears distinct from, and lower than, its physical velocity, which somehow violates common sense. Part of the answer is that, in order for the chromatographic model to apply, the liquid carrier should be chemically inert. If major elements are not to develop fronts in percolating basalts, it must be assumed that dC sol ydC liq ' in equation (9.4.34) is virtually zero, although C so / is not, i.e., the liquid is well buffered and Henry's law does not apply. The rather stringent assumption of trace-element transport by percolating magmas is therefore that of major elements being inert relative to the porous matrix while trace elements are easily exchanged.
9.5 Which element, which process?
9.5.1 The good use of compatible and incompatible elements Choosing the elements which are the most informative for a given situation requires some attention. In the context of magma genesis, the trace elements used for fusion processes, usually solid-dominated, should be different from those used for the crystallization processes that are most commonly liquid-dominated. Let us try to assess how informative the concentration of elements with given partition coefficients in melts are to identify processes, in particular fractional crystallization at low to moderate fractionation extent, and partial melting with small melt fractions. We derive first the relative change dC/C (or, equivalently, d l n C ) that an element with partition coefficient D undergoes when the melt fraction changes by a quantity dF. Taking the differential form of the fractional crystallization equation leads to
For small extents of crystallization, the maximum change, and thereby the most valuable information on F, will be obtained from elements with high Dt (compatible elements) such as Ni in basaltic olivine. Elements with Dt« 1 (incompatible elements), such as Th, Ba or rare-earth elements in basaltic systems, will provide basically no clue to F variations. In addition, information carried by incompatible elements, which do not fractionate with respect to each other, is entirely redundant. This is better shown by taking the relative change in the ratio of two elements i\ and il per increment of crystallization d(C'VC»),,
dF ^ - ^
(9 5 2)
--
In Figure 9.15, the relationship between the fractional change in the elemental ratio and the extent of crystallization F is plotted for different values of AD = Di2 — Dn: for partition coefficients less than 0.1, several tens of percent fractionation are needed before a change of a few percent in the ratio becomes visible. Crystal fractionation does not change incompatible-element ratios such as La/Yb, Zr/Nb, ... except in extremely residual melts.
519
9.5 Which element, which process? 1
0.8
/ -
AD =10
;
/ l
0.4 0.2
0.1
0.0 (
———
V^—
1
"
0.01 0
0.8
0.6
0.4
0.2
Fraction of residual melt, F Figure 9.15 Fractionation of two trace elements il and i2 during fractional crystallization according to equation (9.5.2). AD is the difference Dn—Di2. Incompatible elements are not fractionated efficiently even for large extents of solid removal.
Taking the log-derivative of the partial melting equation (9.2.2) relative to F leads to dC,liq
CjdF
_
1-D, F + D&1-F)
(9.5.3)
We will assume small degrees of melting, i.e., F « l . Two extreme cases will be considered. For compatible elements (D f »F) dC l i q '
1-Dt
D;
(9.5.4)
which shows that the concentration of compatible elements does not change much with the degree of melting (buffering). For instance, Ni will be buffered by olivine during mantle melting. In contrast, for incompatible elements (/),«1), concentration changes as
CjdF
(9.5.5)
Incompatible-element concentrations will change very strongly with the degree of melting. For extremely small degrees of melting, (F ^ Dt)9 incompatible-element ratios will also change, whereas at higher F they will tend to level off. Elements have therefore to be used for what they are good at. In a suite of rocks,
520
Trace elements in magmatic processes
using compatible elements to decipher partial melting processes is futile because: (i) magma differentiation makes the primitive melt concentrations unattainable; (ii) even if this problem can be overcome, information on the degree of melting F is very poor. Similarly, when using incompatible elements to address fractionation processes (i) subtle changes in the degree of melting which produced each parent magma will overwhelm the variations produced by crystal fractionation; (ii) even if this problem can be overcome, information on the extent of fractionation 1 — F is very poor. So much for trace elements in melts. One may wonder what we can expect from solids, especially residual rocks. Incompatible elements are, by definition, drained preferentially into the melt and we should be rather suspicious about using these elements in residues and cumulates. When melts are produced upon melting of a source rock or when melts infiltrate through a porous layer, it seems quite unlikely that they may be quantitatively 'wrung out' of the wetted layer. The concept of melt trapped in source rocks has been used by Langmuir et a\. (1977) to explain some geochemical features of the basalts from the FAMOUS area in the Atlantic. Residues are expected to recrystallize with the melt left behind (residual porosity). In addition to all the parameters which control melting processes in melts (melting process, melt fraction, residuum mineralogy,...), the fraction of melt trapped by the source rocks is an extra degree of freedom which complicates the interpretation of incompatible elements in solids. Assuming for element i a solid-liquid partition coefficient Dt, the concentration in a hypothetical dry residue of melting or percolation and the concentration in a rock which would have trapped a fraction/tr of interstitial liquid, are related by
u
^residue
i
Assuming that half a percent liquid is trapped with the residue, equating the concentration in the rock to that in the solid residue will result in a severe bias for incompatible elements with D^O.01 and in pure nonsense for elements with Dt<0.001. A typical example is represented by rare-earth elements in peridotites. Even separated clinopyroxenes can be suspected to have incorporated most of the REE from whichever trace amounts of liquid happened to be trapped in the cooling rock. If the rest of the minerals do not take any REE, it is left to the reader as an exercise to show that the concentration in clinopyroxene after uptake of incompatible elements is related to that in the clinopyroxene from a liquid-free residual peridotite through C <-cpx + tr
Ccpx*
' - ,
f
/(
1
/tr
/
A
/ c p x + /tr V^cpx/liq'
where Kcpx/liq* is the clinopyroxene/liquid partition coefficient. For KcvtxjXij = 0.\ percent,/tr = 0.5 percent, and/ c p x = 10 percent, concentration in the clinopyroxene contaminated by liquid is 43 percent larger than in the initial residual mineral. For / tr = 2 percent, and/ cpx = 5 percent, error exceeds 350 percent.
9.5 Which element, which process?
521
9.5.2 Elements and processes
The nature of melting and crystallization processes is largely unknown and much of what is described asfieldevidence is actually model-dependent. Because observations are dependent on what is assumed to represent a certain type of mantle (e.g., ophiolites), the respective role of porousflow(McKenzie, 1984) and channel migration (e.g., Nicolas and Jackson, 1982; Nicolas, 1986) is not unambiguously established, neither is the rheology of a molten mantle rock at high temperature. The role of regional stress in driving melts out of the pore depends on a mechanical model of source rocks and is largely a matter of speculation. For stability reasons, the persistence in the mantle of a liquid phase continuous over large vertical distances is rather problematic. Under certain conditions, solitary waves that propagates regions of high magma-filled porosity upwards are solutions to the percolation equation (Scott and Stevenson, 1986) and can be modeled by zone-refining. Ribe (1985) demonstrated that the equilibrium melting equation is solution to the porous flow equation with no diffusion term, although the process is not 'batch' melting since melt migrates relative to the matrix. To his own 'surprise', Richter (1986) found that perfect equilibrium partial melting equations replicate quite well the results of an ideal model of melt segregation from a deformable matrix. This is probably because incompatible elements are fractionated by melt segregation for liquid-solid ratios which would also produce serious elemental fractionation in a static melting process. Evidence for fractionated incompatible-element ratios in a melt may suggest either small degrees of melting or aggregation of different melt batches (e.g., O'Hara, 1985), where at least one of them represents a small liquid/solid volume ratio. For instance the U-Th fractionation during basalt genesis demonstrated by Th isotope geochemistry (Condomines et al., 1981) hints at extremely dispersed melt in some part of the mantle source although not necessarily through a percolation process. The equilibrium of melts with residual mantle is also currently under active research. From the U-shaped distributions of rare-earth elements observed in the ultramafic section of ophiolites, Prinzhofer and Allegre (1985) favored a model in which the melts under ridge crests are not equilibrated with the residual solid. Johnson et al. (1990) found that clinopyroxenes from mid-ocean ridge peridotites are too depleted in light REE relative to the heavy REE for being equilibrated with MORB liquids. They conclude that melting must have been progressive and their observations support a mechanism of fractional melting. Although these findings turn out to be real breakthroughs for understanding melting processes, the utmost care is still necessary in interpreting rare-earth and other incompatible-element distributions in peridotites and their minerals. Likewise, an acceptable picture of magma chambers is available only for those magmas which differentiate at less than a few kilobars (Mid-Ocean Ridge and Continental Flood basalts). Seismic evidence under Hawaii (Ryan, 1988) provides no more than a blurred image of mechanical events. Even in the best documented cases, animated controversies exist on many of their basic features such as the role of replenishment and crystal settling, persistent zoning, convection, the locus of crystallization and interpretation of the seismic evidence. Analog models, such as syrup-and-dye-in-tank magma chambers, provide a useful illustration of potential factors. However, scaling of natural processes on man-made material is approximate,
522
Trace elements in magmatic processes
geometry is dependent on ambiguous field observations, observables are few and their interpretation oriented by rather strong prejudices on the nature and dynamics of magma bodies. Even though they may sound physically less informative, simple models are as illuminating as complicated constructions. Models with a large number of parameters are only nearly as good as our knowledge of the least-known parameters. Little can be said about the unicity of solutions: as mentioned in Chapters 4 and 5, the co variance structure of the parameter space conditions strongly the results. A considerable effort is still to be made in order to describe the range of acceptable 'realistic' models. In this context, Monte-Carlo simulations have not received the attention they deserve. On the contrary, when the number of parameters is reasonably small, models can be tested accurately and stumbling blocks identified. Geochemistry offers four types of constraints: (a) elemental mass balance (b) elemental distribution among various phases (c) the rate of radioactive element decay (d) the overwhelming consistency of geochemical patterns among major petrological units (the odds of predicting reasonably well the La/Yb of a normal MORB or the 87Sr/86Sr ratio of a sedimentary carbonate are extremely good). These constraints, however, rarely provide evidence for specific physical processes, but should remain the base of any calculation.
9.6 Disequilibrium fractionation during crystal growth
For all the previous models, equilibrium partitioning of elements among homogeneous phases has been assumed. Crystal growth and melting, however, are disequilibrium processes and distribution of elements between mineral and melt must therefore be considered in kinetic conditions. We may wonder how kinetics, i.e., the combination of crystal growth and diffusional transport, affects trace-element distributions between crystals and liquid. We consider a liquid (x > 0) and a reference frame with x = 0 at the mineral-liquid interface. Liquid therefore seems to move towards negative x values and freeze upon interface crossing. Crystal growth rate will be assumed to be constant with modulus v. Fractionation of the element i at the interface will be assumed to take place at equilibrium and be governed by a mineral-liquid partition coefficient Dt. Diffusion in the solid is neglected as is volume change upon solidification. At steady-state and allowing for a negative advection rate, the one-dimensional transport equation reads ^
^ dx 2
+
v
Q
(961)
dx
where Q) is the diffusion coefficient of element i. Mass balance at the interface requires the equality of the diffusion and advectionfluxeson the liquid side with the advection flux on the solid side
dx
(9.6.2)
where, again, advection rate is negative. Equilibrium fractionation at the interface
9.6 Disequilibrium fractionation during crystal growth
523
requires
Cj = DtC^
(9.6.3)
The condition at x = 0 is therefore dC ' ^ ^
Ci(Dl)
It is convenient to introduce the new variable ti = dC liq '/dx
into the diffusion equation which leads to the first-order differential equation du Q)
hvw = 0
dx Upon two successive integration, we obtain
where a and j8 are two constants to be determined from the boundary conditions. Taking the derivative and applying the condition at x = 0, it becomes
dx Jx = 0
^a e 9)
v(
Therefore
and vx C^ = — D Q-
.,
^~vxn
(9.6.4)
One additional boundary condition being needed, two cases have been treated in the literature: (i) Tiller et al. (1953) assume the liquid medium is unbounded and therefore
which results in r i
Trace elements in magmatic processes
524
8 10
mm mm mm Solid
•:•••:•••:•••:•••:
mm mm mm
Liquid
0.1
0
1
2
3
4
Normalized distance to interface,
5
6
7
vx/2
Figure 9.16 Kinetic fractionation during crystal growth. Steady-state distribution of melt concentrations in the vicinity of a solid growing at the rate v for trace elements with different solid-liquid fractionation coefficients [equation (9.6.5), Tiller et al. (1953)]. The stippled area indicates the steady-state chemical boundary-layer with thickness S — @/v.
The concentration profiles for Df = 0.1 and D, = 5 have been depicted in Figure 9.16 as a function of the dimensionless distance vx/@. Accumulation of incompatible elements and depletion of compatible elements in the vicinity of the interface are the remarkable features of this model. Concentrations at the interface are given by Cliqf(0) = CoVDi
and
Cj(0) = Col
At steady-state, solid and liquid far from the interface tend to have the same concentration. Kinetic partitioning therefore brings solid-liquid partition coefficients close to unity and decreases chemical fractionation. The concentration profile in the liquid at distance x from the interface also reads
A convenient estimate of the anomalous layer thickness (chemical boundary layer) is given by d = @/v. Indeed, the excess or deficit M of diffusing substance is equal to the area limited by the concentration profile and the initial distribution Col in the liquid 0O
(C liq i -C 0 i )dx = [C l i q i (0)-C 0 i ] f° o Jo
9.6 Disequilibrium fractionation during crystal growth
525
The length scale 3 is therefore equivalent to the thickness of a layer with uniform concentration C_iig(0) and which would hold the same excess or deficit M of diffusing substance as the growing system at steady-state (Figure 9.16). (ii) Burton et al. (1953) consider the case where hydrodynamic conditions impose the concentration at a given distance L of the interface (e.g., rotating crystal growth during the industrial making of crystals) Cnqi = CL
at
x=L
which results in
which tends to Tiller et al's solution when L increases to infinity. Concentration in the solid at the interface is n c *
cj«»=
DiCh
Since the denominator falls in the range D{ to 1, concentration in the solid is closer to that of the liquid away from the interface than equilibrium fractionation would require. Again, disequilibrium partitioning during crystal growth decreases solid-liquid chemical fractionation.
For kinetic disequilibrium partitioning of trace elements, equation (9.6.6) after Burton et al (1953) is commonly presented as an alternative to equation (9.6.5) due to Tiller et al (1953) (e.g., Magaritz and Hofmann, 1978; Lasaga, 1981; Walker and Agee, 1989; Shimizu, 1981). However, the relative values of viscosity and chemical diffusivity in common liquids and silicate melts make the momentum boundary-layer (i.e., the liquid film which sticks to the solid) orders of magnitude thicker than the chemical boundary layer. It is therefore quite unlikely that, except for rare cases of transient state, liquid from outside the momentum boundary-layer may encroach on the chemical boundary-layer, i.e., 3 may actually be taken as infinite. As a simple description of steady-state disequilibrium fractionation, the model of Tiller et al (1953) has a much better physical rationale. A more elaborate discussion of these processes may be found in Tiller (1991a, b). The transient solution to this problem, briefly described in Section 8.5.9, has been worked out analytically by Smith et al (1955) and Hulme (1955), whereas Albarede and Bottinga (1972) calculated numerical solutions for the case where a crystal grows out of a finite amount of melt and discussed the geological implications.
References
Ahlberg, J. H., Nilson, E. N. & Walsh, J. L. (1967). The Theory of Splines and their Applications. New York: Academic Press. Albarede, F. (1976). Some trace element relationships amongst liquid and solid phases in the course of the fractional crystallization of magmas. Geochim. Cosmochim. Ada, 40,667-73. Albarede, F. (1978). The recovery of spatial isotope distributions from stepwise degassing data. Earth Planet. Sci. Letters, 39, 387-97. Albarede, F. (1983). Inversion of batch melting equations and the trace element pattern of the mantle. J. Geophys. Res., 88, 10573-83. Albarede, F. (1985). Open magma chambers: regime and trace element evolution. Nature, 318, 356-58. Albarede, F. (1992). How deep do common basalts form and differentiate? J. Geophys. Res., 97, 10997-11009. Albarede, F. (1993). Residence time analysis of geochemical fluctuations in volcanic series. Geochim. Cosmochim. Ada, 57, 615-21. Albarede, F. & Bottinga, Y. (1972). Kinetic disequilibrium between phenocrysts and host lava. Geochim. Cosmochim. Ada, 36, 141-56. Albarede, F. & Brouxel, M. (1987). The Sm/Nd secular evolution of the continental crust and depleted mantle. Earth Planet. Sci. Letters, 82, 25-35. Albarede, F., Michard, A., Minster, J. F. & Michard, G. (1981). 87 Sr/ 86 Sr ratios in hydrothermal waters and deposits from the East Pacific Rise at 21°N. Earth Planet. Sci. Letters, 55, 229-36. Albarede, F. & Provost, A. (1977). Petrological and geochemical mass balance: an algorithm for least-squares fitting and general error analysis. Comp. Sci., 3, 309-26. Albarede, F. & Tamagnan, V. (1988). Modelling the recent evolution of the Piton de la Fournaise volcano, Reunion Island, 1931-1986. J. Petrol, 29, 997-1030. Alibert, C. & Carron, J.-P. (1980). Donnees experimentales sur le diffusion des elements majeurs entre verres ou liquides de compositions basaltique, rhyolitique et phonolitique, entre 9 0 0 C et 1300C, a pression ordinaire. Earth Planet. Sci. Letters, 47, 294-306. Allegre, C. J. (1982). Chemical geodynamics. Tectonophys., 81, 109-32. Allegre, C. J., Brevart, O., Dupre, B. & Minster, J.-F. (1980). Isotopic and chemical effects produced in a continuously differentiating convecting earth mantle. Phil. Trans. R. Soc. London, A297, 447-77. Allegre, C. J., Hamelin, B., Provost, A. & Dupre, B. (1987). Topology in isotopic multispace and origin of the mantle chemical heterogeneities. Earth Planet. Sci. Letters, 81,319-37. Allegre, C. J., Hart, S. R. & Minster, J.-F. (1983). Chemical structure and evolution of the mantle and the continents determined by inversion of Nd and Sr isotopic data, I. Theoretical models. Earth Planet. Sci. Letters, 66, 177-90. 526
References
527
Allegre, C. J., Hart, S. R. & Minster, J.-F. (1983). Chemical structure and evolution of the mantle and the continents determined by inversion of Nd and Sr isotopic data, II. Numerical experiments and discussion. Earth Planet. Sci. Letters, 66, 191-213. Allegre, C. J. & Minster, J.-F. (1978). Quantitative models of trace element behavior in magmatic processes. Earth Planet. Sci. Letters, 38, 1-25. Allegre, C. J. & Rousseau, D. (1984). The growth of the continents through geological time studied by Nd isotope analysis of shales. Earth Planet. Sci. Letters, 67, 19-34. Allegre, C. J., Treuil, M., Minster, J.-F., Minster, B. & Albarede, F. (1977). Sytematic use of trace element in igneous process. Part I fractional crystallization processes in volcanic suites. Contrib. Mineral. Petrol., 60, 57-75. Allegre, C. J. & Turcotte, D. L. (1986). Implication of a two-component marble cake mantle. Nature, 323, 123-27. Anonymous (1982). Theory of Linear Systems. New York: Res. Edu. Ass. Arndt, N. T. (1977). Partitioning of nickel between olivine and ultrabasic and basic komatiitic liquids. Carnegie. Inst. Washington Year Book, 76, 553-57. Arnold, V. I. (1978). Ordinary Differential Equations. Cambridge: MIT Press. Backus, G. & Gilbert, F. (1967). Numerical applications of a formalism for geophysical inverse problems. Geophys. J. R. Astron. Soc, 13, 247-76. Barin, I. & Knacke, O. (1973). Thermochemical Properties of Inorganic Substances. Berlin: Springer-Verlag. Barling, J. & Goldstein, S. L. (1989). Extreme isotopic variations in Heard Island lavas and the nature of mantle reservoirs. Nature, 348, 59-62. Barnola, J. M., Raynaud, D., Korotkevich, Y. S. & Lorius, C. (1987). Vostok ice core provides 160,000-year record of atmospheric CO 2 . Nature, 329, 408-14. Bateman, H. (1910). Solution of a system of differential equations occurring in the theory of radioactive transformations. Proc. Cambridge Phil. Soc, 15, 423-27. Baumgartner, L. P. & Rumble III, D. (1988). Transport of stable isotopes: I: Development of a kinetic continuum theory for stable isotope transport. Contrib. Mineral. Petrol., 98, 417-30. Bear, J. (1972). Dynamics of Fluids in Porous Media. New York: Elsevier. Beattie, P. (1993). Uranium-thorium disequilibria and partitioning on melting of garnet peridotite. Nature, 363, 63-5. Benson, S. W. (1982). The Foundations of Chemical Kinetics. Malabar: Krieger. Berger, A. (1988). Milankovitch theory and climate. Rev. Geophys., 26, 624-57. Berger, W. H. & Heath, G. R. (1968). Vertical mixing in pelagic sediments. J. Mar. Res., 26, 134-43. Berner, E. K. & Berner, R. A. (1987). The Global Water Cycle. Englewood Cliffs: Prentice Hall. Berner, R. A. (1980). Early Diagenesis. A Theoretical Approach. Princeton: Princeton, University Press. Berner, R. A., Lasaga, A. C. & Garrels, R. M. (1983). The carbonate-silicate geochemical cycle and its effect on atmospheric carbon dioxide over the past 100 millon years. Amer. J. Sci., 283, 641-83. Bird, R. B., Stewart, W. E. & Lightfoot, E. N. (1960). Transport Phenomena. New York: John Wiley. Bodinier, J. L., Vasseur, G., Vernieres, J., Dupuy., C. & Fabries, J. (1990). Mechanisms of mantle metasomatism: Geochemical evidence from the Lherz orogenic peridotite. J. Petrol., 31, 597-628. Boger, P. D. & Faure, G. (1974). Strontium-isotope stratigraphy of a Red Sea core. Geol., 2, 181-83. Boher, M., Abouchami, W., Michard, A., Albarede, F. & Arndt, N. T. (1992). Crustal growth in West-Africa at 2.1 Ga. J. Geophys. Res., 97, 345-69.
528
References
Bowen, N. L. (1956). The Evolution of the Igneous Rocks. Dover. Bracewell, R. (1965). The Fourier Transform and its Applications. New York: McGraw Hill. Brady, J. B. (1975). Reference frames and diffusion coefficients. Amer. J. Sci., 275, 954-83. Brigham, E. O. (1974). The Fast Fourier Transform. Englewood Cliffs: Prentice Hall. Brinkley, S. R. (1946). Note on the conditions of equilibrium for systems of many consituents. / . Chem. Phys., 14, 563-64. Briqueu, L. & Lancelot, J. R. (1979). Rb-Sr systematics and crustal contamination models for calc-alkaline igneous rocks. Earth Planet. Sci. Letters, 43, 385-96. Broecker, W. S. (1974). Chemical Oceanography. New York: Harcourt Brace Jovanovich. Broecker, W. S. & Li, Y.-H. (1970). Interchange of water between the major oceans. J. Geophys. Res., 75, 3545-52. Broecker, W. S. & Peng, T.-H. (1982). Tracers in the Sea. New York: Eldigio. Broecker, W. S. & Takahashi, T. (1978). The relationship between lysocline depth and in situ carbonate ion concentration. Deep Sea Res., 25, 65-95. Brooks, C , Hart, S. R. & Wendt, I. (1972). Realistic use of two-error regression treatments as applied to rubidium-strontium data. Rev. Geophys. Space Phys., 10, 551-77. Brown, E. T , Edmond, J. M , Raisbeck, G. M., Yiou, F., Kurz, M. D. & Brook, E. J. (1991). Examination of surface exposure ages of Antarctica moraines using in situ produced 10 Be and 26A1. Geochim. Cosmochim. Ada, 55, 2269-83. Bruland, K. W. (1980). Oceanographic distributions of cadmium, zinc, nickel, and copper in the North Pacific. Earth Planet. Sci. Letters, 47, 176-98. Bryan, W. B., Finger, L. W. & Chayes, F. (1969). Estimating proportions in petrographic mixing equations by least-square approximations. Science, 163, 926-7. Burton, J. A., Prim, R. C. & Slichter, W. P. (1953). The distribution of solutes in crystals grown from the melt. Part I. Theoretical. / . Chem. Phys., 21, 1987-91. Candela, P. A. (1986). Generalized mathematical models for the fractional evolution of vapor from magmas in terrestrial planetary crusts. In Chemistry and Physics of Terrestrial Planets, ed. E. K. Saxena, pp. 362-96. NY: Springer. Caroff, M , Maury, R. C , Leterrier, J., Joron, J.-L., Cotten, J. & Guille, G. (1993). Trace element behavior in the alkali basalt-comenditic trachyte series from Mururoa Atoll, French Polynesia. Lithos, 30, 1-22. Carslaw, H. S. & Jaeger, J. C. (1959). Conduction of Heat in Solids. Oxford: Oxford University Press. Chadam, J. & Ortoleva, P. (1984). Moving interfaces and their stability: Applications to chemical waves and solidification. In Dynamics of Non-Linear Systems, ed. V. Hlavacek, pp. 247-78. New York: Gordon and Breach. Condomines, M., Morand, P. & Allegre, C. J. (1981). 23<>Th-238U ra( fi O active disequilibria in tholeiites from the FAMOUS zone (Mid-Atlantic Ridge, 36°50'N): Th and Sr isotopic geochemistry. Earth Planet. Sci. Letters, 55, 247-56. Corrigan, J. (1991). Inversion of apatite fission track data for thermal history information. J. Geophys. Res., 96, 10347-60. Cortini, M. & Hermes, O. D. (1981). Sr isotopic evidence for a multi-source origin of the potassic magmas in the Neapolitan area (S. Italy). Contrib. Mineral. Petrol., 11, 47-55. Cortini, M. & Scandone, R. (1982). The feeding system of Vesuvius between 1754 and 1944. J. Vole. Geotherm. Res., 12, 393. Cox, K. G., Bell, J. D. & Pankhurst, R. J. (1979). The Interpretation of Igneous Rocks. London: George Allen & Unwin. Cox, K. G., McKenzie, D. & White, R. S. (1993). Melting and Melt Movement in the Earth. Oxford: Oxford Univ. Press. Craig, H. (1961). Isotopic variations in meteoritic waters. Science, 133, 1702-3. Craig, H. (1969). Abyssal carbon and radiocarbon in the Pacific. J. Geophys. Res., 74,5491-506.
References
529
Craig, H. (1974). A scavenging model for trace elements in the deep sea. Earth Planet. Sci. Letters, 23, 149-59. Crank, J. (1976). The Mathematics of Diffusion. Oxford: Oxford University Press. Crawford, J. D. (1991). Introduction to bifurcation theory. Rev. Modern Phys., 63, 991-1037. Criss, R. E., Gregory, R. T. & Taylor, H. P., Jr (1987). Kinetic theory of oxygen isotopic exchange between minerals and water. Geochim. Cosmochim. Acta, 51, 1099-108. Dansgaard, W. (1964). Stable isotopes in precipitation. Tellus, 16, 436-68. Darken, L. S. (1948). Diffusion, mobility and their interrelation through free energy in binary metallic systems. Trans. AIME, 174, 184-94. Darken, L. S. & Gurry, R. W. (1953). Physical Chemistry of Metals. New York: McGraw-Hill, de Boor, C. (1978). A Practical Guide to Splines. New York: Springer. Deloule, E., France-Lanord, C. & Albarede, F. (1991). D/H analysis of minerals by ion probe. In Stable Isotope Geochemistry: A Tribute to Sam Epstein, ed. J. Taylor H. P., J. R. O'Neil & I. R. Kaplan, pp. 53-62. San Antonio: The Geochemical Society. Denbigh, K. (1968). The Principles of Chemical Equilibrium. Cambridge: Cambridge University Press. DePaolo, D. J. (1980). Crustal growth and mantle evolution: inferences from models of element transport and Nd and Sr isotopes. Geochim. Cosmochim. Acta, 44, 1185-96. DePaolo, D. J. (1981). Trace-element and isotopic effects of combined wallrock assimilation and fractional crystallization. Earth Planet. Sci. Letters, 53, 189-202. DePaolo, D. J. (1985). Isotopic studies of processes in mafic magma chambers. J. Petrol, 4, 925-51. DePaolo, D. J. & Ingram, B. L. (1985). High-resolution stratigraphy with strontium isotopes. Science, 227, 938-41. De Vault, D. (1943). The theory of chromatography. J. Amer. Chem. Soc, 65, 532^0. Dodson, M. H. (1973). Closure temperature in cooling geochronological and petrological systems. Contrib. Mineral. Petrol., 40, 259-74. Doerner, H. A. & Hoskins, W. M. (1925). Coprecipitation of radium and barium sulfates. J. Amer. Chem. Soc, 47, 662-75. Dudewicz, E. J. & Mishra, S. N. (1988). Modern Mathematical Statistics. New York: John Wiley. Dupre, B. & Allegre, C. J. (1983). Pb-Sr isotope variation in Indian Ocean basalts and mixing phenomena. Nature, 303, 142-6. Eberhardt, P., Geiss, J., Graf, H., Gr*gler, N., Krahenbuhl, U., Schwaller, H., Schwarzmiiller, J. & Stettler, A. (1970). Trapped solar wind noble gases, exposure age and K/Ar-age in Apollo 11 lunar fine material. Proc. Apollo 11 Lunar Sci. Conf, 2,1037-70. Faure, G. (1986). Principles of Isotope Geology. New York: John Wiley. Feigenson, M. D. & Carr, M. J. (1993). The source of Central American lavas: inferences from geochemical inverse modeling. Contrib. Mineral. Petrol, 113, 226-34. Feinn, D., Ortoleva, P., Scalf, W., Schmidt, S. & Wolff, M. (1978). Spontaneous pattern formation in precipitating systems. / . Chem. Phys., 69, 27-39. Finlayson, B. A. (1972). The Method of Weighted Residuals and Variational Principles. New York: Academic Press. Fleck, R. J. & Criss, R. E. (1985). Strontium and oxygen isotopic variations in Mesozoic and Tertiary plutons of Central Idaho. Contrib. Mineral. Petrol, 90, 291-308. Fletcher, C. A. J. (1991). Computational Techniques for Fluid Dynamics. Volume I: Fundamental and General Techniques. Berlin: Springer-Verlag. Fletcher, R. (1987). Practical Methods of Optimization. Chichester: John Wiley. Flicker, M. & Ross, J. (1974). Mechanism of chemical instability for periodic precipitation phenomena. / . Chem. Phys., 60, 3458-65. Foland, K. A. (1974). Ar 40 diffusion in homogeneous orthoclase and an interpretation of Ar diffusion in K-feldspars. Geochim. Cosmochim. Acta, 38, 151-66.
530
References
Fourcade, S. & Allegre, C. J. (1981). Trace-element behavior in granite genesis: A case study. The calc-alkaline plutonic association from the Querigut complex (Pyrenees, France). Contrib. Mineral. Petrol, 76, 177-95. Francis, D. (1985). The Baffin Bay lavas and the value of picrites as analogues of primary magmas. Contrib. Mineral. Petrol., 89, 144-54. Gast, P. W. (1968). Trace element fractionation and the origin of tholeiitic and alkaline magma types. Geochim. Cosmochim. Ada, 32, 1057-86. Ghiorso, M. S. (1985a). Chemical mass transfer in magmatic processes. I. Thermodynamic relations and numerical algorithms. Contrib. Mineral. Petrol., 90, 107-20. Ghiorso, M. S. (1985b). Chemical mass transfer in magmatic processes. II. Applications in equilibrium crystallization, fractionation and assimilation. Contrib. Mineral. Petrol., 90, 1021-41. Grandjean, P. (1989). Les terres rares et la composition isotopique du neodyme dans les phosphates biogenes: traceurs des processus paleo-oceanographiques et sedimentaires. Inst. Natl. Polytechn. Lorraine Ph.D., Nancy. Gray, P. & Scott, S. K. (1994). Chemical Oscillations and Instabilities. Oxford: Oxford University Press. Greenland, L. P. (1970). An equation for trace element distribution during magmatic crystallization. Amer. Mineral., 55, 455-65. Grossman, L. (1972). Condensation in the primitive solar nebula. Geochim. Cosmochim. Acta, 36, 597-619. Grossman, L. & Larimer, J. W. (1974). Early chemical history of the solar system. Rev. Geophys. Space Phys., 12, 71-101. Guinasso, N. L., Jr. & Schink, D. R. (1975). Quantitative estimates of biological mixing rates in abyssal sediments. J. Geophys. Res., 80, 3032-43. Guy, B. (1984). Contribution to the theory of infiltration metasomatic zoning; the formation of sharp fronts: a geometrical model. Bull. Mineral., 107, 93-105. Hackbusch, W. (1985). Multi-Grid Methods and Applications. New York: Springer. Hagen, H. & Neumann, E.-R. (1990). Modeling of trace-element distribution in magma chambers using open-system models. Comput. Geosci., 16, 549-56. Haken, J. (1978). Synergetics. An Introduction. Nonequilibrium Phase Transitions and Self-Organization in Physics, Chemistry and Biology. Berlin: Springer-Verlag. Hall, A. (1987). Igneous Petrology. Harlow: Longman. Hamelin, B., Manhes, G., Albarede, F. & Allegre, C. J. (1985). Precise lead isotope measurements by the double spike technique : a reconsideration. Geochim. Cosmochim. Acta, 49, 173-82. Hamilton, W. C. (1964). Statistics in Physical Science. New York: Ronald. Harris, P. G. (1957). Zone-refining and the origin of potassic basalts. Geochim. Cosmochim. Acta, 12, 195-208. Harrison, T. M. (1981). Diffusion of 40 Ar in hornblende. Contrib. Mineral. Petrol., 78,324-31. Hart, S. R. (1984). A large-scale isotope anomaly in the Southern Hemisphere mantle. Nature, 309, 753-7. Hart, S. R. & Davis, K. E. (1978). Nickel partitioning between olivine and silicate melt. Earth Planet. Sci. Letters, 40, 203-19. Hart, S. R., Hauri, E. H., Oschmann, L. A. & Whitehead, J. A. (1992). Mantle plumes and entrainment: isotopic evidence. Science, 256, 517-20. Hart, S. R. & Zindler, A. (1989). Isotope fractionation laws: A test using calcium. Int. J. Mass Spectr. Ion Proc, 89, 287-301. Hertogen, J. & Gijbels, R. (1976). Calculation of trace element fractionation during partial melting. Geochim. Cosmochim. Acta, 40, 313-22. Hilliard, J. E. (1970). Spinodal decomposition. In Phase Transformations, pp. 497-560. Metals Park: Amer. Soc. Metals.
References
531
Hirose, K. & Kushiro, I. (1993). Partial melting of dry peridotites at high pressures: Determination of compositions of melts segregated from peridotite using aggregates of diamond. Earth Planet. Sci. Letters, 114, 477-89. Hodell, D. A. & Cieselski, P. F. (1991). Stable isotopic and carbonate stratigraphy of the Late Pliocene and Pleistocene of Hole 704A: eastern subantarctic South Atlantic. Proc. ODP Sci. Results, 114, 409-35. Hoel, P. G., Port, S. C. & Stone, C. J. (1971). Introduction to Probability Theory. Boston: Houghton Mifflin. Hoffman, N. R. A. & McKenzie, D. P. (1985). The destruction of geochemical heterogeneities by differential fluid motions during mantle convection. Geophys. J. R. Astron. Soc, 1985, 163-206. Hofmann, A. (1971). Fractionation corrections for mixed-isotope spikes of Sr, K, and Pb. Earth Planet. Sci. Letters, 10, 397-402. Hofmann, A. W. (1972). Chromatographic theory of infiltration metasomatism and its application to feldspars. Amer. J. Sci., 292, 69-80. Hofmann, A. W. (1988). Chemical differentiation of the Earth: the relationship between mantle continental crust, and oceanic crust. Earth Planet. Sci. Letters, 90, 297-314. Hofmann, A. W. & Feigenson, M. D. (1983). Case studies on the origin of basalt: I. Theory and reassessment of Grenada basalts. Contrib. Mineral. Petrol., 84, 382 — 9. Hofmann, A. W. & Hart, S. R. (1978). An assesment of local and regional isotopic equilibrium in the mantle. Earth Planet. Sci. Letters, 38, 44-62. Holland, H. D. (1978). The Chemistry of the Atmospheres and Oceans. New York : Wiley. Hulme, K. F. (1955). On the distribution of impurity in crystals grown from impure unstirred melt. Proc. Phys. Soc, 68, 393-9. Irvine, T. N. (1977). Definition of primitive liquid compositions for basic magmas. Carnegie Inst. Washington Year Book., 76, 454-61. Irving, A. J. (1978). A review of experimental studies of crystal/liquid trace-element partitioning. Geochim. Cosmochim. Ada, 42, 743-70. Irving, A. J. & Frey, F. A. (1984). Trace-element abundances in megacrysts and their host basalts: Constraints on partition coefficients and megacryst genesis. Geochim. Cosmochim. Ada, 48, 1201-21. Jackson, D. D. (1972). Interpretation of inaccurate, insufficient and inconsistent data. Geophys. J. R. Astron. Soc, 28, 97-110. Jacobsen, S. B. & Wasserburg, G. J. (1979). The mean age of mantle and crustal reservoirs. J. Geophys. Res., 84, 7411-27. Johnson, K. T. M., Dick, H. J. B. & Shimizu, N. (1990). Melting in the oceanic upper mantle: an ion microprobe study of diopsides in abyssal peridotites. J. Geophys. Res., 95, 2661-78. Johnson, R. A. & Wichern, D. W. (1982). Applied Multivariate Statistical Analysis. Englewood Cliff: Prentice-Hall. Jouzel, J., Lorius, C , Petit, J. R., Genthon, C , Barkov, N. I., Kotlyakov, V. M. & Petrov, V. M. (1987). Vostok ice core: a continuous isotope temperature record over the last climatic cycle (160,000 years). Nature, 239, 403-8. Junge, C. E. (1974). Residence variability of tropospheric trace gases. Tellus, 26, 477-88. Juteau, M., Michard, A. & Albarede, F. (1986). The Pb-Sr-Nd isotope geochemistry of some recent circum-Mediterranean granites. Contrib. Mineral. Petrol, 92, 331-40. Kahlweit, M. (1965). The structure of a precipitate as determined by the interplay of nucleation, growth and ageing. Prog. Chem. Solids, 2, 134-74. Keir, R. S. & Berger, W. H. (1983). Atmospheric CO 2 content in the last 120,000 years: The phosphate extraction model. J. Geophys. Res., 88, 6027-38. Kent, J. T., Watson, G. S. & Onstott, T. C. (1990). Fitting straight lines and planes with an
532
References
application to radiometric dating. Earth Planet. Sci. Letters, 97, 1-17. Kinzler, R. J., Grove, T. L. & Recca, S. I. (1990). An experimental study of the effect of temperature and melt composition on the partitioning of nickel between olivine and silicate melt. Geochim. Cosmochim. Ada, 54, 1255-65. Kirkaldy, J. S. & Young, D. J. (1987). Diffusion in the Condensed State. London: The Institute of Metals. Korzhinskii, D. S. (1970). Theory of Metasomatic Zoning. Oxford: Clarendon Press. Lai, D. (1988). In s/fw-produced cosmogenic isotopes in terrestrial rocks. Ann. Rev. Earth Planet. Sci., 16, 355-88. Lancelot, J. R. & Allegre, C. J. (1974). Origin of carbonatitic magmas in the light of Pb-U-Th isotope system. Earth Planet. Sci. Letters, 22, 233-8. Lanczos, C. (1961). Linear Differential Operators. London: Van Nostrand. Langmuir, C. H. (1989). Geochemical consequences of in situ crystallization. Nature, 340, 199-205. Langmuir, C. H., Bender, J. F., Bence, A. E., Hanson, G. N. & Taylor, S. R. (1977). Petrogenesis of basalts from the FAMOUS area: Mid-Atlantic Ridge. Earth Planet. Sci. Letters, 36, 133-56. Langmuir, C. H., Vocke, R. D., Hanson, G. N. & Hart, S. R. (1978). A general mixing equation with applications to Icelandic basalts. Earth Planet. Sci. Letters, 37, 380-92. Larimer, J. W. (1967). Chemical fractionations in meteorites - I. Condensation of the elements. Geochim. Cosmochim. Ada, 31, 1215-38. Lasaga, A. (1980). The kinetic treatment of geochemical cycles. Geochim. Cosmochim. Ada, 44, 815-28. Lasaga, A. (1981a). Implications of a concentration-dependent growth rate on the boundary layer crystal-melt model. Earth Planet. Sci. Letters, 56, 429-34. Lasaga, A. C. (1981b). Dynamic treatment of geochemical cycles. In Kinetics of Geochemical Processes, ed. A. C. Lasaga & R. J. Kirkpatrick, pp. 69-110. Washington: Miner. Soc. Amer. Lasaga, A. C , Berner, R. A. & Garrels, R. M. (1985). An improved geochemical model of atmospheric CO 2 fluctuations over the past 100 millon years. In The Carbon Cycle and Atmospheric CO2: Natural Variations Archean to Present, ed. E. T. Sundquist & W. S. Broecker, pp. 397-411. Washington: American Geophysical Union. Leon, S. J. (1990). Linear Algebra with Applications. New York: Maxwell-Macmillan. Li, Y.-H. (1981). Ultimate removal mechanisms of elements from the ocean. Geochim. Cosmochim. Ada, 45, 1659-64. Li, Y.-H. (1982). A brief discussion on the mean oceanic residence time of elements. Geochim. Cosmochim. Ada, 46, 2671-5. Lichtner, P. C. (1985). Continuum model for simultaneous chemical reactions and mass transport in hydrothermal systems. Geochim. Cosmochim. Ada, 49, 779-800. Lloyd, E. (1980). Handbook of Applicable Mathematics. Vol. II: Probability. Chichester: John Wiley. Logan, J. D. (1987). Applied Mathematics: A Contemporary Approach. New York: John Wiley. Lomb, N. R. (1976). Least-squares frequency analysis of unequally spaced data. Astrophys. Space Sci., 39, 447-62. Lovera, O. M., Richter, F. M. & Harrison, T. M. (1989). The 40 Ar/ 39 Ar thermochronometry for slowly cooled samples having a distribution of diffusion domain sizes. J. Geophys. Res., 94, 17917-35. Lovett, R., Ortoleva, P. & Ross, J. (1978). Kinetic instabilities in first-order phase transitions. J. Chem. Phys., 69, 947-55. Ludwig, K. R. (1980). Calculation of uncertainties of U-Pb isotope data. Earth Planet. Sci. Letters, 46, 212-20. Magaritz, M. & Hofmann, A. W. (1978). Diffusion of Sr, Ba, and Na in obsidian. Geochim. Cosmochim. Ada, 42, 595-605.
References
533
Mantovani, M. S. M. & Hawkesworth, C. J. (1990). An inversion approach to assimilation and fractional crystallization processes. Contr. Mineral. Petrol, 105, 289-302. Marquardt, D. W. (1963). An algorithm for least square estimation of non-linear parameters. SIAM /., 11, 431-41. McBirney, A. R. (1984). Igneous Petrology. San Francisco: Freeman Cooper. McCulloch, M. T , Gregory, R. T., Wasserburg, G. J. & Taylor Jr., H. P. (1981). Sm-Nd, Rb-Sr, and 1 8 O/ 1 6 O isotopic systematics in an ancient crustal section: Evidence from the Samail ophiolite. J. Geophys. Res., 86, 2721-35. Mclntire, W. L. (1963). Trace-element partition coefficients: — a review of theory and applications to geology. Geochim. Cosmochim. Acta, 27, 1209-64. Mclntyre, G. A., Brooks, C , Compston, W. & Turek, A. (1966). The statistical assessment of Rb-Sr isochrons. J. Geophys. Res., 71, 5459-68. McKenzie, D.(1984). The generation and compaction of partially molten rocks. J. Petrol, 25, 713-65. McKenzie, D. (1985). 2 3 O Th- 2 3 8 Th disequilibrium and the melting processes beneath ridge axes. Earth Planet. Sci. Letters, 72, 149-57. Menard, H. W. & Smith, S. M. (1966). Hypsometry of ocean basin provinces. J. Geophys. Res., 71, 4305-25. Meyer, C. (1977). Petrology, mineralogy and chemistry of KREEP basalt. Phys. Chem. Earth, 10, 239-60. Michard, A. & Albarede, F. (1986). The REE content of some hydrothermal fluid. Chem. Geol, 55, 51-60. Michard, A., Gurriet, P., Soudan, M. & Albarede, F. (1985). Nd isotopes in French Phanerozoic shales:external vs. internal aspects of crustal evolution. Geochim. Cosmochim. Acta, 49, 601-10. Michard, G. (1989). Equilibres Chimiques dans les Eaux Naturelles. Paris: Publisud. Michel, H. V., Asaro, F., Alvarez, W. & Alvarez, L. W. (1990). Geochemical studies of the Cretaceous-Tertiary boundary in ODP holes 689B and 690C. Proc. ODP Sci. Res., 113, 159-68. Minster, J.-F. & Allegre, C. J. (1977). Systematic use of trace elements in igneous processes. Part III: inverse problem of partial melting. Contrib. Mineral Petrol, 68, 37-52. Minster, J.-F., Ricart, L.-P. & Allegre, C. J. (1979). 8 7 Rb- 8 7 Sr geochronology of enstatite meteorites. Earth Planet. Sci. Letters, 42, 333^7. Mitchell, A. R. (1969). Computational Methods in Partial Differential Equations. London: John Wiley. Morel, F. M. M. & Hering, J. G. (1993). Principles and Applications of Aquatic Chemistry. New York: John Wiley. Morse, P. M. & Feshbach, H. (1953). Methods of Theoretical Physics. New York: McGraw-Hill. Murray, J. W., Grundmanis, V. & Smethie, W. M. (1978). Interstitial water chemistry in the sediments of Saanich Inlet. Geochim. Cosmochim. Acta, 42, 1011-26. Navon, O. & Stolper, E. (1987). Geochemical consequences of melt percolation: the upper mantle as a chromatographic column. J. Geol, 95, 285-307. Neuman, H., Mead, J. & Vitaliano, C. J. (1954). Trace-element variation during fractional crystallization as calculated from the distribution law. Geochim. Cosmochim. Acta, 6, 90-100. Nicolas, A. (1986). A melt extraction model based on structural studies in mantle peridotites. J. Petrol, 27, 999-1022. Nicolas, A. & Jackson, M. (1982). High-temperature dikes in peridotites: origin by hydraulic fracturing. J. Petrol, 23, 568-82. Noyes, R. M. & Field, R. J. (1974). Oscillatory chemical reactions. Ann. Rev. Phys. Chem., 25,95-119.
534
References
O'Hara, M. J. (1977). Geochemical evolution during fractional crystallization of a periodically refilled magma chamber. Nature, 266, 503-7. O'Hara, M. J. (1985). Importance of the 'shape' of the melting regime during partial melting of the mantle. Nature, 314, 58-62. O'Hara, M. J. & Mathews, R. E. (1981). Geochemical evolution in an advancing, periodically replenished, periodically tapped, continuously fractionated magma chamber. / . Geoi Soc. London, 138, 237-77. O'Nions, R. K., Evensen, N. M. & Hamilton, P. J. (1979). Geochemical modeling of mantle differentiation and crustal growth. J. Geophys. Res., 84, 6091-101. O'Nions, R. K. & Powell, R. (1977). The thermodynamics of trace-element distribution. In Thermodynamics in Geology, ed. D. G. Fraser, pp. 349-63. Dordrecht: Reidel. Ortoleva, P. (1984). The self-organization of Liesegang Bands and other precipitate patterns. In Chemical Instabilities: Applications in Chemistry, Engineering, Geology and Material Science, ed. G. Nicolis & F. Baras, pp. 289-97. Dordrecht: Reidel. Ortoleva, P. J. (1994). Geochemical Self-Organization. Oxford: Oxford University Press. Ozisik, M. N. (1968). Boundary Value Problems of Heat Conduction. Scranton: Intern. Textbook. Co. Ottino, J. M. (1989). The Kinematics of Mixing: Stretching, Chaos, and Transport. Cambridge: Cambridge University Press. Palmer, M. R. & Edmond, J. M. (1989). The strontium isotope budget of the modern ocean. Earth Planet. Sci. Letters, 92, 11-26. Papoulis, A. (1984). Probability, Random Variables and Stochastic Processes. Kosaido: McGraw-Hill. Parker, R. L. (1977). Understanding inverse theory. Ann. Rev. Earth Planet. Sci., 5, 35-64. Pearce, J. A. & Cann, J. R. (1973). Tectonic setting of basic volcanic rocks determined using traceelement analyses. Earth Planet. Sci. Letters, 19, 290-300. Pearce, T. H. (1978). Olivine fractionation equations for basaltic and ultrabasic liquids. Nature, 276, 771-4. Pfann, W. G. (1952). Principles of zone-melting. Trans. AIME, 194, 747-53. Phillips, O. M. (1991). Flow and Reactions in Permeable Rocks. Cambridge: Cambridge University Press. Prager, S. (1956). Periodic precipitation. J. Chem. Phys., 25, 279-83. Presnall, D. C. (1969). The geometric analysis of partial fusion. Amer. J. Sci., 267, 1178-94. Press, W. H., Flanney, B. P., Teukolsky, S. A. & Vetterling, W. T. (1986). Numerical Recipes: The Art of Scientific Computing. Cambridge: Cambridge University Press. Prinzhofer, A. & Allegre, C. J. (1985). Residual peridotites and the mechanism of partial melting. Earth Planet. Sci. Letters, 74, 251-65. Procaccia, I. & Ross, J. (1978). Stability and relative stability in reactive systems far from equilibrium. II. Kinetic analysis of relative stability of multiple stationary states. / . Chem. Phys., 67, 5565-71. Provost, A. (1990). An improved diagram for isochron data. hot. Geosci., 80, 85-99. Provost, A. & Allegre, C. J. (1979). Process identification and search for optimal parameters from major-element data. General presentation with emphasis on the fractional crystallization process. Geochim. Cosmochim. Ada, 43, 487-501. Rayleigh, J. W. S. (1896). Theoretical considerations respecting the separation of gases by diffusion and similar processes. Phil. Mag., 42, 77-107. Raymo, M. E., Ruddiman, W. F. & Froelich, P. N. (1988). Influence of late Cenozoic mountain building on ocean geochemical cycles. Geology, 16, 649-53. Reed, M. H. (1982). Calculation of multicomponent chemical equilibria and reaction processes in systems involving minerals, gases and an aqueous phase. Geochim. Cosmochim. Ada, 46, 513-28.
References
535
Ribe, N. M. (1985). The generation and composition of partial melts in the earth's mantle. Earth Planet. Sci. Letters, 73, 361-76. Ribe (1987). Theory of melt segregation - A review. / . Vole. Geoth. Res., 33, 241-53. Richardson, S. M. & McSween, H. Y., Jr (1989). Geochemistry: Pathways and Processes. Englewood Cliffs: Prentice Hall. Richter, F. M. (1986). Simple models for trace-element fractionation during melt segregation. Earth Planet. Sci. Letters, 11, 333-44. Richter, F. M., Bowley, D. B. & DePaolo, D. J. (1992). Sr isotope evolution of seawater: the role of tectonics. Earth Planet. Sci. Letters, 109, 11-23. Richter, F. M. & Ribe, N. M. (1979). On the importance of advection in determining the local isotopic composition of the mantle. Earth Planet. Sci. Letters, 43, 212-22. Robie, R. A., Hemingway, B. S. & Fisher, J. R. (1978). Thermodynamic properties of minerals and related substances at 298.15 K and 1 bar (105 pascals) pressure and at higher temperatures. U.S. Geol. Surv., 1452, 1—456. Roeder, P. L. & Emslie, R. F. (1970). Olivine-liquid equilibrium. Contrib. Mineral. Petrol., 29, 275-89. Rosen, J. B. (1961). The gradient projection method for non-linear programming, Part II, Non-linear constraints. J. Soc. Indus. Appl. Math., 9, 414-32. Ruddiman, W. F. & Glover, L. K. (1972). Vertical mixing of ice-rafted volcanic ash in North-Atlantic sediments. Geol. Soc. Amer. Bull, 83, 2817-36. Russel, W. A., Papanastassiou, D. A. & Tombrello, T. A. (1978). Ca isotope fractionation on the Earth and other solar system materials. Geochim. Cosmochim. Ada, 42, 1075-90. Ryan, M. P. (1988). The mechanics and three-dimensional internal structure of active magmatic systems: Kilauea volcano, Hawaii. J. Geophys. Res., 93, 4213-48. Salters, V. J. M. & Hart, S. R. (1989). The Hf-paradox, and the role of garnet in the MORB source. Nature, 342, 420-22. Sandwell, D. T. (1987). Biharmonic spline interpolation of GEOS-3 and SEASAT altimeter data. Geophys. Res. Letters, 2, 139-42. Saxena, S. K. & Eriksson, G. (1983). Theoretical computations of mineral assemblages in pyrolite and lherzolite. J. Petrol., 24, 538-55. Scargle, J. D. (1982). Studies in astronomical time series analysis. II. Statistical aspects of spectral analysis of unevenly spaced data. Astroph. J., 263, 835-53. Scheid, F. (1968). Theory and Problems of Numerical Analysis. New York: McGraw-Hill. Schilling, J.-G. & Winchester, J. W. (1967). Rare-earth fractionation and magmatic processes. In Mantles of Earth and Terrestrial Planets, ed. S. K. Runcorn, pp. 267-83. New York: Interscience. Scott, D. R. & Stevenson, D. J. (1986). Magma ascent by porous flow. J. Geophys. Res., 91, 9283-96. Seber, G. A. F. (1984). Multivariate Observations. New York: John Wiley. Sen, A. & Srivastava, M. (1990). Regression Analysis. Theory, Methods, and Applications. New York: Springer-Verlag. Shackelton, N. J. & Kennett, J. P. (1975). Paleotemperature history of the Cenozoic and the initiation of Antactic glaciation: oxygen and carbon isotope analyses in DSDP Sites 277, 279 and 281. In Init. Rept. Deep Sea Drilling Project, pp. 743-55. Wahington D.C.: U.S. Government Printing Office. Shaw, D. M. (1970). Trace-element fractionation during anatexis. Geochim. Cosmochim. Ada, 34, 237-43. Shimizu, N. (1981). Trace-element incorporation into growing augite phenocryst. Nature, 289, 575-7. Sleep, N. H. (1976). Segregation of magma from a mostly crystalline mush. Geol. Soc. Amer. Bull., 85, 1225-32.
536
References
Smith, W. R. & Missen, R. W. (1982). Chemical Reaction Equilibrium Analysis. New York: John Wiley. Smith, V. G., Tiller, W. A. & Rutter, J. W. (1955). A mathematical analysis of solute redistribution during solidification. Canad. J. Phys., 33, 723-45. Sneddon, I. N. (1957). Elements of Partial Differential Equations. New York: McGraw-Hill. Sobolev, A. V. & Shimizu, N. (1992). Ultra-depleted melts and permeability of oceanic mantle (in Russian). Dokl. Acad. Sci. Russia, 236, 354-69 Southam, J. R. & Hay, W. W. (1976). Dynamical formulation of Broecker's model for marine cycles of biologically incorporated sediments. Math. Geol., 8, 511-27. Spiegel, M. R. (1973). Theory and Problems of Complex Variables. New York: MacGraw-Hill. Spiegel, M. R. (1975). Theory and Problems of Probability and Statistics. New York: McGraw-Hill. Spiegelman, M. & Elliott, T. (1993). Consequences of melt transport for uranium series disequilibrium in young lavas. Earth Planet. Sci. Letters, 118, 1-20. Stallard, R. F. & Edmond, J. M. (1981). Geochemistry of the Amazon 1. Precipitation chemistry and the marine contribution to the dissolved load at the time of peak discharge. J. Geophys. Res., 86, 9844-58. Steiger, R. H. & Wasserburg, G. J. (1966). Systematics in the Pb 2 0 8 -Th 2 3 2 , Pb 2 0 7 -U 2 3 5 , and pb 206_ U 238 s y s t e m S i j Geophys. Res., 71, 6065-90. Strang, G. (1976). Linear Algebra and its Applications. New York: Academic Press. Strang, G. (1986). Introduction to Applied Mathematics. Wellesley: Wellesley-Cambridge University Press. Stumm, W. & Morgan, J. J. (1981). Aquatic Chemistry. New York: John Wiley. Sundquist, E. T. (1985). Geological perspectives on carbon dioxide and the carbon cycle. In The Carbon Cycle and Atmospheric CO2: Natural Variations Archean to Present (AGU Geophys. Monograph 32), ed. E. T. Sundquist & W. S. Broecker, pp. 5-59. Washington: Amer. Geophys. Union. Swalin, R. A. (1962). Thermodynamics of Solids. New York: John Wiley. Tarantola, A. (1987). Inverse Problem Theory. Amsterdam: Elsevier. Tarantola, A. & Valette, B. (1982). Generalized nonlinear inverse problems solved using the least-square criterion. Rev. Geophys. Space Physics, 20, 219-32. Taylor, H. P., Jr. (1974). The application of oxygen and hydrogen isotope studies to problems of hydrothermal alteration and ore deposition. Econ. Geol., 69, 843-83. Taylor, H. P., Jr. (1978). Oxygen and hydrogen isotope studies of plutonic granitic rocks. Earth Planet. Sci. Letters, 38, 177-210. Taylor, H. P., Jr. (1980). The effects of assimilation of country rocks by magmas on 18 O/ 1 6 O and 87 Sr/ 86 Sr systematics. Earth Planet. Sci. Letters, 47, 243-54. Taylor, H. P., Jr. & Sheppard, S. M. F. (1986). Igneous rocks: I. Processes of isotopic fractionation and isotope systematics. In Rev. Mineral. 16: Stable Isotopes in High Temperature Geological Processes, ed. J. W. Valley, H. P. Taylor Jr. & J. R. O'Neil, pp. 227-71. Washington: Mineral. Soc. Amer. Tiller, W. A., Jackson, K. A., Rutter, K. A. & Chalmers, B. (1953). The redistribution of solute atoms during the solidification of metals. Acta Metall., 1, 428-37. Tiller, W. A. (1991a). The Science of Crystallization: Macroscopic Phenomenon and Defect Generation. Cambridge: Cambridge University Press. , Tiller, W. A. (1991b). The Science of Crystallization: Microscopic Interfacial Phenomena. Cambridge: Cambridge University Press. Tilton, G. R. (1960). Volume diffusion as a mechanism for discordant lead ages. J. Geophys. Res., 65, 2933-45. Treuil, L. & Joron, J.-L. (1975). Utilisation des elements hygromagmatophiles pour la simplification de la modelisation quantitative des processus magmatiques. Exemples de l'Afar et de la dorsale medioatlantique. Soc. Ital. Mineral. Petrol., 31, 125-74.
References
537
Turcotte, D. L. & Schubert, G. (1982). Geodynamics. Applications of Continuum Physics to Geological Problems. New York: John Wiley. Turner, G. (1968). The distribution of potassium and argon in chondrites. In Origin and Distribution of the Elements, ed. L. H. Ahrens, pp. 387-98. London: Pergamon. Turner, G. (1972). 4 0 Ar- 3 9 Ar age and cosmic ray irradiation history of the Apollo 15 anorthosite 15415. Earth Planet. Sci. Letters, 14, 169-75. Ulmer, P. (1989). The dependence of the F e 2 + - M g cation-partitioning between olivine and basaltic liquid on pressure, temperature and composition: An experimental study to 30kbars. Contrib. Mineral. Petrol, 101, 261-73. Van Zeggeren, F. & Storey, S. H. (1970). The Computation of Chemical Equilibria. Cambridge: Cambridge University Press. Vasseur, G., Vernieres, J. & Bodinier, J. L. (1991). Modelling of trace-element transfer between mantle melt and heterogranular peridotite matrix. In Orogenic Lherzolites and Mantle Processes, ed. M. Menzies, C. Dupuy & A. Nicolas, pp. 41-54. Oxford: Oxford University Press. Veizer, J. & Jansen, S. L. (1979). Basement and sedimentary recycling and continental evolution. J. GeoL, 87, 341-70. Vidal, P., Dosso, L., Bowden, P. & Lameyre, J. (1979). Strontium isotope geochemistry in syenite-alkaline granite complexes. In Origin and Distribution of the Elements, ed. L. H. Ahrens, pp. 223-31. Oxford: Pergamon. Vollmer, R. (1976). Rb-Sr and U-Th-Pb systematics of alkaline rocks: the alkaline rocks from Italy. Geochim. Cosmochim. Ada, 40, 283-95. Wagner, C. (1950). Mathematical analysis of the formation of periodic precipitation. J. Coll. Sci., 5, 85-97. Walker, D., Agee, C. B. & Zhang, Y. (1988). Fusion curve slope and crystal/liquid buoyancy. J. Geophys. Res., 93, 313-23. Walker, D., Shibata, T. & DeLong, S. E. (1979). Abyssal tholeiites from the Oceanographer Fracture Zone. Contrib. Mineral. Petrol., 70, 111-25. Walker, F. W., Parrington, J. R. & Feiner, F. (1989). Nuclide and Isotopes, 14th edition. General Electric. Walker, J. C. G. (1991). Numerical Adventures with Geochemical Cycles. New York: Oxford University Press. Walsh, G. R. (1975). Methods of Optimization. New York: John Wiley. Warren, P. H. (1986). The Bulk-Moon MgO/FeO ratio: A highlands perspective. In Origin of the Moon, ed. W. K. Hartmann, R. J. Phillips & G. J. Taylor, pp. 279-310. Houston: Lunar Planet. Inst. Warren, P. H. & Wasson, J. T. (1979). The origin of KREEP. Rev. Geophys. Space Phys., 17, 73-88. Wasserburg, G. J. (1954). Argon 40 :Potassium 40 dating. In Nuclear Geology, ed. H. Faul, pp. 341-9. New York: John Wiley. Wasserburg, G. J. (1963). Diffusion processes in lead-uranium systems. J. Geophys. Res., 68, 4823^6. Wasserburg, G. J., Jacobsen, S. B , DePaolo, D. J., McCulloch, M. T. & Wen, T. (1981). Precise determination of Sm/Nd ratios, Sm and Nd isotopic abundances in standard solutions. Geochim. Cosmochim. Ada, 45, 2311-23. Webster, R. K. (1960). Mass spectrometric isotope dilution analysis. In Methods in Geochemistry, ed. A. A. Smales & L. R. Wager, pp. 202-46. New York: Intersciences. Wetherill, G. W. (1956). Discordant uranium-lead ages. Trans. Amer. Geophys. Union, 37, 320-26. Wetherill, G. W., Davis, G. L. & Lee-Hu, C. (1968). Rb-Sr measurements on whole rocks and separated minerals from the Baltimore gneiss, Maryland. Geol. Soc. Amer. Bull, 79, 757-62.
538
References
White, W. M. (1985). Sources of oceanic basalts: radiogenic isotopic evidence. Geology, 13, 115-18. Whitfield, M. & Turner, D. R. (1979). Water-rock partition coefficient and the composition of seawater and river water. Nature, 278, 132-7. Wiberg, D. M. (1971). State Space and Linear Systems. New York: McGraw-Hill. Widder, D. V. (1975). The Heat Equation. New York: Academic Press. Wiggins, R. (1976). Interpolation of digitized curves. Bull. Seism. Soc. Amer., 66, 2077-81. Wilkinson, J. H. (1965). The Algebraic Eigenvalue Problem. Oxford: Clarendon Press. Williams, R. W. & Gill, J. B. (1989). Effects of partial melting on the uranium decay series. Geochim. Cosmochim. Ada, 53, 1607-19. Williamson, J. H. (1968). Least-squares fitting of a straight line. Can. J. Phys., 46, 1845-7. Wood, B. J. (1987). Thermodynamics of multicomponent systems containing several solid solutions. In Thermodynamic Modeling of Geological Materials: Minerals, Fluids and Melts, ed. I. S. E. Carmichael & H. P. Eugster, pp. 71-95. Washington: Mineral. Soc. Amer. Wright, T. L. & Doherty, P. C. (1970). A linear programming and least squares computer method for solving petrologic mixing problems. Geol. Soc. Amer. Bull., 81, 1995-2008. York, D. (1966). Least-squares fitting of a straight line. Can. J. Phys., 44, 1079-89. York, D. (1969). Least squares fitting of a straight line with correlated errors. Earth Planet. Sci. Letters, 5, 320-24. Zienkiewicz, O. C. (1977). The Finite Element Method in Engineering Science, 3rd edition. New York: McGraw-Hill. Zindler, A., Jagoutz, E. & Goldstein, S. (1982). Nd, Sr and Pb isotopic systematics in a three-component mantle: a new perspective. Nature, 298, 519-23. Zindler, A. W. & Hart, S. R. (1986). Chemical Geodynamics. Ann. Rev. Earth Planet. Sci., 14, 493-571. Zwillinger, D. (1989). Handbook of Differential Equations. Boston: Academic Press.
Subject index
activation energy 421, 455 activity 319 ADI 165 advection 165, 401, 406 advection-diffusion model 274, 464 AFC (assimilation-fractional crystallization) concentrations 505 identification diagrams 508 isotopic ratios 507 AFM diagram fractionation and mixing 32 alkalinity 395 Ar loss 128, 194, 313, 315, 446, 451, 455 Arrhenius plot 421, 455 asymptote mixing hyperbola 19, 264 autonomous system 345 base variable 340 batch melting 478 density function 192 forward problem 478 known source 479 Shaw's equation 487 unknown source 483 Berthelot-Nernst partition coefficient 477 bias mass see mass discrimination statistical 185 bifurcation 364, 417 bioturbation 408 Boltzmann variable 424, 428 boundary conditions 162, 421 boundary layer 436, 525 box see one-box, multiple-box boxcar function 101, 438 bulk mixing see conservative mixing bulk partition coefficient 478, 492 C mixing time ocean 354 carbonate equilibrium 320, 324, 325, 395 geochemistry 241, 267 compensation depth 393 CCD see carbonate Ce (La)-Yb fractionation 135, 216, 234 centered random variable 175 change of random variables 185, 206 characteristic equation 74 chemical equilibrium 318 chi-squared see pdf
clastic sediments 367 closure temperature 457 common dimension expansion matrix product 56, 75 compatible element 477 complexes solution 328 component 1, 318 identification by PC A 241, 243 loading 240 principal see PCA concentration-ratio hyperbola 18 retrieving 26 Concordia 125 confidence interval of the mean 196, 211 of the variance 197 conservative mixing 1 conservative property 401 constrained least-squares linear constraint 278 quadratic constraint 282 constrained minimum 147, 333 contamination (isotopic) binary 12,16,22 continuity equation 405 continuous inverse model 312 continuous melting 500 control line 114 cooling age 456 cooling rate 457 correlation coefficient 202 correlation matrix 203 sample 204 cosmogenic nuclides 410 covariance 202 covariance matrix 203, 208 matrix sample 204, 285 Crank-Nicholson see finite differences implicit critical melting see continuous melting crustal growth 367, 389 cumulate control line 114 curvature matrix 139, 147, 299 curvature mixing hyperbola 19 damping factor 352 Darken's theory 421 degree of freedom 181, 189, 197 density function see pdf determinant 58, 73 deviate 199, 233
539
540
Subject index
di-ol-si triangle 65 diagenesis 461 differential equations linear, order >1 97 linear, stability 98 first order, system of linear 85, 375, 381 differentiation 1 diffusion and precipitation 467 basalt-rhyolite 259 boundary layer 436 definition 406, 419 periodic boundary conditions 434 radial flux 446 radioactive decay 439, 451 semi-infinite medium, parallel flux 428 slab, parallel flux 437 solidification 442, 522 sphere in a well-stirred solution 449 uphill 422, 470 diffusion coefficient 420 variable with time 453 diffusivity chemical see diffusion coefficient disequilibrium radioactive 88 disequilibrium fractionation crystal growth 442, 522 dispersal passive tracer 154, 412 dispersion matrix see covariance matrix distillation see incremental process distribution function 173 divergence 139 divergence theorem 404 Doerner-Hoskins law 36 dot product 55 Duhamel's principle 451, 476 dynamic systems 344
eddy diffusivity 464 eigencomponents 73, 86, 140, 214, 216, 237, 238, 282, 375, 380 singular value decomposition (SVD) 75 symmetric matrix 75 eigenvalues see eigencomponents eigenvectors see eigencomponents electroneutrality 320 elemental fractionation 387 ellipsoid 78 end-member 1 entropy 129, 150 equilibrium constant 319, 394, 395, 477 equilibrium in solutions 320 erf 313, 430, 471 erfc see erf erosion rate 410 error calculation 217, 291, 306 error ellipsoid 80, 206, 212, 215, 285, 306 error function see erf error propagation 217, 233, 306 estimate 184, 204, 249 estimator 184, 203 Euclidian space 55 expectation 175, 249 expected value see expectation exponential see pdf
exposure age 410 extremum see minimum, maximum F see pdf feasible set 148, 340 FeO/MgO ratio 12, 20, 39, 126 finite-differences advection term 165 explicit 157 implicit 157 prescribed flux at boundary 162 flushing time 347 flux material 401 species 402 volume 401 FONI (first-order non-isothermal) reactor 361 forcing 345, 380 Fourier series 100 fractional condensation isotopic effects in rain 46 fractional crystallization 35, 114, 126, 491 inverse problem 495 isotopic effects 38 ratios 36, 494 fractional melting 43, 497 aggregated melt 498, 499 density function 192 free variable 340 front velocity 417 function spaces Legendre polynomials 104 spherical harmonics 107 trigonometric functions 99 gamma see pdf Gaussian see pdf (normal) geometric transformation 62 Gershgorin circles 82, 375, -378 Gibbs energy minimum 319, 331, 340 global geochemical models 386, 392 gradient projection 307, 334 gradient vector (grad) 138, 445 Gram-Schmidt see orthogonalization Green function 348 heat and mass transfer coupling 361 Hessian see curvature heterogeneities 359, 413 heterogeneous equilibrium 318 homogeneous equilibrium 318, 320 Hotelling's T2 see pdf hydrothermal alteration 87 Sr/86Sr in open system 50 <518O in closed system 23 <518O in open system 48 hyperbola binary mixing 18, 19, 262 hypsometric curve 393 ice cap isotopic effect of melting 13 ICP-MS 253, 310 ideal gases 331 ierfc see erf incompatible element 477, 489 incremental process 34, 35, 38, 114, 126, 491 ratios 36, 123, 494 independent random variables 201 influence of data 250
Subject index inner product see dot product interpolation 132 inverse matrix 60 inverse methods 248 inverse problem see continuous inverse model ion probe 183, 221, 251, 292 Ir pulse 408 isobaric interference see peak stripping isochron 125, 294, 303 isopleth velocity (non-Henry's law behavior) 416 isotope dilution 14, 229, 253 optimal 111 Jacobian 207 K-T boundary 408 Ko 36, 39, 126 kernel functions 313 kinetic exchange 93 Lagrange multipliers 149, 279, 282, 295, 332 lakes 350 Laplacian 420, 446 least-square constraints see constrained leastsquares criterion 249 errors, linear 288 errors, non-linear 294 hyperbola 262 non-linear 273 plane 257 polynomial 258 straight line 255 Legendre polynomials 104 Leibniz's rule 120,418,441 lever rule 5,16 Liesegang rings 467 linear array see mixing, AFC linear function spaces see function spaces linear programming 148, 340 liquid line of descent 115 loading see component log-log plot 37, 44, 493, 497, 513 magma chamber periodic regime 503 steady-state 502 magma residence time 357, 503 mantle components 28, 243 Mantle Plane 245 marble-cake mantle 413 marginal density function 201, 211 mass action 319 mass balance concentrations 2 ratios 11 mass discrimination (fractionation) 121, 229 mass interference see peak stripping Matano interface 275, 423 matrix inverse 60 operations 53 orthogonal 60, 62 special 53 square-root of 76, 289
541
subspaces 57 trace 61 matrix exponential 86, 375, 381 maximum 139, 144 mean 175 sample 184, 204 mean square of weighted deviations see MSWD mean squared distance of diffusion 429 metasomatic front 417 metasomatism 414 metric tensor 68 mg# 20, 21 Milankovich see periodogram mineral reaction 9 mineral removal 5,8, 39 mineralogical matrix 9, 220, 283, 318 minimum 139, 144 mixing 1 binary 3 et seq. bulk 1 concentrations 1 conservative 1 elemental and isotopic ratios 11,28 hyperbola see hyperbola mantle components 28 ratio-concentration relationship 15 ratio-ratio relationships 18 ratio-ratio ternary 28 retrieving concentration ratio 26 ternary 6 mixing length 465 mixing time 354, 359, 413 mixture ideal gases 331 mobility 421 modal abundances minerals 7, 220, 281 mode 191 moment 175 moment generating function 176 Monte-Carlo simulation 233 moving reference frame 407 MSWD (mean square of weighted deviations) 291 multiple-box model isotopic systems 386 model linear 371 Nd crustal residence age 226, 371 Nd isotopes 22, 28, 225, 226, 229, 366, 389 Newton method 123, 335 Newton-Raphson method 142, 299, 303, 320 Ni-Mg fractionation 41 normal see pdf normalized variables 31 AFM plot 32 oceanic islands 244, 262, 271 ODE see ordinary differential equations olivine fractionation 5, 32, 39, 41, 126 one-box model 345 isotopic ratios 355 non-reactive species 346 periodic input 351 radioactive species 353 reactive species 348 open-system exchange isotopic ratios 47
542
Subject index
orbital frequency see periodogram ordinary differential equations Euler method 129 Runge-Kutta method 130, 152 orthogonal functions see function spaces orthogonal matrix 60 orthogonalization 72, 105 outer product 55 outgassing see Ar loss outlier 196, 240 oxygen isotopes 13, 23, 26, 38, 46, 48, 93, 190, 267 P cycle 377 partial differential equation (PDE) finite differences 155 partition coefficient 477 pattern formation 467 Pb isotopes 125, 198, 205, 213, 271, 287, 303 PC A (principal component analysis) 237 component 237 loading 240 PDE (partial differential equation) 155 pdf (probability density function) 173 beta 181 Cauchy 180 chi-squared 181, 188, 189, 197, 206, 209, 289, 291, 301 exponential 178, 189 gamma 180, 187 Hotelling's T2 206, 212 joint multivariate 200, 205 log-normal 179, 189, 199 normal 179, 186, 187,473 Poisson 181 Snedecor's F 181, 206, 212, 216 Student's t 182, 196, 209, 212 uniform 178 various, relations 183 peak stripping 221, 251, 253, 292 percentile 175 percolation 407, 514 Henry's law 414, 514 non-Henry's law behavior 517 periodogram 264 pH 323, 325, 327, 400 phase chemical 1 phase shift 351 Pi variable 487, 498 picrites 42 point source 428 pooled mean see weighted mean population 184 population dynamics 366 pore water 463 porosity 414, 450 principal component analysis see PCA probability density function see pdf probability distribution function 173 probability ellipsoid see error ellipsoid productivity 393 projection oblique 68 orthogonal 65, 250, 274, 290 propane combustion 335
quadratic form 55 associated ellipsoid 78 quadric 78 radioactive disequilibrium 88 ramp function 103, 447 random variable 173 change of 185, 206 rare-earth elements 216, 221 rate of stretching 413 Rayleigh law 36, 492 reaction mineral 9 reactivity 349 recipe 318, 332 relaxation 345 reservoir see box residence time 347 reactivity 349, 359 river 5 rock 3 root of equations 123, 142 rotation matrix 62 Runge-Kutta method 130, 152 runoff 393 sample 184 scaling matrix 65 scavenging length 465 seawater 13, 355, 394 sediment recycling 367 separation of variables 437 Shaw's Pt see Pt signal drift 310 significance level 196 simplex method 148 singular value decomposition (SVD) see eigencomponents solubility CO2 in seawater 394 minerals in melts 257 solutions equilibrium in 320 spallogenic argon 317 spherical harmonics 107, 269 spline functions 132, 154 Sr isotopes 12, 16, 22, 26, 28, 50, 211, 233, 355, 357, 358, 508 Sr-Ca fractionation brines 39 stability one-box system 360 system of differential equations 98 thermodynamic 117, 143 standard deviation 175 matrix 203 standard error 185, 286 standardized variable 175 statistic 184 statistical distance 284, 286 steady-state 350, 354, 376, 380, 412, 461, 464, 511 steady-state magma chamber 502 steepest-descent 144 sterile phase 482 stoichiometric coefficient 9, 283, 319 Student's / see pdf sulfate reduction 461
Subject index SVD (singular value decomposition) see eigencomponents system closed/open 3 system of linear differential equations see differential equations t distribution see pdf tangent equation of 114 Taylor expansion 120 ternary plot 31 theta functions 474 total inverse 307 trace element 477 and magmatic processes 521 choosing the right 518 trace of a matrix 61, 65, 73 tracer dispersal 154, 412 transfer function 353 transition matrix 375 transport equation 405
trapped melt 520, 500 turbulent diffusivity see eddy diffusivity U-Pb dating 125 unit response function 348 uphill diffusion see diffusion variance 175 sample 184 variance-covariance matrix see covariance matrix water-rock ratio 25, 48 weathering 394 weighted-mean 285 zone-refining 510 partially molten zone 510 steady-state 511
543