Introduction to Geochemical Modeling

This book provides a quantitative treatment for a variety of geochemical problems involving mass balance, equilibrium, ...

Author: Francis Albarède

244 downloads 3539 Views 12MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

This book provides a quantitative treatment for a variety of geochemical problems involving mass balance, equilibrium, dynamics, and transport. Numerous applications from igneous and sedimentary environments are presented in the form of problems and their explicit solutions. It will particularly appeal to geochemists who need a proper grounding in the essential modeling methods brought to the field from physics and chemistry. Applications to natural environment make these methods also of interest to the geophysics, physics and chemistry community.

INTRODUCTION TO GEOCHEMICAL MODELING

INTRODUCTION TO GEOCHEMICAL MODELING

FRANCIS ALBAREDE Ecole Normale Superieure de Lyon

CAMBRIDGE UNIVERSITY PRESS

Published by the Press Syndicate of the University of Cambridge The Pitt Building, Trumpington Street, Cambridge CB2 1RP 40 West 20th Street, New York, NY 10011-4211. USA 10 Stamford Road, Oakleigh, Melbourne 3166, Australia © Cambridge University Press 1995 First published 1995 First paperback edition (with corrections) 1996 A catalogue record for this book is available from the British Library Library of congress cataloguing in publication data Albarede, Francis. Introduction to geochemical modeling/Francis Albarede. p. cm. ISBN 0-521-45451-4 1. Geochemistry - Mathematical models. I. Title. QE515.A53 1995 553.9'Ol'5118-dc2O 93-49747 CIP ISBN 0 521 45451 4 ISBN 0 521 57804 3

hardback paperback

Transferred to digital printing 2002

PR

To Benjamin, Olivier, and Isabelle as a sign of deep affection.

Contents

Foreword Preface

xv xvii

1 Mass balance, mixing, and fractionation 1.1 Concentrations as mixing variables 1.1.1 Basic concepts 1.1.2 Special case: binary mixing 1.1.3 Ternary mixing and removal 1.1.4 The inverse approach 1.2 Reactional assemblage 1.3 Working with ratios 1.3.1 Introduction 1.3.2 Ratio-concentration relationships in binary mixing 1.3.3 Ratio-ratio relationships in binary mixing 1.3.4 Mixing hyperbola: the inverse problem 1.3.5 Ratio-ratio relationships in ternary mixing 1.4 Normalized variables 1.5 Incremental processes (distillation) 1.5.1 Introduction 1.5.2 Concentration changes upon closed-system crystallization 1.5.3 Changes in element and isotope ratios upon closed-system crystallization 1.5.4 FeO-MgO fractionation during olivine crystallization in basalts 1.5.5 Elemental fractionation during basalt differentiation 1.5.6 Fractional melting 1.5.7 Fractional condensation 1.5.8 Open-system isotopic exchanges

1 1 1 3 6 9 9 11 11 15 18 26 28 31 34 34 35

2 Linear algebra 2.1 A matrix refresher 2.1.1 Definitions 2.1.2 A few rules for matrix manipulation

52 52 52 53

IX

36 39 41 43 46 47

x

2.2

2.3

2.4

2.5

2.6

Contents

2.1.3 The common-dimension expansion of the matrix product 2.1.4 The subspaces of a matrix Square matrices 2.2.1 The determinant of a matrix 2.2.2 The inverse of a matrix 2.2.3 Orthogonal matrices 2.2.4 The trace of a matrix 2.2.5 The fundamental geometric transformations 2.2.6 The metric tensor and oblique projections 2.2.7 Gram-Schmidt orthogonalization Eigencomponents 2.3.1 General 2.3.2 Computation of eigencomponents 2.3.3 Eigencomponents of symmetric matrices Quadratic forms and associated quadrics 2.4.1 Quadrics associated with symmetric matrices 2.4.2 Gerschgorin's circles theorem Systems of linear differential equations 2.5.1 First-order linear homogeneous equations 2.5.2 Linear equations of order higher than one 2.5.3 Stability of solutions to linear systems of differential equations Linear function spaces 2.6.1 General 2.6.2 Fourier series 2.6.3 Legendre polynomials 2.6.4 Associated Legendre polynomials 2.6.5 Spherical harmonics

3 Useful numerical analysis 3.1 Functions of a single variable . 1 Derivatives .2 Equation of the tangent to a curve .3 Leibniz's rule for the derivative of a definite integral .4 Taylor series .5 Roots of implicit equations and extrema of functions: the Newton method .6 Ordinary differential equations: the Euler method .7 Ordinary differential equations: the Runge-Kutta method .8 Interpolation with spline functions 3.2 Functions of several variables 3.2.1 Introduction 3.2.2 System of implicit non-linear equations: the Newton-Raphson method 3.2.3 Extrema: the steepest-descent method 3.2.4 Constrained minimization 3.2.5 The Runge-Kutta method for a system of differential equations 3.2.6 Interpolation with spline functions

56 57 58 58 60 60 61 62 68 72 73 73 74 75 78 78 82 85 85 97 98 99 99 100 104 106 107 111 111 111 114 120 120 123 129 130 132 137 137 142 144 147 152 154

Contents

3.3

xi

Partial differential equations: the finite differences method 3.3.1 One-dimensional diffusion problems: general 3.3.2 More boundary conditions 3.3.3 A word about advection 3.3.4 Two space coordinates: The ADI method

155 156 162 165 165

4 Probability and statistics 4.1 A single random variable 4.1.1 General 4.1.2 Expectation and moments 4.1.3 A compendium of some common probability density functions 4.1.4 Some relationships between fundamental distributions 4.1.5 Estimators 4.1.6 Change of variable 4.1.7 Confidence intervals 4.1.8 Random deviates 4.2 Several random variables 4.2.1 Estimators 4.2.2 Useful multivariate distributions 4.2.3 Change of variables 4.2.4 Confidence region of a sample from a normal population 4.3 Error propagation and error calculation 4.3.1 General concepts 4.3.2 Linear error propagation 4.3.3 Linearized error propagation for non-linear relationships 4.3.4 Monte-Carlo simulations 4.4 Principal component analysis

173 173 173 175 178 183 184 185 196 199 200 203 205 206 213 217 217 219 223 233 237

5 Inverse methods 5.1 Linear estimates 5.1.1 General 5.1.2 The least-square straight line and least-square plane 5.1.3 Least-square polynomials 5.1.4 Least-square hyperbola 5.1.5 The periodogram 5.1.6 Fitting global data with spherical harmonics 5.2 Non-linear least-squares 5.3 Constrained least-squares 5.3.1 Linear constraints: the closure condition 5.3.2 Quadratic constraints: mineral reactions 5.4 Handling errors in least-square problems 5.4.1 A simple illustration: the weighted mean 5.4.2 Linear least-square systems 5.4.3 Non-linear least-square systems: isochrons 5.5 Gradient projection and the total inverse 5.6 The continuous inverse model

248 249 249 255 258 262 264 269 273 278 278 282 284 285 288 294 307 312

xii

Contents

6 Modeling chemical equilibrium 6.1 Introduction 6.2 The Newton-Raphson method applied to solutions 6.2.1 Homogeneous equilibrium in solutions 6.2.2 Heterogeneous equilibrium in solutions 6.2.3 More about scaling 6.3 Gibbs energy minimization 6.3.1 Mixtures of ideal gases 6.3.2 Pure coexisting phases

318 318 320 320 324 328 331 331 340

7 Dynamic systems 7.1 Introduction 7.2 Single-variable residence time analysis 7.2.1 Non-reactive species 7.2.2 Reactive species 7.2.3 Radioactive decay and first-order kinetics 7.2.4 Isotope and trace-element ratios 7.2.5 Heterogeneities, mixing time, and residence time 7.2.6 Stability of single-variable systems 7.2.7 Random geochemical variables 7.2.8 Population dynamics 7.3 One element in several interacting reservoirs 7.3.1 A closed-system 3-box model with concentrations as the variables 7.3.2 The general box model: an empirical model 7.3.3 The general box model with forcing terms 7.4 Several elements in several interacting reservoirs 7.4.1 Multiple reservoir isotopic systems 7.4.2 Non-linear coupling of geochemical reservoirs

344 344 345 345 348 353 354 359 360 364 366 371

8 Transport, advection, and diffusion 8.1 Fluxes 8.1.1 Basic definitions 8.2 The divergence theorem and the conservation equations 8.2.1 The continuity equation 8.2.2 The general transport equation 8.3 Advection and percolation 8.3.1 Effect of bioturbation on concentration profiles in sediments 8.3.2 Exposure ages and the assessment of erosion rates 8.3.3 Dispersal of a conservative tracer in a velocity 8.3.4 Percolation and infiltration metasomatism 8.4 Diffusion basics 8.4.1 The diffusion equation 8.4.2 The diffusion coefficient 8.4.3 The Matano interface

401 401 401 404 404 405 407

field

372 374 380 385 386 392

408 411 412 414 419 419 421 423

Contents

8.5

xiii

Solutions of the diffusion equation: parallel flux 8.5.1 Parallel flux: the instantaneous point source in the infinite medium 8.5.2 Two half-spaces with uniform initial concentrations 8.5.3 The infinite medium with a layer of uniform initial concentration 8.5.4 The infinite medium: an arbitrary initial distribution 8.5.5 The infinite medium with C0(x) being a periodic function of x 8.5.6 The semi-infinite medium with constant surface concentration 8.5.7 The slab with uniform initial concentration 8.5.8 The slab with accumulation of a radiogenic isotope 8.5.9 Disequilibrium fractionation during solidification 8.6 Radial flux and spherical coordinates 8.6.1 Introduction 8.6.2 Radial diffusion in the sphere 8.6.3 Desorption from a sphere into a well-stirred solution 8.6.4 The sphere with accumulation of a radiogenic isotope 8.7 The diffusion coefficient varies with time 8.7.1 General 8.7.2 Cooling ages 8.8 Two useful steady-state solutions 8.8.1 Early diagenesis: sulfate reduction 8.8.2 The advection-diffusion model in the water column 8.9 Simultaneous precipitation and diffusion Appendix 8A: The error function Appendix 8B: The theta functions Appendix 8C: Duhamel's principle

428

431 431 434 435 437 439 442 445 445 446 449 451 453 453 456 460 461 464 467 471 474 476

9 Trace elements in magmatic processes 9.1 Introduction 9.2 Batch-melting and crystallization 9.2.1 Introduction and forward problem 9.2.2 Inverse problem: the source composition is known 9.2.3 Inverse problem: when the source composition is unknown 9.2.4 Shaw's formulation 9.3 Incremental processes 9.3.1 Fractional crystallization: forward problem 9.3.2 Fractional crystallization: inverse problem 9.3.3 Fractional melting 9.3.4 Continuous melting 9.4 Open magmatic systems 9.4.1 The steady-state magma chamber 9.4.2 A periodically erupting, periodically refilled magma chamber 9.4.3 Assimilation-fractional crystallization (AFC) 9.4.4 Zone-refining 9.4.5 Percolation and magma segregation

477 477 478 478 479 483 487 491 491 495 497 500 501 502 503 504 510 514

428 430

xiv

9.5

9.6

Contents

Which element, which process? 9.5.1 The good use of compatible and incompatible elements 9.5.2 Elements and processes Disequilibrium fractionation during crystal growth

References Subject index

518 518 521 522 526 539

Foreword

Since the early days of Goldschmidt or Vernadsky, geochemistry has become a mature science which now plays a central role in the Earth Sciences. More particularly, it has evolved considerably over the last fifty years. From an analytical approach with a goal of establishing the chemistry of the Earth compositions of rocks, soils, water, crust and mantle, geochemistry has become an explanatory science. The chemical and isotopic compositions of various earth materials now make up the data used to build models to explains the formation of the Earth, its evolution the genesis of the different terrestrial units: continents, mantle, core, ocean etc. ... From a descriptive and qualitative early stage, geochemistry has become explanatory and quantitative. In this new context modeling is a key method. Francis Albarede has been a very active actor in this evolution towards quantitative science. His abundant scientific contributions published in the best international journals are all focussed on the goal of building a quantitative science. He is one of the leading scientists in this area and has now decided to broaden his approach by writing a book on geochemical modeling. This book has no equivalent in the present literature. It explains how we can build mathematical models to explain geochemical observations. This book also gives the vision of Francis Albarede about science. He does not consider a science serious if there is no solid mathematical modeling applied to robust quantitative measurements. In this book, he gives all the techniques used today to model various geochemical phenomena from isotope geology to mineral thermodynamics passing through ocean chemistry or trace-element behaviour in volcanic systems. He gives a clear presentation of the different mathematical techniques which are conveniently assembled and he also provides numerous actual examples treated quite completely. This book will be useful for researchers and students as well as for teachers. Claude J. AIleg re

xv

Preface

Ever since the age-long committal of Earth Sciences to the hunt for natural resources turned into a largely non-profitable activity and the completion of the rather brutal metamorphosis of plate tectonics gave birth to a more mature and steady management of research in thisfield,Geochemistry has been undergoing a profound change. With all the excuses and the turmoil gone, the objectives of Geochemistry now join those of all other modern scientific fields: aside from a more accurate description of the world, either past or present, a set of quantitative concepts and rules must be built that will permit the outcome of geological processes and the future of geological systems to be predicted. Behind these vague terms hide such enormous challenges as the prediction of volcanic eruptions, the safety of drinking water, the evolution of the greenhouse effect, just to mention a few. The meaning of the quantitative approach in natural sciences is still blurred by confusion: there is no more quantification in plotting a few concentrations or isotopic ratios from a table in a geochemical diagram than coloring a geological map. Obviously, a wealth of high-quality information is still to be gathered through the description and comparison of geochemical measurements, but no more than can be gained from expert field work or from the intelligent practice of the optical microscope. In most cases, observations can be expressed in numbers that we call measurements, while the processes and causes invoked can be parameterized with yet more numbers. However, only a quantitative approach can test whether these processes and causes, parameterized in the most appropriate way, can reproduce the observations. The ultimate quality of a scientific theory is its ability to predict numerical outcomes of natural or artificial processes. Geochemistry should not try to elude this challenge. Claude Allegre stubbornly passed on to his students the habit of turning his perception of any geological process into equations that could eventually be tested against measurements: the spirit of this book owes him a great deal. The idea of writing a book on geochemical modeling finally emerged during a sabbatical leave at Lamont engineered by Alan Zindler, which gave me the first chance of getting my thoughts organized into a manuscript. The present book, largely dedicated to a practical approach to modeling through a large number of worked examples, is meant as a helper to the many students who knock at their adviser's door asking 'How can I get to this result?'. My motivation grew steadily out of more than twenty years of experience with students and young scientists in Geochemistry, so obviously disarmed when it comes to ascertaining ideas against physical models of natural processes. My own frustration built up by interacting with friends and

xviii

Preface

colleagues who wondered how the equation of a mixing hyperbola is actually derived, or how the diffusion loss equation for a sphere is changed from a long term to a short term expansion. These are just simple examples taken from everyday experience. The same scientists now worship imported isochron programs as sacred items and fearfully shrink on their seat during meetings when a ten-color slide with spherical harmonics expansion is projected on the screen. Literature is flooded with 'well-known' results which no book has ever presented as the logical conclusion of a sequence of statements. Can we blame our students for feeling uncomfortable in using these results and for dodging their application to new situations? A related concern is recent but acute. Geochemistry is becoming home to several good scientists from other fields, such as Geophysics, Physics, and Chemistry. Their command of difficult theoretical tools makes the solution of some complicated problems of igneous and sedimentary geochemistry much more affordable to them than to the vast majority of full-fledged geochemists. If we want our students to keep up with recent conceptual evolution in Geochemistry, our teaching should keep pace. However unrealistic this statement, I wrote the present book with the hope of relieving some of these frustations which have been mine as well. Many geochemists will find this book much too obfuscated to be useful in everyday scientific life. At the same time, experts and scientists from specialized fields may be skeptical about what they will consider as unrealistically simple situations or methods from a previous century. Nevertheless, I would be more than happy if this book could help just a few students and scientists bridge the gap. The topmost reward would be to find, while snooping around my student's offices, the present book shelved behind more up-to-date treatises and monographs. My students at the Ecole Normale Superieure in Lyon gave me the first blow by complaining about the limits of the theory and some examples being somewhat childish. This book assumes that the reader is familiar with basic geochemical observations and processes. It deals with some strong points of geochemical modeling: mass conservation and transport, equilibrium, fractionation and dynamics, plus some methods to calculate optimal solutions and some others to test them. It uses largely matrix theory and probabilities but refrains from developing functions of complex variables and integral transforms. Some of the most difficult concepts have been eluded, notably chemical waves and pattern formation. Introduction of modern mathematical concepts (fractals, chaos) has been postponed until their usefulness as tools of geochemical prediction is unambiguously established. I have tried to show that the concepts of geochemical modeling may apply to a broad spectrum of geological environments. Although the book has a strong igneous imprint, many applications deal with the oceanic and sedimentary environment. Some readers will be disappointed by finding only little emphasis on microscopic and macroscopic description of geochemical processes. Excellent textbooks exist at all levels that deal with thermodynamics, crystal chemistry, spectroscopy, or the atomistic theory of chemical kinetics. Although not necessarily dedicated to geochemical problems, they are relevant enough for a motivated student to catch up with the state-of-the-art in these fields. The chapters have been organized in such a way that, after basic principles have been introduced {Mass Balance), modeling methods (Linear Algebra, Numerical Analysis, Probability and Statistics) are presented before more specifically geochemical

Preface

xix

topics (Equilibrium, Dynamic Systems, Transport, Tracer Modeling). This order may not be the most attractive for a textbook on Geochemistry and I perceive the risk that some readers with a strong geological background may be put off by equations before they get the chance of practising real life problems. A minimum dose of mathematical difficulty, though, simply cannot be bypassed. Pretending that we can make use of elaborate models without being reasonably comfortable with the tools would be inappropriate. Other readers with some background in calculus and statistics may simply want to skip Chapters 2-5. Having apologized for what this book does not contain, I will now write a few words about what it does. The quest for reviewers was an almost impossible task. Some of the most competent and good-willing reviewers happened to be too busy to engage in lengthy equation debugging, while others involuntarily became promoted to 'essential' reviewer, a status they never applied for. The students from the Magistere des Sciences de la Terre Rhone-Alpes-Auvergne were arbitrarily assigned the role of guinea-pigs for the applications and showed enough disrespect to their silver-haired Professor to catch many unmistakable mistakes. None of these generous colleagues and students should be held responsible for the remaining errors. The author's e-mail address and World Wide Web home page are provided below. I will be grateful for any comments, inquiries, complaints, and gibes. A reader made the observation that there is hardly enough English between equations for the language to become a real problem. I will nevertheless request the reader's indulgence for the strong flavor oifranglais. The solutions of most problems were programmed with MatLab from the MathWorks company, either on Macintosh desk computers or on Sun workstations in the classroom. Janne Blichert-Toft had the most unpleasant yet the most essential part in this work: notwithstanding this overwhelming intrusion in our private life, she kept fighting inadequate formulations and obscure constructions while patiently reshaping a hopelessly multi-ethnic english. The friendly efficiency of Brian Watts for copyediting is gratefully acknowledged. The following colleagues and friends have taken much of their valuable time to review parts of the manuscript and suggested the introduction of essential ideas: P. Allemand, N. T. Arndt, B. Bourdon, O. Grasset, E. Kaminsky, E. Lewin, G. Michard, A. Provost, J. J. Royer, H. P. Taylor Jr, and G. Vasseur. P. Grandjean, B. Luais, and V. Salters provided data that could be used as a support to some exercises. Finally, many other colleagues helped me find cases, examples, methods, and references, not to forget those who straightened up some misconceptions, through informal discussions. At high risk of being disloyal, I will mention P. Alle, C. J. Allegre, H. Bertrand, J. Blichert-Toft, M. Campillo, M. Chaussidon, C. Chauvel, G. Chazot, M. Condomines, E. Deloule, L. A. Derry, A. W. Hofmann, E. Jagoutz, C. Jaupart, C. E. Lesher, B. Luais, M. A. Mellieres, A. Michard, J. F. Minster, S. M. F. Sheppard, M. Spiegelman, D. Velde, P. Vidal, Y. Zhang, and A. W. Zindler. I also want to acknowledge the enduring support and encouragement of Jean-Michel Caron over the years I was writing this book. Francis Albarede Lyon ([email protected]) (http://www.ens-lyon.fr/ ~ albarede/geochemodel.html)

1 Mass balance, mixing, and fractionation

The chemical evolution of geological reservoirs, such as the upper mantle, an oceanic basin, or a magma chamber, results from the competition of two opposing kinds of processes. From a parent system with uniform (or at least smoothly changing) geochemical properties, differentiation processes generate subsystems in which these properties are usually different. Among the differentiation processes, we can mention phase changes such as crystal fractionation and partial melting, mechanical sorting, and biological activity. In the opposite direction, mixing processes tend to combine systems with distinct geochemical properties into more uniform supersystems. Mixing obviously plays a fundamental role in the formation of clastic sedimentary rocks and magmas emplaced on continental crust, while being responsible on a broad scale for the rather simple chemical properties of seawater and the isotopic characteristics of the mantle sources of basalts. This chapter deals with the basic principles of mass conservation associated with mixing and differentiation processes.

1.1. Concentrations as mixing variables

1.1.1 Basic concepts The term mixing refers to a vast series of processes in which several mineral phases or chemical components are brought together in a multi-phase system (e.g., mixing of sediments) or a multi-component system (e.g., magma or seawater mixing) to form an array of hybrid samples (mixtures). The latter case is often referred to as 'bulk mixing' or 'conservative mixing' in contrast with other selective mixing processes which involve the preliminary sorting of phases or the preferential transfer of some chemical components. A phase is a system with homogeneous chemical properties that stays physically distinct in the mixture, such as quartz in a sediment, and is a term most commonly used for mechanical mixtures. A component loses its physical identity upon mixing, such as the Depleted Mantle component in an oceanic basalt or the North-Atlantic Deep Water in the ocean, and is a term most commonly used for systems that are fluid during mixing. Both terms can be replaced by the mixing-specific term end-member. A system, subscripted 0, contains several species (i = 1,..., m) held in phases (j = l, ..., n). Let Mj be the mass of phase j and mf the mass of species i contained in the 1

2

Mass balance, mixing, and fractionation

phase j . We refer to species instead of element i because the theory applies equally well to all the isotopes or molecules which are not produced or destroyed during the mixing process. Concentration of species i in phase j is defined as

For the bulk material, mass conservation requires MQ= £ Mj where the sum is over all the phases and 0=

Z

m

J

for element i. The proportion fj of the phase j is such that

f

fMj

then V m* Mo

Mo

or

j

ci_k^j Mo

_ f rnl^ M Mj

MO

and finally

Cj=t c/fj

(I- 1 - 1 )

with the closure condition

The 'bulk rock' composition vector is a linear combination of the mineral compositions (equivalently, the mixture composition vector is a linear combination of the end-member composition vector). The non-negative coefficients of this linear

1.1 Concentrations as mixing variables

Figure 1.1 Scaling the sample relative to an arbitrary heterogeneity. Here, the rock is assumed to have a characteristic exchange distance 5. Atoms in the outer shell (stippled) may have moved in or out; inside all the movements kept the system closed. The size of a rock sample will be scaled for a closed system by minimizing the relative proportion of the shell and will be large. For an open system, it will be taken smaller than 5.

combination obey the closure equation. An equivalent statement which will prove to be mathematically convenient is that the bulk rock composition is the centroid of the mineral compositions weighted by the mass fraction of each mineral phase. In this sense, a rock is a pure artifact that comes to existence solely through the sampling process. The chemical properties of the rock represent a local average of mineral chemical properties hammered out of the outcrop by the geologist. This average smoothes out all the chemical heterogeneities that are present over characteristic distances significantly smaller than sample size, but not the longdistance variations. This concept of a rock has important consequences for isotopic dating. Let us imagine that the Rb-Sr system of a granite has been disturbed by a metamorphic event subsequent to granite emplacement. In addition, we assume that, during metamorphism, Rb and Sr have been moving around in each part of the granite and that we can define a characteristic distance 5 for these movements (Figure 1.1), for instance a mean-square distance of diffusion. None of the samples, whatever its size, is a really closed system since the outer shell of thickness 8 has undergone significant exchange with its surroundings during the metamorphism. Clearly, very small systems, smaller than 5, can be considered as completely open, whereas the proportion of the shell in the sample is small for very large systems which can, to any arbitrary precision, be considered to be closed to the metamorphic perturbation. This contrasting behavior is the basis of whole-rock vs mineral Rb-Sr dating in polyorogenic areas (e.g., Wetherill et a/., 1968; Faure, 1986).

1.1.2 Special case: binary mixing The conservative mixing of two components requires linear relationships for every pair of species. We take two end-members j = a and j = /? and note the bulk system

4

Mass balance, mixing, and fractionation

with the subscript 'mix' instead of the subscript 0. The closure equation is

while mass balance for species i reads (1.1.3)

which for species i\ and il can be rewritten

(1.1.4)

Dividing equations (1.1.4) by each other, the relationship between Cmixfl and Cmixi2 and the condition 0 ^ / ^ 1 describes a line segment (Figure 1.2) such that il

r

i2_

r

i2

~7?ii r* ^R

— r ii — ^a

(1.1.5)

passing through the points [Cj\ Cj2] forfp = 0 and [C/ 1 , C^ 2] for/^= 1. The slope sni2 of a mixing array in a diagram Cmixi2 vs Cmixl1 is C Ril

(1.1.6) —

C'a

Figure 1.2 Linear array of mixing between end-members a and p as given by equation (1.1.5). snl2 is the slope of the mixing line.

1.1 Concentrations as mixing variables Table 1.1. Compositions of sediments in the river Meurthe and two tributaries and their theoretical mixtures.

SiO 2 (%) A12O3 (%) Fe 2 O 3 (%) MnO (%) MgO (%) CaO (%) Na 2 O (%) JV2L/ ^ /oj

Ba (ppm) Cr (ppm) Y (ppm)

Meurthe

Fave r0

Mortagne

Downstream

Downstream R2

67.45 12.76 2.80 0.07 2.23 1.87 2.00 3.87 907 74 41.4

76.85 10.10 2.45 0.04 0.86 0.63 1.18 3.92 781 72 82

63.37 10.24 4.20 0.11 2.00 2.07 0.53 2.72 529 95 25

70.74 11.83 2.68 0.06 1.75 1.44 1.71 3.89 863 73 55.6

69.27 11.51 2.98 0.07 1.80 1.56 1.48 3.65 796 78 49.4

When component and mixture concentrations of any species i are known, mass proportions can be calculated from the lever rule' c

i

— ci

The Meurthe river in North-Eastern France has two major tributaries, the Fave and Mortagne rivers. Let R be the concentration of an element in the main Meurthe river and r that of the same element in its tributaries (Table 1.1, columns 2 and 3). 65 percent offine-grainedsediments from the Upper Meurthe (Ro) mix with 35 percent sediment from the Fave (r0) river. At the next confluent, 80 percent of the Meurthe fine-grained sediments mix with 20 percent Mortagne (rx) sediment. Find the composition of the sediments in the Meurthe down each tributary. Written in a symbolic way, the mass balance equations read R1=0.65 x R0 + 035 x r0 R2 = 0.8xRi+0.2xr1 and the results are shown in the columns 4 and 5 of Table 1.1. o & The 1887 Mauna Loa lavaflowin Hawaii (PM = parent magma) has been found to contain olivine with the composition listed in Table 1.2. Assuming a fixed olivine composition (ol) fo88, calculate that of the residual liquid (RL) upon fractionation of /ol = 5, 10 and 15 percent olivine. The parent magma is the combination of the residual liquid and olivine, hence +

The symbols &

and
Mass balance, mixing, and fractionation Table 1.2. Major-element data (%) for a Mauna Loa basalt and olivine phenocrysts. Composition of the residual liquids obtained after removal of olivine fractions iol. Mauna Loa 1887 SiO 2 TiO 2 A12O3 FeO MgO CaO Na 2 O Na 2 O/TiO 2

51.63 1.94 13.12 10.80 8.53 9.97 2.21 1.14

fO88

/oi = 0.05

/oi = 0.1

/oi = 0.15

39.90 0.00 0.00 11.70 47.80 0.28 0.00

52.25 2.04 13.81 10.75 6.46 10.48 2.33 1.14

52.93 2.16 14.58 10.70 4.17 11.05 2.46 1.14

53.70 2.28 15.44 10.64 1.60 11.68 2.60 1.14

or * ~~ / o l

X

The results are listed in Table 1.2. Note the constant Na 2 O/TiO 2 ratio: Na and Ti are enriched in the same proportions because they are not present in the olivine. <>

1.1.3 Ternary mixing and removal At least three species will be necessary to describe the behavior of the three end-members j = a, j = /?, and j = y. From the closure equation

Mass balance for species i reads

which for each of the three species i = il, i = i2 and i = ii can be rewritten

Cmixn-Can =fp {Cf-Cj^ + fy (CJ'-CJ1) cm-J2 - cj2=fp (cy2 - cj2)+fy (c/ 2 - cj2) cmix/3 - cj3=fp (c^ - cj3)+fy (c/ 3 - cj3)

(1.1.8)

In the space Cmix11, C mix l2 , C mix i3 (Figure 1.3), this is the equation of a triangle whose

Concentrations as mixing variables Table 1.3. Major-element composition (%) of the minerals used to calculate the composition of a gabbro with known modal abundances.

SiO 2 A12O3 FeO MgO CaO Na 2 O

fO85

di

an 8 0

40.01 0.00 14.35 45.64 0.00 0.00

54.69 0.00 3.27 16.51 25.52 0.00

48.07 33.37 0.00 0.00 16.31 2.25

Figure 1.3 Triangle of mixing between end-members a, /?, and y as given by equation (1.1.8) in a C \ C 2 , C 3 space.

apices are the points represented by the triplets ICJ\CJ\CJ^

for / a = i,//? = o,/y = 3

[cyscy'.cy ] for ftt=o,ffi=i,f7=o ia;\ c/ 2 , cy 3] for / a =o, ffi=o, fy=1 & An olivine gabbro (WR = whole rock) contains 40 percent olivine (ol) fo 85 , 30 percent diopside (di), and 30 percent plagioclase (pi) an 8 0 . From the mineral compositions given in Table 1.3, calculate the whole-rock composition. Mass balance of SiO 2 reads CWRSiO2 = /olColSiO2 + /diCdiSiO2 + /plCplSiO2 = 0.4 x 40.01 + 0.3 x 54.69 + 0.3 x 48.07 = 46.83

Mass balance, mixing, and fractionation Table 1.4. Major-element composition (%) of basalt and phenocrysts used to calculate cumulate (cum) and residual liquid (RL) compositions.

SiO 2 A12O3 FeO MgO CaO Na 2 O

basalt

fo 85

di

an 8 0

cum

RL

49.79 16.95 8.52 8.59 12.17 2.61

40.01 0.00 14.35 45.64 0.00 0.00

54.69 0.00 3.27 16.51 25.52 0.00

48.07 33.37 0.00 0.00 16.31 2.25

48.44 16.68 3.85 14.08 15.81 1.13

50.13 17.02 9.68 7.22 11.26 2.98

and similar equations can be written for the other oxides. Anticipating the methods developed in Chapter 2, we can set up all the equivalent equations in matrix form and find C MU2

-

C A1 2 O 3 CFeO

£-MgO

C

CaO

C Na 2 O

\17D

40.01

54.69

48.07

0.00

0.00

33.37

14.35

3.27

0.00

45.64

16.51

0.00

0.00

25.52

16.31

0.00

0.00

2.25

"46.83"

ro.4" 0.3

10.01 6.72 23.21 12.55

& 20 percent of a cumulate (cum) consisting of 20% olivine, 30% diopside and 50% plagioclase is withdrawn from a mid-ocean ridge basalt (bas). Compositions are given in Table 1.4. Calculate the composition of the residual liquid (RL). Mass balance requires

or Cbas'-0.2xCb 0.8

where Ccum' = 0.2 x Cfo'>0.3 x Cdi'>0.5 x Cpl' The results are given in Table 1.4. <=

1.2 Reactional assemblage

9

1.1.4 The inverse approach

It will be discussed in the following Chapters how the previous calculations can be inverted in order to retrieve from mixing arrays: (a) mineral proportions in a rock when the compositions of the rock and minerals are known, (b) end-member proportions in a mixture, (c) mass fractions of phases removed from a parent magma.

The constrained least-square method is developed in Section 5.3 and a numerical example treated in detail. Efficient specific algorithms taking errors into account have been developed by Provost and Allegre (1979). Literature abounds in alternative methods. Wright and Doherty (1970) use linear programming methods that are fast and offer an easy implementation of linear constraints but the structure of the data is not easily perceived and error assessment inefficiently handled. Principal component analysis (Section 4.4) is more efficient when the end-members are unknown.

1.2 Reactional assemblage

An assemblage of m components or minerals with Cf being the concentration of species i in phase j (i = 1,..., m; j = 1, ..., n) is said to be reactional when the mineral concentration vectors are not independent. Reactional assemblages are a source of considerable difficulties when dealing with mass balance problems. A simple example is the well-known reactional assemblage made of enstatite (en = MgSiO3), forsterite (fo = Mg2SiO4) and quartz (qz = SiO2). The mineralogical matrix in mole fraction of oxides of this assemblage is given in Table 1.5. As we will see in more detail in Chapter 2, three vectors in a two-dimensional space cannot be independent and there exists an infinity of relationships such that

leno-fo + 4 4

where the coefficients are called stoichiometric. Multiplication of the stoichiometric coefficients by the same arbitrary number gives a new solution. In addition, the same stoichiometric coefficients relate the vectors and their components on each axis (Table 1.5 and Figure 1.4), i.e., their concentrations. In this particular case, we easily check that

and —_r . Wo

M

g° J- _ r ~r~

^q

10

Mass balance, mixing, and fractional ion Table 1.5. A mineralogical matrix (mole fractions) for the enstatite-forsterite-quartz assemblage.

SiO2 MgO

en

fo

qz

1/2 1/2

1/3 2/3

1 0

1 .

Figure 1.4 A reactional assemblage enstatite-forsterite-quartz in the 2-dimensional space SiO2-MgO. Enstatite can be decomposed as 3/4 forsterite plus 1/4 quartz.

Taking this trivial example a little further, we imagine that we grind together enstatite, forsterite and quartz and mix the powders intimately (at which Nature usually balks). Then we ask ourselves whether a measure of the MgO and SiO2 contents of the mixture combined with a knowledge of mineral combination can provide an estimate of mineral proportions in the mixture. The answer, of course, is no, because there is an infinity of solutions that we can derive from each other by allowing some forsterite consumption. In that case, thermodynamics is needed to tell us which phase assemblage is actually stable. We know from petrology that either the assemblage (forsterite + enstatite) or the assemblage (enstatite + quartz) should be the end-point of the reaction, not the assemblage (forsterite + quartz). More generally, when a reaction exists among the n minerals or end-members of a mixture, there exist n numbers Vj (j = 1, ..., n) such that (1.2.1)

The breakdown of a mixture into its end-members is no longer unique and additional assumptions are needed.

1.3 Working with ratios

11

1.3 Working with ratios 1.3.1 Introduction Basic concepts are decades old and it is difficult to establish the first mention of hyperbolic mixing relationships for ratio-ratio plots with different denominators. An early work on mixing hyperbolae in the U-Th-Pb chronometric system is that of Steiger and Wasserburg (1966), while Vollmer (1976), Langmuir et al. (1978), and Juteau et al. (1986) provide detailed discussions of the mixing parameters. Isotopic ratios have extremely attractive properties and add information additional to the simple mass balance. Ratios involving a radiogenic isotope like 87Sr/86Sr are time-dependent and trace the parent-daughter ratio of the rock or its source. Ratios of stable isotopes such as 18 O/ 16 O carry information on equilibration temperature and the isotopic properties of the elusive fluid phase lost by most rocks. In addition, other ratio indexes turn out to be much more informative than plain concentrations. The FeO/MgO ratio of igneous mafic rocks is a very efficient differentiation index that is almost insensitive to crustal contamination. Incompatible-element ratios in lavas are insensitive to the proportion of phenocrysts in the sample. However attractive, ratios should not be handled casually, because in most cases ratio variables have non-linear relationships to other variables. The ratio of two species (or isotopes) with concentrations C 1 and C 2 in a mixture of n components is

c

il

Alternatively, this equation can be written

(1.3.1)

where cpf1, defined by

fc Jk Jk k= 1

is the mass fraction of species i\ that was contributed by the component j to the mixture. Summing equation (1.3.2) over all the phases

I Cj"fj ^ =1

I ct»A fc=l

d-3.3)

12

Mass balance, mixing, and fractionation

<& Calculate the FeO/MgO ratio of a rock made of 50% groundmass (gd, 11% FeO, 10% MgO), 30% olivine (ol, 15% FeO, 45% MgO) and 20% clinopyroxene (cpx, 4% FeO, 18% MgO). From equation (1.3.2), and using FeO/MgO for concentration ratios, we write /FeO\

_/FeO\

/FeO\

/FeO\ MgO

We first calculate C gd MgO / gd= 10x0.5 = 5, ColMg °/olol = 45 x0.3 = 13.5, CcpxMgO/cpx = 18 x 0.2 = 3.6 and CrockMgO = 5 + 13.5 + 3.6 = 22.1, all in units of percent MgO relative to the mass of rock. We then calculate the cp values as
o

<& A basalt (bas) with 400 ppm Sr and a 87 Sr/ 86 Sr ratio of 0.704 is contaminated by a gneiss (gn) with 100ppm Sr and a 87 Sr/ 86 Sr ratio of 0.712. The 87 Sr/ 86 Sr ratio of the hybrid rock (hyb) is 0.705. Calculate the mass proportion of gneiss assimilated by the basalt. The atomic binary mixing equation reads 87

Sr\

86q r ) ^Vhyb

where (/>86Sr refer to atomic fractions. Since the two we can rewrite as /87Sr\

(ii)

=

/87Sr\

IV 8 7 Sr\

86

Sr fractions sum up to unity,

/87Sr\

1

( 4 + (i«£) -(ii|) k86Sr

(°-4)

and 7

86Sr_

/ 8 7 Sr\ , V 86 Sr/ w 87 Sr\ _/ 8 7 Sr\ Sr\

0.705-0.704 1 0.712-0.704 8

Let a 86 be the number of 86 Sr atoms per gram of Sr. If / gn is the fraction of gneiss and 1 —/gn the fraction of basalt in the hybrid lava, we can write 86Sr=

/ 1 0 0

8 68 6 n

+ (l-/ ( l g n/ )400a ) 4 0 b0a s88 66

8

1.3 Working with ratios

13

Given the extremely small variations of the Sr isotope compositions (most Sr is 88Sr), a86 is virtually identical for each component, hence it can be canceled out so that gn

+ (l-/ g n )400

Mass proportions of gneiss and basalts are 400 /

8n

=

= 0.364 1100 700

! /

= 0.636

The hybrid magma contains proportions of gneiss and basalt in a ratio of 4 to !.<> & The 518O value of modern seawater (sw) is 0% while the average 818O value of polar ice caps is —45%. Calculate the 818O value of the 'ice-free' ocean obtained upon melting the polar ice caps, i.e., the bulk value of the hydrosphere. Assume that ice caps hold a fraction/ice = 2 percent in mass of the terrestrial waters and that other water reservoirs (e.g., ground water) can be neglected. The 818O of a sample is the relative deviation in permil of its 18 O/ 16 O ratio relative to that ratio in a reference material, usually the standard mean ocean water (SMOW) (

The atomic binary mixing equation is

hydrosphere

Upon division of the mixing equation by ( 18 0/ 16 0) SMO w and subtraction of the equality

the mixing equation can be rewritten with the 8 notation d

^hydrosphere — d

^sw^sw

+

d

^ice^ice

By definition, the atom fraction of 16O frozen in polar caps is e

f x

16

+ f x- 16

where x16 is the number of 16 O atoms per gram of water. Since

16

O is by far the

14

Mass balance, mixing, and fractionation

most abundant isotope of water, xice 16 and xsw16 are nearly equal, and therefore S18Ohydrosphere«(0) x 0.98 + (-45) x 0.02= -0.9% Paleotemperatures obtained from the oxygen isotope composition of marine biogenic calcium carbonate depend on the 818O of seawater by approximately 0.2 degree C per delta unit. Skeletal calcium carbonate of foraminifers from pre-Tertiary sediments formed in a presumably warmer, ice-free ocean and isotopic temperatures should be evaluated with respect to 818O values of seawater more negative than that of modern seawater. In contrast, calcium carbonate precipitated during the Quaternary glaciations from seawater with more positive 818O values. Depending on the assumption made on the amount of polar ice, estimated temperatures may differ by up to several degrees C (Shackelton and Kennett, 1975). o & Isotope dilution. 0.250 grams of a neodymium spike solution containing 1.5 nmol 150 Nd per gram of solution is added to 100 mg of a sample. Measurement on a mass spectrometer indicates that the mixture has a 148 Nd/ 150 Nd of 0.092. Calculate the concentration of this sample. Isotopic atomic proportions of 148 and 150 isotopes are 5.73 and 5.62 percent in natural Nd (molar weight 144.24), 0.40 and 97.25 percent in the Oak Ridge 150 Nd spike. Transposing the mixing equation (1.3.4) derived for 87Sr/86Sr and using the subscripts sa, sp, and mix for sample, spike, and mixture, we write the atomic mixing equation as

[7 / 150Nd i5 Lv °NdA V i 5 O N d ; J sa or / 1 4 8 Nd\ 150Nd =

\

1>lu

148

/ 1 4 8 Nd\

/mix

\

^U/s

/ 1 4 8 Nd\

Nd\

Taking the reciprocal, we can refer to the more appropriate spike/sample ratio / 148 Nd\ j 1

^sa

ISONd.jy

N

_

iy

sa

^J>sp

iV s a

150Nd

150Nd

N

_ 1 i

150Nd

(isO^J

S

P

Nsa

^ 150Nd

/

148

1>u

/sa

Nd\

/148Nd ~( \

_/148Nd\

where JVsa150Nd and N s p 1 5 0 N d refer to the number of 1 5 0 Nd atoms of sample and

15

1.3 Working with ratios spike, respectively, present in the mixture. Then / 1 4 8 Nd\

/ 1 4 8 Nd\ _/148Nd\

/ 1 4 8 Nd\ -1 =

/ 1 4 8 Nd\ _/148Nd\ nta "V

T55

NdAp

and taking the reciprocal / 1 4 8 Nd\

/ 1 4 8 Nd\

150Nd p150Nd

/148 N ( J\

l^NdL" In the present case iV sa 150Nd _ 0.092-0.40/97.25 Nsp150m

~ 5.73/5.62-0.092 150Nd

The a m o u n t of spike added iV

= 0.883

is 0.250 g x 1.5 nmol/g = 0.375 nmol so the

p 150Nd

a m o u n t of sample JV s a is 0.375 nmol x 0.883 = 0.331 nmol. T h e mass of N d in the sample is therefore equal to (0.331 x 10" 9 mol)x 100%/5.62% = 5.89x 10" 9 mol Nd or (5.89 x 10- 9 mol) x 144.24g/mol = 0.85 x 1 0 " 6 g elemental Nd. Since the assay was on 0.1 g sample, the sample N d concentration is 8.5 ppm.

1.3.2 Ratio-concentration

relationships

in binary

mixing

Equations (1.3.2) a n d (1.3.3) written for two species i\ a n d il a n d two components a and p are rearranged as

where cppl1 is the mass fraction of species il in the mixture that was contributed by the component /?. Equivalently

16

Mass balance, mixing, and fractionation

The lever rule for species il reads f J =

P

C mix "-C a " n -Cjl C

which results in

We finally get the very useful relationship C

which is a linear relationship in the diagram (Ci2/Cn)mix vs 1/Cmixn A symmetric expression can be obtained upon exchange of a and /?. Plots such as (A/B) vs (1/B), had a long history in rare-gas isotope geochemistry (e.g., Eberhardt et a/., 1970). Plots of 206 Pb/ 206 Pb vs l/CPb or 87Sr/86Sr vs l/CSr are also widely used as mixing identification tools in radiogenic isotope geochemistry (e.g., Lancelot and Allegre, 1974; Boger and Faure, 1974; Faure, 1986). A note of caution: it is shown in Chapter 9 that the assimilation-fractional crystallization (AFC) model results in the same linear relationship which could point to the wrong end-member if interpreted as a simple mixing relationship. Going back to our previous mixing example, a basalt with 400 ppm Sr and a Sr/86Sr ratio of 0.704 is contaminated by a gneiss with 100 ppm Sr and a 87Sr/86Sr ratio of 0.712. Calculate the locus of possible hybrids, i.e., the mixing line in the 87 Sr/86Sr vs l/CSr diagram. The equation of the mixing line is

&

87

=

{

|

[86Sr]gn

V 86 SrA yb V86SrAas iV 86 Sry gn V86SrAasi [ 8 6 Sr] g n -[ 8 6 Sr] b a s j

J^rSr]^ [ 86 Sr]] h y b

where [ 8 6 S r ] indicate the number of 8 6 Sr atoms per gram of rock. W e assume that this number is proportional to the Sr concentration in ppm, which, unless unrealistic differences of isotopic composition between end-members occur, is quite innocuous (e.g., Faure, 1986). Hence CbasSr Aasi C g n S r -C b a s s ' which, in order to see what is going on when Sr concentration in the hybrid tends

1.3 Working with ratios

17

Table 1.6. Calculated Sr concentrations (ppm) and87Sr/86Sr ratios for different fractions of gneiss in the hybrid rock. ( 87 Sr/ 86 Sr) hyb

1/Ch 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.70400 0.70422 0.70447 0.70477 0.705 14 0.705 60 0.706 18 0.70695 0.70800 0.709 54 0.71200

0.00250 0.002 70 0.00294 0.003 23 0.003 57 0.00400 0.004 55 0.005 26 0.00625 0.007 69 0.01000

400 370 340 310 280 250 220 190 160 130 100

0.715 0.712

gneiss

t-i

J? 0.709 10 percent mass fraction increments

0.706 0.703

basalt

0.700 0.003

0.006

0.009

0.012

Sr

1/C hybrid Figure 1.5 The linear array of mixing between gneiss and basalt in a plot of 87 Sr/ 86 Sr vs the reciprocal of Sr concentration [equation (1.3.8)]. Data from Table 1.6.

toward that in the gneiss, can be rewritten

hyb

K

87 ^r\ _lM

/87^r\ - \ — - \

Sr ~\C C C hyb ~ C b

r

c

C

Sr

c

86S/ \86sr/ r Sr c Inserting the values of the parameters in either equation, we can now build a table for 10 percent mass fraction increments (Table 1.6) which we can plot in Figure 1.5. <=

18

Mass balance, mixing, and fractionation 1.3.3 Ratio-ratio relationships in binary mixing

We will now consider four distinct species (element, isotope, ion), il, il, i3, i4, form the concentration ratios of il to i 1 and i4 to i3, and assess how the two ratios Ci2/Cn and Ci4/Ci3 vary in the mixture when the two end-members a and /? are mixed in variable proportions. Equation (1.3.6) reads

(

z-i2\

/ri2\

<- /mix

\<-

r/r i 2 \

/ri2\

"Ir*

L\C

\

C

/aJ^mix

— 1 =1 — 1 + 1 — 1 - I — I /a

//?

£1

f

* ^

which we rearrange as

il /• "~ V

/ - il /•

V1^

/mix

then /-il

V

/mix

(1.3.9)

c2

i2

\^

/a

W

/mix

w

Likewise, for the ratio Ci4/Ci3

(1.3.10) \^

/mix

\^

Dividing equation (1.3.10) by equation (1.3.9), and rearranging, we get

\^

/C' \

(-) \^

/mix 4

(£)

(C'/C 3)/,' (Cn/C'\ (

\

\^

/mix

We define the ratio of 'denominator ratios' q as _(CnIC%

/mix

C

/mix

(1.3.11)

1.3 Working with ratios

19

and use the simple notation x = Ci2/Cn

Inserting these expressions into (1.3.11) gives

mix

. .a

—mix

~a

which is developed as —mix../? _i_ Y a..mix

Y mix..mix

Y a../?

n ^ Y m ' x V a 4- Y^V m * x

Y m * x v; m * x

and x m i x ( / - qy«) + (xa - qxP)ymix = (1 - g)xmixymix + x a / - qxty

(1.3.13)

If q = 1 (equal concentrations of the denominator species, in particular when the denominators are identical or are isotopes from the same chemical species), the quadratic term in xmix and ymix vanishes and equation (1.3.13) becomes that of a straight line in the diagram (x, y) x m i x (/ - ya) + (xa - xp)ymix = x a / - xfif If g / 1 , equation (1.3.13) is the equation of a hyperbola. A hyperbola is best defined by its asymptotes and curvature. Let us define x0 and y0 as o

(1.3.14)

d yo \-q

\-q

and y as

We can write x mix jo + x o y mix = xmixymix + y Introducing the product x 0 y0 o n both sides, we obtain X

mix,.mix

y

~xv

mix.,

v yo~xoy

..mix i

..

..

-,

Y vx +^0^0— oyo—y

which takes the simple form (xmix-xo)(ymi*-yo)

= xoyo-y

(1.3.16)

20

Mass balance, mixing, and fractionation Table 1.7a. End-member compositions (%) used for the mixing calculations shown in Figure 1.6 FeO

MgO

10 5

10 1

basalt gneiss

CaO

Na 2 O

10

2 4

4

Table 1.7b. Some parameters of the mixing hyperbola shown in Figure 1.6. / C FeO\

/ C FeO\

/^Na 2 O\

/ C Na 2 O\

c CaO

\C Mg °Aas

\cMgO/gn

I C/^CaO / \ /bas

V CCaO )

r* CaO ^bas

1

5

0.2

\ ^

/gn

0.10

0.40

(CMgO/CCaO)gn

q

x0

(c M * o /c C a O ) b a s 0.25

y

-0.333

1.267

1.000

^0^0

y

-1.422

x0 and y0 are the coordinates of the asymptotes parallel to the y- and x-axis, respectively, while (xoyo — y) is the curvature factor (Juteau et a/., 1986). & Draw in a diagram Na 2 O/CaO vs FeO/MgO, the curve representing the hybrids (mix = hyb) between a basalt (a = bas) and a gneiss (/? = gn), whose compositions are listed in Table 1.7a. Discuss the relationship between CaO and the magnesium number mg # which is the atomic ratio Mg/(Fe + Mg) Table 1.7b lists the necessary parameters from the appropriate equations. The position of the asymptotes is given by equation (1.3.14) x0 =

1-0.25x5 = 1-0.25

1 1-0.25x0.2 0.95 and y0 = = 3 1-0.25 0.75

[note that, in equation (1.3.14), x0 and y0 are not symmetrical relative to the end-member parameters] and y by equation (1.3.15) 1x1-0.25x5x0.2

0.75 = =1 1-0.25 0.75 The ratios in the hybrid rocks are calculated for a few hybrid ratios, which is much y=

21

1.3 Working with ratios

Table 1.7c. Results of the mixing calculation shown in Figure 1.6 for selected values of gneiss fractions in the mixture. / C FeO\

/ C Na 2 O\

MgO

M

I r CaO /

gn

\C *°)hyb

0.00 0.05 0.10 0.15

0.000 0.005 0.011 0.017

1.000 1.021 1.044 1.069

0.000 0.021 0.043 0.066

0.90 0.95 1.00

0.474 0.655 1.000

2.895 3.621 5.000

0.783 0.884 1.000

\

C

/ hyb

0.200 0.216 0.234 0.253

0.826 0.907 1.000

easier than using the full equation (1.3.13). In Table 1.7c, the cp values are calculated from equation (1.3.2), then the ratios from equation (1.3.5). For example, if we assume / gn =0.1, we get Cbas MgO / bas + CgnMgO/gn

1 0 x 0 . 9 + 1 xO.l

9.1

4x0.1

0.4

10x0.9 + 4x0.1

9.4

and (p

CaO

= -

CaO CaO ^ basCaO/C /bas +x_rCgnCaO/ /gn

^8°

with
=

/FeO\

^ n

01 _ 9.1

9.1

and

)

CaO J

9.4

+ 1

9.4

The complete results are plotted in Figure 1.6. The mg # is a useful parameter in petrologic problems: it varies rapidly with the fractionation of mafic minerals from basalts and is left almost unchanged by assimilation of crustal material, which contains very little Fe and Mg. The two coordinates may be considered as ratios, CaO with a constant denominator, mg# with (Fe + Mg) as the denominator. Mixing and fractionation relationships are therefore not straight lines in such a diagram. The mixing curvature is usually strong as it is a function of the atomic (Fe + Mg) contents in crust and basalt, which are usually extremely different, o

22

Mass balance, mixing, and fractionation

1

2

3

4

5

FeO/MgO Figure 1.6 Hyperbolic array of mixing between gneiss and basalt in an Na 2 O/CaO vs FeO/MgO plot. Data from Table 1.7a. See text for explanations.

<& Draw on the 1 4 3 Nd/ 1 4 4 Nd vs 87 Sr/ 86 Sr diagram the curve representing the hybrids (hyb) of a basalt (a = bas) and a gneiss (/? = gn) with the properties listed in Table 1.8a. Table 1.8b lists the necessary parameters from the appropriate equations, first the position of the asymptotes xo =

0.703-0.2x0.710 , 0.511-0.2x0.513 rww. —— = 0.70125 and yo = —— = 0.5105 1-0.2

1-0.2

and then y as 0.703 x 0.511 - 0 . 2 x 0.710 x 0.513 y=

= 0.3580 1-0.2

The q> values and the isotopic ratios are tabulated in Table 1.8c. For example, if we assume/ gn = 0.9, we get 200 x 0.9

C 8n

s

s

C bas 7 bas + C gn '/ gn

100x0.1+200x0.9

and 20x0.9 8

C bas Nd / bas + C gn Nd / gn

The isotopic ratio of the hybrid are given by

2x0.1+20x0.9

= 0.947

1.3 Working with ratios

23

Table 1.8 a. End-member compositions used for the mixing calculations shown in Figure 1.7

87

Sr/ 86 Sr Nd/ 1 4 4 Nd CSr (ppm) CNd (ppm) 143

basalt

gneiss

0.703 0.513 100 2

0.710 0.511 200 20

Table 1.8b. Some parameters of the mixing hyperbola shown in Figure 1.7. r

^gn

Sr/>~ Sr /^bas

2

c

Nd

Nd

/c

10

(c S r /c N d ) g n (c S r /c N d ) b a s 0.2

x0

H

y

0.701 250

0.510 500

0.358

X y

°°

y

-0.00001188

Table 1.8c. Results of the mixing calculation shown in Figure 1.6 for selected values of gneiss fractions in the mixture

0.000 0.095 0.182 0.261

( 87 Sr/ 86 Sr) mix 0.703000 0.703 667 0.704273 0.704 826

0.000 0.345 0.526 .638

( 1 4 3 Nd/ 1 4 4 Nd) m i x O.513OOO ^ 0.512310 .0511947 0.511723

0.947 0.974 1.000

0.709 632 0.709 821 0.710000

0.989 0.995 1.000

0.511022 0.511010 0.511000

/gn

0.00 0.05 0.10 0.15 0.90 0.95 1.00

and \ i44

V NdA A yb

/ 1 4 3 Nd\

V

4

\

0.011x0.513 + 0.989x0.511=0.51102

The complete results are plotted in Figure 1.7. <>

& We will take an example of closed-system meteoric-hydrothermal alteration of a granitic rock (modified from Taylor, 1978). A granitic rock with 8 1 8 O = + 8 and 5D=—65 is invaded by meteoric water (8 1 8 O=—16, 8 D = —120). Assume that

24

Mass balance, mixing, and fractionation

73

:

® basalt

S

144N

0.513

-

\ 0.512 -

_

^-oooocoog)8116185

0.511

-

asymptote

0.700

0.702

0.704

0.706

0.708

0.710

0.71

Figure 1.7 A hyperbolic array of mixing between gneiss and basalt in a 143Nd/144Nd vs 87 Sr/ 86 Sr plot. Data from Table 1.8a.

water-rock oxygen fractionation can be approximated by water-feldspar values (8 18 O feldspar —5 1 8 O w a t e r = +2) and hydrogen fractionation by the water-biotite value (8D biotite — 8D w a t e r = — 40). Assume CW°ICT° — 2, where C w ° and C r ° are the number of oxygen atoms per gram of water and rock (ideally the number a 16 of 1 6 O atoms should be used as before, but the difference is minute and we do not want to introduce confusion with isotopic fractionation factors) and C w H /C r H = 100. Draw the locus of possible 8 1 8 O r o c k and 8D b i o t i t e for variable proportions of interstitial water. We assume that the 1 8 O/ 1 6 O ratio of the (water + rock) system is preserved through the hydrothermal reaction (closed system)

—)

-fC -fC

of'*!

C of +fC

1

™) - f

C

of'*!

+fC

o

In this equation,/ w and/ r = 1 —/w are the mass proportions of water and rock in the system. Let us divide each term in the second equality by ( 1 8 0 / 1 6 0 ) S M O W > subtract (/wCw° + / r C r °) and multiply by 1000. We obtain

in which we recognize the familiar expansion of the mixture as a function of the cp° parameters

13 Working with ratios

25

Table 1.9. Results of the mixing calculation shown in Figure 1.8 for selected values of the waterI rock ratio Q =fjfr Q 0.00 0.01 0.02 0.03 0.04 0.05 0.10 0.15 0.20 0.25

s18orock

SD biotite

8.00 7.57 7.15 6.75 6.37 6.00 4.33 2.92 1.71 0.67

0.000 0.500 0.667 0.750 0.800 0.833 0.909 0.938 0.952 0.962

-65.0 -112.5 -128.3 -136.3 -141.0 -144.2 -151.4 -154.1 -155.5 -156.3

0.000 0.020 0.038 0.057 0.074 0.091 0.167 0.231 0.286 0.333

Likewise

Introducing the rock-water fractionation, we get

/wCwH(5Dr + 40) + / r Cr H 6D r = /WCWH5DW° + /rCrH5Dr° which can be rearranged as f C of818O ° + 2^-l- fC °8 18 O ° JWCW + J r C r

In the last equation, (p^H bears the usual meaning of the fraction of total hydrogen held by water, with similar definitions for oxygen and rock. The mass unit water/rock ratio widely used by economic geologists is simply Q = fw//r. We can write O _

/wC w ° + / r C r ° fC

H

e(c w H /c r H )

^W "/wC w H +/ r c r H = With the values given for CJCr° and Cj*/CTH9 a few values of 5 1 8 O r o c k and 5D b i o t i t e have been calculated for a range of Q (Table 1.9) and plotted in Figure 1.8.
Mass balance, mixing, and fractionation

26

Unaltered isotope composition of the granite

-60

.2-100 Q CO

-140

-180 -4

0

+4

+8

18

6 Orock Figure 1.8 Closed-system equilibration between biotite, rock and water for oxygen and hydrogen isotopes (from Taylor, 1978, modified). Data from Tablel.9. 5 1 8 O r o c k -5 1 8 O w a t e r =+2 and 5D b i o t i t e -5D w a t e r =-40.

1.3.4 Mixing hyperbola: the inverse problem The simplest inverse approach consists, given the end-member coordinates, in finding the mixing parameters of any individual mixture, i.e., finding the parameter q defined by equation (1.3.1). We will treat this case with an example. <& An andesite (and) with 87 Sr/ 86 Sr = 0.706 and 5 1 8 O = 8 results from the hybridization of a basalt (bas) with 87 Sr/ 86 Sr = 0.703 and 5 1 8 O = 6 and sediments (sed) with 87 Sr/ 86 Sr = 0.713 and 5 1 8 O = 16. Effects of radioactive decay on isotopic compositions are neglected. Estimate the ratio q such that (Sr/O)sed (Sr/O)bas Let us express 87 Sr/ 86 Sr in the mixture as a function of the the 87 Sr/ 86 Sr ratio of each end-member

86

Sr conservation requires

86

Sr allotments cp and

1.3 Working with ratios

27

and therefore

87 86 87 86 8 6 S r _( Sr/ Sr) a n d -( Sr/ Sr) b a s — ' 87 86 87 86

( Sr/ Sr)sed -( Sr/ Sr)bas

Since effects of radioactive decay on the molar mass can be neglected, cpbas86Sr can be replaced with an excellent precision by (pbasSr. Moreover, the definition of q> expressed through equation (1.3.2) as a function of mass fractions / and concentrations CSr requires Sr

f r Sr _

Jsed^sed

Vsed

f

Sr

Jbas^-bas

f

r

Sr

"+" ised^sed

with the mass conservation condition /bas + ./sed = 1

The last two equations can be recombined as L(l ~./sed)Cbas r + /sedQed T^sed * = /sedQed *

or (1 - / s e d ^ s e d ^ J * = /sed(l "
and therefore

which shows that/values cannot be uniquely computed unless a Sr sediment-basalt concentration ratio is assumed. Likewise, since 1 6 O makes most of the oxygen present in rocks, we get

(S 18 O) sed -(5 18 O) bas and

Cbas° The final expression for q is ^sed

/^sed Sr

C b a s /C b a s °

V^ r / U )sed

(Sr/O) bas

^sed \ l ~ ^sed J

28

Mass balance, mixing, and fractionation

Inserting numbers into the equations we get < 0.706-0.703 8-6 ^ J o (pSed = = 0.03 and (psed° = = 0.2 0.713-0.703 16-6 and therefore _(Sr/O) sed _ 0.3(1 -0.2) _ ^~(Sr/O) bas ~ 0.2(1-0.3)" ' Spectacular examples of hyperbolic mixing exist in the geochemical literature but one may wonder whether these plots can be used in a quantitative way. Although the positions of end-members of hyperbolic mixing arrays is underdetermined, these arrays offer an inestimable advantage over linear mixing arrays. Because the geochemical data must stay finite, we know that end-members of mixtures must be located on the mixing array between the extreme point of the array and the asymptote, which may eventually provide a tight constraint on their position. In addition, the curvature factor carries valuable information on the relative concentrations of denominator species in the end-members. Mixing hyperbolic trends with a negative slope for a maximum spread in coordinate ratios, and a strong curvature for better positioning of the asymptotes, are the most informative. An example of least-square estimate aimed at optimizing the knowledge of the mixing parameters from a hyperbolic data array will be presented in Section 5.1.

7.5.5 Ratio-ratio relationships in ternary mixing Ratio-ratio relationships in mixtures involving three or more end-members are no more difficult to handle than binary mixtures. Curved triangles are obtained in binary plots and curious surfaces may be drawn in a 3-D space using 3-D plotting programs. Different concentration ratios may produce widely different triangles and singular points may be created for a particular range of parameters. & In the 143 Nd/ 144 Nd vs 87Sr/ 86 Sr plot, draw the mixing triangle at the 20 percent mesh size for a mixture of three mantle components, the depleted mantle (DM), the enriched mantle I (EM I) and the enriched mantle II (EM II). The isotopic ratios and relative concentrations of each component are listed in Table 1.10. We will illustrate the calculation at one particular point, say/ DM = 0.5,/ EMI = 0.3, / EMII = 0.2. From equation (1.1.1), the concentrations in the mixture are CmixSr = 40 x 0.5 + 400 x 0.3 + 20 x 0.2 = 144

1.3 Working with ratios

29

Table 1.10. Isotope ratios and elemental concentrations (ppm) used for the three-component mixing calculation shown in Figure 1.9. 87

Comp.y

Sr/ 86 Sr

C/ r

0.703 0.705 0.710

40 400 20

DM EMI EM II

143

Nd/ 144 Nd 0.5131 0.5118 0.5121

C

Nd

5 10 10

and CmixNd = 5 x 0.5 + 10 x 0.3 + 10 x 0.2 = 7.5 Units are immaterial to the result since we will work with concentration ratios. We can therefore calculate the allotment of Sr and Nd to each end-member through the (p factors and compute the isotopic ratios of the mixture from equation (1.3.1) as

86

Sr7 mi¥

'

144

144

144

or 87o

\

—-I =0.703x0.139 + 0.705x0.833 + 0.710x0.028 = 0.704 87 86 Sr/ mix We identify 0.139 as the fraction (pDM86Sr«(pDMSr of 8 6 Sr contributed by the component DM, and the like for other end-members. Likewise

4

. Nd/ mix

=0.5131

5 x 0.5 7.5

+0.5118

10 x 0.3 7.5

+0.5121

10 x 0.2 7.5

i.e., /143Nd\

=0.5131x0.3333 + 0.5118x0.4000 + 0.5121x0.2667 = 0.51231

Figure 1.9 shows a mixing hyperbolic triangle drawn for / values at 20 percent intervals. Interesting conclusions may be drawn from this plot (a) If analytical data concentrate in only one of the triangles, they will look as fanning out of the intersection. The intersection of two of the hyperbolic triangle edges therefore may hint at a virtual end-member which actually does not exist. (b) The points representing mixtures are more likely to fall where the density is larger: for 20 percent fraction lines there are 5 + 4 + 3 + 2 + 1 = 15 small triangles. If the probability of

0.5130

0.5125

0.5120

0.5115 0.704

0.706 87

0.708

0.710

Sr/ 86 Si

Figure 1.9 A hypothetical hyperbolic triangle for the mixing of three mantle components in the sense of Zindler and Hart (1986). Data from Table 1.10.

1.4 Normalized variables

31

combining any mass fraction of each component were uniform in the mixing process, each triangle would contain 1/15 of the data points. Mixtures would therefore concentrate where the triangles are smaller, i.e., between EM I and the intersection point.

1.4 Normalized variables

Similar methods apply when the concentration data are normalized or reduced. In particular, the simple properties relating ratios with identical denominators apply when the denominator is a sum. It is common practice in petrology and geochemistry to plot points in triangular diagrams which are a special case of a diagram with normalized coordinates. The AFM diagram (e.g., Cox et al, 1979; McBirney, 1984; Hall, 1987) and the Ti-Y-Zr plot of Pearce and Cann (1973) are enlightening examples of descriptive diagrams that have considerable conceptual importance but have received minimal quantitative treatment. The normalized concentration of the species i in the end-member j is defined as

where m is the number of species that enter the normalization. This definition results in the closure property

d-4.2)

The normalized concentration in a mixture of n end-members is defined in a similar way c c

i_

,

tc/fj

^mix

bmix

VC

„

_ J=l

k

_

yi

. ^j

i k=l

y r *

fc=i fc=i

hence m Cmix = ZJ *j ~

Jj

k=l

Finally n

t«J= I Sfr?

(1-4-3)

32

Mass balance, mixing, and fractionation Table 1.11a. Major-element compositions (%) used for the example of olivine fractionation in the AFM diagram of Figure 1.10.

FeO MgO Na 2 O K2O

bas

ol

10.0 12.0 2.5 1.0

14.35 45.64 0.00 0.00

where q>^ is defined as

c and represents the mass fraction of the summed concentrations that was contributed by the component j to the mixture. This situation is entirely analogous to the cases of elemental or isotopic ratios which have been discussed above, except that we have artificially created a new 'species' by summing the concentrations of m different species. Actually, any sort of linear combination of elemental concentrations would show the same properties. These properties have fundamental consequences on mixing relationships. If points on a plane are ascribed to different end-members, like making three end-members the apices of a triangle, mixture coordinates will be represented as linear combinations of end-member coordinates. For instance, binary mixing will be represented by linear segments, ternary mixing by triangles, and the usual mass balance relationships apply including the lever rule. <& Draw on the AFM diagram the curve representing the removal of up to 20 percent olivine fo85 from a basalt, assuming that olivine composition stays the same during the process. Chemical compositions in percent are listed in Table 1.11a. Let us subscript the residual liquid 'res'. The calculation is slightly complicated by our dealing with mineral removal which is opposite to mixing. Zconc is the sum FeO + MgO + Na 2 O + K 2O. In the AFM triangle, the usage is to label the three normalized coordinates £ as X = £(1) = (Na2O + K2O)/Zconc, F = £(2) = FeO/5;conc, and M = {(3) = MgO/Econc. Given a table of residual liquid fractions / res, we first calculate in the second column the change in Sconc from /y

v

_ (^conc)bas~(l ~/res)(^conc)ol Jres

1.4 Normalized variables

33

Table 1.11b. Evolution of A, F, and M parameters as afunction ofolivine removalfrom basalt. Compositions are listed in Table 1.11a.

basalt -> olivine —•

v^conc/res

A

F

M

25.5 60

0.137 0.000

0.392 0.239

0.471 0.761

0.137 0.156 0.179 0.212 0.259

0.392 0.413 0.439 0.476 0.528

0.471 0.432 0.381 0.312 0.213

/res

1

1.00 0.95 0.90 0.85 0.80

25.50 23.68 21.67 19.41 16.87

1.00 0.88 0.76 0.65 0.53

Next, we calculate (p res z from £

V^conc/res /res V^conc'bas

and finally each coordinate £res* can be calculated from

As an example, assume fres = O.S. We first calculate (£ conc ) res as 25.5-(l -0.8)60 (^conJres =

ZTZ

= 16.875

U.o

then (p resz from equation (1.4.4) as

es

25.5

and, finally, from equation (1.4.3), we compute the AFM coordinates 0.137-(1-0.529)0 0.529

= 0.259

0.392-(l-0.529)0.239 "

0.529 0.471-(1-0.529)0.761 0.529

= 0.528

= 0.213

The complete results are listed in Table 1.11b and displayed in Figure 1.10.

34

Mass balance, mixing, and fractionation F

A

5 percent increments

M

Figure 1.10 The liquid line of descent upon olivine removal from basalt in the AFM diagram using equations (1.4.3) and (1.4.4). Olivine has a fixed composition (data from Table 1.1 la). Each open circle indicates removal of 5 percent olivine.

1.5 Incremental processes (distillation)

1.5.1 Introduction Although this topic will be largely covered with respect to trace elements in magmatic processes in Chapter 9, incremental processes affect elemental and isotope distributions even when the species considered is not a trace element and therefore deserve a treatment in which dilution is not critical. During a distillation process, a condition of chemical equilibrium is applied to a state of the system which changes while the process progresses. Everyone is familiar with the principles of thermal distillation of alcoholic solutions in a still. Here, the chemical equilibrium condition is represented by the pressure of alcohol in the vapor phase being proportional to the concentration of alcohol in the liquid, i.e., Raoult's law. Upon progressive condensation of the vapor in a refrigerated side-flask, the residual liquid becomes more and more depleted in alcohol. Similar fractionation processes take place in the Earth whenever phases separate progressively from each other. A familiar case is the fractional crystallization of basalts. Progressive removal of mineral phases from a parent magma produces a sequence of residual (= differentiated) liquids which altogether form the liquid line of descent (Bowen, 1956). The opposite process, in which fractionating minerals accumulate in specific parts of magmatic systems, creates chemical trends known as control lines. Liquid lines of descent and control lines account for most of the chemical variability in igneous rocks. However difficult the distinction can be, the two sorts of trends should not be confused with each other. For instance, the olivine control lines observed in many basaltic series produce chemical variations that can be confused with the liquid line of descent of picritic magmas. Likewise, vapor expelled from

1.5 Incremental processes (distillation)

35

magmas also induces chemical distillation processes which can lead to a large geochemical variability (e.g., Candela, 1986). Another incremental process consists in the progressive flushing of a porous rock by a fluid. Solid-liquid exchange leads to chemical changes when more fluid is allowed to percolate, which is extremely efficient in producing strong elemental or isotopic fractionation. The mass balance aspects of distillation and percolation processes are often confused with their kinetics. Transfer of successive mass increments usually takes place over successive time increments but how the progress variable varies with time is often unknown in natural processes. Fortunately, knowledge of time-dependence is by no mean compulsory to achieve the description of the process in terms of mass balance.

1.5.2 Concentration changes upon closed-system crystallization A homogeneous mass M liq of liquid contains the amount mliq* of species i. The concentration Cliq* of species i in the liquid is defined as

The fraction F of residual liquid is relative to that of parent magma M o F = MnJM0 Taking the differential of In Cliq\ we get dC l i q f _ dm liq f

dF

The closed-system condition requires

=0

q

(1.5.1)

Let Di be the ratio of the concentration dmsoi ydM sol of species i in the last increment of precipitated solid to the concentration C liq l in the liquid, i.e.,

H=Dinh}l dM8ol

(1.5.2)

MIiq

with no implication that Dt is constant. Applying the mass balance relationships, the last equation also reads

dM l i q

"M liq

36

Mass balance, mixing, and fractionation

or, equivalently

M

liq

Expressing the concentration leads to

liq

*

or dlnCliqI' = (D I --l)dlnF

(1.5.3)

This is the fundamental distillation equation, often referred to as the Rayleigh law when in its integrated form (Rayleigh, 1896). As far as Dt is considered to be a function of F, this equation applies to the change of any species concentration in the course of phase separation. Liquid-vapor or solid-solid fractionations are liable to the same formulation.

1.53 Changes in element and isotope ratios upon closed-system crystallization If we are dealing with major elements, partition coefficients Dt may be expected to vary with many parameters, including temperature or liquid chemistry. For some elemental pairs and particularly for isotopes for which activity coefficients are correlated, there is a better chance, however, that the ratio of two partition coefficients Dn and Di2 shows lesser variations. We therefore subtract the distillation equation (1.5.3) for element or isotope i\ from that for il as

liq

or d \n(Ci2/Cn)Uq = (Dni2 - l)Dn d In F

(1.5.4)

where we note Dni2 the ratio of the elemental partition coefficients. These equations are known collectively as the Doerner-Hoskins law (Doerner and Hoskins, 1925). Petrologists would commonly label Dni2 with the symbol KD, while stable isotope geochemists could label it a. Eliminating Dn with the distillation equation (1.5.3) for il, we get the two expressions d \n(Ci2/Cn)Uq = (Dni2 - l)(dIn C liq jl + d In F)

(1.5.5)

1.5 Incremental processes (distillation)

37

and (1.5.6)

We sometimes want to track the change in the concentration of a species *3 as a function of an index ratio i'2/il. Dividing equation (1.5.3) for i = i3 by equation (1.5.4), the appropriate equation is

i2

(1.5.7)

{Dni2-\)Dn

ICn)^

The new variable z is defined as z=

FCUqil/Coil

where the subscript 0 in the denominator refers to the parent magma. Taking the log-differential, we get = dlnF + dlnCliql1 which, inserted into equation (1.5.5) gives dln(C2/Cll)liq = (A/ 2 -l)cilnz

(1.5.8)

For constant Dni29 equation (1.5.8) is integrated into

' Equation (1.5.9) has been derived by Irvine (1977) and Pearce (1978) in the case of olivine fractionation from magmas and is also found in fluid-magma systems (Candela, 1986). Dividing equation (1.5.8) for the pair i3-il by the equation for the pair i2-il gives

In logarithmic plots, such as In Ci3/Cn vs In Ci2/Cn9 the slope of the curve is given by the right-hand side of equation (1.5.10). When Dni2 and Dni3 are constant, so is the slope and these plots must be linear arrays with the equation

\c 1 liq

\

C

/0

n U

il

~

L

1 \r

I

\ ^

/O

n U

i\

i ~V

\r

I

\ ^

/liq

(1.5.11)

where the subscript 0 refers to the parent magma. Some related equations were derived by Irvine (1977) for the particular case where f3 is an element excluded by olivine,

38

Mass balance, mixing, and fractionation

and by Pearce (1978) for i\ being constant. Equation (1.5.11) is actually fairly general. The reader can easily check that, in the same diagram, the solid (cumulate) defines an array parallel to the liquid array and displaced by l n D a l 2 in x and lnD^ 1 3 in y. & 18 O/ 16 O fractionation during closed-system crystallization. A magma body precipitates a cumulate whose solid-liquid fractionation factor a for 1 8 O/ 1 6 O fractionation is 1.0005. What magma fraction should crystallize with distillation effect in order to lower the melt 5 1 8 O from 7 to 6? The fractionation coefficient a is defined as

( 18 O/ 16 O) liq

When inserted into equation (1.5.5), we get dln( 1 8 O/ 1 8 O) l i q = (a-l)(dlnC l i q 1 6 ° + d l n F )

(1.5.12)

Since we do not know the solid-liquid oxygen partition coefficient, we must resort to an approximation. Most oxygen atoms are 1 6 O and crystallization rarely changes the total oxygen concentration of the residual silicate melt very significantly. We can assume d In Cliq'6° % d In Cliq° « d In F which enables equation (1.5.12) to be integrated into

r where the subscript 0 refers to the liquid when crystallization starts (F= 1). Turning to the 8 notation, we can reformulate the distillation equation (1.5.13) as

1000

-=F -

(1.5.14)

1000 Inserting the numerical values, we get 1.0005-1 _

1.006 1.007

giving F = 0.14 (86 percent crystallization). This equation also applies to the H 2 O liquid-vapor system, o

7.5 Incremental processes (distillation)

39

<& Sr/Ca fractionation during anhydrite precipitation from brines. A solution of

calcium sulfate contains 20mmol/kg Ca and 0.2mmol/kg Sr. Calculate the Sr concentration in the residual brine (res) when 50 percent Ca is removed. Use The distillation equation (1.5.6) reads ^

C

a

(1.5.15)

Since Ca is a major element in anhydrite while it is still relatively dilute in the solution, we can assume £>Ca»l , hence, upon integration Ca\DcaS

where the subscript 0 refers to the initial brine. Inserting the numerical values into equation (1.5.16) gives

and the requested result CresSr = (20 x 0.5) x 0.0157 = 0.157 mmol/kg

1.5.4 FeO-MgO fractionation during olivine crystallization in basalts The Fe/Mg fractionation between olivine and basaltic liquids is independent of temperature (Roeder and Emslie, 1970) while its pressure-dependence is quite small (Ulmer, 1989). This is also the case, although to a lesser extent, between other mafic minerals and basaltic liquids (see a compilation in Warren, 1986). This remarkable property makes these two elements invaluable tools for understanding basalt genesis. The approach taken here is that of Albarede (1992). KD is defined by solid-liquid equilibrium A^

FeO

Q"*sol

(-< FeO

Cliq

M S 17

which, inserted into equation (1.5.8) gives dln(C Hq FeO /C liq M8 °)=(K D -1) dlnz

(1.5.18)

40

Mass balance, mixing, and fractionation

where z = FC liq Mg0 /Co Mg0

(1.5.19)

For constant KD, equation (1.5.9) reads

Olivine is assumed to be the only mineral precipitating and the fa and fo subscripts refer to pure fayalite and forsterite. Upon crystallization of the incremental amount dMsol of cumulate, the incremental amount dMfa of fayalite is precipitated, and, since the iron content of fayalite is constant r

J^, FeO FeO_ d m sol

Therefore, conservation of FeO and MgO requires

Factoring the numerator of the last term on the right-hand side gives

We note that, from mass balance, /m

M

8°

M liq

which we insert into equation (1.5.21). Dividing by Mo, making concentrations appear in thefirstterm of the parenthesis, and rearranging the last differential term, we obtain 1 \ /A/I *M MgO\ dm s o l F e O /dM s o l1 \ J l i q ^liq \ Mg Qo ~ °// U o Mliq / sol ~ / sol~i Wo \ 0 liq l

^0

\Wa

um

UJVi

M

JVi

which is recast into MgO ' q

Using the definition (1.5.19) of z, we get FeO

r

MgO

MgO

JVi

7.5 Incremental processes (distillation)

41

Up to this point, the only assumption is that olivine is the only phase precipitating. Taking the further step that KD is constant, equation (1.5.21) can be integrated into M

8°

which is an implicit equation with no solution in closed form. We can now compute exactly the FeO and MgO contents of residual magma corresponding to a given extent of olivine fractionation. Selecting a pair of parent magma values C 0 FeO and C 0 MgO and an olivine fraction crystallized 1 - F , z can be calculated iteratively from equation (1.5.23). Then, using equation (1.5.19) gives CliqMgO while equation (1.5.20) gives C liq FeO The same equations can be used for reversing olivine precipitation: the parent magma values are to be replaced by those of the residual magmas, F values being higher than unity. Because solving the transcendental equation (1.5.23) requires iterative methods of root finding, the numerical solution to this equation is postponed to Section 3.1. o

7.5.5 Elemental fractionation during basalt differentiation From equation (1.5.7), change in the concentration of an element i with the differentiation index (FeO/MgO) liq , written in lieu of (CFeO/C MgO) liq , is calculated as

d ln(FeO/MgO)liq

D F e O - DMgO

DMgO(KD

-

(1.5.24)

Even for a constant KD, calculating the evolution of a species i in the residual melt requires that both Dt and DMgo a r e known for each value of (FeO/MgO) liq . <& The Baffin Bay picrites (Francis, 1985) show a very good covariation of FeO, MgO, and Ni. Defined from the twelve XRF data listed in Table 1.12, the variables x = ln(FeO/MgO) and y = In (Ni/MgO) have been fitted by the parabolic equation >; = 2.92-1.60*-0.670* 2

Assess the variations of DMgONi by assuming that D MgO FeO = X D = 0.3. We should remember that the Ni/MgO ratio unit is in ppm Ni per percent MgO. When plotted in Figure 1.11, we see that the covariation between the two ratios is smooth but not linear. Given the assumption made of a constant X D , we conclude that DMgoNi varies with differentiation, which is a well-documented observation of experimental petrology (Arndt, 1977; Hart and Davis, 1978; Kinzler et a/., 1990). Taking the derivative dy/dx, we obtain the slope of the curve and, from equation (1.5.10), we write MgO

42

Mass balance, mixing, and fractionation Table 1.12. FeO, MgO (%) and Ni (ppm) concentrations in some Baffin Bay picrites (Francis, 1985).

Sample # 14 1 64 15 24 56 3 21 4 29 13 19

1

F e CVy O

MgO

Ni

10.39 10.68 10.25 10.77 10.82 10.91 10.77 10.24 10.49 10.40 10.44 10.32

11.97 13.58 14.40 14.91 14.97 15.68 16.13 17.91 20.03 21.83 22.13 21.93

278 361 407 410 435 498 517 687 775 930 943 906

FeO

Ni x 104

MgO

MgO

0.868 0.786 0.712 0.722 0.723 0.696 0.668 0.572 0.524 0.476 0.472 0.471

23.22 26.58 28.26 27.50 29.06 31.76 32.05 38.36 38.69 42.60 42.61 41.31

4.00

-0.8

-0.6

-0.4

-0.2

In FeO/MgO Figure 1.11 Ni/MgO vs FeO/MgO relationship for the Bay of Island basalts and picrites (Table 1.12 data from Francis, 1985). The curve is a least-square parabolic fit through the data. Since K D = D MgO FeO remains constant, the DMgONi partition coefficient increases with increasing FeO/MgO [equation (1.5.10)].

1.5 Incremental processes (distillation)

43

or ^M g o Ni = 1 + ( 0 . 3 - 1 ) ( - 1 . 6 0 - 1 . 3 4 I n ^ ^ - ) = 2.12 + 0 . 9 4 I n ^ ^ ~ MgO/ MgO \

Z)MgONi increases from 1.5 for a picrite with FeO/MgO = 0.5, to 3.2 for a differentiated basalt with FeO/MgO = 3, values that are consistent with the range found by Arndt (1977) and Hart and Davis (1978). <*

1.5.6 Fractional melting The approach just used for fractional crystallization can be transposed immediately to fractional melting, a process by which each packet of melt is withdrawn from the source thereby prevented from equilibration with the solid. Again, these equations will be developed in Chapter 9, but the present section emphasizes a representation which does not require constant Berthelot-Nernst partition coefficients, and therefore is more useful for major elements. As a parallel to equation (1.5.2), we define Dt as the ratio of the concentration msoiyMsol of species i in the residual solid to the concentration dmliq7dMliq in the last increment of liquid ^

=» i ^

Msol

(1-5-25)

dM H q

which, again, carries no implication on Dt being constant. Combining with mass balance conditions (1.5.1) gives = U:

*so.

dM sol

M sol

mso,'

or

The logarithmic differential of concentration is, as before dC5Oi' _ dmj

Cj

mj

dM 5ol

Msol

which, inserting equation (1.5.26), becomes dCj Cj Introducing F as the molten fraction

_l-DidMsol D, Mso,

(1.5.27)

44

Mass balance, mixing, and fractionation

we get —^dln(l-F)

(1.5.28)

which is the equivalent for fractional melting of equation (1.5.3). Subtracting equation (1.5.28) for species i\ from the same equation for species il gives

'VC1).* = (J- - J - W l -F) \Ui2

(1.5.29)

"il/

or 1

dln(l - F )

(1.5.30)

From equation (1.5.28), we obtain 1/Dn as

Dn

dln(l-F)

which, inserted into equation (1.5.30), gives - 1 Ydln Csol11 + d ln(l - F ) ]

(1.5.31)

If we consider three elements in a ln(Ci3/Cn)sol vs \n(Ci2/Cn)sol plot and divide equation (1.5.31) for the pair il — B by the same equation for the pair il — i2, we get the slope s of the graph as

d\n(C2/Cl)sol

(1.5.32)

_1 D

i2

If Dni2 and Dni3 are constant, the two ratios define a straight-line in a logarithmic plot and equation (1.5.32) can be integrated as

o J Da' 2

Vc'Vo J V

2

where the 0 subscript now refers to the so/W source [compare with equation (1.5.11)]. Because each phase, mineral or liquid, differs from the other by a partition coefficient, they all should follow trends parallel to that defined by equation (1.5.33).

1.5 Incremental processes (distillation)

45

Table 1.13. Major-element data (%) on clinopyroxenes from the mid-ocean ridge peridotites analyzed by Johnson et al. (1990).

FZ stands for 'fracture zone'.

Vulcan FZ

Bullard FZ Bouvet FZ

A12O3

FeO

MgO

8.20 5.68 5.30 6.40 4.98 3.68 4.58 3.39 3.68 4.27

3.01 2.73 2.75 2.88 2.71 2.39 2.58 2.23 2.39 2.35

14.95 15.76 16.43 15.67 16.47 16.90 17.46 17.11 17.23 16.89

FeO MgO

A12O3 MgO

0.201 0.173 0.167 0.184 0.165 0.141 0.148 0.130 0.139 0.139

0.548 0.360 0.323 0.408 0.302 0.218 0.262 0.198 0.214 0.253

& Ion-probe data for trace elements in clinopyroxene from abyssal peridotites collected along America-Antarctica and South-West Indian ridges led Johnson et al. (1990) to suggest that the peridotites are residues left by a process of fractional melting of the asthenosphere. Use the electron microprobe data for A12O3, FeO and MgO given in Table 1.13 on peridotites from three fracture zones around the AntarcticaAfrica-America triple junction to test the fractional melting hypothesis and, assuming #MgoFeO = 0.3, calculate the DMgoAl2°3 fractionation coefficient. When plotted in a In (Al 2 O 3 /MgO) vs ln(FeO/MgO) (Figure 1.12), the clinopyroxene data define a unique trend which is reasonably linear, thereby supporting the hypothesis of fractional melting made by Johnson et al (1990). In addition, the linear array supports a homogeneous source and rather constant DMgOFeO = 0.3 and DMgOAh°l fractionation coefficients. Standard regression suggests the following expression

and therefore, from equation (1.5.32) 1

\ //

AI2O3 O

//

1

FeO \ n V'MgO

-1

or 1

>- = 1 + 1 2 2 Ur

=2.22

46

Mass balance, mixing, and fractionation

0.0

-2.0

-2.0 -1.9

-1.8

-1.7

-1.6

-1.5

In FeO/MgO Figure 1.12 Al 2 O 3 /MgO vs FeO/MgO plot of peridotite clinopyroxenes from near the Antarctica-Africa-America triple junction (Table 1.13 data from Johnson et al, 1990). The linear array [equation (1.5.32)] supports the assumption that a homogeneous peridotitic source has experienced fractional melting with D MgO Al2 ° 3 ^0.18.

and finally sO.18

Although less compatible than Mg, Al apparently is still well retained by the residue during melting, which supports the presence of an aluminous phase in the residue as suggested by Johnson et al (1990). If we knew the Al2O3/MgO ratio of the whole rock prior to melting, it would be possible to calculate that ratio in the last liquid extracted and compare its value with mid-ocean ridge basalt (MORB) values. <= 1.5.7 Fractional condensation

Distillation processes explain (Dansgaard, 1964) the correlation between 818O and 5D values of meteoric water found by Craig (1961). Most water vapor forms near the equator by evaporation of warm surface seawater, i.e., from a reservoir which can be assumed to have nearly homogeneous isotope composition of oxygen and hydrogen. When air masses move poleward, progressive cooling of the water vapor and condensation takes place which induces water/vapor isotopic fractionation. Liquid water is enriched in the heavy isotopes (18O and D) relative to the light isotopes (16O and H) relative to the residual vapor. With increasing latitude, atmospheric moisture therefore becomes more and more depleted in 18O and D by rainfalls. Fractional condensation can be handled like fractional crystallization with rain water (w) instead of solid, and vapor (v) instead of liquid. Then, the Dni2 fractionation

7.5 Incremental processes (distillation)

47

coefficients are related to the isotopic fractionation factors a commonly used in this field through n is w/v ( 1 8 O/ 1 6 O) W 18 D 1 6 = a o ' =—z—Ti 16

°

an

(18O/16O)V

d

^ H =aH

(D/H) w =

(D/H)v

Dividing equation (1.5.4) for D/H by the same equation for

18

O/ 1 6 O produces

dln(D/H)v _ a H w / v - l DR _ a H w / v -l ( H / 1 6 O ) W dln(18O/16O)v " a o w / v -l D 16o a o w / v -l (H/16O)V The last term in the right-hand side of the last equality is obviously very close to unity, therefore dln(D/H)w dln(18O/16O) w

a H w / v -l a o w / v -l

The 5 notation is more convenient and, since SMOW values, by definition, are constant, we get

a H w / v -l V 1000/ 18 ,1 A 5 Ow\~aow/v-l din 1 + V 1000 / The linear expansion of the natural logarithm in a Taylor series \n(l+s)&s holds for any small number e. For small values of 5 18 O and 8D, we can use the approximation d5Dw d518Ow

a H w / v -l a o w / v -l

Craig (1961) found that the slope of the correlation between 8 18 O and 5D in precipitation is constant and equal to 8 which is reasonably consistent with atmospheric temperatures.

1.5.8 Open-system isotopic exchanges In hydrothermal systems, a simple distillation equation for an isotope ratio i2/il can be worked out if only il is easily exchangeable. This is the case of the 1 8 O/ 1 6 O ratio because the bulk of oxygen is the isotope 16 and its concentrations are determined

48

Mass balance, mixing, and fractionation

by stoichiometry (Taylor, 1978). This may also be the case of Sr when the bulk concentrations are not changed drastically and isotopic exchange is the most visible effect (Albarede et al., 1981; McCulloch et al, 1981). We consider a system of total mass M o made of porous rock (r) wetted by water (w) with respective mass fractions / w and fT which we assume to be constant. The isotope i\ is assumed to be unaffected by water-rock interaction. Change in the amount of isotope il held by the whole system takes place by circulation of a fluid incremental amount dM w which enters the system with a concentration Cini2 and leaves it with the il concentration Cj2 of the interstitial liquid d[M0(/ wCwi2 + /r Cri2)] = Cin'2 dMw - Cj2 dMw

(1.5.34)

Defining the incremental water/rock ratio dQ as dQ = dMJM0 and dividing equation (1.5.34) by dM w dC i2 dC i2 / w r 4 + d / w ) 4 = Cin dQ dQ

(1.5.35)

We finally divide both sides by Cj1 = Cinil to obtain

- W \C W 1 V

dQ

d.5.36)

dQ

<& Oxygen isotope exchange during open-system meteoric-hydrothermal alteration of a granitic body (Taylor, 1978). Meteoric water (8 1 8 O i n = —16) percolates through a granitic body with 5 1 8 O = + 8. Assuming that water (w)-rock (r) oxygen fractionation can be approximated by the water-feldspar value (5 18 O feldspar —8 1 8 O w a t e r = + 2), draw the locus of possible 5 1 8 O r o c k for variable proportions of interstitial water up to a water/rock ratio of 0.8. Assume the oxygen concentration ratio C w °/C r ° = 2. Letting il = 1 6 O and i'2= 1 8 O, equation (1.5.36) becomes ,d( 1 8 O/ 1 6 O) w /w

dQ

i n

nCr°d(

+(1 / J

^7

18

O/16O)r

dQ

/18O\ _/18O\

*{-^5)in

[*o)w

where the « sign emphasizes that 1 6 O is considered as immobile. Dividing each equation by ( 18 O/ 16 O) SMOW , subtracting 1 and multiplying by 1000 enables 5 values to be introduced C.5.,7, Since the difference in 8 values for water and rock is constant, their incremental

1.5 Incremental processes (distillation)

49

variation d 5 1 8 O is identical and — »5 1 8 O- —518O

_l_ M— /*)—?—I

this equation can be rearranged for easier integration as d(5 1 8 O w -5 1 8 O i n ) 5 1 8 O w -5 1 8 O i n

dg

(1.5.38)

which can be integrated into (1.5.39)

6 1 8 O w °-5 1 8 O i n where eo=/w+(i-/w)~ 518Orock values can now be calculated from 518Or = 2 + 5 18 O in -f(5 18 O r °-2-5 18 O in )e- Q/Qo

(1.5.40)

a relationship drawn in Figure 1.13. o

+10

-10 0.2

0.4

0.6

(2 = M w a t e r /M 0 Figure 1.13 Incremental water/rock interaction: change of the whole rock 5 18 O with the integrated water/rock ratio Q as given by equation (1.5.40). Water-rock fractionation is 5 1 8 O r o c k -5 1 8 O w a l e r =+2.

50

Mass balance, mixing, and fractionation

& Strontium isotope exchange during open-system hydrothermal alteration of basalts (Albarede et aL 1981). Seawater (sw) with 8ppm Sr and ( 87 Sr/ 86 Sr) sw = 0.709 percolates through mid-ocean ridge basalts (bas) with 120 ppm Sr and 87 Sr/ 86 Sr = 0.703, initially. Seawater and basalt react to form a hydrothermal solution (hw) and a hydrothermally altered rock (hr). Sr concentrations are not modified in the process. Calculate the water/rock ratio Q for a hydrothermal solution with ( 87 Sr/ 86 Sr) hw = 0.704, assuming an extremely low rock porosity and complete rock-fluid isotopic equilibrium. The assumption of low rock porosity is equivalent to / w « 0, hence, from equation (1.5.36) Crn\d(Ci2/Cil)r_/Ci2\ AT 1 /

dg

~\C"/in"

We assign the superscripts i\ to 86 Sr, il to 87 Sr, the subscripts r to the hydrothermal rock (hr), w to the hydrothermal solution (hw), and in to inflowing seawater (sw). Since 86 Sr and Sr concentrations are nearly exactly proportional quantities, we get C hr Sr d( 87 Sr/ 86 Sr) hr _/ 87 Sr\ Chw

V86Sr/sw

dg

/ 8 7 Sr\ \86SrAw

The Sr isotope composition of the hydrothermally altered basalts (hr) being identical to that of the hydrothermal fluid (hw), we prepare for integration rewriting

C 86

Sr

Sr

which holds true because ( 87 Sr/ 86 Sr) sw is constant. If the fluid Sr concentration stays constant during the water-rock interaction process, so does rock Sr concentration, hence, upon integration 7

Sr

Rearranging

Mbas

^ ,

( 8 7 Sr/ 8 6 Sr) s w -( 8 7 Sr/ 8 6 Sr) b a s

ChwSr

87 86 8 787 6 (( 87 S /86 S )sw( -( SSr/ / 8 86 SSr) ) hw Sr/ Sr)

and inserting numerical values 120 l n 0.709-0.703 = 2 7 8 0.709-0.704

7.5 Incremental processes (distillation)

51

How does the system behave for small Q values? Equation (1.5.43) can be rewritten

( 87Sr/86Sr)sw-(87Sr/86Sr)hv

(1.5.44)

Using once more the linear expansion of the natural logarithm in a Taylor series for the second term between brackets allows us to find an approximation valid for

C h r S r ( 8 7 Sr/ 8 6 Sr) h w -( 8 7 Sr/ 8 6 Sr) b a s ^^ChwSr(87Sr/86Sr)sw-(87Sr/86Sr)hw

{

"

)

Inserting the appropriate numerical values, we obtain Q = 3.0. The reader is urged to check that equation (1.5.45) is equivalent to assuming closed-system fluid-rock equilibrium. Unless water-rock ratio is quite high, the closed-system approximation is sufficient for practical purposes (Albarede et al, 1981). <=

2 Linear algebra

The purpose of this Chapter is not to present an exhaustive theory of linear algebra that would take more than a volume by itself to be presented adequately. It is rather to introduce some fundamental aspects of vectors, matrices and orthogonal functions together with the most common difficulties that the reader most probably has encountered in scientific readings, and to provide some simple definitions and examples with geochemical connotations. Many excellent textbooks exist which can complement this introductory chapter, in particular that of Strang (1976).

2.1 A matrix refresher

2.1.1 Definitions • An n-tuple is an ordered set of n objects. • A vector space W of dimension n is the set of all the n-tuples, called vectors, of real numbers or scalars. • In the space 9?n, m vectors (xx • • • •• xm) are independent if, given the m scalars af, the relationship m

I oyc,=0

(2.1.1)

i= 1

requires that the m a/s are zero. • A base of 9?" is a set of n vectors such that any vector in 9T can be represented by a linear combination of the base vectors. • A matrix is an array, either square or rectangular, of numbers. It will be noted subsequently with bold-face, upper-case symbols. The real matrix Amxn has m rows and n columns. A matrix Amxn also represents either a set of n column-vectors in SRm, or a set of m row-vectors in 9T. A scalar element of the matrix A will be referred to by its row and column index and noted atJ (ith row, jih column), so the array form of the matrix A is

A=

52

2.1 A matrix refresher

53

Example: "1 0

-r 2

1

3_

A vector may be considered as a special matrix, such that n= 1, and will be noted with bold-face, lower-case symbols. Example: V b=

0 L-U

• A square matrix has the same number of rows and columns. A square matrix is symmetric if atj = ajt. A diagonal matrix is a square matrix with all off-diagonal entries equal to zero. A diagonal matrix is necessarily symmetric. Example of a diagonal matrix: 1

0 0"

A= 0 - 1 0 L0

0

3.

The identity matrix InXn (usually written for short /„) is a diagonal matrix with ones on the diagonal. Example:

/,=

o

2.1.2 A few rules for matrix manipulation The transpose of the matrix Amxn

is the matrix Bnxm

such that (2.1.2)

Example: '1

-1"

0

2

.1

3_

r * on L - l 2 3j

54

Linear algebra

• The addition of matrices Amxn

and Bmxn results in the matrix Cmxn, such that (2.1.3)

C=A+Bocij = ai

Example: "1

-1"

If A = 0

2

2

and B=

0

3_

.1

0 -2"

-1"

~-l

- 1 , then C=A + B= 2 1_

.1

1 4j

• The multiplication of the matrix Am x „ by the scalar a results in the matrix Cm x B, such that C=0^OCy=0Kly

(2.1.4)

Example: "1

-1"

If .4 = 0

2

.1

2 and a = 2, then

C=2A-

3.

• The multiplication of the matrix Amxn calculated from

by the matrix Bnxp

-2

0

4

.2

6.

results in the matrix Cmxp

(2.1.5)

Ca =

Example:

3 .-1

-1"

'1

1

0

andtf= 0

2

1.

2

• - 1

If A =

.1

'-2

-r

2"

2 , then C=AB=

3

3_

0

-1 8.

For instance, c32 = ^3 + ^32^22 + ^33^32 = ( ~ 1) x ( ~ 1) + 2 x 2 - ( - l x 3 = 8. • The product of the matrix /4 m x n by the vector xn produces the vector ym. A is therefore a linear operator which associates a vector from W" with a vector from $RW. Example:

1 If A =

-1 and x = \

, then y = Ax =

.0 •

Matrix product is in general not commutative, i.e.,

AB^BA

(2.1.6)

(AB)C=A(BQ

(2.1.7)

but is associative

2.1 A matrix refresher

55

A(B+Q = AB+AC

(2.1.8)

and distributive

Identity matrices represent the unitary elements of matrix multiplication for A I —I A —A s*m x nln Mm^tm x n ^mxn

(2.1.9)

It is left to the reader to show that, through equation (2.1.5) (2.1.10)

(AB)J = BTAT • The dot or inner product of two vectors xn and yn is the scalar a, such that tx = x-y = x y=y x= ^

x}y}

(2.1.11)

Because of its scalar nature, the order of vectors in the dot product can be exchanged. • The outer product of two vectors xm and yn is the matrix Am x „, such that

(2.1.12)

The outer product is not commutative. • Given xn and a symmetric matrix An x „, the scalar S, such that = xTAx

(2.1.13)

is called a quadratic form. Example:

• The squared-length || JC|| 2 of the vector x makes the vector space 9T Euclidian if it is given by ||JC||2 = A:TA:

(2.1.14)

The cosine of the angle 6 between vectors x and y is given by cos 6 = -

(2.1.15)

Example: Given the vectors x = [ l , 2 ] T and y = \_— 1,1]T, we compute their squared-lengths, dot

56

Linear algebra product, and angle as

d=1+1=2

COS0 =

10 • Two vectors x and y are orthogonal if their dot product is zero.

2.1.3 The common-dimension expansion of the matrix product Let us prove a useful alternative expression of the matrix product. Let ei ( i = l , ..., n) be the column vector whose n coordinates are zero except for the ith which is equal to 1. The n e,'s form a base of the Euclidian space 9T. Uet is the ith column of a matrix UmXn while e? V is the ith row of a matrix Vn x p. Outer products such as e{e? are nxn matrices. From the previous definitions

2>i«,-T = /,.

(2.1.16)

i=l

where /„ is the identity matrix. For a diagonal matrix DnXn with diagonal element du e?Dei = du

(2.1.17)

eiTDej = 0

(2.1.18)

while, for i / j

We now assume that the matrix ^ m X p can be expressed as the product of three matrices Umxn, DnXn, and VnXp An,xp=UmxnDnxnVnxp

(2.1.19)

Let us express the matrix A as (2.1.20)

2.1 A matrix refresher

57

or, using equation (2.1.16) and keeping the dummy index different for each sum

Using the distributive property of the matrix product, and since each summation index is independent, we get

i= 1

k= 1

All the terms under the second summation sign with i ^ k vanish and therefore (Uei)dii(eiTV)= £ dH(ith column of U)(ith row of V)

A=j] i= 1

(2.1.21)

i= 1

The common-dimension expansion shows that the matrix product can be viewed as a linear combination of the pairwise outer products of U columns and V rows. Example: 1

-2"

2

1

_0

1.

o % -a0

= 2 0

_0

-1 -2 +3 0.

-12

-8"

2

1 =

6

-1

2

1.

6

-4

_2

3j

a result, of course, which can also be obtained with the usual rule of matrix multiplication.

2.1.4 The subspaces of a matrix Most of the following definitions and theorems form the core of many textbooks in linear algebra. They have been given for reference, but the reader looking for completeness and rigor could refer to Strang (1976) or Leon (1990). • The span of a set of vectors in R" is the subspace of all the linear combinations of these vectors. For instance, the span of one vector is a straight-line, the span of two vectors is a plane, and so on. • The range, or column-space, of a matrix Amxn is the span in 9?m of the column-vectors of A. The row-space of A is the span in 9?" of the row-vectors of A. Given a vector xn, the product vector y = Ax belongs to the column-space of A. Using the notation of the current elements, the product of the matrix / 4 4 x 3 by the vector x 3 = [x 1 ,x 2 ,x 3 ] T gives the vector

Linear algebra

58 ]T

as

a linear combination of the column-vectors of A "fill

-a 1 2 a 22

yi

a21

y3

*31

^32

-y*-

-041-

-«42-

+ x2

"«13" «23

+ x3

«33 -«431

The column-space of ^ is identical to the row-space of A . The space of the column-vectors xn such as Ax = Om, where 0m is a m x m matrix of zeroes, is called the nullspace of the matrix A. Any vector from the nullspace is therefore orthogonal to any vector from the row-space. The left nullspace of A is the set of vectors ym, such as yTA = 0n. Any vector from the left nullspace is therefore orthogonal to any vector from the column-space. The left nullspace of A is identical to the nullspace of AT. The rank r of the matrix A is the dimension of its column-space, i.e., the maximum number of independent column-vectors. It can be demonstrated that the ranks of A and AT are equal. The dimension of the nullspace is n — r, that of the left nullspace m — r. Example: let us consider a mineralogical assemblage made of enstatite, forsterite, quartz similar to that discussed in Section 1.2. The matrix A2x3 such that en

fo

qz"

A = SiO 2

1/2

1/3

1

LMgO

1/2

2/3

0.

where, for clarity, headings have been added to rows and columns, is called the mineralogical matrix or component matrix of the assemblage. Any geochemical sample formed as a combination of silicon and magnesium oxides can be considered as a vector in either the oxide space or the mineral space. Both oxides and minerals can be actually present or simply be virtual. The normative composition (e.g., the CIPW norm) of a sample is the representation of its composition in terms of virtual minerals. Oxide or mineral components are independent if they form a base of the corresponding space. The number of independent components of the assemblage is therefore the rank of the mineralogical matrix. This is the celebrated Brinkley rule (Brinkley, 1946), quite straightforward for simple artificial assemblages, of considerable difficulty for complex natural rocks and solutions. If the number of phases exceeds the number of independent components, the mineral phases form a reactional assemblage. Examples of calculations of stoichiometric coefficients from the mineralogical matrix will be presented in Chapter 5. Additional examples will be discussed for solutions in Chapter 6. Each vector xn can be decomposed as the sum of a vector from the row-space and a vector in the nullspace. These two vectors are orthogonal. Each vector >>m can be decomposed as the sum of a vector from the column-space and a vector in the left nullspace. These two vectors are orthogonal. 2.2 Square matrices

2.2.7 The determinant of a matrix • The determinant det A of the square matrix A recurrence formula

is most simply defined by the row-wise

(2.2.1)

2.2 Square matrices

59

Figure 2.1 Hyper-prism built on the column-vectors (a l5 a 2 , a3) of a non-singular matrix A3x3. The determinant is equal to the volume of the hyper-prism. When one of the vectors can be expressed as a linear combination of the others, both the volume and the determinant vanish and the matrix A is singular. In this formula, the choice of the row index i is indifferent and \Atj\ is the ij th minor, i.e., the determinant of the matrix obtained by deleting the ith row and the jth column of the matrix A. The recurrence relation requires a starting condition: the determinant of a scalar a, i.e., of a matrix of dimension 1 x 1, is a. The product (-1)' + i \A i } \ is the cofactor of rank ij. The development can be equally made column-wise, i.e., (-lY+>a,j\AtJ\

(2.2.2)

Example (row-wise): det

^11

^12

a2l

a22

Example (column-wise) 1 2 0 det 0

2

1 = (-l) 1

3

1

1

+1

det A represents the volume of the hyper-prism made of the column-vectors of the matrix A (Figure 2.1). If det A^O, the column-vectors of A are independent and the matrix is regular. Since the rank of the column-space equals that of the row-space, the row-vectors are also independent. If det A = 0, the column-vectors A are not linearly independent (nor are the column-rows) and the matrix is singular. At least one edge-vector of the hyper-prism made of the column-vectors is in the subspace of the remaining edge-vectors: the volume of the hyper-prism vanishes and det A = 0. A few useful properties of the determinants: det AB=det A detB

(2.2.3)

60

Linear algebra (2.2 A) (2.2.5) (2.2.6)

2.2.2 The inverse of a matrix • Given a square matrix Anxn, the matrices Bnxn and Cnxn such that AB=In and CA=In must be identical because matrix product is associative C(AB) = {CA)B

(2.2.7)

B and C are both noted A~l which is called the inverse matrix of A. If A has an inverse, it is said to be regular. If B does not exist, A is said to be singular. The demonstration of the following useful properties will be found in standard textbooks M1)"1^"1)1 1

(AB)

=B

1

A

(2.2.8) l

(2.2.9)

• The adjoint adj A of the matrix A is the transpose of the matrix of its cofactors. If A is regular (2.2.10) This property is rarely used to calculate inverses for matrices with a dimension in excess of three. Example: The matrix A = \

2 2

3 3

has the adjoint 3

-2

l

Its determinant is

and its inverse 3/5 -2/5

1/5 l/5_

2.2.3 Orthogonal matrices A square matrix O is orthogonal if its column-vectors form a set of orthogonal vectors and have unit length, i.e., OTO=I

(2.2.11)

2.2 Square matrices

61

SO

OT = O~ 1

(2.2.12)

OTO=OOT = I

(2.2.13)

hence

which shows that row-vectors are also orthogonal. An orthogonal matrix is always regular because its vectors (either row or column) are orthogonal. Using the properties of determinants and inverse matrices shows that det O=det 0 T = det O~l = 1/det O

(2.2.14)

The determinant of an orthogonal matrix is therefore equal to ± 1. Example:

r i/v/2 1/V21 1—1/^2 1/^2J

2.2.4 The trace of a matrix The trace of a square matrix AnXm

denoted trA, is the sum of its diagonal elements trA=tatt

(2.2.15)

Straightforward properties of the trace are tvAT = tvA

(2.2.16)

tr(^±J?) = t r ^ ± t r ^

(2.2.17)

while using equation (2.1.5) would show that XvAB=\rBA

(2.2.18)

The trace of a scalar is the scalar itself. Since the inner (dot) product of two vectors xn and yn is a scalar, we can write r

(2.2.19)

Replacing y by the product Az gives the useful form (2.2.20)

62

Linear algebra

Example:

Tl/2 1/2T3 O l r p / 2 1/21=3 + 1 = : |_l/2 1/2 JLO l j L3/2 l/2j 2 2

a

2.2.5 The fundamental geometric transformations The square matrix ^4wXw transforms the vector xn into a vector yn by the product y = Ax. Multiplication by the matrix A associates two vectors from the Euclidian space 9T and therefore corresponds to a geometric transformation in this space. A is a geometric operator. Non-square matrices would associate vectors from Euclidian spaces with different dimensions. The ordered combination of geometric transformations, such as multiple rotations and projections, can be carried out by multiplying in the right order the vector produced at each stage by the matrix associated with the next transformation. • If A is the orthogonal matrix O, the product Ox may be viewed as a rotation. The relationship (0JC)T (0JC) = xTOTOx = xTx

(2.2.21)

implies that length Ox — length x is preserved, which identifies the transformation as a rotation. Example: ^ 1 . •, * ^ fcosfl For plane rotation with angle 0, O = \ |_sin 6

— sin0~| cos 0 J

If 0 = 45°, the vector (x = 1, v = 0) becomes the vector (x = y=

Rotation matrices can be defined for an arbitrary number of dimensions. They are particularly useful to examine compositional data in three-dimensional spaces in search for regularities unsuspected in two-dimensional spaces. Commercial software (e.g., Systat™) exists that produces geometric transformations in a convenient way. & Table 2.1 gives SiO2, CaO and K2 O data for 2.1 Ga-old granitic rocks from West-Africa (Boher et a/., 1992). Plot the standardized data, i.e., the data from which the mean is first subtracted then divided by the standard deviation, in diagrams showing the second (y) vs the first (x) coordinate after rotation around the third coordinate (z) axis of —70 degrees, then after rotation around the first coordinate axis of + 30 degrees. Table 2.1 gives the data row-wise, so we will call this array XT. The mean vector (67.73, 2.93, 3.27) is first subtracted from the data and the result divided by the

2.2 Square matrices

63

Table 2.1. Major-element data (%) on some 2.1 Ga-old granites from (Boherei al., 1992).

West-Africa

Sample #

SiO 2

CaO

K2O

Sample #

SiO 2

CaO

K2O

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

65.40 63.90 60.71 61.69 59.53 62.55 66.43 64.67 72.60 70.40 71.83 67.43 65.90 64.96 71.95 68.17

3.30 3.10 4.45 3.44 4.97 4.35 2.76 2.90 0.75 1.90 1.27 2.95 4.25 4.59 1.28 3.84

3.90 3.75 3.86 3.59 2.87 3.40 3.93 4.41 5.30 4.93 4.53 3.54 0.83 1.50 5.12 0.58

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

63.46 72.90 61.17 67.43 68.96 75.06 66.71 71.90 70.71 70.12 69.35 69.56 70.95 62.58 74.13 74.26

3.79 2.07 5.47 3.50 3.54 0.89 3.39 2.88 0.54 2.70 2.26 3.17 3.12 4.91 1.09 0.49

3.60 2.82 0.81 1.82 1.46 4.37 2.66 2.18 7.03 2.38 3.65 1.48 1.44 1.93 5.15 5.85

standard-deviation vector (4.36, 1.37, 1.60). Rotation by —70 degrees around the third axis gives the rotation matrix 0.342 O = -0.940 0

0.940 0" 0.342

0

0

1.

and the new data matrix will be XTOZ. Rotation by 30 degrees around the first axis gives the rotation matrix 10 0 0.867 |_0 0.5

0 -0.5 0.867J

and the new data matrix will be XTOZOX. The three plots are shown in Figure 2.2: the first rotation shifts representation from the standardized CaO vs SiO 2 to an array with very large dispersion. The second rotation around this new x axis produces an unexpected linear correlation which shows that the three variables are tightly related. The original unit vectors in the standardized SiO 2 , CaO, K 2 O are also shown in order to keep track of the transformations. They are simply given by the rows of Oz in the first plot, and by the rows of OZOX in the second plot, o

64

Linear algebra

3 2 1

(a)

o o

o

o

oo°o

° VTW

0

O

Q

O

-1

0 °o O°o

-2 -3

-2

(b)

2

1

1

Loo%

0 -1

2

0

o

O

o °o o o° o

oO

o°o o o o o

o °o o

-2 -3

-2

0

2

2 <

1

8

0 -1

Qo

Standardized dataaxes l:SiO 2 2:CaO 3: K2O

-2 -3

-2

Figure 2.2 (a) Standardized SiO2, CaO and K2O data for 2.1 Ga-old granitic rocks from West-Africa (Boher et al., 1992, Table 2.1). (b) After rotation around the third coordinate axis of — 70 degrees, then (c) after rotation around the first coordinate axis of + 30 degrees.

2.2 Square matrices

65

• If A is a diagonal matrix, it corresponds to a scaling transformation. Each coordinate of the vector x is scaled by the corresponding diagonal element. Example:

Lo 1/2JL1J L1/2J • A projector Pm x m is a square matrix, normally singular, and such that PT = P (symmetric)

(2.2.22)

PP=P (idempotent)

(2.2.23)

trP=rank P

(2.2.24)

If P is a projector, l—P is a projector onto the orthogonal space since (Px)T(I- P)x = xTPTx - xTPTPx = 0

Given the m x n rectangular matrix A, the m x m projector P=A(ATA)~1AT projects each m-vector y onto the column-space of A. Py can be written

where the term in brackets is a vector of dimension m. Py is therefore a linear combination of the column-vectors of A. In addition, P is visibly symmetric and idempotent since

while the trace of the projector is

As a special case, projection onto a vector b corresponds to the matrix P, such that

PJ-^T

bb

& Find the projection y of the rock composition x given in weight percent in Table 2.2 into the diopside-olivine-silica triangular diagram of Walker et al. (1979). This diagram uses coordinates given by: plagioclase (pi) = A12O3 + Na 2 O + K 2 O diopside (di) = CaO - A12O3 + Na 2 O + K 2 O olivine (ol) = (FeO + MgO + A12O3 - CaO - Na 2 O - K2O)/2 silica (si) = SiO2 - (FeO + MgO + A12O3 + 3CaO + 1 lNa 2 O + 1 lK2O)/2

66

Linear algebra Table 2.2. Major-element composition x(%) of a rock to be projected in the diopside-olivinesilica triangular diagram of Walker et al., 1979.

M represents molar weights of oxides.

xT M

SiO 2

A12O3

FeO

MgO

CaO

Na 2 O

K2O

49 60

16 102

8 72

13 40

10 56

2 62

1 94

where oxide symbols represent molar proportions in a projection from the plagioclase apex. The solution to this problem is extremely simple by non-matrix methods, but we will take advantage of its simplicity to give the layout of an approach that can be extended by computer software to any projection. First comes a scaling step that consists in dividing weight percents by the molecular weights given in Table 2.2 to give molar proportions m. Multiplication by 1000 is convenient and results in millimoles per 100 g of rock.

— 60 0

0 1000 102

0

0

0

0

0 0

1000 72 0

0 0 0 1000 40

0 0

0 0

0 0 49

817 157

0

0

0

16 8

111

0

0

0

13

325

1000

0

0

0

0

0

0

0

0

0

0

0

0

0

0

56

0 1000 62 0

0

=

10

179

2

32

1

11

0 1000 94

This projection step first consists in recasting the molecular rock composition as linear combinations of the mineral molar coordinates pl-di-ol-si. The results represent a modified rock vector m+ related to m through the matrix Q, such that

67

2.2 Square matrices and SiO2 0 Q=

0

A12O3 1 -1

FeO

MgO

CaO

Na2O

0

0

0

1

1

0

0

1

1

1

K20

0

1/2

1/2

1/2

-1/2

-1/2

-1/2

1

-1/2

-1/2

-1/2

-3/2

-11/2

-11/2

The modified rock vector m+ =(200,65,186,16)T can now be projected onto the diopside-olivine-silica plane through the projector P, such that y=Pm+=PQm This projector P is defined by the three vectors di (diopside), ol (olivine), and si (silica) from the same space as m+. The problem is quite simple because
0

0 pi

A= 1 0

0

0 di

0

0

1 0 ol 1 si

The projector P onto the column-space of A is "0 0 0 0" P=A(AJAy1AT

=

0 10 0 0

0 10

_0 0 0 1

which, in this case, leads to y = [0,65,186,16]T. The same result is conveniently arrived at by combining the operators in the proper order and relating the weight fractions to the final projection coordinates through one single matrix product

(PQM)x =

0

0

0

0

0

0

-9.80

0

0

17.86

16.13

10.64

65

0

4.90

6.94

12.5

-8.93

-8.06

5.32

186

-4.90

-6.94

-12.5

-26.79

-88.71

-58.51

16

16.67

0

0

0

68

Linear algebra

The matrix PQM can be computed once and used for any further calculation. In order to compute the coordinates j n o r m of the rock composition in the triangular diagram, the solution y should finally be normalized to 1. The result is ^nOrm = [0.243, 0.697, 0.060]T Petrological literature is rich in various projections. The reader can try other projections in which Q and P have different form. <^. & Find the projection of the vector x (1,1) onto the vector b (1,2) and its orthogonal complement. The projector P on b is

J>

_ [1/5 2/5]

Tj "1.2/5 4/5 J

<• 4 2

The projection of x onto b is therefore 5

2/5

2/5

5JL1J |_6/5J

4/5J

and the orthogonal complement is

2/5

l

-1/5J

2.2.6 The metric tensor and oblique projections We now return to the very basic concept of vector length in vector spaces. In a Euclidian space spanned by a base of n orthogonal unit vectors eh the squared length I2 of a n-vector v is the quadratic form given by l2 = vTv = vTInv = vTfj et etTv = £ i= 1

foT»)V»

(2.2.25)

i= 1

The scalar etJv is the ith 'coordinate' vt of the vector v in the base of the orthogonal vectors eb i.e.,

In a Euclidian space, the squared length of a vector is the sum of its squared

2.2 Square matrices

69

coordinates. It is quite frequent that a vector is not or cannot be naturally expressed as coordinates in a Euclidian base of orthogonal vectors. An enlightening example is the major-element composition of a rock which can be expressed as a combination of either orthogonal oxide proportions or non-orthogonal mineral proportions. Let us call the new arbitrary unit vectors £j and write the change of coordinates as a set of linear equations n

(from the equations written above, it seems more convenient to deal with the transpose of unit vectors). In a matrix notation, we introduce the nxn matrix E of the general unit vectors Sj which is the equivalent of the matrix /„ for the vectors ei and the matrix B made of the bu components

E* = InB

(2.2.26)

Clearly B=ET, and given the special nature of the vectors et ^ = 8^

(2.2.27)

We assume that we have shifted to a 'real' new vector base, i.e., the vectors Gj are independent and B is non-singular. Now, we can go back to the v coordinates and write that, in a Euclidian space, vector length is an invariant

For convenience, we will write w = B 1v and call wf the ith coordinate of vector w in the new base of the vectors Sj. We also define the matrix G as (2.2.28)

and call it the metric tensor (a tensor is a multi-dimensional array or a generalized matrix which has special properties upon change of variables). It satisfies (length x)2 = xJGx

(2.2.29)

and is equal to the identity matrix /„ in the Euclidian vector space 9T. The concept of metric tensor becomes central whenever distances and projections are considered, particularly when least-square criterion are used, a point that will be discussed in Chapter 5. Let us ask the frequently raised question of how to find an expression in terms of old coordinates (e.g., oxide proportions) for a projection made in the non-Euclidian space. This could be the case for finding oxide abundances of a basalt composition projected in the Yoder and Tilley tetrahedron, or the oxide abundance of a metamorphic rock composition projected into an ACF diagram assuming that quartz is present.

70

Linear algebra

& A rock is made of 0.45 moles SiO 2 , 0.10 moles CaO, and 0.45 moles MgO, which we will describe with the composition vector v (0.45,0.10,0.45). If we consider this rock as being made of virtual forsterite (SiO2, 0, 2MgO), enstatite (2SiO 2, 0, 2MgO), and diopside (2SiO2, CaO, MgO), what is the projection vector v of the rock composition onto the enstatite-diopside plane? The mineralogical matrix BT is obtained by the set of equations relating the unit mineral compositions e to oxide unit fractions e e (forsterite) = l/3e SiO2 + 0eCaO + 2/3e MgO £ (enstatite) = l/2eSiOl + 0eCaO + l/2e M g O £ (diopside) = l/2e SiO2 + l/4e C a O + l/4e M g O

BT reads as 1/3

0 2/3"

J

B = 1/2

0

1/2

.1/2 1/4 1/4. and therefore 1/3

0

2/3T1/3

T

G=B B= 1/2 0 1/2 0 .1/2 1/4 I/4JL2/3

1/2 1/2"

5/9

1/2 1/3"

0 1/4 1/2 1/2 3/8 1/2 1/4 J Ll/3 3/8 3/8.

It is left to the reader to check that

l

B =

-3

3

3"

4

-6

-2

0

4

0.

and therefore the abundance vector w of virtual forsterite, enstatite, and diopside proportions is given by -3 4 0

3

3

-6 4

-

2

0.45

0.3

0.10 = 0.3

ojL0.45.

.0.4

We can check the double equality ["0.45" 0.10

0.45] 0.10 = 0.415

Lo.45.

71

2.2 Square matrices

and

p/9 0.3

1/2

1/31 0.3

0.4] 1/2 1/2

3/8

0.3 = 0.415

Ll/3

3/8J

0.4

3/8

In the virtual mineral space, the rock composition is projected onto the plane made by the vectors enstatite [0, l , 0 ] T and diopside [0,0,1] T . Although these vectors are not orthogonal in the original oxide composition space, which can be verified by constructing the dot product of columns 2 and 3 in the matrix BT, the particular choice of the projection makes the vectors orthogonal in the transformed space. According to the projector theory developed above, we project the rock composition onto the column-space of the matrix A such that 0 0" A= 1 0

Lo l. The projector P, which makes such a projection, is given by P=A(ATA)~1AT or "0 0 P= 0 .0

0"

1 0 1.

0

As expected, the projected virtual mineral abundances co of the rock are given by 0 0.3 L0.4J

We find the projected rock composition v by applying the change of variable backwards 1/3

1/2

1/2

0

0

1/4

2/3

1/2

r°

0.35

0.3 = 0.10

1/4. L0.4_

.0.25

which can be later normalized to 100% oxides. In case this calculation should be repeated over large files, writing up the oblique projector PB explicitly saves computational efforts. By forming the appropriate dot product, we check that the

72

Linear algebra

projected composition is not orthogonal to forsterite in the oxide space (oblique projection) for T0.35" [1/3

0

2/3] 0.10

0.85

L0.25J

2.2.7 Gram-Schmidt orthogonalization Given a set of independent vectors x1,x2, ..., xn, it is requested to form a new set •••> y>t» of orthogonal vectors spanning the same space. The Gram-Schmidt orthogonalization scheme sets

JI>J2>

yi=xx

(2.2.30)

The fcth step removes from xk its components along yl9 ..., j f c _

l5

e.g., for k = 2 (2.2.31)

or, in general

-Jfc-i

(2.2.32)

This procedure is important for producing orthogonal vectors and functions in numerical analysis. It is not unique since it can be started from any xk. <& Construct orthogonal vectors from xx = [1,2,0] T and x2 = [1,1,2] T . The first step is

Given the products i=5

and

the second step is 1 2/5 3 2 = -1/5 1 y2= 1 - - I = J 2 . _2_ .0.

T

It is easily verified that y1 and y2 are orthogonal.

2.3 Eigencomponents

73

2.3 Eigencomponents 2.3.1 General • Most square matrices Anxn can be factored using a diagonal matrix An and a regular matrix Unxn in the following forms A = UAU 1oAU=UA<>A=U ^AU

(2.3.1)

The ith diagonal element A, of the matrix A is called the ith eigenvalue of A. The columnvectors (**!, ..., «„) of £/ are usually taken (although not necessarily) of unit length and are called the eigenvectors of the matrix A. • If the inverse A ~l of matrix A exists, then it can be decomposed as A-1 = UA-1U~l

(2.3.2)

A matrix and its inverse therefore have the same eigenvectors but reciprocal eigenvalues. Each pair of eigenvector and eigenvalue (eigencomponents) can be related through a linear system of equations such that Aut = XiUi

(2.3.3)

Example:

-\ -3 The relationship between u2 and X2 is

[-: -a-H-3 The order of eigencomponents is arbitrary, although they are commonly ranked in the order of increasing or decreasing eigenvalues. Some properties result from those of the determinant, e.g., det A = det U x det A x det U

1

n

= det U x det A/det U= ]~] K

(2-3.4)

which shows that, for a matrix to be singular, at least one of its eigenvalues must be zero. From the properties of the trace, we get

t r / * = t r £ / A { r 1 = t r A ( T 1 £ / = t r A = £ Xt The trace of a matrix is equal to the sum of its eigenvalues.

(2.3.5)

74

Linear algebra 2.3.2 Computation of eigencomponents AUi = kt u( o det(A - kti) = 0

(2.3.6)

The second equation is called the characteristic equation of the matrix A. Once lt is known, the vector ut is computed as the unit vector solution of the linear system. It is left to the reader to show that the eigenvalues of a 2 x 2 matrix A can be found as a solution of the equation 0

(2.3.7)

Example: the eigenvalues and eigenvectors of the matrix A given by

'-[.; -a are obtained by solving the characteristic equation

{LL - l2~X - 12-Aj1 = 0 or (2-/l)2-l=0 and has the solutions kx = 3 and X2 = 1. The x and y coordinates of the eigenvector associated with kl satisfy

These equations, which give an eigendirection and not the coordinates of an eigenvector, are equivalent. An eigenvector, usually with unit length, must be chosen along that eigendirection, so that

One solution is x= 1/^/2

and

y=-l/y/2

For the same eigenvalue, the opposite vector in the same eigendirection is the solution — x, — y, also acceptable. The eigenvector associated with k2 = 1 is

2.3 Eigencomponents

75

and finally

V -i / v /2 1/V2JL0 Efficient computer programs rarely use this method. The numerical algorithms used to search for eigenvalues are countless and difficult to implement without a dramatic loss of accuracy. Use of the best canned routines saves time and produces better results. 2.3.3 Eigencomponents of symmetric matrices If A is a symmetric matrix, then the matrix U is orthogonal. This can be shown by considering two eigencomponent pairs i and j

Multiplying the first equality by the transpose of Uj gives u/AUi = (Aujfui = Xju/ui = Ajif/ifi

(2.3.8)

which is possible only for i=j or orthogonal, unit-length eigenvectors. The eigenvectors u{ make an orthogonal base of unit vectors in the space 9?" A = UAU

l

= UAUJoAU=UAoA=UTAU

(2.3.9)

If A is a symmetric matrix, it can be shown that its eigenvalues are real. Product-matrices such as ATA or AAT are special cases of symmetric matrices and will be shown in the next section to have non-negative eigenvalues. • A symmetric matrix A, can usually be factored using the common-dimension expansion of the matrix product (Section 2.1.3). This is known as the singular value decomposition (SVD) of the matrix A. Let Xt and ut be a pair of associated eigenvalues and eigenvectors. Then equation (2.3.9) can be rewritten, using equation (2.1.21) A = UAUT= £ XiUiu?

(2.3.10)

A symmetric matrix therefore can be expanded as a weighted sum of the outer product matrices i#, utJ, the weights being the eigenvalues kt (Figure 2.3). This will prove to be a useful concept in principal component analysis (Chapter 4). For large matrices (n > 30), the modulus of eigenvalues may span tens of orders of magnitude. In contrast, the eigenvector components are bounded since eigenvectors are of unit length. A large matrix is therefore dominated by the eigencomponents corresponding to large eigenvalues, whereas the eigencomponents corresponding to the smallest eigenvalues are seen mostly as noise.

76

Linear algebra T

=

A,

Figure 2.3 Singular value decomposition of a matrix A into the weighted sum of the outer product of its eigenvectors.

• A projector is another case of a symmetric matrix. Since it is idempotent, its eigenvalues must be either 1 or 0. Indeed, idempotence relative to eigenvectors ut implies

Moreover P(Put) = Pui = Xiui

which requires that any eigenvalue and its square are equal. Such a condition is only true for 0 and 1. The square-root A112 of a symmetric matrix A is defined as Al/2 = A1/2UT

(2.3.11)

which enables the matrix A to be decomposed as A=(A1/2)TA1/2

The elements of Ai/2 may be complex numbers. Al/2 is the product of a simple rotation and a scaling, carried out in that order. Example: r

2 -1~| 1 / 2 = | V 3

L-l

00T1/V2

-1/J2ljj3j2

2j Lo yrl

I/V2J Ll /v /2

-V372]

l/v/2~J

& Describe the geometric transformation applied by multiplying the vector x = [1,2]T by the matrix A such that

-[

!

"

Multiplication of vector JC by matrix /4 produces the vector y and therefore y = Ax=UAUTx

2.3 Eigencomponents

11

x2

[-3 Ax = UAUTx

(iii) AUTx

-2

-1

Figure 2.4 Decomposition of the product of a vector by a 2 x 2 symmetric matrix A = UAUT into a sequence of (i) rotation (ii) scaling (iii) rotation. The last and the first rotations are opposite of each other.

Using the results from the previous exercise, the first transformation UTx is a rotation that can be written as

L1/V2

1/V2I2J L 3/V2J

(Figure 2.4). The second transformation scales the resulting vector UTx

3/^/2

The third transformation counter-rotates the vector \UTx back to the initial frame

-1/V2 I/V2JL 3/yiJ L3 Example: Let us check the singular value decomposition for the matrix must prove the following equality A = Ajlljfl!

=

2

L-l

M.We 2j

78

Linear algebra

With the help of the results from previous exercises, we get

1/V2 J

1/2 - l / 2 j L—1/2 1/2J

p/2 1/21 I" 2 -11 L1/2 1/2J L - i 2J

2.4 Quadratic forms and associated quadrics 2.4.1 Quadrics associated with symmetric •

matrices

T

Given An^n a symmetric matrix, the quadratic form S = x Ax can be rewritten as S = xTU\ UTx = (UTx)TA( UTx) = (A1/2 UTx)T\ljl

UTx

or, using SVD

S=t Wurfx = t U»iTx)Wx i = 1

(2.4.1)

i= 1

and finally

S=I^ 2 =I Z i 2 i=l

(2.4.2)

i=l

where ^• = 11^

(2.4.3)

represents the zth component of x in the frame of the vectors i#, and

z^V'V*

(2-4.4)

its ith component in the frame of the vectors h112^. • In space W, S = constant is the equation of a hyper-quadric whose principal axes are colinear with the eigenvectors ut and have a half-length of A,~1/2. The simplest case occurs when all At are positive, which happens in particular when A is a product of real matrices such as BTB or BBT. Then the hyper-quadric is a hyper-ellipsoid and, from the above equations, 5 is positive whatever the vector x. The matrix A is said to be positive definite. Similarly, the equation xJA-lx=l

(2.4.5)

represents the equation of an hyper-quadric whose principal axes are colinear with the eigenvectors ut and have a half-length of V / 2 . This is easily shown by using the eigencomponent decomposition of the matrix inverse A'1. If A is definite positive, the associated quadric will be a hyper-ellipsoid with half-length principal axes equal to kt ~1/2.

2.4 Quadratic forms and associated quadrics

79

1

0.5

-0.5 -

-1

=i

i

-1

-0.5

y- \j 0.5

Figure 2.5 The ellipse 2x2 — 2xy + 2y2 = 1. ux and u2 are the eigenvectors associated with the quadratic form, 3 and 1 the corresponding eigenvalues.

Given the ellipse (Figure 2.5) defined by the equation

find the matrix associated with this quadratic form. Find the appropriate change of variable to obtain the equation of a circle. Clearly

[-; -3 since

Using the factorization of the symmetric matrix A as UAUT, we get, upon a change of coordinates

80

Linear algebra

which is equivalent to

\x]=u\*U bJ

LyJ l-

In the previous two equations, the old coordinates x and y are set as linear combinations of the eigenvectors. The following equation is therefore arrived at s=3x i 2 +y 2 =i which is the equation of an ellipse with its principal directions colinear with the eigenvectors and with half-axes equal to l/y/3 and 1/^/L Changing this equation into the equation of a circle is achieved through

lA/ilx or \X

-

LyJ "'"

LyJ L-1/V2 1/V2JL 0

The change of variables

x" y=- —- +

y

therefore leads to the expression

which is easily simplified into x i2 +y i2 = i i.e., the equation of a circle with unit radius and centered at the origin, o & Table 2.3 lists some Na and Cl data (in mmoles per liter) on rainwater from the Amazon basin (Stallard and Edmond, 1981). Draw the ellipse associated with the

2.4 Quadratic forms and associated quadrics

81

Table 2.3. Na and Cl concentration (\imoll~x) of rainwater samples from the Amazon area (Stallard and Edmond, 1981). Na Cl

18.6 25.9

14.2 14.0

19.5 18.0

21.8 24.4

20.6 22.7

16.2 13.8

9.8 10.5

9.9 8.4

13.2 14.0

7.4 5.7

9.2 1.6 11.5 1.9

quadratic form |_c -x N a ,c

-x c l ja

_

-i

LQ-XciJ

where xNa and xcl are the mean value of the sample Na and Cl concentrations, respectively, and S is the covariance matrix. The purpose of this exercise is to learn how to draw a probability ellipse from the mean values and covariance matrix, a topic to be further developed in Chapter 4. Na and Cl are incorporated into clouds during evaporation of seawater and are therefore strongly correlated. Let us call x the vector of Na and Cl concentrations, x the vector of sample means and S the symmetric, positive-definite covariance matrix, i.e., the 2 x 2 matrix with variances on the diagonal and the covariance between Na and Cl concentrations as off-diagonal terms. The equation of the ellipse to be drawn can be written (x-x)TS-\x-x)=l The covariance matrix is factored using the diagonal matrix A and the eigenvector matrix U as UAUT. Since S is symmetric and positive-definite, the eigenvalues are positive and the eigenvectors orthogonal. The inverse S~ * of S can be expanded as UA ~x UT and the transformation z = A~i/2UT{x-x) transforms the ellipse equation into the unit circle equation zTz=l Conversely, given a vector z on the unit circle, we go back to the original coordinates through

The mean value of Na and Cl concentrations are 13.50 and 14.23 and the covariance matrix S is =p7.56

42.791

L42.79

55.09J

82

Linear algebra

The eigenvalues can be found by solving equation (2.3.7) X2 - A(37.56 + 55.09) + (37.56 x 55.09 - 42.79)2 = X2 - 92.65/1 + 238.2 = 0 giving /Ij =2.643 and X2 = 90.01. The coordinates x and y of the first eigenvector are found by solving one of the eigenvector equations

or y= -34.92/42.79*= -0.816* Combining one of these equations with the condition for a unitary vector x2 + / = l or x 2 (l+0.816 2 )=l we get x = 0.775 and y= — 0.632. Proceeding identically with the second pair of eigenvalue and eigenvector, we obtain the factorization

s-u\uj-l

0>775 a 6 3 2 2 6 4 3

L -0.632

T-

0.775 J|_0

° T

a775

90.01 JL-0.632

°-632T 0.775 J

We now calculate the coordinates of an arbitrary number of points zi (i= 1,2,...) on the unit circle, most easily by incrementing an arbitrary angle (pt from 0 to 2n and taking zi = (cos cph sin cp^T. For instance, for q> = n/6, z = [y/3/291/2]T and i3.50~| |~ 0.775 0.632X^2.643 0 T N /3/2"|_ri7.59 l4.23j + L-0.632 0.775JL 0 y^OOTJL l / 2 j " L l 7 . Enough points have been calculated in this way to draw the ellipse shown in Figure 2.6. o

2.4.2 Gerschgorin's circles theorem This theorem has important implications in the box model theory. It states that every eigenvalue of AnXn, possibly complex, lies in the complex plane inside at least one of the circles centered at the diagonal entry au and with a radius equal to the sum Z \atj\(i / ; ) of all the off-diagonal elements of the ith row. In order to prove this theorem, let X and u [ul9 w2, • • •, wJ T be a pair of eigenvalue

2.4 Quadratic forms and associated quadrics

83

30 o

25

o o

^»—1

20

yS

i u

15 -

O

-

o so

10 -

-

/ o

5-

_

o

0

0

10

15

20

25

Na (junol I"1) Figure 2.6 The ellipse (x — x?S~x(x-x)=\ and Edmond (1981) listed in Table 2.3.

for the Amazon rain Na and Cl data of Stallard

and eigenvector of the matrix A, which therefore satisfy Au — Xu

Let ut be the component of u with the largest absolute value. Then, by the rule of matrix product (2.4.6)

We rewrite the last equality as != X auuj or (2.4.7)

Taking the modulus of each side and applying the rule for sums of modulus (Schwarz

84

Linear algebra

inequality) gives

-a«l= Z Since u( is the component of u with the largest absolute value, we get

iA-aui< i K\

(2.4.8)

which means that, in the complex plane, the eigenvalue X lies within the circle centered at au and with radius r, such that (2.4.9) j*i

The same argument applies to AT and may be used to calculate Gershgorin's circles with respect to rows instead of columns. <& Find the Gershgorin's circles for the matrix A, such that • - 1

A=

0

r

3

I

1 1

l.

0

Imaginary part

Real part

Figure 2.7 The Gershgorin's circle theorem. The three eigenvalues of the matrix A are located within the Gershgorin's circles.

2.5 Systems of linear differential equations

85

Using commercial software, we find the eigenvalues kx = -1.45,^2 = 1.00, ^ 3 = 3.45

We can check that

|A2-3|
and the corresponding circles are drawn in the complex plane of Figure 2.7. <=

2.5 Systems of linear differential equations 2.5.1 First-order linear homogeneous equations

Let us first investigate the properties of equations that involve only first-order derivatives, i.e., equations such that (2.5.1)

at

where f(t) and g(t) are functions of t only. g(t) is known as the input or forcing function (see Chapter 7). If g(t) = 0, the equation is said to be homogeneous. Given the following system to solve in the n time-dependent unknowns xb with the atj being constant coefficients: dxj

= allxl+al2x2+

... + alHxH

dx2 -— = a2lxl+a22x2+ dt

... +a2nxn

at

dxn — =fl w ix 1 +an2x2 -f ... +annxn dt

Let us lump the n xt (i= 1,2, ..., n) together into the n-column vector x and the atj coefficients into the matrix AnXn which is not assumed to be symmetric. We now note the system of equations in the matrix form (2.5.2)

86

Linear algebra

The matrix A is first diagonalized as UAU'1

and, premultiplying by IT"1, we get

I A^ l ,^ A l T ' , dr dr

(2.5.3)

Defining the n-vector y as U~ xx, the matrix equation can be rewritten dv

-f = Ay dt

(2.5.4)

which amounts to the n equations

dr where /,- are the diagonal entries (eigenvalues) of A. This system has a unique set of solutions (i = 1,2, ..., n) ^ = y,°e^

(2.5.6)

which we recombine into the matrix form x=Uy=UeAtU-1x0

(2.5.7)

where y0 = [ j ^ 0 , . . . , y n °] T is the vector j , and x0 the vector JC, both taken at t = 0. e A ' is a diagonal matrix with entries eA'f. Reverting to the JC values, the solution is JC =^v=i7e

Af

f/- 1 jc 0

(2.5.8)

The matrix Ue^U'1 is usually denoted eAt. In the common-dimension expansion form (Section 2.1.3), this matrix product reads x= £ e;'f(/th column of (7)(/th row of U~l)x0= ^ e^.-JCo i= 1

(2.5.9)

i- 1

where the matrix 21, is defined as ^lf- = (ith column of U)(ith row of f/~1)= Ue^U'1

(2.5.10)

The relationship between the 21/s and the matrix exponential tAt is eAr= f ^"^i

(2.5.11)

i= 1

The final step can be taken by defining the n x n matrix Q by its ith column

9X£JC0.

2.5 Systems of linear differential equations

87

Then the solution becomes

(2.5.12)

The solution vector x therefore can be viewed as the weighted sum of n 'components' eAr with a corresponding weight matrix Q = Z5IiJt0 (i = 1,..., n). This form is extremely useful to predict the rate of growth or decay for each individual component of the solution. &

Solve the system

dt

dt

• = xl+x2

with the initial conditions x1 = 1 and x2 = — 1 at t = 0. Defining the column vector x as (xux2)T, we can write djc_p

01

d7~|_l

U

Ax

The matrix A can be factored as

-C 3-

p> OT-72 01 .0 l l - l lj

UAU l = L 2

The solution is

or

- 'ill The alternative formulation

Linear algebra

or, in its more conventional form

x2 = e 2 r -2e r shows that the two components (1,1) and (0, — 2) of x increase at different rates, o ^ Given a number of nuclides at t = 0, calculate the distribution of nuclides in the U-Th decay series at any time t. This problem has important applications to the various methods of dating collectively known as radioactive disequilibrium and to the evolution of hazard in nuclear wastes containing mixtures of radioactive nuclides with different periods. The natural nuclides 2 3 8 U, 2 3 5 U, and 2 3 2 Th decay to different lead isotopes along a series of radioactive isotopes of different element (Pa, Ra, Rn, ...). For one specific decay series, let us call Nt the number of atoms of the ith present at time t in the system, which we assumed to be closed to any exchange with the surrounding medium, and kt its decay constant [in (time unit)" 1 , not to be confused with eigenvalues]. Let n be the number of nuclides in the series, i.e., i= 1 for 2 3 8 U, 2 3 5 U, and 2 3 2 Th and i — n for the stable lead isotopes 206,207, and 208. The amount of each nuclide is increased by the decay of its parent isotope and decreased by its decay into the daughter isotope, hence

at

with the Xt canceling for i = 1 and i = n. Although the solution to the system of equations for several nuclides has been known for quite a while (Bateman, 1910), matrix formulation has become the most flexible approach. It is common practice to deal not with number of atoms but with activities [N J = AjiVj (number of decay events per time unit). Therefore, multiplying both sides

dt

or, in a matrix form -Xi

dN

~dt~

o

x2 -x2

...

...

0

0

•••

0

0

••

0

0

0

0

-L. /L

N=AN 0 -X

where TV is the /t-column vector of nuclide activities [ A ^ ] , [N2\

•••, [ N J . The

2.5 Systems of linear differential equations

eigenvalues of the matrix A can be found by solving 0

0

0

0

0 =0

det 0

0

0

0

...

0

-k^i-fi

-k

L

for the eigenvalue \i. Expanding this determinant along the last column shows that the only product which does not vanish is that along the diagonal. Hence

and the system eigenvalues tx1,...,fin are simply the negative of decay constants Al9A2>-~9 K- Th e solution therefore can be written as a linear combination of the negative exponentials

where a/ are constant terms and the sum is limited to j = i because the number of atoms of a given nuclide does not depend on its descendants. For all three radioactive series, the decay of the first nuclide is the rate-limiting step so that Xx «kj, except for the terminal lead isotopes (Faure, 1986). After some typical time t(*, such that — «t t * « —, for all j ^ i, yV 1 all exponential terms become negligible relative to the term in e~A|'. This situation is known as secular equilibrium and requires

Taking the derivatives and comparing with the decay equation, we obtain dN —— ~ —A^cCi e

at

— A, (a,_! e

— a£ e

j

which can be rearranged into i

~ai

This condition is only satisfied when

which establishes the well-known result that, at secular equilibrium, activities are equal.

90

Linear algebra

Let us take the simple example of the two radioactive isotopes us call ^238u a n ( * ^234u their decay constants, respectively d238 U dr d 234 U dr

.

J

238

U and

234

U. Let

238U

LF — A 2 34U

U

or, in activities d[ 238 U] dr ^234U([U][U])

dr In matrix form, this system of equations becomes 0

V[ 2 3 8 U]

A

The matrix admits — /I238u a n ^ ~^234u f° r eigenvalues. Let V be the eigenvector matrix of the matrix on the right-hand side (for obvious reasons, we do not want any confusion with U chemical symbol). The coordinates of the eigenvector associated with — A238U are determined from the equation

o which, combined with the normalization condition

yields 1

_

where — V (^234U~

2.5 Systems of linear differential equations

91

The coordinates of the eigenvector associated with — 2 2 3 4 U satisfy

or

which requires ^ i 2 = 0 and v22 = \. The eigenvector matrix F i s therefore

v= It can be verified that the eigenvectors are not orthogonal. Using the standard procedure, this matrix is easily inverted into

The solution may now be expressed as the sum of the individual components

238

U and

234

U

[ 238 U] [ 234 U]

» - A234U*

which is recombined as = [238U0]e~A238ur a

- A234U*

Although this way of deriving a classical result seems rather awkward and timeconsuming compared to more direct elimination methods, it has the advantage that it can be extended to any combination of isotopes in a decay series. <>

92

^

Linear algebra

Apply the previous procedure to the decay scheme

using the following decay constants: " 99 aa " 1 /L234U = 2.79 xlO" 6 a~ ] ^230Th = 9.21 x 1 0 " 6 a " 1 , 10" 238U = 0.155 125 x 4 A226Ra = 4.27 x 10" a ' \ and A210Pb = 3.23 x 10" 2 a " \ Assume [ 238 U 0 ] = 1, [ 234 U 0 ] = 5, [ 23O Th o ] = 0.2, [ 2 2 6 Ra 0 ] = 0.02, and [ 2 1 0 Pb 0 ] = 2.0. The choice of activities at t = 0 amounts to normalising activities to [ 2 3 8 U]. Once the matrix A

0

238U

*234U

0

0

0

0

0

0 0

0

0

0

0 0

0

is built on a computer from the numerical values given above, the following eigenvector matrix V is calculated "0.44719

[ts inverse V

0

0

0

0"

0.447 21 0.37194

0

0

0

0.447 22

0.533 57

0.568 90

0

0

0.447 22

0.53708

0.58144

0.70239

0

0.447 22

0.53713

0.58161

0.71180

1

is "

2.236 -2.689

l

V~ =

0

0

0

0

2.689

0

0

0

1.758

0

0

1.424

0

-1.013

1

-2.522

0.769 -2.064x10"

4 10

3.301 x 10"

0.0316

-1.455

-3.821 x l O "

6

0.0134

The eigenvalues are known to be the negative of the decay constants. Finally, the matrix Q is computed as 1 0

0

0

0

1 4.0000

0

0

0

0

0

1 5.7382

-6.5383

1 5.7760

-6.6824

-0.073606

0

5.7765

-6.6844

-0.074592

1.9824.

.1

2.5 Systems of linear differential equations

93

-a <

lO"1

103

106

109

Time (years) Figure 2.8 Evolution of the activity of the five nuclides 2 3 8 U, 2 3 4 U, 2 3 0 Th, 2 2 6 Ra, 2 1 0 Pb for the initial activity conditions [ 2 3 8 U O ] = 1, [ 2 3 4 U 0 ] = 5, [ 2 3 O Th] o = 0.2, [ 2 2 6 Ra] 0 = 0.02, [ 2 1 0 Pb 0 ] = 2.0.

The evolution of activities is shown in Figure 2.8. We can see that about one million years (1 Ma) is required to reach full secular equilibrium. <=> & Kinetic theory of oxygen isotope exchange: a granite is made of two mineral phases, quartz (i=l) and feldspar (* = 2), plus interstitial hydrous fluid (w). Initially, 8 18 O 1 =9, 8 18 O 2 = 8 (common late magmatic values) and 5 18 O w =—5 (meteoric water). The system is assumed to be closed and the mass fractions fx= 0.3,/ 2 = 0.6, and / w = 0.1 of oxygen held in mineral 1, mineral 2, and water relative to the total in the rock do not change during the isotopic exchange process. Equilibrium oxygen isotope fractionation between a mineral (i=l,2) and the surrounding hydrous fluid depends on a simple function af (T) of the temperature

(18o/16ox-

( 1 8 O/ 1 6 O) W ~

Calculate the evolution of 518O for each phase assuming that temperature is such that cc1 = 1.010, a2 = 1.005 and first-order exchange kinetics apply with time constants kt and k2(kjk2 = 0.2). We are going to present a method slightly modified from that of Criss et al. (1987) in order to account for some forms of oxygen isotope disequilibrium among minerals in ultrabasic, metamorphic or hydrothermal assemblages. When a mineral assemblage is invaded by a fluid which is not in equilibrium with it, isotopic exchange takes place in such a way that the whole system tends to a new state of equilibrium. Criss et al. (1987) suggest that approach to equilibrium takes place through first-order

94

Linear algebra

kinetics, i.e.,

_ or

where the temperature-dependence of at is made implicit. Dividing both sides of each equation by ( 18 O/ 16 O) SMOW , we write df (18O/16O)t 18 drL( O/ 16 O) SMOW or, upon multiplication by 1000

dt which can be expressed as a deviation from the initial conditions (superscript 0)

dt -M518O.- 0-a,.5 18O w0)-1000/^(1 -a,) Finally, we assume no precipitation and no dissolution of solid phases. Closed system and mass conservation imply

Dividing all by ( 18 O/ 16 O) SMOW , subtracting/i + / 2 + / w , and then multiplying by 1000 results in

or 618Ow-518Ow°=--(618O1-518O10)-^(518O2-618O20) Jw

Jw

Relabelling the variables (1 = 1,2) in such a way that

2.5 Systems of linear differential equations

and i° +1000(1

-aJ-a

results in a system of non-homogeneous linear differential equations

dt

Let us define the matrix A as

/w

A=

We can write

dr

Introducing the change of variables

the system of equations becomes dz dt

which can be solved as a homogeneous system. The solution is

This equation can be reformulated as

The eigenvalues of A are found from

95

96

Linear algebra

with

and ,

ai/i /w

/w

Let us express the time in units of l/kl9 which amounts to assuming k1 = \9 k2 = 5, and work with a non-dimensional time T = k1t. The matrix A is calculated as " / ~~ V

1.010 xO.3\ 0.1 ) 1.005x0.3 01

1.010x0.6 01 / ~ V

1.005x0.6 01

_f" -4.030 L-15.075

-6.0601 -35.150J

and the vector b as b

_|~-1(9 + 1000(1-1.010)- 1.010(-5)}l_r -4.0501 ~L-5{8 +1000(1-1.005)- 1.005(-5)}_r[_-4O125j

Therefore \ 2.001J Solving the characteristic equation, we find that the eigenvalues of A are X1 = —1.3289 and X2= —37.851 with the eigenvector matrix Uand its inverse given by _[" 0.9134 01764"! ~L-0.4071 O9843J

_1__Tl.0139 ~|_0.4193

-018171 0.9408J

The matrix exponential is calculated from the common-dimension expansion as

or, inserting numerical values, as ^ t = e -i.328J~ a 9 1 3 4 ][i. O 139 L-0.4071J

-0.1817]+ e- 3 7 - 8 5 1 { 0 1 7 6 4 ][0.4193 L0.9843J

0.9408]

Remembering that x0 is the deviation of 8 18 O from the values at t = 0, i.e., x0 = [0,0] T , we get the final solution as

0.975e—3.+

L

2.5 Systems of linear differential equations

97

This result can be made explicit as

The new equilibrium state can be calculated by letting f-»oo

eldsp =

5 18 O feldsp°-2.0 = 6.0

5 18 O W can be calculated using the closure equation 618Ow = 5 1 8 O w 0 - — ( + 2 ) - — ( - 2 ) = - 5 - 6 + 1 2 = + 1 The reader will check that 5 18 O rock ( + 7) does not change in the process,

2.5.2 Linear equations of order higher than one A simple example will show how higher-degree linear equations reduce to a system of first-order equations.

Solve

dr2

dt

with the conditions that x(0)= 1 and dx(0)/dt = 0. Let us make the change of variables xx = x(t) and x2 = dxjdt, with xx = 1 and x2 = 0 at t = 0 .The second-order differential equation can be transformed into the following system of two first-order equations

dr dx^ dt Defining the matrix A as

••[-:

98

Linear algebra

and the vector x as (xl9x2)T, the last equation can be recast in matrix form as — = Ax dt which can be solved as a system of equations of order one. Finding the eigenvalues of matrix A amounts to solving the characteristic equation of our differential equation

X = 1 and X = 3 are the solutions of this equation. It can be verified that the eigenvector matrix U and its inverse are given by

^^2

1 / ^

^

fi 3/yioJ Proceeding as before we obtain the solution 1 1

r i i

7 i

10 /2

/2JLO

3 10J

u im

which can be rearranged as rx.1

UJ

3/2-3/2JLe3'

It can be checked that -e r — 2 2

is the solution which satisfies both the differential equation and the initial conditions, o 2.5.3 Stability of solutions to linear systems of differential equations All the previous examples happened to provide matrices with real and negative eigenvalues but this is by no means required to be a general situation. The reader is referred to advanced textbooks on eigenvalue theory (e.g., Wilkinson, 1965) for the demonstration that the complex eigenvalues of real general (non-symmetric) matrices form pairs of conjugate complex numbers. Let us consider one of these eigenvalues

2.6 Linear function spaces

99

k which we write in its complex form

where a and b are the real and imaginary parts of A, respectively. The corresponding time-dependent exponential that appears as a component of the solution to the system of differential equation is

which is the product of a real exponential term by a periodic complex term of modulus unity and frequency 2n/b. zh grows exponentially for a > 0 and the solution is unstable for large values of time. When a is negative (which will be shown in Chapter 7 to be the case for linearly coupled geochemical reservoirs), eAr decays towards zero after some oscillations if b is not zero. For a = 0, the system oscillates endlessly. This state of sustained oscillation is known as a limit cycle and separates stable (a < 0) from unstable (a>0) conditions. Many textbooks in applied mathematics offer an excellent discussion of the stability theory for differential equations (e.g., Strang, 1986; Logan, 1987; Zwillinger, 1989).

2.6 Linear function spaces 2.6.1 General

The idea of a vector space is usefully extended to an infinite number of dimensions for continuous functions. Given a function /(e.g., / = sin x) and a definition domain (e.g., 0 to 2TT), the coordinates of/ = sin x will be the infinite number of values of the function over the definition domain. This definition is consistent with that of Euclidian spaces if a metric is defined. In about the same way as the squared norm of the n-vector x(xi,x2, ..., xn) is

the squared-modulus of the infinitely dimensional vector / = sin x over the [0, 2TC] interval divided in segments of length Ax = 2n/n is 2 71/Ax

| / | 2 = lim Ax->0

£ sin2(/Ax)Ax

(2.6.1)

i=l

i.e., \f\2=\

sin 2 xdx Jo

(2.6.2)

100

Linear algebra

In general, the squared-modulus of the vector function f(x) over the domain 3) is

2 I/I2 = =

==[

/2Md*

(2-6-3)

(note the alternative formulation), whereas, the dot, or inner product, of the vector functions f(x) and g(x) is

fT

»=«»>=l

f(x)g(x)dx

(2.6.4)

Two functions f(x) and #(x) are orthogonal over Q> if for #0

and

<^O

(2.6.5)

then = 0

(2.6.6)

2.6.2 Fourier series A widely used example of orthogonal functions is the set of sines and cosines. For example, given any real number a, and the function sin nx for integer values of n, @ is equal to [a,a + 27i]. We can check that srr, f =

• 2 , f sin2nxdx=

JJoe

lcos2nxj [x dx= -

Ja J

2

L 4« J a

L2

(2.6.7)

where the subscript value for the quantity in brackets is subtracted from the superscript value. Also C2n . . f27r cos(n-m)x-cos(n + m)x j J \h9/— sin nx sin mx dx = dx Jo Jo 2 /r

x

and

r T r T o

2L

n-m

Jo

2L

n-fm

Jo

Similar orthogonality relationships can be shown for cosines. In addition, sines and cosines are mutually orthogonal, i.e., for any m and n <sin 2nnx, cos 2nmx} = 0

(2.6.9)

2.6 Linear function spaces

101

Just as a vector is projected as components on orthogonal axes, a given function defined on a given domain can be projected onto an orthogonal set of functions. The Fourier series decomposition of a function/(x) defined over the interval [ — X,X] is a convenient example /(*)= f;Ksin27rn^ + &Bcos27rn^)

(2.6.10)

A/

A

As in the case of regular vectors, the components an and bn can be found by forming the inner products upon multiplication of the /(x) expansion by the adequate sine or cosine /

x\ / x x /, sin 2nn —) = an\ s*n ^nn —> s^n ^nn ~ X/ \ X X

We note that, with the appropriate variable change, equation (2.6.11) reads

)-x

sin^ nn — ax = — X n}.n

sin nn — dl n— ) = — n = X X V X) n

with an identical result when the integrand is a cosine. We therefore obtain an and bn from ( /, sin 2nn — ) \

( /, cos Inn — '

Y

an = ±

YI

\

?L9b =± X

n

^L

(2.6.12)

X

From these last expressions, it is apparent that a0 is null whereas b0 represents the mean value of the function over the interval. We will now work out two examples, the results of which will be used in Chapter 8. & Find the Fourier series expansion of the boxcar function (Figure 2.9) defined over the interval [ - X , Z ] by f(x)=l

0<x<X

/(x)=-l

-X<x<0

f(x) = 0

x=0

If this pattern is repeated over all intervals [(2n-l)X, (2n+l)X] for integer values of n from — oo to + oo, the result is a periodic function of period 2X. an is calculated as If0 an = —\ Xj-x

x 1 C+x x (~l)sinnn — dx-\— (l)sinnn — dx X XJ0 X

Linear algebra

102

1.0

0.0

i

-1.0

-X

-2X

2X

x Figure 2.9 The periodic boxcar function with period 2X. or x~ -x cos nn — X nn/X an =

X +x cos nn — X nn/X 0

+

0

X

= — ( 1 — cos nn) nn

Only the odd terms are non-zero and equal to 4(nn) i. The bn coefficients such that b

If0 x 1 C+x x n = —\ (-l)cosmr — dx + — (l)cosmr — dx X J -x X X Jo X

can be rewritten x

x

{

L ! f" A bn = — cos nn — dx H—

x)0

x

r cos nn — dx X

x)0

Changing x to — x in the first integral and recognizing that cos is an even function shows that bn is zero for all n. The Fourier series expansion of the boxcar function is therefore /(*)=

x 4 an sin nn — = A nr

1

x sin(2w+l)7c —

103

2.6 Linear function spaces

1.0

xlX

0.8 0.6 0.4

\ Wlr

-1 //

\\ \1

0.2 0 0

X

Figure 2.10 Partial sums of Fourier components of the boxcar function up to p = 9. Convergence to the boxcar functions is achieved rapidly although 'ears' appear next to discontinuities (Gibbs effect).

Addition of thefirstcomponents up to p = 9 is shown in Figure 2.10. Reconstruction of the boxcar function is rapid although 'ears' persist next to the edge, a feature characteristic of discontinuities and known as the 'Gibbs effect'. <= # ^ Find the Fourier series expansion of the ramp function (Figure 2.11) defined over the interval [ — X,X] by f(x) = x f(x) = 0

-X<x<X x=±X

Again, reproducing this pattern over intervals [(2^-1)^, (2n+ \)X~\ for integer values of n from — oo to + oo results in a periodic function with period 2X. an is calculated as flfI = — x sin nn — dx I-x X

Integrating by parts gives

x sin nn — dx= — X

x cos nn — X nn/X -x

-r-

x\ cos nn — \ X\ dx=nn/X

J-x\

and therefore 2X2(-l)n/nn X

nn

-+0 nn

104

Linear algebra

1.0

0.0 / -1.0

-2X

2X

Figure 2.11 The periodic ramp function with period 2X.

It will be left as an exercise for the reader to show that the cosine terms bn, including the mean value b0, are zero. The Fourier series expansion of the ramp function is therefore ;(x)=

2X » (-1)" . x > sinmr — n

n=i

n

X

2.6.3 Legendre polynomials The powers of the variable x (1, x, x 2 ,..., x n ,...) are not orthogonal functions over a unique interval. However, particular sets of polynomials present the orthogonality property. A simple and useful example is that of Legendre polynomials. Let us choose over the range [—1, +1] the first two polynomials P0 = l and Px=x Po and Px are orthogonal since c= 2

2.6 Linear function spaces

105

and

while

In the space defined by the non-orthogonal 'vectors' 1, x, x 2 ,... let us find the Legendre polynomials of higher degree by Gram-Schmidt orthogonalization. A polynomial P 3 is found by removing the components of x 2 along P o and P x as in equation (2.2.32)

or x 3 dx 2

P, = x - ^ ^

1-i^i

x 2

x dx Finally 2_

0

x

2

3

Alternatively, a recursion formula can be used (e.g., Scheid, 1968) (n+l)P n + 1 =(2n+l)P n -nP n _ 1 which, from the same seed P o = 1 and Px=x

produces the sequence

P 2 = i(3x 2 -1)

= !(35x 4 -3Ox 2 + 3) 8 = i(63x 5 -7Ox 3

(2.6.13)

106

Linear algebra

1.0

Legendre Polynomials

n-\-l

0.5

-0.5

-1.0 -1.0

0.5

-0.5

1.0

Figure 2.12 The first Legendre polynomials.

Because of a different normalization , the coefficients of the parentheses are not identical for the Gram-Schmidt orthogonalization and for the recursion formula. The first Legendre polynomials are depicted in Figure 2.12. It will be checked that a polynomial of order n has exactly n zeroes in the range [— 1, + 1 ] .

2.6.4 Associated Legendre polynomials Legendre polynomials are one specific variety of a more extended class of orthogonal polynomials called associated Legendre polynomials. An associated Legendre polynomial Pim(x) is defined relative to an ordinary Legendre polynomial Pt(x) through

dxn

(2.6.14)

Alternative definitions lack the (— l) m term. Pt(x) is therefore a concise expression for Pl°(x). Note that, because of the derivative term in equation (2.6.14), if m > /, P*m(x) = 0. Advanced calculus would show orthogonality properties with =0 for /I ^/2

2.6 Linear function spaces

107

Table 2.4. The first associated Legendre polynomials up to l = m = 2. /

m

0 1

0 0 1 0 1 2

2

Pim(x) 1 X

-0-X2) 1 ' 2

l/2(3x2-l) -3(l-x 2 ) 1/2 x 3(1 -x2)

and

2/+1 (l-m)\ The numerical generation of associated Legendre polynomials is discussed by Press et al. (1986). These authors use the following recurrence on / x)

(2.6.15)

and the two starting values Pwm(x) = ( - l)m(2m-1)!!(1 ~x2)m/2

(2.6.16)

where the double factorial n\\ denotes the product of all the odd integers ^ n , and P£+1(x) = x(2rn + l)Pmm(x)

(2.6.17)

Examples of ordinary Legendre polynomials with m = 0 are P 0 °(x)=l P 1 °(X) = X X 1 X P 0 ° ( X ) = ^

(2-0)P2°(x) = xx 3 x P! 0 (x)- 1 x P0°(x) = 3 x 2 - 1 The first polynomials P,m(x) are given in Table 2.4. The number of associated Legendre polynomials up to the order / is (/+1)(/ + 2)/2. 2.6.5 Spherical harmonics

Some global problems deal with the variations of some geochemical parameters on the surface of the Earth. Problems of that sort have recently appeared when, for example, the world-wide distribution of the 206 Pb/ 204 Pb and other isotopic ratios

108

Linear algebra

in oceanic basalts permitted Dupre and Allegre (1983) to identify large-scale anomalous regions in the southern hemisphere for which Hart (1984) coined the name of 'Dupal anomaly'. The need to account for these variations with series of functions orthogonal on a sphere meets that encountered by geophysicists when trying to extract the most significant part of the variations in the gravity equipotential (geoid). These problems can be dealt with adequately using spherical harmonics. Spherical harmonics closely resemble normal Fourier harmonics except that they are functions of both the latitude and the longitude instead of the linear abscissa on a standard axis. Bi-dimensional Fourier analysis on a plane exists but is inadequate since the most desirable property of the requested expansion is the orthogonality of its components upon integration over the surface of the Earth, assumed to be spherical for most practical purposes. The standard coordinates on the surface of a sphere of radius r, i.e., the spherical coordinates, are the longitude and the co-latitude 9 (co-latitude is n/2 minus the latitude). Orthogonality over the surface of a sphere of two distinct functions /((/>, 6) and #(, 6) takes the usual form but in two dimensions

J the Earth surface

f(,e)ds(,e)=o

(2.6.18)

/ 2 (4>, 6) dS{(j), 6) = const

(2.6.19)

and

r

I

the Earth surface

where the functional dependence of the surface element dS with <\> and 6 is to be determined. From Figure 2.13, we see that the arc length elements d/^ = r sin 6 d(/> and dle = rd6 satisfy the length conditions since (1) integrating rsin0d at constant 0 for (j) varying from 0 to 2n gives the circumference 2nr sin 6 of the small circle, and (2) integrating rdO at constant (f> for 6 varying from — n to n gives the circumference 2nr of the large circle. Likewise t h e surface element dS = dle d/^ satisfies t h e surface c o n d i t i o n

m

'2n

"I

C~TC2n

r sin Odd) \rd(b = r2

o

C

2n

J

f

1

1 d
Ji Uo

J

= r2 \ d(f)\ d(cos 9) = r2 x 2TT X 2 = 4rcr2 (2.6.20) Jo J -l which is the surface S of the sphere. Further calculations will be carried out with reference to a sphere of unit radius, which can always be arrived at by proper

2.6 Linear function spaces

Figure 2.13 The system spherical coordinates: is the longitude and n/2-6

109

is the latitude.

normalization. We get S=

dS=

d(/> d ( - c o s 0 ) =

whole sphere

which shows that the surface element is dS = — dcf) d(cos 6). As in the case of Fourier components, we will refrain from using complex variables. Instead, we will handle spherical harmonics as two separate sets of orthogonal functions C™ (0,0) and S,W(<M) such that

Cr(,0)=

(2.6.21)

4TT

(2.6.22)

™^ 9) =

4n

Example: —^ -si 47r(2+l)! i.e.,

/

i 16n

Linear algebra

110

Longitude

Latitude Figure 2.14 The spherical harmonic S32((p,6).

Figure 2.14 shows the spherical harmonic S32 (,9). The normalization property of associated Legendre polynomials stated above, guarantees that these functions are orthogonal over the surface of a sphere

[2

(2.6.23)

Jo A bounded function /((/>, 6) can be expanded as an infinite series of C™ (0,6) and S (0,6) over the surface of the sphere

1=0

m

=o

(2.6.24)

where the oclm and f$lm coefficients are to be found from an integral expression similar to that used for Fourier coefficients. For a given / there is a total of (/ + 1)(' + 2) — (/ + 1) = (/+ I) 2 oilm and Plm coefficients (since all fil0 are zero), e.g., 49 coefficients are needed to expand a function in spherical harmonics to the order 6. Likewise, a 00 is the average value of the function over the sphere. A typical application of spherical harmonics will be given in Chapter 5.

3 Useful numerical analysis

3.1 Functions of a single variable 3.1.1 Derivatives The derivative of the function /(x) with respect to the dependent variable x is the scalar number defined as d/W /(x + Ax)-/(x) kim ffx (x) = (x) = kim x dx Ax-o Ax fx'(x0) represents the slope of the tangent to the curve y = f(x) at x = x0. An extremum, i.e., either a minimum or a maximum, corresponds to a null derivative. The quantity

is the derivative of order n of/(x) with respect to the variable x and is obtained by applying n-times the derivation formula. • The log derivative of a function /(x) is the derivative of ln/(x), i.e., dln/(x) dx

=

fx'(x) f{x)

1

The logarithmic differential of the function/(x) is defined as d/(x)//(x). If/(x) can be written as a ratio of two functions g(x)/h(x\ then ^

^

fix)

g(x)

^

(3.1.2)

h(x)

Example:

1+x

x2

)

1+x

V^

1+x/ dx

<& Optimum spike addition (Webster, 1960). We now deal with a specific example

that can be extended to any isotopic pair. Let us assume that 150 neodymium spike is added to 100 mg of a sample with ca. lOppm Nd in order to measure sample 111

112

Useful numerical analysis

neodymium concentrations. Isotopic proportions of 148 and 150 isotopes are 5.73 and 5.62 percent in natural Nd (molar weight 144.24), and 0.40 and 97.25 percent, respectively, in the Oak Ridge 1 5 0 Nd spike. Calculate the optimum amount of a spike containing 12.5 nmol 1 5 0 Nd per gram of solution to be added to this sample which minimizes the error on the calculated sample concentration. Assume that sample and spike Nd isotope compositions are perfectly known. Let us consider the two isotopes 1 4 8 Nd and 1 5 0 Nd and measure the isotope composition of the spiked mixture. Using the subscripts sa, sp, and m for sample, spike, and mixture, it was shown in Section 1.3 how to calculate the spiking ratio r 8

N,sp

Nd

8

150 Nd

Nd\ 50 Nd/ m

/ 1 4 8 Nd\ / 148 Nd V150Nd

where Afsa'5°Nd and iVspl5°Nd refer to the numbers of 1 5 0 Nd atoms of sample and spike, respectively, present in the mixture. Using x to represent the 1 4 8 Nd/ 1 5 0 Nd ratio, we get the more compact form

r

_

and try to find the value of x m which makes the relative error on r a minimum. Assuming that the isotopic compositions xsa and x sp are perfectly known for the spike and sample, the relative error dr/r on the spiking ratio r is dr = dlnr

_

~ dx ™

dx

™

_

*m(s,p-xj

d x m _ dxm

The relative error on the measurement dx m /x m is amplified by the factor y to give the relative error on the spiking ratio dr/r. The amplification term y goes to infinity for the extreme cases x m = x sa (no spike) and x m = x sp (no sample). Given that (xsp — xsa) is constant, y is minimum for

dx m (x s p -xj(x m -x s a )

-=0

or ~ X m ( - X m + X sa + X sp - X m )

=0

which is equivalent to (3.1.3)

3.1 Functions of a single variable

0

0.05 148

0.1

113

0.15

0.2

Nd/150Nd ratio in the sample-spike mixture, xm

Figure 3.1 Optimization of spike addition for the isotope dilution technique: y is the error amplification factor.

and hence 1-

-^ (3.1.4)

^sp

Relative error on isotope dilution measurements is minimum whenever the mixture isotopic ratio is the square root of spike and sample isotopic ratios. In the present case, xsa = 5.73/5.62 =1.0196, and x sp = 0.40/97.25 = 0.004 113. We can plot the amplification factor modulus |y| as a function of the isotopic ratio in the mixture (Figure 3.1). Minimum amplification is obtained for

hence the assay contains approximately O.lOOgxlOxKT 6 = 6.93nmolNd 144.24 gmol" 1

114

Useful numerical analysis

or 0.0562 x 6.93 nmol = 0.3895 nmol of natural

r

150

Nd. Addition of

0.06351

from the spike, amounting to 6.133nmol150Nd 12.5 nmol 150Nd per gram of solution of spike solution, minimizes error propagation on concentration. <>

3.1.2 Equation of the tangent to a curve Tangents to curves and surfaces play a key role for a certain number of petrologic or geochemical systems which undergo infinitesimal changes. Finding the compositional changes associated with the segregation of a mineral phase and deciding whether a system is stable relative to small perturbations are problems that commonly need the tangent equations to be found. Given the curve with equation y =/(x), the limit of the chord intersecting the curve in X and x is the tangent to the curve at x = xO. The slope s of this tangent is given by r s= lim

f(x)-f(x0) x — x0

df(x) dx

while the tangent equation must satisfy

and therefore (x-x0)

(3.1.5)

<& Cumulate control lines vs liquid lines of descent. In a study of the 1931-1986 basaltic eruptions from the Reunion Island (Indian Ocean), Albarede and Tamagnan (1988) found the Ni and Cr concentrations (in ppm) listed in Table 3.1 and shown in Figure 3.2. Samples with Ni>100ppm were found to contain large amounts of cumulus olivine. The last five samples of Table 3.1 are picrites. The smooth trend observed for all the rocks is suggestive of a complementary fractionationaccumulation relationship out of a single magma batch. Therefore, it is asked whether the trend observed for Ni and Cr in basalts with less than 100 ppm Ni may be ascribed to the removal of the olivine found in the cumulates. When minerals fractionate from a crystallizing magma, mass balance requires that, in a linear plot of Cr vs Ni, the points representing the composition of the parental

3.1 Functions of a single variable

115

Table 3.1. Ni and Cr concentrations in ppm in Piton de la Fournaise lavas, Reunion Island, 1931-1986 (Albarede and Tamagnan, 1988).

Ni

Cr

Ni

Cr

Ni

Cr

Ni

Cr

55 56 61 62 65 67 70 71 72 73

58 76 66 82 150 128 106 123 124 160

74 74 74 75 75 77 77 78 79 79

191 181 145 224 177 157 203 184 181 172

79 80 81 83 88 88 96 100 112 118

230 166 189 223 236 248 272 363 380 442

144 146 157 203 462 802 870 890 975

494 448 523 555 903 1547 1700 1635 1740

1000

100

Picrites =

;

Basalts

1000

100

Ni Figure 3.2 Plot in log-log scales of the Ni and Cr concentrations (ppm) in post-1930 basalts (open symbols) and picrites (solid symbols) from the Piton de la Fournaise volcano, Reunion Island, Indian Ocean (Albarede and Tamagnan, 1988).

magma, the residual magma and the average cumulate define a straight line (see Chapter 1 and Figure 3.3). In order to define the composition of the solid in equilibrium with the magma at a given point of its fractionation history, let the residual magma composition approach that of the parental magma. We conclude that the instantaneous cumulate must lie on the tangent to the locus of liquids (the so-called liquid line of descent) at the point representing the magma (Figure 3.3). The tangent line is also the locus of all combinations between the magma and the instantaneous cumulates, i.e., the locus of the cumulative rocks belonging to a magmatic stage with a unique differentiation extent: this is the cumulate control line (by reference to the widely used olivine control line). The model developed below is from Albarede (1976).

Useful numerical analysis

116

Parent magma

(Linear scales)

u

Instantaneous cumulate M

Bulk cumulate

Residual liquid

Ni Figure 3.3 Schematic residual magma-cumulate relationships for instantaneous and average magmatic products in a linear plot of Ni and Cr concentrations. For mass balance to be obeyed, the instantaneous cumulate must lie on the tangent to the liquid line of descent at the point representing the instantaneous liquid.

In the present case, the question is whether Reunion olivine-rich rocks belong to a cumulate control line of the basaltic rocks. Through a simple log-log regression, we find that the basalt trend ( = the liquid line of descent) can be approximated by a power law, a form justified in Sections 1.5 and 9.3. liq

=«(Ciitl

)

(3.1.6)

The equation of the tangent to this curve in the (Ni, Cr) plane at CNi = C liq Ni and c ''liq ~ IS (3.1.7)

or -

C

li

(3.1.8)

3.1 Functions of a single variable

117

From the last two equations, we get CCr = C l i q C r + ab(CUqNi)b-

l

(CNi - C liq Ni )

(3.1.9)

which can be rewritten C Cr -C Hq Cr = a 6 ^ ! ! r ( C N i - C l i q N i )

(3.1.10)

Qiq

Finally, comparing with equation (3.1.6), we get s~
(~* Cr

_

Z^S-=b-^-

^Ni

r

Ni

f* r

Cr

(3.1.11)

Ni

W1-11/

The slope of the linear array defined by the olivine-rich lavas (olivine control line) with more than lOOppm Ni is found by linear regression to be (CCr -C liq Cr )/(C Ni -C liq Ni )=1.6 whence we conclude that the liquid which could be involved in the olivine-rich lavas has a ratio Cr/Ni= 1.6/b= 1.6/2.8^0.6, which is well out of the range of the Cr/Ni ratios in basalts (1 to 3). Therefore, olivine-rich lavas are not cumulative rocks genetically related with the basaltic sequence. The large value of b is probably related to the presence of spinel on the liquidus. This does not exclude, however, that they may be cumulates from an earlier differentiation stage with a smaller b value , i.e., before spinel saturation.
b

sol •

At first sight, the answer would be AG = AG0 = Gliq(XUqb)-Gsol(Xsolb\ but we will show that this is wrong. We assume that dn = dn a + dn b moles are transferred from the liquid to the solid phase. Let us assign the symbol fi to chemical potentials, e.g., juliqa for species a in the liquid. Then dn AG = /<sola dnsola + /isolb dnsolb + ^liqa dnliqa + /zliqb dnliqb We have assumed that dn = dnsola + dnsolb and therefore

(3.1.12)

118

Useful numerical analysis

hence dn AG = (^sola - A*liqa)cKola + (//solb - /*liqb)dnsolb

(3.1.13)

The molar fraction of b in the newly formed solid is defined as

y b_ ^ sol

with a similar definition for the liquid fractions. The change AG in Gibbs energy upon transfer of dn moles is b q

-AO^soi b

(3.1-14)

which, for reasons which will appear later, is rewritten as AG = V b = XXsol V

This equation can be recast into AG = Xsol V sol b + (1 - * S O , W - [*nq Vuq b + (1 - * n q b K q a ] b

-X s o l b )( M l i q b -^ l i q a )

(3.1.15)

The terms on the right-hand side of the first line represent the difference between the molar free enthalpies of the solid and liquid solutions at composition Xsolb. In addition, a standard result for binary systems (e.g., Swalin, 1962) states for G and fi the following relationship

Applying equation (3.1.16) to the liquid results in AG = Gsol(JTsolb) - G liq (X liq b ) -

(3.1.17)

Let us label with an asterisk the value of the free enthalpy taken at Xsolb along the tangent to the curve of the liquid free enthalpy at X liq b (Figure 3.4) so that

liq l A sol ) — ^ l i c ^ l i q

dGUt

dX

(3.1.18)

3.1 Functions of a single variable

119

A

A

/

/

/

\

\

V

/

]~~

Gsol

\

•

AG i

AG 0 '

/ <

• — _ _ _ _ _

'liq

sol

Figure 3.4 Change of Gibbs free enthalpy at the onset of crystallization of a solid solution with composition Xsolb from a liquid solution with composition Xliqb. The change corresponds to AG, not to AG0.

which gives the simpler expression = G s o l (X s o l b )-G l i q *(X s o l b )

(3.1.19)

The change in Gibbs free enthalpy is therefore the difference measured at Xsolb between the G value of the solid and that taken along the tangent to the liquid curve (Figure 3.4). At equilibrium, Gliq and Gsol have a common tangent and therefore AG = 0. <=• Such a calculation can be extended to other molar quantities such as the volume change (AV) or the entropy change (AS). Unlike AG, the quantities AV and AS do not vanish at equilibrium. Although they derived the same tangent rule through a significantly more complicated theory, Walker et al. (1988) showed that in the context of olivine flotation in melts at high pressure, the slope of pressure-temperature (P, T) equilibrium univariant curves given by the Clapeyron rule (e.g., Denbigh, 1968) cannot be used to retrieve the relevant information on the actual change in density upon melting and crystallization in the mantle.

120

Useful numerical analysis 3.1.3 Leibniz's rule for the derivative of a definite integral

• Given /(x, t) a function of time and of another variable x, and given the time-dependent limits a(f) and /?(f), Leibniz's rule states that

11""/™*,. r ^ Jm

drj a a(r) (r)

^

-

J

dt

M dt

*

,3,.20, dt

Example: d Ckt

—

Ckt

sin(2f x) dx =

dtJo

2x(cos 2tx) dx + (sin 2kt2) xk- (sin 0) x 0

Jo

3.1.4 Taylor series • The Taylor expansion of the function f(x) about the value x 0 is

2!

X0

or in compact form

where the exclamation mark stands for the factorial expansion. The Taylor-McLaurin expansion of the function/(x) about x o = 0 is a particular case of equation (3.1.21) 4- ...

It is particularly useful for approximating some functions in the vicinity of x = 0: (a) the exponential function — e° + — e ° + . . . = l + x + — + — + . . . 2! 3! 2! 3!

(3.1.22)

(b) the natural logarithm function x2

1

1+0 *

-.x-xl 2 (c) the power series

x +

'2!(l+0)

2

.l+...

4

1

3!(l+0)3 x

l+ 3

x3

1

(3.1.23)

3.1 Functions of a single variable

121

with the particular case for a = — 1 x3+ ...

(3.1.24)

Isotopic fractionation provides illustrative examples of first-order expansions of unknown functions. In general, the mass spectrometric measurement r/ of the ratio between two isotopes of mass mt and m, of the same element, differs from the natural value RJ. Only a very small fraction of the original sample produces ions and different processes taking place in different parts of the mass spectrometer act differently on the sensitivity of each isotope. We assume that instrumental isotopic fractionation is mass-dependent.

Equilibrium fractionation. A simple fractionation law, called the linear law (e.g., Hofmann, 1971), relates the measured and natural isotopic ratios through a function /(Am/) of the mass difference Am/ = m7 — m, between the isotopes defining the ratios )

(3.1.25)

As Russel et al. (1978), we write the Taylor expansion of/(Am/) about Am/ = 0 and get /(Am/) = /(0) + Am//'(0) 4- higher-order terms and drop the terms which involve derivatives of order higher than one. As one isotope cannot be fractionated relative to itself (Am/ = 0),/(0) = 1 and the linear fractionation law reads r/

= K/(l+Am/(5)

(3.1.26)

where 8 = /'(0) is a constant coefficient called the mass discrimination or mass bias per mass unit. Let us take the example of the 148 Nd/ 150 Nd ratio

(1 2S)

~

natural

In a ratio-ratio plot (Figure 3.5), typically r/ 2 vs r/ 1 , the ratios combine as r/2-iV2_R/2Am/2_^ n r/ 1 -/*/ 1 K/'Am/ 1

/2Am/

2

Am/1

which shows that they are linearly related.

Mass discrimination with distillation effects. Let us assume that the isotope composition of an element is being measured by thermal ionization. This method consists in ionizing the sample atoms by evaporation on a metal filament. Statistical thermodynamics (e.g., Denbigh, 1968) tells us that, while vapor pressure is a function

Useful numerical analysis

122

O

i 42 o C/5

Domain of the linear law

Isotopic ratio 1 Figure 3.5 The domain of the linear law for mass-dependent discrimination between two isotopic ratios.

of the molecular weight of the isotope, the fraction evaporated and ionized is a complex function of ionization potential, temperature, work-function of the filament, etc. At a given time, we assume that the ionization parameters are fixed. Therefore, the ionization probability of an isotope i and, equivalently, the proportion of atoms on the filament that per unit time eventually ionizes is a function of mt only. Let g(mt) be that proportion. If we call nt the number of ionized atoms of isotopes i, we can write dw.

(3.1.27)

For two isotopes i and j , we combine expressions (3.1.27) as duj

drii

dlnn,— dlnrc,

d In rJ(i)

n{dt

dt

dt

Expanding the function g in a Taylor series to the first order with respect to m, we get the approximation

3.1 Functions of a single variable

123

or dlnr,'(f) — ss - (m,- - mi)gm (m,)

Upon integration, this equation becomes r/(0 = rIj(0)exp[- Am, V f a M =r/(0)[exp (-gJimdt)^

(3.1.28)

where the pre-exponential term represents the isotopic ratio of the first fraction evaporated from the filament. This ratio is at equilibrium with the ratio of the sample initially on the filament. From the earlier discussion, we can express r,J(0) as

Two limiting cases arise depending on the intensity of the distillation effect. If distillation is not important, we can expand the distillation exponential term to the first degree as rt\t)« /V(0)(l + Am/ <5)[1 - A w ^ W f l * M<>){ 1 + Am^-^'(m ( )t]} where the second-order terms are neglected. [8 — gJim^t] is known as the time-dependent mass fractionation per mass unit. This is the widely used timedependent linear law of mass fractionation suitable for large samples. For small samples, however, mass fractionation is important and the cumulative effects are dominant so 8 can be neglected resulting in

This relationship is known as the power law of mass fractionation (Wasserburg et al, 1981; Hart and Zindler, 1989). Writing oc(t) for -^ w / (mI )r for the mass fractionation per mass difference unit, the 148 Nd/ 150 Nd ratio (Am/= —2) would change with oc(t) according to a power law if

3.1.5 Roots of implicit equations and extrema of functions: the Newton method Some equations such as/(x) = 0 cannot be explicitly solved for x. If multiple solutions are not expected in a narrow range, Newton's method is often simple to implement and has faster convergence than the natural method of interval splitting. The method is recursive and uses the first-order expansion of/(x) in the vicinity of thefcthguess / [ x ( * + 1J] % /[x (fc) ] + [x(fc + 1 } - x (fc)]/'[x(fc) ]

124

Useful numerical analysis

Table 3.2. Calculation of yj2 as a solution to the equation x2 — 2 = 0 by the Newton method. Compare with the true value of 1.41421. Step/c 0 1 2 3 4

7.00000 1.36111 0.137 80 0.00222 0.00000

3.00000 1.833 33 1.46212 1.41500 1.41421

Expressing our goal that/[x (fc

+ 1)

6.00000 3.66667 2.92424 2.83000 2.82843

1.16667 0.37121 0.047 12 0.00078 0.00000

] = 0 results in

The Newton method can also be used to find the extremum (maximum or minimum) of a function/(x), i.e., the value for which the first derivative f\x) is zero. The iterative search for the extremum is implemented by a formula derived from equation (3.1.29)

i

i

i

The extremum X is a minimum if the second derivative/"(X) is positive, a maximum if f'\X) is negative. <& Find the square-root of 2. This amounts to solving

Pretending that we ignore the result, we calculate/'(x) = 2x and take 3 as the initial guess x(0). Hence,/[x ( O ) ] = 3 2 - 2 = 7,/'[x ( O ) ] = 2 x 3 = 6 and x(1) = 3-(7/6)= 1.8333 Table 3.2 lists the results for four iterations. After four iterations, the estimate reproduces the true value of yjl to the fifth decimal place, o Find the roots of the equation

1+x2

3.1 Functions of a single variable

125

which appears in connection with diffusion from a sphere into a finite volume (see Chapter 8). x (in radians) is the intersection of the periodic function tan x with the single value function x(l-hx 2 )" 1 . Since tanx goes to infinity for x = (2n+l)n/2 (n = 0, ± 1 , ± 2 , ...), there is one of these intersections per interval [_(2n — l)n/29 (2n +1)TT/2]. Consequently, we take 2nn/2 = nn as the initial guess for each interval. In addition, let us replace g(x) by f(x) such that /(x) = ( l + * 2 ) t a n x - x which is easier to handle and has the same roots since (1 +x 2 ) is strictly positive. The derivative f'(x) is +x 2 Xl+tan 2 x)-l Let us try for n = + 5, i.e., with 5TT = 15.708 as the initial guess x(0) . The first step gives /'[x ( 0 ) ] = (1 + 25TT2)0 - 5TT = -15.708 / ' [ x ( 0 ) ] = 2 x 5TT x 0 + (1 + 25TT2)(1 + 0 2 ) - 1 = 246.739 x{1) = 5n-(- 15.708/246.739)= 15.772 and the second step / [ x ( 1 ) ] = ( l + 15.7722)tan 15.772-15.772 = 0.1492 /'[x ( 1 ) ] = 2 x 15.772 x tan 15.772 + (1 +15.7722)(1 + tan 2 15.772)-1 =251.770 x ( 2 ) = 15.772-(0.1492/251.770)= 15.771

The improvement is not significant, so we stop here, o & A primary isochron 2 0 7 Pb/ 2 0 4 Pb vs 2 0 6 Pb/ 2 0 4 Pb on a series of rock samples gives a slope of 0.256. Calculate the age T of the isochron. Standard textbooks on chronology (e.g., Faure, 1986) give the slope of a 207p b/ 204p b

v s

206p b/ 204p b

i s

35L 1 f(T) = — — e'~ *238ur - 0.256 = 0 T 137.88 e '-l

Taking the derivative relative to T, we get 1

*

/-''35UT/ /-")38L^

1\

"

/'"'38U7'/ /" ) 35U^

1\

Using / 2 3 8 U = 0.155125Ga" 1 and / 2 3 5 U = 0.98485 0 a " 1 and an initial guess T 0 = 5Ga, Table 3.3 lists the results of the first five iterations, o

126

Useful numerical analysis

Table 3.3. Iterative calculation of the age of an isochron with a slope of 0.256 in the vs 206Pb/204Pb diagram by the Newton method. Step/c 0 1 2 3 4 5

jn

5.00000 4.01055 3.41007 3.234 33 3.222 31 3.222 26

207

Pbl204Pb

/[7^]//TT<*>] 0.58927 0.17202 0.032 60 0.001 97 0.00001 0.00000

0.595 55 0.28647 0.18549 0.163 59 0.16219 0.16219

0.989 45 0.60048 0.175 74 0.012 02 0.000 05 0.000 00

# I n a Concordia diagram 2 0 6 Pb/ 2 3 8 U vs 2 0 7 Pb/ 2 3 5 U, a series of zircons give a good alignment with a slope a = 0.043 633 and an intercept /? = 0.094613. Calculate the ages at which this line intersects the Concordia. Equation of the Concordia is (Wetherill, 1956; Faure, 1986)

whereas the straight-line equation reads y-ax-j5 = 0 These equations can be combined as /(T) = e;238uT-1 -a(e ; 2 3 5 u r - 1)-0 = which has for derivative

Using two different initial guesses (5 and OGa) in order to approach the two intersections from different directions, Table 3.4 gives the results of the first iterations towards each intersection. The zircon alignment intersects the Concordia at 1.00 and 2.00 Ga.<^ & A basaltic liquid with an FeO content C 0 FeO of 10 percent and an MgO content C0MgO of 12 percent crystallizes olivine (ol). Calculate the FeO and MgO contents CliqFeO and CliqMgO after 15 percent crystallization. Irvine (1977) has shown (see Section 1.5) that

CfaFeO and CfoMgO being the contents of FeO in fayalite and MgO in forsterite, the

3.1 Functions of a single variable

127

Table 3.4. Iterative calculation of the two age intercepts with the Concordia curve for a zircon linear array having a slope a = 0.256 and an intercept /? = 0.094 613. Step/c First intercept: 1 2 3 4 5 6 7 8 9

/'[T<*>]

/[r<*>]//'[r<*>]

5.0000 4.1243 3.3571 2.7447 2.3202 2.0875 2.0091 2.0001 2.0000

-4.8824 -1.6891 -0.5581 -0.1715 -0.0465 -0.0095 -0.0009 0.0000 0.0000

-5.5755 -2.2017 -0.9113 -0.4039 -0.1999 -0.1213 -0.0990 -0.0965 -0.0965

0.8757 0.7672 0.6124 0.4245 0.2326 0.0784 0.0090 0.0001 0.0000

Second intercept: 1 0.0000 2 0.8436 0.9883 3 4 0.9999 5 1.0000

-0.0946 -0.0113 -0.0008 0.0000 0.0000

0.1122 0.0782 0.0671 0.0661 0.0661

-0.8436 -0.1447 -0.0116 -0.0001 0.0000

solution is (Chapter 1 and Albarede, 1992) Q

FeO

Q MgO

(3.1.32) where KD is the ratio (FeO/MgO) ol /(FeO/MgO) liq and the parameter z is defined as C1/^

MgO//^1 MgO

Z — r Ujiq

/U 0

c\ i 'l'W

yD.l.DD)

Taking the derivative of/(z) with respect to z gives

f'(z)=-KD

r

F

FeO

(3.1.34)

Using a table of molar weights, we get C fa FeO = 2 x 71.85/203.78 = 70.52 percent CfoMgO = 2 x 40.31/140.71 =57.30 percent A natural starting value for z if F. The calculation listed in Table 3.5 was stopped once the relative deviation in z was less than 0.001. The final z value of 0.4306 is converted through equation (3.1.33) into CliqM*0 = 0.4306 x 12/0.85 = 6.08 percent

128

Useful numerical analysis Table 3.5. Iterative calculation of the z value, equation (3.133), for thef(z)=0 equation (3.1.32). Step/c

f(z)

f'(z)

z

0 1 2 3

-0.1121 0.0054 0.0000

-0.2556 -0.2867 -0.2842

0.8500 0.4115 0.4305 0.4306

and through equation (3.1.31) into C liq FeO = 6.08 x (10/12) x (0.4306)° 2 9 ~ * =9.21 percent.

<& Upon heating for a time t = 1 hr at 800°C, a plagioclase crystal with a radius a = 1 mm has lost 40 percent of the radiogenic argon it initially contained. Calculate the argon diffusion coefficient 2 at this temperature. Assume that the plagioclase crystal may be considered as a sphere with isotropic diffusion properties and that, initially, radiogenic argon was homogeneously distributed. Let us define the dimensionless number

From Section 8.6, the fraction F(T) of argon left in the sphere at x can be written F(T) = -^-t\exp(-n2n2T) n2n=ln2

(3.1.35)

Hence, the implicit equation to be solved for T is /(T) = F(T)-0.6 = 0

The derivative of /(T) is /'(T)=

- 6 £ exp(-n27r2T)

(3.1.36)

The first seven iterations produced from an arbitrary trial value T = 0.0001 are listed in Table 3.6. The final result is exact to better than four decimal places. This result allows the diffusion coefficient to be calculated as

3.1 Functions of a single variable

129

Table 3.6. Iterative calculation of the % value corresponding to a lost fraction of radiogenic argon of 40 percent. Spherical geometry is assumed, equation (3.1.35).

Stepfc

F(T)

/'M

/W//'(T)

T

1 2 3 4 5 6 7

0.9664 0.9664 0.8444 0.6932 0.6145 0.6004 0.6000

-166.26 -166.26 -32.261 -14.028 -10.168 -9.6357 -9.6216

-0.0022 -0.0076 -0.0066 -0.0014 -0.0000 -0.0000

0.0001 0.0023 0.0099 0.0165 0.0179 0.0180 0.0180

Find numerically the mininimum over [0, 1] of the binary entropy function /(x) = x l n x + ( l -

The obvious result x = 0.5 can be arrived at in a number of ways. The first and second derivatives are /'(x) = ln[x/(l-x)]

and r ( x ) = x"

With the trial value x(0) = 0.1, we obtain /'[x ( 0 ) ]=-2.1972,/"[x ( 0 ) ] = 11.111, and x(1) = 0.2978. The result 0.5000 with four correct decimal places is obtained with x(3). o

3.1.6 Ordinary differential equations: the Euler method Quite commonly, differential equations appear in the form

(3.1.37)

and cannot be solved explicitly. We have to resort to one of the many numerical methods of which the simplest versions are given here. The Euler method has little practical value, but forms the basis for most of the more elaborate methods. It consists in a first-order expansion of the derivative. The approximation at step tin+1) is (3.1.38)

130

Useful numerical analysis

Table 3.7. Solution of the differential equation dy/dt = -2ty by the Euler method with a time step of 0.1 (left) and 0.01 (right).

t

y

-2ty

True value

t

v

-2ty

True value

0.00 0.10 0.20

l 1.0000 0.9800

0 -0.2000 -0.3920

1 0.9900 0.9608

0 0.01 0.02

1 1.0000 0.9998

0 -0.0200 -0.0400

1 0.9999 0.9996

0.90 1.00

0.4655 0.3817

-0.8379 -0.7634

0.4449 0.3679

0.19 0.20

0.9663 0.9627

-0.3672 -0.3851

0.9645 0.9608

& Solve the equation that results from introducing the Boltzmann variable into the diffusion equation (Chapter 8) y'(t)=-2ty given the initial value y(0)= 1. This equation has an exact solution y = exp(-t2). For a rather coarse time step of 0.1, we obtain .1-0.0)(-2x0xl)=l .2) = )/(0.1) + (0.2-0.1X-2x0.1 x l) = 0.98 Table 3.7 compares the accuracy for two different time step sizes. Obviously, final accuracy depends on the time step chosen and considerable computational effort would be required for a good approximation, <=>

3.1.7 Ordinary differential equations: the Runge-Kutta method Although Press et al (1986) compare the use of the Runge-Kutta method to 'ploughing the fields' and that of high-order predictor-corrector schemes to 'racing on the fast lane with a sports car', we are still dealing with a reliable method easy to implement and quite successful. The Runge-Kutta method is sketched in Figure 3.6 and uses successive approximations of the function on the [t{n) — r(w+1}] interval. Let us define

3.1 Functions of a single variable

131

t Figure 3.6 Numerical solution of ordinary differential equations: sketch of the four steps of the Runge-Kutta method to the order four giving the n + 1 th estimate y ( n + 1 ) from the nth estimate y{n).

and the intermediate evaluations

Then

f-fc4)

(3.1-39)

o This scheme has tight connections with the Simpson's rule for numerical integration.

&

Let us solve the same equation as above y'(t)=-2ty

with initial value y(0)= 1 and a constant time step tin+1) — t(n) = OA.

Useful numerical analysis

132

Table 3.8. Solution of the differential equation dy/dt = —2ty by the Runge-Kutta method to the order four with a time step of 0.1.

Step n

y> *i

k2

K exp(-r2)

0

2

1

0 1 0.0000 -0.0100 -0.0100 -0.0198 0.9900 0.9900

0.1 0.9900 -0.0198 -0.0294 -0.0293 -0.0384 0.9608 0.9608

0.2 0.9608 -0.0384 -0.0471 -0.0469 -0.0548 0.9139 0.9139

4

3 0.3 0.9139 -0.0548 -0.0621 -0.0618 -0.0682 0.8521 0.8521

0.4 0.8521 -0.0682 -0.0736 -0.0734 -0.0779 0.7788 0.7788

5 0.5 0.7788 -0.0779 -0.0814 -0.0812 -0.0837 0.6977 0.6977

Let us show how the method is implemented on the first time step y(0)=l

r r /c3 = 0.1x

/

o.i\ /

o\i

/

oi\ /

-0.01M

-2x(0+ —jx( 1+

j

=-0.00995

/c4 = O.lx [ - 2 x(0 + 0.1)x (1-0.00995)] = - 0 . 0 1 9 8 0 1

hence (0-2x0.01-2x0.00995-0.019 801) -=0.99005

y(0.1)=l-h-

A few more steps with results trimmed to the fourth digit are computed in Table 3.8. The Runge-Kutta provides a robust and reasonably precise answer to most differential equations, o 3.1.8 Interpolation with spline functions Interpolating discrete data is an old concern of physics but, in the case of numerous data points, the conventional collocation and osculating polynomials are of too high degree to be really useful (Scheid, 1968). Interpolation of n + 1 data points yOiyl9..., yn tabulated at x0 xx ' xn for intermediate values of the dependent variables x can be done by a number of methods but one of the simplest and most elegant is the constructions of cubic splines. On each interval, the 'data' function will be approximated by a cubic polynomial such that, at their common point, the polynomials of neighboring intervals have identical values, slopes and curvatures (Ahlberg et ai, 1967). Let us consider the interval (xt_ x - x£). A third degree polynomial

3.1 Functions of a single variable

133

has a linearly changing second derivative y", hence y"(x) = yyi-l" + (\-y)yi'\

for ^ ^ x ^ x , -

(3.1.40)

where y is a factor depending on x. Solving at the extremities x t _! and x, of the interval, we get X( X

/(x)=

~

ft-r + fl

^—)yi",

for x.-.^x^x,-

(3.1.41)

while over the interval xt and xi+1 the expression becomes +

X Xi

~

yi+l",

for x ^ x ^ x I + 1

(3.1.42)

These expressions can be integrated relative to (x, — x) 1 l(x,-x) 2 y(x)=--

yi.l 2x I -x,_ 1

I" 1 (x,-x)2l n - x,-x-y, +cx for x^ L

x-x,--

2x,-x I _J l

I_U/' + _

2 X, +

t

—y.

+ l»

+ C2 for x,

2 Xf + j — X,-

— Xf J

where c t and c2 are two constants of integration. A second integration gives for Xt-

6x l-xI _1 2 — ly/' + —yi+i" + c2(x-Xi) + c4 for x,y(x) = -\ 3(x-x,) 6L x« + i-XiJ 6x l + 1 - x f

Writing that the first cubic goes through the points (Xi-^y^^ yi-i = -(xi-xi-l)2yi-1" o

+ -(xi-xi-1)2yi"-cl(xi-xi-.l) 3 y i

hence

r)

+ c3

3

+yi~yi-1

U

and (x,-,^,-), we get

(3.1.43)

where yi'i~) is the left-derivative at x = xt. Likewise, writing that the second cubic goes through the points (xf,yf) and (xi+1,yi + 1), we get

3

o

134

Useful numerical analysis

hence (3.1.44)

where y/ ( + ) is the right-derivative at x = xf. The first left- and right-derivatives must be equal for the common point x = xh hence

X: —Xf_!

which is recombined into

For n data points, (n— 1) equations, such as the one above, can be written with (n+ 1) unknowns y/' (i = 0,..., n). Two additional equations are needed, which most often are end conditions at i = 0 and i — n. The two conditions specifying the slopes

(3.1.45)

(3.1.46) xM-xM_1

make it possible to complement the system of equations and solve it for the (n +1) unknowns y" (i = 0,..., n). In a matrix form, this is written as AM=D

(3.1.47)

where the current element atj of the (n +1) x (n + 1) tri-diagonal matrix A are given by

0
(3.1.48)

otherwise

M i s the (n+1)-vector of the unknowns y{\ while D is the (n+1)-vector with current

3.1 Functions of a single variable

135

Table 3.9. Chondrite-normalized Ce/Yb ratio of some recent lavas of the Piton de la Fournaise, Reunion Island (Albarede and Tamagnan, 1988).

i Year (Ce/Yb)N

0

1

2

3

4

5

6

7

1948 20.9

1953 21.2

1956 22.0

1966 20.8

1972 21.7

1975 22.4

1981 21.3

1985 18.9

element

(3.1.49)

n

= 6\ y n ' {

>

Once the linear system is solved for the unknowns y", the value interpolated at any arbitrary x can be calculated from either formula for x ^ ^ x ^ x , (3.1.50)

— 6

l

-(x — xt)3

for x I ^ x ^ x l + 1

(3.1.51)

xi+l-Xi

where the left- and right-derivatives y{( } and y{{ + ) are given by equations (3.1.43) and (3.1.44), respectively. It is often preferred to choose the values of the derivatives instead of the curvatures as unknowns. In particular, the right-derivatives y{{ + ) at the data points (breakpoints) appear in the standard pp-form (piecewise polynomial) used in many software packages (de Boor, 1978). This transformation is a simple task using the relationship between derivatives and curvatures, preferably in a matrix form. & The Piton de la Fournaise volcano (Indian Ocean) erupts basalts with chemical compositions that change with time. The rare-earth elements have been measured on eight dated historic lavas (Table 3.9 and Figure 3.7, Albarede and Tamagnan, 1988), and chondrite-normalized (Ce/Yb)N ratios over the time interval 1948-1985 are given in Table 3.9. Calculate an annual interpolation of these results. Let us build first the matrix A of coefficients atj and the right-hand side vector D with the - admittedly questionable - assumption that the derivatives at the end-points are zero. Matrix A is calculated from equations (3.1.48), e.g., aoo = 2 x (1953-1948)= 10,

136

Useful numerical analysis

1940

1950

1960

1970

1980

1990

Year Figure 3.7 Spline interpolation of the chondrite normalized Ce/Yb ratio in recent lava flows of the Piton de la Fournaise volcano (Albarede and Tamagnan, 1988). The end derivatives are supposed to be zero.

a01 = 1953-1948, and so on. Vector D is calculated from equations (3.1.49), e.g., dn =

u

l

/ 21.2-20.9 \ 1 9 5 3 - 1948

,1953 - 1 9 4 8 /

-0

=0.36

r 1956-1953; V

so we arrive at

A=

10

5

0

0

0

0

0

0

0.36

5

16

3

0

0

0

0

0

1.24

0

3 26

10

0

0

0

0

2.32

0

0

10 32

6

0

0 0

0

0

0

6

18

3

0 0

0

0

0

0

3

18

6 0

0

0

0

0

0

6

20 4

0

0

0

0

0

0

4

8

and D =

1.62 I 0.50 2.50 2.50 3.60

Solving the matrix equation AM=D,we get the vector M of unknowns y" (i = 0,..., 7), which enables the left- and right-derivatives to be calculated through equations (3.1.43) and (3.1.44) (Table 3.10). Let us give an example by calculating the (Ce/Yb)N value interpolated for the year 1960 with thefirstinterpolation formula. Clearly i = 3 (1966), so x3 - x = 1966-1960 = 6, then x3-x2 = 1966-1956 =10. Moreover, y3 = 20.8, ^ ( " } = -0.0423 and ^ = 0.0919.

3.2 Functions of several variables

137

Table 3.10. Second derivatives, first left- and rightderivatives calculated for each data point listed in Table 3.9 through equations (3.1.43) to (3.1.46).

Finally, y3"-y2"-

i

y"

0 1 2 3 4 5 6 7

-0.0185 0.1090 -0.1371 0.0919 0.0085 -0.0683 -0.2161 0.5581

yti+)

0 0.2262 0.1840 -0.0423 0.2589 0.1693 -0.6839 0

0 0.2262 0.1840 -0.0423 0.2589 0.1693 -0.6839 0

=0.0919-(-0.1371) = 0.2290, hence \x3*)

=y3-

+>V(*3*)\ 2 6 x3 — x2

(

3

)

which, upon replacement with the actual values, yields y(t =1960) = 20.8-(-0.0423) x 6 + -(0.0919) x 6 2 - - ( — ) x 6 3 = 21.88 2 6 \ 10 / The calculation can be made for an arbitrary number of points provided their abscissa lie inside the range of x values. Figure 3.7 shows the characteristic features of spline interpolation, a very smooth aspect although with some 'overshooting' problems, i.e., extrema located between the data points. Alternative interpolation schemes are discussed by Wiggins (1976). o 3.2 Functions of several variables

3.2.1 Introduction • Let / ( x 1 , x 2 , . . . , x j be an n-multivariate scalar function of the dependent variables x 1 ,x 2 ,...,x n . An equivalent notation is/(jc), where JC is the vector made of the n variables Xj, x 2 ,..., xn. The partial derivative of the function/(jc) with respect to the dependent variable x, is defined as

df(x)

/ ( x 1 , x 2 , . . . , x l + Ax,,...,x n )/(x 1 ,x 2 ,...,x l ,...,x M ) = lim

(J.z.l)

A Higher-order partial derivatives can be defined in a similar way. Using equation (3.2.1), we can show the important equality

d2f(x) dx dy

djm dy dx

(3 2 2)

138

Useful numerical analysis

• Let u{t) be an n-vector with components (ul9 u2,..., un) depending on a single scalar variable t. The derivative of the vector u with respect to the scalar t is the n-vector defined as du

— = lim dt Ar^o

u(t + At)-u(t)

(3.2.3)

At

It makes a vector with n components dujdt (i= 1,2,..., n). Example: with respect to t is the vector = The derivative of u(t) = \ |_sin t J dr |_cos t J The gradient vector grad/(jc) of the scalar function/(JC) is an n-vector defined as df(x)dx1 df(x)/dx2

grad/(jc) =

(3.2.4)

df(x)/dxn The nabla notation V/(JC) is commonly used. Example: the gradient vector grad /(JC) of the function defined as

grad/(jc) = .COS X 3 _

As a particular case, the gradient of a scalar quantity obtained as the dot product of the constant column vector u(u1,...,ut) and the column vector x (xu...,xn) is the vector u itself for

grad wx = Vi#T x =

duJx/dx2

u2

(3.2.5)

JuTx/dxn_ Example: the gradient vector of the scalar 2xt 4-3x2 is the column vector [2, 3]T . The variation df(x) of an n-multivariate scalar function/(x) along a displacement direction x can be written as

<*/(*)= I ^ - d x dx

(3.2.6)

where do: is the n-vector with elements dx^dx^...,dx M . The variation d/(x) therefore has the meaning of the scalar product between the gradient and the displacement vector. Particular displacement vectors dx generate contour lines with constant/(JC) values such as df(x) = 0. The dot product vanishes and the gradient is therefore a vector perpendicular

3.2 Functions of several variables

139

to the contour lines of the function. The gradient vector points uphill. If d/(jt) is negative, the function f(x) decreases along the direction x and the opposite is true if it is positive. Example: At point (1,2, n/2), the direction [-1,3,2] T parallel to the infinitesimal change 3) =

djc(-d/c,3d/c,2d/c)

with d/c>0 induces a change of the function

defined previously, such that d/(jt) = -dkx (22) + 3 dk x (2 x 1 x 2) + 2 dk x [cos(;r/2)]

The direction [— 1, 3, 2] T corresponds to an increasing value of/(jc). • The divergence of a vector v with components v^v2 ..., vn is the scalar number noted either divr or, with the nabla notation V», defined as ivi> = V-i>= V —*dX

(3.2.7)

The Taylor's expansion of a bivariate function /(x, y) in the neighborhood of the point (xo Jo) is obtained by expanding /(x, y) with respect to x at constant y then with respect to y at constant x (or the other way around) and can be*written as

f +highe,orderterms

Defining AJC as the column vector with elements (x —x0) and (y — yo\ an alternative matrix formulation is /(x, y) = /(x 0 , y0) + AJCT grad / + - AjtT//Ajt + higher-order terms

(3.2.8)

where the 2 x 2 symmetric curvature matrix or Hessian H is defined as d2f(x,y) H=

dx2

d2f(x,y) dx dy

d2f{x,y) dx dy d2f(x,y)

(3.2.9)

dy2

An extremum is a minimum if any fluctuation of the coordinates about this point causes the function to increase. It is a maximum if any fluctuation causes the function to decrease. In any other case, we are dealing with a saddle point. We write the fluctuation A/of/(x,y)

140

Useful numerical analysis

about the extremum with coordinates (x*, y*) as A/ % f(x, v) - /(x*, y*) = - AxTHAx In general, / / can be diagonalized as

H=U\U ~l where A is the diagonal matrix of real eigenvalues and U an orthogonal and therefore invertible matrix. The fluctuation A/now writes

Af*-A Introducing the new vector z such that z=U~ 1Ax, we obtain Afx-zTAz

= -trAzzT = -(A1zl2 + A2z22)

where the properties of the trace of a matrix have been used (Section 2.2.4). If all eigenvalues are positive, the extremum (x*,/*) corresponds to a minimum. If they are negative, the extremum corresponds to a maximum. If they are of mixed sign, we are dealing with a saddle point. Figure 3.8 depicts the different cases. These concepts are easily extended to more than two variables. &

For x = (w, v)9 calculate the Hessian of the function f(x) defined as f(x) = — sin u cos v

and map curvature changes. This function is plotted on Figure 3.9 and shows a regular pattern of maxima and minima. Its Hessian matrix is Tsin u cos v cos u sin v~\ |_cos u sin v sin u cos vj The relation tr H= sin2 u cos2 v>0 shows that the eigenvalues of H have the same sign. Curvature changes sign whenever the determinant of H vanishes, i.e., for det H— sin2 u cos2 v — cos2 u sin2 v = 0 or det H= (sin u cos v — cos u sin v) x (sin u cos v + cos u sin v) — 0

3.2 Functions of several variables

141

Figure 3.8 Curvature of a function f (x,y). Top: the Hessian H has two negative eigenvalues. Middle: two positive eigenvalues. Bottom: mixed-sign eigenvalues.

and, finally det H= sin(w — v) x sin(w + v) = 0 which holds for u — v = n or u + v = n. o

142

Useful numerical analysis

Figure 3.9 Plot and curvature of the function/(w,v) = — sinwcos i; for —

3.2.2 System of implicit non-linear equations: the Newton-Raphson

method

The Newton method can be extended to several variables in order to find the zeroes of n functions/ in n variables xi9 which we lump as the vector x = [ x 1 , . . . , x J T , i.e., to solve the system of equations fi(xl,...,xn) = fi(x) = 0

(3.2.10)

for i = l , . . . , n . We start with an initial guess jc(0) = [x 1 ( 0 ) ,...,x n ( 0 ) ] T of x, expand the function to the first order, and make the result equal to zero. We get / T v (!) v t 1 )!— / T v (°> v- (°>~l _l_ «ro«l / T v (0) v (0)1 • A *•(!) — n JilXi ,..,xn j — jiix1 ,...,xn JH-graajfLXi ,...,xM j A X — U

where Ax(1) is the vector of the increments xiil) — xiiO)(i=l9...9ri). in expanded form the n equations of this type, we obtain

ML dxl

dxn

I

k

Lumping together

(3.2.11) Y

(1)

Y

(0)

dxn

Let us call/[jc ( 0 ) ] the n-column vectors of the n values of/ calculated at x{0\ and (0) D[JC ( 0 ) ] the matrix of the derivatives. We assume that Z)[x ] is non-singular, i.e., that its determinant (the Jacobian of the ^^/transformation) is different from zero. The increment Ax(1) is calculated as (3.2.12)

3.2 Functions of several variables

143

and the iteration repeated as far as needed. This formula extends equation (3.1.29) to multiple dimensions. Although several examples implementing the Newton-Raphson method for the computation of chemical equilibrium are developed in Chapter 6, we will now present some simple applications that illustrate its basic principles. & Let us assume that, at high temperature and ambient pressure, the binary system albite-anorthite (ab-an) is ideal. The temperature Tf and enthalpy AH{ of melting of each component is Tfab=1373K Atffab = 64.3kJmor 1 Tfan = 1830 K A//fan = 133.0 kJ mol" J Assuming that AHf are constant, calculate the composition of the solid and liquid coexisting at 7 = 1600 K. The variable X referring to mole fractions, equilibrium is achieved when the chemical potentials ji(ln X) for each element are equal in each phase

an

2

= //solan(0) + 0tT In Xsolan

where M is the gas constant and /i(0) the Gibbs energy of the standard state (pure phase or end-member) at the same temperature and pressure. Closure requires the conditions AY

liq

ab

_i_ AY a n — 1l ' Hq ~~

sol

ab i y an i i~A sol ~~ x

y

A

Equilibrium conditions can be recast into two equations in the two independent variables X liq an and X sol an hq

an liq

sol

-lnX sol an ] = 0

(3.2.14)

We recognize in the first two terms on the right-hand side of each equation the Gibbs energy of melting AGf of each end-member which, assuming that AH{ is constant, can be expressed as

and therefore AGfab = 64.3 x (1-1373/1390)=-10.6kJmoP 1 AGfan= 133 x(l-1600/1830)=+ 16.7kJmor 1

Useful numerical analysis

144

Table 3.11. Iterative calculation of the solution to equation (3.1.23) through the Newton-Raphson method. The vector x is the set of the two variables Xliqan and Xso*n. D(n)

n

-Ax-/)"1/

fin)

0 1 2 3 4 5 6

x( w + 1 ) -x ( n ) + Ax 0.6 0.3

-18075 25 937 2329.8 5041.5 561.32 -2738.5 -7.7393 -238.17 0.0369 -2.7995 -1.88xlO" 6 -3.76 xlO" 4

- 3 3 257 22171 -19197 43 328 - 1 5 585 90857 -16142 75 625 -16215 74079 -16216 74061

Let us define x = derivatives D(x) is

19004 -44343 50857 -18015 36148 -21049 35 874 -21143 36055 -21081 36057 -21080

0.29297 -0.43843 0.16061 0.10644 -0.02949 0.00281 -0.003 67 -0.00187 -0.00004 -0.00002 5.8 xlO" 9 2.7 x H T 9

0.30703 0.73843 0.14642 0.63199 0.17591 0.629 17 0.17958 0.63104 0.17962 0.63106 0.17962 0.63106

9.99 x 108 3.08 x 107 7.81 x 106 5.68 x 104 7.84 x 10° 1.41 xlO" 7

*0, A M l a n ] T a n d / ( J C ) = [ / 1 ( J : ) , / 2 ( x ) ] T . The matrix of partial

SIT (3.2.15)

D(x) = Sf2 Y

an

^Miq

Let us choose the initial guess for JC(0) = (0.6, 0.3). Successive steps produce the results shown in Table 3.11. Figure 3.10 shows the Gibbs free enthalpy of the liquid and solid mixtures together with the final result JC(6) = (0.179 62,0.631 06). The last column in Table 3.11 lists the squared-modulus s = / T / o f the vector/as a convenient measure of convergence. <j= Not all calculations converge so nicely, especially when the derivatives of higher order are large and variable. The choice of variables as well as their value at the starting point may turn to be critical in achieving reasonable convergence.

3.2.3 Extrema: the steepest-descent method A large category of problems consists in finding the extremum of a function with respect to several variables. Finding the maximum of a function/(JC) is equivalent to finding the minimum of —f(x), so the discussion will be restricted to the search for minima. Let us assume that/(jc) is a function in the n variables xl9 x2, >,xn collected

145

3.2 Functions of several variables

I o 0.2

0.4

0.6

0.8

1

Figure 3.10 Computation of equilibrium concentrations for coexisting solid and liquid solutions.

Figure 3.11 The steepest-descent method: the search in one direction is discontinued when no further decrease is possible, i.e., when the search direction is parallel to the local contour line. The next step starts in a perpendicular direction, i.e., in the direction opposite to the local gradient.

into the vector x. Since the grad f(x) vector points towards the maximum increase of/(jc) (Figure 3.11), minimizing/(JC) may be iteratively achieved using (3.2.20)

where a is a constant. The linear search for the optimum value <xm of a is carried out by either bisection or by more efficient method such as Davidon's cubic interpolation (e.g., Walsh, 1975; Fletcher, 1987). A measure of how fast/(jt) decreases from x{k) to xik+1) is the scalar gik + 1\ such that (k + 1)

=

1) _

(3.2.21)

Useful numerical analysis

146

Table 3.12. Search for the minimum ofthe function f(x) = exp(0c12 steepest descent.

by the method of the

The scalar a, equation (3.2.20), is estimated by crude linear search, g is the convergence criterion given by equation (3.2.21). Values in italic refer to the minimum along the search direction.

a 0 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0 0.1 0.2 0.3 0.4 0 0.15 0.30 0.45 0.60

1.0 0.9197 0.8393 0.7590 0.6786 0.5983 0.5179 0.4376 0.3573 0.4376 0.3829 0.3283 0.2736 0.2189 0.1642 0.1096 0.0549 0.1096 0.0872 0.0649 0.0425 0.0202 0.0425 0.0297 0.0169 0.0042 -0.0086

x2

/

-1.0 -0.8393 -0.6786 -0.5179 -0.3573 -0.1966 -0.0359 0.1248 0.2855 0.1248 0.0936 0.0624 0.0312 0.0001 -0.0311 -0.0623 -0.0935 -0.0623 -0.0369 -0.0115 0.0140 0.0394 0.0140 0.0123 0.0130 0.0137 0.0144

20.0855 9.5322 5.0811 3.0422 2.0459 1.5453 1.3111 1.2494 1.3373 1.2494 1.1784 1.1225 1.0798 1.0491 1.0293 1.0200 1.0207 1.0200 1.0104 1.0045 1.0022 1.0035 1.0022 1.0012 1.0006 1.0004 1.0005

df/dx, 40.171

df/dx2

g

-80.342

8069

1.0935

0.6236

-6.18

1.0935

0.6236

-6.18

0.2235

-0.2542

0.0859

0.2235

-0.2542

0.0859

0.0852

0.0559

0.0048

0.0852

0.0559

0.0048

0.0083

0.0549

0.0038

At the minimum along the kth search direction, the old and the new directions should be orthogonal, i.e., (3.2.22) grad/[jc(k+ ^-grad/iy 0 ] =0

Calculate by the steepest-descent method the minimum of the function

using the starting point ( + 1 , - 1 ) . The solution is obviously (0, 0). The first four stages of linear search are given in Table 3.12. <^

3.2 Functions of several variables

147

The steepest-descent method does converge towards the expected solution but convergence is slow in the vicinity of the minimum. In order to scale variations, we can use a second-order method. The most straightforward method consists in applying the Newton-Raphson scheme to the gradient vector of the function/to be minimized. Since the gradient is zero at the minimum we can use the updating scheme grad f[x(k

+1

>] = g r a d / [ j c ( f c ) ] + / / < k ) [ > ( * + l)- jc ( k ) ] = 0

or *<*+1> = x <*>-[#<*>]-i g r a d / [ x ( k ) ]

(3.2.23)

where H{k) is the curvature matrix (Hessian) o f / a t step k. Equation (3.2.23) is the extension of equation (3.1.30) to multiple dimensions. As will be seen in Chapter 5, this method is extremely useful for refinement near the minimum but may otherwise run away towards any point where the gradient vector vanishes, such as saddle points or even maxima. This inconvenience may be overridden by using the mixed scheme due to Marquardt (1963) and which is written grad /[jc(fc)] + [#<*> + a/J[jc(fc + 1 } - *<*>] = 0 where a is a parameter and /„ the (n x n) identity matrix. This scheme allows the user to shift from the most reliable gradient method far from the minimum for large values of a to a fast-converging Newton-Raphson method for small values of a near the minimum (Fletcher, 1987). Other methods, which also have the property of accelerated convergence near the minimum, take advantage of how the gradient varies locally, or, in other words, build up a local second-order approximation to the function/ The most commonly applied methods are those of conjugated gradients due to Fletcher and Reeves and the variable metric methods, e.g., the Davidon-Fletcher-Powell (DFP) and Broyden-FletcherGoldfarb-Shanno (BFGS) algorithms. These methods require more substantial theoretical developments, which may be found in the books by Walsh (1975), Press et al. (1986), Fletcher (1987) and are found in most major software packages such as MatLab. 3.2.4 Constrained minimization Constraints are relations of equality or inequality, which must be exactly obeyed by the unknowns of a model. A familiar example is the mineral abundances in a rock or the end-member proportions in a mixture, which must sum up to unity whatever the errors on the data. • When the objective function to be minimized is linear, the problem is relatively simple (Figure 3.12) and is the substance of linear programming. Let us take a simple example in n = 2 dimensions and assume that we are looking for the minimum of a linear function/(JC), where JC is the vector [x 1 ,x 2 ] T , with the equality constraint bTx = q (b and q being known constants) and the inequality constraints xt ^ 0 , x2 ^0. In geochemistry or thermodynamics, the equality constraints are typically conservation equations, while phase or end-member

148

Useful numerical analysis

x\\\\\\v<x\\\\\\\

Figure 3.12 Minimum of the function/(x) submitted to the equality constraint bTx = q, and to the inequality constraints x ^ O , x 2 ^0. The feasible set is the segment of the constraint line located in the positive quadrant. The minimum occupies an edge of the feasible set. proportions are usually required to be non-negative. Constant values of/(jc) define straight lines in the plane (xl5 x2). The inequality constraints split the space in two subspaces. The feasible set of solutions is the convex subspace that comprises all the values of the vector x which satisfy both the equality and inequality constraints. A corner is a vector of n values (a point), an edge is a segment associated with a linear relationship between the n variables. In the case of Figure 3.12, we see that the feasible set is a segment of a straight line. It could be an empty set if the equality constraint was entirely contained in the forbidden subspace of the inequality constraints. The vector corresponding to the minimum, if it exists, may be shown to occupy a corner of the feasible set (e.g., Strang, 1976). The idea of the simplex method is to select any corner of the feasible set and to proceed from corner to corner along the edges in a direction that minimizes the function f(x) until no further reduction can be achieved. A linear programming solution devised by Wright and Doherty (1970) is widely used in the literature for the problem of finding modal abundances, while an example of free energy minimization will be discussed in Chapter 6. When constraints are equality only, the method of Lagrange multipliers is of broad applicability. Let us consider, as in Figure 3.13, a function/(JC) to be minimized on which we impose the constraint that g(x) = 0. From the previous discussion on the gradient properties, we know that — grad / is a vector perpendicular to the curves of constant /(JC) pointing towards decreasing values, while grad g is orthogonal to the locus of x vectors such as g(x) = 0. The orthogonal projection of —grad/on the constraint g(x) = 0 represents a direction of decreasing/(JC). A minimum will be reached once —grad/and grad g are

3.2 Functions of several variables

149

Figure 3.13 Constrained minimization: the minimum of a function f(x) submitted to the constraint g(x) = 0 occurs at M on the constraint subspace, here on the curve g(x) = 0 where V/(JT) + XVg(x) = 0. P is the unconstrained minimum of f(x). This principle is the base for the method of Lagrange multipliers.

collinear, i.e., grad / + X grad g = 0

(3.2.24)

where X is a constant called a Lagrange multiplier. If more than one constraint should be obeyed, as many Lagrange multipliers as there are constraints should be used. In practice, this procedure amounts to increasing the number of independent variables in the system by as many new variables as there are constraints. Equation (3.2.24) is indeed equivalent to finding the minimum of (3.2.25) The derivatives with respect to x produce the set of equations represented by grad 5 = grad / + X grad g = 0

(3.2.26)

150

Useful numerical analysis

while the derivative with respect to k gives 9(x) = 0

(3.2.27)

which is precisely the constraint which we want to be verified. &

Find the minimum of y2

submitted to the constraint

This problem amounts to minimizing

where / is a Lagrange multiplier, relative to x, y, and L The derivatives relative to each variable cancel at minimum dS dS dS — = 2x + ^ = 0 , — = 2 y + A = 0, a n d — = dx dy dX Adding the first two equations and subtracting the third twice results in /.= - 1 , x= 1/2, and y= 1/2

& Distribution of energy states. According to quantum theory, the energy states £ o>£i>£2>--- t h a t atoms in a gas, a liquid or a crystal can reach are distinct and have an equal probability of being taken by an atom. Standard textbooks (e.g., Swalin, 1962) show that the entropy 5 of a population of N atoms, nt being in the energy state ei9 is S = — k Y n.In — t

N

where k is the Boltzmann constant. Find the values of nt for maximum entropy 5 for constant total energy E. The first constraint is a fixed total atoms number £ > = iV or X d w . = ° i

(3.2.28)

i

and the second is a fixed total energy Y4eini = E or YJsidni = ° i

i

(3.2.29)

3.2 Functions of several variables

151

Let us minimize the function S1" given by

S+ = 5 + xfc eini - E \ + fih nt where a and ft are two Lagrange multipliers. We first observe that

and therefore

Each derivative relative to n 0 ,M 1? n 2 ,... must cancel, so for any state i A: In — - a e , - ^ = N

which we rearrange as

Summing up over all the energy states and applying the constraint (3.2.28), we get

and therefore n

N

Yea£'/fc

Introducing this expression into dS and equating to (Lejdnj)/T would show that a is equal to — T " 1 and produce the familiar Boltzmann distribution.o When the minimum of a function f(x) is sought not algebraically, as in these examples, but numerically, different techniques exist, which are described in a specific and abundant literature. Once more, a simple approach makes use of the minimization properties of the gradient direction. The so-called gradient projection method consists in using the projection of the gradient direction onto the constraint subspace as the search direction (Figure 3.14). It works best with linear constraints and an example will be given in Chapter 6.

Useful numerical analysis

152

\ \

-grad/

\ \ \ \

£(*) = const

\ /

\ \ \ x2

\ \

/x\ \

/decreases

gradg

\ \ \

\%

\

\

\ \

\ Figure 3.14 Search direction for the minimum of a function/^!, x2) submitted to the constraint g(x1,x2) = 0. The optimum direction is the projection of the opposite of the gradient onto the constraint subspace.

3.2.5 The Runge-Kutta method for a system of differential equations The single-variable method can be extended to a system comprising any number of differential equations. The Runge-Kutta method is commonly found in software packages. For simplicity, it will be described for a system of only two equations — = *r =f(t,x,y) dt

(3.2.30)

dy —=yt'=g(t>x>y) dt

where / and g are two known functions of t, x, and y. Let us define

Given the intermediate evaluations kA = h(n)f[t{n\ x(n), /">], /c 2

t<»» +

"

x<»>

+ 2 i , y»» + !i I

», x
],

;3 = l4 =

r

L(«)

3,

y(n) + / 3 ]

3.2 Functions of several variables

153

the values of x and y at time t{n+1) are

<#^ Solve the following equation, which appears in connection with some diffusion problems, in C(t) from t = 0 to t = 0.5 " + 2fC'-0.5C =

(3.2.31)

with the conditions C= 1 and C' = 0 at t = 0. Let us use a time step h of 0.1 and define x = C, y = C. Equation (3.2.31) becomes a system of first-order equations

' = 0.5x-2ty

(3.2.32)

with the conditions x=l and y = 0 at t = 0. With reference to equation (3.2.30), the functions / and g are defined as (3.2.33)

Let us work out the first step in detail using t(0) = 0, x ( 0 ) = 1, >>(0) = /cx =0.1 x0 = 0 / 1 = 0 . 1 x ( 0 . 5 x l - 2 x 0 x 0 ) = 0.05 /c2 = 0.1 x (0 + — J = 0.0025

/2 = 0.1x o.5n + -J-2K)+ — /c3 = 0.1

- ^ ) ] = 0.04975

= 0.002488

kt = 0.1 x (0 4- 0.049 814) = 0.004 981 /4 = 0.1x [0.5(14-0.002488)-2(0 + 0.1)x(0 + 0.049814)] =0.049 128

Useful numerical analysis

154

Table 3.13. The first five iterations (n = l, ..., 5) of the Runge-Kutta solution to equation (3.2.31) for a time-step of 0.1.

n

0

1

2

3

4

5

t(n)

0 1 0 0 0.0500 0.0025 0.0498 0.0025 0.0498 0.0050 0.0491 1.0025 0.0497

0.1 1.0025 0.0497 0.0050 0.0491 0.0074 0.0480 0.0074 0.0481 0.0098 0.0466 1.0099 0.0977

0.2 1.0099 0.0977 0.0098 0.0466 0.0121 0.0447 0.0120 0.0448 0.0142 0.0425 1.0219 0.1424

0.3 1.0219 0.1424 0.0142 0.0426 0.0164 0.0400 0.0162 0.0401 0.0183 0.0373 1.0382 0.1824

0.4 1.0382 0.1824 0.0182 0.0373 0.0201 0.0343 0.0200 0.0345 0.0217 0.0312 1.0582 0.2167

0.5 1.0582 0.2167 0.0217 0.0312 0.0232 0.0279 0.0231 0.0281 0.0245 0.0247 1.0813 0.2447

x{n) y(n)

k1

h

k2

h

u

x(n+l)

which gives the x and y values at the next step u_

u

0 + 2x0.0025 + 2x0.002488 + 0.004981

=0+

Table 3.13 gives the results up to £ = 0.5. <= & Convective dispersal of a conservative tracer in a velocity field. This calculation is of major interest for many problems such as the geochemical evolution of the mantle, the mixing time of the ocean, the understanding of magma mixing processes, ... Justification of the equations used is presented in Chapter 8. Solve for t = 0.02, 0.04, 0.06, ... the system of differential equations dx . (n — = — n sin nx sin dt \2

&y n fn — = — cos nx cosl dt 2 \2

ny y

(3.2.34)

7iy v

given x= y = 0.l at t = 0. It is left to the reader to build Table 3.14. 3.2.6 Interpolation with spline functions When a bi-dimensional table of data with unevenly distributed coordinates is available (e.g., data points on a map), the need for computing interpolated values is frequently

3.3 Partial differential equations: the finite differences method

155

Table 3.14. The first five iterations of the Runge-Kutta solution to equation (3.2.34) for a time step of 0.02.

n t(n)

xin) y(n)

f G

h

k2

h K U(»+D X

y(n+D

0 0 0.1 0.1 -0.9233 0.4616 -0.0185 0.0092 -0.0167 0.0097 -0.0169 0.0097 -0.0153 0.0103 0.0832 0.1097

1 0.02 0.0832 0.1097 -0.7638 0.5129 -0.0153 0.0103 -0.0138 0.0108 -0.0139 0.0108 -0.0126 0.0113 0.0693 0.1205

2 0.04 0.0693 0.1205 -0.6303 0.5670 -0.0126 0.0113 -0.0114 0.0119 -0.0115 0.0119 -0.0104 0.0125 0.0578 0.1324

3 0.06 0.0578 0.1324 -0.5191 0.6244 -0.0104 0.0125 -0.0094 0.0131 -0.0095 0.0131 -0.0085 0.0137 0.0484 0.1455

4 0.08 0.0484 0.1455 -0.4268 0.6854 -0.0085 0.0137 -0.0077 0.0143 -0.0078 0.0144 -0.0070 0.0150 0.0406 0.1599

5 0.10 0.0406 0.1599 -0.3506 0.7501 -0.0070 0.0150 -0.0063 0.0157 -0.0064 0.0157 -0.0057 0.0164 0.0343 0.1756

encountered. A most common application is the drawing of contour maps (such as isopleths). Several procedures exist but the success of cubic splines makes this technique easily available from software packages. Two-dimensional interpolation is carried out as a successive construction of one-dimensional splines along rows followed by one-dimensional splines along columns (e.g., Press et a/., 1986). Other useful spline variants in multiple dimensions are described by Sandwell (1987). 3.3 Partial differential equations: the finite differences method

Partial differential equations (PDE) involve relationships between partial derivatives of a function. One of the most common PDE problems to be solved in geochemistry appears to be the diffusion equation, which, in the jargon of PDE specialists, is said to be parabolic, because of the values of each derivative's coefficients, in contrast with elliptic equations (e.g., Laplace equation in a square) or hyperbolic equation (e.g., the wave equation). Many methods exist which can solve these problems, some of which appeared in the last 20 years, e.g., the collocation and least-square methods (Finlayson, 1972), the variational finite-element method (Zienkiewicz, 1977) and the multigrid method (Hackbusch, 1985). However, the easiest route is by far the finite difference method (e.g., Mitchell, 1969). Although computational effort with finite differences is small, the results, surprisingly enough, are often quite satisfactory. In addition, although being rather slow, the finite differences are easy to learn and easy to implement. The increasing power of computers makes it possible to consider this method as a competitor to the most efficient and sophisticated methods, sometimes difficult to master for a non-specialist, always time-consuming to implement. Only for natural objects with irregular geometry, such as plutons, could alternatives to the finite differences result in a net gain in the time and effort spent in solving the problem.

156

Useful numerical analysis 3.3.1 One-dimensional diffusion problems: general

Let us consider the diffusion of a homogeneously distributed substance out of a slab with unit thickness in the direction x and infinite size in the other two dimensions. C(x, t) being the concentration at time t at a distance x from the origin, the diffusion equation is £OM)*C<M) dt dx2 where 3f is the diffusion coefficient of the substance. The following initial condition holds C(x,0) = Co(x) while at the boundaries, the conditions are

The first step is to discretize the (x, t) domain using a uniform mesh x = iAx, i = 0,l,...,n=l/Ax t=jAt, 7 = 0,1,...,00 and call Ctj the concentration at the mesh point (node) x = i Ax, t =j At (here i and 7 have do not have their usual meaning of element and phase). Let us define the central difference operator 8X as

From this definition, the second-order difference is

or 5 x 2 C/=C l _ 1 ^-2C/ + Cl + 1^

(3.3.2)

In the finite difference schemes, the derivatives are replaced by difference operators, e.g., the first derivative dc(xt)

8cy

^x

Ax

c ; - c ;

3

Ax

and d2C(xj) ^ Sx(dxCA6xCi AxV Ax /

CI-/2C/CI^ Ax 2

Ax2

3.3 Partial differential equations: the finite differences method

157

The time derivative is evaluated at mid-point between t and t + At, i.e., dc{x,t) dt

6 r c^ + 1 / 2 At

c/+1-cy At

(3.3.5)

and the second derivative with respect to x is replaced by a linear combination of the second-order finite differences at t and t + At ,,,6,

Axz

dx

where 9 is a parameter defined over (0,1]. Rewriting the diffusion equation into a partial difference equation leads to

At

|_

Ax2

(3.3.7)

Ax2 J

Two well-known schemes can be derived from this equation. For 0 = 0, the second-order difference is evaluated at t = i Ax and the scheme is called explicit

Ax2

At

The 'molecule', that defines the nodes involved in the calculation for a pair of i, j values, is shown in Figure 3.15. Defining the parameter r as

(3-3-8)

r~ Ax2 and recombining J 1"

(3.3.9)

or equivalently, in terms of the central difference operator

For r < 1/2, the explicit scheme is shown to be stable, e.g., does not make errors grow exponentially with time (e.g., Mitchell, 1969). Alternatively, for 6 = 1/2, i.e., evaluation of the second-order difference as an average of the values at t and t + At, leads to the robust Crank-Nicholson scheme ^

^

+ CI. + ^ + 1 )

(3.3.10)

The 'molecule' of the Crank-Nicholson scheme is also shown in Figure 3.15.

158

Useful numerical analysis

Equivalently, in terms of the central difference operator 8X, we can write

Multiplication by 2 and recombination leads to the linear system of equations (3.3.11)

Given the boundary conditions of zero concentration for i = 0 and i = n9 the equations can be recast into matrix form 1 -r

0

0

0

0

0

2(1+r)

-r

0

0

0

j+1

0

0

-r

2(1 + r)

-r

0

0

0

-r

2(1 + r)

-r

0

0

0

0

-r

2(1 +r)

-r

0

0

0

0

0

0 - - .

0

0

0

0

0

0

0

1

0

0

r

2(1-r)

r

0

0

r

2(1 - r )

r

0

1

C J

!

0 0

0

0

0

0

r

2(1 - r ) ..-Or

..

0

0

r

0

-r)

r

0

1

Let us lump the concentrations at the nodes f = 0, 1, 2,..., n— 1, n into a single vector c7 and defining the (n+ l)x(n-|-1) matrices An and ^ n by their current element atj and bip respectively, such that

.= - r

bij = r

for | i - ; | = l

j= 0

6l7 = 0

otherwise

The matrix form of the Crank-Nicholson implicit scheme becomes AncJ+l=BncJ

3.3 Partial differential equations: the finite differences method

159

Implicit CrankNicholson scheme

Explicit

i+l

i+l

i-l

i-l

Time Figure 3.15 Finite difference 'molecules' for explicit (left) and implicit Crank-Nicholson (right) schemes.

or (3.3.12) which shows how the node concentrations at time t + At are calculated from those at time t. Although the Crank-Nicholson scheme is unconditionally stable, use of small r values such as r< 1/2, improves accuracy (Mitchell, 1969). &

Solve with both the explicit and implicit methods the equation

ct

with the initial condition C(x, 0) = 1 and the boundary conditions C(0, t) = C(/, t) = 0 for / = l c m and ® = 0.005cm 2 s" 1 . For the purpose of illustration, a very coarse mesh size will be used with Ax = 0.25, i.e., w = 5, and At = 2.5 s. Therefore

r = — r2 = Ax

0.005 x 2.5 z— = 0.2 0.252

Useful numerical analysis

160

Figure 3.16 Results of the explicit difference scheme applied to the following diffusion problem: (a) initial concentration equal to unity (b) end concentrations at x = 0 and x = 1 cm are zero (c) Qi = 0.005 cm2 s" 1 . Length increment is Ax = 0.25, i the number of length increments. Time increment is At = 2.5 s, j the number of time increments.

We first implement the explicit scheme. The initial conditions require C,° = 1 = 1 for i = 1,2,3,4 (initial condition) Q>° = 0

(boundary condition)

C 5°

(boundary condition)

Points i = 5, 4, and 3 may be obtained by symmetry from points 0, 1, and 2. Hence at t = At = 2.5s l

= Co°

(boundary condition)

1

C 1 = 0 . 2 x C 0 0 + 0.06xC 1 0 + 0.2xC 2 ° = =0.2 x ( V + O^ x C2° + 0.2 x C 3 °= 1

The same calculation can be can be carried on as far as needed. The results up to r = 4Af=10s are depicted in Figure 3.16. Let us now turn to the implicit Crank-Nicholson method and form the matrix An as 1

0

0.2

2(1+0.2)

0 A =

-0.2

0 -0.2 2(1+0.2) -0.2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

-0.2 2(1+0.2) -0.2 0

-0.2 2(1+0.2) 0

0 -0.2 1

3.3 Partial differential equations: the finite differences method 1

0

0

-0.2

2.4 -0.2

0

-0.2

0

0

-0.2

0

0

0

0

0

0

0

0

0

0

0

2.4 -0.2

2.4 -0.2 -0.2

0

2.4 -0.2

Likewise 1

0

0 «„=

0

2(1-0.2)

+ 0.2

+ 0.2

+ 0.2

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

+ 0.2

2(1-0.2)

+ 0.2

0 0

+ 0.2

2(1-0,2) + 0.2

0

2(1-0.2) + 0.2

0

1

0

"1

0

0

0

0 "

0.2

1.6 0.2 0

0

0

0

0.2

1.6 0.2 0

0

0

0

0.2

1.6 0.2 0

0

0

0

0.2

1.6 0.2

0

0

0

0

0

1

therefore "1.0000 0

0

0

0

0.1678

0.6784

0.1409

0.0118

0.0010

0.0001

0

"

0.0141

0.1409

0.6902

0.1418

0.0118

0.0012

0.0012

0.0118

0.1418

0.6902

0.1409

0.0141

0.0001

0.0010

0.0118

0.1409

0.6784

0.1678

0

0

0

0

0

1.0000

•

•

We can now calculate the approximation cx for i= 1, 2, 3, 4 at t = At as fl.OOOO 0

0

0

0

0.1678

0.6784

0.1409

0.0118

0.0010

0.0001

0

" "0" 1

0.8321

0.0141

0.1409

0.6902

0.1418

0.0118

0.0012

1

0.9847

0.0012

0.0118

0.1418

0.6902

0.1409

0.0141

1

0.9847

0.0001

0.0010

0.0118

0.6784

0.1678

0

1

0.8321

0

0

0

0

1.0000

0m

•o

_0

161

Useful numerical analysis

162

o

Figure 3.17 Same as Figure 3.16 but for the implicit Crank-Nicholson scheme. then iterate over successive time steps. The results up to t = 4 At = 10 s are depicted in Figure 3.17. o

3.3.2 More boundary conditions

For boundary conditions which require a prescribed value of the flux instead of concentration, we introduce what is usually called fictitious points. Let us assume that at x = 0, the condition is dC(xj)

(3.3.13)

ex We introduce the approximation valid for any value of t, hence of j dC(x, t) ex

l

<~ - 1

(3.3.14)

where i = — 1 refers to the point symmetrical to the mesh point i = 1 relative to the interface. Then we write that the finite difference approximation holds for the mesh point / = 0 at the interface, e.g., for an implicit scheme -rC^j+1 +2(1 Combining the two equations, the fictitious point concentrations can be eliminated

or (3.3.15)

3.3 Partial differential equations: the finite differences method

163

As will be shown below, writing this equation as a matrix equation is rather straightforward. Similar techniques can be used for the so-called 'radiation' boundary conditions which involve linear combinations of concentration and concentration gradient and appear in connection with elemental fractionation between adjacent phases. & Solve with the Crank-Nicholson scheme the diffusion problem described in the worked example in Section 3.3.1 d2C(x,t) <3 dx2

dC(x,t) ct

with C(x, 0)= 1, C(/, r) = 0 for /= 1 cm, and 2 = 0.005 cm2 s~ \ but with no flux at x = 0. Let us keep the same mesh size Ax = 0.25, i.e., n = 5, and Ar = 2.5s. Let us form the matrices

A =

2(1 + r )

—r

0

0

0

0

—r

2(1 + r)

—r

0

0

0

0

—r

2(1 + r)

- r

0

0

0

0

—r

2(1 l-r)

—r

0

0

0

0

2(1 + r)

—r

0

0

0

0

1

" 2.4

-0.4

-0.2

r 0

0

0

0

0 "

2.4

-0.2

0

0

0

0

-0.2

2.4

-0.2

0

0

0

0

-0.2

2.4

-0.2

0

0

0

0

-0.2

2.4

0

0

0

0

-0.2 1

0

and 2(1 - r )

2r

0

0

0

0

r

2(1-r)

r

0

0

0

0

r

2(1-r)

r

0

0

0

0

r

2(1 -r)

r

0

0

0

0

r

2(1-r)

r

0

0

0

0

0

1

164

Useful numerical analysis 1.6

0.4

0

0

0

0 "

0.2

1.6

0.2

0

0

0

0

0.2

1.6

0.2

0

0

0

0

0.2

1.6

0.2

0

0

0

0

0.2

1.6

0.2

L0

0

0

0

0

1

and the vector of flux conditions at x = 0 -4rg0Ax 0 =0

V= 0 0

The matrix equation reads (3.3.16)

We obtain "0.6903 0.2837

0.0238

0.0020

0.0002 0.0000"

0.1419

0.7022

0.1429

0.0120

0.0010

0.0001

0.0119

0.1429

0.6904

0.1419

0.0118

0.0012

0.0010

0.0120

0.1419

0.6902

0.1409

0.0141

0.0001

0.0010

0.0118

0.1409

0.6784

0.1678

0

0

0

0

0

1

We can now calculate the approximation cx at t = At fO.6903

0.2837

0.0238

0.0020

0.0002 0.0000" "1"

0.1419

0.7022

0.1429

0.0120

0.0010

0.0001

1

•i.oooo0.9999

0.0119

0.1429

0.6904

0.1419

0.0118

0.0012

1

0.9988

0.0010

0.0120

0.1419

0.6902

0.1409

0.0141

1

0.9859

0.0001

0.0010

0.0118

0.1409

0.6784

0.1678

1

0.8322

0

0

0

0

0

1

The results up to t = 4 At = 10 s are depicted in Figure 3.18.

_0_

0

3.3 Partial differential equations: the finite differences method

165

1 0.8 0.6 0.4 0.2 0

1

0

2

3

4

5

i Figure 3.18 Same as Figure 3.17 but with no flux at x = 0.

3.3.3 A word about advection Solving the purely advective equation or even introducing an advection term into the diffusion equation is a source of numerical difficulties. The simplest advection equation of a medium moving at velocity v in one dimension can be written dC(xj)

dC(xj) = —v-

(3.3.17)

dx

ct

Most simple finite difference schemes tend to be either unstable, inaccurate or both. Strang (1986) recommends using Friedrichs scheme

C,- +1

— C,--!

At

(3.3.18)

With r = vAt/Ax, the finite difference equation becomes (3.3.19)

and the stability condition is — l ^ r ^ - h l . For the cases where advection appears as an additional term in the diffusion equation, the reader is referred to the discussion in Chapter 9 of the book by Fletcher (1991). 3.3.4 Two space coordinates: The ADI method Given a rectangle R {a ^ xl ^ b, c ^ x2 ^ d) and the two-dimensional diffusion equation x 1 > x 2 > r) = dt

p 2 C(x l t x 2 > r) L

d*i

2

d2C(xux2,t)~

(3.3.20)

166

Useful numerical analysis

with the initial condition C(xu x2, 0) = C o(x 1 , x2) and the boundary conditions C(a,x2,t) = Ca(x29t) C{b9x2,t) = Cb(x29t)

The grid points are ilAx (Jl=0, 1, ..., wl), i2Ax (i2 = 0, 1, ..., nl\ andjAr. Explicit different schemes show poor stability properties (Mitchell, 1969). In terms of the central difference operator, it may be shown that an accurate implicit equation is 1--^ which is the extension of the Crank-Nicholson implicit formula to two dimensions. The widely used Peaceman-Rachford method deals with the Laplacian operator in two steps. Approximation is carried out on one space derivative with an explicit scheme and an implicit scheme for the other derivative (hence the name of Alternating Direction Implicit or ADI method), which results in the two successive formulas

I,i2

H/2

Developing, the two steps can be written

(3.3.21)

+ ui2i+V2

(3.3.22)

The 'molecule' of the difference scheme over the time steps is given in Figure 3.19. The half-step may not have in general the significance of an intermediate time step and is more a matter of computational convenience. For sake of demonstration, let us assume that concentration on the edge is constant with time c C

J—c 0i2 — e 0 i 2

with similar equations for the other three edges. Let us first compute the condition along the vertical (y) edges for the first half-step. For il = 1 and 1 < f 1 < n\ — 1, equation

3.3 Partial differential equations: the finite differences method

167

a

il

Figure 3.19 Finite difference 'molecule' for the alternating-directions implicit method (ADI) of solving the two-dimensional diffusion equation.

(3.3.21) becomes Ili o

(3.3.23)

iUn2

(3.3.24)

A similar equation holds for Q = nl—\ and 1 < i \
and for the horizontal (x) edge. For il = 1 and 1 < il < nl — 1, equation (3.3.23) becomes 2(l+r)CUi2j+1'2-rC2,2j+^2 and for il = nl,

= rCui2_^ + 2(\-r)Cui2^rCUi2

j +l

+ rC042

(3.3.25)

I
(3.3.26) We now turn to the conditions at the corners. As an example, let us compute equation (3.3.21) of the ADI scheme for il = i2= 1 (3.3.27)

168

Useful numerical analysis

and for i\=n\ — 1 and i2 = n2—l.

(3.3.28)

Let us form the system of equations for I
0 -r

..• 0

0 0

7 + 1/2 2

7+1/2

0

0

-r

-r2(l+r)

-r

0

•.

0

-r

2(l+r)_

2

C

1 , «"2 — 1

C

0,i2

r

0

2(1 - r ) + r C«l-2,i2-l

j

L

0

r

For f2 = 1, the system becomes "2(1 +r)

-r

0

•••

0

-r

2(1+r)

-r

0

0

0 0

0 •••

-r 0

2(1 +r) -r -r 2(1 +r)_

c ^2,0

nl - 2 , 1 .^nl-1,1

C

nl-2,

U

« l -1,2 _

while for i2 = n2-l "2(1+r)

-r

0

•••

0

-r

2(1+r)

-r

0

0

0 0

0 ...

-r 0

2(1+r) -r -r 2(l+r)_

c2

l

7+1/2

3.3 Partial differential equations: the finite differences method CUn2-2J

169

C\,n2-\

J

C2,n2-2

C2,«2-V r

L2(l-r). mCnl,n2

- 1 + Oil - I,n2_

All these equations can be stacked as •2(1+r)

-r

0

-r

2(1+ r)

-r

0

-r

2(1 + r)

-r

0

-r

(1+r).

"2(1 -r)

r

0

0

r

2(1-r)

r

0

0

0

r

0

...Or e

o

^2,0

2(1 -r)

r 2(1 -r\

r

r

...

o

Cnl,n2-2

c2

Cnl,n2-l+Cnl-l,

where C J is the (n\ — 1) x (n2— 1) matrix at time step 7 of concentration values at all the inner grid points (not including the boundaries). In a compact matrix form, the equation reads (3.3.29) In this equation, Anl _ x is an (nl — 1) x (nl — 1) tridiagonal matrix with 2(1 + r) on the main diagonal and — r on the other two diagonals. Bn2-1 is an (n2— I)x(n2— 1) tridiagonal matrix with 2(1—r) on the main diagonal and I r o n the other two diagonals. H is an (nl — 1) x {nl— 1) matrix built as indicated. For the second half-step described by equation (3.3.22), we can write (J'2 = 2, 3, ..., n2-2, Ci,i2J C2,i2j

—r 2(1 +r) —r

1

Useful numerical analysis

170 2(1 - r )

r

0

r

-r)

r

0

0

0

0

r

2(1 -r)

r

0

r

2(1 - r ) _

0

0

. J+l/2

C

J+l/2

-2,i2

cnl - I i 2

J+l/2 i+1/2

and expressions similar to those of the first half-step for concentrations on the edges. In a matrix form, the system of equations can be written (3.3.30)

In this equation, An2 _ x and Bnl _ x are defined as before, while His an (nl — 1) x (nl — 1) matrix. Rewriting the expression (3.3.29) for the first half-step as

and inserting into equation (3.3.30), the matrix equation becomes

and finally ,-'

(3.3.31)

We now get a fairly good feeling that handling boundary conditions in two dimensions is significantly more difficult than those of one-dimensional problems. Flux conditions can be applied to the boundaries using the method offictitiouspoints. Solve the two-dimensional diffusion equation C(x,y,t)] d2C(x

dC(x,y,t) dt

dy2 J

in the rectangle defined by a^x^b = a + L and condition C(x, y, 0) = 0 and the boundary conditions

with the initial

C(a,y,t) = O •(b,y,t) =

^

d-c

b-a

for a diffusion coefficient Q) = 0.005 cm2 s i. Let us mesh the rectangle assuming Ax = 0.2, which is equivalent to nl=ny = 4, n2 = nx = 5, and At = 0.1, i.e., r = 0.0125 (note that x and y are switched relative to the

3.3 Partial differential equations: the finite differences method

171

indices n\ and til). We now build the time-independent matrices 2.0250

-0.0125

-0.0125

2.0250

-0.0125

-0.0125

2.0250

-0.0125

-0.0125

2.0250

0 0

0

0

0 0

and "1.9750 0.0125 0

0

0.0125

1.9750 0.0125 0

0

0.0125

1.9750 0.0125

0

0

0.0125 1.9750

which are combined into 0.9754 0.0122 0.0001 0.0000" 0.0122 0.9755 0.0122 0.0001 0.0001

0.0122 0.9755 0.0122

0.0000 0.0001

0.0122 0.9754

Likewise 2.0250

-0.0125

-0.0125

2.0250

0

-0.0125

0 -0.0125

and

1.9750

0.0125

0

0.0125

1.9750

0.0125

0.0125

1.9750.

L0

2.0250J

so 0.9754

0.0122

0.0001

0.0122

0.9755 0.0122

LO.0001 0.0122 0.9754

H is built from the boundary conditions as 0.2000 0.4000 0.6000 1.55'

H= 0

0

0

0.50

.0

0

0

0.25.

hence 0.0025

0.0049

0.0075

0.0190"

0.0000

0.0000

0.0001

0.0062

.0.0000

0.0000

0.0000 0.0031.

Useful numerical analysis

172

The concentration matrices will be shown as Cj fringed on each edge by the boundary values written in italic, i.e., as (n y +l)x(n J C +l) = 6 x 5 matrices, and a symbols with overbar Cj will be used. Given the initial concentration matrix '0 0.2 0.4 0.6 0.8 1.0 ' 0 0

0

0

0

0.75

C° = 0 0

0

0

0

0.50

0 0

0

0

0

0.25

0 0

0

0

0

0

we can calculate concentrations at t = At '0 0.2

0.4

0.6

0.8 1.00'

0

0.0025 0.0049 0.0075 0.0190 0.75

0

0.0000 0.0000 0.0001 0.0062 0.50

0

0.0000 0.0000 0.0000 0.0031

0.25

0

0

0

0

0

0

0.4

0.6

0.8 1.001

t = 2At V0 0.2 0

0.0049 0.0098 0.0149 0.0372 0.75

2

C = 0 0.0000 0.0001 0.0003 0.0124 0.50 0

0.0000 0.0000 0.0001 0.0061

10 0 and so forth. <=•

0

0

0

0.25 0 A

4 Probability and statistics

4.1 A single random variable 4.1.1 General

In most natural situations, physical and chemical parameters are not defined by a unique deterministic value. Due to our limited comprehension of the natural processes and imperfect analytical procedures (notwithstanding the interaction of the measurement itself with the process investigated), measurements of concentrations, isotopic ratios and other geochemical parameters must be considered as samples taken from an infinite 'reservoir' or population of attainable values. Defining random variables in a rigorous way would require a rather lengthy development of probability spaces and the measure theory which is beyond the scope of this book. For that purpose, the reader is referred to any of the many excellent standard textbooks on probability and statistics (e.g., Hamilton, 1964; Hoel et al, 1971; Lloyd, 1980; Papoulis, 1984; Dudewicz and Mishra, 1988). For most practical purposes, the statistical analysis of geochemical parameters will be restricted to thefieldof continuous random variables. • A random variable X is defined by (a) a continuous domain Q of definition in % e.g., ] - oo, 4- oo[, [0, + oo[, or [ - 1 , + 1]. (b) its probability distribution function (sometimes called the cumulative distribution function or, for short, distribution function), i.e., a continuous non-decreasing function F{x) defined for each value x in Q, such as F(x) = #{X^x}

(4.1.1)

where & stands for 'probability of. Strict notation conventions should be observed: an upper-case letter should refer to a random variable, while the same letter in lower case refers to the values taken by this variable. Writing the same relation at x + cbc

and subtracting, we get

The probability density function f(x) (also the density function or frequency function) is the 173

Figure 4.1 Relationship between the probability density function/(x) of the continuous random variable X and the cumulative distribution function F(x). The shaded area under the curve f(x) up to x0 is equal to the value of f(x) at x0.

4.1 A single random variable

175

derivative of the distribution function F(x). These two functions relate through dd

/ (() )d d # { ^ ^

+

}

(4.1.2)

dx f(x) therefore has the significance of a probability per unit of X in the neighborhood of the value x. Because F(x) is non-decreasing, its derivative/(x) is non-negative. Conversely, if Q is the domain ] — oo, +oo[, the cumulative distribution function F(x) relates to the probability density function/(x) through =\

f(u)du

(4.1.3)

J—

F(x) represents the surface under the curve of the function/(u) up to u = x (Figure 4.1). Since the total probability over the domain is 1

1 •

(4.1.4)

If p is an integer between 0 and 100, the pth percentile is the value xp of the random variable X which limits to the right p/lOOths of the surface S under the probability curve (Figure 4.2). 4.1.2 Expectation and moments

Let us consider a continuous random variable X defined over a given domain Q of ^ , such as [0, ..., +oo[ or ] - o o , ..., +oo[ and let/(x) be its probability density function over Q. Let g(X) be a function of the random variable X. •

The expectation $(g) of g(X) is <%)= I g(x)f(x)dx

(4.1.5)

and is usually denoted with greek letters. Expectation is therefore a linear operator. The expectation has the meaning of a centroid with the local' probability /(x)dx as a weight-function. • Using parentheses to emphasize power functions, the nth moment is the expectation $\_(X)n~\ of X". The mean \i is the first moment H=

x/(x)dx

(4.1.6)

JQ

• The nth central moment is g[_(X-n)n] or £{[X-g{X)~]n}. central moment, i.e.,

The variance a2 is the second

\rckr(Y\ rr 2 jPf F Y /?( YY\2\ /P(Y2\ JCP( Y\/P( Y\ y a\\J\ ) — Ox — 0 I L ^ — ^ V^ )J ( =z & V-A ) — <5 ^A )0 \J\ )

(A \ H\ yr.l.. I)

• The standard deviation a is the square-root of the variance and has the same unit as the random variable. A random variable is standardized (or reduced) if its variance is unity and centered if its mean is zero.

Probability and statistics

176

100 100

Figure 4.2 Definition of the percentile p: the curve is the probability density function /(x) of the continuous random variable X. xp is the pth percentile when the surface up to xp represents p percent of the total surface S under the curve. ^

Calculate the mean and variance of the uniform density f(x)=l/2a

for

The mean is the first moment, i.e., c+a i rx 2 i + f l li = £'(X)=\ x —dx= — =0 J_fl 2a L4«J-fl while the variance is the second central moment i fx 3 ~l +a a3
2

&

2

2

—a3

a2 =— 6a 3

<^

Find the variance of the Cauchy distribution

n

1+x2

Because f(x) is symmetric about the mean, its mean must be zero. Its variance should be Tll-hX 2

but this integral diverges when x->oo. The Cauchy distribution has no finite variance, o • The moment generating function M(t) = $(etX) is particularly useful to compute moments. Expanding the exponential, we get = S[ l + tX+l-X2+

...]=\+t£(X)+t-g(X2)+

...

4.1 A single random variable

111

Alternatively, M(t) may be expanded in Taylor series in the neighborhood of t = 0 M(t) = M(0) + tM'(0) + - M'(0) + ...

(4.1.8)

where the number of primes indicates the order of the derivative. Comparing the last two equations shows that the moment of order n is equal to the nth derivative of M(i) for t = 0 (4.1.9) Jt = O

& Calculate the moment generating function of the general normal distribution f(x) given by

From the definition of the moment generating function, we write

Multiplying and dividing the right-hand side by the exponential of fit + a2t2/2, we get / M(t) = exp( fit H

\

(T2t2\C

+ co

I

1

(x-u)2-^rT2t(Y-n\^rT4t2l

f exp

^ / J - ao ^/luo

L

which we rearrange as 2<72

J

The integrand is almost the normal density. Introducing the variable u= -

M(t) can be rewritten as

<

A*f +

a2t2\ 2 /

x

1

f +0

x /2^J-oo

On the right-hand side of the multiplication sign, we recognize the integral of the reduced normal density function which sums up to one, and therefore e^t+£ r

(4.1.10)

178

Probability and statistics

The first and second derivatives are

M"(t) = G2M{t) + (/i + (72t)M'(0 = (T2 Af (0 + (// + a2t)2M{t) The values of M(t) and its first two derivatives at f = 0 are M(0)=l, M/(O) = iu, and M"(0) = //2 + (72 The mean of the normal density function is therefore &(X) = M'(0) = fi

(4.1.11)

and its variance g(X2)-g(X)g(X)

= M'\Q>)-[M\0)]2 = ii2 + G2-v2 = o2

(4.1.12)

two well-known and quite useful results. 4.1.3 A compendium of some common probability density functions • A random variable is said to be distributed as the uniform distribution if f(x) is given by /(x)=-i-, b—a

a<x
(4.1.13)

and/(x) = 0 otherwise Because of its importance in natural phenomena, e.g., radioactive decay and population dynamics, let us introduce the exponential distribution through an illustration, n atoms are assumed to decay over the time interval [0 — 0], each atom having the same probability of decaying at any time in this interval. In other words, the time at which an atom decays is uniformly distributed over [0 — 0]. Let us call N(t) the number of decay events between 0 and t. The probability that a single atom has not decayed at time t, is 1 —1/6 (Figure 4.3). The probability that none of the n atoms has decayed at time t is (4.1.14) The probability that the first decay takes place before t is therefore the complement of this expression to one since

The cumulative distribution function of the time T to the first decay is F(t) = ^{T^t}=

&{N(t) # 0} = 1 - 0>{N(t) = 0}

and therefore F(t)=l-(l--'

4.1 A single random variable

o

t

179

e

Time Figure 4.3 Distribution of events on the time axis in a Poisson process over the time interval [0 — 0]: the probability of occurrence for one event in a time interval depends on its duration only (see text). The corresponding density of probability function f(t) is obtained as the derivative of F(t) f(t) = — = -

1--

dt e\

(4.1.15)

e)

Let us assume that probability of decay per unit time is constant, i.e.,

We rewrite equation (4.1.15) as

and 'enlarge' our view by increasing the number of decays n->oo, which because of constant decay rate implies 0->oo. For finite time t, At«n and, expanding the logarithm to the first order (see Section 3.1), we get the density of the exponential distribution defined over [0, + oo[

XCXP x f nn 1 ^ 1 L n\

(4.1.16)

The exponential distribution has mean /x = X and variance a2 = X2. i The most widely used distribution is the normal distribution (also Gaussian) with the familiar bell shape defined over ] — oo, + oo[. Its most general form is (4.1.17) with mean \i and variance a2. The moment generating function is given by equation (4.1.10). It is said to be centered when fi = 0 and reduced when a2 = l. • A random variable X is distributed as a log-normal distribution if In X is distributed as a normal distribution. If U is a random normal variable, a log-normal distribution is the distribution of a variable X such as (4.1.18) or X = fiaRu

(4.1.19)

crR can be viewed as a 'relative' standard deviation: for U= ± 1, X is multiplied, respectively divided, by aR.

180

Probability and statistics

Figure 4.4 Gamma probability density functions for scale parameter p = 1 and different shape parameters a = 1, 2, and 5.

The Cauchy distribution, also bell-shaped, is defined over ] — oo, + oo[ and reads (4.1.20) Its mean is zero but we have demonstrated that it does not have a finite variance. Several distributions belong to the family of gamma distributions defined over ]0, + oo[

fix)-

1

(4.1.21)

where a and p are parameters with a and p > 0. a is often referred to as the shape parameter and p as the scale parameter. This function is plotted in Figure 4.4 for /?= 1 and a = 1, 2, and 5. F(a) is the gamma Eulerian function

,-JV..

x*-le~xdx

F(a)=

(4.1.22)

Jo Jo

and is also well known in its form of the factorial function for integer values of a (4.1.23) Additional properties of this function are the recursion formula (4.1.24)

and the special value (4.1.25)

4.1 A single random variable

181

The mean of the gamma distribution is a/? and its variance a/?2. The moment generating function is (4.1.26) The parameters a and /? can be computed from // and a2 using a = (/V
p = <J2/n

(4.1.27)

Setting a = 1 and /?= I/A, we recognize the exponential distribution of equation (4.1.16). • For a = v/2 and /? = 2, the gamma distribution becomes the chi-squared (x2) distribution A

(

4

.

1

.

2

8

)

with mean v and variance 2v. For reasons that will appear later, v is known as the number of degrees offreedom. The pth percentile of the chi-squared distribution with v degrees of freedom is denoted xP,v2• A special case of the gamma distribution is obtained by replacing a by n+1, where n is an integer, and x by /to (this can be easily achieved using the change of variable technique discussed below). The resulting distribution f(x) is /(x) = i x n e - x \

(4.1.29)

and is known for integer values of x as the Poisson distribution. A random variable X is distributed as the beta distribution over the range [0,1] if its density function is given by

l^"\-\l-xr-i

(4.1.30)

with the two parameters ax and a 2 > 0 . This function is plotted in Figure 4.5 for different combinations of ax and a2. Its mean and variance are ^

(

4

.

1

.

3

1

)

The parameters ax and a2 can be computed from \i and a2 using «i = ^ ( l - M ) - ^

and

a2 = ^ ^ - / i - ( l - M )

(4.1.32)

No simple form of the moment generating function exists. In the special case where a i = a 2 = 1» t n e b e t a distribution reduces to the uniform distribution over [0, 1]. • Finally, we will frequently refer to Snedecor's F-distribution. A random variable defined over ]0, +oo[ is distributed with the F-distribution with Vi and v2 degrees of freedom

182

Probability and statistics

0

0.2

0.4

0.6

0.8

1

X Figure 4.5 Beta probability density functions for different parameter pairs ax and a2.

when its density function is given by

2

;

/Vl

i/2 (vi+v2)/2

/ ( * ) = •

(4.1.33)

The pth percentile of the F distribution with vx and v2 degrees of freedom is denoted F p V) vv The order of vt and v2 is critical since the F-distribution is not symmetric in these variables. Its mean is

v,-2

v2>2

(4.1.34)

and its variance , 2v22(v1+v2-2) 'v,(v 2 -2) 2 (v2-4)'

v2>4

(4.1.35)

• Making the change of variable x = t2 in the F density for Vj = l and v2 = n, we get the Student's t-distribution defined over ]-oo, + oo[

f(t) =

1

V2

(4.1.36)

nn

Its mean is zero and its variance n/(n — 2). The pth percentile of the t distribution with v degrees of freedom is noted tPtV. The Student's f-distribution converges rapidly towards the normal distribution: in practice, when v > 30, the two distributions become indistinguishable.

4.1 A single random variable

183

Useful relations among percentiles are ti-.P/2,,v2 = fi-p,v

(4-1-37)

and (4.1.38)

4.1.4 Some relationships between fundamental distributions Although the formal proof of some relationships will be postponed until the necessary background is exposed, it is probably necessary at this point to justify the rather lengthy compendium of distributions of Section 4.1.3. The exponential distribution with parameter k is the distribution of waiting times ('distance' in time) between events which take place at a mean rate of k. It is also the distribution of distances between features which have a uniform probability of occurrence (Poisson process), such as the simplest model of faults on a map. The gamma distribution with parameter n and k'1, where n is an integer is the distribution of the waiting time between the first and the nth successive events in a Poisson process. Alternatively, the distribution/(t), such as f(t) = -(h)nQ-Xt n\

(4.1.39)

represents the probability that n events have occurred over the time t. The associated gamma and Poisson distributions are well-suited to describe non-negative quantities that result from the addition of a finite number of units. Detrital input to a sedimentary layer from a particular drainage basin through river transport can be treated as a Poisson process by assuming that small batches of sediments are carried to the sea at random times. Each single layer represents a time interval over which the total sedimentary mass added from that drainage basin may be treated as a gamma variable. The same simple approach can be used to model the distribution of contributions from a magma source to a given batch of magma. Deloule et ai (1991) have shown that the energy spectrum of hydrogen atoms in amphiboles measured by ion probe fits a gamma distribution consistent with the atom being ejected from the sample after absorption of about seven electrons by a neighboring iron atom. Given two gamma variables X and Y with parameters ax, /? and ay, /?, respectively, the proportion X/(X + Y) is distributed as a beta distribution with parameters ax and ocY. This relationship between the exponential, gamma and beta distributions is useful in handling mass balance problems. Using once again the sedimentary example, if only two basins contribute material to a detrital layer with quantities being described by Poisson processes of identical rate k but different numbers of batches nx and nY, the proportion of each component is distributed as a beta distribution with parameters nx and nY. Although this model may appear a little simplistic, it is still appealing enough to use the beta distribution for mass fractions in mixtures.

184

Probability and statistics

The physical and conceptual importance of the normal distribution rests on one unique property: the sum of n random variables distributed with almost any arbitrary distribution tends to be distributed as a normal variable when n->oo (the Central Limit Theorem). Most processes that result from the addition of numerous elementary processes therefore can be adequately parameterized with normal random variables. On any sort of axis that extends from — oo to + oo , or when density on the negative side is negligible, most physical or chemical random variables can be represented to a good approximation by a normal density function. The normal distribution can be viewed a position distribution. The ratio of two normal random variables with zero mean is distributed as a Cauchy variable. Isotopic ratios such as 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb therefore should not be described as normal variables since ratios of ratios (e.g., 2 0 7 Pb/ 2 0 6 Pb) should be distributed with a consistent distribution. A consistent distribution for isotopic ratios is the log-normal distribution. The square of the distance between two points with position distributed normally is distributed as the x2 distribution with one degree of freedom. The sum of n such distances is distributed as the x2 distribution with n degrees of freedom. The ratio of squared distances, which is used for instance to test the ratio of variances or the ratio of a squared distance to a variance, is distributed as an F-distribution. 4.1.5 Estimators The set of all possible outcomes of a measurement considered as a random variable is usually called the population. The parameters of the density function associated with a particular population, e.g., mean or variance, are not physically accessible since their determination would require an infinite number of measurements. A measurement, or more commonly a set of measurements ('points' or 'observations'), produces a finite set of outcomes called a sample. Any convenient number describing in a compact form some property of the sample is called a statistic, e.g., the sample mean

(4.1.40)

where m is the number of observations Xj in the sample, or the sample variance

s2 = —

(4.1.41) m-1

The (m— 1) factor will be dealt with below. An alternative expression for s2 is obtained by developing the squared parentheses as

(4.1.42)

-(xf[ m —1

m—1 (.

m

)

4.1 A single random variable

185

where the last term is often read as the mean square minus the squared mean. Although this expression is usually the easiest to implement on a computer, it may lead to devastating roundoff errors. If the value of a sample statistic #is used to estimate a parameter 6 of the population, this statistic is called an estimator and its value for the sample the estimate. Sample mean x and variance s 2 are the usual estimators of the population mean ji and variance a2. Estimators are most useful and reliable when they are convergent, i.e., |£_0|_»O when m->oo

(4.1.43)

i(6) = 0

(4.1.44)

and unbiased, i.e.,

Sample mean x and variance s2 are convergent and unbiased estimators (e.g., Hamilton, 1964), which implies that the so-called empirical variance 62 given by m

mm

m

has a bias s2/m. o2 is nevertheless convergent since the bias tends to zero when m-> oo. If the estimator S is not biased, we can write its variance a@2 as G62 = &{[6-&(G)-\2}

= S\_(6-0)2-]

(4.1.46)

The distribution of a sample statistic is a sampling distribution. A particularly important result concerns the variance of the sample mean given by a-2 = S\_{x - //)2] = — m

(4.1.47)

A proof of this statement can be found in Hamilton (1964). The square-root of a^2 is usually referred to as the standard error of mean. 4.1.6 Change of variable Let cp be a monotonous function and note q> ~ x its inverse function which is assumed to be single-valued (i.e., a value of the independent variable is associated with one value of the dependent variable). If the random variable X has the density function fx(x\ then Y = (p(X) has the density function fY(y) given by

£

(4-1.48)

dy A more rigorous statement can be found in standatd textbooks such as Hoel et al. (1971). In order to demonstrate this relationship, let us call Fx and FY the distribution

186

Probability and statistics

functions corresponding to fx and / y , respectively. Let us assume first that

Since cp is increasing,

Using the chain rule for differentiation, we get

fr(y)=

dFY

dFx

dy

dcp

l

dy

Since x = cp l(y\ the first term on the right-hand side is simply the function/and therefore

^ dy

fxlcp(y)l^ dy

If (p is monotonically decreasing,

and

dcp

dy

Considering the sign of dx/dy, we can combine the two expressions in the form

fY(y)=fxl
(4.1.49)

<& If X is a normal random variable with zero mean and unit variance, find the distribution fY{y) of the variable Y related to X through the function

where a and \i are two constants (p > 0). The assumption that x is distributed as

2K

4.1 A single random variable

187

gives the result as a straightforward application of equation (4.1.49)

if We recognize the non-central normal distribution with parameter or deviate Another way of handling changes of variables is through the moment generating function. If Z is the sum of two independent random variables X and Y, integration of the two variables under the integral can be carried out independently, hence Mz(t)=MX+Y(t)=
[e«* + y >]=£{e x

eY)=s(ex)S(eY)=Mx(x)MY(t)

Consequently, the distribution of the sum Z of two normal variables X and Y with respective moment generating functions <7yt

{)

MY(t) = Q^t+

and

—

From the previous result / )

^

+

^

t

+

2 I _ 2w2 x

Y

(4.1.51)

The sum Z of two normal variables X and Y is a normal variable. Its mean is the sum of individual means, its variance the sum of individual variances. If X is normal variable with mean \i and variance
Likewise, the distribution of the sum Z of two gamma variables X and Y with identical second parameter f$ and with moment generating functions Mx(t) = (l — fit)~ax and MY(t) = (l—ptycCY has interesting additive properties. Again

The sum Z = X + Y is therefore a gamma variable with first parameter <xx + OLY. A straightforward consequence is that the sum of a y2 variable with m degrees of freedom and a y2 variable with n degrees of freedom is a y2 variable with m + n degrees of freedom. When cp ~ 1 is not a single-valued function, the range of variation of the random variable X can be split into intervals over which the function has this desirable property. We can use this method to find the distribution of Y = X2 where X is a normal random variable with zero mean and unit variance fx(*) = hi

188

Probability and statistics

X is defined over ] — oo, +oo[, but one value of Y corresponds to two values of X. We say that cp is double-valued. Let us call Fx(x) and FY(y) the distribution functions of X and Y, respectively, and/ y (y) the density function of Y. By definition

and

and therefore

If we apply the chain rule to the definition of the density function fY(y)

dy we get n ,_dF ,_dFyy_f _fdF^)

jY\y) ) — —-

dy d

— —

LL

-p

f

dF^jy/y)] 1 _[dF^/y) 7=

^

7= —

77== 11

7=

J ^L

Each term in the brackets can be replaced by the appropriate expression of fx

and, finally

As stated in Section 4.1.4, Y = X2 is distributed as a chi-squared distribution with one degree of freedom. <& Find the distribution of Y = - In X for X uniformly distributed as fx(x)=l fx(x) = 0 elsewhere Applying the rule for the change of variable, we get

dy

dy

4.1 A single random variable

189

Y is therefore distributed exponentially since

(4.1.54)

-Ue"*

>l If u is a uniform deviate, — In u is an exponential deviate. <> & Find the distribution of Y = e* where X is a normal variable with mean \i and variance
while it is easily found that dy dx The density of probability function g(y) of Y is therefore

My)=

> expF-i^Y]

(4.1.55)

which is the log-normal distribution, o & Show that, for an m-point sample from a normal population X with mean JU and variance a1, the quantity

(4.1.56)

where x and 52 are the usual sample mean and variance, is distributed as a chi-squared distribution with m — 1 degrees of freedom. In order to prove this statement, we write

or, factoring the terms independent of j

£ (x,-x)2= £

(X.-^+^X-M)2-^-^)

f; (x,-

190

Probability and statistics

Since

we obtain m

m

{Xj — x) — 2 , \Xj~~W —n{x — ii)

2 m Dividing this equation by a(x , it— becomes x}2 fx

—u

or

The left-hand side is the sum of m squared normal deviates with zero mean and unit variance and is therefore distributed as a chi-squared variable with m degrees of freedom. Referring to the sampling distribution of x given above, the second term on the right-hand side is also distributed as a chi-squared variable with one degree of freedom. Because of the additive property of chi-squared variables, the first term on the right-hand side is distributed as a chi-squared variable with m — 1 degrees of freedom. <^ & The 5 18 O of rain water changes with the fraction x of water precipitated from atmospheric vapor according to the law 518O-hl000 = (518 O 0 +1000)(l-x)a - 1

(4.1.58)

where the subscript 0 refers to the equatorial value and a is the liquid-vapor 18 O/ 16 O fractionation coefficient (e.g., Faure, 1986). The values of the random variable X representing the fraction of water precipitated from atmospheric vapor at a certain station are assumed to be distributed as a beta distribution with parameters m and n. The mean of the variable X is determined to be 0.20 and its standard deviation cr = 0.10. Find the probability density of rain water 5 18 O at this station assuming 5 18 O 0 = 0anda=1.0111. The probability density distribution fx(x) of the random variable X is a beta distribution with parameters m and n. Therefore

(4.1.59)

4.1 A single random variable

191

From equation (4.1.32), the parameters m and n of a beta distribution can be computed from the mean and the variance as a2 =^--(l-n)-fji ( a22

= 3 and

n= -

a2

Using a table of factorial functions, we compute - ^ - = 1092 r(3)r(i2) which gives the probability density distribution of the random variable x /*(x)=1092x 2 (l-x) 11 From equation (4.1.58), we express 1 — x as a function of 5 18 O l-x=

/ V

/

5 1 8 O\ 9 0 1000/

We will need the derivative of x with respect to 5 18 O, which reads dx d518O

90 / -I 1 1000V

8 1 8 O\ 8 9 1000/

(4.1.60)

We now express the probability density distribution of the random variable 5 18 O as dx d518O or, inserting the appropriate expressions for x and its derivative relative to 5 1 8 O

The graph of this distribution is shown on Figure 4.6. Using the power series expansion of Section 3.1 to the first order and assuming / V

5 18 OV 1000/

,

18 n.8 O ]1000

we find that the variable (—0.09x5 18 O) is approximately distributed as a beta distribution with parameters 3 and 13. The mode (maximum) of the distribution is at 8 1 8 O » —1.7, its mean at 8 1 8 O » —2.1. In contrast, application of equation (4.1.58)

Probability and statistics

192

-10

-8

-6

-2

0

Figure 4.6 Calculated probability density function of 5 1 8 O values in rainwater (see text for assumptions and parameters).

to the mean value of x would have given 518O = 1000(1 -0.2) 0 0 1 1 1 -1000% -2.5 which is slightly, although significantly, incorrect, o & The concentration Csoll of an element i in the residual solid after extraction of a liquid fraction F by fractional melting is given by equation (9.3.14) Cj{ = Cji\-F)KXm-x

(4.1.62)

where Co* is the concentration in the solid source before melting and Dt is the solid-liquid partition coefficient for element i. The symbol F is used for consistency with other chapters and should not be mistaken as representing a cumulative distribution. For batch partial melting, equation (9.2.2) multiplied by D( gives r

i

—

(4.1.63)

Assuming that F is a beta random variable with parameters m and n, calculate the probability density of C^/CQ for each model. Assume mean F of 0.04 and a standard deviation of 0.04 and apply the resulting equations to elements with D~ 0.005, As in the previous exercise, we compute the two parameters of the beta distribution of the random variable as m = 5/3 and n = 40. The probability density function/^F) is = 525.4F 2 / 3 (1-F) 3 9

(4.1.64)

193

4.1 A single random variable

which is plotted in Figure 4.7. Then, the probability density function /(CsoiyC0l ) is given by (4.1.65)

d(c sol yc 0 l )

0

0.05

0.1

0.15

Fractional melting Dt=0.005 D,=0.05

melting

0

0.2

0.4

0.6

0.8

1

Figure 4.7 Assumed probability density function for the degree of melting F (top). Resulting probability density functions for the reduced solid concentration of element i upon fractional melting (middle) and batch melting (bottom) for different solid-liquid partition coefficients Dt.

194

Probability and statistics

Let us first express F for fractional melting as a function of _j_

(Q

F=l-l

so1

i\\-D,

while its derivative is

diCj/Cj)

(4 .,.66)

1-

For batch partial melting, F depends on CsJlCol as Di

-

l

(

while its derivative is D{ (

dF

d(csolycy)

1

i-DAc sol yc 0

(4.1.67)

Inserting equations (4.1.64), (4.1.66), and (4.1.67) into equation (4.1.65) will provide the probability density function f{CsoX'/C0'). For fractional melting, for instance, the result is 0«

with a corresponding equation for batch melting. Both distributions have been plotted in Figure 4.7 for the assigned values of Dt. The differences in the probability density functions between the two processes are quite striking for very incompatible elements (Z), = 0.05 and D{ — 0.005): batch melting has curves more 'peaked' than has fractional melting, o <& The fraction F of a gas, e.g., argon, remaining at time t in a spherical mineral of radius R is given by equation (8.6.7) as 6

oo

1

/

Qi

n

n=

in

\

K

where 3) is the diffusion coefficient of the gas in the mineral. Calculate the fraction cp of gas remaining at time t in a population of spherical minerals with radii distributed as a gamma distribution with parameters a and j8. Let us first illustrate with a simple example how to handle this problem. If, instead

4.1 A single random variable

195

of a continuous size distribution, the mineral population was made of, say, 3 size groups of minerals, each with volume Vl9 V2, V39 then each volume fraction would b e / u / 2 , / 3 . At any time, the total fraction outgassed cp(t) could be expressed as the weighted mean of the outgassed fraction for each size fraction.

where F(t) has been rewritten F(V, t) to emphasize the dependence on radius and hence volume. Now, for a continuous size distribution, fv(V)dV is the fraction of the population with volume between V and V + dV. The fraction outgassed cp(i) is therefore

Jo

The probability density function fR(R) of the radii is FR(R) = — — Rz-tQ-wP

(4.1.70)

where the parameters a and p can be related to the mean fiR and variance oR2 through GR2 = (*P2

MR = aft

or a=( V M 2

and jS = /!*/«

(4.1.71)

Let us introduce the dimensionless variable R/iaR 0La ( R \ a ~ 1

0La

fR(R) =

R*-i

-*R/MR =

_

e-*RinR

(4.1.72)

The radius R relates to the volume of the sphere by 1/3 (4L73)

with the derivative — = 4nR2 dR

(4.1.74)

The probability density function of volumes, which we keep expressed as a function of the radii, is (4.1.75)

196

Probability and statistics

or, inserting equation (4.1.73) fv(V)=

-V)

exp

f

/ 3

-J——V

xp -J—

As a function of the dimensionless variable R/fiR, the fraction F(R, t) remaining in a sphere of radius R at time t reads

Introducing the dimensionless time x = @t/jnR2 and the dimensionless radius x = R/fiR, the fraction of gas cp(t) remaining in the population can be expressed as 6a a 2

°° r i r°° 2

7r r(a) n = iLn Jo

/ V

T

x

2

\\ -*** dx JJ

(4.1.77)

This expression looks complicated but can be computed with minimum effort by using numerical quadrature software. Allowing a size distribution for the outgassing of argon from minerals (Turner, 1968) makes it easier to understand why more argon is lost at low temperature from natural crystals than predicted by the uniform size distribution, regardless of mineral geometry, o 4.1.7 Confidence intervals If a random variable X is defined over a continuous domain Q in % the unknown mean // of a sample lies in a known two-sided confidence interval a> = [x a , xfc] at 100(1—a) percent, or, equivalently, is known at the a significance level, if

or ^(x f l ^^x f t )=l-a

(4.1.78)

The two-sided confidence interval is limited by the 100a/2 and 100(1 —a/2) percentiles. A commonly used confidence interval is 95 percent (a = 0.05), although for the search of outliers (rogue values produced during data acquisition), larger confidence intervals are occasionally preferred. An application of the confidence interval concept central to most statistical assessment is the £2-test for small normal samples. Let us consider a normally distributed variable X with mean \i and variance a2. It will be demonstrated below that for m observations with sample mean x and variance s 2, the variable U defined as

4.1 A single random variable

197

is distributed as a ^-variable distribution with m — 1 degrees of freedom. At the 100(1 —a) confidence level, we write j? f

100a/2,m-l^

II l

— ^ *100(l -a/2),m- 1 [ ~

a

[^.L./y)

Upon multiplication by s/x/w and subtraction of x, we get s

f X

M ~

s

-

i

+ ^100a/2m-l —^= ^ ~~ ^ ^ ~~ X + ^100(1 -a/2),m- 1 —7= (

m

*

Multiplying the inequalities in parentheses by —1 changes the 'smaller than' into 'greater than' X

~tl l00a/2,m-l —7= ^ V- ^ X ~ hoO(l -a/2),m-

s

m

Since the r-distribution is even, t100a/2fm.1=

1

^100(1-a/2),m-1

^= ^ r- ^

A

-t100(1-a/2),m-i,

' 1100(1 -<x/2),m- 1

and, finally

^ f —l

a

^t.l.OWJ

A widely used a = 5 percent significance level produces a 95 percent confidence interval extending over ±t91.5,m-1s/-s/m about the mean x. For m = 6, 11, 31, and oo, the f 97.s,m-i values are 2.57, 2.20, 2.04, and 1.96, respectively (e.g., Spiegel, 1975). The last figure applies to the academic case of an infinite number of measurements and corresponds to the 95 percent confidence interval for a standard normal distribution. Therefore, the normal approximation of the ^-distribution is correct to « 1 2 percent for m> 10 and to 4 percent for m>30. Alternatively, one may associate significance levels with specific intervals around the mean. For large m, intervals of 1 x s/y/m, 2 x s/y/m, and 3 x s/y/m on each side of x correspond to the 31.7, 4.5, and 0.3 percent significance levels, respectively. They limit 68.3,95.4, and 99.7 percent of the surface enclosed under the density of probability curve. Jargon often refers to these intervals as la, 2a, and 3a of the mean intervals. The same principle applies to the confidence interval of variances. We find that keeping the same sample from the same normal distribution, the variable v such as

v=(m-l) —2 G

is distributed as a chi-squared variable with m — 1 degrees of freedom. We therefore write the definition of a two-sided 95 percent confidence interval as (4.1.81)

198

Probability and statistics

Dividing by (m— 1) s2 a n d taking the reciprocal, we get

2

^<J2^—

ZlOOa/2,m-l

>=*!-« ZlOO(l -a/2),m- :

where the inequality signs have been reverted, or, equivalently 1-a Xl00(l-«/2),m-l

(4.1.82)

VXl00a/2,m-l

& The 2 0 6 pb/ 2 0 4 Pb ratios of four samples from a Polynesian island have been determined to be 18.999, 19.091, 19.216, and 19.222. Assuming that these measurements represent a sample from a normal population, find a 95 percent confidence interval for the mean and the standard deviation of the population. Here m = 4 and the number of degrees of freedom m—1 = 3. For a two-sided exclusion domain, a 95 percent confidence interval corresponds to a = 2.5 percent. From standard statistical tables (e.g., Spiegel, 1975), we obtain t91 s 3 = 3.18. Let us calculate the sample mean __ 18.999+19.091 +19.216+ 19.222 _ ~ 4

x

and variance 2_

(18.999-19.132)2+ (19.091-19.132)2+ (19.216-19.132)2+ (19.222-19.132)2

or 5 = 0.107. A 95 percent confidence interval for the population mean \x is given by

or 18.961*^ ^ 19.303 Likewise, we found the values / 2 5 3 2 = 0.216 a n d X91.5^2 = 9.35 from the tables, which provides a 95 percent confidence interval for a as

: 0.216 or 0.061 <
4.1 A single random variable

199

4.1.8 Random deviates Models are often best understood relative to the situation they are designed to describe if their constitutive variables are allowed to fluctuate statistically in a realistic way. Once a variable has been assigned a suitable density of probability distribution and the parameters of this distribution have been chosen, the fluctuations can be conveniently produced by using random deviates from statistical tables. A random deviate is a particular value of a standard random variable. Many elementary books in statistics contain tables of deviates from uniform, normal, exponential, ... distributions. Many high-level computation-oriented programming languages (e.g., MatLab) and spreadsheets, such as MicroSoft Excel, also contain random number generators. The book by Press et al. (1986) contains software that produces random deviates for the most commonly used probability distributions. <#^ Make a table of 20 'crustal' values of ^Nd(0) which is assumed to be a normal variable with mean \i= — \2 and standard deviation o = 3. The assumption is that the random variable U such as

is a normal variable with mean 0 and standard variation 1. We first produce 20 random normal deviates ut for i= 1,..., 20 (here the MatLab 'rand' function has been used with the option 'normal') and then compute the values [eNd(0)](I) using

Computation from Table 4.1 would produce the satisfactory values x= —11.6 and 5 = 3.1.4= 6 Build a table of 20 values of 'basaltic' Ce concentrations CCe assumed to have a log-normal distribution with mean ft = In 20 ppm and a standard deviation a = In 4. The assumption of a log-normal distribution is that the variable U such as

In 4 is normally distributed with a mean 0 and a standard deviation 1. In a sense, the log-normal distribution is the normal distribution of relative errors: for u= + 1 , the concentration equals exp(/i) multiplied by 4, while for u— — 1, it is divided by 4. Once the normal deviates ut are produced independently for the two elements, the concentrations are calculated as ln(CCe)(l) =

200

Probability and statistics

Table 4.1. Twenty random crustal values of^^^(O) from a normal population with mean fi= —12 and standard deviation o — 3 produced by the normal random deviates w;

i 1 2 3 4 5 6 7 8 9 10

-0.2345 1.4525 0.7631 0.1402 1.0192 -0.5806 0.9448 -1.7994 0.5777 0.7101

[%d(0)F

i

-12.70 -7.64 -9.71 -11.58 -8.94 -13.74 -9.17 -17.40 -10.27 -9.87

11 12 13 14 15 16 17 18 19 20

To /YYTl(0 L^Ndv^vJ

1.8706 -0.4809 -0.6921 -0.2945 1.4001 -1.9289 -0.8867 0.0824 0.2545 0.4311

-6.39 -13.44 -14.08 -12.88 -7.80 -17.79 -14.67 -11.76 -11.24 -10.71

Table 4.2. Twenty random basaltic Ce concentrations (in ppm) from a log—normal population with mean fx = ln20ppm and standard deviation a = In 4 ppm produced by the normal random deviates wv

i 1 2 3 4 5 6 7 8 9 10

(CC.)(i)

-1.47 0.27 1.77 -1.77 1.63 1.20 -1.70 -0.59 2.21 -0.83

2.59 29.00 231.28 1.72 190.94 104.89 1.89 8.88 425.51 6.35

i 11 12 13 14 15 16 17 18 19 20

(C Cej(0 «

«

•

-2.37 -0.87 -0.70 0.86 -1.30 1.97 -2.30 -0.99 0.11 1.38

0.75 5.99 7.58 65.84 3.32 305.30 0.82 5.04 23.19 134.57

or (CCe)(l) = 20x4Ul in ppm. Using a generator of normal deviates, we get Table 4.2. o

4.2 Several random variables • For an n column-vector X of n random variables X=(X1, X2, ..., I") T , and a continuous domain Q of definition in 9T, we can define a joint multivariate distribution function

4.2 Several random variables

201

x 2 , ..., x") consistent with the defintion of single variable functions (4.2.1) The joint multivariate density function ^(x1,

x 2 , ..., xn) is (4.2.2)

If Q coincides with 9T, Fx and fx are related through fx1

Fx(x\

2

fx2

M

x ,..., x ) =

fx"

... %) — oo •/

oo

fx(u\

w 2 ,..., un) du 1 du 2 ... dun

*/ — oo

An important concept is the marginal density function which will be better explained with the joint bivariate distribution of the two random variables X and Y and its density fXY(x, y). The marginal density function fxM(x) is the density function for X calculated upon integration of Y over its whole range of variation. If X and Y are defined over 9*2, we get P + QO

J-co

fxM{x)dx measures the weight of the probability 'slice' taken along y at X = x (Figure 4.8). fxM(x) is the density function of X regardless of variations of Y. This concept is easily extended to higher dimensions.

Figure 4.8 A bivariate probability density function. The slice parallel to the y axis represents the marginal density fxM(x\

Two random variables X and Y are independent if their joint density function fXY can be factored as a product of two density functions, each involving one variable, e.g., (4.2.3)

202

Probability and statistics

• Given two random variables X and Y, the covariance of X and Y is cov(X, Y) = <xxy =
(4.2.4)

cov(X, 7) = ^ { [ I - ( f ( I ) ] [ y - # ( 7 ) ] }

(4.2.5)

or

which can be developed as cov(X, Y) =
(4.2.6)

From this definition, cov(X, Y) is identical to cov(Y, X). For two independent variables, the definition of expectation shows that (4.2.7)

so their covariance is zero. The reciprocal statement is not necessarily true. The correlation coefficient pXY of two random variables x and y is , Y) (4.2.8) and may be viewed as the covariance standardized between — 1 and +1. The correlation coefficient measures the linear dependence between the two variables X and Y. Let us assume that they are perfectly correlated, i.e., Y = aX + b with a and b constant. The linearity of the expectation operator amounts to

and therefore

which, using equations (4.1.7) and (4.2.5), results in the relations between expected values var( Y) = a2 var(X) and cov(X, Y) = a var(X) and therefore a var(X) PYY —

—

a — = —

pXY is therefore equal to +1 if a is > 0 and — 1 if a < 0. • Referring to the n random variables X 1, X2, ..., Xn collectively as the vector X, the same

4.2 Several random variables

203

definitions apply and we can form the covariance (or dispersion) matrix L x as

-"

co\(X\Xn)~

No bold face will be used for the subscript X of £*. The current element (il, il) of L x can be rewritten as S{Xn Xi2)-S{Xn)S(Xi2). The condensed form of the covariance-matrix is obtained by using the outer product defined in Section 2.1 = ${\_X-

£(X)\\X-

(4.2.9)

The current element (il, i2) of the correlation matrix p is calculated by the relation (4.2.8). Let the standard-deviation matrix a be /var(X x)

0

0

0

\

0

0

0

(4.2.10) n

J\2iT(X p

It can be verified that (4.2.11) The concept of covariance matrix can be extended to two distinct vectors of different dimensions, XeW and YeW , Y) = £{\_X-£(X)~]\_X-g{Y)~]T} or Cov(Ar, Y) =

T

) -

(4.2.12)

the mxn matrix Cov(Ar, F)) being no longer symmetric or even square. The mxn correlation matrix C o r r ^ , Y) is also the matrix of correlation coefficients and is likewise no longer symmetric or even square. If ox is the mxm matrix of standard deviations on X and oY is the n x n matrix of standard deviations on F, we get the relationship equivalent to equation (4.2.11), i.e.,

\Y)oY

(4.2.13)

The converse relationship Corr(*, Y) = ax ~l Cov(*, Y)aY ~ *

(4.2.14)

will prove useful for principal component analysis (Section 4.4).

4.2.1 Estimators For a vector X of n random variables with mean vector \i and nxn symmetric covariance-matrix £ x , an m-point sample is a matrix X with n rows and m columns.

204

Probability and statistics

We should be aware that the variables now appear row-wise, so that the current element x/ of X is the jth measurement of the ith variable. • The sample mean vector x is a column vector with n elements such as jf = ^ m

(4.2.15)

where Jm is an m column-vector (1, 1, . ..,1)T. • Variances and covariances can be lumped together into the n x n symmetric sample covariance or dispersion matrix S (or £) with current element siU2 such that

Siui2 = si2,n = —

;

(4.2.16)

m-1 where the sum is over all the observations/ We recognize the sample variances for i\ = il and covariances for i\ ^i2. The 'deviation'matrix of X from the mean sample vector is the mxn matrix X—xJT while the sample covariance matrix reads

Jx-*rv-*j
(4217)

m-1 which is the most useful form of this matrix with the least roundoff error. • The symmetric sample correlation matrix R (or p) is similarly defined by its current element m / il —X v E {Xj 7=1

fl

v

S

il,i2

S

il

S

i2

i2 i2 Vv ){Xj —Xv \)

(x/1-?1)2

£(xji2-xi2)2

Calling the diagonal matrix of the sample standard deviations s, R and 5 relate through S=sRs

(4.2.18)

• Likewise, the sample covariance matrix between two vectors JC and y would be y) = x~p-xy1

(4.2.19)

Defining sx and sy as the diagonal matrices of sample standard deviations on JC and y, respectively, the sample correlation matrix would be Corr(x,y) = sx~x COV(JC,y)sy~1

(4.2.20)

We are dealing with unbiased estimators, so we can write

S(R) = e

which is the basis for statistical assessment in any modeling situation.

(4.2.21)

4.2 Several random variables

205

Table 4.3. Lead isotope ratios of four Polynesian lavas. 6

Pb/ 2 0 4 Pb

207

18.999 19.091 19.216 19.222

Pb/ 2 0 4 Pb

15.569 15.616 15.621 15.619

& Four samples from a Polynesian island gave the lead isotope compositions given in Table 4.3. Calculate the mean and standard deviation vectors, the covariance matrix and the correlation coefficient between the two isotope ratios. The sample mean vector is x = [19.132, 15.606]7, the standard-deviation vector s = [0.107, 0.025]7, and the covariance matrix S given by ["0.011509

0.002 3141

LO.002 314

0.000 621J

which, as expected for lead isotopic compositions, indicates a rather strong correlation between the two ratios (r = 0.87). o 4.2.2 Useful multivariate distributions • A normal (gaussian) probability density function in one centered and standardized variable X reads

n independent normal centered and standardized variables lumped into an n-vector X will be distributed as a multivariate normal distribution with the joint density probability function

^=i^e"^=(^^

(42 22)

-

Since the standard deviations are unity and the variables are independent (zero covariance), the covariance-matrix of X is the identity matrix /„ and the contours of constant probability in the space 9?" are given by JCTX = constant

The surfaces of constant probability density are hyper-spheres. In the more general case, the vector Xn with mean pn and nxn covariance matrix

206

Probability and statistics

T<x has the 'non-central' joint density of probability (see below)

(27rr/2x /det2^

L

2

(4-2.23)

Contours of constant probability density in the space W are such as (x - /i)TLx - l(x - p) = const

(4.2.24)

and describe concentric hyper-ellipsoids centered in /i. If Al9 A 2 ,..., 2n are the eigenvalues of £ x , all non-negative, the axes of these hyper-ellipsoids along their eigenvectors have half-lengths proportional to X /I^, JT~2,..., J~kn (see Section 2.4) This is the base of the widely used concept of 'error ellipse'. Parallel to the case of a single random variable, the mean vector and covariance matrix of random variables involved in a measurement are usually unknown, suggesting the use of their sampling distributions instead. Let us assume that x is a vector of n normally distributed variables with mean n-column vector /i and covariance matrix L x . A sample of m observations has a mean vector x and a n n x w covariance matrix S. The properties of the f-distribution are extended to n variables by stating that the scalar m(x — p)TS~ x(x — /i) is distributed as the Hotelling 's-T2 distribution. The matrix Sjm is simply the covariance matrix of the estimate x. There is no need to tabulate the T2 distribution since the statistic

^ n(m— 1)

(

n(m— 1)

/s \m

may be shown (e.g., Seber, 1984) to be distributed as a distribution F with n and m — n degrees of freedom. When m»n, i.e., when the number of measurements largely exceeds the number of variables, the left-hand side of equation (4.2.25) tends towards T2/n. From equation (4.1.38), T2 which we can rewrite nT2/n, is therefore distributed as nFp^ao = x2P,n

(4.2.26)

Since T2 is a chi-squared variable with n degrees of freedom, its mean value should tend to n when the number m of measurements is very large.

4.2.3 Change of variables Let

(4.2.27)

4.2 Several random variables

207

where J(y/x) is the value for X= x and Y=y of the Jacobian of the transformation, i.e., dY.

J(Y/X) = det

(4.2.28)

dYn

& X is an n-vector of n independent normal standardized variables, i.e., with zero mean, unit variances and null covariances. Find (i) the density function and (ii) the covariance matrix of the vector Y given by (4.2.29)

Y=AX+b

where Ae9inxn is a non-singular matrix and (i) As we have seen above, the joint multivariate density function of the vector X is simply the product of the n normal density functions

In the present case, J is simply the determinant dctA of the matrix A and cp'1 is calculated as

x = A-\y-b) Using equation (4.2.29) that gives the density function of a dependent variable, we obtain

lA-\y-b)YA-\y-b)\

1

Application of basic rules of matrix manipulation gives

and

so that

(y-bnAAJ)-\y-b)\

1

(4.2.30)

208

Probability and statistics

(ii) The covariance matrix Hx of X is defined as

Applying the linearity property of the expectation to the change of variable given by equation (4.2.29), we get

and, upon subtraction from equation (4.2.29)

The transpose of this equality is [ Y- S{ y)] T = [X- <$(X)YAJ Multiplying the last two equations and using the definition of the covariance matrix, we obtain

and therefore XY = AKXAT

(4.2.31)

In the present case, L x = /„ and therefore

which gives the non-central joint normal density equation (4.2.23) for the vector Y

(2n)n/2^deti:Y

& A random n-vector X has a mean vector p. and a n w x n covariance matrix L x .
(4.2.32)

4.2 Several random variables

209

From equation (4.2.31), we can write f —a

but since, from equation (4.2.11),

we obtain the useful result Lc- = p

(4.2.33)

This result does not depend on the vector /i and can be extended to any origin of the vector £. The correlation matrix is therefore the covariance matrix of any standard-deviation normalized vector, o & X is a normal random variable with mean \i and variance a2. Given the set of samples of m observations with mean X and variance S 2 , which will be treated as independent random variables, show that the ratio

T=

X

~^

(4.2.34)

is distributed as the Student ^-distribution with m — 1 degrees of freedom. We know from Section 4.1 that X is distributed normally with mean ju and variance a2/m and therefore (X — fi)/((r/y/m) is a normal deviate with zero mean and unit variance. In addition, (m— l)S2/a2 is distributed as a chi-squared variable with m— 1 degrees of freedom. In order to find independent variables from simple distributions, let us transform the value tofTas X — }1

X — fi

G

X — fi

1

t=-

or t=-

Defining U=

-,

W = y/{m-l)S2/(j2.

and v = m - l

(4.2.35)

a/y/m we infer that U is a normal deviate with zero mean and unit variance, W is a

210

Probability and statistics

chi-squared variable with v degrees of freedom and T can be recast as T=

U f

V/v

Since X and S2 are independent, so are U and W. U is distributed with the density function f^u) such that Mu)

=

2n

and W as l T(v/2)2 V/2

Let us make the change of variables U

X=

, and Y=W

JW/v

The Jacobian J of the transformation is

= det

dX dX ~dU dW dY dY ~dU ~dW

i

JT

i

u

iwJWJv

= det .

0

1

1—

VW

i—

VY

Since independence of U and W is assumed, the joint distribution function/xy(x, y) is

or, expressing U and W as functions of x and y

This expression can be rearranged as

The probability density function of the random variable x is obtained by integrating

4.2 Several random variables

211

x, y) over the whole range of y variations (marginal density)

which, after isolation of the constant terms, becomes P

L 2V

/J

v

We recognize in the integral a form that is close to the Eulerian gamma function (4.1.22), which becomes more visible by preparing for variable change as y-D/2 f oo

Introducing the new variable z as

we obtain the expression ~|(v+l)/2

where the integral equals F[(v + l)/2]. Thefinalprobability density/x(x) is therefore hl)/2

which we identify as the Student t-distribution described by equation (4.1.36) with v = m — 1 degrees of freedom.

212

Probability and statistics

The mean x of these measurements is 0.710259, while their standard deviation s is 0.0000104 (we take one more digit to keep a reasonable precision on ratios). Let us form the variable t which is meant to represent a specific value taken by the Student-t variable and such that t=

x-0.710250 0.710259-0.710250 - ^ = — - = 1.94 s/ x /6^T 0.000 010 4/ x /5

The Student-r percentile £5,97.5 is 2.57, so 95 percent of the surface enclosed under the Student distribution curve lies inside the interval [ — 2.57, +2.57]. Since t lies within that interval, we will assume that the mass spectrometer is unbiased for Sr isotope measurements. «=>

4.2.4 Confidence region of a sample from a normal population The confidence intervals defined for a single random variable become confidence regions for jointly distributed random variables. In the case of a multivariate normal distribution, the equation of the surface limiting the confidence region of the mean vector will now be shown to be an /t-dimensional ellipsoid. Let us assume that X is a vector of n normally distributed variables with mean n-column vector /1 and co variance matrix Hx. A sample of m observations has a mean vector x and a n n x n covariance matrix S. We know that the statistic T2 = m(x — p)TS~ 1(x — /i) is distributed as the Hotelling's2 T distribution and that m-n

m-n

vrc-i/- ^ l (x~fi)

2 J S n(m— 1)-T = -n(m— -m(x-fi) 1)

is distributed as the distribution F with n and m — n degrees of freedom. A 100(1 —a) percent confidence interval for T2 is defined as (4.2.36)

n(m-\)

Because we are dealing with positive numbers, a one-sided confidence condition defines the confidence region. Equation (4.2.36) can be rewritten as r

:

m—n

The coordinates xiooil-a)(p) by the equation

100(1 -x),n,m-n ( ~~ L

a

y*.£.J I)

)

of the confidence region boundary are therefore given

* ^ ^ I W « * . - . m—n

(4-2.38)

4.2 Several random variables

213

or Ol00(l - a ) 0 * ) - - * ] T

^100(1 -a),»,m-«-

\_ m — n

l>100(l -«)W ~ *]

= 1

(4.2.39)

mj

that is, an ellipsoid centered at x. If Af is the rth eigenvalue of 5, the length dt of the semi-axis along the ith eigenvector will be (4.2.40) m(m — n)

For m » n , i.e., when the sample size greatly exceeds the number of variables, we could write, using equation (4.1.38), a slightly simpler form of the confidence region as ),„} = ! - *

(4.2.41)

< ^ For the four samples from a Polynesian island considered above, draw the 95 percent confidence region for the mean \i of lead isotope ratios and compare the results with the individual 95 percent confidence interval for the mean of each ratio. We found x = [19.132, 15.606] 7 and the covariance matrix S such that [0.011509 0.002 3141 ~|_0.002 314 0.000 621J The eigenvalues of S are calculated by MatLab to form the diagonal matrix A as

["1.198 xlO" ~|

0

2

0 1 4 J 1.497 xlO" 1.

and its eigenvector matrix as [0.9799 ~|_0.1996

0.19961 - 0.9799 J

It will be easily checked that U is an orthogonal matrix. Moreover, from standard statistical tables

m(m — n)

4( — 2)

The length of the semi-axis is ^/b.0120 x 3.77 = 0.412 along the first eigenvector and y/0.000 150 x 3.77 = 0.0462 along the second eigenvector. We can now draw the 95 percent confidence ellipse using the method outlined in Section 2.4. First, let us write the diagonal form of the matrix S as

s= u\uT

214

Probability and statistics

and therefore S-l = U\-1UT = (\-1/2UT)T^'l/2UT

(4.2.42)

Inserting this form of S~x into the equation (4.2.39) of the confidence ellipse gives Si'1

Sn(m-l) ~-*J

* 100(1 -<x),n,m-n~

|_ m — n

\ L

mj

{\-1/2uT)T\-1/2uT a)W--^] — 77 n(m-\) r m(m — n) f 100(1

O -<x),n,m-n

JJ

(4.2.43)

Introducing the vector z defined as A" 1/2

z = -^-UT(fi-x)

(4.2.44)

equation (4.2.43) becomes zTz=l

i.e., the equation of the unit circle. We therefore calculate the coordinates of an arbitrary number of points zt(i = 1,2,...) on the unit circle. This is most easily done by incrementing an arbitrary angle cpt from 0 to In and taking zt = [cos q>i9 sin (p,]T. We next compute the coordinates for the 95 percent confidence ellipse of the mean using the reverse transformation

We can now compute the coordinates x95(0 (/i) and y95(0(/i) of the ith point on the 95 percent confidence ellipse as _ [19.1321 I"-0.4043 (0

y Q 5 (A«)

0.0092Tcos ^

~ L15.606 J "*" L -0.0824 - 0.0452 J|_ sin
Table 4.4 gives the coordinates for 12 points of the 95 percent confidence region of the mean JU. The complete ellipse is drawn in Figure 4.9. The 95 percent confidence intervals of the mean of each coordinate 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb are calculated for m = 4, i.e., for 3 degrees of freedom. From standard statistical tables, we obtain £95 3 = 3.18 (two-sided). The 95 percent confidence intervals for the mean are therefore for

206

Pb/ 204 Pb

215

4.2 Several random variables

Table 4.4. Contours of the 95 percent confidence ellipse of the mean for the Pb isotope composition of Polynesia lavas given in Table 4.3.

*,<deg)

COS (p(

sin (pi

0 30 60 90 120 150 180 210 240 270 300 330 360

1.0000 0.8660 0.5000 0.0000 -0.5000 -0.8660 -1.0000 -0.8660 -0.5000 -0.0000 0.5000 0.8660 1.0000

0.0000 0.5000 0.8660 1.0000 0.8660 0.5000 0.0000 -0.5000 -0.8660 -1.0000 -0.8660 -0.5000 -0.0000

19.1412 18.9378 18.7864 18.7277 18.7772 18.9219 19.1228 19.3262 19.4776 19.5363 19.4868 19.3421 19.1412

15.5611 15.5259 15.5123 15.5239 15.5575 15.6042 15.6514 15.6866 15.7002 15.6886 15.6550 15.6083 15.5611

IS 7S 15.70 S

15.65

^

15.60 15.55

18.6

18.8

19.0

19.2

19.4

19.6

206pb/204pb Figure 4.9 The 95 percent confidence ellipse of the mean for the Polynesian data listed in Table 4.3. The horizontal and vertical bars show the 95 percent confidence intervals of the mean calculated independently for each coordinate.

and 15.606±3.18(—^

for

207

Pb/ 204 Pb

These intervals are drawn in Figure 4.9. In the case of correlated variables, the

Probability and statistics

216

Table 4.5. Ce and Yb concentrations (ppm) and Ce/Yb ratios in a geochemical survey of 20 igneous samples. C

Ce

Yb

Ce/Yb

5.22 4.01 5.02 5.37 0.72 1.54 0.97 1.21 1.25 1.26

4.23 2.95 2.45 2.00 12.48 19.42 17.29 97.00 15.84 60.06

C

22.07 11.83 12.31 10.75 8.98 29.98 16.83 117.78 19.73 75.45

C

Ce

94.34 63.78 16.47 3.83 8.78 42.06 5.71 103.49 74.33 29.82

Ce/Yb 2.20

1.11 0.67 0.91 0.71 0.85 2.57 1.03 0.93 0.94

42.85 57.55 24.59 4.23 12.29 49.65 2.22 100.72 80.00 31.59

confidence region calculated from the joint distribution is significantly larger than the confidence region calculated from individual distributions. <=• ^ In a random geochemical survey, Ce and Yb concentrations have been measured in twenty igneous samples. The results in ppm are reported in Table 4.5. Find the 95 percent bivariate confidence regions for the mean of (i) the Ce-Yb pair (ii) the Ce/Yb-Ce pair. (i) For the Ce and Yb pair, the sample mean is x = [38.42,1.92] T and the standard deviation vector s = [36.17,1.92]T hinting to a strong non-normality. The correlation between the data is weak with a correlation coefficient of —0.28. The sample covariance matrix S is T13O8 ~|_-16.42

-16.421 2.61 J

with eigencomponents . [1308 0 "I A and A= L 0 2.408J

[ Q-0.9999 0.01261 U=\ 0.0126 0.9999 J L-OJ

rT

In agreement with the small correlation coefficient, the eigenvectors are nearly perfectly parallel to the coordinate axis. Using tables, we find 2(20-1)

w(m-l) .

.

m(m — n)

r

2 ~ ^95,2,18 = 0.106 x 2.62 = 0.277 = 0.526 20(20 — 2)

100(1 -a),w,m-«—TT—'

Defining z as a vector (cos q>i9 sin cpt) of coordinates for a point on the unit circle as

4.3 Error propagation and error calculation

217

above, the transformation formula for the contour of the 95 percent confidence region of fi is obtained as

or, inserting the values _ [38.421 L 1.92J

-19.0239 0.2392

- 0.0103Tcos (/>, -0.8162_l_sin
(ii) For the Ce and Ce/Yb pair, the sample mean is x = [38.42,31.97] T and the standard deviation vector s = [36.17,32.14]T hinting again to a strong non-normality. In contrast, the correlation between the data is strong, as could be expected from Ce appearing in both variables, with a correlation coefficient of 0.93. The sample covariance matrix S is T13O8 10801 L1080 1033J and has the eigencomponents T0.7504 0.66101 U=\ L 0 81.5J L0.6610 -0.7504 J The eigenvector coordinates have a rather similar modulus in agreement with the strong correlation coefficient (0.93). F 9 5 „ m_n is as in (i) and finally [2260

A=

0 1

J

and

_ p 8 . 4 2 l T-18.763 ~|_31.97_r|_-16.526

3.139X008^1 - 3.564 JL sin
which is the equation of a rather 'steep' and elongated ellipse drawn in Figure 4.10. If X and Y are independent random variables, f(X, Y) and g(X, Y) ~ where / and g are some functions of X and Y - may in general be suspected to be correlated to an extent that should always be carefully assessed. <=>

4.3 Error propagation and error calculation 4.3.1 General concepts We have already met the concept of error propagation a few times when dealing with the change of variable formulas for probability distribution, but let us try to illustrate it with a simple example. We want to measure the diffusion coefficient 3) of uranium in a glass by maintaining at a specific temperature and for a specific time t the surface of one long glass rod in contact with a concentrated solution of uranium. We admit without further justification (see Section 8.5) that the depth x of uranium

238

Probability and statistics

zero and their variance is given by

m— 1

(4.4.6) N o w comes the very principle of the principal c o m p o n e n t analysis. A total variance is n o w defined as the trace of the matrix Sx or, using a p r o p e r t y of the trace of a matrix product given in Section 2.2 t r £ x = tr(tfA£/ T ) = tr(A' I i/ T tf) = trA = £ lj

(4.4.7)

a n d the p r o p o r t i o n of that variance explained by the c o m p o n e n t k is the ratio pk given by

Adding variances on different variables at the denominator, e.g. pH and temperature in solutions, does not make much sense and is certainly not invariant upon rescaling. Proportions of explained total variance do not survive a simple change of units! For this reason, PC A is commonly carried out instead on normalized variables £ such as Z^s-^Xi-x)

(4.4.9)

where s is the diagonal matrix of sample standard deviations. The n x m matrix S collects the z = l,...,m normalized measurements £,. As we have seen in equation (4.2.33), the covariance matrix of standardized variables is the correlation matrix of the non-standardized variables. Therefore, the £f have the correlation matrix R for covariance matrix. The diagonal form of R is R=VAVT

(4.4.10)

where A is the diagonal matrix of eigenvalues (5 l5...,<5 n and V the matrix of the orthogonal eigenvector v 1 ? ...,v n of R. The component fjt of the ith vector xt along the 7'th eigenvector v, of R is given by

/;, = v/£

(4.4.11)

and the components can be collected in an n x m matrix F obtained as before through F= VTE

(4.4.12)

E=VF

(4.4.13)

or, since V is an orthogonal matrix

4.3 Error propagation and error calculation

219

e.g., chemical heterogeneities in the glass or alternate transport processes, pointing to a failure of the simple model. In other to describe these fluctuations, we must therefore calculate the sample statistics of x and Q) from the set of measurements. In case (ii), we do not really know which part of the measurement dispersion can be ascribed to fluctuations in the diffusion process and which part comes from the poor measurement reliability. If we could describe the distribution of the random variable x, we could proceed as in the previous section by changing the variable and assess the distribution of Of. This is however an unlikely situation and only sample statistics, namely the sample mean and variance, are available from the experiment. A confidence interval (spread) on the measurement is identified as the experimental uncertainty on x. This uncertainty is then propagated by the techniques described below on the estimate of @9 then compared with the observed spread of the measurements. If the propagated experimental uncertainty on <2) is significantly larger than the measured confidence interval, we can decide that a unique Q) value is consistent with the observations. If the spread on 3) is significantly larger than the propagated experimental uncertainty, we can suspect either fluctuations in other parameters or, more seriously, failure of the model. Situations such as (ii) are the most frequent and actually turn out to be quite satisfactory since we can assess the limitations of the measurements relative to fluctuations (or failure) of the model. We now know enough probability and statistics to make an assessment of the calculated errors. We will give some examples of error propagation, both linear and non-linear, using explicit or Monte-Carlo techniques. Examples of decision making (how significant is significant?) will be given in Chapter 5.

4.3.2 Linear error propagation

We have considered in some detail in Section 4.2 the case where the random vector Y of n ancillary or dependent variables relates linearly to those of a vector X of n principal or independent variables (e.g., raw data) with covariance matrix T,x through the matrix equality Y=AX+b where X, F, 6e9T and A, Ttxe9lnxn. From equation (4.2.31), the covariance-matrix Ly of Y is equal to

There is no restriction in the derivation of this relationship that would prevent its extension to cases where Ae9T, £xe9TX11, Y and beW1, and ^e9? m x " with m^n. Then, L y e9l mx " and equation (4.2.31) is still valid. Error propagation is achieved by replacing the population parameters by the value estimated by sampling, e.g., x for the sample mean xy = Axx + b

(4.3.3)

Probability and statistics

220

Table 4.6. Miner alogical matrix (molar fractions) of a metamorphic carbonate.

CaO MgO SiO 2

ca

do

di

1 0 0

1/2 1/2 0

1/4 1/4 1/2

and 5 for the sample covariance matrix y

=

ASxAT

(4.3.4)

where the estimates are subscripted with reference to the corresponding variables. The covariance matrix of the mean vector xy would be derived through a similar expression. & A sample of metamorphic carbonate contains calcite CaCO3, dolomite Ca0 5Mg0 5 CO3 , and diopside CaMgSi2O6. A chemical analysis on the calcinated (CO2-freej rock indicates the following molar proportions: 0.525 (0.03) CaO, 0.225 (0.01) MgO, and 0.25 (0.02) SiO2 with standard deviations given in parentheses. Find the molar proportions of each mineral in the rock and their standard deviation. Let us denote x = [xCaO, xMgO, xSiO2]T the vector of rock concentrations, and y= [>>ca, ydo, ydi]T the vector of molar proportions in each mineral. The mineralogical matrix of this rock is given in Table 4.6. CaO mass balance between minerals and rocks reads -7^di = 0.525

4

Likewise, for MgO ^ d i = 0.225

For SiO2, the equation is

Lumping the three equations in a matrix form, we get 1 1/2 l/4ip ca 0 1/2 1/4 L . =

.0 0

l/2JLdJ

X x

x

"0.525"

CaO

MgO = _

0.225 .0.250.

4.3 Error propagation and error calculation

221

a matrix equation which is inverted as 1 0

-1

.0

0

0 T0.525"

"0.3"

- 1 0.225 = 0.2

2

2 L0.250.

.0.5.

Defining the matrix A as 1 A=

-1

0"

2

-1

0

2.

this equality can be rewritten = Ax

In the absence of further information on correlations, we form the covariance matrix Sx of the dependent variable vector x as 0.032

0

0 L 0

0.01 2

0

0

0.02 2 .

Applying equation (4.3.4) for the linear propagation of errors gives the covariance matrix Sv of the vector v 0.095

0.005

0.0051

0.005

0.005

0.005 x l 0 ~

0.005

0.005

0.010

The square-root of the diagonal elements gives the standard deviation and the final results are (standard deviations in parentheses) 0.300(0.031)' 0.200(0.007) L0.500(0.010)J

This procedure for propagating errors is not entirely satisfactory since it neglects a source of strong correlation: the phase proportions must sum up to 100 percent even when they are allowed to fluctuate within errors. This point is dealt with in Chapter 5. o & Rare-earth elements in minerals can be measured in situ by ion probe. It is observed that Gd oxide peaks overlap with Yb masses (isobaric interference). The

Probability and statistics

222

Table 4.7. Atomic abundance of selected isotopes of Yb and GdO at mass m.

m

amYb

amGd°

171 172 173 174

0.143 0.219 0.161 0.318

0.148 0.205 0.157 0.248

similarity of isotopic abundances makes correction by peak stripping rather imprecise. In an experiment, the following number of counts per second (cps) Im has been found for each of the following peaks: 87.0 cps at mass 171, 128.0 at 172, 95.6 at 173, and 174.1 at 174. Given the isotopic proportions amYb and amGdO in the species Yb and Gd 16 O listed in Table 4.7, allot a number N of cps to each species. Standard deviation on each peak is assumed to be equal to the square-root of the number of cps (Poisson statistics). The measurement time is 1 second. Repeated measurements show that intensities of each peak fluctuate with a correlation coefficient of 0.9. A peak is the sum of the contributing species weighted by the isotope abundance of each species on this isotope (interference equation)

The co variance matrix of/ can be computed from standard-deviations and the unique correlation coefficient through equation (4.2.18) "9.33

0

0

1

0.9

0.9

0.91 r9.33

0

11.31

0

0.9

1

0.9

0.9

0

0.9

0

1JL 0

0

0

9.78

0

0

0

0.9 13.19

0.9

1

.0.9 0.9 0.9

0

0

0 "

11.31

0

0

0

9.78

0

0

0

13.19.

or

5=

- 87.00

105.53

91.20

123.04"

105.53

128.00

110.62

149.24

91.20

110.62

95.60

128.97

L123.04

149.24

128.97

174.00.

Let us write the system of interference equations for the masses 171 and 172 143 0.148~|pVYb 0.219 0.205

4.3 Error propagation and error calculation

223

The solution is JVYh 1 f-66.19 70.71

47.79T 87.01 _ p 5 8 l - 46.17 JL128.0 J " [_242 J

with the covariance matrix given by linear propagation [equation (4.3.4)]

N

[-66.19 ~[_ 70.71

47.79T 87.00 105.53T-66.19 -46.17JL105.53 128.00JL 70.71

47.79T -46.17J

or 772 -10528

-105281 1372 J

As expected from nearly identical isotopic abundances of Yb and Gd 1 6 O at masses 171 and 172, the standard deviations on N Y b and N G d O are extremely large and errors are strongly correlated (r% — 1.0). A slightly better result is obtained upon replacement of peak 172 by peak 174. The system of interference equations now reads J 171~|

[0.143

0.148~|pVYb 1

/17J

L0.318

0.248 i_NGdo]

or

»U Gd oJ

3601 240J

L

with covariance matrix [ 312 L-2896

-28961 932 J

Errors on NYh and NGdO have been divided by a factor 2 but remain strongly correlated

4.3.3 Linearized error propagation for non-linear relationships The approach developed in this section is of considerable practical importance for the assessment of errors on data obtained through a complex reducing procedure from raw measurements (e.g., optical and mass spectrometry), or on variables inferred through complex modeling. Given a relationship between a random variable X with mean fix and variance ox2 and a dependent variable Y such as (4.3.5)

224

Probability and statistics

where cp is some known function, we wonder how to estimate the corresponding statistics fiY and oY2. Error propagation can be achieved through different means. First, if the density of probability function is known for the variable X (usually a measurement), or if at least a reasonable guess of this function can be arrived at, the density of probability function for the variable y can occasionally be calculated analytically through equation (4.1.48) provided the function

dX

Taking the expectation of each side gives dcp(X) dX

\_S(X) — tix~] + higher-order terms

or, neglecting the terms of order higher than one fiY~
(4.3.6)

These equations can be subtracted from each order to give d
{X-iix)

The expected values relate through ,

d
dX The linear approximation for variance 'propagation' is therefore

*r41\'

(4-3.7)

The useful equations (4.3.6) and (4.3.7), which are valid only for smooth monotonous functions, can be translated into relationships between the corresponding sample statistics y * (P(x)

(4.3.8)

4.3 Error propagation and error calculation

225

and (4.3.9)

Applying the equations (4.3.6) to (4.3.9) to highly non-linear functions, cp is usually inappropriate. & The mean 143 Nd/ 144 Nd of a sample has been found to be 0.513 114 with a standard deviation of 0.000007. Given a present-day 143 Nd/ 144 Nd ratio in chondrites of 0.512638 (an arbitrarily precise estimate), find the standard deviation on the mean eNd(0) value. By definition, eNd(0) is calculated as

_r

d) c h o n d r i t e s - J

which, in this case, leads to sNd(0) = ( ^ H ^ - l ) x l 0 V0.512638 /

4

= 9.29

The derivative of the dependent variable £Nd(0) relative to the independent variable (143 Nd/ 144 Nd)sample is 10 4

d6 Nd (0) 4

)sampie

(' 3Nd/

144

Nd)chondrites

The variances relate to each other through

and, inserting the values, we get «I>Nd(0)] = d

104 0.512 638

x 0.000 007 = 0.14

o

These relationships will be applied with utmost care for the determination of confidence intervals, especially when the probability density function of the dependent variable y is not symmetrical. If we now consider an n-vector X of n random variables ('data') with mean /i x and covariance matrix L x related to a vector Y of m ancillary variables through i= 1,..., m functions (pt (4.3.10)

226

Probability and statistics

and expand each y component in X about the mean fix j — fiXj) + higher-order terms

Grouping all similar equations in a matrix equality gives dcpi

dcp

dxn (X— nx) + higher-order terms -YJM-

Denoting A the mxn matrix of partial derivatives dcpJdXp the previous equation becomes Y(X)= Y(}ix) + A(X-fix) + higher-order terms Introducing the expectations, we obtain ^[ Y(X)~\ = Y(nx) + AS(X-nx) + higher-order terms or (4-3.11) The approximate propagation formula for the covariance matrix is therefore (4.3.12) We can relate the estimates x and S through the following propagation formulas for the mean (4.3.13) and the variance (4.3.14) The same note of caution applies to strongly non-linear functions j as in the case of a single random variable. & The Nd crustal residence age of a sediment is calculated as the time where the isotopic ratio of the sediment or its igneous protolith had the same 1 4 3 Nd/ 1 4 4 Nd ratio as a model depleted mantle. It is assumed that, once the sedimentary protolith is extracted from the depleted mantle, no further 1 4 7 Sm/ 1 4 4 Nd fractionation

4.3 Error propagation and error calculation

227

takes place. Given a sediment with ( 1 4 3 Nd/ 1 4 4 Nd) s = 0.511 815 (s = 0.000012) and ( 147 Srn/ 144 Nd) s = 0.108 (5 = 0.001) and a model depleted mantle with present-day values of ( 1 4 3 Nd/ 1 4 4 Nd) D M = 0.513 114 and ( 147 Srn/ 144 Nd) DM = 0.222, calculate the Nd crustal residence age of this sediment and its standard deviation. Assume that errors on each ratio are uncorrelated. The decay constant of 147 Sm is X = 0.654 x 10"* * a~ 1 . The equation of radioactive decay for the Sm-Nd system in the sediment reads

(

'

'

where the superscripts indicate the geological age (0 for present, T for the age) with a similar equation for the depleted mantle. At time T = TDM, the isotopic ratios were equal in both systems, and therefore

which gives TDM

1

"

/UM

= 147

(

V

/

/S

/ ^| ^ -|

Sm/144Nd)DM°-(147Sm/144Nd)s°

TDM is obtained from the equation ^DM = T l n | 1 +

14 4T 4 VM\ / ' 1 47 7CWI/14

Sm/

c ^ / 114 4 - ^ Nd) D M00 -( '114477Sm/ /

d\

0

In order to propagate the uncertainties on ( 1 4 3 Nd/ 1 4 4 Nd) s and ( 147 Srn/ 144 Nd) s towards TDM, we first need to compute the partial derivatives of TDM relative to these two variables. Using the rules of calculus, we get

dTDM <3( 143 Nd/ 144 Nd) s 0

-1 1 ( 1 4 7 Sm/ 1 4 4 Nd) D M °-( 1 4 7 Sm/ 1 4 4 Nd) s ° k

( 1 4 3 Nd/ 1 4 4 Nd) D M °-( 1 4 3 Nd/ 1 4 4 Nd) s °

(147 Sm/ 144Nd)DM °-(147 Sm/ 144Nd)s° which, with the help of equation (4.3.16), can be simplified into

then dTD 2(143Nd/144Nd)DM°-(143Nd/144Nd)s°

228

Probability and statistics

Likewise, the derivative of TDM relative to ( 147 Sm/ 144 Nd) s is calculated as

d( 147 Srn/ 144 Nd) s °

/L[( 1 4 7 Sm/ 1 4 4 Nd) D M °-( 1 4 7 Sm/ 1 4 4 Nd) s °] 2

or

d( 147 Srn/ 144 Nd) s °

/i( 1 4 7 Sm/ 1 4 4 Nd) D M °-( 1 4 7 Sm/ 1 4 4 Nd) s 0

(4.3.17)

Inserting the numerical values gives TDM =

d(

143

Nd/

144

1

f 0.513114 — 0.511815"! In 1 + = 1.732Ga 0.654 x x 10"n L 0.222-0.108 J Nd) s °

0.654 x l O "

<3T n M £( 147 Sm/ 144 Nd) s °

11

x

-0.011266 0.513114-0.511815

1 0.654 x 1 0 " l l

0.011266 x

= —1326 Ga

- 15.11 Ga

0.222-0.108

Summarizing the results, the vector A is ,4 = [-1326, 15.11] The covariance matrix of the measured ( 1 4 3 Nd/ 1 4 4 Nd) s and ( 147 Srn/ 144 Nd) s is _|"(12xlO" 6 ) 2

I

0

0

1 2

0.001 J

with zero off-diagonal terms since the variables are uncorrelated. The variance of 7DM can now be calculated from equation (4.3.14) as STDM2 = ASAT

= (- 1326 x0.000012) 2 + (15.11 xO.OOl)2

and its standard deviation as

This method can be extended to sets of non-linear implicit relationships. Let us assume that the m-vector Y is defined by a set of k = 1,..., m expressions as a function of the n components of the vector X through (4.3.18) In order to expand each expression to the first degree about the mean, we simply

4.3 Error propagation and error calculation

229

calculate the differential of each expression (4.3.19)

Replacing the infinitesimal increments by the deviation from the mean and combining the m differential equations (4.3.19), we get the system

dX,

8Y,

dxn (X— fix) + higher-order terms

(Y-I*r) = d
_dXl Noting A the m x n matrix with elements dcpJdXj and B the mxm matrix with elements drjk/dXj yields the compact expression B( Y— fiY) = A(X—px) + higher-order terms Taking the expected value of each side, we get B[fiY-g(Y)~] = A[ftx-£{X)~\

+ higher-order terms

Neglecting at this point the higher-order terms, multiplying the last equation by its transpose and taking the expectation gives I? V

J>T

i r<

iT

iil^yiS =A2*XA

(A 1 OA\

(4.J.ZU)

One can assume that the dependent variables Y are defined relative to the variables X by at least n independent equations. Upon pre-multiplication by B~l and post-multiplication by the (/?T)~\ this equation becomes (4.3.21)

The sample covariance matrices are likewise related through the following relationship SYxBlASx(BlA)T &

Isotope dilution with mass fractionation correction. ATNdsp = 0.1 nmol of sa

(4.3.22) 150

Nd

spike is added to a sample containing NNd mol of natural Nd. The mixture is run on a thermal ionization mass spectrometer. Intensities Im are measured in volts (V) on the digital voltmeter (DVM) at masses m= 144, 146, and 150 and reported (Table 4.8) with standard deviation together with isotopic abundances in natural Nd and commercial spike 150Nd. The concentration in the spike solution is thought to be known with a la relative uncertainty of 1 percent and weighing errors are negligible. Tests have shown that intensity variations are correlated with a correlation coefficient

Probability and statistics

230

Table 4.8. Atomic abundance of isotopes of mass m in natural and spike Nd. Im and s(IJ are the measured intensities (in volts) at mass m and their standard deviations. Ion currents are converted into voltage through a high-value resistor. s(Im) 144 146 150

0.237 881 0.171726 0.056239

1.190409 0.857 826 0.473 914

O.OO5O15 0.004 563 0.977 890

0.000 354 0.000212 0.000 149

of 0.92. Calculate the number of moles of natural Nd present in the run and its standard deviation. The number of moles Nm of mass m is the sum of moles of the contributing species (natural Nd and spike) weighted by the isotopic abundances amnai and amsp at mass m Nm = amnatNmnat

+ amspNNdsp

(4.3.23)

Thermal ionization has a mass-dependent efficiency and the intensity Im measured at mass m by the DVM relates to the actual number of moles Nm through the unknown function f(m) Nm = Imf(m)

(4.3.24)

As discussed in Section 3.1,/(m) can be linearly developed in the vicinity of an arbitrary mass value (e.g., 144) and we write ) = / 144 [l-(m-144)<5]

(4.3.25)

where S is the mass discrimination factor or mass bias. We can therefore rearrange the equations for the three masses as a linear system

7l AT iV

1 i

—

146~

nat Nd

7144 nat

(4.3.26)

/ 144 nat

or 1

144

' 146 LM50-I

fu

4.3 Error propagation and error calculation

231

which can be solved for the three unknowns iV Nd nat // 14 4, N N d s o / / i 4 4 , and 3. Since NNdsp is known, the solution of this system of equation will g i v e / 1 4 4 and N N d n a t . In order to compute error propagation, we must evaluate the partial derivatives with respect to both the dependent and independent variables. This will be more clearly seen by differentiating equation (4.3.24) dN/M = f(m) d/m + Im d/(m) = amnat dNNdnat + aj» dNNds*> while the differential of f(m) through equation (4.3.25) reads d/(m) = d/ 144 [l - (m -144)5] -f144(m - 144) d<5 The independent variables fixed by the experiment are / 1 4 4 , / 1 4 6 , / 1 5 0 and N N d s p (n = 4). The dependent variables calculated from the three equations (4.3.26) are iVNdnat, <5, and fl 4 4 (m = 3). We assemble the terms accordingly and write amnat cWNdnat + /./i44(m - 144) dc5 - / m [ 1 - (m -144)5] d/i 44 = f(m) dlm - am** dNNd**> Inserting numerical values, the system 0.237 881 0.005 015 0.000 000"

7144

0.857 826

0.171726 0.004 563 1.715 652

/144

.0.473 914.

.0.056239 0.977 890 2.843486.

1.190409"

is solved as V nat / f V Nd Jl44 XT sp/ f /V Nd 7 l 4 4

=

" 5.00 ~ 0.200 .-0.00100.

3

i.e.,/ 144 = 0.500nmol V" 1 , N Nd nat = 2.500nmol, and a mass bias 3 of l.OOpermil. The reader will check that changing the reference mass, say to 148, does not change the results. We now define and compute matrix A as

A=

"/(144)

0

0

0

/ ( l 46)

0

0

/(150)

. 0

-

0.5000 0.0000 0.0000

-0.005015"

0.0000 0.5010 0.0000

-0.004 563

.0.0000 0.0000 0.5030

-0.977 89 .

and matrix B as

/(146) B=

2/144/1

J

/l44

/(150) /l44

-1.1904"

0.1717 0.8579

-0.8595

L0.0562 1.4218 - 0.4768 J i

6/144/1

0.2379 0.0000

232

Probability and statistics

which gives the more compact relationship equivalent to equation (4.3.19) r d/144 =A

dd

B

d/1

L d/ 144 j

After calculation, we get

l

B A =

-10.2164

21.3087

-12.9082

25.0017"

-0.4123

0.5850

-0.0006

0.0000

-2.4616

4.2581

-2.5795

5.0003.

The covariance matrix Sx of the independent variables is built from the correlation matrix and the standard deviation of each variable according to equation (4.2.18). Note that the uncertainties on spike addition (fourth variable) are not correlated to those of the intensity measurements. Sx is computed as ro.000354

o

o

on

0

0.000212

0

0

0

0

0.000149

0

0

0

0

0.001J

r0.000354

0

0

0 "

0

0.000212

0

0

0

0

0.000149

0

0

0

0

0.001.

r 1

0.92

0.92

01

0.92

1

0.92

0

0.92

0.92

1

0

0

0

L 0

U

or rl.25xHT 7 6.90 xlO~

8

4.85 xlO"

8

0

6.90 x 10" 8 4.85 xlO~ 8 4.49 xlO"

8

2.91 xlO"

8

0

2.91 xlO"

8

0

2.22 xlO~

8

0

0

1.00xl0~6_

0

The covariance matrix Sy of the dependent variable is obtained through equation (4.3.22) (0.025)2

0.000 000 5 2

0.000000

(6xlO" )

L0.000 126

0.000 000

0.000 126 0.000000 (0.005)2 J

4.3 Error propagation and error calculation

233

The solution with standard deviations quoted in parentheses therefore reads N Nd nat = 2.500 (0.025) nmol S = - 0.00100 (0.000 06) per mass unit / 1 4 4 = 0.500 (0.005) nmol V " l

A related example of linearized error propagation during the isotope dilution measurement of lead in rock samples using the double-spike technique is given by Hamelin et al. (1985). o

43.4 Monte-Carlo simulations In many cases, the function relationship q> is too complicated for the distribution function of the dependent variable(s) to be analytically calculated, yet we do not want to linearize error propagation because we feel such a simplification would result in far too inaccurate results. This is typically the case when the dependent variable is calculated by numerical integration of a differential equation. With the advent of fast desktop computers, the Monte-Carlo error propagation technique has become an easy way of circumventing this difficulty. A large number of random deviates on X (or X if we are dealing with a vector of random variables) are generated by software with the appropriate density of probability distribution, then (p(X) or q>(X) are recalculated with the new value(s) and their statistics evaluated directly by the computer. Many deviates ( > 200) are usually needed if we do not want the statistics to be biased by the 'computer sampling', e.g., if a Student-t distribution is to be accurately approximated by a normal distribution. & A test of mixing is to be made in a 87 Sr/ 86 Sr vs l/C Sr plot. In order to estimate statistically the quality of the alignment, the standard deviation of l/C Sr is needed for each data point. Assuming the mean value of CSr for a sample is 70 ppm and its standard deviation 20 ppm, estimate through a Monte-Carlo method the mean of the inverse value l/C Sr and its standard deviation. The procedure consists in producing 500 normal deviates ui9 i.e., random numbers normally distributed with zero mean and unit variance. We then compute 500 random values xt of CSr normally distributed with mean /x = 70ppm and variance a1 = 20 2 ppm2 using

and then compute the 500 inverse values y{ = l/xf . In a real experiment with MatLab, the mean CSr was 79.8 ppm and the standard deviation 20.4ppm. The computed mean l/C Sr was 0.0136ppm" 1 and the standard deviation 0.0045 ppm" *. These estimates are significantly biased relative to the linear propagation theory which would predict 0.0125 and 0.0031, respectively. This example shows that linear propagation should be applied with utmost care when a variable depends on another through a strongly non-linear relationship. <=

234

Probability and statistics

& We want to assess the extent to which the La/Yb ratio of a basaltic liquid is changed by the fractionation of a clinopyroxene-garnet mineral assemblage. The value of the chondrite-normalized La/Yb ratio of the primary melt is assumed to be 5.0. Clinopyroxene (cpx) and garnet (ga) are the only phases at the liquidus. The mean mineral-liquid partition coefficients are K cpx La = 0.01, KcpxYb = 0.4, KgaLa = 0.005, X ga Yb = 4. They are distributed as log-normal variables: the standard deviation of each In X is 2 (i.e., we assume that individual partition coefficients are, on average, known to within a factor of two). From experience, we also know that the 'shape' of partition coefficient patterns is fairly stable, so we assume the correlation coefficient between errors on In K's of the same mineral to be r = 0.95. The fraction (1— F) crystallized is distributed exponentially with a mean value of 0.2. The cumulate is made, on average, of x cpx = 40 percent cpx, this variable being distributed as a log-normal distribution with a standard deviation of 0.1 (10 percent error). From the equations ruling elemental fractionation during progressive removal described in Section 9.3, the relationship between mean values is /La\ VYb/liq

/La\ W o

DLa.DYb

where D stands for the bulk solid-liquid partition coefficient and the subscript 0 refers to the undifferentiated magma. We first compute from equation (4.3.22) the covariance matrices SlnK of In K for each mineral In 2 0 T 1 • ln2jLo.95

0.95Tln2 0 ~| [0.4805 0.4564] 1 X 0 In2_|~ |_0.4564 0.4805J

The vectors [ l n X / a , l n X / b ] T are normal with mean [In 0.01, In 0.4] T for j = cpx and [In 0.005, In 4] T for j = ga. Let us describe how one random estimate of La/Yb can be computed. The vector u such that La A lnK

LlnX cpx

- In 0.01 | -ln0.4 J

Yb

is a vector of two uncorrelated normal deviates with zero means and unit variances. We have, indeed, from equation (4.2.31), the equality

which shows that the identity matrix I2 is the covariance matrix of u. The square-root of the matrix SlnK is calculated as

lnK

1/2 _ [0-5615 0.40651 L0.4065 0.5615 J

Using a generator of random numbers, we produce a 2-vector u of uncorrelated

4.3 Error propagation and error calculation

235

Table 4.9. La/ Ybfractionation by garnet-clinopyroxene removalfrom a basaltic melt: examples of three Monte-Carlo runs of error propagation. The ut and w, deviates are normal deviates, the deviates vt are log-normal. See text for the description of the computed random variables: K represents mineral-liquid partition coefficients, F the fraction of residual melt, x the fraction of a mineral in the cumulate, D bulk solid-liquid partition coefficients. i

AK

La

cpx

1 2 3

-0.3604 -0.8877 -0.3251

-0.0179 -1.0016 -0.3350

0.0081 0.0040 0.0073

i

vt

1-F

1-F

1 2 3

0.4897 0.6855 0.6642

0.1428 0.0755 0.0818

0.8572 0.9245 0.9182

Jf Yb ^cpx

0.3420 0.1589 0.2904

1.6552 0.6287 1.5346

u{

K La ga

K

fr" Yb ga

K

1.6341 -0.6429 0.6876

0.2622 1.0279 1.1546

0.0139 0.0053 0.0118

9.0044 5.4856 10.115

^cpx

*ga

£La

£>Yb

0.4684 0.4247 0.4630

0.5316 0.5753 0.5370

0.0112 0.0048 0.0097

4.9473 3.2234 5.5664

normal deviates and compute the random vector lnX cpx La 1 rinO.Oli

1/2

lnK cpxYb J = bn0.40j + ln* " We then compute Kcpx and KcpxYb and proceed similarly for ga. x cpx is calculated in a similar way: if w is a normal deviate, we get lnxcpx = ln0.40 + (lnl.l)xw or xcpx = 0.40 x 1.1w while xga is calculated as 1 — xcpx . The La bulk solid-liquid partition coefficient D La is calculated as

with a similar formula for D Yb . In order to compute the fraction crystallized (1 — F), we note that, for v being a deviate uniformly distributed between 0 and 1, —Inv is an exponential deviate with unit mean, and —0.2 In v is an exponential deviate with mean 0.2. The La/ Yb ratio can now be calculated and we proceed identically with a different set of random deviates as many times as needed. Table 4.9 gives three cases (n = 1,2,3) of such a calculation with u and w standing for normal deviates and v for uniform

Probability and statistics

236

o

o

10' Garnet

10 c

10 -1

10"4

10 -1

10 -2 L

La

0

25

o

-

-

o |

o

20 o

J

o o

o o

15 -

o -

o o

10

5

o

oo

o °

°-^ o ° * o

o o

o

(y

o

0

0.5

0.6

0.7

0.8

0.9

Figure 4.11 Monte-Carlo simulation (100 trials) of error propagation for La/Yb fractionation in residual melts by clinopyroxene-garnet removal from a basaltic parent magma (see text for parameter description and distributions used). Top: mineral-liquid partition coefficients for La and Yb. Bottom: variations of the La/Yb ratio as a function of the fraction F of residual melt.

4.4 Principal component analysis

237

deviates. Figure 4.11 plots 100 estimates of mineral-liquid partition coefficients and La/Yb ratios.

S=UAUT

(4.4.1)

where A is the n x n the diagonal matrix of eigenvalues Xu..., kn and U is the nxn matrix of orthogonal eigenvectors Uj (j=l,...,n\ the jth component of the ith

observation vector x( is the scalar ejt such as e,, = «/(*,-*)

(4.4.2)

ejt is therefore the coordinate of the ith measurement xt along the 7th eigenvector Uj of Sx. From that definition, we can infer the average component along the 7th eigenvector to be zero since

= 0

(4.4.3)

m The n coordinates associated with the n eigenvectors define the vector et of the point vector xt in the new system of coordinates, which is written formally as ei=U

T

(xj-x)

with the number of vectors et being m. Pre-multiplication of the old coordinates xt — x by UT corresponds to a rotation that makes the new axes correspond to the eigenvectors of the matrix Sx (see Section 2.2). The nxm matrix X therefore produces an n x m matrix E through £ = UTX

(4.4.4)

Applying equation (4.3.4) for variable change, we get the component covariance matrix Se as Se= UTSXU= UJUAUJU=\

(4.4.5)

Since A is diagonal, the covariance of the components along the 7th eigenvector is

238

Probability and statistics

zero and their variance is given by

m— 1

(4.4.6) N o w comes the very principle of the principal c o m p o n e n t analysis. A total variance is n o w defined as the trace of the matrix Sx or, using a p r o p e r t y of the trace of a matrix product given in Section 2.2 t r £ x = tr(tfA£/ T ) = tr(A' I i/ T tf) = trA = £ lj

(4.4.7)

a n d the p r o p o r t i o n of that variance explained by the c o m p o n e n t k is the ratio pk given by

Adding variances on different variables at the denominator, e.g. pH and temperature in solutions, does not make much sense and is certainly not invariant upon rescaling. Proportions of explained total variance do not survive a simple change of units! For this reason, PC A is commonly carried out instead on normalized variables £ such as Z^s-^Xi-x)

(4.4.9)

where s is the diagonal matrix of sample standard deviations. The n x m matrix S collects the z = l,...,m normalized measurements £,. As we have seen in equation (4.2.33), the covariance matrix of standardized variables is the correlation matrix of the non-standardized variables. Therefore, the £f have the correlation matrix R for covariance matrix. The diagonal form of R is R=VAVT

(4.4.10)

where A is the diagonal matrix of eigenvalues (5 l5...,<5 n and V the matrix of the orthogonal eigenvector v 1 ? ...,v n of R. The component fjt of the ith vector xt along the 7'th eigenvector v, of R is given by

/;, = v/£

(4.4.11)

and the components can be collected in an n x m matrix F obtained as before through F= VTE

(4.4.12)

E=VF

(4.4.13)

or, since V is an orthogonal matrix

4.4 Principal component analysis

239

Applying equation (4.3.4) for a change of variable, we get the component covariance matrix Sf as Sf = VTR V= V1 VA VJ V= A

(4.4. H)

As for the principal components of the covariance matrix, the principal components of the correlation-matrix have zero covariance. In addition, the variance of a component is simply given by the corresponding eigenvalue, i.e.,

(4.4.15)

m— 1

Since the trace of a matrix is invariant upon rotation, we get the sum of the n eigenvalues Sj as the trace of the correlation matrix. The dimensionless total variance is therefore n

total variance = £ dj = tr& = trR = n

(4.4.16)

The proportion of the total variance allotted to the kth component is

Pk

k

l i j I j=i

n

(4.4.17)

Relative to any other orthogonal reference frame and for any value of j , the total variance which is unaccounted for by the components associated with eigenvalues smaller than 5j is minimal (e.g., Johnson and Wichern, 1982). Reduction of the number of independent variables can therefore be achieved by dropping all the components that do not account for a substantial fraction of the total variance, i.e., those with the smallest eigenvalues. The cutoff value depends on where the significance level is set up for the problem under consideration. Let us assume that the components j=l,...,k are taken into consideration, while the components from k + 1 to n are considered as 'noise'. We therefore split the nxn matrix V into its most valuable or significant part, the nxk matrix Vk of the eigenvectors associated with the k largest eigenvalues and the 'noise' part, the nx(n — k) matrix VkL of the eigenvectors corresponding to the rest of the eigenvectors. Vk and VkL are made of orthogonal vectors. Likewise, the matrix jFis split into a h n matrix Fk comprising the k significant eigencomponents and a (n — k) x n matrix F^ of noise. Making this split apparent in the matrix equality (4.4.12), it becomes VT* = lVk W E

(4.4.18)

We are interested in the upper part Fk of F and therefore in Fk=VkTE

(4.4.19)

240

Probability and statistics

Conversely, the nxm matrix of reduced data S can be considered as the sum of a significant part Ek and a noise part E^-1, both with the same dimension as S, so equation (4.4.13) now becomes (4.4.20)

The m x n matrix Ek of the significant part in the original space of the reduced data is therefore Ek=VkFk=VkVkTE

(4.4.21)

or for the ith reduced vector ft ft* = VkFk = Vk FfcTft

(4.4.22)

The noise component of the reduced vector is the complement of %tk to ft. The squared modulus of this noise, which is the reduced squared distance (dk)2 of the ith actual measurement to the point represented by its significant principal components, is (d-k)2 = (ft* - ft)T(ft* -

ft)

(4.4.23)

This method is extremely useful to detect the points that have a large noise component (outliers) and therefore are exceedingly far from the subspace of the significant principal components. For the raw data, the significant part xk of the vector xt is

while its noise part is xt — xk. At this point we can describe most of the variability in the original data set by a small number of linear combinations of the original variables. In particular, once the components have been ordered with decreasing eigenvalues, graphing the first components pairwise will enable the data to be shown along the directions of maximum variability. It is convenient to draw a reference unit circle in the plane or a unit reference sphere in a three-dimensional space - which shows the locus of points located at a distance of one standard deviation from the mean. In addition, a unit vector (one standard deviation) along each original axis ^ can be drawn, which gives a quick visual feeling of how a plane defined by two components is oriented in the original reference frame, i.e., tells us 'where' the original data axis is in the component plane. If the unit vector of an original data axis lies in the plane of the two components, its representative point must be on the unit circle. If it is orthogonal to the two component plane, it must project at the center of the circle. From equation (4.4.13), the components of the unit vectors in the original £,- space are simply the rows of the matrix V. The correlation coefficients between the original data and the components, known as the component loadings, are also of great utility as they show which component carries

4.4 Principal component analysis

241

more information on an original data axis. Multiplying equation (4.4.13) by FT, we get EFT=VFFT

(4.4.24)

Both S, by definition, and F through equation (4.4.12), are centered, i.e., their expectation is a null matrix. Therefore the sample covariance matrix between the reduced data and the components is ZFT-EFT=

V(FFT FFT)= VSf= VA

(4.4.25)

where the bar over the matrices refers to the sample mean. The £, have unit variance while A is the covariance matrix of the components. From equation (4.2.20), the matrix of sample correlation coefficients is therefore (4.4.26)

The ijth term of this matrix represents the correlation coefficient (loading) between the ith variable and the jth principal component.

& Limestone samples from Coumiac in the Southern French Massif Central have been measured for major elements and carbon species by Grandjean (1989). The data are reported in Table 4.10. Total iron is counted as Fe 2 O 3 . Use PC A to suggest the possible mineral components that contribute to the rocks. The mean and standard deviation are first computed and shown in Table 4.10. Then the correlation matrix R is evaluated by computer as SiO 2

R=

A12 O3

Fe 2 O 3

CaO

CO 2

Org. C

1.0000

0.9107

0.3530

-0.5461

-0.6008

0.1525"

0.9107

1.0000

0.5481

-0.7273

-0.7637

0.1537

0.3530

0.5481

1.0000

-0.9705

-0.9537

0.5949

-0.5461

-0.7273

-0.9705

1.0000

0.9921

-0.5296

-0.6008

-0.7637

-0.9537

0.9921

1.0000

-0.5367

0.1525

0.1537

0.5949

-0.5296

-0.5367

1.0000

This matrix shows strong correlations between SiO2 and A12O3 on one hand, between CaO and CO2 on the other hand, hinting thereby at a detrital and calcite component, respectively. Fe2O 3 is strongly anticorrelated with both CaO and CO2 . The eigenvalues Sj of the matrix R and their percentage pj of contribution to the

Probability and statistics

242

Table 4.10. Major-element composition (weight percent) of the Coumiac limestones, South French Massif Central (Grandjean, 1989). Layer #

SiO 2

A12O3

Fe 2 O 3

CaO

CO 2

Org. C

3 4 8 9 11 12 13 14 19 20 21 22 23 24 26 27 28

1.23 0.29 0.28 2.69 1.56 1.52 2.00 5.72 0.72 2.21 1.50 1.75 0.82 1.70 1.79 3.20 2.80

0.48 0.13 0.07 1.68 0.82 0.76 0.65 2.45 0.13 0.30 0.15 0.33 0.35 0.69 0.61 1.57 1.21

1.12 0.39 0.31 27.14 0.44 3.23 0.60 6.90 0.23 0.41 0.38 0.45 0.54 0.37 2.11 0.75 1.65

53.30 54.36 54.50 36.17 52.88 51.74 53.90 46.05 55.24 54.09 54.42 53.91 54.42 53.58 52.86 51.76 52.04

42.34 43.55 43.71 30.84 41.64 41.25 42.03 37.19 43.07 42.18 42.87 42.73 42.77 42.23 41.69 40.62 40.79

0.03 0.02 0.02 0.17 0.04 0.03 0.02 0.05 0.05 0.09 0.09 0.12 0.12 0.04 0.02 0.06 0.04

Mean Std Dev.

1.87 1.29

0.73 0.66

2.77 6.50

52.07 4.59

41.26 3.07

0.06 0.04

Table 4.11. Eigenvalues of the correlation matrix of major-element concentrations in Coumiac limestones and percentage of variance explained by each component. Note that the eigenvalues 6} sum up to 6. 1 4.2329 70.55

Pj (percent)

2 1.2262 20.44

4 0.0470 0.78

3 0.4872 8.12

5 0.0054 0.09

6 0.0013 0.02

total variance are given in Table 4.11, whereas the eigenvalue matrix Fis given by

V=

0.3432

0.5651

-0.4362

0.6054

-0.0602

0.0503

0.4034

0.4808

-0.0592

-0.7421

-0.0329

0.2253

0.4391

-0.3113

0.3565

0.2316

-0.0678

0.7246

0.4722

0.1192

-0.2731

-0.0223

0.4788

0.0785

-0.1953

-0.0493

-0.7692

0.3637

0.2730

-0.5764

-0.7526

-0.1619

-0.0241

0.0067

0.6312 0.5379

91 percent of the variance is given by the first two components, so we can restrict our analysis to a plot of component 2 vs component 1 (Figure 4.12). Three groups

243

4.4 Principal component analysis

-1.5

-1

0

1

Component 1 Figure 4.12 Principal component analysis of the major elements in Coumiac limestones. 91 percent of the variance is explained by the first two components. The data can be explained by the combination of three chemical end-members: calcitic (CaO and CO2), detrital (SiO2 and A12O3), and organic (organic C and Fe2O3). Because of the closure condition these three end-members translate into only two significant components.

of variables are identified: the calcitic end-member (CaO and CO2), the detrital end-member (SiO2 and A12O3 ), and the organic end-member (organic C and Fe2O3) which suggests that diagenetic pyrite precipitation occurred whenever strong input of organic material made the environment reducing. Why two components if the end-members are three? The closure condition is an additional relationship, so the number of degrees of freedom for mixtures of these three end-members are only two. <= & The isotopic composition of radiogenic elements in 40 groups of oceanic islands has been compiled by Vincent Salters from the Lamont-Doherty Geological Observatory and is reported in Table 4.12. Find the minimum number of variables to explain at least 90 percent of the variance. Find the deviating islands. Plot the first three components pairwise. As a reference, the mean and standard deviation of each variable are listed in the appropriate column of Table 4.12. Then, the correlation matrix R is calculated as -0.470

-0.099

-0.117

0.812

1

0.344

-0.005

-0.041

0.470

0.344

1

0.855

0.880

0.099

-0.005

0.855

1

0.894

0.117

-0.041

0.880

0.894

1

1 R=

-0.812

244

Probability and statistics Table 4.12. Mean isotopic data for oceanic islands (courtesy Vincent Salters).

The (df)2 of each observation is its squared distance to the first two-component plane ('the Mantle Plane'). Sr

143

Nd

206p b

207p b

208p b

86Sf

144

Nd

204p b

204p b

204p b

(dtk)2

0.0298 0.9158 2.3561 0.8359 0.2172 0.3823 1.1064 0.7589 0.1384 1.2111 0.0685 1.3745 0.3734 0.5515 0.2638 0.0101 0.7976 0.1736 0.6122 1.5094 2.0918 1.9802 0.0175 1.2409 0.1321 0.7788 0.0874 0.9954 2.5327 0.0505 3.0063 0.7729 0.2985 0.9822 0.1748 3.1309 0.0547 1.5217 0.7974 2.1604

87

Island group St Paul-Amst. Ascension Australs Cook-Australs NCook S Cook Azores Balleny Bouvet Cameroon Line CapeVerde Carolines Christmas Cocos Comores Crozet Easter Fernando Galapagos Gough Hawaii Iceland Juan Fernandez Kerguelen Louisville Marion Marquesas NE Seamounts Nunivak Reunion Rio Grande Samoa San Felix Shimada Society St Helena Trinidade Tristan Tuamotus Walvis

0.703 73 0.702 83 0.70311 0.703 67 0.704 55 0.703 70 0.704 57 0.70294 0.703 69 0.703 14 0.70341 0.703 29 0.70440 0.703 03 0.70341 0.70400 0.703 22 0.70411 0.703 12 0.705 10 0.703 76 0.70311 0.703 66 0.70506 0.703 58 0.703 30 0.70424 0.703 37 0.70290 0.704 14 0.704 78 0.705 53 0.70409 0.70484 0.70478 0.702 89 0.703 80 0.70500 0.70408 0.704 69

0.512 88 0.513 04 0.512 88 0.512 82 0.512 78 0.512 77 0.51281 0.51297 0.512 84 0.51290 0.512 84 0.51297 0.51269 0.51299 0.512 82 0.512 85 0.51290 0.51281 0.51299 0.512 54 0.51293 0.51304 0.51284 0.51266 0.51292 0.51293 0.512 80 0.512 85 0.51311 0.512 85 0.512 55 0.512 75 0.51261 0.51264 0.512 80 0.512 87 0.51271 0.512 55 0.51271 0.512 54

18.879 19.421 20.533 20.001 19.386 19.743 19.707 19.752 19.445 20.020 19.254 18.462 18.639 19.234 19.615 18.929 19.865 19.409 19.076 18.445 18.188 18.453 19.121 18.259 19.271 18.562 19.362 20.155 18.588 18.855 17.619 18.914 19.079 19.046 19.128 20.678 19.116 18.476 18.132 17.914

15.585 15.612 15.733 15.673 15.608 15.637 15.703 15.600 15.652 15.672 15.580 15.489 15.605 15.589 15.609 15.587 15.640 15.634 15.564 15.624 15.462 15.484 15.604 15.555 15.610 15.540 15.604 15.629 15.471 15.580 15.490 15.607 15.581 15.681 15.592 15.763 15.601 15.518 15.490 15.492

39.131 38.916 39.876 39.621 39.342 39.482 39.810 39.359 39.065 39.758 39.026 38.289 38.742 38.973 39.479 39.037 39.670 39.331 38.692 38.99 37.899 38.106 38.961 38.646 38.991 38.367 39.258 39.907 38.088 38.919 38.054 39.071 39.029 39.354 38.915 39.985 39.110 38.867 38.879 38.472

X

0.703 87 0.00073

0.512 82 0.00014

19.118 0.690

15.594 0.070

39.037 0.528

s

4.4 Principal component analysis

245

Table 4.13. Eigenvalues and percentage of explained variance for the oceanic island isotope data of Table 4.12.

1 2.9155 58.31

Pj (percent)

2 1.7631 35.26

3 0.1871 3.74

4 0.101 2.02

5 0.0333 0.67

Thefiveeigenvalues of R were calculated by computer as the Sj listed and rearranged in a decreasing order in Table 4.13 together with the fraction Pj in percent of the total variance explained by each component. The matrix V of the five eigenvectors, in the order of the eigenvalues, is 0.2914 0.6122 0.2150 -0.6641 V=

0.6682 0.6659

0.5783

0.0132 -0.0127

0.5149

0.2991

0.5190

0.3074 -0.1693

0.2851

0.2672 0.1492 0.1918 -0.1806 0.1888

0.7935

-0.7230

-0.2036

0.5775 -0.5235

More than 93.5 percent of the variance is explained by the first two components, which tells us that two degrees of freedom describe most of the natural isotopic variation with the five chronometers. This observation has led to the concept of the 'Mantle Plane' of Zindler et al. (1982), since a plane is defined by only two independent variables, and has been extensively discussed by Allegre et al. (1987). Figure 4.13 shows that the first component is dominated by lead isotopes which plot next to the unit circle on the right, while the spread along the second component is dominated by the Sr-Nd anticorrelation. From component 3 to 5, the spread is small. The loading matrix V\1/2 given by equation (4.4.26) can be found in Table 4.14. Lead isotopes have strong correlation coefficients on the first component. They are decoupled from Sr and Nd isotopes which strongly correlate and anticorrelate, respectively, with the second component. On a global scale, Pb isotopic variations in oceanic islands seem to be decoupled from Sr and Nd isotopic variations. Let us decide that the cutoff value is k = 2, so the matrix Vk is made of the first two columns of V. As expected, the remaining part of the data, i.e., the components three tofive,which has been formally ascribed to noise, is very small. This is apparent in the last column of Table 4.12, where the (df)2 values have been listed. What Zindler et al. (1982) called the 'distance' of a datum to the Mantle Plane is the square-root of (df)2. Interestingly, many of the deviating points are those for which a HIMU component (Zindler and Hart, 1986) has been recognized (St Helena, Rio Grande, Australs). In order to account for this additional component, it is left to the reader to show that a third component would work adequately. «= What PC A is actually doing in this case in terms of processes is rather inappropriate. There is a consensus for explaining mantle isotopic variability as mixing geochemical

246

Probability and statistics

component 1

component 1

o

I

I

I

o

CD

O

o

-2 2 O

en

o

I

i

0

o

I

O

o -2 0

2 -2

component 2 o

!!SL • 86 Sr

component 3 206

143

Nd 144 Nd

n

Pb 204 Pb

,

2Q7

Pb

204pb

208

Pb 204p b

Figure 4.13 Principal component analysis of the mean isotopic data for oceanic islands (courtesy of Vincent Salters). In the top left corner, the plane of the first two components (the 'Mantle Plane' of Zindler et a/., 1982) explains 93 percent of the variance. Component 1 is dominated by lead isotopes, component 2 by Sr and Nd isotopes. Other components are plotted for reference. In the top right corner, the 'Mantle Plane' is viewed sideways along the direction of the second component, so the distance of each point to the plane can be easily seen. In the bottom left corner, it is viewed along the axis of the first component. The bottom right corner shows how little variance is left with components 3 and 4.

entities, inadvertently also called 'mantle components' (e.g., Allegre, 1982; White, 1985; Zindler and Hart, 1986; Hart et al, 1992) which should not be confused with principal components. These mantle components most certainly represent contributions from distinct mantle reservoirs with distinct evolution. However, the process of 'adding-up' mantle components does not produce the linear relationships in ratio-ratio plots that PCA assumes implicitly, but rather generates hyperbolic mixing surfaces (see Chapter 1). The components created by PCA are a convenient simplification of rather questionable significance.

4.4 Principal component analysis

247

Table 4.14. Correlation coefficients (loadings) between the original reduced data and the components for the oceanic island isotope data of Table 4.12.

87

Sr/ 86 Sr Nd/ 1 4 4 Nd 206 Pb/ 2 0 4 Pb 207 Pb/ 2 0 4 Pb 208 Pb/ 2 0 4 Pb 143

1

2

3

4

5

-0.4976 0.3670 0.9875 0.8791 0.8861

0.8130 -0.8818 0.0175 0.3972 0.4082

0.2890 0.2880 -0.0055 0.1233 -0.0732

0.0849 0.0610 0.0600 -0.2297 0.1835

0.0273 -0.0330 0.1449 -0.0372 -0.0956

An additional problem of the PCA is how analytical uncertainties may be taken into account. The normalization step represented by equation (4.4.9) involves estimates of standard deviations. A common choice is the sample standard deviation, although Allegre et a\. (1987) consider that a more natural scaling of the variations is the experimental uncertainty on an individual measurement. Both procedures lead to de-dimensionalized data but with an entirely different philosophy, depending on whether emphasis is on the natural or analytical dispersion. For instance, most of the Ni variations in basalts are due to olivine fractionation, while most of the Mn variations can be ascribed to analytical uncertainties, since the partition coefficient of this element is unity for most femic phases. Normalizing concentrations to the sample standard deviations s (sigma) results in the \o surface in the data space being represented by a unit reference circle in the component plane, and the analytical uncertainty surface associated with each point is an ellipse. Normalizing concentrations to the sample analytical uncertainty does the opposite: the analytical uncertainty volume associated with each point in the data space becomes a unit circle in the component plane, while the la surface in the data space is represented by an ellipse.

Inverse methods

Literature abounds with a rich terminology concerning the possible relationships between observations provided by experiment or analysis and parameters which are the physical quantities needed for a mathematical formulation of a process (the model) to be uniquely determined. A forward problem relates observations to parameters by a relationship such as parameters = if *(observations)

(5.0.1)

where <£ * is a matrix, differential, or integral operator and is usually easy to work with. For instance, given some observed (or assumed) mantle source composition, degree of melting, residual mineralogy and fractionation coefficients, calculating the composition of the basaltic melt segregated from the mantle source is a forward problem. Quite commonly, however, we are in situations where such an operator =£?* is not available and, instead, the relationship goes through a known operator <£ that relates the parameters to the observations j£? (parameters) = observations Assuming the operator !£ (the model) is known, the inverse problem consists in finding the inverse operator JSf *, and therefore the parameters which relate to the observations through equation (5.0.1). In addition, very few observations are pristine and basic measurements such as angular deviation of a needle on a display, linear expansion of a fluid, voltages on an electronic device, only represent analogs of the observation to be made. These observations are themselves dependent on a model of the measurement process attached to the particular device. For instance, we may assume that the deviation of a needle on a display connected to a resistance is proportional to the number of charged particles received by the resistance. The model of the measurement is usually well constrained and the analyst should be in control of the deterministic part through calibration, working curves, assessment of non-linearity, etc. If the physics of the measurement is correctly understood, the residual deviations from the experimental calibration may be considered as random deviates. Their assessment is an integral part of the measurement protocol and the moments of these random deviations should be known to the analyst and incorporated in the model. Fitting a parameter-dependent model to a set of observations consists in finding 248

5.7 Linear estimates

249

the set of parameters that is most suitable for bringing the observations and the model as close as possible. Part of this chapter is dedicated to giving the terms 'most suitable' a statistical sense. Given a set of measurements and a set of unknown parameters related to each other by a set of known equations, the least-square method provides the minimum-variance estimate of the parameters. The least-square method is not to be confused with the maximum-likelihood method, which require the knowledge of probability distributions, although they both result in identical solutions for normally distributed variables. 5.1 Linear estimates 5.1.1 General Given a matrix Am x n of known coefficients with m > n and a vector ym of observations or data, an unknown or model vector xn of parameters is sought which fulfills the condition of the model y = Ax

(5.1.1)

This equation has in general no solution because the data vector y should represent a linear combination of the n column vectors au a2, .,an of the matrix A, for y = xla1+x2a2 + ... +xnan

(5.1.2)

y therefore is required to lie in the column-space of A, a desired property which, at least within a certain precision, is usually not met in practice. We therefore make the assumption that the sample data gathered in vector y are only our best estimates of the real (population) values j , which justifies the bar on the symbol as representing measured values. This notation contradicts the standard usage, but is consistent with the basic definitions of Chapter 4. Indeed, for an unbiased estimate, we can still write that S(y)=y

(5.1.3)

The problem is therefore recast as a search for a population vector statistics j , of which the measurement y is an estimate, and which satisfy the model, i.e., y = Ax

(5.1.4)

None of the population parameters x and y can be found since their determination would require the whole range of attainable values to be measured. The least-square criterion provides estimates x and y of x and j , respectively, which also satisfy the model, i.e., y = Ax

(5.1.5)

and makes the length of the residual vector (y — y) as small as possible. The least-square solution y is simply the orthogonal projection of the data vector y onto the column-space

250

Inverse methods

Figure 5.1 The least-square estimate y of the solution to equation (5.1.4) is the orthogonal projection of the observation vector y onto the column-space al9 a2, ..., an of matrix A. of the matrix A (Figure 5.1), which we found in Chapter 2 to be given by y = A(ATA)-1AJy = Py

(5.1.6)

The projector P associated with that projection is A(ATA)~ lAT, with dimension mxm and rank n. Comparing with equation (5.1.5), we obtain the least-square solution as x = (ATA)-lATy

(5.1.7)

The least-square solution itself does not depend on the probability distribution of j : it is simply a minimum-distance estimate. Later in this Chapter, it will be shown, however, that its sampling properties are most easily described when the measurements are normally distributed. Finally, one may ask how a particular datum may influence the results. Various definitions of data importance, leverage, or influence can be used. P measures the importance of the each observation as seen from equation (5.1.6) (5.1.8) where pik is the ikth element of the projector P. This relationship tells us how much of each datum yk participates into the making of the estimate yt. Ideally, we would like the matrix P to be as close to a diagonal matrix as possible, which would insure independent observations. In addition, all pH ideally should be nearly equal in order for the contribution of each observation to the making of parameters to be equivalent. However there is probably no better measure of how a particular observation i influences the model than comparing the solution x with the solution x(i) obtained by leaving out the ith datum (e.g., Sen and Srivastava, 1990). The derivation of the change [ i — x(ij] requires some particular results of matrix algebra not included in this book and the reader may refer to these authors for complete derivation of the

251

5.7 Linear estimates Table 5.1. Ion beam intensities \{ (mV) at mass i, isotopic abundances a of metals and adjusted intensities. Ion currents are converted into voltage through a high-value resistor. Mass i

Tt

a?

a™

AtSm

Ii

142 144 146 148 150

207 62 43 26 22

0.1113 0 0 0 0

0.2713 0.2380 0.1719 0.0576 0.0564

0 0.031 0 0.113 0.074

207.00 62.29 42.65 26.10 21.73

following equation (5.1.9)

where (a1),- is the ith row of the matrix A. & Peak-stripping (mass spectrum deconvolution). An ion probe measures the ion current at the masses 142, 144, 146, 148 and 150 which are known to result from the overlapping isotopic signals of Ce, Nd and Sm. The vector Z not to be confused with the identity matrix, of measured peak ion currents T14.2J144r9... (values in millivolts) is given by Table 5.1. Neglecting instrumental mass fractionation (see Chapter 3), calculate the total elemental signal in millivolts for Ce, Nd, and Sm. Let a 142 Ce be the atomic fraction of 142 Ce in Ce and the like for other elements and isotopes. Let us call JCe, / N d and / Sm the elemental signal, i.e., the total number of millivolts summed over all the isotopes of the same element. Mass balance requires

I —n Cei M 4 4 — "144 *•

and so on for each mass. Defining the matrix A with current element atj we can write ^142 "Zee"

M44 M46

^148

=A

*Nd

An.

Consulting a table of nuclide abundances (Walker et ah, 1989) we can build the matrix A out of rows and columns of Table 5.1. Note that abundances do not sum up to

252

Inverse methods

unity because only a few isotopes were measured. Intermediate results are

ATA=

0.0124

0.0302

0.0000

0.0302

0.1663

0.0181

0.0000

0.0181

0.0192

and 159.21

-32.20

30.28"

-32.20

13.208

-12.421

30.28

-12.421

63.747^

Let us now build the 'stripping' matrix (ATA)~ 1AT, which is independent of the mixing proportions "8.9847

-6.7242

-5.5345

1.5667

0.4245"

(A Ay A = 0.0000

2.7586

2.2705

-0.6427

-0.1742

0.0000

-0.9799

-2.1351

6.4880

4.0167

T

l T

and gives the least-square solution as

x= Jsmj 207 "9.9847

6.7242

-5.5345

1.5667

0.4245

62

0.0000

2.7586

2.2705

-0.6427

0.1742

43 =

0.0000

0.9799

-2.1351

6.4880

4.0167_

26

"1255 248 _ 105_

. 22.

Again, the total elemental signal is in millivolts. The projector P is

T

l J

P=A(A Ay A =

1.0000

0

0

0.6262

0

0

0

0.4742

0.0482

0.0831

0

0.4742

0.3903

-0.1105

-0.0299

0

0.0482

-0.1105

0.6961

0.4439

0

0.0831

-0.0299

0.4439

0.2874

and the projected vector T=PI given in Table 5.1. The diagonal terms pu sum up to 3 as expected. Examination of these terms shows that mass 142 must be measured as it is the only source of information for Ce. Mass 144, which is the second highest peak for Nd, and mass 148, which is the highest peak for Sm, contain more valuable information on these elements than the smaller

5.1 Linear estimates

253

Table 5.2. Influence matrix: each figure represents by how much the variables in the first column change when the observation on the top row is left out.

Mass /Ce

Im

hm

142 2875 0.00 -0.00

144 5.25 -2.16 0.77

146 -3.16 1.30 -1.22

148 -0.51 0.21 -2.12

150 0.16 -0.07 1.54

peaks 146 and 150. The same conclusion is arrived at upon calculation of the influence of each observation. The influence vector of the ith observation is obtained by multiplying the ith column-vector of the matrix (ATA)~1AT calculated above by (7j — /;) and dividing by (1— pu\ e.g., for mass 146 [-5.53,2.27, -2.14] T [43-42.65] 1-0.39 with complete results listed in Table 5.2. For instance, not measuring the mass 146 would decrease Nd intensity by 1.3 mV. o # ^ Isotope dilution. A similar technique can be used to achieve deconvolution of mass spectra when isotopic spikes (elements with artificially altered isotope composition) have been added to a sample so as to perform what is known as isotope dilution. When set up in conjunction with Thermal Ionisation Mass Spectrometry (TIMS) or Inductively Coupled Plasma Mass Spectrometry (ICP-MS), isotope dilution is a fairly precise technique for elemental analysis. Mass interferences make the calculation slightly more complicated but the peak stripping technique is still applicable (Michard and Albarede, 1986). Let us assume that ion currents are measured at masses 140, 142, 143, 145, 146. Each mass 'peak' results from the overlapping isotopic signals of Ce and Nd. As in the previous example, we define the vector 7 of measured ion currents (in millivolts) shown in Table 5.3. The analyst has added 2|imol of 142-enriched Ce and 1 |imol of 145-enriched Nd Oak Ridge spikes. Determine the amount of natural Ce and Nd present in the run for the following mass spectrum Let a 142 Ce be the atomic fraction of 142 Ce in natural Ce, fr142Ce the atomic fraction of 142 Ce in the Ce spike and likewise for the other elements and isotopes. Let us call / Ce nat and / Ce sp the elemental signal in the natural Ce and its spike, respectively, i.e., the total number of millivolts summed over all the isotopes of each element, and likewise for Nd. Mass balance requires i

140 —"140

i

Ce

"'""140

i

Nd

+0140

i

Ce

+#140

'Nd

f Cer nat . _ Ndr nat , L Ce j sp • L, Ndr sp n j ^ 1 4 2 - ^ 1 4 2 7Ce +fl142 Nd + #142 ^Ce + ^ 1 4 2 ^Nd

Inverse methods

254

Table 5.3. First column: ion currents \x (m V) measured for a natural-spike mixture ofCe andNd. The matrix of isotopic abundances a, b was taken from the Chart of the Nuclides (Walker et al, 1989) and commercial Oak Ridge data sheets. Last column: adjusted values of the ion currents I{ Ion currents are converted into voltage through a high-value resistor. Natural

Spike

Mass i

/,.

a?

a™

b?

bt™

140 142 143 145 146

330 280 6 32 10

0.8843 0.1113 0 0 0

0 0.2713 0.1218 0.0830 0.1719

0.0789 0.9211 0 0 0

0 0.0125 0.0075 0.8967 0.0431

Tt 330.0 280.0 6.24 32.01 9.82

and likewise for other masses. In matrix form ^140

~j

-^142

r

nat™ nat

=A

^143 /l45

f sp -7Nd _

^156.

From the data listed in Table 5.3, we calculate

A TA =

0.7944

0.0302

0.1723

0.0014"

0.0302

0.1249

0.2499

0.0861

0.1723

0.2499

0.8547

0.0115

0.0014

0.0861

0.0015

0.8061

and 1.3328

0.6181

-0.4486

-0.0619

0.6181

23.3764

-6.9274

-2.4000

0.4486

-6.9274

3.2767

0.6942

0.0619

-2.4000

0.6942

1.4871

Let us now build the stripping matrix (ATA) 1AT, which again is independent of the mixing proportions and will be the same for every mixture that uses the same spike

(A'l'A)~1AT =

1.1432

-0.0979

0.0748

-0.0042

0.1036"

-0.0000

0.0000

2.8292

-0.2118

3.9150

-0.1381

1.0975

-0.08385

0.0475

-1.1609

0.0000

0.0000

-0.2812

1.1343

-0.3485

255

5.7 Linear estimates

This leads to the solution (in mV of element)

'A1!

1.1432 -0.0979

0.0748 -0.0042

0.1036"

330

"351.2

ion

0.0000 -0.0000 0.1381

2.8292

1.0975 -0.8385

0.0000 -0.0000

3.9150

-0.218

0.0475 -1.1609

-0.2812

49.35

/: 0

246.6

11

1.1343 -0.3485_

10

_ 31.13

The amount of natural element present in the run was (351.2/246.6) x 2 umol = 2.85 umol of Ce and (49.35/31.13) x 1 umol = 1.59 umol of Nd. One should be careful of not working with mass units because of widely different molar weights for the natural element and the spike. The P matrix given by

P=A(ATAy1AT

=

1.0000

0

0

0

1.0000

0

0

0

0

0.3425

-0.0173

0.4742

0

0

-0.0173

0.9995

0.0125

0

0

0.4742

0.0125

0.6580

0

0

0

'

and the adjusted values J£ (Table 5.3) provide some enlightening information. Masses 140 and 142 must both be measured, since two masses are needed for isotope dilution, in this case for Ce. Their adjusted value is identical to the observed value. Mass 145 is the principal Nd spike and its measurement is therefore compulsory. Should one Nd isotope be dropped, it should be 1 4 3 Nd which is less abundant and hence less informative than 146 Nd. <= 5.1.2 The least-square straight line and least-square plane For a straight line, the linear relationship between two data vectors xm and ym can be written in three ways = axm

(5.1.10) (5.1.11) (5.1.12)

Handling these equations is normally done through the least-square method just discussed: on the right hand-side, the unknown vector will be the vector (a, b). The

256

Inverse methods

jth row of the matrix Amx2 of the general least-square problem will be made of the (xi91) in the first case, the (yh 1) in the second case, and the (xi9yt) in the third case. Different results for the slope and the intercept in an (x, y) diagram will be obtained for each case. Variable y in equation (5.1.10) and variable x in equation (5.1.11) are supposed to be known imperfectly. More complicated assumptions are implied by equation (5.1.12). In the jargon of regression techniques, the variable on the left-hand side of any equation is called the dependent variable, that on the right-hand side the independent variable. The least-square solution to equation (5.1.10) seeks adjusted values in the y direction, while for equation (5.1.11), the x direction is assumed. The third equation is known as orthogonal adjustment (Figure 5.2). Many books list the explicit solutions to some of these cases (e.g., Spiegel, 1975), but, with the advent of desktop computers that allow easy implementation of least-square solutions, these cumbersome expressions have become unnecessary. More powerful expressions have been proposed that enhance the visual assessment of error propagation or data influence (Provost, 1990).

Figure 5.2 Three different ways of adjusting a straight line to a set of observations. Adjustment in the ^-direction (top), the x-direction (middle) and orthogonal adjustment (bottom).

257

5.1 Linear estimates

It is left as an exercise to the reader to show that the mean point of a sample belongs to each least-square straight line. This fundamental property results in a disturbing feature of samples from loosely correlated variables: the mean point being constant, slopes and intercepts calculated from successive samples of the same population vary significantly, but remain strongly anticorrelated. Least-square straight lines seem to hinge around mean points. It may seem paradoxical, but the more loosely correlated two variables are, the more anticorrelated are the slopes and intercepts derived from least-square (regression) lines! This statistical artifact represents a major risk of misinterpretation. For a least-square plane, the linear relationship between three data vectors jcm, ym and zm can also be written in different ways, such as (5.1.13)

and (5.1.14)

l=ax

& Hirose and Kushiro (1993) have determined the composition of basaltic melts segregated from peridotite at 10-30 kbars. Some of the data are listed as molar fractions in Table 5.4. Making the rather crude assumption that the clinopyroxene Kcpx solubility product can be formed as (5.1.15)

where [brackets] refer to molar concentrations, determine the dependence of In Xcpx on temperature T and pressure P for the runs saturated in clinopyroxene. The common thermodynamic expression for that dependence AH

AS

PAV

where AH, AS, and AV are the enthalpy, entropy, and volume of clinopyroxene solution, respectively (e.g., Denbigh, 1968) is assumed to hold. Check whether the liquids segregated from clinopyroxene-free peridotites are clinopyroxene-undersaturated. We can make this problem linear by selecting x=1000/T, y=\000P/T and z = In Kcpx and write the relationship between thermodynamic quantities as z = ax + by + c

We therefore solve in the least-square sense the matrix equation for the 16 clinopyroxene-present runs 6.149 6.365

"0.657

6.566

1

0.646

9.690

1

0.556

16.685

1

= 5.406

Inverse methods

258

Table 5.4. Concentrations of selected elements (%) in melts from peridotite at different temperatures t(°C) and pressures P (kb) (Hirose and Kushiro, 1993). T is the temperature in K. *clinopyroxene-bearing residual assemblage. **spinel-clinopyroxenebearing residual assemblage. Adjustment of Kcpx given by equation (5.1.15) is made on clinopyroxene-bearing residual assemblage. X cpx < K cpx in clinopyroxene-absent runs confirms that predicts equation (5.1.15) clinopyroxene solubility reasonably well.

Run #

P

t

SiO 2

MgO

CaO

lnX c p x

1000 T

1000P T

lnK c p x

1** 4** 7** 14** 15** 18** 2* 8* 10* 19* 21* 22* 23* 24* 25* 26* 3 5 6 9 11 12 13 16 17 20

10 15 20 10 10 15 10 20 25 15 20 20 25 25 30 30 10 15 15 20 25 30 30 10 10 15

1250 1275 1350 1250 1300 1300 1300 1375 1425 1350 1375 1425 1425 1450 1500 1525 1350 1350 1400 1425 1450 1475 1500 1350 1400 1400

0.466 0.450 0.432 0.466 0.460 0.457 0.456 0.435 0.429 0.448 0.430 0.444 0.438 0.438 0.411 0.424 0.466 0.443 0.444 0.429 0.433 0.420 0.416 0.463 0.470 0.453

0.106 0.111 0.148 0.109 0.137 0.112 0.142 0.164 0.182 0.178 0.188 0.213 0.182 0.214 0.224 0.241 0.186 0.153 0.195 0.200 0.212 0.195 0.217 0.182 0.223 0.213

0.093 0.077 0.090 0.086 0.111 0.075 0.111 0.098 0.100 0.120 0.108 0.108 0.100 0.106 0.103 0.103 0.085 0.105 0.103 0.100 0.100 0.097 0.094 0.109 0.092 0.104

-6.149 -6.365 -5.993 -6.199 -5.739 -6.349 -5.721 -5.801 -5.695 -5.447 -5.587 -5.401 -5.657 -5.436 -5.551 -5.406 -5.681 -5.759 -5.534 -5.606 -5.532 -5.700 -5.641 -5.455 -5.396 -5.390

0.657 0.646 0.616 0.657 0.636 0.636 0.636 0.607 0.589 0.616 0.607 0.589 0.589 0.580 0.564 0.556 0.616 0.616 0.598 0.589 0.580 0.572 0.564 0.616 0.598 0.598

6.566 9.690 12.323 6.566 6.357 9.536 6.357 12.136 14.723 9.242 12.136 11.779 14.723 14.510 16.920 16.685 6.161 9.242 8.966 11.779 14.510 17.162 16.920 6.161 5.977 8.966

-6.179 -6.347 -6.026 -6.179 -5.691 -6.100 -5.691 -5.795 -5.733 -5.629 -5.795 -5.354 -5.733 -5.517 -5.465 -5.262 -5.233 -5.629 -5.187 -5.354 -5.517 -5.675 -5.465 -5.233 -4.802 -5.187

and obtain a= -22.10, b = 0.12% and c = 9.18. The estimates In Kcpx (Table 5.4 and Figure 5.3) correlate well with the measured values. The value of [SiO 2 ] 2 [MgO][CaO] for clinopyroxene-absent runs are on the left of the linear array in Figure 5.3, which correctly predicts clinopyroxene undersaturation. 5.1.3 Least-square polynomials m pairs of experimental data u{ and y{ (i = 1,..., m) are to be fitted with a polynomial of degree n— 1, such as (5.1.16)

259

5.7 Linear estimates -5.0

o cpx absent

o

-5.5

-6.0

-6.5 -6.5

-6.0

-5.5

lnZ cpx Figure 5.3 Correlation between the estimated ^ c p x = [SiO2]2[MgO][CaO] at clinopyroxene saturation and the observed value in melts for cpx-present and cpx-absent runs in Hirose and Kushiro's (1993) experiments on peridotite melting. The straight line calculated from cpx-present runs represents the saturation line.

where the n at are constants to be determined. In matrix form, this can be written

u2

lym Let us lump together the m observations yt (i= 1,..., m) into a vector j , the polynomial coefficients a7- (y = 0,...,n—1) into a vector x of unknowns, and define the (/—l)th power of the fth observable (M*)7"1 as the current term atj of the matrix Amxn. We now apply the usual method. Polynomials of high degrees tend to generate nearly singular matrices A which result in excessive fluctuations. ^ In a basalt-rhyolite interdiffusion experiment (Alibert and Carron, 1980), potassium concentrations CK were measured in a basalt at a given arbitrary distance y in um between rhyolitic and basaltic liquids experimentally heated for 5000 seconds (Table 5.5 and Figure 5.4). In order to determine the diffusion coefficients, a fit of the experimental points with a polynomial is requested. Use the reduced concentration u( (the fractional deviation of the concentration at ut from the concentrations in the original liquids) given by ir Ui=

c

K

r

* - c K"c:

K

_L/^

K

*-c: *

Inverse methods

260

Table 5.5. Profile of concentrations CK (%) in the glass during a rhyolite—basalt diffusion experiment (Alibert and Carron, 1980). yifim) is an arbitrary distance across the granitebasalt interface, u is a dimensionless concentration normalized to the concentrations at the profile end-points. y

CK

u

y

0.0 10.0 26.3 34.8 42.5 50.8 66.6 83.1 99.3 109.3 124.1 134.0 142.3

5.00 4.99 4.79 4.37 3.78 3.26 2.75 2.39 2.16 2.01 1.87 1.82 1.80

-1.0000 -0.9940 -0.8657 -0.6090 -0.2388 0.0896 0.4090 0.6299 0.7761 0.8657 0.9552 0.9881 1.0000

4.6 5.8 24.8 36.9 40.6 51.5 67.7 82.5 97.1 110.0 127.5 135.4 138.5

^ get From Table 5.5, we get 2C y . K u =

"1.8-5.0

1.8 + 5.0 = -0.625CJK + 2.125

1.8-5.0

For n = 6, the matrix A is calculated as IT

u1

uJ

vr

u

_

1.000

-1.000

1.000

-1.000

1.000

-1.000

1.000

-0.994

0.988

-0.982

0.976

-0.970

1.000

-0.866

0.749

-0.649

0.562

-0.486

1.000

-0.609

0.371

-0.226

0.138

-0.084

1.000

-0.239

0.057

-0.014

0.003

-0.001

1.000

0.090

0.008

0.001

0.000

0.000

1.000

0.409

0.167

0.068

0.028

0.011

1.000

0.630

0.397

0.250

0.157

0.099

1.000

0.776

0.602

0.467

0.363

0.282

1.000

0.866

0.749

0.649

0.562

0.486

1.000

0.955

0.912

0.872

0.832

0.795

1.000

0.988

0.976

0.965

0.953

0.942

1.000

1.000

1.000

1.000

1.000

1.000

261

5.1 Linear estimates

-1

-0.5

0

0.5

Reduced concentration u Figure 5.4 Least-square fit of the K concentration data in Alibert and Carron (1980) experiment of diffusion at basalt-rhyolite interface by a polynomial of degree (n-1) with n = 6 (top) and n = 10 (bottom). When n increases from 6 to 10, the solution begins oscillating between the data.

and the 'observation' vector y as j^ = [0.0, 10.0, 26.3, 34.8, 42.5, 50.8, 66.6, 83.1, 99.3, 109.3, 124.1, 134.0, 142.3]T (Actually, there is as much observation in the matrix A as there is in the vector y.) This gives the solution 1

47.72 ' 39.05 33.65

-29.30 -9.802 57.22 t or, equivalently j> = 47.72 + 39.05M + 33.65M 2 - 29.30M 3 - 9.802M 4 + 57.22M 5

262

Inverse methods Table 5.6. Isotopic data on Heard Island volcanics (Barling and Goldstein, 1989) Sample

206p b/ 204p b

87Sr/86Sr

65002 65054 65015 69244 H10 65171 65085 65151 69285

18.527 18.656 18.796 18.776 18.189 18.211 18.110 17.953 17.790

0.704 78 0.704 80 0.704 82 0.704 88 0.705 23 0.705 34 0.70547 0.70600 0.707 92

Except for the end points, thefitis acceptable. Let well alone! Increasing the order of thefitdoes not improve the results and at the degree 9, the solution begins to oscillate wildly where it is not constrained by the measurements, i.e., between the data points (Figure 5.4). o

5.1.4 Least-square hyperbola Given m pairs of experimental data ut and vt (i=l,...,m), and a general hyperbola equation (u-ujiv-vj^c

(5.1.17)

where u^ and v^ are the positions of the asymptotes on the axes and c is a constant characteristic of the curvature, we rewrite this equation for each pair of measurement as UiV^c-u^v^ + ViU^ + UiV^

(5.1.18)

Then we lump together the m products utVi into a vector j , c — u^v^, w^, and v^ into a 3-vector X of unknowns and form the ith row of the Amx3 matrix with 1, vt and u{. Then apply the usual method. ^ Barling and Goldstein (1989) have measured Pb and Sr isotope compositions in recent lavas from Heard Island (Southern Indian Ocean). They obtained the data listed in Table 5.6. Find the parameters of a least-square mixing hyperbola fitting the observations. Examination of the data in a 87Sr/86Sr vs 206 Pb/ 204 Pb diagram (Figure 5.5) suggests that these samples form a hyperbolic array and therefore represent a suite of mixtures between two end-members. In order tofita hyperbola to the data, let us build the

263

5.7 Linear estimates

0.709 0.708h

I

0.707

5

0.7060.705 0.704

17.6

17.8

18.0

18.2

18.4

18.6

18.8

19.0

206pb/204pb Figure 5.5 Least-square mixing hyperbola for the isotopic data on Heard Island of Barling and Goldstein (1989). Data from Table 5.10. The 87 Sr/ 86 Sr value of the MORB source (2*0.7025) lies below the horizontal asymptote. Asthenosphere and oceanic lithosphere are unlikely source components of Heard Island basalts.

vectors x, y and the matrix A. The matrix equality Ax=y reads numerically 1 0.70478

18.527

18.527x0.704 78

13.057

1 0.704 80

18.656

18.656x0.704 80

13.149

1 0.704 82

18.796

18.796x0.70482

13.248

1 0.70488

18.776

18.776x0.70488

13.235

1 0.705 23

18.189

18.189x0.705 23

1 0.705 34

18.211

18.211x0.705 34

12.845

1 0.70547

18.110

18.110x0.70547

12.776

1 0.70600

17.953

17.953x0.70600

12.675

1 0.70792

17.790

17.790x0.70792

12.594

=

12.827

which gives the least-square solution -12.450 17.674 0.704 42J

The mixing hyperbola has been drawn in Figure 5.5. The two asymptotes intersect

264

Inverse methods

the axes at the values 206 87

Pb/ 204 Pb= 17.674

Sr/86Sr = 0.704 42

which, as discussed by Barling and Goldstein (1989), has the important corollary that MORB-type mantle source ( 8 7 Sr/ 8 6 Sr< 0.703) is not involved in the genesis of the Heard Island lavas. Keeping the full precision we would find that the curvature factor is 4.2855 x 10~ 4 so the mixing hyperbola has the equation 87

Sr 4.2855 x 10~4 + :2 0 6 Pb/ 2 0 4 Pb-17.674 86 Sr= 0.70442 '

We should nevertheless be aware that this method does not guarantee that all points will fall onto the same branch of a hyperbola, <=>

5.7.5 The periodogram Periodic or nearly periodic variations of geochemical parameters with the time of deposition for sediments are quite commonly observed. Several characteristic frequencies of these variations can be related to the Milankovich orbital frequencies, which make the analysis of time series in sedimentary sections an attractive tool of paleoclimatology (Berger, 1988). 5 1 8 O and 5 1 3 C data in sedimentary carbonates sampled by drill cores offer the best-documented example of geochemical time series: unfortunately, the measurement cannot be triggered arbitrarily by the analyst as it can in many fields of geophysics. It is rather dictated by the existence of a support (a rock of appropriate composition) that carries the signal over some periods of time: the analyst has only loose control over where in a stratigraphic sequence the measurement can be made. Therefore, the wealth of methods dedicated to Fourier analysis usually fail for that sort of problem because measurements are most commonly made at times that are not equally spaced. In fact, the interpolation of the measurements is almost certain to give spurious results. Given a geochemical variable y9 m measurements at times tl9t29...9tm produce the unevenly spaced time series )>i,.y2>--->.ym> which we lump together as the vector y. In order to find out eventual periodicities, Lomb (1976) suggests fitting the data by a sine wave using a least-square criterion. For any arbitrary frequency / , the fitting function is written y = asin2nf(t-T) + bcos2nf(t-T)

(5.1.19)

where a and b must be determined by least-squares and x is a time-shift variable that gives the solution some convenient properties. The fit can be represented by a series of equations linear in a and b, such that yt = a sin 2nf{ii - x) + b cos 2nf(ti - T)

(5.1.20)

265

5.7 Linear estimates

or in a matrix form sin 2nf(t x — T)

COS 2nf(t1 — T)

sin 2nf(t2 - T)

COS 2nf(t2 - T)

.sin 2nf(tm - x) cos 2nf(tm - x)_

In short, the last equation takes the form y = Ax where A is an m x 2 matrix and JC is the column vector made of a and b. The minimum variance estimator x of x reads The 2 x 2 matrix ATA is given by sin 2

sin 2nf(ti — T) co

Y sin 2nf(ti — x) cos 2nf(ti — x)

Y

i= 1

cos2

^nf^i ~ T)

i= 1

The solution becomes particularly simple and time-invariant if T is chosen in such a way that the cross-terms vanish. Using basic trigonometric identities, we cancel the off-diagonal term X sin2nf(tt-T)cos2nf(ti-x) i= 1 1 m

= - £ sin4nf(tt-1) 2 j= i

1 m

= - X 8^47^*^008 47^1 — - YJ cos 471/t,-sin 4TT/T = 0 2 i= i

(5.1.21)

2 j= i

hence COS4TT/T ^

sin 4^/^- = sin 4TT/T Y

i= 1

cos 4nfti

i= 1

and (5.1.22)

Since the vector ATy can be written

yicos2nf(ti-T)

266

Inverse methods

the solution for a and b is given by A - O / I sin22nf(tt-T)

(5.1.23)

and b=f^ yiCOslnfiti-T) £ cos22nf(ti-x)

(5.1.24)

The 'reduction in the sum of squares' is a concept that may a priori look surprising (Lomb, 1976; Scargle, 1982). Nevertheless, its use is supported by the convergence between the reduction in the sum of squares and the familiar power spectrum in Fourier analysis when the data become equally spaced. It is simply the difference AS(f) in the sum of squares before the fit and after thefitfor one particular frequency

or, in a vector form,

AS(f)=yJy-(y-y)J
AS(f)=yTy-yT(I-Pf(I-P)y where the usual symbol P is used for the projector A(ATA)~1AT. symmetric and idempotent, the reduction AS(f) becomes

Since projectors are

AS(f)=yTy-f(I-P)yTPy This is written in full, AS(f)=yTA

(A TA) ~ 1 A Ty = (A Ty)T(A TA) ~ 1 ATy = yTA x

which reduces to

I" | yt sin Infix, - T ) T I" | yt cos 2nf(tt - T ) T AS(f) = ^ =L + kill ! X sin2 2nf(tt - T) i= 1

(5.1.26)

^ cos 2 27c/(tI- - T) f= 1

Whenever the frequency / becomes close to a strong periodic component of the measured signal, the terms in parentheses tend to add up and the periodogram shows a power peak around/. Between these peaks, the terms are not correlated, their sign and amplitude tend to be random, and the sum will be small. Still with reference to Fourier analysis, it is common practice to plot the power P(f) as AS(/)/2.

5.7 Linear estimates

267

Table 5.7. 5 18 O values (°/oo) in pelagic foraminifera from hole 704 of the Ocean Drilling Program in the South Atlantic (Hodell and Cieselski, 1991). mbsf: meters below seafloor. mbsf

5 18 O

mbsf

5 18 O

mbsf

5 18 O

mbsf

5 18 O

49.05 49.26 49.90 50.05 50.25 50.55 50.76 51.11 51.40 51.75 52.05 52.26 52.61 52.86 52.90 53.25 53.55 53.76 54.02 54.11 54.40

3.52 3.59 3.62 3.37 3.43 3.40 3.51 3.29 2.68 2.22 3.17 3.38 3.07 2.50 2.79 2.63 2.49 2.39 2.53 2.41 2.55

54.75 55.30 55.71 55.90 56.31 56.50 56.80 57.11 57.41 57.74 57.81 58.00 58.30 58.61 58.90 59.31 59.50 59.80 60.11 60.40 60.74

2.37 2.94 2.97 2.87 3.19 3.26 3.28 3.17 2.21 2.53 2.36 2.28 2.84 2.82 3.15 3.16 3.03 2.94 2.70 2.42 2.44

60.81 61.00 61.30 61.61 61.90 62.24 62.31 62.50 62.71 63.11 63.74 64.50 65.01 65.31 65.60 66.00 66.21 66.81 67.10 67.51 67.68

2.22 2.50 2.72 2.76 3.10 3.38 3.24 3.24 3.22 3.13 3.17 2.13 1.57 1.92 2.78 3.25 3.42 2.88 2.24 2.11 2.08

68.01 68.31 68.60 69.01 69.21 69.51 69.54 69.81 70.10 70.51 70.71 71.01 71.31 71.60 71.91 72.01 72.21 72.51 72.81

3.74 3.54 3.44 2.84 2.25 2.71 2.85 3.43 3.57 3.29 3.17 3.31 3.05 2.87 3.24 3.01 2.97 2.72 2.91

& Neogene calcareous sediments were recovered from the hole 704 of the Ocean Drilling Program in the South Atlantic. The 5 18 O of pelagic foraminifera have been analyzed at 82 different depths z reported in meters below seafloor (mbsf) by Hodell and Cieselski (1991) (Table 5.7 and Figure 5.6). For simplicity, we will assume that depth below the sea bottom varies linearly with time. The first step in the calculation consists in removing the long-term drift over the whole period by fitting the data with a parabola and determining a periodogram out of the residuals from this fit. Depth has also been scaled to 1 in order to minimize round-off errors. Applying the method shown above, the best-fit parabola is obtained for 518O = 3.2696-2.1282zred + 2.0372zred2 This calculation gives the sequence of reduced 5 18 O, noted 5 18 O red , i.e., the difference between observed and fitted values, which is plotted in Figure 5.7. Table 5.8 gives a few numerical results for a few data points and / = 4. From columns 3 and 4, we could get 82

X sin47r(4zred)= - 1.8385,

82

£ cos 47r(4zred) = -2.7100

Inverse methods

268

Depth z9 (mbsf) Figure 5.6518O of Neogene pelagic foraminifera from the ODP hole 704 in the South Atlantic sampled at different depths z reported in meters below seafloor (Hodell and Cieselski, 1991)

co -0.5 -

0.2

0.4

0.6

0.8

Depth zred, (mbsf) Figure 5.7 Same data as in Figure 5.6 but depth has been normalized and a parabolic trend has been removed from the data by least-square adjustment.

hence t 4TT4

- 2.7100/

5.7 Linear estimates

269

Table 5.8. Reduced depth z and 3I8O values after removal of a parabolic trend. Sample calculation for i=4 at selected depths. y, = 5 18 O red , £, = 4nfzrei, i2 = 2nf(zrei yi

^red

-1).

sin^

cos^

sin{ 2

cos f 2

0.0000 0.0088 0.0358 0.0421

0.2504 0.3391 0.4240 0.1864

0.0000 0.4298 0.9743 0.8553

1.0000 0.9029 -0.2255 -0.5182

-0.2937 -0.0758 0.5655 0.6887

0.9559 0.9971 0.8247 0.7250

0.9747 0.9874 1.0000

-0.1607 -0.4343 -0.2686

-0.9549 -0.5929 -0.0000

0.2969 0.8053 1.0000

-0.8032 -0.5773 -0.2937

0.5957 0.8166 0.9559

Multiplying column 2 in Table 5.8 by columns 5 and 6 gives 82

82

X 8 18 O red sin27i4(z red -T)=:8.4603,

£ 6 18 O red cos27r4(z red - T) = 1.3883

i= 1

i= 1

while squaring columns 5 and 6 yields 82

82

X sin 2 27r4(zred - T) = 42.6374,

£ cos2 27r4(zred - T) = 39.3626

i= 1

i= 1

Finally 2 2 P(4) = _1/8.4603 + 1.3883 \ = 0.864 2\42.6374 39.3626/

The calculation has been carried out for frequencies ranging from 1 to 20 with intervals of 0.2 and the results are shown in Figure 5.8. The periodogram shows a strong periodic component a t / ^ 3 . 5 and its harmonics at ^ 7 and 10.5. The data are distributed over 23.76 meters. Given a sedimentation rate of 63 m Ma" 1 (Hodell and Cieselski, 1991), the section covers a time interval of about 380000 years. The present stratigraphic section contains evidence for a 380 000/3.5 «10 5 years cyclic component extremely common in the sedimentary record (e.g., Berger, 1988). Although a more detailed discussion does not pertain to this book, the peak width may be shown to be related to the analyzed core length, o 5.1.6 Fitting global data with spherical harmonics

Given a finite number of measurements at a given latitude (90° — 6) and longitude 0 on the surface of the Earth, we look for a smooth function that could be fitted to the data and represent their variations to within any desired precision. Spherical harmonics are suitable because they make an orthogonal set of functions which can

270

Inverse methods

10

15

Frequency/ Figure 5.8 Periodogram of the reduced data shown in Figure 5.7. The strong peak at ca./ = 3.5 and its harmonics at ca. 7.0 and 10.5 correspond to a ca. 105 year periodic component.

be expanded to an arbitrary degree (although limited by the number of data points): because of orthogonality, subsequent truncating of the solution to a lesser degree still gives a solution which satisfies the least-square criterion. Let us assume that n measurements y(4>i,9i) (i= 1,2,...,n) of one geochemical parameter are to be fitted with spherical harmonics to the degree m, i.e., with a function j , such as (5.1.27)

Replacing (/> and 9 by the observed value, we get one equation such as equation (5.1.27) for each of the n calculated j>i(i, 0t), which we collect as an n-vector y. The /? = (/+1) 2 unknown coefficients 6tlm and film are lumped together as a p-vector x of unknowns. Two points make this problem a little difficult: indexing carefully the coefficients 6tlm and $lm in the vector x and finding an efficient routine which enables Cjm(, 9) amd Sjw(0,9) to be calculated. The first point is largely a matter of attention, the second can be solved by borrowing a routine from professional packages, e.g., the routine from Press et al (1986) discussed in Section 2.6. The zero-order sine terms being zero, the individual least-square equation reads ^

<& Table 5.9, compiled by Vincent Salters from the Lamont-Doherty Geological Observatory, lists 2 0 6 pb/ 2 0 4 Pb data in basalts from different oceanic islands the

271

5.7 Linear estimates Table 5.9. Average

206

Pb/204Pb ratios in basalts from some ocean islands (courtesy Vincent Salters).

Latitude (90°-0;)

Longitude

-28.69 -7.95 -23.66 -21.58 -19.58 -21.51 38.50 -67.08 -54.35 1.32 15.70 6.93 -10.50 5.54 -12.04 -46.45 -26.47 -3.83 -0.52 -40.33

65.46 -14.37 -149.37 -155.61 -158.43 -159.05 -28.00 -168.88 3.50 7.52 -24.12 158.32 105.67 -87.08 43.74 52.00 -105.47 -32.42 -90.72 -10.00

206pb

Longitude

204pb

Latitude (90°-0 f)

18.879 19.421 20.533 20.001 19.386 19.743 19.707 19.752 19.445 20.020 19.254 18.462 18.639 19.234 19.615 18.929 19.865 19.409 19.076 18.445

20.36 64.43 -33.62 -52.61 -45.22 -46.92 -9.29 37.95 60.00 -21.07 -30.28 -14.03 -26.42 16.87 -17.61 -15.97 -20.50 -37.10 -20.07 -28.66

-156.68 -19.73 -78.83 72.42 -154.40 37.75 -139.96 -61.88 -166.00 55.75 -35.28 -171.36 -79.98 -117.47 -148.91 -5.72 -29.42 -12.28 -130.10 2.49

206pb 204pb

18.188 18.453 19.121 18.259 19.271 18.562 19.362 20.155 18.588 18.855 17.619 18.914 19.079 19.046 19.128 20.678 19.116 18.476 18.132 17.914

location of which is indicated on the map of Figure 5.9. Map the variations of this isotopic ratio with spherical harmonics to the degree 5. The matrix A is made of 40 rows and 25 columns, 15 columns for Cjm( ,-,#,•) and 10 columns for 5^(0,-, 0f). A is too large to be listed. The 15 coefficients alm and the 10 coefficients $lm can be obtained through the standard procedure and are listed in Table 5.10. Contours of equal 2 0 6 pb/ 2 0 4 Pb can be drawn on a map by making a grid of longitudes and latitudes and inserting the calculated coefficients dlm and film in equation (5.1.27). The map of Figure 5.9, drawn as a Mercator projection, i.e., using the transformation x = longitude

,r (

fn latitude\l = ln tanl - + shows the characteristic problem of extrapolating functions. The spherical harmonics adjust more tightly in areas where data are abundant by letting themselves vary wildly where the data are missing, e.g., under the continents. The large bumps devoid of data (Australia, North America, Australia) should not be taken as features indicative of a geochemical trend. <=

60°N

30°N

30°S

60°S

120°W

60°W

60°E 206

204

120°E

Figure 5.9 Least-square fit of the Pb/ Pb ratios listed in Table 5.9 on basalts from different oceanic islands with spherical harmonics to the degree 4. The results are reported as lines of constant values. Results in continental areas are not shown.

5.2 Non-linear least-squares

273

Table 5.10. Spherical harmonic expansion of the data listed in Table 5.9. 1

0 1 2 3 4

m= 0 lm

71.75 7.206 -0.434 -6.791 0.1378

1

2

3

0 -2.854 4.316 0.810 -0.899

0 0 -8.127 -7.381 -1.859

0 0 0 1.228 -2.712

0 0 0 0 4.190

0 0 3.665 -6.310 -0.274

0 0 0 5.751 0.335

0 0 0 0 -0.878

4

film 0 1 2 3 4

0 0 0 0 0

0 -7.86 -13.2 1.38 5.82

5.2 Non-linear least-squares When the function to be fitted to data does not depend linearly on the parameters, recursive methods must be used. A slightly modified version of the Newton-Raphson method (Chapter 3) will be used (Hamilton, 1964). Let x be the vector of the n unknowns x, and y = f(x) the m-vector of 'observable' functions yt = f(x). The analytical form of the functions/^*) may be the same or not. Let the vector y represent the m observations yt of these functions. A vector x is sought which minimizes the scalar c2 such that c2= Kfl-y,-) 2

(5.2.1)

1=1

Since we are dealing with a finite sample, we define, as for the linear case, the least-square estimators x and y of x and j , respectively, as the vectors such as y=f(x)

(5.2.2)

and satisfying )=y

and

£{x) = x

(5.2.3)

which minimize the scalar c 2. Given an initial guess x° of JC, we expand f^x) in a Taylor series about JC° (5.2.4)

274

Inverse methods

Table 5.11. Ni concentrations (ngkg^1) and depth (m) in the water column from the eastern Pacific (Bruland, 1980).

z C Ni

50 100 250 500 750 1050 1200 1500 1800 2000 2500 3000 220 265 360 425 525 575 580 625 630 640 630 640

we minimize c2=

(5.2.5)

relative to x l 5 x 2 ,...,x n . This problem is equivalent to solving the linear least-square matrix equation Ay = A Ax

where the current elements of the vectors Ax and Ay and of matrix AmXn

(5.2.6)

are (5.2.7)

(5.2.8)

From the initial guess x°, we calculate the m values of fix0) and their derivatives relative to each Xj. Solving the least-square system, we get an improved estimate of JC, that we use as the initial value for the next iteration until the values cease to change significantly. Indicating the /cth estimate by the superscript fc, we can write (5.2.10)

where x{k) and A(k) are the /cth estimate of the vector x and matrix A, respectively. The least-square iteration scheme is therefore (5.2.11)

& Bruland (1980) has measured Ni concentrations (ng kg l) and depth (z in meters) in the water column from the eastern Pacific (Table 5.11). Find the best set of coefficients for the advection-diffusion model of Craig (1969) to fit these data. In Section 8.8., we find that the advection-diffusion model of Craig (1969) amounts to a sum of exponentials. The data listed in Table 5.11 are to be fitted by CNi(z) = exp( - — ) aexp( — ) + £exp( - — )

5.2 Non-linear least-squares

275

Table 5.12. The first step of non-linear least square refinement of the parameters 6t, ft, and £ for selected depths in the water column. The last three columns represent the elements of the matrix

lOOOfl,-!

lOOQo,,

50 100

229.36 288.03

-1.67 -3.34

1042.55 1086.90

959.19 920.04

-1.67 -3.34

2500 3000

5059.65 7796.80

-158.13 -242.01

8031.19 12182.5

124.51 82.08

-339.82 -613.23

-484.93

668.93

1.54

for a, /?, e, and the mixing length /m = 600m (Craig, 1969). We define

and the vector x = (a, /?, e). Then we compute

ft =exp =— P

da

fi ( ai2 = — =exp dft

zt[

\2l sz

(

(EZ\

BZX\

ai3 = — = — aexp y — -/?exp - —

de 2lml

\llJ

\

21Jj

As the initial guess, we chose a(0) = - 20, /?(0) = + 20, # 0 ) = 1. Table 5.12 shows some results for the first iteration. For the second iteration, we take the values of a(1), /?(1) and 8(1) as the new starting point. The sequence of values taken by x is tabulated in Table 5.13. Convergence is achieved in 6-7 iterations. Convergence is towards s = 1, i.e., the initial value is entirely fortuitous. This value is an important indication that scavenging is not very efficient for Ni. The quality of the fit may be seen in Figure 5.10 <> & Let us calculate parameters for a non-linear fit used in Chapter 8 as an example of the Matano interface technique. In order to determine the diffusion coefficient of Ce in apatite, Iqdari and Velde (unpub. data, 1992) kept natural apatite in CeCl 2 at

Inverse methods

276

Table 5.13. Iterative refinement of the parameters a(k), £ ( k ) and £ k ) . Ttpr h

/v(fc)

0 1 2 3 4 5 6 7

-20 668.93 455.05 537.75 643.61 668.54 668.79 668.79

300

fiyk>>

c
20 -484.36 -205.11 -326.27 -456.48 -484.41 -484.69 -484.69

400

500

1 1.5445 1.3697 1.1193 0.9943 0.9833 0.9834 0.9834

600

1

NiCngkg" ) Figure 5.10 Adjustment of Ni concentrations in the water column (data from Bruland, 1980) with the advection-diffusion model of Craig (1969).

different temperatures from variable durations. In one run, the sample was kept for 15 days at 1100°C and Ce concentrations CCe measured from the surface inwards along the c-axis with X being the distance to the surface. Table 5.14 and Figure 5.11 give the analytical results. It is found that, for the sake of Matano integration, these results could be fitted with a six-parameter equation 'Ccc-x1

C Ce -x 3

5.2 Non-linear least-squares

277

Table 5.14. Ce concentrations (%) in apatite immersed in CeC^for 15 days at 1100°C as a function of the distance X (\im) to the interface (Iqdari and Velde, unpub. data, 1992). X

0 5 10 15 20

CCe

X

25 30 35 40 45

1.864 1.832 1.638 1.227 0.6

CCe

X

0.25 0.152 0.085 0.078 0.046

50 55 60 65 70

CCe

0.043 0.012 0.013 0.012 0.013

Ce (%) Figure 5.11 Iqdari and Velde's (unpub. data) Ce diffusion experiment on apatite. Adjustment of the distance to the mineral surface as a function of Ce concentrations.

Find the vector x = (xl9 x29 • •., x6)T of unknown parameters using [0,2,2,2,20, — 20] T as the initial estimate. In this case, the vector y is built out of the X values (0,5,..., 70) while each equation differs from the other by the value Q C e of the concentration. The ith row of the 6 x 15 matrix A of partial derivatives is dX

x2

~dx~x

dx4

dX

dX

fo~2"

dx3

dx6

\2'

Inverse methods

278

Table 5.15. Refinement of the fit parameters used to parameterize the Ce diffusion profile of Figure 5.11 up to iteration 12.

k

x2

x4

x5

0 1 2 3

0 -0.0061 -0.0190 -0.0353

2 1.499 1.840 2.347

2 1.894 1.870 1.873

2 -0.2042 0.5083 0.2791

20 21.00 19.78 18.21

-20 -6.163 -4.973 -4.500

11 12

-0.0500 -0.0500

2.878 2.878

1.932 1.932

0.8753 0.8792

16.98 16.98

-2.872 -2.864

Inserting the initial guess into these expressions gives the first estimate of A0 as

r A° =

0.6

0.537

108.132

-7.3529

1 1.8640

0.6

0.546

70.862

-5.9524

1 1.8320

0.7

0.611

15.262

-2.7624

1 1.6380

1.3

0.815

3.347

-1.2937

1 1.2270

13 888.9

83.333

0.506

-0.5030

1 0.0120

11834.3

76.923

0.507

-0.5033

1 0.0130

and/(i°)as /(£ 0 ) = [-30.91, -27.45, -17.06, -5.50, ..., 185.42, 172.58]T

The results of the first 12 iterations are listed in Table 5.15. Although total convergence is not achieved after 12 iterations, the fit provided by the parameters at this stage is quite acceptable (Figure 5.11) and the final form of this empirical least-square equation becomes 0.879

2.878 •+ •

C Ce + 0.05

Ce

C -1.93

17-2.86C C e

5.3 Constrained least-squares 5.3.1 Linear constraints: the closure condition Since many geochemical units are concentrations of fractions which sum up to unity, let us first demonstrate a useful statement. A vector is normalized when its components

5.3 Constrained least-squares

279

sum up to one. A normalized vector y of 9fm satisfies the condition

where Jm is the vector (1,1,..., 1)T. Given a matrix AmXn column-vectors, i.e., satisfying

with normalized

JmJA=Jj

(5.3.2)

y = Ax

(5.3.3)

then any vector xn such as

is also normalized. This can be shown by pre-multiplying the last equality by JmT Jjy = JmTAx = JnTx=\

(5.3.4)

In most cases of interest, however, the system represented by equation (5.3.3) is overdetermined and we must enforce the closure condition with a different method. Let us return to a standard mass-balance least-square problem, such as, for instance, calculating the mineral abundances from the whole-rock and mineral chemical compositions. If xl9 x2, •. -,xn are the mineral fractions, which may be lumped together in a vector JC, the closure condition

may be written JCT/=/TJC=1

(5.3.5)

Arranging the whole-rock mineral concentrations for each element i (i= l,...,m) in a vector j , and putting the concentration of element i in the phase j at the ith row andjth column of the matrix Amxn (mineral matrix), the usual overdetermined system is obtained y = Ax

(5.3.6)

which has no exact solution. The constrained least-square problem is therefore to find estimates x and y for which three conditions hold: (i) estimates x and y fit the model equation (5.3.3); (ii) the distance (y — Ax)1 (y — Ax) between y and y is minimum; and (iii) the constraint equation (5.3.5) is obeyed. Let us define the Lagrange multiplier as — 21 and form the function c2 that we want to be minimum c2 = (y- Ax)J(y - Ax) - 2k(JcTJ- 1)

(5.3.7)

280

Inverse methods

Since each term in c2 is a scalar and therefore symmetric, equation (5.3.7) may be rewritten as

Let us define the differential of the vector x = (xux2,...,AW)T dx = (dxudx2,...,dxn)T. Differentiating c2 relative to x gives

as the vector

dc2 = - 2 dxJATy + dJtTAJAx + xTATA dx - 2X dxJJ= 0 The second and third scalar terms of c2 are the transpose of each other and are therefore equal, hence dc2 = - 2 dxJATy + 2 dxTATAx - 2k dxTJ or dc2 = 2dxT(ATAx-ATy-M)

=0

(5.3.8)

Clearly, the only non-trivial solution to this equation is ATAx = ATy + M

(5.3.9)

and therefore l

ylJ

(5.3.10)

where x0 is the unconstrained solution (ATA)~1ATy. It is now easy to calculate X in such a way that the constraint is obeyed. Pre-multiplying by / T , we get JTJc = JT(ATA)~ xATy + UT(ATA) ~lJ=\ or, since each term in the equation is a scalar

Note that the denominator is simply the sum of all the terms in the matrix Inserting this expression of X into the solution for JC, we get

{ATA)'1.

a rather cumbersome expression which nevertheless may turn out to be useful to compute the error matrix on the solution using methods described later in this chapter.

281

5.5 Constrained least-squares Table 5.16. Major-element composition (weight percent) of a komatiitic liquid and normative minerals.

SiO 2 TiO 2 A12O3 FeO MgO CaO Na 2 O

liq

ol

cpx

42.0 0.3 8 6.5 22 8 0.5

41.9 0.07 0 7.77 48.5 0.06 0

54.6 0.13 1.9 2.22 15.8 20.6 1.44

ga 41.5 0.11 18 7.04 18.1 6.7 0

^ Let us recast the major-element composition of a komatiitic liquid (liq) into virtual minerals olivine (ol), clinopyroxene (cpx), and garnet (ga) whose compositions are listed in Table 5.16. In the usual notation, the liquid column is the vector yn while the last three columns form the matrix Alx3. After thefiguresare divided by 100, the unconstrained solution is built using the following intermediate steps. First, we compute in the usual way the unconstrained solution x0 as 0.210" 0.267 0.442

This solution happens not to be normalized since

We therefore calculate 0.51" (ATA)-lJ=

1.67 , and 6.46

JT(ATA)-iJ=4.2S

which gives the value of the Lagrange multiplier X as A = (l-0.919)/4.28 = 0.0188

282

Inverse methods

Inserting this value into equation 5.3.10 gives the final solution "0.210" "-0.010" "0.200" 0.267 + -0.032 = 0.235 0.122_ 0.442_ 0.564 5.3.2 Quadratic constraints: mineral reactions Let us consider the problem of finding the stoichiometric coefficients of a mineral reaction, m element concentrations have been measured on n mineral phases of a rock (C/, i = 1,..., m; j = 1,..., n) and it is suspected that the phases are not chemically independent. In other words, we can find n numbers Vj(j=l,...9n) such that (5.3.13) Obviously, the trivial solution v ^ O (j = 1,..., n) does not fit our needs and we must search for solutions as a constrained problem in which the solution vector is of constant, yet arbitrary, length. In other words, we become interested in the vector with some criterion of best direction regardless of its magnitude, which we may conveniently take as unity. Let us lump the Cf coefficients into t h e m x n matrix A and the n coefficients v, into the vector xm hence (5.3.14)

Ax = 0

Obviously, for m ^ n, such a system has no exact solution, and, as before, the search is restricted to that for an estimate x of x. As the solution is only approximate, a residual error vector s may be defined such that Ax = e

(5.3.15)

The least-square criterion suggests to minimize the modulus sTs of this error vector subject to the condition that the modulus xTx of the estimate is unity. The problem is therefore to minimize the sum c2 such as x-X(xTx-l)

(5.3.16)

where X is a Lagrange multiplier. Differentiating with regard to i , one gets dc2 = dxTATAx + JcTATA dx - X dxTx - XxT dJt = O

or, since each term is a scalar and hence symmetric dc2 = 2dxT(ATAJc-Xx) = 0

(5.3.17)

The solution to the problem is therefore the solution to the eigenvalue equation ATAx = XJt

(5.3.18)

5.3 Constrained least-squares

283

Table 5.17. Molar chemical composition of minerals in the assemblage quartz (qz) - muscovite (ms) - K-feldspar (Kf ) sillimanite (sil) - water (w).

SiO 2 A12O3 K2O H2O

qz

ms

Kf

sil

w

1 0 0 0

6 3 1 2

3 1/2 1/2 0

1 1 0 0

0 0 0 1

and hence c2 = XxTx-l(xTx-l) = l

(5.3.19)

As the matrix ATA is positive definite, i.e., it has positive eigenvalues, for c2 to be minimum, the solution x must be the eigenvector ux associated with the smallest eigenvalue kx of this matrix. & Find stoichiometric coefficients for the assemblage quartz (qz)-muscovite (ms)-K-feldspar (Kfhsillimanite (sil)-water (w) given in the form of the mineral composition matrix of Table 5.17 An eigencomponent routine confirms that the matrix A1 A has one eigenvalue equal to zero with corresponding eigenvector [0.485, -0.485, +0.2425, -0.485, -0.485]T. It is common practice to use integers as stoichiometric coefficients. This can be achieved by dividing each component by the component of smallest modulus (0.2425), which produces the vector [2, - 2,1, - 2, - 2] corresponding to the mineral reaction 2 quartz +1 muscovite «-+ K-feldspar+ 2 sillimanite + 2 water

<>

& Given the mineral assemblage quartz (qz)-pyroxene (px)-garnet (ga)-plagioclase (pi) compositions given in Table 5.18, discuss the possible reactions between mineral end-members. The mineralogical matrix of Table 5.18 is full rank (n = 4) and, in terms of the phase rule, does not seem to be a reactive assemblage since the rock contains four minerals for five independent (chemical) components. However, breaking down the last three minerals into end-members: pyroxene as enstatite (en) + diopside (di)+jadeite (jd); garnet as pyrope (py) + grossular (gr); and plagioclase as albite (ab) + anorthite (an) gives the new mineralogical matrix reported in Table 5.19. Three (8 - 5) independent mineralogical reactions exist among these end-members. We can calculate the whole set of possible reactions through a calculation of eigencomponents in all the systems with 5 + 1 = 6 end-members. Indeed, any assemblage with six end-members must involve at least one reaction. There are (8 x 7)/2 = 28 ways of discarding two end-members among eight and the same number

284

Inverse methods Table 5.18. Molar chemical composition of minerals in the assemblage quartz (qz) - pyroxene (px) - garnet (ga) plagioclase (pi).

SiO 2 A12O3 MgO CaO Na2O

qz

px

ga

pl

1 0 0 0 0

2 0.125 1 0.5 0.125

3 1 1.2 1.8 0

2.5 0.75 0 0.5 0.25

Table 5.19. Molar chemical composition of mineral end-members in the assemblage of Table 5.18. See text for abbreviations.

SiO 2 A12O3 MgO CaO Na 2 O

qz

jd

en

di

py

gr

ab

an

1 0 0 0 0

2 1/2 0 0 1/2

2 0 2 0 0

2 0 1 1 0

3 1 3 0 0

3 1 0 3 0

3 1/2 0 0 1/2

2 1 0 1 0

of possible univariant reactions given in Table 5.20, each reaction being as legitimate as any other. Since only three reactions can be independent, other reactions can be obtained by linear combinations. For instance, reactions 2, 3, 4, 5, 15, 19, 20, ... are identical, 15 and 17 sum up to 18, and so on. Given reliable models of activity in plagioclase, clinopyroxene, and garnet, and thermodynamic data at various temperatures and pressure, we could devise many temperature-pressure estimates although with only three degrees of freedom, o 5.4 Handling errors in least-square problems • As soon as observations are considered as samples of random variables, we must redefine the concepts of distance and projection. Let us consider in three-dimensional space a vector y of one observation of three random variables Ylf Y2, and Y3 with its density of probability function/^. The statistical distance c of the vector j> to another point y can be defined by the non-negative scalar c 2 , which has already been met a few times, e.g., in equations (5.2.1) and (5.3.7), and such that c2=-2\nfj(y) +const

(5.4.1)

and is a measure of the probability that y is different from y. The distance of y to the plane

285

5.4 Handling errors in least-square problems

Table 5.20. The complete set of stoichiometric coefficients for the set of end-members described in Table 5.19. These coefficients have been obtained from the eigenvectors associated with the null eigenvalues of the mineralogical matrices formed out of 6 end-members or minerals at a time. No more than three equations from this set can be considered independent.

en

#

qz jd

1 2 3 4 5 6 7 8 9 10 11 12 13 14

0 0 3 -3 -1 -1 0 0 1 1 0 0 -1 -1 0 0 1 1 0 0 0 0 3 -3 0 0 3 -3 -1 0 2 -1 1 -21 0 -3 0 3 0 0 0 3 -3 6 0 -1 -5 0 0 3 -3 1 1 0 0

di py

gr ab an

-i 1 0 0 0 0 1 0 0 0 -1 0 0 0 1 0 0 0 -1 0 -l 1 0 0 -l 1 0 0 -l 0 0 1 1 0 -1 0 -2 -1 0 3 -1 -2 0 3 5/2 4 0 -6 -1 1 0 0 0 0 -1 0

# 15 16 17 18 19 20 21 22 23 24 25 26 27 28

qz jd

en

di py

gr ab an

0 L 0 0 0 0 -1 1 0 I 0 0 0 0 1 [ () 0 0 2 -1 -l 1 () 1L 2 -1 -l 0 -1 0 1L 1L 0 0 0 0 -1 ]L ]L 0 0 0 0 0 -1 1 0 _| ]L () 1 -2 0 1 1 -1 () -1L 1 -2 0 1I I 0 0 0 0 -1 0 * () 3 0 -2 -1 0 3 () ;3 3 0 -2 -1 -3 3 3 () 0 3 -1 -2 0 3 () -:3 0 -3 1 2 3 _3 () () 3 -3 -1 1 0 0 L

defined by the two vectors aY and a2 is the maximum-likelihood point y of the plane, i.e., the point y of the plane that makes the function fy(y) maximum.

If Yl9 Y2, and Y3 are normally distributed, the constant probability surfaces are ellipsoids centered at y (Figure 5.12) and the statistical projection y of y will be defined as the point where the plane is tangent to the innermost probability ellipsoid. Points on the same ellipsoid are by definition at the same statistical distance from y. If Sy is the covariance matrix of the vector j , the statistical distance c between y and y is given by = (y-y)TSf\y-y)

(5.4.2)

5.4.1 A simple illustration: the weighted mean A vector x of n random variables has been measured m times, the iih measurement resulting in an estimate of the mean value xt and of the co variance matrix St. A best estimate x of the pooled ('weighted') average makes the sum of squared statistical distances to each x{ minimum. The scalar expression (5.4.3)

Inverse methods

286

Figure 5.12 Statistical projection y of the observation vector y onto the plane defined by the vectors a t and a2. y is the point where the plane is tangent to the innermost probability ellipsoid.

is therefore minimum for x = x. Differentiating the scalar c2 for JC, we obtain 2cdc= X dxTSi~1(x-xi)+ X (x-xi)JSi'1dx i= 1

=2 £

i= 1

i= 1

or, at minimum (5.4.4) i= 1

i= 1

The solution x is given by - l

m

(5.4.5)

The covariance (standard error) matrix of x is given by (5.4.6)

Defining the weight matrix Wk of the /cth measurement as (5.4.7)

5.4 Handling errors in least-square problems

287

Table 5.21. Average lead isotope compositions measured over three blocks with standard deviations and correlation coefficients. The last row indicates the weighted mean of the three blocks and its standard deviation and correlation coefficient. /206p b X

Block #

/206pbx

/2O7p b \

204

/207p b X

V°Pb/

V PnJ

A 204 Pb/

1 2 3

18.688 18.695 18.678

0.007 0.013 0.011

15.532 15.549 15.522

0.011 0.015 0.013

0.908 0.925 0.937

x

18.688

0.0051

15.535

0.0068

0.912

we get the compact 'weighted' average

x= £ Wfr

(5.4.8)

1=1

As a special case, replacing matrices and vectors by scalars, the case n = 1 reduces to the well-known weighted average

x=—

(5.4.9)

and s*2 = ( 1 ^ )

(5.4.10)

^ Three blocks of lead isotope measurements on the same lead sample by mass spectrometry have produced the average ratios 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb reported in Table 5.21 together with their standard deviation and correlation coefficient. Calculate the weighted mean and standard error matrix of these three blocks. From the definition of variance and correlation coefficient and equation (4.2.18), we find that the covariance matrix S1 is ["0.007 0 T 1 0.908T0.007 0 ~|_ ' L 0 O.OllJLo.908 1 j|_0 0.011 _|~

T0.490 0.699~ Lo.699 1.2ioj

Likewise

2

J 1.690 1.8041 „ Jl.210 1.3401 J = 10"4 and 5r3 = 10~4 L 1.804 2.250 J L1340 1.690J

288

Inverse methods

The inverse matrices are calculated as 11.601 -6.7021 -6.702 4.698 J and

2

_. .

_t_ "

J 4.105 L ~ 3.292

3.084 J'

6.779 -5.3751 _-5.375

4.854 J

and therefore _!

_t 1

+

2

_!_ +

3

I 22.486 L-15.368

-15.3681 12.635 J

The standard error matrix of the weighted average is calculated from equation (5.4.6) as 5=i0_J0.2637

L0.3207

0.3207" 0.32071 0.4692 J

The weight matrices are calculated as

1

_[" 0.9096 ~L-0.2604

-0.26041 0.0551J

2

_[" 0.0268 0.12101 ~|_-0.2279 0.3913J

**

3

_[" 0.0636 0.13941 ~L-0.3481 0.5536J

It can be checked that Wl9 Wl9 and W3 sum up to the identity matrix. The weighted average x is calculated from equation (5.4.8) as 0.9096

-0.2604Tl8.688i I" 0.0268 0.1210"|["l8.695l + 0.055l jLl5.532j L-0.2279 0.3913jLl5.549j |~ 0.0636 ~hL-0.3481

or A

0.1394T18.6781 0.5536jLl5.522j

_ T12.9541 p.3831 p . 3 5 l l _ p8.688l

^ ~" Ll 1619 J + Ll-824j + L2.093 J ~ L15.535 J Due to its smaller uncertainties, the first measurement is dominant. The final results for the weighted average, equation (5.4.5), its standard errors, equation (5.4.6), and the correlation coefficient are reported in the last row of Table 5.21. 0=

5.4.2 Linear least-square systems We assume that a random variable vector YofW1 (here upper-case is used to indicate not a matrix but an ordered set of m random variables) distributed as a multivariate normal distribution has been measured through an adequate analytical protocol (e.g., CaO concentration, the 87 Sr/ 86 Sr ratio,...). The outcome of this measurement is the data vector ym. Here ym is the mean of a large number of measurements with expected

5.4 Handling errors in least-square problems

289

valuey m and has a covariance matrix Sy~l. We know from Chapter 4 that the squared statistical distance c2 between y and j , such that c2 = {y-y)TSy-\y-y)

(5.4.11)

is then distributed as a chi-squared variable with m degrees of freedom (remember that the vector y has m components). From equation (4.2.41), the smaller the value of c2 , the larger the confidence limit 100(1—a) since

where the value of percentiles increases with (1—a). As discussed in Section 4.2, the case of a small number of measurements could be handled in a similar way by adopting the appropriate statistics. The solution associated with the maximum probability (Figure 5.12), said to be maximum-likelihood, is these unbiased estimates x and y of x and y which make the value c2 of c2 minimum and equal to y

y

)

(5.4.13)

Jc and y also satisfy the model y = Ax

(5.4.14)

c2 is known as the weighted sum of squared residuals. We can revert to the standard least-square solution by a change of variable as discussed in Chapter 2. Let us define $=Sy-v2y

(5.4.15)

$ = Sy-1/2y

(5.4.16)

d2=($-$n$-$)

(5-4.17)

and

c2 can be written

Pre-multiplying equation (5.4.14) by Sy~1/2, it becomes $ = Sy1/2y = Sy1/2Ax = yix

(5.4.18)

with the definition of 91 being embedded in the last equality. The standard least-square normal equations now apply as (5.4.19)

290

Inverse methods

or, reverting to the initial variables ( Q - 1/2 j\T \LJy fi)

o - 1/2 A * /r» Ljy /\x — y^y

- 1/2

i\Tc - 1/2 S*) *^v y

/c A "\(\\ yj.^r.ZyJ)

which is simply equivalent to AJSy-1AJc = ATSy-1y

(5.4.21)

i = (^l T 5l)- 1 ^ T ^

(5.4.22)

The least-square solution is

or x = (AJSy~lAy1ATSy~ly

(5.4.23)

and ~1ATSy-1y with the projector P of Sy~1/2y P=Sy-

(5.4.24)

onto Sy~1/2$ defined as l/2

A(ATSy- Uy^Sy-

l/2

A)T

(5.4.25)

We see that for Sy = Im, the cumbersome equations (5.4.23) and (5.4.24) reduce to the standard least-square solutions. The covariance matrix S^ on x can be obtained through equation (4.3.4)

or, after simplification Sjt = (ATSy~1Ayi

(5.4.27)

Again through equation (3.3.4) and (5.4.24), the covariance matrix Sy on y is Sy = [A(AJSy~ 1A)'1AJSy~

1

^\SylA{ATSy~ lA)~ 1ATSy~ X ] T

(5.4.28)

and, after simplification l

A)~1AT

(5.4.29)

The n x m covariance matrix cov(i, y) between the model x and the observations y can be obtained easily. By linearity y

y

(5.4.30)

Multiplying by [y — $(y)]T and taking the expectation gives y lATSy~ 1Sy = (ATSy~ lAyxAT

(5.4.31)

5.4 Handling errors in least-square problems

291

In order to get the correlation coefficients, this matrix must be pre-multiplied by the inverse of the diagonal matrix having the standard deviations of x on the diagonal line, and post-multiplied by the inverse of the diagonal matrix having the standard deviations of y on the diagonal line. We will now investigate the sampling properties of the statistic representing the weighted sum of squared residuals c2 given by equation (5.4.13). We first observe that the slightly different expression (y — y)TSy~1(y — y) is zero since (y-yVSy-^y-^^S.-^y-^Sy-^y-^^-m^-^)

(5.4.32)

where, from the properties of the standard least-square solution, the vector (ij/ — ij/) is orthogonal to the plane containing both \jf and ij/ and therefore orthogonal to (\l/ — }j/). Equation (5.4.13) for c2 therefore can be rewritten

or

and finally \y-y)-(x-xfSjt-\x-x)

(5.4.33)

Since the two terms on the right-hand side are approximately distributed as chi-squared variables with m and n degrees of freedom, respectively, c2 is therefore distributed as a chi-squared variable with (m — n) degrees of freedom. The expected value of c is S(c2) = m-n

(5.4.34)

a property which can be shown to hold even when y is not normally distributed (take the expectation of c2 and apply the commutativity properties of the trace). A common use is to report the Mean Square of Weighted Deviations (MSWD), i.e., c2/(m — n) whose expectation is 1. The value of the expectation can also be used to scale errors when the covariance matrix Sy of the data vector is unknown. The usual procedure is to assume that Sy can be approximated by Sy**Im

(5.4.35)

where a is an unknown scalar which is calculated in such a way that c2 is actually equal to (m — n). This way of assessing errors is sometimes referred to as error calculation in contrast with the more commendable procedure of error propagation used up to this point. In addition to depriving the user of the capability of testing the model through a test of c2 with respect to the physically determined uncertainties, the covariance matrix is assumed to be diagonal (uncorrelated measurements) and

292

Inverse methods Table 5.22. Ion beam intensities \{(mV) at mass i (m V) with standard deviations and adjusted intensities \v See Table 5.1 for atomic abundances. Mass i

It

S(li)

rt

142 144 146 148 150

207 62 43 26 22

1.439 0.787 0.656 0.510 0.469

205.8 61.6 42.2 25.7 21.4

absolute uncertainties made equal for all the data. Error calculation should therefore be restricted to problems which do not need critical assessment. Finally, the assessment of the importance and influence of each observation can be deduced from the 'error-free' case. The matrix of data importance (and a measure of their independence) is still the projector P. The reader can consult Sen and Srivastava (1990) for relevant expressions of their influence. & Let us consider the peak-stripping problem solved in Section 5.1.1 when experimental uncertainties are taken into account. Ion probe measurement of the ion current at the masses 142, 144, 146, 148 and 150 are given in Table 5.22 and form the vector I. For the particular setting of counting times, one standard deviation of the mean has been found by previous measurements to equal 10 percent of the square-root of peak height (Poisson statistics). Because noise is largely due to fluctuations of beam intensity, peak heights have also been found to correlate with a correlation coefficient of 0.85. Calculate the total elemental signal in millivolts for Ce, Nd, and Sm, their co variance matrix and discuss whether the linear signal addition is an acceptable hypothesis. The relevant mass balance equations have been written in Section 5.1.1 and the matrix A is given in Table 5.1. The standard deviations on intensities are calculated, i.e., for mass 142 as 0.1 x ^ 2 0 7 = 1.439 and arranged to form the diagonal matrix S. Then the correlation matrix R is formed out of 1 on the diagonal and 0.85 everywhere else, and, from equation (4.2.18) the covariance matrix Wf is calculated as W,=SRS

which gives "2.070 0.963

0.802

0.624

0.574"

0.963

0.620

0.439

0.341

0.314

0.802

0.439

0.430

0.284

0.261

0.624

0.341

0.284

0.260

0.203

0.574

0.314

0.261

0.203

0.220

5.4 Handling errors in least-square problems

293

We first calculate the covariance matrix of the solution as T46.383 8.131 11.622' 8.131

8.466

6.070

11.622

6.070

8.821

which gives the propagated standard deviations on / Ce , / Nd , and / Sm as the vector [6.81, 2.91, 2.97]T. Given the standard-deviation matrix S , V46.383

0

0

0

^8.466

0

0

o

V'8.821-

-

the correlation matrix p between these variables is

p=

1

T

1

&- (A Wr Ay

1

0.41

0.58

0.41

1

0.70

0.58

0.70

1

The least-square solution / Ce , / Nd , and / Sm is calculated from equation (5.4.23) as 1250.1 245.6 102.1

/smJ

1/2

Determining the projector P is slightly more tedious (Wr through the methods described in Section 2.3) and gives

should be computed

0.908

0.099

-0.203

0.027

-0.178

0.099

0.776

0.377

0.146

-0.025

0.203

0.377

0.339

-0.177

-0.098

0.027

0.146

-0.177

0.729

0.380

0.178

-0.025

-0.098

0.380

0.248

The diagonal elements of P sum up to three, the number of variables to be determined, and rank the importance of each peak in that determination. The 'adjusted' values that satisfy the model are calculated according to equation (5.4.24) and listed in Table 5.22. The covariance between the three model parameters x and the five observations y can be calculated through equation (5.4.31). Their 3 x 5 correlation matrix R(x,y) can be shown to be 0.75

0.43

0.31

0.51 0.41"

0.76

0.96

0.76

0.79

0.68

0.69

0.73

0.54

0.89

0.71

294

Inverse methods

Larger coefficients show that 7Nd depends mostly on variations of F 144 while 7Sm depends mostly on variations of T148. The model itself can be tested against the sum of squared residuals c2 = 4.01. If, as a first approximation, we admit that intensities are normally distributed (which may not be too incorrect since all the values seem to be distant from zero by many standard deviations), c2 is distributed as a chi-squared variable with 5 — 3 = 2 degrees of freedom. Consulting statistical tables, we find that there is a probability of 0.05 that a chi-squared variable with two degrees of freedom exceeds 5.99, a value much larger than the observed c2. We therefore accept to the 95 percent confidence level the hypothesis that the linear signal addition described by the mass balance equations is correct, o 5.4.3 Non-linear least-square systems: isochrons A particular experiment provides observations of n independent variables Xj (j= 1,...,/?) and of a single dependent scalar variable Y which we suspect to be related through a linear relationship such as Y = t

XJXJ + P

(5.4.36)

where the n a, and the one fi are parameters to be determined by the experiment, m observations are carried out (i=l,...,m). For w=l, we are dealing with a straight line in the (x,y) plane, for n = 2 with a plane in the (xux2,y) space, and so on for higher dimensions. Examples of such relationships are countless: Rb-Sr and Sm-Nd isochrons, mixing planes and hyper-planes in multiple elemental or isotopic systems make the best known cases. Kent et al. (1990) list some applications, with emphasis on rare-gas isotopic systems. The variables Xj and y can be either independent, which is nearly the case for Rb-Sr and Sm-Nd isochrons, or strongly correlated as in the Concordia U-Pb plot. The case /? = 0 (lines or planes going through the origin) may be derived with little difficulty from the following by setting all the corresponding coefficients to zero. In order to take advantage of a matrix formulation, we define the vector Xn of the n variables Xj and the vector a of the n unknowns a,- and write 0

(5.4.37)

We now proceed to m observations. The ith observation provides the estimates xtj of the independent variables Xj and the estimate yt of the dependent variable Y. The n estimates x^ of the variables Xj provided by this ith observation are lumped together into the vector xt. We assume that the set of the (n+ 1) data (xhyi) associated with the ith observation represent unbiased estimates of the mean (xh yt) of a random (rc+1)-vector distributed as a multivariate normal distribution. The unbiased character of the estimates is equivalent to <$(xij) = x i j

(5.4.38)

5.4 Handling errors in least-square problems

295

St is the ( n + l ) x ( n + l ) covariance matrix of the ith measurement (jcf,^). The 7th diagonal term is the variance of xij9 while the (n+ l)th diagonal term is the variance of yt. The off-diagonal terms are the corresponding covariance terms. In order to illustrate how the maximum-likelihood expression can be built, let us consider the case n = 1 (only one X\ which is the case of a straight line relating X and Y. The expression of c2 is given by

=

l(xl-xl){yl-yl)...(xm-xm)(ym-ym)']

(5.4.40)

(xm-xm) iV.r m

ymfmrn

where the S( * matrices are 2 x 2 blocks on the matrix diagonal. If each observation is the result of a large number of measurements (ideally the mean value of many replicates), c2 should be approximately distributed as chi-squared variable with 2m degrees of freedom. Off-diagonal blocks are zero because different measurement pairs are supposed to have uncorrelated errors. In this quadratic form, only the terms with matching values of index will be different from zero. The expression of c2 can be written as the sum of the quadratic forms corresponding to each measurement (5.4.41)

Generalizing this maximum-likelihood expression to an arbitrary number n of variables gives (5.4.42)

Lumping the vector xt and the scalar yt together into a single vector ut of dimension ), c2 becomes (5.4.43)

where the means noted with plain symbols are to satisfy the constraint (5.4.44)

(Note that a scalar behaves as a symmetric matrix.) Because of finite sampling, a and P cannot be evaluated exactly. Instead, we will search for unbiased estimates a and P of a and /? together with unbiased estimates yt and xtj of yt and xtj that satisfy the linear model given by equation (5.4.37) and minimize the maximum-likelihood expression in xt and yt. Introducing m Lagrange multipliers Xb one for each linear

296

Inverse methods

constraint to be satisfied by each set of variables corresponding to one observation, the hat Q-variables minimize the expression y2 given by

Differentiating y2 with respect to the ub we get

dy2= - £ 2d«IT5I.-1(«l-^)+ t 2*iduiT\

fl

or, after rearrangement under the same summation sign

J

(5.4.46)

For the sum to be zero regardless of the value taken by the dub each term in the braces must vanish. Therefore, the values at the minimum, labeled with a hat on the symbol, satisfy (5A47)

-i r or

(5.4.48) Pre-multiplying the last expression by [<£T, - 1 ] and combining with equation (5.4.44) gives [deT, - l]tfj = [AT, - I]*,—;L( .[iT , - 1 ] 5 ,

= -p

(5.4.49)

|_ — 1J In order to work with more compact notation, we introduce the weight wt of the ith observation. wt is the scalar given for its value at minimum by !

(5.4.50)

Combining equations (5.4.49) and (5.4.50) results in A, = * r 1 { W T . - l ] * i + ^}

(5-4-51)

Equation (5.4.48) becomes !

(5.4.52)

5.4 Handling errors in least-square problems

297

which can be inserted into the expression (5.4.43) of c2 to give its minimum value

or after simplification t2= jr

V^-HD*1",

-i]«,-+/?} 2

(5.4.53)

In a non-matrix form, this expression becomes c 2 = Z ^i" 1 ! J'i" Z ^ o ~ ^ )

=

Z Wi~l$i2

(5.4.54)

with weight wt defined as n

n

n

n

(5.4.55) and residual 8f as ^• = [aT, - l ] « f + i?= t a/Xy + jJ-y,

(5.4.56)

Let us define ^ as the m x n matrix with current element x(j and the vector ym as the vector with element yt. The m-vector 5 of residuals 8t is -j

(5.4.57)

and differentiates as dd = Xd
(5.4.58)

et being the fth unit vector with (m—1) null components and the ith element equal to 1, the m x m weight matrix W is defined as W= £ e-w^

(5.4.59)

c 2 can therefore be written in the compact matrix form

(5.4.60)

Jrn^

298

Inverse methods

In order to be able to proceed with differentiation, let us partition the covariance matrix of the fth measurement as (5.4.61) where S* is the n x n covariance matrix of the n xtj (j = 1,..., n\ ct the vectors of the covariances cov(xl7, yt), and sty the variance of y(. Using equation (5.4.50), w, can be expressed as wt = aT5fxa - 2aTcf + sty

(5.4.62)

The last equation differentiates as dw,- = 2 daT(5»xa - ct) = 2 daT8f

(5.4.63)

where the vector Bt defined as a^Si'a-Ci

(5.4.64)

dBi = Sixda

(5.4.65)

differentiates as

The differential of c2 is

or, after expression (5.4.58) for d8 has been used XTW

l

5- f {wi-2di2) \ + 2JmTW-l5dp

(5.4.66)

Programming being usually easier with matrix forms, we define E as being the m x n matrix made by the m vectors £fT, and z{a'b) an m-vector with wi~a8ib in fth position. Equation (5.4.66) can be rearranged as

At the minimum, the partial derivatives of c2 with respect to each variable must be zero, i.e., all the components of the gradient g(<x, f$) of c2 in the space of the (n +1) variables a and /? must vanish. Canceling the coefficients of da T gives the vector of the first n components of #(a, jS) through the vector equation

= XTW '3- X (wr252)Ei = 0n

(5.4.67)

5.4 Handling errors in least-square problems

299

while the coefficient of dj8 gives the additional relation

gn+1(*J) = JmTfr-1$= £ ^ = 0

(5.4.68)

The components of the weighted residual vector W'^^S therefore sum up to zero. The most natural way of finding the minimum of c2, i.e., of solving the system of equations (5.4.67) and (5.4.68) in the (n+1) unknowns cci9...,aw, ft is to use the Newton-Raphson method. In order to implement this method described in Chapter 3, the derivatives of each gradient component gi (i = l , . . . , w + l ) relative to each unknown are needed. Using the language of optimization, these derivatives are the elements of the (n+1) x (n+1) symmetric Hessian matrix H of c2. Let us partition the matrix H as

Hxx =\ n H \_hyx

(5.4.69)

hyy

using the nxn submatrix Hxx, the n-vectors hxy and hyx and the scalar hyy. The i/th element of Hxx is dgi/docj = d2c2/d(xid(Xp the ith element of hxy and hyx are dgjdfi and dgn + 1/d<xi9 respectively, both equal to d2c2/dai dp while hyy is dgn+1/dfi = d2c2/d/32. Differentiation of the first term in equation (5.4.67) gives ~ V )d + XTW-1(Xda + JmdP)

(5.4.70)

e^Xis the ith row of A'so we will refer to it as xtT whereas etTS is simply 8t. Calculation of the first term in parentheses on the right-hand side using equation (5.4.63) leads to

^)= - 2 X ( w r 2 t o T d «

(5.4.71)

i=l

Differentiation of the last term in equation (5.4.67) gives

= - 4 f; ( w r ^ X e ^ d a + l f (wj-2«51)8IeiT(^d« + /mdi8)+ f (w,-2S,2)S,*
i- 1

i= 1

(5.4.72)

where use has been made of equation (5.4.57). The sum of the coefficients of da in equations (5.4.70) to (5.4.72) are the elements of H*x, therefore [4w,"3<5iV,T-2w,"2^*<«iT +

W)-w,

300

Inverse methods

or, in a compact form Hxx = XTW-1X+4ET#3<2)E-2[ETZi2>1)X+XTZi2>l)E]-

£ (w,."2^)

(5.4.73)

i= 1

where Z(a'b) is an m x m diagonal matrix with wi~1Sib in ith position. Because e?Jm is equal to one, the coefficients of d/? give the vector hxy as )

(5.4.74)

i= i

Differentiation of #n + 1(a,/?) in equation (5.4.68) is likewise doi + JndP)

(5.4.75)

For the first term on the right-hand side, we get f

(^)= -2 5 wr^ajda

(5.4.76)

The sum of the coefficients of da in equations (5.4.75) and (5.4.76) are the elements of hyx which are, as expected, the transpose of hxy given by equation (5.4.74) while the coefficient of d/J in equation (5.4.75) gives the scalar hyy hyy = JmrW-1Jm

(5.4.77)

The Newton-Raphson scheme prescribes the updating formula

^ l l ^ ^ or

E'L^l}

(5.4.78)

<547

-"

In the vicinity of the minimum, the H should be positive-definite. This may not be the case everywhere in which case there is a small but real danger of iterating towards a saddle instead of the minimum. It is therefore highly advisable, especially when the data scatter about the best-fit straight line, plane, or hyper-plane, to use the best possible initial estimate. Most commonly, one of the linear estimates (Section 5.1) will be good enough. The estimates (adjusted values) x( and yt of x( and yt are calculated from equations (5.4.52) and (5.4.56) combined as

5.4 Handling errors in least-square problems

301

It is estimated that, for reasonably good fits, the linear results hold, i.e., c2 is distributed as a chi-squared variable with (m — n) degrees of freedom and

which makes it possible to test whether the data can be fitted by a linear relationship within a given confidence interval. Moreover, if m » n , the contribution c2 of each observation to c2 given by £.2 = ^ . - 1 ^

(5.4.81)

ideally should be equal and distributed as a chi-squared variable with 'nearly' one degree of freedom. Using statistical tables, we decide that the probability of c2 to exceed 3.84 is only 5 percent, so each observation with c2 much in excess of that value can be considered as unreliable. For a more elaborate discussion of outlier detection, the reader could refer to Kent et a\. (1990). Now, we would like to attach a variance to the estimates of a and jS that make c2 minimum. Given the complex and non-linear nature of the gradient equations (5.4.67) and (5.4.68), we assume for simplicity that <x and fi are normally distributed and resort to linear propagation in order to retrieve an estimate for the covariance matrix of a and p. The covariance matrix S^p of the vector (
J

(5.4.82)

where da and d/? are infinitesimal changes about the expected value. As described in Chapter 4, we let X and y together with a and ft change slightly about the minimum value. For <5, we get dS = Jf da + Jm dp + dJPa - dy

(5.4.83)

The differential of g(<x, ft) dal

d

VXT W~ 1(dJRx - djO +

L

dXTW~181_

JmTW-i(dX*-dy)

J"

is zero at the minimum which results into

where, as usual, hat Q refers to values at minimum that should be treated as constants. The transpose expression is given by [daT,djS]#T= -idXA-dyYlfr-1*, W^J^-FW-^dX^

(5.4.85)

In order to obtain S^p, we will multiply equation (5.4.84) by equation (5.4.85) and

302

Inverse methods

take the expected value. What the products mean must be explicited in a little more detail. The f/cth element of the mxm matrix
X dXijdj-dyAl X dx*A—d and, since different measurements were assumed to be unrelated, it is zero for i # k. The ith diagonal element is simply w( and the matrix ^{[dJPa-dj][dJPa-dj] T }^ W

(5.4.86)

We first observe that 1

£= £

wk-xdkdxkj

The ./7th element of
since the measurements are uncorrelated. Arranging this result in a matrix form gives

Finally, we evaluate the m x n matrix ^[(dXat-dy^d^W'1^. this matrix is

The i/th element of

Uwr'd £ (5.4.88) since expected values vanish for /c / f. The term in brackets is the 7th term of the vector £t = Sfx — ct and the matrix E is therefore a suitable approximation of the matrix g[{dXz-dy){dX* fV~ 1 ^) T ]. Combining equations (5.4.86) to (5.4.88) gives the linear approximation of the matrix fP as

(5.4.89) OJ

0

5.4 Handling errors in least-square problems

303

and the final estimate of S a # as

(5.4.90)

0

i

In the simplest case of a straight line, i.e., for n= 1, all the programs devised over the years to compute isochron parameters and error bars converge toward the same value. As discussed by Kent et al (1990), most early calculation schemes (York, 1966; York, 1969; Mclntyre et al, 1966; Williamson, 1968; Brooks et al, 1972) now look rather awkward and unnecessarily complicated. This situation simply reflects that the need for exhaustive error handling in linear estimation came from geochronology in the first place and not from statistical sciences. However, the output of these early computational schemes should not necessarily be considered incorrect. Similarly, even if not expressed in a full matrix form, Albarede and Provost's (1977) solution to mass-balance equations (/? = 0) is a Newton-Raphson scheme in its own right. The matrix-oriented solution which has just been established is based on Kent et al. (1990) for the Newton-Raphson scheme, although the reader is warned about a typo in their equation (2.16). For isochrons, this method gives results indistinguishable from those obtained by Minster et a/.'s (1979) implementation of Williamson's scheme. Finally, expressions for the assessment of the influence of individual observations have been given by Kent et al (1990). & The 206 Pb/ 204 Pb and 207 Pb/ 204 Pb ratios of 11 Archean granites from Zimbabwe are given in Table 5.23 (courtesy Beatrice Luais) and shown in Figure 5.13. Together with the mean ratio values, the Table shows the in-run standard deviation of the mean of each ratio and of the 207 pb/ 206 Pb ratio. This ratio being close to unity is measured with better precision, which introduces a rather strong correlation between the variables x = 206 Pb/ 204 Pb and y = 207 Pb/ 204 Pb. Check whether these data form a statistically significant alignment. Give the age of the isochron y = ccx + P with error limits. In the present case, a reduces to the scalar a. Wefirsthave to calculate the covariance between the 206 Pb/ 204 Pb and 207 Pb/ 204 Pb ratios out of the variances on 206 Pb/ 204 Pb, 207pb//204pb a n d 20 7pb /206 pb G i v e n t h e m t i o r = ajh o f t w o quantities a andfe,we first propagate the variance on r from that on a and b. This can be done by taking the log-derivative of the ratio through dr r

da a

db b

Squaring and taking the expectation, we get, provided a and b are significantly

Inverse methods

304

Table 5.23. Pb isotope composition of Archean granites from Zimbabwe (courtesy Beatrice Luais).

Covariances in the last columns are calculated as explained /206p b \

Sampler

V 2 0 4 Pb/

1 2 3 4 5 6 7 8 9 10 11

18.073 16.714 33.747 32.376 17.488 14.262 17.579 18.386 15.839 17.398 17.756

/206p b N 51

0.018 0.017 0.034 0.032 0.017 0.014 0.018 0.018 0.016 0.017 0.018

the text. ' 2 0 7 Pb\

V 2 0 4 Pb/

/«"Fb\ A 204 Pb/

< 206pby)

15.707 15.341 18.951 18.694 15.576 14.923 15.597 15.712 15.177 15.496 15.552

0.016 0.015 0.019 0.019 0.016 0.015 0.016 0.016 0.015 0.015 0.016

0.000 43 0.000 46 0.000 28 0.000 29 0.000 45 0.000 52 0.000 44 0.000 43 0.000 48 0.000 45 0.000 44

V /*"Fb\

1

in

PCWfl

C0

/206p b 2 0 7 Pb\

\ao4p b ' 204p b y ) 0.000 253 0.000 223 0.000 567 0.000 532 0.000 238 0.000 184 0.000 254 0.000 252 0.000 210 0.000 221 0.000 253

20

18

16

14 10

20 25 206p b /204p b

15

30

35

Figure 5.13 2 0 6 Pb/ 2 0 4 Pb and 2 0 7 Pb/ 2 0 4 Pb ratios of Archean granites from Zimbabwe (B. Luais, unpub. data) and the least-square straight-line (isochron). (See text for results.)

different from zero var(r) ^ var(a) :r

T

~r ~~~a ~

var(fr) +

2

~b~

cov(a, b)

aV~

which gives the covariance between a and b. Replacing a, b, and r by the appropriate

5.4 Handling errors in least-square problems

305

isotopic ratios, it becomes /206p b 207p b \ V 207

204

Pb'

204

var( Pb/ Pb) ( 2 0 7 Pb/ 2 ° 4 Pb) 2

204

{

Pb/ ~ 2 206

+

206p b

207p b

204

204

Pb

204

var( Pb/ Pb) (206p b/ 204p b) 2

Pb

var( 207 Pb/ 206 Pb)"| (2O7p b/2O6p b)2 J

which gives the covariances listed in the last column of Table 5.23. The covariance matrix can now be calculated. For instance S x is _|~(0.018)2 0.0002531 1 ~l_0.000253 (0.016)2 J The 11 points of the isochron diagram are drawn in Figure 5.13. Their respective error ellipse (la error) could be drawn using the method described in Section 2.4, but errors are too small for the ellipses to be clearly seen. In order to get a good starting value, we could solve the standard unweighted least-square set of 11 equations. Convergence properties are better shown, however, by taking nearly arbitrary starting values, e.g., a = 0.1 and /?= 15. Some other choices may not converge, which shows that saddle points of the function c2 distant from the actual minimum do exist. It would be too space-consuming to give every detail of the calculation. For the reader who needs some practice, it may help to have some intermediate results before the iterations start. £x initially equals 0.0182 x 0.1-0.000253 = 0.000221. The 11 x 1 matrix E is equal to 0.001 x [-0.221, -0.194, -0.451, -0.430, -0.209, -0.164, -0.222, -0.219, -0.184, -0.192, -0.221] T . The diagonal elements of W are calculated from equation (5.4.50) and are equal to [4794, 5456, 3857, 3776, 4731, 5258,4799,4787,5389,5442,4794]. The gradient g is initially equal to [911463,53 218] T and the Hessian matrix H to

tf-10'xP-4826

0H28

l

|_0.1128 0.0053 J

which gives the updated value a"| _ro.l"| ff_Lew Ll5J

_3 [" 0.001174 X |_-0.02496

-0.02496T9114631 [0.35781 0.5492 J|_ 53218_|~[_8.5188j

Successive iterations converge extremely fast. After the fifth step, the results hardly change (Table 5.24) which, using the Newton method outlined in Section 3.1, indicates an age of T = 2.9065 Ga. The value of c2 = 83.45 can be compared with the 95 percentile of a chi-squared variable with 11—2 = 9 degrees of freedom (16.9) and seems much too high for the data set to form a statistically acceptable isochron. Table 5.25 shows that point 6 with a c62 value of 19.41 is particularly suspect. Further elaboration on the data is a matter of individual appreciation. Leaving

306

Inverse methods Table 5.24. Iterative refinement of the isochron parameters (slope a and intercept f$) for the lead isotope data listed in Table 5.23.

c2

Iter. k 0 1 2 3 4 5 6

0.1 0.3578 0.26112 0.21983 0.21056 0.21012 0.21012

15 8.5188 10.783 11.677 11.861 11.870 11.870

7472 101882 8247 327.6 83.94 83.45 83.45

Table 5.25. Partial reduced residuals and adjusted values of isotope compositions for the isotopic data of Table 5.23.

Sample i 1 2 3 4 5 6 7 8 9 10 11

1 2

c^wr ^ 9.587 10.16 0.586 2.762 6.077 19.41 6.822 2.724 2.678 5.320 14.50

206pb

207pb

204pb

204pb

18.028 16.760 33.765 32.339 17.455 14.211 17.541 18.410 15.861 17.431 17.811

15.658 15.392 18.965 18.665 15.538 14.856 15.556 15.738 15.203 15.532 15.612

out the data with high c2 values may be justified only if supported by field or microscopic evidence. Multiplying the errors by a common factor, in such a way that c2 becomes exactly (m — n), is common practice. Doing so, however, dissimulates the surrender of the critical test of whether the data do or do not define an isochron. We may rather question the use of in-run covariance matrix as an estimate of the sampling error: in-run errors can be made almost as small as one wishes by increasing the number of ratios measured, which makes little physical sense. We can instead use the covariance matrix of sample replicates, when available, or the covariance matrix of standard replicates (e.g., NBS 981). Estimating propagated errors requires some more effort of matrix programming. E does not actually change very significantly over the iterations. The covariance

5.5 Gradient projection and the total inverse

307

11.89

0.209

0.211

Figure 5.14 The Is error-ellipse for the 2 0 7 Pb/ 2 0 4 Pb vs 2 0 6 Pb/ 2 0 4 Pb isochron of Figure 5.13. A strong anticorrelation is generally expected between errors on the slope a and intercept /?.

matrix is = 10 in~ £«./!=

3

0.000 399 7 -0.007 857] -0.007 857 0.16903 J

which gives sa = 0.000 632 or an s r ^ (0.000 632/0.210 121) x 2.9065 = 0.000 87 Ga, s^ = 0.0130 and a correlation coefficient between a and ft of —0.956. The negative correlation reflects the property quoted above that the least-square straight line goes through the mean point. The corresponding error ellipse is drawn in Figure 5.14. o

5.5 Gradient projection and the total inverse

A more general least-square technique applicable to non-linear problems with non-linear constraints has been put forward by Tarantola and Valette (1982) and is described at length in Tarantola (1987). The basic idea is that we rarely start computing a model without having an idea of what the model parameters should be. In other words, data and model parameters should be treated as a unique set of parameters and assigned a unique covariance matrix, the model then being treated as a set of constraints applied to the set as a whole. Let us consider, for instance, in the {x9y) plane a set of three points (upvpj = 1,2,3) through which we want to fit a straight line with equation v = au + b. To the set of six 'data' Uj and vp we will add the model parameters a and b. Let us consider, for simplicity that all the data have unit variance and uncorrelated errors. We can lump the data up vp and the model parameters a and b together into a single 6 + 2 = 8-vector x for which we think an initial guess JC0 is acceptable. In an approach similar to the gradient projection method of Rosen (1961), Tarantola and Valette (1982) assume that finding the total inverse solution amounts to find an estimate x of x which minimizes the squared-distance to the

308

Inverse methods

initial guess {x — xo)T(x — JC0) with three constraints, such that ^ — ^ — 5=0(i=l,2,3)

(5.5.1)

or any equivalent expression relating the components of the vector x. Let us start with the illustrative case where we seek to minimize the distance (x — xo)T(x — x0) subject to one constraint which we write g(x) = 0. We want to minimize the sum c2 such as c2 = (x - xo)T(x - x0) - 2Xg(x)

(5.5.2)

where I is a Lagrange multiplier. Differentiating relative to JC, one gets dc2 = 2djtT(*-Jto-/lgrad0)

(5.5.3)

where use is made of the relationship dg = S — dxf = (grad g)T dx = dxT grad g dxt Therefore, c2 is minimum for (5.5.4) Left-multiplying by (grad#)T, equation (5.5.4) becomes (grad g)T(x - x0) = / (grad g)T grad g or X = [(grad gf grad g] " x(grad g)T(x - x0)

(5.5.5)

Therefore x-xo

= grad #[(grad g)T grad g~\ ~ x(grad gf{x - x0)

On the right-hand side, we recognize the projector P onto the gradient of g P=grad #[(grad g)T grad g] " x( More generally, if there are n data points, n Lagrange multipliers X} will be needed. Let m be the sum of the number of observations and the number of parameters. The general function c2 to minimize will be c2=(x - xo)J(x - x 0 ) - Z AM*) Upon differentiation, we obtain dc2 = d*^* - x 0 - Z h 8 r a d 0j(

(5-5-6)

5.5 Gradient projection and the total inverse

309

Let us rewrite the term under the summation sign in a matrix form

j=l

JUJ

We now lump the n Lagrange multipliers into the vector k, and the g^x) into a vector g{x). We further define the mxn matrix F of partial derivatives by its current term fij9 which is the derivative of thejth constraint with respect to the ith parameter, as

The minimum of c2 is for k such that X

JCQ

— f At

(5.5.8)

Pre-multiplying by F1* gives

and therefore (5.5.9)

Inserting that value of k into equation (5.5.8) gives

where we recognize the projector F(FlTF)~1F(T onto the subspace of constraint gradients. Since the constraints g(x) = 0 are to be satisfied at the minimum, the following form is suggested {D.j.lK))

{X — XQ) = ryr t)

\_r [X — XQ) — ff\X)j

Tarantola and Valette (1982) recommend the fixed-point algorithm with the new step being calculated from the old step as *new = *0 + Fold(FoldTFold)

l

[F o l d V o i d - *o) ~ 0(*old)]

(5.5.11)

JC0 being the best estimate of the solution which may differ from the initial guess. Implementation of a covariance structure into this numerical scheme is described in Tarantola and Valette (1982). In essence, an a priori covariance structure is assumed for the whole set of observations and parameters, which should be tightened by iterative refinements since we are still dealing with a minimum variance estimate. There are some marked differences between the total inverse of Tarantola and Valette (1982) and other methods, such as those described for isochrons in the previous

310

Inverse methods Table 5.26. Drift correction during ICP-MS measurements: time (in seconds) and mass-138 intensity (in Mcps) data. j

1000 x tj

h

1 2 3 4 5

0.095 0.210 0.325 0.405 0.510

0.298988 0.270547 0.250778 0.242 391 0.235 427

section, which allow both the parameters and the observations to 'drift' toward the minimum value of c2. Some information is introduced in the total inverse approach that does not belong in the regular least-square solutions. It is extremely common, for instance, that geological evidence is used to make a strong statement about the age of a granite, while experience tells us that initial Sr isotopic ratios rarely fall outside a given range of values. If we try to make a simplistic statement, both the standard least-square and the total inverse methods amount to projecting the observations on the model subspace (as, e.g., in figure 5.12 with an arbitrary surface instead of the plane). The difference is that, contrary to least-squares, the total inverse includes any a priori knowledge of the parameters in the observations. When it comes to the covariance structure, however, problems become acute. Total inversion requires that a joint probability distribution is known for observations and parameters. This is usually not a problem for observations. The covariance structure among the parameters of the model becomes more obscure: how do we estimate the a priori correlation coefficient between age and initial Sr ratio in our isochron example without infringing seriously the objectivity of error assessment? When the a priori covariance structure between the observations and the model parameters is estimated, the chances that we actually resort to unsupported and unjustified speculation become immense. Total inversion must be well-understood in order for it not to end up as a formal exercise of consistency between a priori and a posteriori estimates. Total inverse produces cumbersome sets of equations, especially when errors are taken into account. As usual, not considering errors amounts to taking uncorrelated variables with unit variances, but offers attractive illustrative properties. Examples of application abound in literature, but do not usually give enough detail for the student to use them as practical illustrative references. For these reasons, a simple illustration with no errors will be presented, and readers interested in a complete treatment should refer to Tarantola (1987). & Signal drift during ICP-MS measurement. Because of interaction between the optics and the ion beam, the sensitivity of the measurement degrades with time. Barium signal at mass 138 (in Mcps, million counts per second, noted 7138) has been found to decrease with time as given by Table 5.26. It has been found that the rate

5.5 Gradient projection and the total inverse

311

of change of the signal obeys the law

where / 138 (0), / 138 (oo), and X represent constants. Given five measurements of / 1 3 8 at five different times, find /i 38 (0), / 138 (oo), and X by total inversion. The observations are five values of the time, five values of / 1 3 8 while the parameters are three. We thus have a vector of m=13 variables, which we will rank as tl9 Iissihl- •> *5> hssiUl *i38(0)> / i3 8 (°o) ? and X. There are n = 5 relationships (/= 1, • • -, 5) relating intensities to time

The elements of the matrix F are calculated from

=o

dtj

- = —e ^138(0)

cll38(co)

The best tj and /i 3 8 (^) a priori values clearly are the values for the observations themselves, while some a priori values must be chosen for the parameters. Examination of the data suggests that / 138 (0) = 0.35Mcps, / 138 (oo) = 0.22Mcps are probably not unreasonable. Much of the signal decrease takes place over the measurement interval, so taking X= 1/100s is probably not a bad estimate. We form the initial vector of constraints go{x) as go(x) = [0.0287, 0.0346, 0.0257, 0.0201, 0.0146]7 and the initial matrix Fo of derivatives as 0.5028 1 0 0 0 0 0 0 0 0 0.3867 0.6133 0.0048

0 0 0.1592 1 0 0 0 0 0 0 -0.1225 -0.8775 0.0033

0 0 0 0 0.0504 1 0 0 0 0 -0.0388 -0.9612 0.0016

0 0 0 0 0 0 0.0226 1 0 0 -0.0174 -0.9826 0.0009

0 0 0 0 0 0 0 0 0.0079 1 -0.0061 -0.9939 0.0004

Applying equation (5.5.11) repeatedly gives the successive approximations listed in Table 5.27. Convergence of the calculation is extremely rapid and can be monitored

312

Inverse methods

Table 5.27. Iterative refinement of the time-drift parameters by the total inverse method. Iteration # h

A h

h h

h U

h h h

1(0) 1000 A

9T(x)g(x)

0

1

2

3

0.0950 0.2990 0.2100 0.2705 0.3250 0.2508 0.4050 0.2424 0.5100 0.2354 0.3500 0.2200

0.0895 0.2880 0.2075 0.2550 0.3247 0.2449 0.4050 0.2424 0.5100 0.2411 0.3563 0.2404

0.0897 0.2878 0.2077 0.2549 0.3247 0.2449 0.4050 0.2424 0.5100 0.2411 0.3567 0.2404

0.0897 0.2878 0.2077 0.2549 0.3247 0.2449 0.4050 0.2424 0.5100 0.2411 0.3567 0.2404

10.000

9.9999

9.9999

9.9999

3.3 x l O " 3

5.4 x l O " 8

7.4xlO~ 1 4

1.8xlO~ 1 9

by checking the squared modulus of the constraint vector g(x) which should be zero at minimum. It can be checked that, as expected from a projection, different starting values would converge to different solutions.

In some cases, what we expect to retrieve from observations is not a set of parameters but rather some function describing the continuous variation of a property. In order to illustrate the nature of this problem, we can take a geophysical analog: given many arrival times at various stations of seismic waves propagating from the focus of an earthquake, can we deduce the continuous function relating the seismic velocity to depth in the Earth? That sort of problem has generated an immense literature in geophysics (Backus and Gilbert, 1967; Parker, 1977; Tarantola, 1987), but is not of very common concern in geochemistry. Inferring spatial distributions of rare gases in minerals from step wise heating data (Albarede, 1978), a cooling history from fission track data (Corrigan, 1991) are among the very few but illustrative geochemical examples. Given, the very special character of this problem, it will be developed from an example on outgassing data. Let us assume a spherical mineral with radius R which initially contains a gas with concentration C0(r% r being the radial distance from the center. Upon incremental heating, this gas is lost to the extraction line and at the ith heating step when time is tb the fraction of initial gas remaining is f(tt). Loss takes place by radial diffusion with temperature-dependent, hence time-dependent, coefficient $)(t\ We assume that the total amount of gas held by the mineral at t = 0 is equal to one, i.e., that \~1CR

4 -nR3) 3

4nr2C0(r)dr=l /

Jo

(5.6.1)

5.6 The continuous inverse model

313

We further change to the dimensionless spatial coordinate x = r/R and introduce the dimensionless time T such that r=

(5.6.2)

—— df

Jo R2

where t' is a dummy integration variable. The radial distribution of concentration is given for the sphere by Carslaw and Jaeger (1959) as C(x, T) = - £ sin(/c7rx) x exp( - k2n2z) x' sin(/c7rx')C0(x') dx' Jo **=1

(5.6.3)

An analog expression can be found for parallel diffusion in Chapter 8. By integration of this expression over the entire sphere volume the fraction/(r) of initial gas remaining at dimensionless time T becomes / M = - - Z ^—^- ex P( -

k2n

^) )

xx

sin(/c7rx)C si 0(x) dx

(5.6.4)

which allows us to formulate the inverse problem: given a set of fractions /(T,-) of initial gas remaining when step i is completed and assuming that the x{ values can be estimated, derive the initial distribution C0(x) of the released gas in the heated mineral. We can exchange the order of summations, and rewrite

n J oL* = i

(5.6.5)

k

Let us define the kernel function K(x, T) as K(x, T) = - ^ — Y ^ ^ - sin(/c7rx) x exp( - /C2TC2T)

(5.6.6)

The kernel function could also be made simpler by using X(x, T) = xv"3 - ^ e r f — - erf — ) \ 2 X /T 2^/T/

(5.6.7)

where the error function erf is defined in Appendix A of Chapter 8 and the approximation is arrived at by using the properties of the theta function (Chapter 8, Appendix B). Let us define the nxn symmetric matrix A by its current element atj such that fly=

f 1X(x,T,)iC(x,T i )dx = n^ [ ' I f ^ s i

Jo

Jo U = ifc

314

Inverse methods

Using the result 1 sin2(/c7rx) dx = 2 Jo

derived in Section 2.6, we get the form a

ij = ~2 t ^expC-fcVfa + T,)]

(5.6.8)

In Section 8.6, we derive a useful approximation valid up to 85 percent loss which amounts to 3(Ti + TJ)

Equation (5.6.5) relating the fraction C0(x) can now be written /(T,)=|

/(T,-)

(5.6.9)

remaining at time rf to the distribution

K(x,Ti)txy/3C€{xf]dx

(5.6.10)

Jo

The critical step of the inverse problem is a projection of the unknown function onto the base formed by the kernel functions. This base is not orthogonal and, because the number of kernel functions is finite, cannot describe perfectly the unknown function. For that reason, we write that the unknown function x v / / 3C 0 (x) is the sum of its projections with component ut along thejth kernel functions plus any orthogonal complement labeled Xy/3C±(x) and belonging to the null-space of the kernel functions (5.6.11)

where the n scalar components Uj have been lumped into the vector u and the n kernel functions X(x,t f ) into the functional vector K(x). The functions Xy/3C±(x) belonging to the null-space of the kernel functions, i.e., satisfying the property 1

K(x, T 4 )[Xv/3C ± (x)] dx = 0, i = 1,..., n o

exist whenever the kernel functions are not linearly independent, which is usually the case when the number of observations is large. Assuming for simplicity that the space of the K(x9 rf) is full rank, i.e., that x x /3C ± (x) is zero, we write

/(*,)= [' t ujK(x,Tj)K(x,Ti)dx= Jo ; = i

£ Uj T K(x,Tj)K(x9Tt)dx j=i Jo

(5.6.12)

5.6 The continuous inverse model

315

Table 5.28. Ar outgassing data for lunar anorthosite 15415 (Turner, 1972). F is the remaining fraction of 38 Ar at temperature r°C. T is derived from 37 Ar outgassing data because this isotope is artificially produced by a nuclear reaction between fast neutrons and Ca.

Step # t°C 1-F lni

1 600 0.0061 -12.20

2 700 0.0282 -9.45

3 800 0.107 -6.87

4 900 0.324 -4.52

5 1000 0.611 -2.98

6 1200 0.891 -1.74

or, in matrix notation f = Au

(5.6.13)

Combining with equation (5.6.11) gives the final result

f

(5.6.14)

Discussing practical aspects of the inverse model such as stability, resolution, and statistical assessment would require considerably more development of the subject than presented here. For these important topics, the reader can consult the references quoted above and their enclosed bibliography.

<#^ Spatial distribution of spallation Ar in the lunar anorthosite 15415 (Albarede, 1978). Retrieve the 38 Ar spatial distribution in plagioclase crystals from the stepwise heating data of Turner (1972), assuming spherical minerals of identical grain size. The fractions lost are listed in Table 5.28 together with the xi values deduced from the diffusion of 37 Ar, an isotope produced artificially by neutron activation and supposed to be homogeneously distributed in the minerals (see Chapter 8). The high-temperature steps have been ignored because evidence from 39 Ar diffusion does not support simple diffusion loss at this temperature. We first build the matrix A. Let us show how to calculate the element a24.. We compute T2 + T4 = e~ 9 ' 4 5 + e~ 4 - 52 = 0.01097, then using equation (5.6.9)

a24=l-6 /

+ 3x0.01097 = 0.67840

This calculation is repeated for the whole matrix and inversion of equation (5.6.13)

316

Inverse methods

produces the vector u as

u=A

1

f

=

fO.1083

0.1083

0.1073

0.0977

0.0685 0.0475"

0.9893

0.9693

0.8938

0.6794

0.3894

0.1083

"0.9939" 0.9718

0.9693

0.9580

0.8902

0.6784

0.3891

0.1083

0.8930

0.1331

0.8938

0.8902

0.8520

0.6661

0.3848

0.1073

0.6760

-0.3002

0.6794

0.6784

0.6661

0.5658

0.3443

0.0977

0.3890

0.2531

0.3894

0.3891

0.3848

0.3443

0.2258

0.0685

J).1090_

_-0.0046_

" 1.2251" -0.2386

We now compute the vector K(x) for each value of x that we wish, say x = 0.995. This vector is made out of six K(x, xt) components. For K(x, T4), the calculation reads K(0.995,£T4 52 ) = 0.995 v /3- v /3( erf-

1+0.995

V :

1 -0.995 \ - e r f — ; =

or K(0.905,e" 4 - 52 )= 1.723-1.732 x(l-0.0270) = 0.0383 The complete vector #(0.995) is #(0.995) = [1.5261, 0.5294, 0.1431, 0.0383, 0.0131, 0.0043] 7 which enables us to compute the concentration of

38

A r at x = 0.995

C 0 ( 0 . 9 9 5 ) = # T ( 0 . 9 9 5 ) I I = 1.0179

1.2 1.0 0.8

I

0.6 0.4 0.2 0.85

0.90

0.95

1.00

Radial distance x-r/R Figure 5.15 Distribution of spallogenic 38 Ar in plagioclase crystals from the lunar anorthosite 15415 analyzed by Turner (1972). A population of spherical grains with uniform grain size is assumed. The T, values were deduced from the diffusion of 37 Ar (Albarede, 1978).

5.6 The continuous inverse model

317

Values at different x produce the distribution represented in Figure 5.15. Because of the spherical geometry, more points have been calculated next to the surface. It is difficult to assess in few words the statistical and physical significance of this profile. However, it is consistent with a nearly homogeneous distribution of spallogenic 38 Ar produced during interaction of the cosmic rays with calcium atoms and a subtle loss in recent time suggested by a rim depleted in 38 Ar. Further applications, error handling, and discussions may be found in Albarede (1978). <$=•

6 Modeling chemical equilibrium

6.1 Introduction The purpose of this chapter is to outline the simplest methods of arriving at a description of the distribution of species in mixtures of liquids, gases and solids. Homogeneous equilibrium deals with single phase systems, such as electrolyte solutions (e.g., seawater) or gas mixtures (e.g., a volcanic gas). Heterogeneous equilibrium involves coexisting gaseous, liquid and solid phases. Finding the distribution of components fspeciation') in all the phases of any system requires application of rather simple rules (Van Zeggeren and Storey, 1970): (a) the mass conservation condition: unless radioactive decay is present, atoms and exchangeable particles like electrons must be conserved; (b) the minimum Gibbs free energy condition: according to the Second Principle, a system must spontaneously evolve towards the lowest attainable energy condition, either stable or metastable; (c) the phase rule, which tells us for a given choice of components (either atoms, such as O and Na, or particles such as protons and electrons) and for a number of externally imposed constraints (pressure, temperature, oxygen fugacity, ...) how many phases are present. Condition (a) has been addressed repeatedly in this book, and we will see from different examples how two contrasting approaches to condition (b) may be used, through either the mass action laws or direct minimization of the Gibbs free energy. Condition (c) is a consequence of (a) and (b) and for a thorough discussion the reader is referred to the books by (i) Van Zeggeren and Storey (1970) and Smith and Missen (1982) for general principles; (ii) by Morel and Hering (1993) and Michard (1989) for the case of electrolyte solutions. Let us define the chemical components as the 'building blocks' describing all the possible chemical species present in the system (Morel and Hering, 1993). The chemical components are the minimal set of atoms, species, or ions that may describe the entire attainable stoichiometry of the system. A desirable property is that these components be independent of each other, i.e., that they form an orthonormal set in composition space. By Morel's method, mass balance breaks down in two steps. First, in much the same way as we constructed a mineralogical matrix for a multi-mineral rock, we can build a component matrix for the solution we are dealing with, including all the possible species. Second, the solution recipe is written down. We may think of producing a solution by dissolving a given quantity (the 'recipe') of NaCl and K 2 SO 4 , 318

6.1 Introduction

319

or making a gas mixture by the combustion of hydrocarbon in air. In these cases, species present in the mixture must add up for each component to the total number of components in the recipe. We can think of imposing carbon dioxide pressure in equilibrium with a solution, which amounts to constrain dissolved H2 CO3 . A similar case arises in rocks where oxygen fugacity is buffered by a mineralogical assemblage, e.g., quartz-fayalite-magnetite. Given a solution with m species and n independent components, the component matrix BmXn, is usually rectangular with m ^n. In the n-dimensional component space, there can be no more that n independent components, and some additional m — n independent relationships must exist between species concentrations. These additional relationships are chemical reactions and, in the same way as there is no unique choice of components, there is no unique set of independent reactions. For instance, the following three reactions 2 H2O + CO2^>CH4 + 2 O2

are not independent, since the third is simply the sum of the first two reactions, and there is no best choice of two equations. To each equation, we can associate a change in Gibbs free energy G and an equilibrium constant K. In the case of non-ideal mixtures, K relates activity for solutions and fugacities for gases. Let us assume the following reaction between the hypothetical species A, B, and C vAA<=>vBB + v c C

where the v coefficients are stoichiometric coefficients. At equilibrium, the change in Gibbs free energy AG = vB/zB + vc//c-vAJuA, where \i stands for the Gibbs free energy of each species, is zero. The total Gibbs free energy of the system must therefore be minimized with as many constraints on AG as there are independent reactions. Alternatively, we may take the mass action approach and write

where K(T9 P) is the equilibrium constant and depends on temperature and pressure. Equivalent expressions exist among fugacities for gaseous systems. Finding the speciation in this case amounts to solving simultaneously the n linear conservation equations and the m — n non-linear mass action equations. The case of activity coefficients in solutions is easily but tediously implemented since well-constrained expressions exist, like those produced by the Debye-Hiickel theory for dilute solutions or the Pitzer expressions for concentrated solutions (brines). The interested reader may refer to Michard (1989) for a recent and still reasonably simple account. However simple to handle, activity coefficients introduce analytically cumbersome expressions incompatible with the size of a textbook. Real gas theory demands even more complicated developments.

320

Modeling chemical equilibrium

6.2 The Newton-Raphson method applied to solutions

The Newton-Raphson method consists in solving simultaneously the conservation and mass action equations. Because of its simplicity and rather fast convergence, it is well-fitted to sets of non-linear equations in several unknowns, as described in Chapter 3.

6.2.1 Homogeneous equilibrium in solutions We will follow the layout of the problem as described by Morel and Hering (1993) and use their conventions. A solution may be defined as a solvent, most frequently water, and solutes, which can be neutral species such as O 2 or, more frequently, ions such as Ca2 + or OH~. Charged particles, such as electrons and protons, are not present in solutions but nevertheless may be handled in the same way as other charged species (Stumm and Morgan, 1981). Solvent concentration overwhelms solute concentrations. For reasons of roundoff errors due to water being the dominant species, a simplification is introduced for dilute solutions (Morel and Hering, 1993). One unknown and one equation are simultaneously eliminated from the set of conservation equations which make the recipe. The unknown species H 2 O is expressed as a function of the unknowns OH" and H + , which assigns OH" a - 1 H + coefficient, and the OH" conservation equation (first column in Table 6.1) is left out. & The calcium carbonate system at constant £(CO2). Assuming E(CO2) = 2.2 mmol kg ~ * and m (Ca 2+ ) = 1.2 mmol kg " 1 , calculate the concentration of Ca2 + , HCO3~, CO 3 2 ~, OH" and H + in the system. Assume the activity coefficients are unity and neglect H 2 CO3 . The second dissociation constant of carbonic acid K2 is 4.68 x 10" 11 (pK= -10.33) and the water dissociation constant 10" 14 . We assume that no precipitation occurs, although this assumption will later be proved to be inadequate. Using a non-linear system is a complicated way of solving this particular problem, but this example is quite illustrative and can be extended to any number of components. Although ionized atoms like H + and Ca 2+ are natural components of the dilute solution under consideration, carbon and oxygen do not appear as such in natural systems. Since the group CO 3 2 " is not destroyed in any reaction it will therefore be taken as the carbon host. The component matrix B is shown in Table 6.1. As explained above, the H2 O row is subtracted from the OH" row, which is left with — 1 in the H + column, which produces the new component matrix of Table 6.2. The system to be solved involves five unknown concentrations which, for sake of illustration, are made equal to the corresponding activities. An identical number of equations must be found that include component conservation plus a number of mass action laws corresponding to the formation of as many species as the excess of species over components. We first write the recipe, i.e., the mass balance for the components, not including the components of water. Calcium mass balance reads l[Ca2+] = 1.2xl0"3

6.2 The Newton-Raphson method applied to solutions Table 6.1. Thefull component matrix of the calcium carbonate system.

H2O H+ OHCa 2 + HCO3CO 3 2 ~

OH"

H+

Ca 2 +

CO32-

1 0 1 0 0 0

1 1 0 0 1 0

0 0 0 1 0 0

0 0 0 0 1 1

Table 6.2. The component matrix of the calcium carbonate system in dilute solutions. This matrix can be derived from Table 6.1 by expressing the unknown species H 2 O as a function of OH~ and H + , then removing the O H " conservation equation, i.e., removing the corresponding column.

H+ OH" Ca 2 + HCO3CO32-

1 -1 0 1 0

Ca 2 +

CO32-

0 0 1 0 0

0 0 0 1 1

while neglecting H 2 CO 3 , carbonate mass balance is

Conservation of H + and OH" amounts to the electroneutrality condition

The species forming upon reactions are first the hydroxyl ion

then the hydrogenocarbonate anion [HCO3]

321

322

Modeling chemical equilibrium

Let us build an x vector out of the five unknowns x 1 = Lri j , x 2 = LUri j , x 3 = |_L^a

j , x 4 = LriL,w 3 J? ana X5 = [

The five equations to be solved read (the electroneutrality equation is written first) -x4-2x5

/2(x) = x 3 -m(Ca 2+ )

Taking the derivatives of these five equations gives the matrix D

D=

1

-1

2

-1

-2

0

0

1

0

0

0

0

1

1

i

0

0

0

0

0

0 x2 x5 x4

x

XiX 5

*i

x4

2

Y

x4

In order to start the iterative calculation, a first estimate must be made. Although a subsequent section will show how to generate such an acceptable startup, the purpose of this exercise is to show how it works in a blind situation, which means that we do not want to be too smart. Let us assume that pH = 8, and split the carbonate component evenly between HCO 3 ~ and CO 3 2 ". [ C a 2 + ] cannot be different from the amount present in the solution. We get the initial estimate, labeled with the superscript (0) as (0 Xl

>=10" 8 , x 2 (O) =l(T 6 , x3(0) = 0.0012, x4(0) = 0.0011, x5(0) = 0.0011

which results in the five equations for the initial components of f{0)(x)

/2(O)(x) = 0.0012 -0.0012 = 0 / 3 (0) (JC) = 0.0011 + 0.0011 - 0.0022 = 0 /4(O)(jt)=l(T8xl0-6-10-14 = 0 <°>(JC) = ( 1 0 - 8 X 0 . 0 0 1 1 / 0 . 0 0 1 1 ) - 4 . 6 8 X

=0.995 x l O " 8

Let us define a goodness-of-fit criterion as the modulus of the residual vector given by x 10"

f(x\

62 The Newton-Raphson method applied to solutions

323

Inserting numerical values into the expression for matrix D{0\ we obtain •

1

0 i0)

D

=

0

10'

6

-1

2

-1

-2

0

1

0

0

0

0

1

1

0

0

8

1(T 10~

- 1

0

0

0

-9.091 x l ( T

6

9.091 x

r6 .

and using MatLab, to compute Z)~\ the increment vector Ax as 6.4167 xlO~ 9 " -6.4167 x l O " 7 1

= D f(x) =

0 9.0034 x 10 ~ 4 - 9.0034 x 10 " 4

The first updated vector jt, with its five components labeled with the superscript (1), is obtained as x 1 ( 1 ) = 1 0 " 8 + 0.6417x 10~ 8 = 1.6417x 10~ 8 (1)

= 10" 6 -0.6417 x 10" 6 = 0.3583 x 10" 6

x 3 (1) = 0.0012+ 0 = 0.0012 x 4 (1) = 0.0011+9.0034xl0" 4 = 2 . 0 0 0 3 x l 0 - 3 x 5 (1) = 0.0011 -9.0034 x 1 0 " 4 = 1.9970 x 10" 4

This calculation is repeated for successive estimates of x. At the fifth iteration, the solution obtained is x/ 5 > = [H + ] = 5.225 x 10"'°(pH = 9.28) (5)

= [ C a 2 + ] = 0.0012

x 4 {5) = [ H C O 3 - ] = 2.0020 x 10~ 3 x 5 (5) = [CO 3 2 ~] = 1.8086 x l 0 ~ 4

with the goodness-of-fit parameter s = lf(x)Tf(x)y2

= 1.75 x 10"

In this particular case, convergence is extremely fast.

Modeling chemical equilibrium

324

6.2.2 Heterogeneous equilibrium in solutions A phase distinct from the solution is also present. We will now consider two simple examples of (i) a solution coexisting with a solid phase and (ii) a solution coexisting with a solid phase at a given pressure of a gas that dissolves and dissociates in the solution. ^ Equilibrium with precipitation. The previous example calculated carbonate speciation admitting unrestricted solubility of all species. Actually, it is easily verified that the calculated calcium and carbonate concentrations exceed calcium carbonate solubility as measured by the solubility product [Ca 2 + ][CO 3 2 -] =K S = 10~ 835 =4.47 x 1(T9 What is then the actual distribution of dissolved species and the amount of calcium carbonate precipitated from the solution? Let us add the moles of precipitated calcium carbonate as a sixth variable x 6 , modify the calcium and carbonate conservation equations f2(x) and/3(jc) in order to account for solid phase contribution, and use the expression of the solubility product as a sixth equation. The six equations to be solved read f1(x) = x1—x2 + 2x3 — x4 — 2x5 f2(x) = x 3 + x6 — m(Ca2 + ) Mx) = x4 + x5 + x6 - Z(CO2)

f4(x) =

fs(x) = (XiX5/x4) - K2

f6(x) = x 3 x 5 - Ks

Xlx2-Kw

The matrix of partial derivatives is obtained by taking the derivatives of the previous equations 1

-1

0

D=

1

0

0

0

x2

Xi

0

0

0

0

x.

X5

0

-2

0

0 1

0

1

1

1

0

0

0

* x4 x,

0

*1*5 Y

x4

2

0

0

Let us keep the same initial guess as in the previous example and assume that no solid is present x / ^ K T 8 , x 2 (0) =10" 6 , x3(0) = 0.0012, x4(0) = 0.0011, x5(0) = 0.0011, x6(0) = 0 The initial values off^Xx)

to/ 5 (0) (x) are left unchanged while the value of/6(0)(.*:) is

f6(O)(x) = 0.0012 x 0.0011-4.47 x 10"9% 1.320 x 10" 6

6.2 The Newton—Raphson method applied to solutions

325

The matrix Z)(0) of partial derivatives can be calculated as 1 0 Z) ( 0 ) =

2

-1

-2

0

0

1

0

0

1

0

1

1

1

0

0

0

0

10"

-1

6

10"

8

1

0

0

-9.091 xlO~

0

0

0.0011

0

0 6

9.091 xlO" 0.0012

0 6

0 0

Instead of using the previous goodness-of-fit criterion, we can examine the magnitude of the 'residual' f(x) components after each iteration. After seven iterations, the component of/(jc) with the largest modulus is/5(jc)= —1.5 x 10~18 , which is small enough for assuming convergence. The final components of vector x are

(7)

= [Ca2 + ] = 1.005 xlO" 3

x4(7) = [HCO3-] = 2.000 x 10" 3 (7)

= [CO 3 2 -] =4.448 x l ( T 6

Most noticeable is the large change in pH induced by the precipitation of 0.196mmolkg"1 of calcium carbonate. o=> & The calcium carbonate system at constant CO2 pressure. We take the pressure of atmospheric carbon dioxide pCo2 = ^ x 10~4atm and assume calcium carbonate saturation. Calculate the concentration of Ca 2 + , H 2 CO 3, HCO 3 ~, CO 3 2 ~, OH" and H + in the system. Again, we assume the activity coefficients are unity. The carbonic acid dissociation constants are [HCO3 -][H +] = 4.47x10" [H2CO3]

and [CO 3 2 -][H + ] [HC(V]

= K2 = 4.68xl0" 1 1

while solubility of atmospheric carbon dioxide in water equilibrated with atmosphere obeys [H2 CO 3] Pco2

= a = 0.04molkg-1atm"

326

Modeling chemical equilibrium

The assumption of constant CO 2 pressure makes the system extremely simple since [H2 CO 3 ] = apCO2 = 0.04 x 3 x 1(T4 = 1.2 x 10"5 and therefore the conservation of carbonates cannot be written with the equation used in the previous examples. In addition, calcium carbonate saturation requires

The five unknowns are , x 3 = [Ca2 + ], x 4 = [ H C O 3 ] , and x5 = [CO 3 2 "] Let us write the electroneutrality condition, as well as the equations for water and carbonic acid dissociation and finally the saturation condition as

)=

(x1x5/x4)-K2

The algebraic expressions that make the matrix D of partial derivatives are

Z) ( 0 ) =

1

-1

2

-1

-2

x2

xl

0

0

0

0

0

xtx5

x]

v42 X

x,

x4

0

0

X,

0

0

0

x5

0

x5 x4

X

We can now start the calculation, which requires an initial guess of the x values. Let us assume a pH value of 8 and therefore x / ^ K T 8 , x 2 (O) =l(T 6 which constrains the hydrogenocarbonate content as x4(0) = X1apCO2/x1(0) = 0.536x 1(T3, and the carbonate content as x5(0) = X2x4(0)/[H + ] = 0.251 x 1(T5 The initial calcium guess is calculated from the electroneutrality equation (0)

^0.271 x 10"

327

6.2 The Newton-Raphson method applied to solutions

The initial matrix D(0) of partial derivatives is numerically evaluated as -1

1 6

—1

2 8

0

4.68 x 10" 3

0

0

-8.72 x l ( T 8

5.36 xlO" 4

0

0

10 - 8

10~

HT

0

0

-2

0

2.51 x l 0 ~

6

0

0

1.86 xlO" 5 0

2.71 x l 0 ~ 4

The initial residual, i.e., the initial vector of the functions we are trying to cancel is /<°>(x) = [-9.900 x 1 0 " \ 0, 0, 8.078 x 10" 28 , -3.790 x 10" 9 ] T This is no real surprise since the first four equations were assumed to be obeyed. Only the last condition of calcium carbonate saturation is grossly violated. The first increment is obtained in the usual way as — D ^ / a n d gives 1.8517 xlO~8" -1.8517xlO~ 6 5JC(0) =

-5.0736 x l 0 ~ 4 -9.9326 x l 0 ~ 4 -9.2969 x l 0 ~ 6

After nine iterations, the calculation converges towards the result 5.35 x K T 9 1.87 xlO" 6

[H + ] [OH "] [Ca 2

>=

5.11 xlO~ 4

[HCO a"]

1.00 xlO" 3

"I.

_8.76x 10~6

FYP c»11 *r»»-\/-»/=» / . J-zAlAsiiisiiv^i/

7

nf \\ '

9

F< >(x) =

L Id IIILIIL a i t

2.44 x l O "

20

- 3 . 4 2 xlO~

22

"

8.26 x l O ~ 1 8 - 1 .84 x l O ~ 1 9 - 1 .99xKT 1 6

Again, this is not the easiest way to work out the solution to this particular problem, but the example illustrates a method that can be extended to much more convoluted situations. <^

328

Modeling chemical equilibrium 6.2.3 More about scaling

Because equilibrium constants may often be of widely different orders of magnitude, solving some problems may lead to roundoff errors and poor accuracy. & Mercury-chloride complexes in dilute solutions. This slightly more difficult example will be useful in showing how to handle poorly conditioned systems of equations. It is assumed that mercury chloride HgCl 2 is dissolved in pure water with a molality m = 10" 5 mol kg" 1 . Given the equilibrium constants for chloride complex formation

[HgCl2°] and for hydroxide complex formation [Hg + ][OH-] 2 [Hg(OH)2°]

—R

_1f)-21-85

' -

— P O H — AU

calculate the distribution of species in solution. A convenient choice of components is H + , H g 2 + and Cl~ for six species H + , O H " , H g 2 + , Cl~, HgCl 2 ° and Hg(OH) 2 °, which gives the component matrix shown in Table 6.3. Let us define the vector J C ( X 1 , X 2 , . . . , X 6 ) by its components v.

ru+n v

rnu"i

v

rurr2+n v-

rr*~\~~\ v-

nine*] on anri v

rHa/nw^ °n

The six equations in the six unknowns xl9 x 2 ,..., x 6 are the electroneutrality condition, conservation of mercury and chloride components, plus the three mass action laws corresponding to water dissociation and mercury complexation by Cl~ and O H " . The condition of electroneutrality reads -] + 2[Hg2+]-[Cr] = 0 or lxx — Ix2 + 2x3 — lx 4 = 0 Conservation of mercury requires [Hg 2 + ] + [HgCl20] + [Hg(OH)2°] = 10"5 or Ix 3 + Ix5 + Ix6 = m while, for chlorine

6.2 The Newton-Raphson method applied to solutions

329

Table 6.3. The component matrix of mercury chloride in dilute solution.

H+ OHHg 2 +

cr HgCl 0 2

Hg(OH) 2 °

H+

Hg 2 +

cr

1 -1 0 0 0 -2

0 0 1 0 1 1

0 0 0

I 2 0

or

As a function of the new variables, water dissociation equilibrium requires

while Hg complexation by chlorine reads

x5

Likewise, for the Hg hydroxilated complex, we get _

R

_10-21.85

Since the constant terms on the right-hand side of the previous equations vary by some 17 orders of magnitude, we may face serious roundoff errors in estimating/(x). One way of scaling the equations is to divide each conservation equation by the total amount of the corresponding component and each mass action relation by the corresponding equilibrium constant. If the choice of the initial estimate x{0) is not too awkward, we should obtain the six equations as differences between numbers of more similar magnitudes (ideally unity for all but the electroneutrality condition)

h(x) = -

2m

'--I -1

330

Modeling chemical equilibrium

The matrix D of derivatives becomes

D=

1

-1

0

0

0

0

2 1

0

1

1

m

m

1

1

lin

m

0

0

0

0

(*4)2

,ZX 3 X4

—x3/x4\

n

0

o

0

0

m

it o

-1

0

rC\^ 5

0

2x 2 x 3

0

0 /*OH\

Taking the arbitrary initial guess as x 1 (0) =10" 7 , x2(0) = 10"7, x 3 (0) =10- 7 , x4 = 10- 5, x 5 (0) =10- 5 , x 6 ( 0 ) =10- 5 and inserting these values into the six expressions for the components of/(jc), we obtain

- 1 = 1.01

10"

10 -14

io- 7 (io- 5 ) 2 - 1 = 12.8 *7.244xl0- 14xl0-5 io- 7 (io- 7 ) 2 ' 1.412 x l O " 2 2 x 10"

= 7.082 xlO 5

and the matrix of partial derivatives

Z) ( 0 ) =

1

-1

0

0

0

0

105

0

105

105

0

0

0

0.5 x 105

105

0

7

10 0 0

10

2

7

0

0 1.416 x l O

-1

0.276 x 10

-0.138 xlO

13

0

0

0.138 xlO 13

0.708 x l O

0

0 7

7

0 7

0 -0.708 xlO 1 1

6.3 Gibbs energy minimization

331

We get the increment 6x(0) as '-1.445 xlO- 7 ' 1.445 x 10"7 -0.905 x 10 ~7 0.933 x 10"5 0.335 xlO~ 6 0.986 x 10- \ and the first updated estimate as

x 2 (1) =10- 7 -1.445 xlO~ 7 = -0.445 xlO x 4 (1) =10~ 5 -0.933 xlO" 5 = 0.67 xlO" 6 5 x;55(1) ( 1=>=1010- 5 -0.335 -0.335xl0x 10-6 = 0.967 x 10' 5

x6= 10~5-0.986 x 10"5 = 1.44 x 10"7 After 25 iiterations, the following result is obtained v

v

(25)

( 2 5 ) __

(25)

A QQ rur + i _ ^ 294 x 10 5 (pH (nXJ — = 4. ^

= [ C r ] = 1.294 x 10- 5

(25) = [HgCl20] = 0.353 x 10- 5 x 6(25) = [Hg(OH)2°] = 0.647 x 10~5 Most of the mercury is therefore in the form of chloride and hydroxide complexes. The three 'residual' components of f(x) with the largest deviation are /3(25>(x)=

-

KT

which indicates excellent convergence since they are to be compared with differences of numbers approximately equal to unity. <=> 6.3 Gibbs energy minimization 6.3.1 Mixtures of ideal gases We will now take advantage of a slightly different problem, homogeneous equilibrium in gases, in order to illustrate a different numerical approach, the steepest-descent method. At equilibrium, the Gibbs free energy of a system is minimum. Standard thermodynamics (e.g., Denbigh, 1968) states that the Gibbs free energy G of a perfect

332

Modeling chemical equilibrium

gas mixture containing rij moles of species j is given by

where Pj is the partial pressure of species j9 and fif the Gibbs free energy of the pure gas at unit pressure, pj is related to the total pressure P

Rearranging with the use of logarithm properties G = £ Ujhij0 + @T In P + @T\nnj-@T In N)

(6.3.1)

j

where the total number N of gas moles in the mixture is n

Let us define the constant cj as

and the reduced Gibbs free energy © as Y

(6.3.2)

where the dependence on molar composition has been emphasized. In a formal way, our goal is to locate the minimum of ©(«) relative to the vector n(nu n 2 ,... rij) subject to the constraint that the conservation equation holds, i.e., that the components in the species add up to the total number of components in the recipe. This task will be best handled by the steepest-descent method described in Chapter 3. The constraints will be handled through the gradient projection method which is a close equivalent to using the method of Lagrange multipliers described in the same chapter. In addition, better understanding will be achieved if the section on projectors developed in Section 2.2 has been read first. The ith component of the gradient of function © must be equal to the chemical potential fit. In order to prove it, we first observe that dlniV_ 1 dri:

N

6.3 Gibbs energy minimization

333

and therefore — =ci + \nni-\nN ont

+

ni(l/ni)-Yjnil/N n

or = ci + \n(ni/N) = fii

(6.3.4)

orii

We note all the derivatives in compact gradient form V© = c + In n - JlnN = c + In n - / I n JTn

(6.3.5)

where

In a matrix form, the system of mass balance equation constraints (component matrix) reads BTn-q = 0

(6.3.6)

where B is the component matrix and q the recipe of the system. As per Section 3.2, the unconstrained direction of steepest-descent of (5 is tfk + i)_n(k

+

i)=_aV&k)

( 6 3 7 )

where n(h+1) is the (fc+l)th estimate of the vector «, and a is a constant to be determined. At the nth step, the mass balance constraints (6.3.6) require BTnik)-q

=0

(6.3.8)

For the same constraint to be satisfied at step k+1, n(k + 1) must obey

This holds true for n{k + 1) relating to n(k+1) through (6.3.9) which can be checked upon pre-multiplying both sides by BT. What we have actually performed is a projection (Figure 6.1) of the minimization direction on the constraint subspace of the matrix B using the projector P, such that P=I-B(BJB)

X

BJ

(6.3.10)

334

Modeling chemical equilibrium

Figure 6.1 Search for the minimum of the Gibbs function © in a two-component space (nn and ni2 are mole numbers) with the mass conservation constraints Bn = q. The search direction is the projection of the gradient onto the constraint subspace. Minimum is attained when the gradient is orthogonal to the constraint direction, which is the geometrical expression of the Lagrange multiplier methods.

We get the update formula of the (k + l)th estimate as a function of the /cth estimate as (6 3

!

where 5w(fc) is the incremental correction to the /cth vector n(k) to give the (ZcH-l)th estimate Snik)=-(xPV&k)

(6.3.12)

PV&k) is therefore the direction of constrained minimization. As in the case of Lagrange multipliers, no progress can be made and search will stop when the (/c + l)th minimization direction PV(5 (k+1) is orthogonal to the /cth minimization direction PV(5(k). A criterion for minimum is when the inner product of these vectors becomes less than an arbitrarily small value. From equation (6.3.5), the gradient of function (5 at the /cth step is V©(*> = c + In n{k) - / I n

JTnik)

(6.3.13)

The condition for two consecutive minimization directions to be orthogonal is that their inner product /(a) vanishes, i.e., /(a) =

(6.3.14)

/(a) can be expanded as /(a) =

=0

6.3 Gibbs energy minimization

335

Defining the scalar r as

/(a) becomes / ( a ) = [PV(5 (fc) ] T c + [PV<5 (fc) ] T ln[ii (k) + 8ji(fc)] - r ln[N (fc) - r a ]

(6.3.15)

Since /(a) is a known function of a, we will use a Newton step in order to find an approximate root a(1) of/(a) = 0. From equation (6.3.12), we know that

da

Defining M (k) as the diagonal matrix with n(k) + Sn(k) components on the diagonal, we get

da

-. L

-.

Nik)-m

which we apply to a(0) = 0 (or, equivalently, 8n(k) = 0) to give a(1) X(*)

/(a)/da

- r2/Nk

ik)

(6.3.16)

This Newton step can be repeated, but if the function (5 is not ill-behaved, it may prove simpler to restart the whole loop and recalculate the local gradient. Finding speciation in a multicomponent system is significantly more complicated when different phases are present. A well-known application is the equilibrium condensation model investigated in detail by several authors in order to reproduce the gross chemical features of the solar system (Larimer, 1967; Grossman, 1972; Grossman and Larimer, 1974). Thermodynamic modelling of mineral assemblages (Saxena and Eriksson, 1983) and of the liquid-line of descent of magmas (Ghiorso, 1985a, b) are other successful applications of this theory. The principal difficulty lies with the requirement of non-negative mole numbers, which requires some specific techniques to be used (Van Zeggeren and Storey, 1970; Smith and Missen, 1982). ^ Speciation in a gas mixture. Let us work out a case provided by Van Zeggeren and Storey (1970), involving combustion of propane in air in the proportions of one mole of propane (C 3 H 8 ) and five moles of air (O 2 + 4N 2 ) at 40 atm and 2200 K, which provides a nice illustration of how to calculate the production of greenhouse gases by automobiles. Using ln40 = 3.689 and 01T= 18.292kJmol" 1 , and thermochemical Gibbs free energy at 2200 K from Barin and Knacke (1973), we make Table 6.4. We recognize in the four columns below the components C, O, N and H, the 6 x 4 component matrix B. The gas recipe with four conservation equations as functions of the mole number

Modeling chemical equilibrium

336

Table 6.4. The component matrix and Gibbs energy of formation of various gaseous species in the C-O-N-H system at 2200 K. AGf,22OO

j

Species

C

1 2 3 4 5 6

CO 2 N2 H2O CO H2

1 0 0 1 0 0

o2

o

N

H

(kJmor1)

2 0 1 1 0 2

0 2 0 0 0 0

0 0 2 0 2 0

-982.6 -498.7 -752.4 -623.2 -361.8 -531.3

of propane rcC3H8° and air n a i r °, is for carbon

and for oxygen 2nCO2 + lnHlO + lnco + 2nOl = 2nair°

Nitrogen conservation requires

while for hydrogen

We can collect these equations in the compact form of equation (6.3.8) BTn(O)-q = (

which actually means

0

1 0

2

0

1 1 0

0

2 0 0

0

0

0

2 0

"0"

0" n2

"10

1

"3

2«air°

0

0

n4

Kir°

0

2 0

"5

L8« C3H8 oJ

LoJ

C

J

-50.03 -23.57 -37.44 -30.38 -16.09 -25.36

6.3 Gibbs energy minimization Building the projector P=I-B(BTB)~1BT

337

requires patience and gives

' 0.45

0.00

-0.05

-0.45

0.05

-0.20 1

0.00

0.00

0.00

0.00

0.00

0.00

-0.05

0.00

0.45

0.05

-0.45

-0.20

-0.45

0.00

0.05

0.45

-0.05

0.20

0.05

0.00

-0.45

-0.05

0.45

0.20

-0.20

0.00

-0.20

0.20

0.20

0.20

p=

•

•

Numerical conditions are nC3H8° = 1 and n air ° = 5. An acceptable initial guess is to be found that satisfies the constraint of equation (6.3.6). Although an alternative method will be described below, an efficient way to make such an estimate is to complement the matrix B and the vector q by arbitrary numbers in order to make a system of linear equations that can be conveniently solved. We like to use an initial guess with only positive values of mole numbers since we will have to take their logarithm. Moreover, such a choice makes more sense. In this particular case, finding an acceptable starting value will not be difficult. We would, however, appreciate a method that can be extended to a large number of species. A convenient trick is to use a random number generator to stuff the lower part of the matrix B and vector q with random numbers until the solution of the square system has only positive components. In other words, we solve systems such as 1 0

0

1 0

0

2

0

1

1 0

1

n2

10

0

2

0

0

0

0

"3

40

0

0

2

0

2

0

"4

8

"5

P

P p

P p

P p

-P p

P p

P p_

3

m

P.

where each p position is assigned a different random number until the n,'s are all positive. After a few seconds, a computer may produce the initial guess #i(0) = [2.2075, 20.000, 2.3291, 0.7925, 1.6709, 1.2317]T which gives JTn{0) = N(0) = 28.2317 and In Ni0) = 3.3404. The components of V©(0), the gradient of the function ©, are 1

V© (0) =

- 50.03 + In 2.2075 - 3.3404'

'-52.5786'

- 23.57 + In 20.000 - 3.3404

-23.9147

-37.44 +In 2.3291-3.3404

-39.9350

- 30.38 + In 0.7925 -3.3404

-33.9530

-16.09 +In 1.6709-3.3404

-18.9171

-25.36 +In 1.2317-3.3404

-28.4921

338

Modeling chemical equilibrium

which gives the estimate of the reduced Gibbs function as © ©<0> = [fi<°>]TV©(0) = - 781.0 mol The projection of the gradient along the constraint is />V©(0) = [-1.6322, 0, -2.8284, 1.6322, 2.8284, 2.2303]7 and therefore r = -1.6322 + 0 - 2.8284 +1.6322 + 2.8284 + 2.2303 = 2.2303 The matrix M (0) is "2.2075 0 0 20

0 0

0 0

0 0

0 0

0

0

2.3291

0

0

0

0

0

0

0.7925

0

0

0

0

0

0

1.6709

0

0

0

0

0

0

1.23

The inner product between the direction of minimization V(5(0) and the direction of the constraint PV© (0) is /(0) = [PV©(0)]TV©{0) = ( - 1.6322X - 52.5786) + (0)( - 23.9147) + ... = 26.3016 In order to calculate d//da for a = 0, we need to calculate

2.2075

20.000

2.3291

hence

da

28.2317

= -16.6530

The Newton step is /(a) d/(a)/da

-16.6530

which suggests the increment 5/i (0) =-aPV© (0) = [2.5779, 0,4.4671, -2.5779, -4.4671, -3.5225] T Unfortunately, such an increment 8n makes some components of n negative. The

6.3 Gibbs energy minimization

339

Table 6.5. Mole fraction of gaseous species x after the seventh iteration. The column heading vZS refers to the results obtained by Van Zeggeren and Storey (1970) for the same system but using the JANAF tables of thermodynamic data. vZS

Species CO 2 N2 H2O CO H2

o2

0.1084 0.7396 0.1473 0.0026 0.0006 0.0016

0.1080 0.7387 0.1467 0.0029 0.0008 0.0012

idea is to use the largest a value that makes all n components non-negative. Logarithms, however, are no more tractable with null than with negative values. We therefore use a value of a which is 95 percent of that value that makes all the n components non-negative, i.e., inclusive of zero. In the present case, the fourth component is the limiting factor and we choose a = 0.95 x 0.7925/1.6322 = 0.4613 The first updated vector n{1) is 1

2.2075-0.4613 x(-1.6322)'

20.0000-0.4613x0

1

2.9604"

20.0000

2.3291 -0.4613 x (-2.8284)

3.6337

0.7925-0.4613x1.6322

0.0396

1.6709-0.4613x2.8284

0.3663

1.2317-0.4613x2.2303

0.2030

After seven iterations, the inner product between the direction of minimization and the direction of the constraint becomes

which shows that we are close to convergence and the solution does not move much from /i(7) = [2.9310, 20.0000, 3.9829, 0.0690, 0.0171, 0.0431]T

The results are listed as mole fractions JC(7) in Table 6.5 together with those of Van Zeggeren and Storey (1970). Considering that these authors used different sources of

340

Modeling chemical equilibrium

thermochemical data, the agreement of the present results with theirs is fairly good. The total reduced Gibbs free energy at that point is (5= — 791.612mol and G = (5^T= -791.612 molx 18.292 k J m o r ^ - 1 4 4 8 0 k J For a large number of gaseous species and for real gases, a more powerful method such as the method of conjugated gradients of Fletcher and Reeves (e.g., Fletcher, 1987; Press et ai, 1986), would be more efficient.^

6.3.2 Pure coexisting phases The extreme case where the system is made of pure phases (no gas mixture, no solid or liquid solutions) can be handled in a slightly different way with a method that draws on linear programing methods. Given the Gibbs free energy of formation of all possible minerals, the objective for a rock of known composition is to find the mineral abundances that minimize the Gibbs free energy function of the rock. Linear constraints are the conservation equations of each element (or oxide). In addition, mineral abundances cannot be negative. Let us consider a rock at temperature T whose chemical composition q (recipe) is expressed as the vector of all the molar fractions x0 of s elements or oxides. It is assumed that it can be made by an arbitrarily large number p ^ s of mineral phases exclusive of solid solution. B is the component matrix of these minerals for the selected set of elements or oxides. Let n} be the number of moles of mineral j and g} its Gibbs free energy of formation AGf T estimated when formed from either the elements or the oxides. The function to be minimized is the Gibbs free energy G given by G

= t nj9j = nT9

(6-3.17)

where n and g are vectors with components rij and gj9 respectively. The conservation equations are written in their usual matrix form BTn = q

(6.3.18)

In an s-dimensional space, s vectors at most can be independent. At equilibrium, a rock made of 5 elements cannot consist of more than 5 minerals, which implies that at least p — s of the p mole numbers are zero. In order to find the set of independent vectors that minimize the energy, we first rearrange the order of variables and split the vector n into two parts. The first part is the vector nB made of s base variables, and the second part is the vector nF of (p — s)free variables. Provided the base variables are non-negative, the non-negativity constraints can be satisfied by setting the free variables to zero. For the vector n to be a feasible solution, it should also satisfy the recipe equation, i.e., M

II B 2? F + I , F / ? F

^

(6.3.19)

6.3 Gibbs energy minimization

341

where the matrix B has been split into the upper s x s matrix BB and the lower (p — s)xs matrix B¥. Assuming that the base variables can be chosen in such a way that BB is regular, we can write %T = <1TBB ~l - nFJBFBB

~x

(6.3.20)

For nF = 0, we immediately get the relationship between nB and q. We now want to change both nB and nF in a direction that decreases G. More precisely, we will exchange one free for one base variable at a time as long as the Gibbs free energy can be decreased. The last equation can be differentiated as 5/i B T = -dnFTBFBB~l

(6.3.21)

a n d G as 5G = 6nTg = [5/iBT, 5 % T ] r B 1 = bnBJgB + SnFJgF = SnFT(gF - BFBB " ^ B ) I0J

(6.3.22)

where gB and gF are the Gibbs free energy of formation associated with the base and free variables, respectively. The vector gF — BFBB~1gB is the gradient of G with respect to the free variables with the constraint that changes keep the conservation equation satisfied. Each component of nF is zero and can only be increased. 8G therefore can be negative only if the constrained gradient gF — BFBB~1gB has negative components. The most natural policy is to increase the component i of nF associated with the most negative component of gF — BFBB~1gB. Because the elements of the matrix B are all non-negative, equation (6.3.18) forces at least one element of nB to decrease when one element of nF decreases. Actually, element nB changes in proportion to the ith row of BFBB~X, which we call utT9 but with the opposite sign. The constraint that each element of nB stays non-negative amounts to finding the largest scalar a such that nB-(XUi^0

(6.3.23)

The first element k of nB reaching zero is found by selecting the smallest positive components of nB/ut (the ratio being understood as the vector obtained by element to element ratio). At this point, the fcth mineral of nB is exchanged with the ith mineral of wF, which simply amounts to exchanging their corresponding row in both B and #, and the calculation restarted from the beginning. The calculation stops when 8G cannot be decreased any further, i.e., when each component of gF — BFBB~1gB is positive. & A rock in the SiO 2 -MgO-CaO composition space at 1000 K can consist of the following minerals: periclase (pe), forsterite (fo), enstatite (en), quartz (qz), diopside (di), merwinite (me), larnite (la), and lime (li). The molar compositions of each mineral formula weight are listed in Table 6.6 together with their Gibbs energy of formation AGf 1 0 0 0 ° from the elements given by Robie et al. (1978). Given a rock with molar composition nsiO2 = 0.45, rcMgO° = 0.45, and nCaO° = 0.10, find the stable mineral assemblage.

Modeling chemical equilibrium

342

Table 6.6. The component matrix and Gibbsfree energy of formation for various minerals in the system SiO2-MgO-CaO. AU

#

Mineral

1 2 3 4 5 6 7 8

periclase (pe) forsterite (fo) enstatite (en) quartz (qz) diopside (di) merwinite (me) larnite (la) lime (li)

f,1000

SiO 2

MgO

CaO

(kJmor1)

0 1 1 1 2 2 1 0

1 2 1 0 1 1 0 0

0 0 0 0 1 3 2 1

-493.092 -1771.526 -1256.427 -729.920 -2630.281 -3810.761 -1926.167 -531.007

In a three-component space, three vectors at most can be independent. The vector n of eight variables (mole numbers) is therefore split into a vector nB of three base variables with non-negative values and a vector nF of eight minus three = five free variables equal to zero. Likewise, the matrix B is split into a 3 x 3 matrix BB and a 5 x 3 matrix BF, while g is split as gB and gF. Expressing the rock composition with respect to oxide minerals ensures that the components of the base vector nB combine as non-negative numbers to form the recipe. We therefore use the mineral assemblage qz-pe-li as the starting assemblage and rearrange B and g accordingly, i.e.,

1 0

B=

0

-729.920

4

0

1 0

-493.092

1

0

0

1

-531.007

8

-1771.526

2

-1256.427

3

1 2 0 ,9

1

1 0

2

1

1

-2630.281

5

2

1 3

-3810.761

6

-1926.167

7

1 0

2

The mole numbers of oxide minerals are obtained from equation (6.3.20) as

"1 0 T

J

nB = q BB ~' = [0.45

0.45

0.10] 0

0

1 0

_0 0

1_

= [0.45

0.45

0.10]

6.3 Gibbs energy minimization

343

which gives a total Gibbs free energy G of — 603.456 kJmol" 1 . "1 2 0" 1 1 0 "1 0 0" -1 BFBB

= 2 1 1 0 1 0

"1 2 0" 1 1 0 2 1 1

=

2 1 3 _0 0 1_ _1 0 2.

2 1 3 _1 0 2.

We now calculate the constrained gradient of G relative to the free variables making the components of the vector nF 1771.526" 1256.427 gF-BFBB

l

gB =

"1 2 0" 1 1 0 '-729.920"

' -55.422" -33.415

2630.281 - 2 1 1 -493.092 = -146.342 3810.761 1.926.167.

2 1 3 .-531.007.

-264.808 -134.233.

_1 0 2m

The most negative component is i — 4, so the merwinite mole number will be moved from the status of a free variable to that of a base variable. The fourth row i#4T of BpBy,1 is [2,1,3] and the components of nB/u^ are 0.45/2 = 0.225, 0.45/1=0.45, 0.10/3 = 0.03. The third component (k = 3, lime) of nB is first to reach zero upon increase of merwinite. We therefore exchange the rows assigned to lime and merwinite in the matrix B and the vector g as -729,920

4

-493.092

1

-3810.761

6

-1771.526

2

-1256.427

3

2 1 1

-2630.281

5

0 0 1

-531.007

1 0 2

-1926.167

1 1 0

9=

The procedure is repeated which produces the successive replacements li=>me=>di, then pe=>fo, and finally qz=>en. At this stage, the mineral assemblage of the rock is nen = 0.15, nfo = 0.10, rcdi = 0.10, while the constrained gradient gF-BFBB~1gB along the free variables has only positive components [22.007 (pe); 11.408 (qz); 84.572 (me); 101.519 (li); 80.213 (la)]. This last condition shows that the value of G = — 628.645 kJ mol ~ l is minimum, o

7 Dynamic systems

7.1 Introduction

Dynamics deals with changes in the state of a system with time. We can think of a geological system evolving in response to changes in geological parameters that are not explicitly time-dependent: although trace-element contents in differentiating magmas may change as a function of descriptive parameters that are time-dependent, they can be adequately described by their degree of fractionation. Likewise, the chemistry of clastic sediments with different provenance can be thought of as resulting from a time-dependent process, but most local chemical aspects of these sediments can be handled efficiently using source composition and mixing proportions. These systems are not described as dynamic systems because the time-dependence is not a critical factor in determining the geochemical variable of interest. In contrast, some other systems have characteristic time-scales involved, such as those in geochemistry through fluxes, that are time-dependent in essence, and homogenization processes that need some time to complete. These are real dynamic systems. Let us first introduce some important definitions with the help of some simple mathematical concepts. Critical aspects of the evolution of a geological system, e.g., the mantle, the ocean, the Phanerozoic clastic sediments,..., can often be adequately described with a limited set of geochemical variables. These variables, which are typically concentrations, concentration ratios and isotope compositions, evolve in response to change in some parameters, such as the volume of continental crust or the release of carbon dioxide in the atmosphere. We assume that one such variable, which we label /, is a function of time and other geochemical parameters. The rate of change in / per unit time can be written .„,-„ i(tlQ

(7.1.1)

dr where x^xfa) is a time-dependent external parameter (e.g., temperature) and F = F(f9 xh t) any suitable function. Only one external parameter is needed to illustrate the general behavior of dynamic systems. f0 and F o are the values of/and F at t = 0. A Taylor expansion of F to the first term in the neighborhood of Fo gives

o

\dxJo 344

\dtJo

7.2 Single-variable residence time analysis

345

where zero subscripts indicate that derivatives are taken at t = 0. For sake of demonstration, we consider that F does not depend explicitly on time (autonomous system) and xt is constant. The last two terms of the right-hand side vanish and the rate of change equation becomes

Let a be the derivative (dF/df)0, then df at Changing to a new variable F 0 /a + / —f0 leads to the solution / - / o = — (**-!) a Only negative values of a lead to physically bounded values of/. The reciprocal of a has the dimension of time and is called the relaxation time o f / i n the system. (We will see later that for chemical adjustments, this parameter has the meaning of a residence time.) Relaxation time is a measure of how fast an isolated system adjusts to a change in its conditions (i.e., relaxes). Fo is the forcing constant and produces a systematic drift in the chemical state of the system that depends on how the system interacts with its surrounding. In very simple words, forcing terms tell us where the system goes, while relaxation time measures the pace of change. Generally, most systems can have their rate of changes described in the general form T^+f at

= h(xht)

(7.1.2)

where i is the relaxation time, possibly time-dependent, and h a forcing function of time-dependent parameters, h can be deterministic if the function is exactly known for each t: this will be the case for the concentrations and isotopic ratios of interacting reservoirs, h can also be stochastic if only its statistical properties such as its mean and variance are known, as in the case of Brownian motion and diffusion. It is known in the latter case under the name of the Langevin equation (Haken, 1978). 7.2 Single-variable residence time analysis 7.2.7 Non-reactive species A system, which can be thought of as a lake, an ocean basin, a domain of the mantle or of the crust, ... has a constant volume V in m 3 (Figure 7.1). It receives an input Q of material (water, magma, sediments, ...) in m 3 a" 1 and releases an equivalent output. Assuming that the system is well-stirred, Cl represents volumic concentrations in molm" 3 of a conservative chemical species i. By conservative, it is meant (see

346

Dynamic systems

Q

Figure 7.1 Box model for a non-reactive species i.

Chapter 1) that its concentration in the reservoir can be modified only by processes taking place at the boundaries. Species i can be added to or subtracted from the system by solid, liquid or gaseous input and output, not by chemical reaction or radioactive decay inside the reservoir. For the sake of illustration, we will consider a water reservoir, whose properties will be labeled 'liq\ Mass balance requires dVQ liq

'liq' and 'in' subscripts refer to the liquid (the reservoir and outlets) and input (upstream) values, respectively. Assuming constant V, we get V dC Q

liq

dt

-C

i n

(7.2.1)

which has the form of equation (7.1.2). We will first investigate the evolution of an element in a few simple situations where it is hosted by a single reservoir and its concentration affected by input and outputfluxesand also by its chemical reactivity, a) The amount Ml(0) of species i is released at t = 0 in a reservoir which is initially free of this species, i.e.,

This is the simplest of autonomous systems with no 'force' acting on it (pure relaxation). At £ = 0, the concentration is and, therefore

v

V V/QJ

(7.2.2)

Multiplying (7.2.2) by V, the amount M\t) left in the reservoir at t is therefore M\t) = M'(0) expf -

J-^

(7.2.3)

7.2 Single-variable residence time analysis

347

Between t and tH-dr, the outlet of the reservoir loses a quantity &M\t) of species i that has 'resided' for a time t in the reservoir and given by dM\t)= -QCliq\t)dt=

-M%0)QexJ-J-

The residence time Fof the species i is the mean age of each mass fraction, i.e.: fM'(00)

1

f00 t

( t \ t=— — tdM\t)= exp dr M\oo)-M\0)JMi{0) Jo V/Q FV V/Qj since M'(oo) = 0. Introducing the new variable u = Qt/V, we obtain _ V f °°

eJo which is integrated by parts as

_ v f00 F=—

v

°° v r°°

e

° eJo

we""dM = - [ - w e " u ] + -

eJo

e""du

and finally t=V/Q

(7.2.4)

The relaxation time V/Q is therefore the mean residence time of the species i in the reservoir. This parameter does not depend on the nature of the species as long as the species is non-reacting In particular, V/Q is also the water residence (or renewal or flushing) time. For this reason, it will be denoted TH. An alternative and illustrative derivation of the residence-time equations involves Dirac delta-distributions. Let us assume that Cini(t) = M'(0)5(t)/g or, equivalently, that a mass M'(0) of i is injected into the reservoir at £ = 0, since C + coMH0) f + QO Q - ~ Cj At = M'(0) 5(t) df = Mj(0) J — oo

J — oo

Sc-

The homogeneous system has the solution

where A(t) is a time-dependent parameter. Taking the derivative and comparing with equation (7.2.1) leads to dA(t)

dr

( exp

Q\ M'(0) M'( 1= 5(0

V V V)

V

348

Dynamic systems

Again, we make use of the property of the delta-function integral to solve this first-order differential equation as MHO) Cf (0 \ MHO) (0 \ MHO) A(t)-A{0) = — — 5 ( 0 e x p \ - u )du = — — e x p - 0 = — — V Jo \V / V \V ) V

The reservoir being devoid of element i at time t = 0, ,4(0) is zero and therefore

which is identical to equation (7.2.2). The exponential is the unit response function of the system, a fairly general concept that shows how transit through the system 'spreads' a unit input signal. More generally, time-dependent input signals can be decomposed as a succession of delta-like input signals of varying intensities and their individual output summed up in order to recover the total output signal. The unit response function is similar to the Green functions also hinted at in Chapter 8. However illustrative and elegant this method, most solutions can be derived in a much more straightforward way by calculus techniques, b) Change in the input concentration at t = 0. The condition is now Cinl = const for Upon integration of equation (7.2.1), we get

7.2.2 Reactive species We assume (Figure 7.2) that sedimentation takes place in the reservoir at the rate P and that the species i under consideration is entrained by the sediment with a concentration Cscdl dKC- ' — r ^ = QCJ - ec l i q ' - PCsedl dt

(7.2.6)

We introduce the solid-liquid partition coefficient Dt C ^sed

' - Dx vf

'

i^liq

Residence time and the forcing term become apparent when this equation is rearranged in the form of equation (7.1.2) as

t

dt

7.2 Single-variable residence time analysis

349

cl

sed

Figure 7.2 Box model for a reactive species i.

which gives the dynamic equation THdCliq OL: dt

,_Cin

r Cliq

(7.2.7)

a,

The factor a/5 defined as (7.2.8)

is a coefficient that measures the reactivity of the element in the reservoir and is equal to unity for a non-reactive species. Both the residence time T, (7.2.9)

and the forcing term are inversely proportional to a,. Note the additivity of inverse residence times I

Q

P

?i

V

V

1

1

- = - + - £ ; = — + —Dt *

TH

(7.2.10)

T sed

Residence time and reactivity are strongly correlated through equation (7.2.9). This is true for sea water composition since Whitfield and Turner (1979) showed a rather good correlation between oceanic residence times and seawater-crustal rock partition coefficients which are taken as a measure of element reactivity in the ocean. Actually, a better estimate of reactivity is given by oceanic suspensions, so Li (1982) suggested to use pelagic clay-seawater concentration ratios as a proxy to partition coefficients. The mass balance equation (7.2.7) will now be solved for different cases: (a) a finite amount of the species i is added to the reservoir at t = 0, (b) upstream concentration Cinl is changed at t = 0, and (c) upstream concentration Cinl is a periodic function of the time. a) The amount M\0) of species i is released at £ = 0 in a reservoir which is initially free of this species, i.e., = 0 for

350

Dynamic systems

At t = 0, the concentration is

Concentration is expressed as above C^-^e-*.

(7-2.11)

with the mass M\t) held by the reservoir being Mi(t) = M\0)Q-t/Xi

b) The input concentration changes at t = 0. The condition now is Cin' = const for then a

Ir

dt\

i

u

in \

a,- /

L

r i TJV

in

a,

which is equivalent to Cli;(t) = C liq i (0)e-^ + ^ ( l - e - ^ )

(7.2.12)

with the steady-state concentration Cliq*(oo) given by C^oo^Cj/oLt

(7.2.13)

Steady-state is established more rapidly for a reactive than for a non-reactive species and the steady-state concentration will be smaller (Figure 7.3). ^ Cadmium in the Greifensee Lake (Stumm and Morgan, 1981). This Swiss lake has a volume V of 1.25 x 108 m 3 , water input is Q = 9 x 107 m 3 a~ 19 sedimentation rate is P = 4x 10 7 kga~ 1 . Cd has a sediment-water partition coefficient DCd = 65m 3 kg~ 1 . Calculate the Cd residence time in the lake. Let us compute the water flushing time V 1.25 xlO 8 THH = — = = 1.4a («17 months) Q 9xlO 7 and the reactivity factor from equation (7.2.8)

351

7.2 Single-variable residence time analysis

2.5

// ^

<& Non-reactive species — TH = 0.2

a

Reactive species

0.5

0.2

0.4

4

i ~

0.6

0.8

t

Figure 7.3 Comparative evolution of the concentration for a non-reactive species and a reactive species when the input concentration is doubled at t = 0. In this particular case, TH = 0.2 is the water residence time in time units, a,- = 4 the reactivity coefficient, equation (7.2.8), of the reactive species. The residence time TCd = t H / a cd a n d the limiting concentration CinCd/aCd are divided by a factor of ~ 3 0 relative to a non-reactive case, e.g., chlorine. Entrainment by sediments flushes the excess Cd 30 times faster and decreases Cd steady-state concentration 30 times relative to a sediment-free lake. <^ c) Input concentration is a periodic function of time. The input concentration is assumed to take the form

where T is the period and Cj and AC in ' are constant terms. The time-dependent forcing term is no longer zero. It seems reasonable to look for a solution in the form t-St

where Cliq£, ACliq\ and the phase shift 8t are constants to be determined from the mass balance equation (7.2.7) rewritten as

dr

TH

sin 2 * * 1 sin27c-J

W e use the identity . ^ r —5t t 6t sin 2n = sin In — cos In T T T

t 8t cos 2n — sin In — T T

(7.2.14)

352

Dynamic systems

on both sides of equation (7.2.14) and evaluate the derivative on the left-hand side, i.e., dCnJ

2TE

f

t

bt

.

t .

bt~]

— = — ACHa cos 2n — cos 2n — h sin 2n — sm 2n —

dt

q

T

L

T

T

T

TJ

First, the constant terms must cancel out, hence

c

l

o

C

C

The terms in cos(27ct/r) on both sides must be equal, hence — AQiqq1" cos 2TT — = — ACy* sin 2n — T T TH T

or bt

27TTH

tan27c- = = T a:T

2711;

T

^

(7.2.15)

The equality of the sine terms gives — AC liq I sin27r— = - — <xiACUqicos2n TH|_ T T T

AC-J J

or ACJ 2mH . „ 5t ^ bt ^ bt(2mH —= -sin27c — + a:cos27i— = cos27i— ACliqf T T T T\ T

^ bt tan27i — T

= OL: cos 271 —f tan 2 2n — + 1 TV T

Using the well-known identity cos2 = (1+tan 2 )" 1 and taking the reciprocal of each side

Given that a ^ 1, these equations show interesting properties of the solution (Figure 7.4). First, the amplitude AC liq ' of concentration fluctuations in the reservoir is damped relative to the amplitude AC in ' of the input fluctuations by a factor which depends on both the residence time TH of the fluid in the reservoir and the reactivity oct of the element.

7.2 Single-variable residence time analysis

353

Figure 7.4 Effect of a periodically changing input concentration Cj for a species / in a well-stirred reservoir. Tis the period. The concentration C,^1 in the reservoir shows fluctuations with the same period T, but delayed by 5f and damped. Second, the fluctuation is delayed by a time 8t which is a function of the residence time Tf of the element in the reservoir. For an infinite residence time the argument of the tangent tends towards n/2 and the delay bt towards T/4, while for a short residence time, the delay tends towards zero. As expected, reactive elements respond more rapidly than inert elements. The phase shift and the damping factor relating input to output concentrations represent the angular phase and argument of a complex function known as the transfer function of the reservoir. Such a function, however, is most conveniently introduced via Laplace and Fourier transforms. Applications of these geochemical concepts to the dynamics of volcanic sequences can be found in Albarede (1993). 7.2.3 Radioactive decay and first-order kinetics When species i disappears by either radioactive decay or chemical reaction with first-order kinetics, the mass balance equation must be changed according to (7.2.17)

where X{ is the decay constant (or kinetic rate coefficient) of the species i. The equation is easily modified into

dt

354

Dynamic systems

The equations developed above for stable elements can therefore be worked out for radioactive elements or chemical reactions once the reactivity factor at has been changed into af + AfrH The residence time Tt* of the element i in the system is now (7.2.18)

and the limiting or steady-state concentration

For a pair made of a radioactive isotope i and a stable isotope j of the same element (e.g., 14 C/ 12 C), it can be safely assumed that cct = (Xj. In this case, their ratio at steady-state may be written

In a well-stirred reservoir at steady-state, we can calculate the residence time of the element from

J

^L(C7C J ) liq

J

& Broecker and Li (1970) and Broecker (1974) found that the 1 4 C/ 1 2 C ratio in the deep ocean was 84 percent of this ratio in the pre-bomb surface ocean. Assuming that surface carbon (dissolved and falling debris) is the only source of deep ocean carbon, calculate the residence time TC of this element in the deep-ocean. The 14 C decay constant is 1.2 x 10~ 4 a~ 1 . From equation (7.2.19), we calculate the residence time of 14 C in the deep ocean as TC =

1.2 x K T 4

x( V0.84

1 ) = 1600 a /

TC is also known, somewhat improperly, as the mixing time of the deep ocean, o 7.2.4 Isotope and trace-element ratios Let us consider two reactive species i a n d ; (ions, elements, or isotopes) in a reservoir and their rate of change governed by the equations

^ 5 -=--(«,c l i q i -c i n i ) (7.2.20)

dt

7.2 Single-variable residence time analysis

355

Using the rule of ratio differentiation, the rate of change of the ratio R = Cl/Cj can be written

dt

C lk A dt

Clit/ dt

(

Inserting equations (7.2.20) into equation (7.2.21) gives d/? liq _ dt

i atcy-<W , i TH

Cllq'

TH

QLf^-cjR Cllq'

Upon simplification, this equation becomes

Reformulating the rate of change of element j in equation (7.2.20) as 1 CJ _ xj

|

d
we get the dynamic equation

af + TH d In CUqj

dr

where the relaxation and forcing terms have been emphasized. For two isotopes with identical chemical properties af = a7, whereas for a ratio of non-reactive elements <& DePaolo and Ingram (1985) found that the 87 Sr/ 86 Sr ratio of the ocean has changed almost linearly from 0.7078 to 0.7092 over the last 35 million years. Holland (1978) estimates the oceanic residence time of Sr to be 4 million years. Find the relationship between the rate of change of seawater Sr concentration (presently 8 ppm) and the runoff 87 Sr/ 86 Sr ratio. Let a Sr be the common value of the reactivity coefficients for both isotopes. Replacing the subscripts 'liq' by 'SW\ 'in' by 'runoff and evaluating, we get the rate of change of 87 Sr/ 86 Sr as d( 87 Sr/ 86 Sr) sw dt

=

0.7092-0.7078 35

_ Q _t 1n _ . = 3.889x10 5 M a 1

356

Dynamic systems

Rewriting equation (7.2.23) as dC s w S r T

1

dJRsw/dt

a

~~ -^r

H/ Sr

and inserting the numerical values, we get the relative rate of change of Sr concentrations as 1 Sr

C s w dr

4

38.89x10' 0.7078 + 38.89 x 1 0 - 6 x (35-t)-( 8 7 Sr/ 8 6 Sr) r u n o f f

This relationship is drawn in Figure 7.5 for various runoff 87Sr/86Sr ratios at £ = 0, 15, and 30 Ma BP. Quite surprisingly, the Sr residence time of 4 Ma requires that seawater Sr concentration should change at a rate in excess of 20 percent per million year, which is extremely unrealistic. Curves were also drawn for rSr = 20Ma, which 0.10

TSr = 20 Ma

0.00

-0.10

-0.20

-0.30 0.708

TSr = 4 Ma

0.710

0.712

0.714

0.716

( 8 7 Sr/ 8 6 Sr) r u n o f f Figure 7.5 Relative rate of change of Sr concentration in seawater calculated from equation (7.2.23) for the last 35 million years using the Sr isotope data for seawater of DePaolo and Ingram (1985). Calculations are made for t = 0, 15, 30 Ma. Residence time of Sr in the ocean is assumed to be 4 Ma (Holland, 1978, bottom) which gives an unrealistic rate of change for Sr concentration. An alternative residence time of 20 Ma (top) seems more adequate.

gives more acceptable but still quite rapid Sr depletion. This conclusion is insensitive to a particular choice of runoff 87Sr/86Sr value in a reasonable range (0.710-0.712, see Albarede et a/., 1981). This result indicates that the assumption of a constant 87 Sr/86Sr ratio in the runoff is inadequate. An extreme assumption would be that 86 Sr concentration in seawater stays approximately constant. In other words, 86Sr

7.2 Single-variable residence time analysis

would be at steady-state but not

87

357

Sr, which translates into

runoff ~

aSr

dt

or

Applications of isotopic box models may be far-reaching: Albarede et al. (1981) have investigated the balance of Sr isotopes in seawater between runoff and ridge crest hydrothermal activity. They deduced a range of estimated values for 87 Sr/ 86 Sr in the global river system of 0.7097-0.7113 nearly consistent with the direct estimate of 0.711 by Palmer and Edmond (1989). Raymo et al. (1988) and Richter et al. (1992) suggested that enhancement of continental erosion by the uplift of the Himalayas and Tibetan Plateau explains the modern increase in seawater 87 Sr/ 86 Sr ratio. The seawater 87 Sr/ 86 Sr record may carry geodynamic information of global importance, o # ^ The 87 Sr/ 86 Sr ratio of the lavas erupted by Vesuvius have been found by Cortini and Hermes (1981) to decrease linearly from 0.70793 in the 1754 eruption down to 0.707 22 in the 1882 eruption. This change is not correlated with a systematic trend in Sr concentrations. Assuming that lavas are erupted from a perfectly mixed reservoir withholding a constant mass of magma at an effusion rate Qout = 0.001 km 3 a" 1 , estimate the size of this reservoir as a function of the 87 Sr/ 86 Sr ratio in the input magma. Neglecting the variations in Sr concentrations amounts to assuming a S r = l . We get the rate of change of 87 Sr/ 86 Sr as dRliq _ d(87Sr/86Sr)liq _ 0.707 22-0.70793 _ dr dt 1882-1754

ip-6a-i

The magma residence time can be estimated by recasting equation (7.2.23) as

dRyJdt

+

dC, iqSf

dRViq/dt

Inserting the numerical values gives Kiiq - #in _ T

" ~ ~ dRliq/dt ~

0.707 93 - 5.55 x 10 " 6(t -1754) - Rin

-5.55 xKT 6

and from the definition of the residence time T H 0.707 93-5.55 x 10" 6 (f-1754)-K i n

Dynamic systems

358

0.704

0.706

0.708

(87Sr/86Sr).

in

Figure 9.16 Kinetic fractionation during crystal growth. Steady-state distribution of melt concentrations in the vicinity of a solid growing at the rate v for trace elements with different solid-liquid fractionation coefficients [equation (9.6.5), Tiller et al. (1953)]. The stippled area indicates the steady-state chemical boundary-layer with thickness S = @/v.

This relationship is drawn in Figure 7.6 as a function of the 87Sr/86Sr ratio in the input magma for the dates t= 1760, 1820 and 1880. Provided the initial assumptions on the magma tic regime are valid, the magma chamber is smaller than 1 km 3 .o ^ Aplitic magma with constant composition is continuously injected into a dyke where it crystallizes as a quartz-feldspar assemblage while the residual liquid is expelled toward the surface. Sr in the dyke is found to be isotopically zoned. We assume that the 87Sr/86Sr ratio of 0.710 measured in the earliest rock in the dyke represents the isotopic ratio of the injected magma. Trace-element partitioning suggests that the injected magma has a 87Rb/86Sr ratio of 10 000 and that the ratio of reactivity coefficients aRb/aSr was 0.05. We assume a flow-rate high enough for Rb and Sr concentrations to be at steady-state almost instantaneously. It is found that the most-evolved rocks in the dyke have a constant 87Sr/86Sr ratio of 0.720. Estimate the residence time of Rb in the dyke. Setting constant concentrations in equation (7.2.23), and replacing R by 87Sr/86Sr gives d( 87 Sr/ 86 Sr) liq _ dt

( 8 7 Sr/ 8 6 Sr) l i q -( 8 7 Sr/ 8 6 Sr)

+ A87Rb (87Rb/86Sr)liq

where the additional term on the right-hand side accounts for radioactive decay at constant 86Sr. Since concentrations have reached steady-state and neglecting the effect of decay on 87Rb, combination with equation (7.2.13) gives

7.2 Single-variable residence time analysis d( 8 7 Sr/ 8 6 Sr), l q

( 8 7 Sr/ 8 6 Sr) l i q -(87Sr/ 8 6 Sr) i n

1

359

87Rb

( a R b /a S r

Since the last differentiates have a constant 87 Sr/ 86 Sr ratio, we assume the last equation to be at steady-state and therefore the residence time r Rb = in/a,^ of Rb in the dyke is T H _ 1 ( 87 Sr/ 86 Sr) liq -( 87 Sr/ 86 Sr) in _ 1 ^ 0.720-0.710 87 86 n ( Rb/ Sr)in 1.42 x 10" 10000 aRb ^Rb Dynamic accumulation of radiogenic 87 Sr in a differentiating magmatic system has been suggested by Vidal et al. (1979) to account for the initial 87 Sr/ 86 Sr heterogeneities in concentric granitic intrusions from the Kerguelen islands.

1 zt

This relation shows that relative fluctuations of concentrations about steady-state values are more important for elements with short residence time, i.e., for reactive species. Some elements are more abundant simply because they are chemically inert with respect to the processes taking place in their host reservoir. This is notably the case of N 2 and O 2 in the atmosphere, Na, Cl, Mg in the ocean, Si, Mg, Fe, Ca in the mantle. Atmosphere, seawater, and mantle peridotite are perceived as chemically homogeneous because of the remarkably inert behavior of their major elements. Chemical fluctuations may happen that potentially break the basic assumption of a well-stirred system made earlier in this chapter. This may not be a problem if a stirring process exists, such as thermal convection in the mantle and the atmosphere, or thermo-haline convection in the ocean, that mixes the system down to a certain distance and levels off heterogeneities. The related concept of mixing time, which we just met with carbon in the ocean, is scale-dependent, i.e., sample size dependent. A sample can be scaled in different ways: a sampling bottle in the ocean, a hand specimen for igneous rocks or the height of the melting column for lavas. In solids, the critical size for homogeneity is necessarily larger than mineral grain size and smaller than the system size itself (Figure 7.7). For a given sample size, an element is homogeneously distributed in a system if a suitable dispersion parameter, such as the standard deviation of concentration, falls below a critical level. The time it takes for the size of heterogeneities to decay below the sampling size is the mixing time of the system. An appropriate scale for the mixing time is the reciprocal of the local velocity gradient (see Chapter 8). If the residence time is significantly longer than the mixing time, the system levels off changes faster than they are introduced from the surroundings. If the mixing time is longer than the residence time, stirring is slow relative to external perturbations and the system is heterogeneous. The more reactive an element, the more variable is its concentration. The relationship between dispersion and residence time is well-known in the lower atmosphere (troposphere), where concentrations of reactive gases, such as H 2 O and

360

Dynamic systems

Figure 7.7 The size of heterogeneities depends on the sample size. In thisfigure,dots represent lithospheric material dispersed in the mantle. Small samples are more heterogeneous than large samples.

O3, vary much more than those of inert gases (Junge, 1974). A related observation was made by Hofmann (1988) for the variability of trace elements in mantle-derived rocks: the variability is higher for incompatible than for compatible elements. In the mantle, melts play the role of the scavenger, which is also the role played by particles in the ocean. Incompatible elements, such as Th and La, for which the liquid-solid partition coefficient is higher, vary more than compatible elements such as Mg and Ni. 7.2.6 Stability of single-variable systems When a geochemical variable, e.g., the concentration of an element in a reservoir, is constant with time, the system is said to be at equilibrium although a better practice in compliance with thermodynamics is to use the term steady-state. We now inquire about the stability of equilibrium, i.e., whether in a given state of equilibrium an arbitrary small perturbation is going to decay and bring the system back to equilibrium or grow until the system achieves another state of equilibrium. The following derivation and example are largely inspired by Logan (1987). We can usually assume that, in a reservoir, the concentration C of an element is initially at equilibrium and its rate of change obeys a law of the form (7.2.24)

where F is a known function of the concentration C and a parameter \i. At t = 0, the

72 Single-variable residence time analysis

361

system is at equilibrium, i.e., C = C0 and F = 0. What happens if, for a given value of the parameter, concentration is perturbed by a small increment 8C? Letting

and substituting into the differential equation, we get d(6C) at Expanding F in a Taylor series to the first-order gives the approximation ^

at

A*) ++ ^ ^ (Co, ji)5C » F(C0 , A*) oC C

(7.2.25)

Because the initial condition of equilibrium requires that the first term on the right-hand side of the first equality vanishes, this equation simply becomes d(5C) dF __*_(Co, The solution is 5C(r) = 8C(0) exp — (Co, ft) < which shows that for the perturbation to decay, the stability criterion is dF — (C0,Ai)«>

(7.2.26)

The reader is referred to textbooks on differential equations and applied mathematics for a more rigorous and general proof of the stability criterion (e.g., Logan, 1987). & The first-order non-isothermal (FONI) reactor. A continuous, well-stirred magmatic reservoir similar to those discussed above is supposed to be thermally insulated. A dissolved element i precipitates with a temperature-dependent rate of crystallization. Crystallization rate is assumed to obey first-order kinetics with Boltzmann temperature dependence such as

dt

M

Hq

^^'

where k is a constant, 0t the gas constant, and E the activation energy of crystallization. In a system with input and output of liquid - = QCinl-QCUql-kVCUqlexpl

.

/

-

E

362

Dynamic systems

The heat balance follows a similar relationship with the rate of latent heat release in proportion with the amount of element crystallized

p

dt

q

p

where cp is the heat capacity of the mixture and L the latent heat of crystallization per mass unit of element. Note the plus sign before the last term due to the heat released by crystallization. All parameters besides the descriptive variables Cliql and Tliq are assumed to be constant. Dividing the first equation by VCinl and rearranging, we get AC

i //""

i

1

At

/

C

VIO\

u t

K

'\

/"•

l

/ ^ \

i

/

17

1

C I

C '

V ^ T Ti /T

^in /

^in

\

^

i

in

i

liq/ J in

Once divided by VcpTin, the second equation becomes

Tin/

L

^P Tin Cinl A

^7: n 7iiq/7:n

Introducing the reduced time t\ concentration u and temperature v

and the dimensionless parameters

0± hV

cp Tin

where \i is equivalent to a reduced flow-rate, 6 measures the strength of latent heat effect, and y is the temperature-dependence of the kinetic factor. The two equations can be rewritten dt1 d

v

— = 1i —tH d1

6

u

e-

Multiplying the first equation by 6 a n d adding the two equations gives

dr1 which can be integrated as

7.2 Single-variable residence time analysis

363

where reduced concentration u0 and temperature v0 are the values at t = 0. Following Logan and for sake of illustration, we simply assume that uo = vo= 1, hence 17=1+0(1-11)

W e can n o w discuss the behavior of the reduced concentration u b y writing dw u Y — = 1 - u - - exp dr /i |_

y L

= F(u, i

(7.2.27)

The right-hand side function F(u, fx) is highly non-linear and is contoured in the w, \i variable space (Figure 7.8) for various values of F. The contours have simply been drawn by ascribing values to the dimensionless parameters (0 = 3 and y = l) and F( — 0.06 to +0.06), then calculating /i from a range of u values. Equilibrium is achieved for F = 0. However, for a given value of range of the reduced flow-rate //, approximately from 0.41-0.63, a given value of this parameter can be matched with

0.01

0.02

0.03

0.04

0.05

0.06

Reduced flux, Figure 7.8 Stability analysis of concentration in a simplified model of adiabatic magma chamber with first-order precipitation kinetics. Contours are those of the function F(w,/i) (equation 7.2.27). Unstable (actually bi-stable) behavior (hysteresis) is observed around the branch C-C where the derivative of the function F(u, fi) relative to the reduced concentration u is positive. Reducing the flux of magma produces a pathway A-B-C-D-E; increasing the flux produces A'-B'-C'-D'-E1.

364

Dynamic systems

three values of the reduced dissolved concentration u. u is therefore a multiple-valued function of JJ, and this state is known as a multiple steady-state. These multiple branches are not equivalent. From the contour lines in Figure 7.8, we can deduce which branch is stable, and which branch is not. The middle branch (C-C) lies in a range where, for a given value of \i, F increases with u. The derivative of F with respect to u is therefore positive which is just the criterion we found for an unstable equilibrium. Any fluctuation of u at constant \i will drive the system away from the branch C-C. The opposite holds for the upper and lower branches A-C and A'-C that lie in a range where F decreases when u increases. The derivative of F with respect to u is therefore negative and any concentration fluctuation around an equilibrium state along these branches dies out rapidly. The branches A-C and A'-C1 are stable steady-states. Let us now find out how the system works. Assume that it starts at a large reduced flow-rate (point A) and reduce the input slowly. Up to the point C, any deviation from the equilibrium curve will die out rapidly. At C, concentration fluctuations become unstable and the system evolves quite rapidly towards D (// isfixed)where it finds a stable steady-state. The system has become unstable because reducing the flow-rate enhances crystallization which through the kinetic factor enhances the rate of precipitation and thereby depletes the residual liquid. The system quenches. Upon reducing the flow-rate further, the stable evolution continues towards point E. If the process is reversed and the flow-rate increased, the system evolves smoothly from A' to C on a stable branch. At C, the excess heat brought in by a large input of fluid is no longer balanced by the output and is suddenly converted into latent heat. The system 'thaws'. A large fraction of solid is rapidly dissolved up to D' where the system joins a stable branch. The evolution of the system is therefore not reversible and is reminiscent of hysteresis effects in variable magnets. This dual behavior is one of the cases of what is known as a bifurcation. The system has access to competing steady-states and shifts from one state to another in a catastrophic move. The present case is fairly similar to the triggering of ignition in gases Benson (1982) in which combustion releases heat that enhances the rate of reaction. At the time of writing, the potential for irregularly fed magma bodies to to have a liquid line of descent broken through bifurcation has not been explored. Instabilities due to non-linear interaction between processes of mass and heat conversion are well known in industrial chemical reactors and the interested reader could consult the book by Gray and Scott (1994). In Chapter 8, another case of bifurcation associated with metasomatic fronts is discussed in which the physical foundation of multiple steady-states is substantially different from the present example, o It must be realized that the basic reason for bifurcation is that the function F is multiple-valued and therefore non-linear. Other sources of non-linearity, like auto-catalysis have been explored systematically and have proven to be the starting point of geochemical catastrophes (e.g., Ortoleva, 1994). 7.2.7 Random geochemical variables

When the geochemical variable is not uniquely determined but is a random variable, we would like to be able to assess how the parameters of the population change

7.2 Single-variable residence time analysis

365

through time. Let c be a geochemical variable (e.g., the concentration of an element) assumed to be a continuous random variable defined over a domain Q and /(c) its density of probability function (e.g., a normal density function)./(c) has the standard properties of a density function, i.e., f(c) ^ 0 everywhere and

I

/(c)dc=l

Given the autonomous evolution equation

where F is a known function, we inquire about how /(c) changes with time. The problem is a Lagrangian problem actually similar to a conservation problem (see Chapter 8) since probabilities are conservative. Making a simple comparison with frequency histograms, whatever is lost from a frequency bin must be found in other bins since frequencies sum up to unity. The derivative dc/dt, or identically F(c, t\ has the meaning of a velocity along the c-axis position and f(c)F(c) the meaning of a probability flux along that axis. Let c0 and c0 + dc be two points along the c-axis, where dc is arbitrarily small. /(co)dc represents the probability that c lies between c0 and c0 + dc. The fraction of the population that enters or leaves this segment at c0 during the time interval dt is /(co)F(co) while /(c 0 + dc)F(c0 + dc) is the fraction of the population that enters or leaves this segment at c0 + dc. We write that the rate of change of/in that segment equals the sum of fluxes at both ends = - U(c0 + dc)F(c0 + dc) - /(co)F(co)]

(7.2.28)

where the minus sign accounts forfluxdecreasing the inside probability when counted away from the boundaries. Linearizing the first term in the right-hand side through a Taylor series dc

/ ( c 0 + dc)F(c0 + dc)» f(co)F(co) + co

and, switching to an equality sign d[/(co)dc] dt

dc

dc CO

This equality holds true for any arbitrarily small segment dc and any c0, so the following equality is identically true

f dt

=

-^=-Ff-/^ dc

dc

dc

(7.2.29)

366

Dynamic systems

In a case where F would contain a stochastic term (e.g., Brownian motion, noise), this equation would lead to the celebrated Fokker-Plank equation with a diffusion (second-order) term. This equation is a partial differential equation whose order depends on the exact form of/and F. Its solution is usually not straightforward and integral transform methods (Laplace or Fourier) are necessary. The method of separation of variables rarely works. Nevertheless, useful information of practical geological importance is apparent in the form taken by this equation. The only density distributions that are time independent must obey fF — f— = const dt If the process under investigation is radioactivity, for instance, then — = —Ac dt

where X is the decay constant and the only steady density function would be proportional to c"1. Unfortunately, such a distribution is usually unbounded. Radioactive decay affects the density function of radioactive elements. Two consequences of this simple analysis are far-reaching. First, the common perception that normal or log-normal functions may be used as catch-all probability density functions is physically untenable since these functions are not time-invariant relative to most geological processes (mixing, differentiation, ...). Second, there is more information on the physics of geological processes contained in the density function of concentrations, ratios, and other geochemical parameters than what is reflected by their mean or variance. Obviously, this information is deeply buried and convoluted, but deserves attention anyway. 7.2.8 Population dynamics A related probabilistic approach to the evolution of heterogeneous systems consists in splitting the reservoirs into many units that have known geochemical characteristics and known rates of changes and to handle them collectively. We can relate this method to that of insurance companies which, in order to forecast their profitability and determine customer's contribution, divide the human population into classes defined by individual age, wealth, professional occupation, ... and assign each class, usually on the basis of surveys, a probability of accident, disease, or death. Models predicting the mass- and age-distribution of clastic sediments using a discretization of the geological and orogenic time-scale have been developed by Veizer and Jansen (1979) and applied to Nd crustal residence age by Allegre and Rousseau (1984). An extension of this model to the continuous time-scale was given by Michard et al. (1985) and will be discussed below. For the sake of illustration, we will calculate the Nd isotope composition of continents in a very simple model of crustal evolution. A newly formed crustal segment results from the addition of both juvenile mantle and material recycled from the

7.2 Single-variable residence time analysis

367

preexisting crustal segments. M(T,t) is the mass of crust formed prior to the time T and still surviving erosion at t. There is no continental crust at t = 0. The mean life of the crust relative to erosion is the constant T, which means that the probability of a piece of crust to be eroded per unit time is independent of its age and equal to 1/T, therefore

dt

x

This can be integrated into M(T, t) = const xe" t/T Writing this equation for t = T results in

Next, we assume that juvenile crust is extracted from the mantle at a constant rate g. Therefore M(T,T) = gT

(7.2.31)

or

(7.2.32) The amount dM(T, t) of crust formed between T and T + d T still surviving at t is

dT

d

e

(7.2.33)

and since the amount of crust M(t, t) existing at t is gt, the fraction f(T,t)d preserved crust which formed between T and T + dT is

Tof the

dT M(t,t)

t

The increment of crust newly formed at t is created at a rate g for the juvenile part, and, for the recycled part, as the negative of the erosion rate. Therefore dM{T,t) dT

M(t,t) A t\ =g 1 + z

.„_.. (7.2.34)

v )

Note that the left-hand side has not been expressed as dM(£, t)/dt as in Michard et a\. (1985), which would incorrectly imply a crustal growth rate, but as a density of probability of the crustal ages for T in the vicinity of t. The integral in the middle term represents the eroded components summed over all the class ages [T, T + dT] from T = 0 to T = t.

368

Dynamic systems

We can now turn to isotopic ratios. The Nd isotope composition 1 4 3 Nd/ 1 4 4 Nd and the 1 4 7 Sm/ 1 4 4 Nd ratio are noted y and x, respectively, with the m and c subscripts denoting mantle and crust. Using this notation, the chronometric equation of the Sm-Nd closed-system reads

(note that t and T are times and not ages.) For practical purposes, the Sm/Nd ratios may be assumed to be constant and Xt«l, so the chronometric equation linearly expanded becomes t-T) The

143

(7.2.35)

Nd/ 1 4 4 Nd ratio at time t of a crust formed at T is yc(T, t)»yc(T, T) + Xxc{t - T)

(7.2.36)

yc(T, t) is the weighted average of the ratio in the juvenile fraction and that in all the recycled fractions contributed by all crust segments of age 0 to T c(T, f

y&> ) =

rru^

t)dM(T, t)

gym(t) + - f ' ^ ^ =

yc(T, t)dT

^TTTT^

(7-2-37)

Multiplying both sides of the last equality by the denominator of the right-hand side and combining the two integrals yields

It is convenient to use the difference of isotopic compositions between the crust and mantle values at the time the crust forms as a work variable, therefore

which we split as

or equivalently

7.2 Single-variable residence time analysis

369

The last integral on the right-hand side is just M(t, t), i.e., gt, hence ,., 9[

1 f'<

,

Since the y values under the integral sign are the values at t and not at the time T the crustal segment formed, a correction for decay over t—T brings us back to the formation time "(*c-*J *

JO

Defining the new variable w(t) as /

^W,0-y«W]

(7-2.38)

with M(0) = 0, since mantle and crust are isotopically indistinguishable at t = 0, we can write

u(t) = - f V ) d T + -(x c -x m ) fe r /{l + - V - T ) d T

(7.2.39)

The last integral on the right-hand side of equation (7.2.39) is a standard example of calculus textbooks but we will nevertheless evaluate it by part integration. Let J be this integral divided by T and expand the product of the different factors

This form suggests a change of variable z = T/T, giving rt/t

j=\

r e

r

T

/T\21

H t + (t-T)--Tl-\

T

U-=

Ct/x

ez[

The easy route is to calculate this integral iteratively. Defining /„ as /„= I z"e2dz Integration by parts gives In = znQz-

\zn-lQzdz

=

znGz-nIn_1

370

Dynamic systems

which produces the sequence

Let us expand the expression under the integral sign of J tlo + (t-z)I1-

tl2 = It + (t - T)(Z -1) - T(Z2 - 2z + 2)] ez + const

We can get J by making the difference of this expression at z = t/x and z = 0

which is rearranged as j = (t-T)e'/T + T = T | - - 1 je f / t +l Inserting this expression of J into equation (7.2.39) gives

)=1 f

u(t) = 1 f u(T) dT + AT(XC - x

This integral equation can be transformed into an ordinary differential equation by taking its derivative relative to t and applying Leibniz's rule to the integral

Rearranging and noting that w(0) = 0, we get u(t) h ii'(0 = — + -{xc-xm)e* T

(7.2.40)

T

The solution to the homogeneous equation being e'/T, we can write the solution u(t) to the complete equation as u(t) = f(t)et/x

(7.2.41)

where/(t) is a function to be determined. Taking the derivative and comparing with equation (7.2.40), we get t/x

f(t)

t/t

T

u(t)

h

T

or f'(t) =

-(xc-xm) X

T

t/x

7.3 One element in several interacting reservoirs

371

which has the solution A

t2

/(0 = - ( x c - x J - +const T

2

Inserting this expression into equation (7.2.41), it becomes /T

k

t2

T

2

u(t) = f(t) e' = -{xc- xj - e'/T + const x e'/r

We can now write explicitly u(t) in terms of the geochemical variable yc(t, t) through equation (7.2.38). Again, the condition that there is no primordial crust at t = 0 requires that yc(t9 i) and ym(t) are equal and therefore the constant is zero. Rearranging, we get

Finally, the expression derived by Michard et al. (1985) becomes yc(t, t) = ym(t) = Uxc- xj — kt 2 r+

(7.2.42)

The crustal residence age TDM of sediments formed at time t, i.e., the mean age of their continental protolith is defined as _

1 yc(t,t)-ym(t)_

1 t

^DM — -fstrat + T

^strat + ~

l

\ '•*>'*J)

2 T+t

where Tstrat is the deposition or stratigraphic age. If the characteristic time of erosion is short (T «0), the crust is well-mixed and TDM is given by T —T rp _ rp i ^ D M ~ 'strat '

0

i

strat

~

where To is the age of the oldest event of crust formation. If erosion is inefficient and slow (T very large), there is no contribution from old to new crustal segments and T DM ^ Tstrat. The newly formed crust is said to be juvenile. 7.3 One element in several interacting reservoirs Various geological problems deal with systems which, within a good approximation, can be considered as geochemically homogeneous over a certain time-scale. The mantle and the crust over periods of up to « 1 0 6 years, the ocean for most elements not involved in biological processes over periods of « 1 0 3 years are examples of quite homogeneous systems. These reservoirs can be thought of as 'boxes' with wholesale chemical properties, such as concentrations or 'total standing crop' of an element in the reservoir, and much can be learned about the geochemical evolution of several

372

Dynamic systems

boxes that are allowed to have chemical exchanges through a rather simple formalism of the 'box model'. Exchange of Sm and Nd between mantle and crust, of CO2 between the ocean and atmosphere will be investigated as simple practical examples. The simultaneous handling of a multiple reservoir by systems of equations was initiated by Southam and Hay (1976) and extensively developed by Lasaga (1980, 1981). 7.3 J A closed-system 3-box model with concentrations as the variables The layout of such a model is shown schematically in Figure 7.9. Since we are going to deal with only one conservative species, no ambiguity will arise if we drop temporarily the superscript. Let Vk be the volume of the feth reservoir, Qk^t the flux of material from reservoir (= box) k to reservoir /. There exists one equation per reservoir that describes the conservation of the species dV1C1 dt dV2C2 dt

(7.3.1)

dV3C3 dt

These three equations are not independent which can be checked by adding all three

Figure 7.9 A three-reservoir model. Vj represents the volume of the reservoir j , Ci the concentration in this reservoir of the element investigated, Q^j the material flux from reservoir i to reservoir j .

373

7.3 One element in several interacting reservoirs

and verifying that the total amount of species i is constant d t

1

1

2

2

3

(7.3.2)

3

Dividing each equation by the volume of the corresponding reservoir, we get dt dC2

"dT

V1 — ZL^UJLC

dt

F3

—

V3

or, in a matrix form dC/

~dT dC2 "dT dC3

6^2

63^2

63-1+63^2

6l-

(7.3.3)

l_c3j

The matrix is singular since Vx multiplied by the first line + V2 multiplied by the second line + V3 multiplied by the third line sums up to zero. An entirely equivalent formulation uses the absolute amounts K;Cf present in each reservoir instead of concentrations dVlCl V,

dt dt

dt

Vt

V2

with the resulting matrix equality

dt dV2C2

62^1

dt dt

Again, the matrix is singular since the rows sum up to zero.

v2c2 v3c3_

(7.3.4)

374

Dynamic systems

This case can be generalized from three to n geochemical reservoirs using dC

O

O

^ = -I%^+X%^ dt

Vt

J=1

j = l

(7.3.5)

Vj

where the first summation refers to outputs and the second summation to inputs. Equivalently ^

= - I % ^ C , ) + t ^(VJCJ)

(7.3.6)

73.2 The general box model: an empirical model It is often difficult to define precisely the elemental flux from a system to another as a product of a mass flux of a carrier multiplied by a concentration in this carrier. For instance, the flux of carbon from the biosphere to the atmosphere is not adequately represented by a carrier flux since carbon dioxide escapes directly to the air. We therefore have to resort to a direct formulation in terms of total quantities (amounts, e.g., in tons, kilograms or moles) and fluxes. Denoting Mt the total quantity of the species under consideration in the reservoir i and J^j the flux of the same species from reservoir i to reservoir j , we note that = — L, Jt^j

an<

=

^

L Jj-+i

(7.3.7)

which we rewrite dM, 0 1 1 1

„ Ji^i*M

= — 2^ dt

J

dM/ n

Mj and

jvi Mt

„

Jj^i%Jf

= 2^ dt

M^

j=ti Mj

We note that the ratio

is the ratio of a flux to a mass. It has therefore the dimension of an inverse time (e.g., a" 1 ) and it will be further assumed to be constant. This assumption amounts to considering that the time the element i spends within a reservoir is controlled by parameters independent of both the standing crop Mt and the various fluxes J^j and Jj^t. Typically, time constants arise from hydrodynamic conditions or from entrainment by major carrier species other than water, air, .... Such a simple model works well for diluted solutions but is clearly wrong when, for instance, Mt is buffered by solubility conditions or when the fluxes are controlled by non-linear, e.g., autocatalytic effects (Lasaga, 1980). Combining the fluxes gives the mass balance of the species under consideration in the rth reservoir n dM " — ! = - I k^jMt+ £ kj^Mj

dt

;=i

;=i

(7.3.8)

7.3 One element in several interacting reservoirs

375

Defining the current element of the matrix A by the following expressions n

«»=- Z

1 k

t^j=—'and

Jf=l

a

tj=kj^i

T(i)

where T (0 is the residence time of the considered species in the ith reservoir. Lumping the amounts M{ together into the vector JC, the system can be recast into the standard form — = Ax dt

(7.3.9)

When the matrix A is constant, the system of linear equations is linear. This system is solved with the procedure described in Section 2.5. The non-symmetric matrix A is first diagonalized

A = U\U '= £

£

where 9lt is the matrix formed as

If the matrix is time-invariant, or equivalently, if residence times are constant, the solution can be calculated as (7.3.10)

where x0 is the vector of concentrations at t = 0. The matrix exponential eAt is known as the transition matrix of the system. Due to the way the matrix is built, this system has very simple properties. (i) The rows of the matrix A are not linearly independent, since

hence the matrix is singular and has one eigenvalue equal to zero. In other words, in an n-box model, only n2 — n independent flux coefficients can be fixed independently. (ii) Given one zero eigenvalue for A, since the complex eigenvalues of a real matrix are conjugated, a two-reservoir system cannot have a complex eigenvalue. A minimum number of three reservoirs is required for periodic fluctuations. A small number of reservoirs cannot give oscillations of significant amplitude (the reader is urged to make a numerical experiment with random matrices). (iii) The non-zero eigenvalues are non-positive, a consequence of applying the Gershgorin's circle theorem columnwise (see Section 2.4). Indeed, eigenvalues are within the circles

376

Dynamic systems centered on the diagonal terms an (always negative) and having a radius r such that n

The modulus notation is omitted for these terms are positive. But as we just saw n

and therefore

Since each a^- is negative, this equality holds true only if the real value of each Xj is negative. As discussed in Section 2.5, the solution is therefore physically stable. When t->ao all but the exponential term with the zero eigenvalue tend to zero, possibly after a few oscillations if some eigenvalues are complex. The concentrations are relaxing towards the steady-state given by x^ = Saoxo

(7.3.11)

where $I0 is the matrix 2lf associated with the zero eigenvalue. If x0 corresponds to the unperturbed set of elemental amounts in each reservoir at steady-state, i.e., if JCO = JC°°, then we can write

which is an alternative way of viewing the steady-state values as associated with X{ = 0. & The global phosphate system is described in Figure 7.10 (Lasaga, 1980). Table 7.1 gives the amounts held by each reservoir, and Table 7.2 the fluxes between reservoirs. Assuming steady-state, calculate the evolution of the world phosphate system if 1 0 0 0 0 x l 0 9 k g of phosphorus from fertilizer (mined from an isolated reservoir) were dumped on land in a short period of time. An example will show how the kt^j terms are evaluated

Proceeding similarly with the other terms gives the matrix A -5.0xl0~9 5.0 xlO" 9

9.15 X 10"5 -4.18 X 10" 4

0 0.0212

0 0

0 0

1.95 xlO" 5 " 0

0

3.10 X 10~4

-0.0212

0

0

0

0

0

0

0

8.50 X 10"

0

0

6

0

-7.54

0.384

0

7.23

-0.390

0

0.304 6.53 xlO" 3

6.66 x H T 4 -6.85 xlO" 4

311

7.3 One element in several interacting reservoirs

Table 7.1. Amounts of phosphorus (109kgP) stored in each reservoir at \ = 0 and the initial perturbation. i

Reservoir

1 2 3 4 5 6

Sediments Land Terrestrial biota Oceanic biota Surface ocean Deep ocean

Steady-state JC00

Perturbation 5JC

4xlO9 2xlO5 3000 138 2710 8.71 x 104

0 lxlO4 0 0 0 0

Table 7.2. Phosphorus fluxes between the six reservoirs (109kga-'). Fluxes not given are assumed to be negligible.

5 = 58

J 5 _ 4 =1O4O

^5=1.7

3

—I

4 Oceanic biota

Land biota

2

5

—

Land

Surface ocean

1

6 m

Sediments

—

•

Deep ocean

Figure 7.10 The long-term phosphate system (Lasaga, 1980). The arrows show the fluxes that are taken into account.

The negative reciprocal of phosphorus residence time in each reservoir is found on the diagonal entries of matrix A (Table 7.3). A is factored giving six eigenvalues and six characteristic times of the system as the negative reciprocal of the eigenvalues

139.0

3150

138.8

3100

138.6 138.4

3050

138.2 2.10

2730

2.08

2725

2.06 2720 2.04 2715

2.02 8 Sediments (x 109)

87.6

Deep ocean (x 103)

87.4

87.2

10°

102

104

106

102

104

io6

Time (a) Figure 7.11 The long-term phosphate system. Evolution of the amount of phosphorus (in units of 109kg) held by the six systems described in Figure 7.10.

108

7.3 One element in several interacting reservoirs

379

Table 7.3. Residence time of phosphorus in each reservoir. Reservoir

TP (a)

Sediments Land Terrestrial biota Oceanic biota Surface ocean Deep ocean

200 x10 6 2395 47.2 0.133 2.56 1459

Table 7.4. Eigenvalues and characteristic times of the six-reservoir system for phosphorus. i

1 2 3 4 5 6

-7.91 -0.0217 -0.0215 -9.85x10" 5 -1.89x10" 5 4.50x10" 19

0.126 46.2 46.5 10150 52900 oo

(Table 7.4). The six eigen- (column-) vectors form the matrix U given by P-6.71X10" 8 -6.52xlO" 4 -2.99xlO" 3 -6.70X10" 1 -7.18X10" 1 0.999999 ' 0.0 0.0

0.0

7.08X10"1

7.38X1O"1 -4.44x10" 5 5.00x10" 5

0.0

1

l . l l x l O " 2 -6.67xlO" 7 7.50xl0" 7

-7.05X10"

U= -7.20X10" 1 -3.52xlO" 2

1.56xlO"3 -1.04xl0~ 4

1.07 x 10" 3 3.45 x 10"8

6.93X10"1 -6.89X10" 1

3.04xl0" 2 - 2 . 0 4 x l 0 " 3

2.11 x 10"2 6.78 x 10"7

7.24X10"1 -3.23 x 10"2 -7.67xlO" 2

6.96X10"1 2.18 xlO" 5

2.72xlO" 2 •

•

The results are shown graphically in Figure 7.11. As shown by Lasaga (1980), the global relaxation time of the system, which is its longest finite time constant (52 900 a), is shorter than the longest residence time in the system (P in sediments with 200 Ma). It is left to the reader to show that these results can be predicted from Gershgorin's theorem (Section 2.4) applied linewise to matrix A. Matrix A is ill-conditioned (i.e., numerically singular) because its five non-zero eigenvalues vary by more than five orders of magnitude: the condition number \XJk5\ is 4.2 x 105. Eigenvectors associated

380

Dynamic systems

with nearby eigenvalues are 'wobbly' (Golub and van Loan, 1983). The system 'hesitates' numerically between the truly zero sixth eigenvalue and the second smallest eigenvalue that corresponds to the longest time-constant (52 900 a). In mathematical terms, the null-space of the matrix A has a dimension of two. The long-term behavior of a system is poorly known whenever reservoirs have widely different residence times. In the present case, the sediment reservoir contains most of the total phosphate. P, therefore, spends most of its time in this reservoir and the system reacts as if one extra eigenvalue was approaching zero. As a result, large machine- or softwaredependent errors may arise on calculating the eigenvector coefficients associated with the smallest eigenvalues. We now calculate 9I0 as T9.9993 x 10" 1 9.9993 x 10" 1 9.9993 x 1 0 ' l 9.9993 x 10" 1 9.9993 x 10"* 9.9993 x 10" 4.9996 x 10~ 5 4.9996 x 10" 5 4.9996 x 10" 5 4.9996 x 10" 5 4.9996 x 10" 5 4.9996 x 10~ 5 7.4995 x 10" 7 7.4995 x 10" 7 7.4995 x 10" 7 7.4995 x 10" 7 7.4995 x 10~ 7 7.4995 x 10" 7 ° ~ 3.4497 x 10" 8 3.4497 x K T 8 3.4497 x 10" 8 3.4497 x 10" 8 3.4497 x H T 8 3.4497 x 1(T 8 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 6.7745 x 10" 7 •

2.1173 x 10" 5 2.1773 x 10" 5 2.1773 x 10" 5 2.1773 x 10" 5 2.1773 x 10" 5 2.1773 x 10~ 5

•

It will be checked that

within five decimal places. Dumping fertilizer on land perturbs the initial state of the system, so that we write

and the new steady-state will be given by

Sundquist (1985) provides an interesting application of these methods to the geological carbon cycle. 7.3.3 The general box model with forcing terms One way of circumventing the difficulties encountered for systems with widely different time constants is to split the reservoirs into two categories. The first category will comprise the reservoirs with short residence times which will be explicitly required to satisfy the constraints of mass conservation. Reservoirs with long residence times will make up the second category which we will treat as source and sinks. Equation (7.3.8) will be transformed into HM

n

n

n

—-i = - X h^jMt+ X kj^Mj+ X (JJ^-J^J*) at

j=i

j*i

j=i

J*i

(7.3.12)

j=1

j*i

where the J* symbols refer to input-output terms which can be functions of time

7.3 One element in several interacting reservoirs

381

and of all the M f . Defining b as the vector made of the n rightmost summation terms in equation (7.3.12), we rewrite equation (7.3.9) as — = Ax + b dt

(7.3.13)

If b does not depend explicitly on the Mh decoupling reservoirs introduces a forcing term without affecting the short-term relaxation behavior.The diagonal decomposition of A as UAU'1 gives — = UAU1x + b dt or, pre-multiplying by U~1 du~lx

(7.3.14)

dt l

Introducing the new variables y=U equations

x and h=U

x

b, we get a system of n scalar

&

^ = Xiyi + hi dt

(7.3.15)

which is integrated as

In matrix form, these n equations read y = eAty0 + QAt

t~XuHu)du

Jo where y0 = U~ x x 0 or

x=UeAtU-1xo+UeAt\

Q^U-^ Jo

which, using the matrix exponential notation of Section 2.5, becomes x = eAtx0+

eA(t-u)b(u)du

(7.3.16)

Jo & Two oceanic basins noted A and B contain initially a mass of sodium M A ° and M B °. Sodium residence time in each basin is TA and TB. Sodium exchange takes place

382

Dynamic systems

7

B->A

'A->B

B

Figure 7.12 Exchange of Na in an open two-reservoir system: the flux J0(t) of Na weathered from evaporites introduces a forcing term into conservation equations. between A and B only (Figure 7.12) and steady-state is assumed. From £ = 0, an important mass M o of evaporite is subject to weathering and brings sodium to basin A with a flux J0(t) described by the equation (7.3.17) where Jo(0) and 6 are constants. Calculate the new steady-state and the transient evolution of Na in the two reservoirs. Assume the following values: T A =12Ma, TB = 60Ma, 0 = 5 My, M A ° = 2, M B ° = 10, M o = 3, all the masses being in arbitrary units. Let us first calculate steady-state concentrations when all Na weathered from evaporites has been transported to the sea and all the parameters become time-invariant. At steady-state, fluxes between reservoirs must be equal

which, combined with the conservation condition 2(Na) = MA° + MB° + M0 = MA* + MB gives the successive equalities r

J

B-+A

._, —JA^B

OO_MA-_MB™_1 —

— T

A

T

B

with the symbol oo labelling steady-state values. Therefore

(7.3.18)

Inserting numerical values gives MA°° = 12.5 and MB°° = 2.5 in the same units as the

7.3 One element in several interacting reservoirs

383

other masses. The total flux of sodium released by evaporites from t = 0 to t = oo must be equal to M o and therefore

Mo= p Jo(t)dt = Jo(0) fV"«dt = Jo

Jo

from which we can calculate J0(0). In matrix form, mass balance reads 1

1

1

1

dt dM B

MBJ L 0

(7.3.19)

dt

or, in the usual form

dt

The matrix A can be diagonalized as UXU'1 roots of the equation

The eigenvalues Ax and X2 are the

i.e.,

X =

1

1

TA

=

TB +

TB

TB

TATB

and, as expected from the closure condition, X2 = 0. The relaxation time T of the whole system satisfies 1 1 1 _= —|— T

TA

(7.3.20)

TB

or =10Ma Using the method outlined in Section 2.5, we obtain the eigenvector matrix U and its inverse U~1 as

U=

and

384

Dynamic systems

The matrix exponential eAt is calculated using the standard rules of matrix multiplication -Tx/2

?A

eAt=UeAtU~l =

2

2

. / T A + TB

2

e"

r/T

0

1

0" 1

which is evaluated as TB

- ( TB

(7.3.21) TA

Preparing for integration of the second term on the right-hand side of equation (7.3.16), we first compute the product eA{t~u)b(u) Mo A{t u)

Q ~ b(u) =

as

QA(t-u)b(u) =

The indefinite integral of eA(t

u)

b(u) with respect to u is

TB

f ^(fM)

+ const

T-t

which, evaluated between 0 and £, gives

e^(r"u)A(M)dw =

Jo

TB

M0T2

(7.3.22)

7.4 Several elements in several interacting reservoirs

385

We can combine the last results for the transient mass of sodium in reservoir A

and in reservoir B MB(t) = (l-e-'")-J In order to make the problem dimensionless, we divide the last two equations by MA°° + MB°° , and refer to the fraction of total sodium held at t by reservoirs A, B, and evaporites as/A(£),/B(f), and/0(r), respectively. The solution takes the simple form ^(e-^-e-^)

(7.3.23)

and

/BW = -[l-/o(0)e-^]--^!l(e-^-e-^)

(7.3.24)

and (7.3.25) This transient solution is depicted in Figure 7.13. For more complex dynamic problems, the use of Laplace or Fourier transforms would be particularly recommended (Wiberg, 1971; Anonymous, 1982). <>

7.4 Several elements in several interacting reservoirs This is the most general and promising case which finds application in all the major geochemical and geodynamic systems. Recent years have witnessed the outpouring of many interesting attempts of geochemical modeling using complicated systematics with application to the long-term evolution of the mantle-crust system and, as a response to environmental worries, the evolution of the ocean-atmosphere system. A crucial tool in a better understanding is the use of isotopic tracer, but the price is the handling of non-linear systematics. The same observation can be made for chemical systems in which reactions introduce non-linear relationships among the geochemical variables. One example of isotopic systems and one example of a chemical system will be selected with simple assumptions made in order to illustrate the potential virtues of this approach. Stability analysis of these non-linear systems is considered beyond the scope of this book and the reader may refer to specialized publications (e.g., Arnold, 1978; Procaccia and Ross, 1978, and the review of Crawford, 1991).

Dynamic systems

386

0.2 ^

0

5

10

15

20

Time, t (Ma) Figure 7.13 Evolution of the fractions fA(t) and/ B (f) of total Na held by the two reservoirs A and B in the system described in Figure 7.12. fo(t) is the fraction remaining in evaporites at t.

7.4.1 Multiple reservoir isotopic systems We will now show how the evolution of chemical and isotopic ratios in a multiple-reservoir configuration brings in information about kinetics and chemical fractionation among the reservoirs. That sort of approach goes back to the early days of modern isotope geochemistry (e.g., Jacobsen and Wasserburg, 1979; O'Nions et aU 1979; Allegre et al., 1980; DePaolo, 1980; Allegre et a/., 1983a, b) but its usefulness as a geodynamic constraint is often insufficiently perceived because of many sources of indeterminacy. This is a special case of coupling concentrations since a ratio involves two concentration or quantity variables as its numerator and denominator. We will first set up the equations for the number of moles of three species, one stable species N, one radioactive species M, and one radiogenic species N*. For the evolution of a stable isotope ratio, e.g., 18 O/ 16 O, the equations are still valid but, of course, without the terms involving radiogenic decay. For example, A/"=144Nd, M= 1 4 7 Sm and N* = 143Nd. We denote R the ratio M/N, in this example 147Srn/144Nd, and R* the ratio N*/N9 here 143Nd/ 144 Nd. For the stable species N, mass conservation reads

At

A N,

(7.4.1)

Dividing by the total amount of species N in the whole system and introducing the mass fraction (allotment) XjN of species N hosted by the jth reservoir, we get

7.4 Several elements in several interacting reservoirs

387

For the radioactive species, the mass balance equation contains a radioactive decay term n diV/J ^ I _ _ V _LH_M

dt

j|=i

n

J

^ J 1

Mj

M —2M

^

4- y

(14 2)

Mj

j=i

where X is the M decay constant, while for the radiogenic species, there is radioactive accumulation n

H/V*

i. .N*

~dT"~M

N(*

n

T.

.N*

"' + jh Nj*

j

The ratio M/N in the ith reservoir changes as dRt dt

d(MI/Nl) dt

1 /dMj Nt\ dt

l

dNt dt

Inserting the appropriate expressions, we get

dt

fti

Mt

Nt

fit

-N

J

Mj Nj Nt

fit

Nt

fit Nj Nt

which is equivalent to dR

(J

~dT~ ^tXN'i

M

M

J

\

M

idT/hi

N •

N

J r

N-

Nj ~Ni l~

J ~Ni h i

l

and to M

dt

•J%\ N,

Mt )

k\

T

Mj

'

Nj

N

'JN,

Since M and JV are different chemical species, we make fractionation factors apparent in the form

dr

l

j%

Nt V

MJN,

We call the expression D

j M,N=Ji-j

Mil /Jt-j

N ( 7 4 3 )

MJN{

the coefficient of relative fractionation between M and N upon extraction from reservoir i into reservoir j . It is truly the ratio of M/N in the material extracted towards the reservoir j from the parent reservoir i to M/N in the parent reservoir i.

388

Dynamic systems

Inserting this new variable, the equation now reads

tit

J^i

J^i

^i

where we have introduced the kinetic factors k for N defined in Section 7.3.2. The equation for the ratio R* = N*/N is quite similar, apart for a positive accumulation term. Isotopes of the same element have identical chemical properties, so fractionation factors are unity

and the evolution of the radiogenic ratio is given by dR

i* dt

_ v J¥:i

N

*

J 1

J

"

l

X N J xtN

Summarizing, each set of radioactive-radiogenic pair + stable isotope gives a non-linear set of three equations per reservoir dxN i

at

N V"1 I

= —Xt j^t 2 J ki-*j

N

V"1 /

j^t kj-+i + L,

V"'

at

YJk]^(]Ri)

j^i

N

X

N

j

I

N

^ '

N

R

i

) ^ - A R xt

i

(7.4.5)

Xi

Solving the forward problem of the isotopic and chemical evolution of n reservoir exchanging a radioactive and its daughter isotope requires the solution of 3n—1 differential equations (the minus one stems from the closure condition). The parameters are n (n — 1) independent flux factors k for the stable isotope N and n (n — 1) independent M/N fractionation factors D. In addition, the n values of /^*, the n values of Ri9 and the n — 1 allotments xtN of the stable isotope among the reservoirs must be assumed at some time, preferably at the beginning of the evolution (e.g., 4.5 Ga ago), or in the modern times, in which case integration is carried out backwards in time. One kind of inverse problem providing useful kinetic information assumes a known evolution of AT, R, and R*, typically their present-day values and derivatives. Another kind of inverse problem with an alternative set of equations replaces the knowledge of the rates of change in equations (7.4.5) by the properties of the bulk system, commonly by assuming some sort of grand average of the whole system, as with the chondritic composition of the bulk silicate Earth. Two reservoirs provide five equations for [2 x 2(2—1)+1], i.e., five unknowns. The two-reservoir configuration can apparently be exactly solved for the kinetic and fractionation factors of each isotopic system, although we are going to find some undeterminacy for the fractionation factors. Three reservoirs provide eight equations per system for

7.4 Several elements in several interacting reservoirs

389

Continental crust

Mantle

Figure 7.14 The two-reservoir mantle-crust system.

[2 x 3(3 — 1)+1], i.e., 13 unknowns. The system cannot be solved for the kinetic and fractionation factors unless some assumptions can be made, presumably some of the six fractionation factors. For a larger number of reservoirs, the problem turns out to be a non-linear set of underconstrained equations. It therefore becomes essential to make a large number of assumptions including the chemical fractionation factors. & Calculate the kinetic and fractionation factors for the 147 Sm- 143 Nd system in a simple modern Earth as a two-reservoir mantle-crust (m-c) system (Figure 7.14) as a function of Nd distribution in each reservoir. R is the elemental ratio 147Sm/144 Nd and R* the isotopic ratio 143 Nd/ 144 Nd. Assume the following values for the parameters

We further assume that steady-state is achieved for Sm/Nd fractionation between the two reservoirs

while examination of geochemical data on modern and ancient rocks suggests the following approximation

with 5m = 0.22 and sc = 0.17. The rate of change sm and sc of the 143 Nd/ 144 Nd ratio are expressed in units of A.47Sm in order to emphasize the difference between the observed values for each reservoir and the value estimated from the curves of Nd isotope secular evolution. This was shown by DePaolo (1980) to be compelling evidence for continuing exchange between mantle and crust. For two reservoirs, the summation signs disappears from the descriptive equations (7.4.5). In addition, we can safely equate the kinetic and fractionation factors of 144Nd

390

Dynamic systems

with those of elemental Nd. The six conservation equations (7.4.5) read

dt dx™

Nd v Nd

j.

Nd v Nd

dt dR

Nd

Nd —k M D Sm/Nd\o , u — /Cm->c U — L ; m ^ c ^ m + ^c

"^c

dRn dt

,

Nd/ n

-

_ *m

X

Sm/Nd n

c*

m/Nd n

n \ m

Sm/Nd^n

_'

Rm*)

dKc*

From the last two equations expressing the rate of change of isotopic ratios, we extract a relationship between the kinetic (flux) constants k and the fraction x of Nd apportioned to each reservoir Nd

—

sm-Rm

x

Nd ~Nd~ A l 4 ? Sm

(7.4.6)

Nd Nd

=

7

* Sm

For compact notation, we define a dimensionless time as

The rate of change of Sm/Nd fractionation in the mantle (assumed to be at steady-state, which is not inconsistent with observations, see Albarede and Brouxel, 1987), will provide additional relationships for the fractionation factors D. Inserting expression (7.4.6) for k values into equations (7.4.5) produces

dt

)Km

RS-R.

sc-Rc

In order to make apparent only two unknowns related to the partition coefficients, we can write

391

7.4 Several elements in several interacting reservoirs

which results in the system m

_ n

Sm/Nd\

~c

R*-R,

*_R *

R

Rrr

m

Nd

In a matrix form, the system of differential equations becomes x, Nd

sc-Rc di dRc

dr

-R,

Rm-sm Sm/Nd

1 _

-Rn

R *Rm*-Rc*

(7.4.7)

The matrix on the right-hand side is singular (its determinant is clearly zero), so that the partition coefficients cannot be independently determined. However, the allotments x of Nd between the two reservoirs can be retrieved by adding the two lines of equation (7.4.7) after multiplication of the first row by x m Nd and of the second row by x c Nd . The terms involving the partition coefficients cancel out and we get Nd^m

i

Y

N d ^ c

\/n

n

\

^m ~ Sm

„

Nd,

/n

j? \

5

c ~

c

„

Nd

or

dx

— +K

-R

Inserting the numerical values for the modern Earth, we obtain ° 2 5 °' 2 2 0.25 - 0 0.5131-0.5118 = 0.666 0.17-0.12 0-(0.25-12) + 0.12 0.5131-0.5118

(0.12 - 0.25)

About 40 percent of the Nd held by the mantle + crust system therefore resides in the crust. This estimate is contingent on neither the ( 1 4 3 Nd/ 1 4 4 Nd) b u l k nor the ( 147 Sm/ 144 Nd) bulk ratio of the mantle-crust system (in particular, the chondritic

392

Dynamic systems

silicate-Earth reference). Interestingly, ( 1 4 3 Nd/ 1 4 4 Nd) b u l k and ( 147 Srn/ 144 Nd) bulk can be evaluated from the simple mass balance equations / 1 4 3 Nd\ V 144 NdAui k

/ 1 4 3 Nd\ V 1 4 4 NdA

m

V 1 4 4 Nd

=0.4x0.5118 + 0.6x0.5131=0.5126

These results fall fairly close to chondritic values, 0.51264 and 0.1967, respectively. The Nd kinetic factors (the reciprocal of residence times) can be computed from equations (7.4.6) as O22-0:25L 0.5118-0.5131 0.666

/c m ^ c Nd =

17 12 °-°0.5131-0.5118

which hints at steady-stade not being attained over the age of the Earth (4.5 Ga). Finally, if we assume that crustal Sm and Nd are recycled to the mantle without fractionation, i.e., Dc ^mSm/Nd« 1, we get an estimate of Dm^cSm/Nd from equation (7.4.7) for the crust as 0=-0.25

Q17

~ a 1 2 (l-Dm^cSm/Nd) + 0 + (0.25-0.12) Q 1 7 ~ 0 1 2 0.5131-0.5118 0.5131-0.5118

0.12

and, therefore

If that sort of simplistic model is acceptable, the extent of fractionation hints at rather small degrees of melting in the mantle and/or the presence of a residual phase that fractionates Sm from Nd, probably garnet. o 7.4.2 Non-linear coupling of geochemical reservoirs

When different elements in interacting reservoirs are allowed to dissociate, to react with each other to form complexes, and to precipitate, the rates of change of concentrations usually become non-linear functions of these concentrations. Since no unique theory can be established to describe all possible situations, a reasonably simple example amenable to calculation and of geological interest will give a perspective for more complete formulations. The outline of a model for the response of carbonates in a simplified ocean-atmosphere system to a sudden influx of carbon dioxide has been put forward by Broecker and Peng (1982), Berner et al. (1983), and Lasaga et al. (1985). The model presented in Figure 7.15 aims at suggesting tendencies and kinetics in the carbonate system on the 2000-200 000 year scale and will resort to drastic approximations which do not hold if time-scales outside of that range are

7.4 Several elements in several interacting reservoirs

393

Atmosphere

Pco,

Runoff (C*2+)

Figure 7.15 A simple ocean-atmosphere-continent system. Pressure of CO 2 enhances Ca release from the continental crust (which is assumed to be made of CaSiO3) and controls the depth of calcite saturation. Calcite precipitation is therefore controlled by the hypsometric curve, equation (7.4.8), and pCOr

considered. Because a well-mixed ocean is assumed, this model is particularly inappropriate for short term predictions of the possible effects of fossil fuel burning. It may however be used to discuss the consequences of the glacial-interglacial changes in CO2 that have been found in polar ice cores (Barnola et a\., 1987). Assumptions of the present model are: (1) The mass M of the ocean (1.37 x 1021 kg) and the continental runoffs (3.6 x 1022 kg Ma" 1 ) are constant over 200000 years. (2) Atmospheric CO 2 equilibrates instantaneously with dissolved oceanic carbonates. Actually, Broecker and Peng (1982) show that the rate-limiting step for this process is vertical mixing in the ocean which takes place in approximately 1600 years. (3) The ocean is chemically homogeneous and, in addition to carbonates, contains Ca2 + plus enough of inert ions to adjust alkalinity to the values observed in modern seawater (%2.1 x 10~ 3 eqkg - 1 ). The inert ions, assigned as N a + and Cl" for convenience, are assumed to be time-invariant in the ocean. (4) CaCO 3 surface productivity P by organism is constant. CaCO 3 preservation is proportional to the oceanic surface shallower than a depth zs which we identify with both the carbonate compensation depth (CCD) and the lysocline. The bathymetry of the oceans is approximated by a normal function with a mean value of 5 km and a standard deviation of 1.25 km. These figures are in reasonable agreement with the hypsometric curve of Menard and Smith (1966). By virtue of the relationship between normal functions and error functions (Chapter 8, Appendix A), the fraction of the ocean surface which is shallower than zs is therefore given by (7.4.8)

394

Dynamic systems

(5) Solubility of calcite increases significantly with increasing pressure. In the present ocean which contains 10.3 x 10" 3 mol kg" 1 Ca 2 + , the depth zs of CaCO 3 saturation is given by [CO32~]=90xl0"6xe016(Zs~4) (Broecker and Takahashi, 1978). zs is the depth in km and carbonate concentrations are in mol kg" 1 . Since seawater Ca 2 + concentration is allowed to change as a result of calcite precipitation and river input, it is more general to state that the saturation product Ks changes as

= 0.927 x 10 ~ 6 x e 0 1 6 ( Z s - 4 )

(7.4.9)

(6) Dissolution of carbonated sediments is, probably unduly (Broecker and Peng, 1982), neglected. (7) Continental crust is assumed to be made of calcium silicate. High CO 2 pressure increases weathering and Ca2 + in the runoff according to the fictitious reaction CaSiO 3 + CO 2 = SiO2 + C a 2 + + C O 3 2 -

(7.4.10)

A different weathering reaction proposed by Berner et al. (1983) involves HCO 3 ~ instead, but is entirely equivalent to the present equation when HCO 3 " dissociation is taken into account. (8) Activity and fugacity coefficients are equal to unity. (9) The present-day concentrations are taken as steady-state values. (10) The ocean-atmosphere system contains 35 x 1015 kgC, only 2 percent of which resides in the atmosphere (Berner and Berner, 1987). ^ Assuming that 0 . 7 x l 0 1 5 k g reduced C from forests or fossil fuel are nearly instantaneously burned as CO 2 at t = 0, i.e., that an additional 2 percent oxidized carbon is injected into the ocean-atmosphere system, calculate how Ca 2 + , alkalinity, and Z(CO 2 ) in the ocean, the depth of Ca-carbonate saturation (lysocline), and the atmospheric pCOl change over 100000 years. Carbonic acid is diprotic and its dissociation may be written with the two equations

[H+XHCO3-] [H 2 CO 3 ]

pngxy] [HCO3-]

(7A11)

while dissolution of atmospheric carbon dioxide in ocean and river water is assumed to obey Raoult's law with constant solubility coefficient a [H 2 CO 3 ] = apCO2

(7.4.12)

Combining the last three equations gives the CO 2 pressure as a function of the speciation of carbonates

-*'-

C H C O

^

(7.4.13)

7.4 Several elements in several interacting reservoirs

395

For p H > p X l 5 the total of carbonates can be approximated as Z(CO2)« [HCO 3 " ] + [CO 3 2 - ]

(7.4.14)

while, neglecting minor ions H + and O H " , electrical neutrality demands

Writing for short p = pCo2, x = C a 2 + , y = £(CO 2 ), and label with the subscript r the riverine values. The alkalinity A of a solution (Stumm and Morgan, 1981) is defined as its neutralizing capacity -]

(7.4.15)

Subtracting (7.4.14 ) from (7.4.15) and turning to the short notation results in [CO32] = A-I,(CO2) = A-y

(7.4.16)

while, upon subtracting (7.4.15) from twice equation (7.4.14), we get [HCO 3 -] = 2Z(CO 2)-A = 2>;-/1

(7.4.17)

Introducing K' = ocK1/K2, we now express equation (7.4.13) as Kp

J^^l

(7A18)

A-y Rearranging, we get the second order polynomial in y 4y2-{4A-K'p)y + A2-AK'p = 0 which has one non-negative solution (see Broecker and Peng, 1982) ]"2}

{4A

(7.4.19)

The weathering reaction leads to the relationship [Ca 2 + ] r [CO 3 2 -] - = K* Pco2

where X* is the equilibrium constant. We assume that rivers transport C a 2 + but very little other major ion besides carbonates, i.e., Ar = 2xr

396

Dynamic systems

and therefore P Combining with equation (7.4.19) rewritten for runoff concentrations

we get an explicit expression that can be solved for K* if xT and pco2

This equation can also be solved for x r if pco2

are known

and K* are known

by transformation into a fourth-degree polynomial. It is nevertheless solved faster using a Newton refinement scheme since a reasonably good initial estimate is usually available. The system is controlled by three independent variables, C a 2 + , Z(CO 2 ) and alkalinity A. In differential form, the rates of change can be written as the difference between the input rate in runoff and output rate by CaCO 3 sedimentation M ^ = Rxrlp(x, y, Aft - PF[zs(x, y, Aft dt M ^ = Ryr - PFlzs(x, y, Aft at dA M — = 2Rxrlp(x, y,A)-] - 2PF[zs(x, y9 Aft at

(7.4.23)

with the initial conditions x = x 0 , y = y0, A = Ao. The rate of Ca and carbonate removal (CaCO 3 sedimentation) is equal to the calcite productivity P multiplied by the preservation function F. We take X'=113molkg" 1 atm"' 1 . At steady-state, which we note with the superscript t , p t = 3 x l 0 " 4 a t m , x t = 10.3 x l O ~ 3 m o l k g " 1 , x r t = 3 . 6 x l 0 " 4 m o l k g ~ 1 . Since in this model CaCO 3 is the only sink of carbon dioxide, xrjf = yr^ at steady-state, and therefore yT = yr^ = 3.6 x 10 ~4 . This is enough to estimate the weathering constant K* from equation (7.4.21) sje_

113x0.00036 8

0.000 362 0.0003

0.000 36[7

113 x 0.0003V 0.00036

0.0003 LV

,

2

113 x0.0003l1/2

0.000 36 +0.000 36

8

/

2

J

7.4 Several elements in several interacting reservoirs

397

or K* = 1.69xKT5 Under a steady-state CO 2 pressure of 3 x 10~ 4 atm, A* = 2.1 x 10" 3 eq k g " 1 . Inserting this value into equation (7.4.19) gives the steady-state Z(CO 2 ) yt =

i [ 4 x 0.0021 -113 x 0.0003 8 + [(4 x 0.0021 -113 x 0.0003)2 - ( 4 x 0.0021)2 +16 x 0.0021 x 113 x 0.0003]1/2]

or

We n o w need t o c o m p u t e the calcite productivity term P from t h e conditions a t steady-state

which requires t h e preservation a t steady-state t o b e estimated. Recasting equation (7.4.9) as a function of the variables x, y, a n d A gives

and at steady-state, the calcite saturation depth can be computed from 13.89 +In x V * - ) ^

+

zsT = 4H 0.16 Replacing the variables b y numerical values, it becomes , 13.89+ ln[10.3x KT 3 (2.100-1.995)x 10" 3 ] T zs = 4 H = 4.9 km 0.16 The fraction of the calcite produced which reaches seafloor at steady-state is

F(z.) = i l+erf^^4 2\

=0.468

1.25^/2

which gives calcite production rate as p.3-6xl0»x3.6xl0-*

0.468 Introducing the time-constant characteristic of seawater renewal by runoflf and calcite

398

Dynamic systems

entrainment as TH = M/K=1.37xl0 21 /3.6xl0 22 =0.0381Ma- 1 M/P= 1.37 x 1021/2.77 x 1019 = 49.5 Ma" 1 the system of differential equations can finally be written in a non-linear form amenable to calculation dx dt

xr[p(x,y,A)~] 0.0381

dy

3.6 x l O "

dt

0.0381

dA dt

4

49.5 FlzJLx,y9A)]

(7.4.24)

49.5

2dx

dt

At time t, the state of the system is defined by a set of variables x, y, and A. The atmospheric pCOl is calculated through equation (7.4.18), C a 2 + in runoff (xT) through equation (7.4.22). The saturation depth zs is obtained through equation (7.4.9), then the preservation function F through equation (7.4.8). Integration can be carried out numerically using a Runge-Kutta method.

10.34

0.02

0.04

0.06

0.08

0.1

Time, t (Ma) Figure 7.16 Medium-term response of the model described in Figure 7.15 to a sudden increase by 2 percent of the total amount of oxidized carbon in the ocean-atmosphere system. Ca (top) and X(CO2) (bottom) in the ocean.

600 r

0.6

n

5001

0.5

I

B 0.4 a

400

OQ

2.20

0.6 J0.4

'2 2.15

0.2 2.10

0

0.02

0.04

0.06

Time, / (Ma)

0.08

0.1

0.02

0.04

0.06

0.08

0.1 0

Time, t (Ma)

Figure 7.17 Same as Figure 7.16 for p CO2 , seawater alkalinity A, runoff [ C a 2 + ] and the fraction F of precipitated calcite which is preserved on the ocean floor. It takes a little less than 10000 years for runoff calcium to neutralize the excess dissolved CO 2 , but calcite precipitation takes much longer to eliminate Ca and carbonate excess.

^

400

Dynamic systems

The constitutive variables at t = 0 are x= 10.3 x 10" 3 molkg" 1 , y= 1.02 x 1.995 x 10~ 3 molkg~\ and ,4 = 2.1 xlO~ 3 eqkg" 1 . The system of differential equations has been solved using commercial Runge-Kutta software (MatLab) up to t= 100000 years. The major features of this calculation are quite interesting (Figures 7.16 and 7.17). An input of CO 2 into the ocean-atmosphere system results in a shallower saturation level F: more CO 2 dissolves, the pH decreases and HCO3~ dominates over CO 3 2 ~. For constant alkalinity, the system accommodates more carbon as bicarbonates. Since poor preservation does not permit evacuation of the excess HCO 3 ~ and CO 3 2 ~, the system must 'wait' until the burst of atmospheric CO2 enhances erosion and drives enough Ca to the sea. Seawater alkalinity is raised in this process, the excess carbonic acid is neutralized and calcite sedimentation can resume. It takes roughly 10000 years for this neutralization to take place and the lysocline to be restored. It takes another 100000 years for the system to eliminate the excess calcium and carbonate ions by carbonate sedimentation. Further practical applications of modeling methods will be found in Walker (1991). o

8 Transport, advection, and diffusion

8.1 Fluxes 8.1.1 Basic definitions We already used the concept of conservative process in Chapter 1. In the present context, conservative properties are scalar, vector, or even tensor variables that can be added or subtracted. They can only be modified by exchange with the surrounding across interfaces and by being locally stored in sinks or locally released from sources. Mass and energy are conservative scalar properties, concentration and temperature are not. Momentum is a conservative vector property, velocity in general is not. A flux represents an 'amount' of a conservative property transported in a given time across a boundary, an outlet, or an imaginary surface. It may be a number of cars driving on a highway on Sunday, the mass of water running through a fault zone in a year, the heat flow across the Earth's surface or the vector describing the rate of transport of electromagnetic energy through space. In proper words, we should call flux density whatever flux refers to a unit time and a unit surface and reserve the use of flux for less specific situations. As common usage has unfortunately decided differently, we will have to be careful about the possible ambiguities associated with flux denomination. Let us consider a medium moving with velocity v (components vx, vy, vz). A medium with non-zero velocity is said to be advective. Let us first define in the most general way the flux of volume at a point M of the familiar 3D space: this is simply the quantity of volume moving across the unit surface perpendicular to v per unit time. For an arbitrary surface 85 next to M and perpendicular to v (Figure 8.1) and during time dt, the volume will be dK=65|v|dr and the flux per unit time and unit surface will be the vector Jv such as _

55vdr SSdt

In a moving medium, velocity therefore represents the flux of volume. The flux of material / associated with the movement is the amount of matter carried by the flux 401

402

Transport, advection, and diffusion

vdt

Surface 85 Figure 8.1 For a surface perpendicular to the displacement, the flux of volume through the element of surface 55 during the infinitesimal time interval dt is the vector v 55 dt. v therefore is the flux of volume crossing the unit surface per unit time. of volume. This flux is variable in space and time. If p is the local density (in kgm" 3 ) of the medium, this flux is simply J=Jvp = pv If Cl (in kg/kg) is the local concentration of the species i and if there is no relative movement of different species relative to each other, the flux of the species i will be the vector j such as i

= pvCi

(8.1.1)

If species i moves with a velocity vl different from that of other species, this expression must be modified as i

= pivi = pviCi

(8.1.2)

where P; = pC' is the mass density of component i, again in kgm" 3 . The total flux of matter / is the sum of the fluxes Jl of its individual components i and therefore J=YJJi

(8.1.3)

i

The velocity v of the moving medium is the mean velocity for all species weighted by their concentration

(8.1.4) Were other units used for concentrations, namely mole numbers instead of weight,

403

Figure 8.2 For a surface which is not perpendicular to the displacement, only the component of the velocity perpendicular to the surface, i.e., vn measures the flux.

volume instead of mass, other reference frames should be chosen for the mean velocity (e.g., Brady, 1975). Now let us define the fluxes across an arbitrary surface (Figure 8.2): they simply are the scalar amount of volume, mass or species i which crosses an arbitrary surface, not necessarily perpendicular to v. This flux, which is here represented by the lower-case letter;, is the projection of the vector flux onto the normal to the surface. Since the dot product vn (or vTw) is the projection of v onto the normal to the surface, the flux of volume jv per unit surface is

whereas the flux across the arbitrary surface will be

where use is made of the oriented surface dS, a vector defined as the product of the scalar surface d5 with the normal n

dS=ndS For a closed surface, the convention is to use a normal oriented outwards. Flux j of matter and flux J1 of species i through dS will be

and

JidS=pCivdS respectively.

404

Transport, advection, and diffusion 8.2 The divergence theorem and the conservation equations

Let us consider the material balance in an infinitesimal cubic volume dV around a point M with edges dx, dy9 dz, so

dV=dxdydz The balance of material in the x direction is the difference between the fluxes through the surfaces perpendicular to the x axis (Figure 8.3). This corresponds to the difference 71 xH

, y,z \{idydz)-jl

x

,y9z Y{idydz)

where notation emphasizes the dependence of the flux on the position variables and unit vectors, or its first-order expansion dJ(x,y,z)dxl

I"

dJ{x9y9z)dx

This equation is equal to

dx The dot product /(x,y,z)7 is simply the component Jx of J(x9y,z) along the x axis. The total change over the volume dV is the sum of the mass transport in the three directions, i.e., dJx dJy 3J2\A A A 1 H I dx dy dz = div /(x, y9 z) dV dy dz J dx The divergence of the flux vector is therefore the net rate of accumulation of the quantity which is transported in and out of the volume element dV. This can be integrated over an arbitrary volume Q limited by the surface S to give the divergence theorem of Gauss

ff f div J(x,y,z)dV = JJ J(x,y,z)-dS 8.2.1 The continuity equation In the real world, matter must be conserved. Let us relate the rate of variation of the mass contained in an arbitrary volume Q to the flux across the surface Z d fff f sum of all the fluxes ) CC — pdV = < V = - J(x,y,: dtjjj [across the boundaries J JJ

8.2 The divergence theorem and the conservation equations

405

Figure 8.3 Material balance in a moving medium through the faces of an infinitesimal cube centered in M.

on the left-hand side, the order of integration relative to space coordinates and derivation relative to time can be interchanged, provided the time derivative is specified to be local at the point M(x,y,z): a partial derivative sign is therefore necessary. On the right-hand side, the minus sign is necessary to make a flux vector pointing outwards decreasing the mass content of the volume. The divergence theorem can be applied which results in

Once mass flux is expressed as a function of density and velocity, it becomes

f

divpvdK

(8.2.1)

This equation holds true for any arbitrary volume, and so must be true for the volume element dV. We can therefore drop the integration sign and term dV to obtain dp dt

= — div pv

(8.2.2)

which is the continuity equation. Velocities in an incompressible (isochoric) medium, i.e., a medium with constant p, will therefore verify div v = 0

(8.2.3)

8.2.2 The general transport equation The elemental conservation equation is laid down on the same principles as the continuity equation. The rate of variation of the amount of element i contained in

406

Transport, advection, and diffusion

an arbitrary volume Q due to the flux (x, y, z) across the surface Z, and to chemical reactions is

where Ak[ is the production (> 0) or consumption (< 0) rate of the species i in the kth chemical reaction (in mol or g per unit time and volume). If Akl = 0 for all fc, the species is conservative. Note the minus sign on the first right-hand side term. As for the continuity equation, we change the order of derivation and integration on the left-hand side and apply the divergence theorem on the right-hand side

(814)

This equation holds for any arbitrary volume, and so must be true for the volume element dV. We can therefore drop the integration signs and the common factor dV in order to obtain

or introducing the expression of /*(x,j;,z) as a function of the velocity v1

Let us split the flux term pVO into the purely advective term pvC1 and a diffusive term, hereafter also referred to as Hf\ describing the movement of the species i relative to the center of mass p(vl — v)Cl = - div p(vl" - v)C - div pvO + £ Akl

dt

Expanding the derivatives of the left-hand side and the second divergence term on the right-hand side gives C — + p — = - div V - O div pv - pvgrad O + £ Ak* dt

dt

k

which, using the continuity equation (8.2.2), can be simplified into

p

dC* = - div Yf - pvgrad C" + X Akl dt

k

(8.2.5)

8.3 Advection and percolation

407

This is the fundamental transport equation for the species i, which does not depend on any assumption other than mass conservation. This equation is valid for a fixed reference frame, such as, using a comparison borrowed from Bird et al. (1960), counting fish in a river from a bridge. Occasionally, the fisherman wants to track the concentration in a given parcel of matter, say that he is now counting fishes from a boat carried by the stream. Let us write the total differential of O

,^ fdci\ Ad fdci\ d, x +(dci\ '+hr hr dc= — d ' + h r d x + h r V Vd dt J Jxyz

\\ dx d J Jyzt

, (dc\ d d *+hr *+hr

\\ ddy J Jxzt

, dz

\\ ddz J Jxyt

Divide everything by dt dC

(dC\

~df " \fr)xyz

fdC\ +

dx fdCl\

dy fdC{\ dz

+

Kte),* df \dy)X2t di + \ ¥ J X , d^

If concentrations have to be known at a point moving with an arbitrary velocity, the increment ratios dx/dt, dy/dt, and dz/dt can be constrained accordingly. Commonly, the velocity will be that of the medium itself (the boat is now drifting freely) and dx

dy =

dt

y

dz = y

*' dt

r

=

dr

y

2

where vx, vy, and vz are the components of the rate of displacement v, which amounts to 'pin' the ratio dO/dt to the moving medium. This ratio is usually called the substantial derivative and noted with upper case D

The rate of change of Cl along the movement therefore is DC1

(8.2.6)

8.3 Advection and percolation In the case of pure advection (no molecular transport), the diffusion term in the general transport equation (8.2.5) is made equal to zero and time-dependent mass balance is expressed as vgradC+X^ dt

pk

(8.3.1)

408

Transport, advection, and diffusion

8.3.1 Effect of bioturbation on concentration profiles in sediments The mechanical activity of burrowing and digging animals, such as worms, mixes the surface layer of deep-sea sediments and therefore smears the stratigraphic record of the chemical changes at sediment surface. This problem has been investigated quite thoroughly by a number of authors interested in recovering the original history of chemical fluxes (Berger and Heath, 1968; Ruddiman and Glover, 1972; Guinasso and Schink, 1975). The simplest approach considers a perfectly mixed bioturbated layer of thickness L and homogeneous concentration C\ If v is the sedimentation rate, the mass balance condition for element i reads

where J0\t) is the flux (mol m 2 s *) of i at the sediment-water interface. Let us assign arbitrarily a time equal to zero to a layer presently at depth z 0 below the bottom of the bioturbated layer. The layer now at depth z was formed at time t such that Z = Z0- fv(l)dT Jo For a bioturbation of constant thickness L, the conservation equation now becomes pLdCl = [Jol(z)-pvCl]l

(8.3.2)

Integration of equation (8.3.2) shows that, at constant sedimentation rate, a spike of element flux at t = 0 with no subsequent input will be smeared out upwards with a characteristic length L and decrease exponentially (Figure 8.4) according to the law C\i) = C'(0) e "Vt/L = C\0) e "(ro"2)/L

(8.3.3)

where C(0) is the concentration at £ = 0 at the bottom of the bioturbated layer. Alternatively, a reduced flux of element i per unit length can be computed from

pv

\

dz /

(8.3.4)

Michel et al. (1990) have measured the iridium content in Ocean Drilling Project hole 690 sediments from the Weddel Sea across the Cretaceous-Tertiary (K-T) boundary (Table 8.1). Depth z is relative to an arbitrary level in the core. Compare the reduced iridium flux at each depth down the core assuming the bioturbated layer is either 4 or 8 cm thick. Making the approximation at the feth depth level that

409

8.3 Advection and percolation Table 8.1. Iridium content (ppt or 10 12 gig) in sediments from the section 15X of the OOP 690 hole (Michel et al., 1990). Depth z (cm) is measured relative to an arbitrary reference layer.

22 23 24 25 26 27 28 29 30 31 32 33 34

CIr

z

CIr

z

CIr

128 221 163 190 197 126 164 150 134 223 264 333 328

36 37 38 39 40 41 42 43 44 45 46 47 48

467 747 1101 1487 1566 983 1237 602 378 578 511 619 362

50 51 52 53 54 55 56 57 60 63 66 71 77

298 389 248 361 218 223 297 292 174 252 126 111 84

Time

li

Figure 8.4 Smearing of a burst in the input of an element i at the surface of the sediment by bioturbation over a layer of constant thickness L (from Ruddiman and Glover, 1972). With time, combined sedimentation at rate v and bioturbation, smears the concentration peak up the sedimentary column.

Figure 8.5 Reduced flux of iridium at the K-T boundary in the Weddel Sea assuming bioturbated layer of constant thickness L = 4 cm and L = 8 cm. L = 0 cm corresponds to the raw data of Michel et al (1990).

8.3 Advection and percolation

411

we get the reduced fluxes plotted in Figure 8.5. For L = 0, the original data are obtained. The negative values obtained for L = 4 and L = 8 cm appear to result from both a sharp peak and the assumption of constant L (this effect is parent to the Gibbs effect mentioned in Chapter 2). The values becoming more negative above the Ir peak when L increases suggest that the bioturbated layer has actually decreased dramatically as a result of the K-T catastrophic event, asteroid impact or volcanic eruption, that killed most of the burrowing organisms. The nearly periodic pattern of the Ir flux (10 cm «10,000 years) is not explained. o=

8.3.2 Exposure ages and the assessment of erosion rates Cosmogenic isotopes, such as 10Be or 26A1, are created by the interaction of cosmic rays with the Earth surface (Lai, 1988; Brown et a/., 1991). Their measurement in surficial rocks have been suggested to provide a quantitative estimate of erosion rates. Relative to a depth z-axis with the origin kept at the surface, the assumptions of the model are as follows: (a) since the Earth surface moves upwards with respect to the origin, i.e., toward negative values, the material velocity is a negative scalar that will be considered constant and labeled — v; (b) due to the rays being progressively absorbed by the surface material, the production rate decays exponentially with depth from a value of Po at the surface with a characteristic attenuation distance I/a; (c) the cosmogenic isotope decays with a decay constant L Therefore, the rate of change of the concentration C of cosmogenic isotope is the sum of the advective flux plus the production rate minus the decay rate — = v — + P0Qaz-AC dt

(8.3.5)

dz

Integration of this partial differential equation will be easier by pinning the frame to the rock (remembering that the algebraic speed is — v) DC = POQ-^-^-XC

(8.3.6)

Dt

where £ is now the distance to the interface at t = 0, i.e., { = z + vt

(8.3.7)

An equation without exponential would be easier to handle, so we change the variables as

with the substantial time derivative equal to — = — e"a(C"vr)

Dt

Dt

412

Transport, advection, and diffusion

The conservation equation simplifies into DM

which can be rearranged as / D [u--

P.

\

•7

Dr and integrated into — = const x e - (A+av)r

u

(8.3.8)

Reverting to concentration C eaz = —— + const(C) x e "(A+av)t we apply a condition of constant concentration C o at all depths C eaC = -^5L + const(C)

(8.3.9)

where we have made a careful distinction between z (depth at t) and z (depth at t = 0). Inserting the value of the constant into equation (8.3.2) gives

X + zv with the final result (8.3.10) At steady-state, surface concentration is

A + av 8.3.3 Dispersal of a conservative tracer in a velocity field. This is a problem of considerable geodynamic and petrological impact: on the mantle scale, molecular processes are negligible (Hofmann and Hart, 1978) and elements can

8.3 Advection and percolation

413

be considered as conservatively (= passively) dispersed by mantle convection (Richter and Ribe, 1979; Hoffman and McKenzie, 1985). Injection of oceanic crust and sediments at subduction zones creates geochemically anomalous zones in the uppermost mantle. These anomalies are entrained by mantle movements. Likewise, contamination of a magmatic body by roof pendants rapidly spreads under effects of magma convection. Convective velocity fields are fairly complex as they depend on a number of major assumptions on the temperature dependency of rheology. We can illustrate the salient features of convective dispersal by choosing a simple velocity distribution in a rectangular convection cell (0 < x < a, 0 < y < b) such as that associated with the onset of Benard instability in the conditions of Boussinesq approximations (e.g., Turcotte and Schubert, 1982). Let us make the calculation for the so-called 'free-slip' conditions, which permit free movement along the boundaries, both vertical and horizontal, such as a convection cell which would be limited by no rigid boundary. From Turcotte and Schubert (1982), we take the velocityfieldto be dx

y .

dt~~'

\j/0 — COS 5T-SII

dy dt

n . y ) — sinTc 2a

X

a x

COS 7 1 -

(8.3.11)

where ^ 0 is a constant. No material crosses the limits, for vx = 0 at x = 0 and x = a, and vy = 0 at y = 0 and y — b and velocities are maximum along the limits. Tracer i is conservative, i.e., diffusion is negligible and no reaction changes its local concentration. Its concentration therefore satisfies the advection equation

Dt

0

In practice, the tracer follows the moving material with the same velocity, like dye transported by a solution. The position of a point 'dyed' by the tracer can be followed by solving the system of differential equations above. As shown in Section 3.2, a numerical Runge-Kutta scheme is appropriate. In Figure 8.6, we assume a = b = \po = \ and points falling on a circle at t = 0 are tracked by increments of 0.2 until r = 0.8. The warping and stretching of the circle shows how mechanical mixing proceeds: the area enclosed by the curve is preserved, since material is conserved, but its perimeter is stretched considerably. The rate of stretching, e.g., the rate of length change per unit time, often simply scaled by l/|gradv|, measures the efficiency of the mixing process (Figure 8.7). Mechanical boundary layers, such as continental lithosphere or the D" layer in the mantle or bottom water in the ocean, are regions where heterogeneities are efficiently created and preserved. Physics of mixing has recently opened as a new field with many promises for engineering (Ottino, 1989) as well as Earth sciences. The concept of a marble-cake mantle (Allegre and Turcotte, 1986) in which layers of subducted oceanic crust are stretched by mantle convection together with residual peridotite is currently under intense scrutiny by scientists in geochemistry.

414

Transport, advection, and diffusion

0

0.2

0.4

0.6

0.8

1

Figure 8.6 Convective dispersal of a passive tracer located at t = 0 inside the circle in the upper right corner by the velocity field described by equations (8.3.11).

|«-----:

~T

vx(y + dy)dt

_*_ vx(y)dt Figure 8.7 Scaling of the stretching rate by the velocity gradient. For simplicity, the medium is assumed to move in a direction parallel to the x-axis. Two points initially connected to each other by the vector [0,dy] become connected, after a time df, by the vector [dv x ,dy]. dvjdy is therefore a measure of the stretching rate in the direction x.

8.3.4 Percolation and infiltration metasomatism Migration of fluids in a porous matrix with solid-liquid fractionation results in a process much similar to the chromatographic separation of elements (DeVault, 1943; Korzhinskii, 1970, Hofmann, 1972). This mechanism has recently been revived in the context of mantle metasomatism by Navon and Stolper (1987), Bodinier et al. (1990), Vasseur et al. (1991), in the context of hydrothermal systems by Lichtner (1985) and, for stable isotopes, by Baumgartner and Rumble (1988). Only a simplified account of this model will be given here. Let cp be the porosity of the medium, p sol and pUq the density of the solid matrix and melt, respectively, and vliq the fluid velocity relative

8.3 Advection and percolation

415

to the matrix. The reference frame is attached to the solid matrix. We first write that the rate of variation of the amount of material contained in the volume Q enclosed by the surface Z equals the total flux across the surface

= -IT

v

~
(8.3.12)

Exchanging the order of derivation and integration on the left-hand side integral and using the divergence theorem on the right-hand side, we get

^ [ Wn q + (1 -
What is true for an arbitrary volume Q is true for the volume element dV9 hence we can drop the integration signs and dV on both sides of the equation

- [ W i i q + (l-
(8.3.13)

which expresses continuity of the fluid and its solid matrix. Let us now calculate the mass balance for an element i which is transported by the fluid and is allowed to react with the solid. Csolf and CXij are the concentrations of element i in the matrix and fluid respectively. The variation of the total amount of element i contained in the volume Q enclosed by the surface Z per unit time is equal to the sum of fluxes which cross the surface

iiqCW + U -cp)psolCson dV= - JJ W l i q C l i q l v l i q -d5 Dispersive forces due to the tortuous path of the fluid, matrix deformation, and diffusion are neglected. Taking the same steps as for the continuity equation, we get - [ W i i q C W + (1 - ^PsoiCsoi1'] = - <*iv
Taking the derivatives of the products, rearranging the left-hand side and expanding the divergence on the right-hand side gives - [WiiqCW + (! ~ ^PsoiCso/] + - [(1 - q>)pjfij ~ Cliq')] dt dt Hq '

- Cluj div
416

Transport, advection, and diffusion

Expanding the left-hand side and rearranging, we obtain 3

ZQ

< V l < pl
)P,OI] W lliq )P] + [[W iq

i

d(C

i

C ')

+ dd -<7>)P <7>)Psosi]-£*+ (1 - S°' "" o i]£ + ot at t

=(CJ - <:„„') ^ ^ ! - (?>pllqvliq-grad Cllq' - C,,,' div Wliq v liq From the fluid continuity equation (8.3.13), the first term on the left-hand side cancels out with the last term on the right-hand side, giving the general conservation equation (8.3.14) which will prove useful in modeling trace-element partitioning during magma genesis. Further simplification is achieved for constant p sol and cp dC l dC l ^+(l
Wvwgradcy

(8.3.15)

We assume that C so / and C liq ' are related through a thermodynamic relationship, as may be the case for trace elements. In one dimension (x), equation (8.3.15) can be recast as a function of Csoll only

which we can rewrite as

dt

dx

vl is defined as

v l lq q

(8.3.17)

and represents the velocity of the isopleths Csol* and C liq ' since, for a parcel of solid moving with the velocity v1 =-^L = 0 Dt

(8.3.18)

The evolution of a concentration profile depends on how the velocity changes with concentration. It is left to the reader to show by taking the derivative of equation (8.3.17) at constant cp and vliq that the sign of dvVdCliq' is opposite to the sign of d2Cso,V(dCliq1)2- For a trace element i, the last expression is the derivative of its

417

8.3 Advection and percolation

solid-liquid partition coefficient relative to the concentration in the liquid and therefore measures the deviation from regular solutions (Henry's law). Three cases may be accordingly singled out: (a) For dvVdCliql = 0 or tfCj/idC^)2 = 0 (Henry's law), v1 is constant. This case is investigated in Chapter 9: all the isopleths of species i move at the same velocity and the concentration profile in the solid is simply translated without modification. (b) For dv'/dCliq l < 0 or d 2 C sol '/(dC liq f ) 2 >0, isopleths of low concentrations catch up isopleths of high concentrations: downstream positive gradients build up as metasomatic sharp fronts (Hofmann, 1972), negative gradients flatten. (c) For dvVdCuq1' > 0 or d2CsolV(dCliq1)2 < 0, isopleths of high concentrations catch up isopleths of low concentrations: downstream positive gradients flatten, negative gradients build up as sharp fronts.

At some point, gradients may become infinite and form a shock wave. This 'breaking' time represented as stage 3 in the cases (b) and (c) of Figure 8.8 depends on the initial distribution and solution properties and is known mathematically as a 'bifurcation' (e.g., Logan, 1987; Strang, 1986). Further evolution results in a

(a)

d2dsol \

(Henry's law)

\

\

-ari

(b)

/

/

(c)

Figure 8.8 Advective propagation of a chemical wave of tracer i moving with a velocity v1 in a wetted porous solid at times t= 1, 2, 3 for different values of d ^ ^ d C ^ 1 ' ) 2 . Breaking takes place at t = 3 in cases b and c.

418

Transport, advection, and diffusion

Figure 8.9 A multi-valued relation between x and C so / is unstable and produces a concentration discontinuity (front). The hatched areas on both sides of the vertical line must be equal, which defines the position of the front.

multi-valued function C so / over a certain range of x, i.e., for a single position x three distinct values of Csoll satisfy the conservation equation, which is schematically depicted in Figure 8.9. Such a situation is unstable with regard to fluctuations and a concentration front forms with its position controlled by mass balance constraints (the two hatched areas must be equal). This theory was developed for metasomatic fronts by Korzhinskii (1970) and Hofmann (1972) and the topic comprehensively covered in the book by Ortoleva (1993). We will now estimate the front velocity using the method of Guy (1984). Let us consider, in one dimension, the conservation of element i in a rock column (x = xx to x = x2) of unit section which contains a propagating front at the time-dependent position s(t). The discontinuity is handled by breaking the column at s = s(t). The amount of element i in the rock column only changes by fluid exchange through both ends of the column, hence +(1 -
[w l i q C l i q ' + (l -
dx

where the - and the + superscripts on s indicate which side of the discontinuity is used as the integration limit. Using the Leibniz's rule for differentiating integrals gives — dx

at

8.4 Diffusion basics

419

Let xx-+s~ and x 2 ~> s+ - The two integrals tend to zero while the other terms remain finite because of the concentration jump. The velocity ds/dt of the discontinuity is therefore

C Hs+)-C

dT

Wliq + (1 - W s o l ^

fV + x

Hs-)nq ^

(8 3

' '

9)

,v _x

This solution is a generalization of equation (8.3.17) in the absence of a front. Many geological problems dealing with the genesis of ore bodies and oil fields, with the chemistry of aquifers, etc... cannot be handled adequately with a matrix of constant mineralogical composition. Upon water percolation, some mineral reactions take place which affect the chemical and physical properties of the rock. These reactions lead to extremely complicated transport problems which, in the frame of a non-specialized textbook, can only be hinted at. Contrary to chemical elements, molecules or minerals are not necessarily conserved in a closed system because chemical reactions may affect their relative proportions. Every geologist is familiar with weathering of granites increasing the amount of kaolinite at the expense of K-feldspar. For each reaction between N species (ions, molecules, minerals, ...), a simple symbolic form can be written N

l l

0<^> X vklZZl where Z 1 is the chemical symbol of the ith species and vk its stoichiometric coefficient in the fcth reaction. When the reaction proceeds, the numbers nl of molecules i present in the system change by the increment dnkl and mass balance requires dnkl

dnk2

dnkN

where £k is the progress variable of the fcth reaction. Assuming that species i is involved in R reactions, its conservation equation therefore must be written ^

(8.3.20)

where d£k/dt is the rate of the fcth reaction. Readers are referred to Bear (1972), Lichtner (1985), and Phillips (1991) for a detailed treatment of this complicated problem with many applications to economic geology. 8.4 Diffusion basics 8.4.1 The diffusion equation From now on, we assume that the diffusion ( = molecular) transport is not negligible, so we need some expression relating the diffusion flux to measurable quantities, e.g.,

420

Transport, advection, and diffusion

concentrations. The first Fick 'law ' assumes a proportionality between the flux of the species i and a 'force' which is the volumic concentration pCl ¥ = - 0 1 grad pCl

(8.4.1)

where Q}{ is the diffusion coefficient of the species i in the medium under consideration. & is expressed in surface unit per time unit (e.g., m 2 s~ 1 ). If the medium is incompressible, the conservation equation (8.2.5) can be transformed into dCl

A'

— = div(^f grad C) - v grad Cl + £ — dt uP

(8A2)

The first term on the right-hand side can be expanded as div(0' grad Cl) = 0'V 2 C + grad 0'*grad C where the Laplace operator V2 = div(grad) = V(V), also noted A, and such as

has been introduced. For constant <3\ we obtain the standard diffusion equation ?sCl

dt

^VCvVC+y * P

(8.4.3)

The right-hand side is the sum of three terms describing diffusion, advection, and chemical reaction, respectively. For the one-dimensional equation with x as the space variable, the diffusion equation is a partial differential equation of the first order in time and the second order in x. It therefore requires concentration to be known everywhere at a given time (in general t = 0) and, at any time t > 0 , concentration, flux, or a combination of both, to be known in two points (boundary conditions). In the most general case, the diffusion equation is a partial differential equation of the first order in time and the second order in the three space coordinates x, y, z. Concentration or flux conditions valid at any time t > 0 must then be given along the entire boundary. Taking the one-dimensional problem with x as the space variable as an example, boundary conditions can be of three forms: (i) a known concentration, e.g., Cl = C0 at x = L (ii) a known flux or, equivalently, a known concentration gradient dCi/dx = q at x = L

8.4 Diffusion basics

421

(iii) a mixed relation between flux (gradient) and concentration dCi/dx + ctCi + P = O at

x=L

(a and ft are constants). This last condition is also known as the radiation condition.

8.4.2 The diffusion coefficient The diffusion coefficient depends on the diffusion process being considered. In self-diffusion, isotopes of the same species are exchanged independently of any other species. This is the case for radioactive tracer diffusion in pure solids. This process is of considerable importance to the understanding of isotopic systems in geochemistry and is amenable to calculations by statistical mechanics. Chemical diffusion involves the interchange of different chemical species usually in a heterogeneous system and is a much more complicated phenomenon poorly described even by sophisticated models. In general, self-diffusion is about an order of magnitude faster than chemical diffusion (Hofmann and Hart, 1978). Diffusion being a thermally activated process, the diffusion coefficient depends on the absolute temperature T according to an Arrhenius law (8.4.4) a dependence usually represented in a plot of In 2fl vs 1/T. In this equation 0t is the gas constant ( 1 . 9 8 7 c a l K ^ m o l " ^ 8 . 3 1 4 4 J K ^ m o l " " 1 ) and Et is the activation energy. Fick's law describes diffusion in ideal or regular solutions, particularly the behavior of trace elements, fairly well. In more complicated systems, it becomes inadequate to use the concentration gradient as the diffusion driving force. The gradient of the chemical potential fit (= partial molar Gibbs energy) of the species i represents the actual energy gradient that drives atoms in one direction or another. As a simple illustration, let us consider a binary solution in which we neglect cross-effects of elements and write that the relative velocity vl of one atom of the species i is given by v'(x, y,z)= —Ml grad fit

where Ml is called the mobility of the species i. Darken (1948) and Darken and Gurry (1953) relate the Fick diffusion coefficient 9)1 to the mobility M l by equating the diffusion flux T 1 to the relative velocity of the atom

In one space dimension, neglecting temperature gradients relative to concentration gradients, this can be rewritten

dx

dx

422

Transport, advection, and diffusion

Atom fractions Xt relate to volume concentrations through

x-

pC

Neglecting variations in the molar volume makes the denominator constant and therefore the relative variations of concentrations and mole fractions are equal d In pC1' which results in

In a binary mixture, /i, can be expressed as

where G is the free enthalpy of the system (e.g., Swalin, 1962). The coefficient of diffusion is therefore (8.4.5)

For solutions which present a mixing gap, the locus of points where the second derivative of G vanishes is called the spinodal. Within the spinodal, this second derivative is negative which results in negative diffusion coefficients or uphill diffusion (Figure 8.10). Even this more elaborated description of ion movements in response to gradients of chemical potential may turn out to be insufficient, in particular when uphill diffusion is active: (a) Local charge balance must be obeyed which introduces stringent constraints difficult to meet (e.g., Kirkaldy and Young, 1987) (b) Cross-effects of multiple chemical gradients on the chemical potential are not easy to quantify: a simple experimental evidence of the cross-diffusion effects is the buildup of concentration gradients for an element 2 having an initially uniform distribution as a result of a concentration gradient in element 1 (Figure 8.11) (Kirkaldy and Young, 1987). (c) For an identical composition, the chemical potential of a species in a homogeneous system is different from its potential in a heterogeneous system. It has been suggested (Hilliard, 1970) that its appropriate form is

d2cl Ji,-2K — dx2 where K is the 'gradient-energy coefficient' which results in a more complicated form of the diffusion equation.

423

8.4 Diffusion basics

>0

Figure 8.10 According to Darken's theory, the sign of the diffusion coefficient changes where the second derivative of the Gibbs function relative to the molar fraction Xt vanishes (spinodal).

C/5

g

1

Initial distributions Distributions at t |~

C1

I

O

U

Distance Figure 8.11 Evidence of cross-diffusional effects. The homogeneous distribution of species 2 (dashed line, top) is perturbed by a coexisting gradient of species 1 (bottom).

8.4.3 The Matano interface A common method to measure diffusion coefficients consists in the welding of two samples with different concentrations of the element of which the diffusion coefficient is to be known. Upon heating, diffusion asymmetrical profiles are quite commonly obtained which show that different species diffuse at different rates. The concentration-

424

Transport, advection, and diffusion

dependent diffusion coefficient can be obtained by a method devised by Boltzmann and applied by Matano. The one-dimensional diffusion equation (8.4.2) of a medium at rest and with no reaction reads

dt

dx\ dx

where the superscript i is temporarily left out. The rather general similarity method (Logan, 1987; Zwillinger, 1989) uses particular variable transformations in order to reduce partial differential equations into ordinary differential equations. We already used implicitly a linear similarity transformation relating t and x by introducing the substantial derivatives [equations (8.2.6) and (8.3.18)]. For the diffusion equation, the transformation is no longer linear and we use the Boltzmann variable (8.4.6) By the chain rule, the derivatives become d

du d

x

d

dt

dt du

2tJ~t du

d _ du d _

1 d

dx

f\ du

and dx du

By changing the variables, the partial differential equation of diffusion has been turned into an ordinary differential equation udC _ d / 2 du du\

dC du

where simple derivatives are used since the only variable left is u. Alternatively

(8A7)

which can be integrated from one end of the diffusion couple where concentration is Co with zero gradient (no flux at the end), to the current concentration C as 1

r d -(®dc\

2jc 0

V du)c

(@dc\

-(®dc\

\

\

du/Co

dujc

where the lower-case c is a dummy integration variable. For a given value of t, u can be replaced by x and the equation rearranged into — xdc + [3>—) =0 2tJCo \ dxJc

8.4 Diffusion basics

425

Likewise, at the other end where concentration is Cx and gradient vanishes, mass balance between C and Cx leads to

2tJC

Adding up these two conservation equations leads to

i

Ci

xdc = 0

Co

We now have to make the definition of x more explicit and decide where to locate the origin (the Matano interface) for the conservation condition above to hold. In the laboratory, one would use an arbitrary coordinate X, such as the distance to one end of the experimental device, then x = X-Xm(t) where Xm(t) is the position of the Matano interface. Integrating the material conservation equation above, we get rci

rex

\X-XJfi\dc=\ J Co

rc\

Xdc-\ J Co

Xm(t)dc = J Co

and Xm(t) is found to be the mean distance (8.4.8)

Xdc c0

i.e., the distance at which the surface hatched in Figure 8.12 is split in half. Finally, the diffusion coefficient is computed from re

1 Jc 0

r

xdc

2r(dC/dx)c

c 1 J

xdc

2t(dC/dx)c

(8.4.9)

& Iqdari and Velde (unpub. data, 1992, see Table 8.2) described experiments of Ce diffusion in apatite soaked in CeCl 2 with asymmetric diffusion profiles. For one of their runs carried out at 1100°C for 15 days, and described as an example of a non-linear least-square fit in Section 5.2, it has been found that the relationship between the Ce concentration CCe and the distance X to the mineral surface is described by

CCe + 0.05

1.93-CCe

We would like to know how the Ce diffusion coefficient varies along the profile.

426

Transport, advection, and diffusion

Table 8.2. Cerium concentrations (%) in apatite soaked in CeCl2 (Iqtari and Velde, unpub. data, 1992). X(um) is the distance to the interface. Expression for I(X) is given in the text. Diffusion coefficient ^ ( c m 2 s " 1 ) calculated by the Matano method.

I(X)

X (umm) 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70

1.864 1.832 1.638 1.227 0.600 0.250 0.152 0.085 0.078 0.046 0.043 0.012 0.013 0.012 0.013

21.2 21.3 20.6 16.9 8.2 1.1 -1.6 -3.8 -4.1 -5.4 -5.6 -7.2 -7.2 -7.2 -7.2

dX/dCCe

X

-15.3 -10.3 -5.3 -0.3 4.7 9.7 14.7 19.7 24.7 29.7 34.7 39.7 44.7 49.7 54.7

X0

198.1 87.9 6.4 -2.8 -9.2 -34.5 -73.1 -160.5 -178.3 -314.9 -335.4 -751.3 -727.7 -751.3 -727.7

9 0 -1.98xlO"13 -7.02xl0~14 6.00 x 10" 1 4 2.23 x 10" 1 3 6.03 x 10" 1 3 9.55 x l O ~ 1 3 1.36xlO" 1 2 1.40 x 10" 1 2 1.43 x 10" 1 2 1.40 x 10" 1 2 -3.51 x l O " 1 3 -2.07 x 10" 1 3 -3.51 x l O " 1 3 -2.07 x 10" 1 3

x

m

Figure 8.12 The Boltzmann-Matano technique. Initially, concentration is C o to the left of the initial interface located at x 0 , Cx to the right. The hatched areas between C o and Cx must be equal, which defines the position of the Matano interface. The framed area represents the numerator of equation (8.4.9). The diffusion coefficient is computed from the same equation.

8.4 Diffusion basics

427

The expression given for X as a function of C leaves us in trouble at both ends of the diffusion profile. X appropriately tends towards + oo and — oo when CCe tends asymptotically towards —0.05 and 1.93, respectively, which are nearly the extreme concentrations in the profile. This is what we expect from an infinite system. However, the integrals of the rational fractions are simply natural logarithms which cannot be evaluated for a zero argument and therefore do not converge when evaluated between CQ and Cv We will therefore restrict the calculation to the interval between extreme concentrations, say C o = 1.865 and Cx =0.012. The flux at both ends will not be strictly zero, since for these values,

dC fdx — = 1— dX \dC but we can safely assume that the amount of Ce that diffused out of this concentration interval is very small. The apparent shortcoming of this assumption is that estimates of diffusion coefficients near the ends will be unreliable. Assuming that the dimension of the system is infinite in both directions, we first determine the position of the Matano interface from i

i

rc\

Xm =

p.012

Xdc = C1-CoJCo

Xdc 0.012-1.865 J

lM5

Inserting the expression for X, we get

Xm==

0.012-1.865 C / ( " ) ] n "

where I(u) = 2.878 ln(w + 0.054) + 0-879 ln(w -1.93) +1 lu -

2.86M2

and, upon evaluation ^2-21^ 0.012-1.865 The derivative of X relative to CCe is given by dx dC^ ~

2.878 0.879 (CCe + 0.05)2 + (1.93-CCe)2 " '

In units of cm 2 s~ \ the Ce diffusion coefficient can be calculated as

1A-8

Ay

—-{[/(C C e )-7(1.865)] -15.3 x (C Ce -1.865)} Ce 2(15 x 24 x 3600) dC dC Ce l "

428

Transport, advection, and diffusion

As predicted, the points next to the ends of the profile give unreliable (negative) estimates because thefluxesat the end points themselves become a substantial fraction of the fluxes in the neighborhood. In addition, noise in sections with rather flat gradients becomes a problem. More elaborated non-linear regression techniques should be used to handle this specific problem. Nevertheless, the profile between 15 and 50/mi, where steep Ce variations are observed, shows evidence for substantial changes of the ^ C e which seems to hint at much faster diffusion when this element is only in trace amounts in the apatite lattice. <=> 8.5 Solutions of the diffusion equation: parallel flux

Different techniques are commonly used to solve the diffusion equation (Carslaw and Jaeger, 1959). Analytic solutions can be found by variable separation, Fourier transforms or more conveniently Laplace transforms and other special techniques such as point sources or Green functions. Numerical solutions are calculated for the cases which have no simple analytic solution by finite differences (Mitchell, 1969; Fletcher, 1991), which is the simplest technique to implement, but alsofiniteelements, particularly useful for complicated geometry (Zienkiewicz, 1977), and collocation methods (Finlayson, 1972). A series of cases relevant to geochemical problems in parallel and radialfluxeswill now be described as examples of a few methods. 8.5.1 Parallelflux:the instantaneous point source in the infinite medium Quite illustrative is the case of a finite amount of a diffusing substance deposited initially at a given position. The Gauss function (8.5.1)

where M is a constant to be determined, may be shown by simple substitution of its derivatives to satisfy the one-dimensional diffusion equation with constant diffusion coefficient Q) 9 dt

(8.5.2) dx2

At t=0, C is zero everywhere except at x1 where it is infinite. The amount of diffusing substance can be found by integrating C from — oo to + oo r+»

M r + » r (x-x i ) 2 i J Cdx = ——= exp dt

Introducing a Boltzmann variable ^

(8.5.3)

8.5 Solutions of the diffusion equation: parallel flux

429

0.0 -4

-3

Figure 8.13 Dispersion of an instantaneous point source [equation (8.5.1)]. A quantity of the diffusing species equivalent to the surface of whatever curve on the diagram is deposited initially at x = 0. The curves are Gauss functions.

it becomes

f + Q0 J - 00

Cdx =

M f •K/ H " ~

From a well-known result of calculus, the definite integral on the right-hand side is y/n so M is just equal to the quantity of diffusing substance. The present solution is therefore applicable to the case where M grams (or moles) per unit surface is deposited on the plane x = x1 at t = 0. In terms of concentration, the initial distribution is an impulse function (point source) centered at x = x1 which evolves with time towards a gaussian distribution with standard deviation JlQ)t (Figure 8. 13). Since the standard deviation is the square-root of the second moment, it is often stated that the mean squared distance traveled by the diffusion species is 2<2>t. Two important consequences of this equation are: (i) Identical Boltzmann variables x/y/t produce identical concentrations. The solution at x = 0 for t > 0 is identical to the solution at t = oo for x > 0. Likewise, the solution at x = oo for t > 0 is identical to the solution at t = 0 for x > 0. (ii) The distance of all the points with identical concentrations (isopleths) to the origin varies with the square-root of time.

430

Transport, advection, and diffusion 8.5.2 Two half-spaces with uniform initial concentrations

The infinite medium with one-dimensional diffusion and constant diffusion coefficient can be treated easily with the point source theory. Let us first assume that two half-spaces with uniform initial concentrations C o for x < 0 and 0 for x > 0 are brought into contact with each other. The amount of substance distributed per unit surface between x' and x' + dx' is just C o dx'. From the previous result, at time t the effect of the point source C o dx1 located at x1 on the concentration at x will be zexp

-

Summing over all the point sources at x' from — oo to + oo and noting that the contribution from the half-space x > 0 is zero will give the concentration distribution C = —-

dx1

exp

2 v /^rJ-oo

L

A9t J

Using again the Boltzmann variable change u= or

x—x

1 d « = —- dx

u changes from + oo to x/2 (@t)1/2 and the integral becomes c C + °° C = —%\ exp(-M2)dw

Introducing the error function erf x defined as (Appendix 8A) erfx = — and the error function complement erfc x erfcx=l—erfx =

2 f + 0° *4 e

d<^

r~ I * / 71 J x

the solution can be rewritten r_

v

(8.5.4)

8.5 Solutions of the diffusion equation: parallel

flux

431

Several curves of C against x have been drawn in Figure 8.14 for different values of the parameter 2t. We note that the distribution must remain symmetrical about x = 0. Concentration at the interface is therefore equal to C o/2.

8.5.3 The infinite medium with a layer of uniform initial concentration With initial condition C = C o for — X < x < + X and C = 0 elsewhere, we can assume that the initial distribution is the sum of point sources uniformly distributed between — X and +X. Again, the amount of substance distributed per unit surface between x1 and x' + dx1 is C o dx'. Summing up the contribution of all the point sources such as — X ^ x1 ^ + X will give the concentration distribution

Introducing Boltzmann's variable, we get x+X ;=-^

v

x+X Q-*&U

> x-X

x-X

u == ^ \ V - ce- ^ d dw ll--^:!^ X/TTJO ./TTJO

Q~ e U" du

2,/Wt

Finally

(8.5.6)

Defining the dimensionless variables £ = x/X and t = @t/X2, the solution can be rewritten

This solution keeps the remarkably constant value C « C o / 2 at x= ±X for values of @t/X2< 0.5 (Figure 8.15).

5.5.4 The infinite medium: an arbitrary initial distribution The method of point sources can be extended to any form of the initial distribution C0(x) in the infinite medium. The amount of substance distributed per unit surface between x' and x' + dx1 is just C0(x')dx'. Summing over all the point sources at x' from — oo to + oo gives the concentration distribution C=

-2

-1.5

-1

-0.5

Figure 8.14 Two half-spaces in contact with each other at x = 0 have the concentration Co for x<0 and 0 for x>0. Interface concentration is constant and equal to Co/2. The curves are labeled for different values of the parameter Qit.

0.8

0.6

0.4

0.2

0

-2

-1.5

-1

-0.5

0

0.5

x/X Figure 8.15 The infinite medium with a layer of thickness 2X and uniform initial concentration C o [equation (8.5.6)]. Interface concentration at x= ±X stays nearly constant and equal to Co/2 for ©t<0.5. The curves are labeled for different values of the parameter 3>t.

1.5

434

Transport, advection, and diffusion

The exponential term which represents the effect of a point source is sometimes called the influence function or Green function of this diffusion problem. The method of sources and sinks easily produces solutions for an infinite medium or for systems of finite dimension when their boundary is kept at zero concentration. Different boundary conditions require a more elaborate formulation (Carslaw and Jaeger, 1959). 8.5.5 The infinite medium with C0(x) being a periodic function

ofx

A useful result is obtained for the initial distribution C0(x) given by x C0(x) = A sin 2n—+B A

where A is the amplitude and X the wavelength. B is a constant which represents the mean concentration and will be taken equal to zero, which amounts to considering a perturbation around the mean value. A quick look at the partial derivatives in the diffusion equation shows that double differentiation relative to x keeps the sine term and suggests a solution in the form

r A

Inserting this expression into the diffusion equation leads to x An2 x f\t) sm2n- = -$ —2 f(t) sin 2n - + B A

A

A

or 4TT2

/ ' ( * ) = - 0 — f{t) A

which is integrated into f{t) = const x expl

t\

X2 J

I

The solution is therefore C(x, ty= A sin 2TT - expf

— J = C0(x) expf

— j

(8.5.8)

The shorter the wavelength, the faster its decay. Mineral scale heterogeneities in rocks disappear long before meter-scale or even larger heterogeneities. This concept can be extended to any arbitrary combination of periodic functions: in Section 2.6, we have already met the idea that any function bounded over an interval can be expanded as a sum of sine and cosine functions. Shorter wavelengths will decay much faster

8.5 Solutions of the diffusion equation: parallel flux

435

Figure 8.16 The infinite medium with initial concentration C0(x) = sin 2nx + sin 10nx + sin 20nx. The curves are labeled for different values of the parameter Q)t. Short wavelengths are smoothed much faster than long wavelengths [equation (8.5.8)].

than longer ones: diffusion filters out short concentration variations first. This is illustrated in Figure 8.16 for A = 1 and the sum of three sine functions with A = 0.1, 0.2, and 1 at @t = 0, 0.001, 0.01. 8.5.6 The semi-infinite medium with constant surface concentration Let us assume parallel flux in a semi-infinite medium bound by the plane x = 0. Diffusion of a given element takes place from the plane x = 0 kept at concentration Cint. Introducing a Boltzmann variable u with constant diffusion coefficient such as

equation (8.4.7) is modified as d 2C — 2du

dC -2u — du

Equation (8.5.9) is equivalent to

du = -2u du

This is integrated a first time into 8C — = const x e du

(8.5.9)

436

Transport, advection, and diffusion

and a second time into

where a and jS are constants to be determined from the boundary and initial conditions. Applying the initial and boundary conditions gives Co = a erf oo + /?

hence :=(C 0 -C int )erf-

+c int

which can be rewritten — = erfc Cint-C0

(8.5.10)

Curves corresponding to various values of Q)t have been drawn in Figure 8.17. A standard chemical (or diffusion) boundary layer thickness b can be defined as the

I c

1.6 S (0.05) = 2

Figure 8.17 Constant surface concentration [equation (8.5.10)]. The curves are labeled for different values of the parameter 3it. The framed area is equal to the hatched area and defines the thickness of the boundary layer at for ®t=0.05 [equation (8.5.12)].

8.5 Solutions of the diffusion equation: parallel flux

437

mean distance to the interface traveled by the diffusing substance. The total mass of diffusing substance which penetrated the interface at t is given by the surface under the curve

It is shown in Appendix 8A that r + 00

i

f Jo

and therefore r = 2(C i n t -C 0 )J— n

(8.5.H)

The boundary layer thickness 3 (Figure 8.17) will be defined as the thickness of a layer with uniform concentration C int which would contain the same amount M of diffusing substance 3 =——— = 2 /— Cint-Co

n

*

(8.5.12)

8.5.7 The slab with uniform initial concentration When combined with the Fourier expansion of functions, separation of variables is another powerful method of solutions which is particularly useful for systems of finite dimensions. Regardless of boundary conditions, we decompose the solution C(x, t\ where the dependence of C on x and t is temporarily emphasized, to the general one-dimensional diffusion equation with constant diffusion coefficient dC(x,t)_^d2C(x,t) dt dx2 into the product C(x,t)=f(t)g(x) Inserting this expression into the diffusion equation gives

dx1

dt or 1

df(t)_ dt

1 d2g(x) g(x)

dx2

438

Transport, advection, and diffusion

Since the two members in the last equation cannot be functions of the independent variables x and r, they must be equal to a same constant, which suggests using an exponential form for f(t) and a trigonometric form for g(x). The diffusion equation is indeed identically verified for C(x, t) = (xn sin nn — exp( - n2n2@t/X

2

)

where n is an integer and aM an arbitrary constant. This solution satisfies the condition C(0, t) = C(X, t) = 0. Any superposition of solutions with different values of n would also be a solution. The condition at t = 0 suggests that the initial concentration distribution can be expanded in a series of sines, e.g., C(x,0)= Y, oin sin nn — n=0

X

A general solution to the diffusion equation is therefore ansmnn-Gxp(-n2n2@t/X2)

C(x,t)= X n=0

X

Such a sine expansion is generally made possible with the condition of zero concentration at x = 0 by using an odd function for the initial concentration distribution. A simple example is for initial uniform concentration C o between 0 and X for which we can assume a fictitious concentration — C o between — X and 0. Using the results of Chapter 2, the Fourier expansion of the boxcar function which is C o between 0 and X, and 0 at x = 0 and x = X is 4C

°°

—

X

1

x

sin(2n+l)7r-

7T n = 0 2 n + l

(8.5.13)

X

Concentration at t therefore is C(x,t) = — £ —?— sin(2n+l)7r-exp[-2n+l)2 7c 2 ^r/X 2] X n ,, = o2rc+l

(8.5.14)

Formulating the same problem in a symmetrical way, i.e., for a slab with —h<x<+h may happen to be occasionally convenient. Changing X into 2h and x into {y + h) would give m A 4C0-(-l)M 2H+1 x f /2n+l\2 2 "I C(x,t) = > cos 7r-exp %zQ)t hz

n n%2n+l

2

h *± \

2 J 2

J

This solution converges very slowly for values of @t/X «l. Alternative solutions suitable for small values of the time are available from the standard books on diffusion (Carslaw and Jaeger, 1959; Crank, 1976). The amount M(t) of diffusing substance still

8.5 Solutions of the diffusion equation: parallel flux

439

present in the slab at t is

4C0 » 1 , , , Cx 2 2 2 M(t)= C(x,t)dx =—- X exp[-(2n+l) 7r ^t/X ] si n Jo Jo n = o2n + l The integral can be calculated as f*

x COS(2tt+l)7t — 1)TT

X

X

dx-

L-cos(2rc+l)7r

2X

(2H+1)TT

(2n+l)7r

Jo

(8.5.15)

X

and the mean concentration C(t) = M(t)/X is given by (8.5.16)

a solution which converges extremely fast except for small values of @t/X2. Using the properties of the theta functions, an alternative solution can be computed which converges rapidly for small values of the time (Appendix 8B) l Co

4

X

[

1

°°

nX ~\

+2£(-l)"ierfc—— (8.5.17) = i 2J&U Jn » where ierfc refers to the integral of the error function (Appendix 8A). The solution C(t)/C0 has been drawn in Figure 8.18 for various values on the parameter Q)t. The approximation ^

(8.5.18)

gives the fraction left in the mineral at t with four exact digits for loss < 40 percent, and two digits for loss <80 percent (Figure 8.18). Using symmetry arguments, the solution for diffusion with no flux at one end can be derived from these equations. Obviously, the concentration profile for zero surface concentration is symmetrical relative to X/2, which means that dC/dx is zero at that point: the flux of diffusing substance through this point is zero. Other combinations of boundary conditions can be found in standard textbooks (Carslaw and Jaeger, 1959; Crank, 1976). 8.5.8 The slab with accumulation of a radiogenic isotope In K-Ar or zircon U-Pb dating, modeling the loss of radiogenic isotopes by volume diffusion is important. If Po is the local concentration at t = 0 of a radioactive element decaying with constant 2, a source term exists in the transport equation of the radiogenic element which is the local rate of accumulation lP o e~ A f . For dual decay,

Transport, advection, and diffusion

440

Parallel diffusion in a slab = 0.0001 Simple loss: full solution Simple loss: approximation Decay and loss

0.2

10-5

101

10

3t/x2 Figure 8.18 Mean concentration in a slab (0<x<X). Heavy line: simple loss and initial concentration Co [equations (8.5.16) and (8.5.17)]. Dotted line: approximation (8.5.18). Thin lines: initial concentration 0, radiogenic production from a radioactive parent with concentration Co, w is the loss parameter [equation (8.5.22)].

such as K-Ar, this equation should be corrected for the branching ratio. The diffusion coefficient of the radiogenic element is assumed to be constant, thus the onedimensional diffusion equation reads

dt

(8.5.19)

dx

At t = 0, the slab is supposed to be free of radiogenic element (C o = 0). At the boundaries x = 0 and x = X of the system, concentration is kept to zero. For simplicity, Po will be assumed to be constant over the mineral. Let us consider in the first place the total concentration N = C + Poe~Xt of radioactive and radiogenic isotopes. Since there is no loss of the radioactive isotope, the variation of N equals the loss of the radiogenic isotope. In other words

dx2

dt

Since P o is independent of x, this can be rewritten dN _

d2N

~dt~

~dx^

with the conditions that N = P0 at t = 0 and N = Poe

Xt

at x = 0 and x = X. Such a

8.5 Solutions of the diffusion equation: parallel

flux

441

formulation is extremely handy when numerical methods such as finite differences are used. Analytic solutions can be arrived at by using Duhamel's principle briefly outlined in Appendix 8C and which will be subsequently applied to spherical systems. Alternatively, the solution can be worked out through a series of steps similar to those taken for the non-radioactive case. We assume that a solution can be found as a product of a function f(t) and a series of trigonometric functions in x such as

Qx,t) = f(t) f

ansmnn —

n = 0

-**•

Equation (8.5.19) becomes

fit)

Y ansmnn — = w =

0

ansinnn—h>lPoe~Ar

^—f(t) Y

-^*-

•**•

n =

0

-**•

The trouble is now that the source term does not include the sum of sines, so we will use a trick resting on the Leibniz's rule for differentiating integrals. A particular solution of the diffusion equation with radiogenic accumulation is XP0e-*uQxp[-n2n2@{t-u)/X2]du

C(x,t)= x,t)=

£ a M sinmrn=0 X

Jo J

Using Leibniz's rule, we get

which, inserted into the diffusion equation, amounts to the additional condition oo

Y

Z aM sin nn — = 1

w= 0

X

The function that has such a Fourier expansion while satisfying C(0,t) = C(0,X) = 0 is the boxcar function. We found in Section 2.6.2 that the an coefficients of this function are 0 for even values and 4/nn for odd values of n. The general solution is therefore 4 oo

i

C(x,t) = - £ nn = o2n+l

x

rt

sin(2n+l)7rXJ0

r

/lP o e" A "exp -(2n+l)2n2 |_

@(t — u)~\ K

}

2

X

\du (8.5.20) J

Let us evaluate the integral I n on the right-hand side of equation (8.5.20)

= XP0 expl" -(2n +1) V | [ 1 J' expj^n + l)V J^ - A j du

442

Transport, advection, and diffusion

which can be integrated into

(2n+l)2n2®

Y Inserting this value into equation (8.5.20) gives

The mean concentration C(t) at time t is obtained by integrating this expression from 0 to X )= ^ |

C(x,t)dx

Jo

Using equation (8.5.15), we get, in terms of the dimensionless time

nn

~

.

ex

f P

(2n+l)2n2@tl e x ( 77-, ~ P

\9>t (8.5.21)

1 — (2n+ 1) n w

where, the dimensionless parameter w is defined as (8.5.22) This formula could be made slightly more compact by using Cauchy's Theorem of Residues for complex variables and the Theorem of Mittag-Leffler (e.g., Spiegel, 1973), but, since the advent of reasonably fast desktop computers, this is no longer a critical requirement. For fast decay, all radioactive atoms are rapidly converted into radiogenic atoms. It can be checked than when X goes to large values, i.e., when w vanishes, the second term after the summation sign tends to unity, the second exponential within the braces vanishes and the expression becomes identical to the case with initial concentration Po which we have established in the previous case. The solution has been drawn for several values of the parameter w in the Figure 8.18.

8.5.9 Disequilibrium fractionation during solidification Let us consider the influence of a solid-liquid interface advancing at a constant velocity on the solid-liquid fractionation of an element i. In the case of unidirectional solidification, it is convenient to consider that liquid crosses the immobile interface with an absolute constant velocity v, while a solid-liquid fractionation coefficient K is applied to the fractionation of element i. Let us assume that the interface is at x = 0, the medium being solid for x < 0 . Liquid fills the half-space 0 < x < o o and

8.5 Solutions of the diffusion equation: parallel

flux

443

concentration is initially C o . Diffusion in the solid is much slower than in the liquid and therefore is neglected. Since the liquid moves towards negative x, concentration of a particular element in the liquid obeys the diffusion equation ^

^ ^E +V ^ dx2 dx

dt

(8.5.23)

where 2 is the diffusion coefficient of the species. Mass balance at the interface x = 0 requires the equality of the diffusion and advection fluxes on the liquid side with the advection flux on the solid side - r ^ - PuqvCu, = - p sol vC sol

(8.5.24)

OX

or ^— = I — K ~ 1 — C,iq = - C liq dx \/9 liq /Q) S

(8.5.25)

in which the parameter a, corresponding to the parenthesis in the middle term, and the characteristic length 3 = Q)/v have been introduced in order to keep the notation compact. In addition, when x-»oo, C-+Co since the liquid away from the interface does not 'feel' it. An element-dependent characteristic time of the process is 6 = @/v2. This problem has been considered by Smith et al. (1955) and Hulme (1955), who, using the method of Laplace transforms, found the solution (also listed in Carslaw and Jaeger, 1959, p. 389)

Co

^

^

{ + (1+2g)T

i±^

(8.5.26)

f where we have introduced the dimensionless variables „ vx x { = — = -,

, and

v 2t t T= — = -

(8.5.27)

After some manipulation, concentration of the liquid at the interface ^ = 0 becomes

0

2(1 + a) (

\

2/

2

(8.5.28)

Let us investigate the long-term asymptotic properties of this solution. Given that l + 2a\ 2 1 2 / 4

2

Transport, advection, and diffusion

444

we write

which is recombined in a more symmetric way Cnq(0) Co

1

1 +a

e+ •

Using the property of the error function listed in Appendix 8B 1

exp«2erfcu-

as y/n\nu

when r-> oo, the asymptotic value becomes Cliq(0) Co

1 1+a

|~l+2a 2

2

a

L (l + ) (l + 2a)N/7rr

2(1+a)

The term between braces is identically zero so Cliq(0) Co

y

1 _ Pnq

The steady-state profile in the liquid is calculated in Section 9.6. The rate at which interface concentration builds up or goes down is shown for various a in Figure 8.19. Neglecting density change upon solidification and con-

Figure 8.19 Evolution of the liquid concentration at the interface with a solid growing at the constant rate v from a solution initially at C o . K is the solid-liquid partition coefficient. Steady-state takes longer to establish for incompatible elements.

8.6 Radial flux and spherical coordinates

445

sidering incompatible elements (K^O), equation (8.5.29) simplifies to an excellent approximation into

or

^T *

(1

2K

e )

( 1

^ ^

e )

(8 53O)

-

This equation shows that, at constant growth rate, the more incompatible the elements, the Ionger4t takes for steady-state to establish. We therefore can expect kinetic disequilibrium beween mineral and liquid to be more conspicuous for incompatible than for compatible elements. 8.6 Radial flux and spherical coordinates 8.6.1 Introduction In the case of flux with spherical symmetry, i.e., with no dependence on the latitude and longitude, gradient and Laplacian operators must be expressed as a function of the radial distance r to the origin

The derivative operator transforms into d _ dr d _ x d dx dx dr r dr which results in r

dr

where i,j, k are the unit vectors along the x, y, z axis. The fraction on the right-hand side represents the ratio of the vector with modulus r to the modulus r itself. It therefore represents the unit vector er along the radial direction at the point under consideration gradC = er— dr

(8.6.1)

From the previous expression of the derivative d2C _ d (x dC\

\dC

d (\ dC

dx2

r dr

dx\r

dx\r

drJ

dr

446

Transport, advection, and diffusion

which can be developed as d2c_\dc dx2

x2 d(\

dc\

r dr\r

dr J

r dr

or d2C

1 dC

x2 dC

x2 d2C

Let us write the Laplacian explicitly as a sum of second-order derivatives relative to x, y, and z d2C dx

d2C

d 2C

,

+ — - + — - = AC = V2C = dy

dz2

x2 + y2 + z2 dC

3dC

— r3

r dr

x2 + y2 + z2 d2C

+ r2

dr

dr2

or r2 dC

_3dC

2

r dr

r

3

r2 d2C r2 dr2

dr

Finally

whence the diffusion equation for radial flux with constant diffusion coefficient can be rewritten

dc

(d2c

dt

0 \dr2 [

2 dc\ r (dr

8

.

6

.

3

)

Similar expressions can be derived for cylindrical coordinates.

8.6.2 Radial diffusion in the sphere In the diffusion equation (8.6.3) with radial flux and constant diffusion coefficient, let us introduce the new variable u(r, t) = u = Cr 1 du u r dr r2

dC dr

\(du r\dr

u r

and for the second derivative d2C _\d2u ~\ 2

dr

1 du Ts 2

r dr

2

~\

r dr

1 du

2u _ \fd2u

~~2 "^

r dr

it

r

' ~

2 du ~

-

2u ~

8.6 Radial flux and spherical coordinates

447

therefore

fd2C 2

V^ "

2du\®Yd2u +

2du 2u 2/du u\l

r ~dr~)~~r~ld? ~ ~r Jr + ~? + ~r\dr ~ ~r)\

The diffusion equation in the new variable u becomes dC _ ® d2u

or du Q) d2u - = - —2 r dr dt

(8.6.4)

Let us calculate the concentrations in a sphere with a uniform initial distribution Co and zero surface concentration. Initial and boundary conditions are w(r,0) = Cor, u(a, 0 = 0, w(0,t) = 0. The diffusion equation in one dimension (8.6.4) admits r

f

a

\

u(r, t) = an sin nn-exp

®t\

-n2 22n2—) a2/

as a particular solution which fulfills the conditions at r = 0 and r = a for n integer, an being a constant to be determined. Any superposition of solutions with different values of n would also be a solution. The condition at t = 0 suggests that if the initial concentration distribution can be expanded in a series of sines, e.g., oo

y

u(r,0)= £ ccn sin nnn-Q

d

the general solution is 00

r ( Q)t\ u(r,t)= YJ awsinmz;-exp( —n2n2—-\ n= i

a

\

a )

In order to make the solution consistent with initial and boundary conditions, we will use for u(r, 0) the ramp function defined in Chapter 2. For 0 < r < a, u(r, 0) = Cor, while u(r, a) = 0. Using the results of Section 2.6, the Fourier expansion of this function is 2aC0 » (-1)" .

u(r,0)=

>

n

n

=i

r

sinmr-

n

a

which gives the general solution

CM) — ^ nr

I tlltan.LaJ-M*) n

=i

n

a

\

2

)

(8.6.5)

Transport, advection, and diffusion

448

The amount of diffusing substance M(i) still present in the sphere at t is obtained upon integration of the concentration times the volume (4nr2 dr) of the infinitesimal shell of thickness dr r sin nna 4nr2dr M(t) = | C(r, t) 4nr2 dr = - 2C0 £ (-1)" exp( - n2n2 — OVJo r nn Let us call Jn the integral on the right-hand side of the last equation. Jn can be integrated by parts as

_

1

^

r sin nnaam

r

nil

A

2A

4a

(°

V

•

4a

A

Jn= Jo

4a C

nn

n Jo

a

r

z

nn- Anr dr = — smnn-rdr — — ar n Jo a n

a

COStt-

r cos nn a -dr nn a

Since the last integral on the right-hand side is zero (8.6.6)

which gives M(t) as 8Coa3 « 1e x / n 2 n 29t\ M(t) = X ^ P ~ -T 7r

n

=ir

\

a^/

The mean concentration of the sphere therefore is M(t)

6C0 - 1

/

^ ^ e x P ~nn

^~~r 2

n

n=

\n

2

2

2Qt\

—)

(8.6.7)

2

\

a/

Again, this solution converges very slowly for small extents of loss, i.e., for small values oi^tja1. In this case, the solution expressed as an error function series should be used (Appendix 8B) na

/^+

ie

(8.6.8)

Of course, both equations (8.6.7) and (8.6.8) are valid solutions which only differ by the rate of convergence. Figure 8.20 under the label a = 0 illustrates how the solution varies for different values of the parameter Q)tja2. For loss extents < 85 percent, the approximation (8.6.9) Co

gives the lost fraction with at least four exact digits.

449

8.6 Radial flux and spherical coordinates

Radial diffusion

a=5

a=

0.2 .

io- 3

Equilibrium with a

101

10°

10

Figure 8.20 Mean concentration in a sphere (0
In the case where the surface is kept at Cs, the solutions just derived hold with C — Cs and C0 — Cs written in place of C and C o . In particular, we will subsequently make use of the solution C(r,t)-C0_1

C(r,t)-Cs

2a °° (— 1Y r •— X !—^-sinnTt-expl -n2n2^nrn = i

n

a

V

or

)

(8.6.10)

Once equation (8.6.10) is integrated over the sphere for C o = 0, we get the expression (8.6.,,,

which is useful to simulate a 'clean' sphere immersed in a 'dirty' liquid. It may be applied to the uptake of trace-elements by minerals from liquids or to the sorption of rare-gases from a surrounding fluid phase. 8.6.3 Desorption from a sphere into a well-stirred solution Let us assume that a sphere with radius a is immersed in a liquid of finite volume, e.g., a mineral in a hydrothermal fluid. Diffusion in liquids is normally fast compared to diffusion in solids, so that the liquid can be thought of as homogeneous. Similar conditions would apply to a sphere degassing into a finite enclosure, e.g., for radiogenic argon loss in a closed pore space. Given the diffusion equation with radial flux and constant diffusion coefficient dt

Br2

dr

450

Transport, advection, and diffusion

we calculate the evolution of the concentration in a sphere of radius a with uniform initial concentration C o desorbing the diffusing substance into a well-stirred solution of volume w with zero initial concentration. The surface concentration of the sphere is in equilibrium with the solution through a partition coefficient K such as

In addition, mass balance requires that the amount of element leaving the sphere increases the concentration in the surrounding liquid, i.e.,

dr

dt

K

dt

This problem requires specific techniques not developed in this chapter, such as Laplace transforms, and the reader interested in the derivation of the solution may refer to the textbook of Crank (1976). Defining a as the final distribution ratio, i.e., the amount of solute contained in the solid divided by the amount contained in the liquid when £->oo J ^ 3w

(8.6.12)

the solution is = Co

1-6 Y 1+a

n=

(8.6.13) i 9a 2 + 9a + qn

where qn is t h e nth solution of the equation tang M =

n

-~

(8.6.14)

an equation we can solve numerically by the Newton method described in Section 3.1. Letting co increase indefinitely, a decreases to zero, a = 0 corresponds to a situation of very large values of K or very small volume of liquid. Most of the species considered is held by the solid. In this case, the solutions to the equation

are simply nn and the solution is identical to that with no surrounding liquid. If a increases to infinity, e.g., because of a very high partition coefficient, the first term on the right-hand side of equation (8.6.13) tends to unity while the second term vanishes. Negligible amounts of elements are transferred from the sphere into the surrounding medium and concentration stays C o . It will be left to the reader to verify that mass balance is verified at equilibrium for a concentration C(oo) given by

The curves of the solution for various values of a have been drawn in Figure 8.20.

8.6 Radial flux and spherical coordinates

451

8.6.4 The sphere with accumulation of a radiogenic isotope This problem, in which a radiogenic element is allowed to leak out of its host mineral as it forms, has found important applications in geochronology, particularly for the K-Ar method (Wasserburg, 1954) and the U - P b method (Tilton, 1960; Wasserburg, 1963) with the so-called continuous loss model. The equation for radial diffusion of a radiogenic element in a sphere with radius a and uniform parent isotope concentration P = P0 at t = 0 can be written

, r dr,

(8.6.16)

where X is the decay constant of the parent isotope. Again, changing variables and using the total concentration N of radiogenic and radioactive isotope such as N = C + Poe~kt would lead to the equation dN

dt

„ fd2N

$ ( + \ dr

2 dN

r dr

where 3) is the diffusion coefficient of the daughter isotope. However, we rather change the variable C into w(r, t) = Cr as above which gives

dt

dr2

The same derivation as that used for the accumulation of radiogenic isotope in a slab would lead to the solution but we will take advantage of this case to fully develop an application of Duhamel's principle (Appendix 8C). The assumption of zero initial and surface concentration of the radiogenic isotope is equivalent to ii(r, 0) = 0

Introducing the concentration deficit due to loss of radiogenic element times r as the new variable v(r, t\ not to be confused with a velocity, we obtain Q-Xt)-u(r,t)

(8.6.17)

Since the second derivative of v(r, t) + w(r, t) with respect to r is zero, we can rewrite the diffusion equation as dv(r,t)_^d2v(r,t) dt dr2

452

Transport, advection, and diffusion

The boundary conditions become v{r, 0)=0 Q-Xt)

(8.6.18)

t?(O,r) = O

In order to apply the Duhamel principle, we must retrieve the solution v(r,t) for constant v(a, t) = vs(t) = vs from equation (8.6.10). Upon multiplication by r/a, the solution reads \r 2 » (-1)" r / y(r,r) = t;s - + - £ sinmi-exp \_a

nn = i

n

a

\

0A~| -n2n2—)

a JA

T being an arbitrary time, we first define the function g(r, t — T) as . r 2 « (-ir . r f 2 2^-i)"| g(r,t — T) = - + - > sinn7r-exp —nznz — a nn = i n a |_ a2 J We then let surface value VS(T) vary with T according to equation (8.6.18). Using Duhamel's principle (Appendix 8C), we find that the solution for the time variable function VS(T) is d Cf v(r, t) = - \

d P

v(r, t - T) dr = -

i7g(T^(r, r - T) dr

or upon changing t — x in T

.(r,r) = ^ f ° i ; s a - T ^ ( r , r X - d r ) = ^ f i;.(r-r)flf(r,r)dr Applying the Leibniz's rule to the integral on the right-hand side, we get

which, since vs(0) = 0, reduces to the integral term. From the definition (8.6.18) of vs(r) dvs(t-T) dt

and therefore

dt

8.7 The diffusion coefficient varies with time

453

Since

the solution i;(r, t) is Ar

t>(M) = rP 0 (l-e

)+

sinn7r

Z

TlA

ex

P

2~ )~e

From the definition (8.6.17) of v(r,t\ we retrieve the concentration C(r9t) as

_..,_.

.

1

2 2

v

'

—n n w a where, as before, the dimensionless parameter w is defined as w = @/Xa2. The amount of radiogenic isotope accumulated in the sphere at time t is M(t)=\ C{r,t)4nr2dr Jo Using equation (8.6.6), we obtain the mean concentration of the sphere as C(t)= — ^ - = 4na3/3

^ £ e ~C — n n=i n2(l—nn2w) 2

(8.6.19)

a solution given by Wasserburg (1954). Again the Theorem of Residues could make the formula look marginally better. The solution for short values of t is given by Carslaw and Jaeger (1959, p. 245) but, quite unfortunately, involves error functions with complex arguments and therefore is too complicated for being useful. For fast decay, X is very large and the solution converges towards equation (8.6.7), as required.

8.7 The diffusion coefficient varies with time

8J.I General When the diffusion coefficient varies with time, the fundamental transformation commonly used consists in defining a new time variable T as (8.7.1)

454

Transport, advection, and diffusion

or

where X is a length scale (usually the thickness for a slab or the radius for a sphere). We first use the chain rule to obtain dC _ dx dC _ 9(t) dC

Defining the space variable £ = x/X9 we get dC 1 dC — = , and dx X d£

d2C 1 d 2C — 2- = ——dx X2 d£2

The diffusion equation can now be rewritten as

dT

d£2

(8.7.2)

All the solutions with constant diffusion coefficients can therefore be used for problems with time-dependent diffusion coefficients upon replacement of Q)tjX2 by T. An important application of this transformation is the recovery of diffusion coefficients from stepwise heating experiments. Let us imagine minerals which can be considered as spheres of the same size a and initially containing a uniform concentration C o of a gas, argon for instance. In a stepwise heating experiment, the experimentalist heats the sample at increasingly higher temperatures. At the fcth heating step, the diffusion coefficient<3)kremains constant between tk_l and tk. The remaining fraction of gas in the mineral Cfc(i)/C0, which can be calculated once all the gas is ultimately extracted at the end of the experiment, can be matched with a unique value of xk through _

or the corresponding expression for small times. Since the diffusion coefficient is constant by steps, we can write

and therefore T» + i-T» = ^ t + ' ( f * 2 * l ~ t t ) a

(8-7.3)

where tk + x — tk is the duration of the /cth heating step. Plotting ln[(rfc+x — xk)/(tk+2 — rfc)] i;s l/T(Arrhenius plot) should provide a straight-line with slope - Ej3t and intercept

8.7 The diffusion coefficient varies with time

455

Table 8.3. Experimental results on 15415 lunar anorthosite, (Turner, 1972). Cumulated fraction 1 — F k of37Ar outgassed at each temperature step. k = 0 refers to the undegassed state.

Step k

1

2

3

4

5

6

7

t°C 1-^

600 0.0076

700 0.030

800 0.106

900 0.320

1000 0.611

1200 0.892

1400 1.000

Q)0la2. If the duration At of all steps is the same, plotting ln(rfc+1 — xk) vs 1/T should also provide a straight-line with slope - E/& and intercept @0At/a2. ^ In order to determine the 3 9 Ar- 4 0 Ar age of the 15415 lunar anorthosite, Turner (1972) irradiated the plagioclase from this sample in a fast-neutron producing reactor. In addition to Ar isotopes being produced by neutron reaction, 37 Ar was also produced by a 40 Ca(n,a) 37 Ar nuclear reaction. Argon was progressively extracted from the sample during lh heating steps at increasing temperature. The fraction of the total 37 Ar extracted from the sample at each temperature is listed in Table 8.3. Draw the Arrhenius plot for 37 Ar diffusion assuming spherical feldspar grains with radius a and homogeneous Ca distribution. The fraction of total 37 Ar degassed after the /cth heating step is ou tgassed fc 37 Ar total

37 A Ar

n2

where, since it is a stepwise heating protocol

(with to = 0) and, therefore, equation (8.7.3) applies. We have shown in Section 3.1 how to extract zk values from the equation above using a Newton iterative scheme. The last heating step cannot be used since we should then be able to increase temperature to infinity in order to extract 100 percent of the argon present in the crystals. Next, the differences Tk + 1— xk can be calculated (with T 0 = 0) and, since the duration of each step is constant, their logarithm plotted directly against the reciprocal of the absolute temperature. The complete results (see Albarede, 1978) are listed in Table 8.4 and plotted in Figure 8.21. As commonly observed, the high-temperature step (here 1200 °C) significantly deviates from the linear trend observed at lower temperature. A regression on the points up to 1000 °C gives a slope of - 2 5 500K corresponding to an activation energy E = 25 500 x 8.3144«212kJmol" x and an intercept of 16.9. Albarede (1978) has shown that stepwise heating data can be inverted to recover the spatial distribution of radiogenic, planetary and spallogenic components. This topic was covered in Section 5.6. o

Transport, advection, and diffusion

456

Table 8.4. Derivation of the Arrhenius coordinates for the 37Ar outgassing of the 15415 lunar anorthosite.

Temperatures t in °C and T in K. Step/c+1

1

600 1.15 0.0076 -12.20 -12.20

1000/T

0

700 1.03 0.03 -9.45 -9.52

800 0.93 0.106 -6.87 -6.95

1000 0.79 0.611 -2.98 -3.22

1200 0.68 0.892 -1.74 -2.08

r

Anorthosite 15415 (Turner, 1972)

-2 -

^

-4 -

I

-6 -

-

900 0.85 0.32 -4.52 -4.62

-10 -12 -14 0.6

0.8

1000IT Figure 8.21 Arrhenius plot of the 37Ar outgassed from the lunar anorthosite 15415 (Turner, 1972). Only steps 1-5 are taken into account in calculating the least-square straight-line parameters. 8.7.2 Cooling ages

The strong temperature dependence of the diffusion coefficient suggests that when a system cools down, chemical equilibration and loss of radiogenic isotopes come to a freeze over a short time interval. The closure temperature of a chemical system is the temperature at which the inward and outwardfluxesof atoms or isotopes involved in exchange reactions with surrounding minerals and fluids fall below a critical level. For instance, exchange of oxygen isotopes in an igneous or hydrothermal system continues well below the temperature of crystallization and oxygen thermometers record temperatures that are interpreted as those of the end of isotopic exchanges. The cooling age of a chronometric system, such as 40 K- 40 Ar, is a measure of the

8.7 The diffusion coefficient varies with time

457

time at which temperature drops below the closure temperature for the exchange of radioactive and radiogenic isotopes, here 40 Ar, with the surrounding medium. These concepts have been covered in detail in a classical paper by Dodson (1973) but the account presented here will follow a different line of argument. Dodson admits that the end of chemical and isotopic exchanges involving a labile species ( 18 O, 40 Ar, ...) coincides with the diffusion coefficient 2 of the radiogenic isotope dropping below a uniquely defined critical value. This critical value is a function of the geometry of the system and of the cooling rate, or equivalently of the rate at which the diffusion coefficient changes with time. In terms of dimensionless parameters, we can write the requirement for a closed spherical mineral with radius a as 3)Q

1

T<

~a A

where A is a constant depending only on geometry and 9 a time scaling constant to be defined and characteristic of the loss process (the symbol T used by Dodson will be avoided as it would collide with the variable defined for time-dependent diffusion coefficients). Dodson chooses the constant 9 as the time necessary to decrease the diffusion coefficient by a factor of e, i.e., 1 dt

6

the minus being a result of Q) decreasing with time, which results in

J

dln0\

A

dt J The critical value <£ic of the diffusion coefficient is the value of 9) which makes this relationship an equality and is related to the closure temperature Tc through the Arrhenius equation (8.4.4). Therefore

A

dt

or 1 —=

Z

0 t ( In

E

--

a1

V A9a

dt

and, finally Tc =

(8.7.4) 0l\n(A®oO/a2)

an expression derived by Dodson through a different set of arguments. 9 is treated by Dodson as a constant, which is by no means critical. Since 9 appears in a logarithm,

458

Transport, advection, and diffusion

the closure temperature has only a weak dependence on the cooling rate. He suggests that values of A equal to 55, 27, and 8.7 for the sphere, the cylinder, and the plane sheet, respectively, ensure a system tightly closed to the loss of radiogenic atoms. Actually, a single closure temperature, or even a narrow temperature range, characteristic of the diffusing species in a particular mineral phase independent of the cooling history of the mineral may simply not exist. Whether we are dealing with stable or radiogenic isotopes, a measure of the closedness of a system to an isotopic or atomic species is the relative rate of loss (e.g., the fraction lost per Ma) d In F/dt, where F is the fraction of atoms still present. This quantity is equivalent to the probability for an atom to escape from the system in this unit time and is the reciprocal of its residence time. We are going to relate d In F/dt to the temperature for a system which holds a given amount of atom or isotope and deduce a temperature threshold which separates the open from the closed system. As in Dodson's theory, we admit that a closure temperature would not depend on the rate of radioactive accumulation and carry out the calculation for a stable species, but the present analysis introduces explicitly the rate of loss. We will work out the solution for the case of a spherical mineral and radial diffusion holding initially a homogeneously distributed amount of a labile isotope (e.g., 40Ar). This calculation can be easily extended to other geometry. Given the dimensionless time x defined as

we have seen (Section 8.6.2) that, for loss extents up to 85 percent, the relationship between F and x can be accurately approximated by equation (8.6.9) as

Let us solve this relationship for x. Switching to an equality sign, we write this relationship as a second-degree polynomial in /

Taking the positive root gives

By chain rule, dlnF dt

d l n F d r _ dF dx dt Fdx a2

and therefore dlnF ^ 3 ( 1 - 1 / ^ ) go dt F a2

8J The diffusion coefficient varies with time

459

or, substituting for ^fnr from above,

dlnF dt

=

3 Jl-(l-F)n/3 F

i_ji_(F)/3 iF)n

@0 ( 2-exp a

\

E 0tl

In the manner of Dodson, let us define A(F) as

F i_yi_(i_ which produces the more compact notation 0O / E dt = A(F) —2 exp[ P \ &T a relationship which can be rearranged into T

=

^—TT-

(8-7-6)

(-dlnF/dt) At a given F, we can therefore ascribe a temperature to the maximum relative rate of loss for a system to be closed and a temperature to the minimum relative rate of loss for a system to be open. This interval may be thought of as the closure temperature interval. The relationship (8.7.6) between temperature and the rate of loss has been calculated for amphibole and orthoclase crystals, both 50 um in diameter, which lost 0.1, 1, 10 and 50 percent of the diffusing species initially present (Figure 8.22). The diffusion constants are those of Harrison (1981) for amphibole and Foland (1974) for orthoclase. For a nearly untouched amphibole which has lost only 0.1 percent of its initial content of labile atoms, the system would switch from a fully open regime of 10 percent loss per million year at 720 K (point A) to a rather tight state of 1 percent loss per billion year at 580 K (point B), i.e., over a temperature range in excess of 100 degrees. A different amphibole that lost 50 percent of its labile atoms would close over approximately a similar temperature range (interval A'-B'), although at temperatures in excess of ~ 8 0 K relative to the 0.1 percent loss case. This is because the average probability for a labile atom to escape from the crystal decreases very quickly as soon as loss has started. Physically, the progressive formation of a depleted diffusion rim around the crystal tends to slow down subsequent loss. Formally, the factor A(F) decreases very fast for small losses, e.g., F « 1. The tighter a system, the larger is its closure temperature interval. Although the analysis would certainly be slightly different for the accumulation of radiogenic atoms in a mineral instead of the simple closure to the loss of a stable species, it is intuitively acceptable that if a closure temperature exists, it is the same for a radiogenic isotope (e.g., 40 Ar) and a non-radiogenic isotope (e.g., 36 Ar or 39 Ar) from the same element. Differences may arise because the protective depleted rim

Transport, advection, and diffusion

460

900

i i i mi

1—i i i i in i

1—i i i ii

i i in ii

1—i i i 11 in

1—i i i i II II

800

Orthoclase

10" 6

10,-5

10 -4

10 -3

10 -2

10 -1

10°

Fraction lost per Ma Figure 8.22 Closure temperature Tc as a function of the rate of loss for a spherical geometry [equation (8.7.6)]. The numbers on the curves are for different fractions lost by the mineral. Amphibole data from Harrison (1981), orthoclase data from Foland (1974). Points A, A', B, B': see text. cannot form until enough radiogenic atoms have accumulated and therefore the interior of the system has closed. In a global sense, however, closure temperatures seem to depend significantly on the cooling history, and the thermal aspects of the cooling age theory must be applied to geological problems with the utmost care. The modeling of Ar outgassing from K-feldspars assumed to have coexisting domains with different sizes has been recently carried out by Lovera et ah (1989). This method seems very promising for the reconstruction of thermal history and vertical movements in young mountain belts. 8.8 Two useful steady-state solutions A conservative property is at steady-state when fluxes, sources, and sinks do not change with time. It is not to be confused with equilibrium which is a state with no flux, no source, and no sink. The general transport equation (8.4.3) of element i at steady-state is k p

8.8 Two useful steady-state solutions

461

and receives a certain number of important applications. Steady-state fractionation of a trace-element during crystal growth is described in Chapter 9 and two examples from the hydrous environment will be described below 8.8.1 Early diagenesis: sulfate reduction Sediment deposition on the seafloor traps interstitial water. After deposition, complex reactions take place in the sediment, most of them fueled by the decay of organic matter, such as sulfate reduction, denitrification,... Because of fast diffusion rates of most cations in seawater, the presence of interstitial water makes exchange between overlying sedimentary layers a much easier process than if sediment deposition was dry. The book by Berner (1980) is entirely dedicated to these processes and only a short example is given here. Let us consider sulfate reduction by bacterial activity at the expense of decaying solid organic matter. Berner suggests the simplified equation SO42 ' + 2CH2O->H2S + 2HCO3 " where CH 2 O represents the organic compounds. Let us call C c the volume concentration of organic (reduced) carbon per volume of sediment (solid + interstitial liquid), supposed to be locked in the solid fraction: molecular diffusion is neglected ( ^ c = 0) and organic carbon flux is entirely advective. The transport equation for organic carbon is (l-^)Cc]a2[(l-0)Cc] dt dz2

d[(l-4>)Cc] dz

Q

where (j) is the porosity. We further assume that carbon disappears with first-order kinetics, i.e., Ac=-k(\-4>)Cc

(8.8.1)

where k is the kinetic factor. After simplification, this equation is integrated into c

— co e

v

{6.0.z)

with C o c being the surface concentration (molkg" 1 of solid sediment) of organic carbon. Reduction of one atom of sulfur in interstitial water requires oxidation of two atoms of organic carbon from the sediment and care must be taken that conservation is written between numbers of atoms. Sulfate is destroyed with first-order kinetics

Neglecting the movement of water relative to the surrounding sediment, we write the steady-state transport equation in one dimension with burial, e.g., in a medium

462

Transport, advection, and diffusion

moving downwards with the burial velocity v

dz2

dt

(8.8.3)

dz

In this equation, C1 is the concentration of element i in pore water at depth z below the seafloor and A1 is a reaction (sink and source) term. For reactions involving the oxidation of organic matter, A1 can be evaluated independently. For constant porosity 0, the sulfate transport equation becomes so

d2CSO4 dz2

dCSO4 dz

k\-4> 2 <j)

*z

where the symbol for partial derivatives has been dropped and the term on the right-hand side represents the sulfate sink. The resolution of the homogeneous differential equation leads to an exponential term in vz/^ s ° 4 . This term being unbounded when z-»oo, its coefficient in the solution is necessarily zero. The exponential on the right-hand side therefore suggests trying a solution in the form

Inserting this expression into the transport equation gives

which, as expected, cancels the exponential terms. Rearranging

j8 is equal to the sulfate concentration C^04 deep in the sedimentary pile. It can be determined by making concentration at z = 0 equal to seawater concentration

4

—r s°4 -csw

which finally gives

(8.8.4)

463

8.8 Two useful steady-state solutions

Table 8.5. Sulfate concentrations Cs°4 (mmoll 1) at depth z (cm) in pore waters from the Saanich Inlet, British Columbia (Murray et al., 1978). 0 25.8

z

Cs°4

1 21.7

4 14.4

6 9.9

9 4.8

13 2.6

18 0.9

30 0.1

3

0.01

Figure 8.23 Sulfate concentrations in pore waters as a function of the depth below the water-sediment interface of the Saanich Inlet Murray et al. (1978). The exponential curve supports the diffusional diagenetic model.

This model requires an excess of sulfate over reducible carbon. Concentrations may be measured in solutions squeezed from sediment cores, diffusion coefficients are known from standard chemical data tables and sedimentation rates determined from 14 C, 210 Pb, or 230Th dating. Therefore, this model finds its best use in the recovery of the kinetics of organic matter decay. A discussion of this and similar equations and numerical applications may be found in Berner (1980). & Murray et al. (1978) measured sulfate concentrations in pore water from the Saanich Inlet (British Columbia) and obtained the data listed in Table 8.5. Calculate the reaction rate constant and the content of organic carbon in surface sediment using v = 3.3 x 10" 8 cms" 1 , ®s°4 = 2x 10" 6 cm 2 s" 1 , 0 = 0.93, psol = 2.7kgl"xWe assume that C^ 0 4 is negligible. A plot of In Cs°4 vs the depth z gives a fairly good straight-line (Figure 8.23) corresponding to = 26.6e"0184z

464

Transport, advection, and diffusion

The rate constant can be calculated from the logarithmic slope

The pre-exponential term includes the surface concentration of organic carbon C o c

= 26.6

and therefore

c m

<> v + ^

o = 26.6 x 2 x

„.

*

v7

= 53.2 v

2

l-0 (j) P

0.93 1+0.184x200/3.3 = 53.2

\-

psol

0.07

2.7

hence C 0 c = 3200mmolkg" 1 = 3.2x 12x 10" 3 g/g = 3.8 weight % organic C

Murray et al. (1978) found that this value is a factor of 4 larger than what is actually measured and suggest that methane upward diffusion accounts for the missing carbon. <> 8.8.2 The advection-diffusion model in the water column Some easily adsorbable metals in the ocean are removed from the water column by falling particles with strong surface reactivity such as oxi-hydroxides: this is the case of many transition elements, the rare-earth elements, thorium, ... For sake of simplicity, first-order adsorption kinetics is commonly assumed. Likewise, dissolved radiocarbon is removed from sea water by radioactive decay: the physical removal process is different, but still 14 C atoms decay with first-order kinetics. On the scale of the ocean, molecular diffusion is an inefficient transfer process. However, turbulent transfer in the water column is commonly described via the same phenomenological (i.e., formal) equation: at a given locality in the ocean, the turbulent or eddy diffusivity 2 describes how fast eddies are transported. It also measures the efficiency of the transport down the concentration gradient in much the same way as the diffusion coefficient in Fick's law. It is larger by several orders of magnitude and, being associated with bulk material transport, is identical for all the elements. Craig (1969, 1974) proposes a one-dimensional model with first-order removal kinetics

dz 2

dz

0

where z is the depth below the ocean surface, v the upwelling velocity, and k a kinetic coefficient. The eddy diflFusion coefficient is overlined in order to stress its representing

8.8 Two useful steady-state solutions

465

a turbulent flow property. For a conservative species, the reaction term is zero. The characteristic equation of the differential equation (8.8.5) has two roots given by

Defining the mixing length Zm = ^/v, the scavenging length /s = v/fc, equation (8.8.5) becomes dz where the term between brackets represent the ratio of dissolved flux to upwelling velocity (reduced flux) at depth z. Introducing the parameter s such as

the roots of the characteristic equation take the simple form 1 ,.

n

rr-rr.

1±£

and the solution therefore is C(z) = a exp(

— J + p exp( - - — — 1

(8.8.6)

where a and ft are two constants to be determined from the boundary conditions. In order to make notation compact we introduce the hyperbolic cosine and sine functions defined as e

U

+

e

cosh x =

A

• U

and sinh x =

These functions satisfy coshO=l

sinhO = O

(cosh x)' = sinh x and (sinh x)' = cosh x After a little manipulation, we get the alternative form C(z) = expf —— )(a cosh — + b sinh — ) P

V 2/JV

2/m

(8.8.7)

21J

where the constants to be determined from the boundary conditions become a = (<x + P)/2 and 6 = (a — /J)/2. Although alternative sets of conditions could be both

Transport, advection, and diffusion

466

physically meaningful and tractable, we assume that concentrations are known at the top (z = 0) and the bottom (z = Z) of the scavenged layer, giving C(0) = a

and f

Z

sZl I

eZ

b = C{Z) exp — - C(0) cosh — / sinh —

L

2/m

2 / J / 2/m

Inserting these values of a and b into the expression (8.8.7), we get

<

C(z) = -

z \ 2/J

sinh

s(Z — z) Z —z sz + C(Z) exp sinh — P 2/m 2/, 2/m

(8.8.8)

sinh — 2L

where use has been made of the identity sinh(w — v) = sinh u cosh v — cosh u sinh v We deduce the general expression of the reduced flux of dissolved element at z

dC(z) 1 dz "

C(0) expj - -^-) cosh f^L-i* - C(Z) exp ^ - -icosh — (8.8.9)

sinh — 2L

The flux J of element carried downwards by the sinking particles (the 'rain') can be estimated by comparison with the flux of dissolved element. We write that, at steady-state, the sum of dissolved and particulate fluxes remains constant, i.e., for two depths zx and z 2 dC — dz

dz

-vC(z

where the — v term stems from the movement being upward, or in a reduced form

dz

dz

+ C(Z) = -

^ Draw the concentration and flux profiles of a species i with surface concentration of 2 and bottom concentration of 10 (arbitrary units). Assume that the mixing length can be obtained from the distribution of conservative quantities, usually salinity or potential temperature. Craig (1969) suggests a value of ~800m in the 4000m-deep Pacific.

467

8.9 Simultaneous precipitation and diffusion 5

10

/Z=0.2

' / 1\

0.2

7

/

0.4

0.6 \ 0.8

\

\

- X\^ -1.0 Concentration (arbitrary unit)

1/ V -0.5 Total dissolved flux Advective flux at Z

0.0

Figure 8.24 The advection-diffusion model (Craig, 1974) in a water column of depth Z, mixing length /m, and scavenging length /s. Concentrations [left, equation (8.8.8)] andfluxes[right, equation (8.8.9)] in the water column for the IJZ values labeled on the curves.

The data impose IJZ = 0.2. The concentration profiles have been drawn for /s /Z = 0.1, 0.5, and 10 (Figure 8.24). Also drawn are the fluxes of dissolved species i for the same values of the parameters, which makes it possible to estimate the flux carried by sinking particles. For instance, a quick graphic examination reveals that, for /s/z = 0.5, the flux of species i reaching the bottom Z with the rain of particles is approximately —0.1—(—0.7) or 60 percent of the dissolved flux advected at the base. An example of inverse calculation from dissolved Ni concentrations in the Eastern Pacific measured by Bruland (1980) is discussed in Chapter 5. Particularly important in inverting the data is to make sure that e must be larger than unity, since the rate constant is a positive parameter. <*=• 8.9 Simultaneous precipitation and diffusion

When diffusion interacts with crystal growth and nucleation, phenomena of periodic precipitation may appear that have been known historically as Liesegang rings. Bands and concentric rings with alternating mineral abundances are not uncommon in all sorts of geological environments: orbicular structures in plutonic rocks or striated chemical precipitates in sediments give dramatic examples of pattern formation or self-organization. Some mechanisms of pattern formation require competing chemical species with contrasting diffusivity and chemical reactivity. Continuous growth at

468

Transport, advection, and diffusion

one site is rapidly starved by the inability of one of the species, which is part of the precipitate, to move over large distances. In contrast, periodic precipitation requires that the energy for driving the slowly diffusing species up its own gradient over short distances is compensated by the energy released in the phase change. If this condition can be met, precipitation proceeds by matching the long diffusion distance of the fast species with a succession of bands roughly as wide as the diffusion distance of the slow species. Other theories do not explicitly assume the presence of several components and rely on autocatalytic reactions (Flicker and Ross, 1974; Noyes and Field, 1974) or capillarity phenomena (Feinn et al, 1978; Lovett et al, 1978). For decades, many theoretical models of periodic precipitation have been put forward (Wagner, 1950; Prager, 1956; Kahlweit, 1965). Concentration gradients are no longer deemed to be necessary to generate heterogeneities (Flicker and Ross, 1974) while widely accepted theories emphasize the role of capillarity in retarding crystal growth of small particles (Feinn et al, 1978; Lovett et al., 1978; Ortoleva, 1984; Kirkaldy and Young, 1987). The following simple derivation which shows how unstable behavior may be initiated in a precipitating two-component system is adapted from Kirkaldy and Young (1987). Let us assume a solid infinite matrix with O and Cj being the concentration of two conservative species i and j . i and j may react to form a compound, e.g., a local precipitate, an oxide,... with fixed concentrations Col and Coj. pp and p m , the densities of the compound and matrix, respectively, are assumed to be constant. The compound is finely dispersed, and we call p its volume fraction. Mass balance of element i requires

div

A w ) d K

(8.9.1)

Let us assume no advective flux and the diffusive flux to be proportional to the fraction 1 — p of matrix material

Ax,y,z)v -(1 In a local form, we get the conservation equation (1 —p)pm

h (PPCQ — pmC() — = (1 — p)pm@iV2Ci'—pm&

grad C'-grad p

or

dt

pm

1 - p dt

1-p

(8 . 9 . 2)

At the onset of the precipitation, the product of gradient terms on the right-hand side can be neglected and we obtain

dt

dx2

pm

dt

8.9 Simultaneous precipitation and diffusion

469

The rate of precipitation depends on the rate of change in concentrations dp _ dp dO dp dCj Tt~'dCi~dt+~dCj'^t The conservation equation is rearranged into

dt V

Pm

C

* X

dx2

dCJ

l **

da CV dt

Pm

(8.9.3)

We define a positive variable Pl describing how changes in matrix concentrations depend on the precipitation increments as

(8.9.4)

dC l /dp

Pl is positive since precipitation decreases the amount of species i in the matrix. Pj is defined in a similar way. Two elements i and j give the system of conservation equations dCl _ & d2O ~dt~ l-Pi~dxY+ dC{ _ & 32Cj ~dt~ 1-Pj~dx2~

Pj CQ1 dCj l-PiCj~dT P{ Coj dCl \-PJ^Cj~dt

F 1-Pj

which we combine as pipj

ear

1

&

e2cl

l-Pi~dx2+

~dT[_ ~(l-P^l-P^j

col e2cj

pj

j

(l-Pi)(l-Pj)~CJ~dxT

We finally get the system of equations dO dt

dt

PJ

• l

-P -P

j

Pl 1-P -Pj i

2

dx

c o l d2cj j 2 (1 -p'Xi -P ) Co dx

' d2O Co " Cn 1 dx2

j

1- -Pl-Pj

dx2

(8.9.5)

The theory of linear differential equations indicates that long-term evolution depends on the boundary conditions and the determinant of the coefficients preceding the second spatial derivatives (which can actually be considered as effective diffusion coefficients). Such a system is likely to be highly non-linear. One extreme case, however, is particularly interesting in demonstrating how periodic patterns of precipitation can be arrived at. We assume that (i) species i diffuses very fast and dC'/dp is large so that Pl is small and (ii) that species j is much less mobile and Pj is large. The

470

Transport, advection, and diffusion

Thomson-Freundlich equation relates the solubility of a particle to its curvature radius (e.g., Swalin, 1962). Using this equation, Kirkaldy and Young (1987) show how periodic precipitation may result from the capillary resistance of the matrix to grow small precipitates. Formally, the conditions read &»&

Pi«\«Pi

and

Under these simplifying assumptions, the system of diffusion equations becomes

dx2

dt ^ dt

=

-

PJ

^

(8.9.6)

dx2

The mobile diluted species i has a normal behavior relative to diffusion, whereas the sluggish major species j undergoes uphill diffusion. Contrary to what would happen for the mobile species i which diffuses down its own concentration gradient, any oscillatory component of a perturbation 5Cjf = A sin would increase exponentially with time as x bCj(x, t) = A sin 2n - exp

pa2

where Pj is assumed to be constant. Spinodal-like behavior is to be expected in such a system. The prediction of band spacing in this case depends on the functional dependence of P's on concentrations. Assuming the existence of initial chemical gradients, Kirkaldy and Young (1987) propose a scaling distance much reminiscent of a mean-squared diffusion distance encountered in standard downhill diffusion

(8.9.7)

Given the conditions on Pl and P\ the relation dp ^

Pj

dCj

shows that precipitation oscillations have the same wavelength as concentration waves, which provides a semi-quantitative framework for Liesegang structures. Derivation of band spacing in autocatalytic and capillarity-based models is entirely different and the interested reader should refer to the literature referenced above.

Appendix 8A: The error function

• The error function erf u is defined as 2

= - ^^=

e~x2 dx

yjit JO

therefore erfO = O, erf oo = l, and erf( — u)= — erfw • The error function complement erfc u is defined as = erfcw =

2/f 2 f + oo x Q~x'dx 'dx = = 1 Q~ 1 yJnJu yJn\Jo

J

+ QO

e~ xx "dx"dxe~

J

f

, \ e~* dx dx \\ = = 1-erfu 1e~* Jo /

J

with erfc 0 = 1

erfc oo = 0

erfc(-w)=l-erf(-u)=l+erfu = 2 The functions erf and erfc are depicted in Figure 8A.1. Other important properties of these functions are werfcw->0 as M->OO 1

derfw_

-as u->oo

(8A.2) (8A.3)

2

du ~~ff d erfc —

d erf—

di

di

lI--A__c-»*

• The integral of the error function complement ierfc is defined as ierfc u=

f°° erfcxdx Ju 471

(8A.5)

Transport, advection, and diffusion

472

Figure 8A.I Graph of the functions erf u and erfc u. Attention must be paid to the sign in deriving the function ierfc, since using Leibniz's rule dierfcw d f00 , , f^derfcx^ . ^ dw A — =— encxdx= dx + erfcoo xO—erfc ux — =—erfcw Ju du du du duju Using integration by parts and changing the variable x2 into v, we also get the useful result ierfc w = [x erfc x]

•TS

e *2x dx = — u erfc u

e v dv

or ierfcw =

1

e "2—werfcu

(8A.6)

As a particular result, we get for w =

f

erfc xdx = ierfc 0 =

o

X/TC

J

Also, using equation (8A.4),

= —pierfc-—

Wterfc—-

Appendix 8A: The error function

473

and, therefore

(8A.7)

Error function relates to the normal density of probability (pdf)/x(x) of Gauss. fx(x) is given by fx(*) = where \i and a2 are the mean and variance of the normal pdf. This form suggests that the cumulative distribution function F(x) which measures the probability that the variable X is ^ x can be expressed in terms of the error function. By definition F(x)= |

fx(X)dX

Making the variable change

the expression of the distribution function becomes x~H x

= F(x) =

C

1

exp

r

2 2

(x—u\ r~^~fi (X — u)i ~\ [ojl a f5. fl 2n — dX= e""2———dw =

The integrals are split at zero in order to make the erf form apparent

[ r J - oo and finally

x/7r

Jo

11 1 [oji

e""2dw

Appendix 8B: The theta functions

Theta functions are special functions related to Jacobian elliptic functions (Morse and Feshbach, 1953; Widder, 1975) with special properties that make then extremely useful to calculate solutions to diffusion problems for small values of time. Three of the four theta functions will be used in the present context

n=0

^, mi) = 1 + 2

(8B.1) (-l)ncos2^exp(-n27r2i)

where i is the square-root of — 1. For k = 1 to 4, these functions satisfy the diffusion equation

The essential transformation properties are (Morse and Feshbach, 1953)

(8B.2) 17IT 7TT

lTTT 7TT

Let us now calculate some solutions for short values of the time. For a sphere with homogeneous initial concentration and zero surface concentration, we replace Q^tja2 by T. From equation (8.6.11), the fraction F s p h left at x is

Let us first observe that when x tends to zero, F s p h tends to 1 (no loss) and we get the result V - = — n=m2~ 6 474

Appendix 8B: The theta function

475

Then, we can write

n2n2

Jo

n2n2

hence

P

f=i\n( jn {J

n

= l-6 t fe-"2"2" d« o

«=iJo

We can exchange the order of summation and integration

p f e-"v»du o «=1

Introducing the function # 3 and using the second transformation rule in equation (8B.2) gives

F =l-3 |T-U( 1+2 I

and reverting to the infinitesph sum, we get Expanding the last expression, we get

n

i

\

—-1

yjnu J which can be simplified using equation (8A.7)

oo

ft e -« 2 /u

du-6£

—=

n = i Jo ^Jnu d( y/wierfc-—

Integrating each term separately

Fsph=lwhich upon replacing T by its value gives equation (8.6.8). The same method can be used in order to obtain a spatial distribution which converges more rapidly for small values of Q)tja2 than equation (8.5.14). For a slab with homogeneous initial concentration and zero surface concentration, the dimensionless variable @t/X2 is replaced by T. The fraction left at T, given by equation (8.5.16), is transformed as in the case of the sphere. B2 now appears in place of # 3 and is replaced by # 4 through the first of the equations (8B.2).

Appendix 8C: Duhamel's principle

The basic principles are taken from Zwillinger (1989). Duhamel's principle enables solutions for surface conditions being functions of time to be calculated from solutions with permanent surface conditions. Although this principle is most easily derived through the use of Laplace transforms, more conventional demonstrations, not repeated here, can be found in Sneddon (1957) or Carslaw and Jaeger (1959). Given the diffusion equation I dC(x,t) d2C(x,t) —-— =9 —— dt dx2 with the initial conditions and boundary conditions C(x,0) = 0, C{a,t) = f(t\ and C(b,t) = g(t) we choose instead to solve equation II dC(x,t,r) d2C(x,t,T) — -=@ -^—dx2

dt

with the initial condition unchanged and boundary conditions C(
d f'

C(x, t) = —

C(x, t -

T, T) dr

An example can be found in the section dealing with the diffusion of radiogenic isotope out of a sphere. Duhamel's principle can be extended to cases of surface conditions being functions of both time and space variables and to variable source and sink terms as well (Zwillinger, 1989).

476

9 Trace elements in magmatic processes

9.1 Introduction

Trace elements are useful tracers of geochemical processes mostly because they are dilute: their behavior depends primarily on the trace element-matrix interaction (e.g., Rb-host feldspar, Sr-calcite) and very little on the trace-trace interaction (e.g., Rb-Rb, Sr-Sr). Consequently, the distribution of trace elements among natural phases largely obeys the linear Henry's law. The modeling of trace elements in various geological environments (magmas, hydrothermal fluids, seawater,...) relies on three different aspects (a) The total mass of each element distributing among several subsystems such as phases (minerals, melts, fluids) or reservoirs (the 'Depleted Mantle', the 'Lower Crust', the 'Antarctic Bottom Water') that altogether form a closed-system must be preserved. This condition is true regardless of configurations, proportions, physical or chemical state of individual subsystems. This conservation, or mass balance, property is unrelated to the trace character of the element. (b) The equilibrium distribution of a trace element i between two phases a and /?, which is usually handled through the law of diluted solutions (9.1.1) where Kp/J(T9P) is the temperature-pressure dependent Berthelot-Nernst distribution or partition coefficient. If the composition dependence relative to the major phase constituents is to be emphasized (Mclntire, 1963), a major element / and a new partition coefficient KP/J"(T9P) will be introduced such that — Kp/a' {I,r)

\y.i.Z)

This form of the partition coefficient, analogous to that used for Fe-Mg fractionation between olivine and melt (see Chapter 1), is necessary only for the rare cases where trace substitution affects Cj and C$ substantially. A number of reviews (O'Nions and Powell, 1977; Michard, 1989) describe the various sorts of partition coefficients expressed either in mass-fractions, atom fractions, or normalized to a major element and their respective merits. If the discussion is restricted to a narrow range of chemical compositions (e.g., basaltic systems, Irving, 1978, Irving and Frey, 1984), enough experimental information exists on trace-element partitioning to resort to the wonderfully simple equation (9.1.1). (c) Compatible elements are easily hosted by the structure of the crystallizing minerals (X sol/liq '> 1) while incompatible elements are rejected into the liquid (X sol/liq I « 1). 477

478

Trace elements in magmatic processes

This chapter will emphasize modeling of the simplest processes which govern magma formation and evolution. Probably none of the natural processes can be fully described by one of these simple models. More likely, these simple processes combine in a quite complex arrangement to form the magmas and their solid products. The burden of the proof increases quickly with increasing model complexity, however attractive a detailed model may be be. The extraction of unambiguous quantitative information from a given model demands a considerable amount of experimental, theoretical and computational work. Simple models should therefore be considered first: identifying precisely how and where simple models fail may be much more informative on the physical processes at work than inadequate implementation of a complex model with more independent parameters than can be effectively handled.

9.2 Batch-melting and crystallization

9.2.1 Introduction and forward problem Batch partial melting will hereafter be understood as equilibrium melting, which is in contrast to fractional melting discussed in Section 9.3.3. The foundation of this model is remarkably simple and was first laid down by Schilling and Winchester (1967). A number of more or less complex modifications enabling useful information to be extracted from the data were later introduced by Gast (1968), Shaw (1970) and Albarede (1983). Bulk equilibrium crystallization of a liquid batch can be handled with equations identical to those for batch-melting. We consider a molten multi-mineral assemblage, referred to as the source, which is presumed to give rise to an erupting magma. X-} is the mass-fraction of each phase j relative to the molten source (and not to the residue). Subscript j= 1 refers to the melt, j~2,...,n to the n — 1 residual mineral phases. The sum of the X} over all the n phases is unity. If Cliq l is the concentration of the ith among m elements in the liquid, the concentration of element i in phase j will be K/C l i q '. Let Col be the concentration of i in the source prior to melting. Hence, mass balance requires

in which a K/ value of 1 has to be assumed for the melt (/ = liq). Letting F be the molten fraction, the mass-fraction fj of phase j relative to the residue relates to Xj through

The sum of the fj over all the n—\ residual mineral phases is unity. This leads to the equation known as the equilibrium, partial, or batch-melting equation

CrJ = q

(9.2.2)

F + Dt{l-F)

9.2 Batch-melting and crystallization

479

Table 9.1. Mineral-liquid partition coefficients used for the forward modeling of batch-melting.

olivine-liquid clinopyroxene-liquid

Ni

Cr

Yb

Rb

6 1

1 8

0.1 0.3

0 0

In equation (9.2.2), Dt is the bulk solid-liquid partition coefficient D>= I

fjK/

(9.2.3)

i.e., the centroid of the mineral-liquid partition coefficients weighted by the corresponding mass-fraction of the minerals in the residue. 4? A peridotite contains 2500 ppm Ni, 1500ppm Cr, 0.2 ppm Yb, and 0.01 ppm Rb. Calculate the concentration of each element in the liquid produced by 10 percent partial melting when a residue containing 60 percent olivine and 40 percent clinopyroxene is left. Assume the partition coefficients given in Table 9.1. From equation (9.2.3), the bulk partition coefficients are DNi = 0.6x 6 + 0.4 x 1 = 4 DCr = 0.6x 1+0.4x8 = 3.8 DYb = 0.6 x 0.1 +0.4 x 0.3 = 0.18 Z)Rb = 0.6x 0 + 0.4x0 = 0 which gives the following melt concentrations from equation (9.2.2) CliqNi = 2500/(0.1 + 4 x 0.9) = 676 ppm CliqCr = 1500/(0.1 + 3.8 x 0.9) = 426 ppm CliqYb = 0.2/(0.1 + 0.18 x 0.9) = 0.763 ppm CliqRb = 0.01/(0.1 + 0 x 0.9) = 0.1 ppm This simple formalism may be applied to natural lavas provided the lava samples have not undergone extensive mineral fractionation. o 9.2.2 Inverse problem: the source composition is known The simplest inverse model consists in finding liquid and solid phase proportions assuming a melt and source composition. This case is depicted in Figure 9.1 and may be modeled quantitatively with no extra assumption. Equation (9.2.1) expresses the fact that, in the m-dimensional composition space, the source composition must be the centroid of melt and residual mineral compositions, each being weighted by the

480

Trace elements in mag ma tic processes

element 3 mineral 1

mineral

element 2

liquid element 1 Figure 9.1 Inverse partial melting problem in the three-dimensional space of elements 1, 2, 3 when the source is known. Projection of the source onto the sample subspace provides the mass-fraction of each phase of the molten source. If one phase is at the origin (sterile phase), every representative point can be shifted by a constant vector.

mass-fraction of the corresponding phase. Once C liq ' and, hence, the composition of each residual phase through the assumption of mineral-liquid partition coefficients, are known, the sample subspace is completely defined by the set of all possible phases, present or not, in the residue. The m x n matrix A of phase compositions is defined by its current element aij9 such as

and the n-vector x and m-vector y by their current element Xj and C o ', respectively. Solving the matrix equation y=Ax

(9.2.4)

in the least-square sense amounts to projecting the source-composition vector y onto the sample subspace. The closure equation

L *;=L*;=1

(9.2.5)

makes the least-square solutions more complex because one has to resort to the Lagrange multiplier technique for the constraint of equation (9.2.5) to be exactly verified. This system of m + 1 equations in n unknowns may be solved for m ^ n— 1. The solution (Albarede, 1983) has been given in its matrix form in Chapter 5. Defining

481

9.2 Batch-melting and crystallization

x0 as the unconstrained solution, i.e., Jto=(ATAy1ATy

(9.2.6)

the general solution can be put into the easily tractable form (9.2.7) where / is the n-column vector (1,1,..., 1). The denominator of the last equation represents the sum of the terms from the matrix ATA. A negative value of a phase proportion x7 indicates that the calculation should be restarted after discarding phase j from the residual phases. &

Invert the results of the forward partial melting example. From the concentration of each element in the liquid and in the source, we can retrieve the degree of melting and the residual mineralogy. We assume that the liquid contains 676 ppm Ni, 426 ppm Cr, 0.763 ppm Yb and 0.1 ppm Rb, whereas its source composition y vector is (2500, 1500, 0.2, 0.01) in ppm. We will test the assumption that the residuum is composed of olivine and clinopyroxene with the partition coefficients given above. Phase abundances Jt will be ordered as liquid, olivine and clinopyroxene. Let us compute, as an example, the element in the third row and second column of the matrix A

The whole matrix A is built in a similar manner, which gives liq x676

ol 6 x676

cpx 1x676 •

Cr 1 x426

1 x426

8x426

Ni

•1

Yb 1 x 0.763 0.1 x 0.763 0.3x0.763 Rb .1 xO.l 0 xO.l 0x0.1 .

"676

4056

676

426

426

3409

0.763

0.0763

0.229

0.0

0.0

_ 0.1

hence 1.854

-0.2761

-0.1972 "

-0.2761

0.04112

0.02937

-0.1972

0.02937

0.02098

and therefore 1.854

-0.2761

-0.1972 T 2328 394

"0.10

-0.2761

0.04112

0.02937

10744340

0.54

_-0.1972

0.02937

0.02098JL 6802826_

0.36

482

Trace elements in magma tic processes

This result reproduces the original melting conditions with 10 percent liquid while the residue contains 0.54/(0.54 + 0.36), i.e., 60 percent olivine and 0.36/(0.54 + 0.36), i.e., 40 percent clinopyroxene. Since we knew that the model was perfectly obeyed, we could expect that phase abundances sum up to unity. <> & Invert the results from the example above replacing the source concentration y values with the data 'polluted' by errors Ni = 2200 ppm, Cr = 1800 ppm, Yb = 0.15 ppm, Rb = 0.008 ppm. Following the same steps, we obtain 1.854

-0.2761

-0.1972 T2253 543

-0.2761

0.04112

0.02937

-0.1972

0.02937

0.02098 J|_7 622 854_

9685966

0.0108" 0.4627 0.4688

x0 does not sum up to unity any more. Indeed JT(ATAylJ= 1.0280

We now calculate "0.0108" 0.4627 = 0.942 0.4688

and obtain the normalized solution "0.0108"

1.3805/1.0280"

0.09"

0.4627

-0.2056/1.0280 (1-0.942) = 0.45

0.4688

-0.1469/1.0280_

0.46

i.e., with 9 percent degree of melting, and a residue made of 45/91 =49 percent olivine plus 46/91=51 percent clinopyroxene. <^ A problem arises whenever a column j of A comprises only zeroes, i.e., for a sterile phase which admits none of the measured elements in its lattice (olivine in basaltic systems, quartz in granitic systems). If there is only one of these phases, the difficulty may be overcome by a simple trick consisting in shifting all the concentrations by a constant vector which amounts to shifting the disturbing phase away from the origin (e.g., add 1 to all concentrations). If the sterile phases are multiple, like olivine, spinel and orthopyroxene for REE in basaltic systems, the matrix A has multiple singularities, and the inversion fails because the sterile phase contributions cannot be disentangled. For similar reasons, it can be realized that stuffing too many lines representing very incompatible elements into the matrix A results in either redundant or inconsistent information and, in turn, in a singular system even with

9.2 Batch-melting and crystallization

483

9.2.3 Inverse problem: when the source composition is unknown This case is somewhat more complex but a pictorial feeling of the solution may be obtained quite easily (Albarede, 1983; Albarede and Tamagnan, 1988). We can think of 5 undifferentiated lava samples originating within a common source such as several basalts produced from the same peridotite at different pressures (e.g., across the spinel-garnet transition as suggested for Mid-Ocean Ridge Basalts by Salters and Hart, 1989) and with different degrees of melting. As in Figure 9.1, each molten source may be represented by a compositional subspace of the complete m-element space. As illustrated in Figure 9.2 for three samples, three elements and two residual phases, rinding the common source of a suite of cogenetic lavas therefore requires that the intersection of these sample subspaces be calculated. A unique intersection does not exist in the general case and a solution will be sought in the least-square sense.

element 3

element 2 Figure 9.2 General solution of the partial melting problem for a suite of cogenetic rocks when the source composition is unknown (M and m are two mineral phases). Both mineral phases M and m accept some of the analyzed elements.

Because of the large number of unknowns, it has long been estimated that the partial melting model was heavily underdetermined. In fact this turned out to be true but for the wrong reasons. In most cases, sterile minerals exist, e.g., olivine and orthopyroxene, which make the problem formulation more obscure: there is usually no easily available element entering the lattice of these minerals except for some very compatible ones (Ni, Cr) that are far too affected by fractional crystallization for their concentration to give a reliable indication of the value in the pristine melt. For each sample, the point representing the composition of these sterile minerals falls at the origin (zero concentrations). The origin is therefore common to each sample subspace. If those sample subspaces have two points in common, they must have an

484

Trace elements in magmatic processes

element 3 liq3

liq,

sterile phase (e.g., olivine) Figure 9.3 General solution of the partial melting problem for a suite of cogenetic rocks when the source composition is unknown. One phase M has a regular behavior, the olivine is sterile and does not contain any of the analyzed elements. The solution is the direction represented by the heavy segment joining the source and the sterile phase.

infinite number of common points. In the three-dimensional diagram of Figure 9.3, these common points would define a line segment. For the experienced numbercruncher used to programing the standard batch-melting equations for REE and other incompatible elements, this relation expresses the well-known fact that the degree of melting and the amount of olivine in the residue cannot be constrained independently. An alternative formulation of the inverse partial melting problem therefore may be stated in this way: given a set of samples produced from a homogeneous source including sterile phases, the source composition cannot be uniquely determined, but the direction vector going from the origin through the source composition can. This statement expresses in a geometric way the property that, even if absolute concentrations are unknown, the relative concentrations are fully determined. In terms of REE, the concentration level is unknown but the shape of the pattern, enriched, depleted,... may be quantitatively assessed. Therefore, we want to decide which direction, among all possible choices, is common to all sample subspaces, or, at least, which direction represents the best zone of the sample subspaces in a least-square sense. Since a direction can be completely described by its unit vector, we can restrict the solution set to the surface of the unit sphere centered at the origin. Let us call.)? the solution of unitary modulus and yk its projection onto thefcthsample subspace (k = 1,..., s) represented by the matrix Ak. It is a simple matter to show that

9.2 Batch-melting and crystallization

485

where Qk is a symmetric mxm projector matrix such that Qk = AkiAkTAk)'1AkT

(9.2.8)

Finding the least-square solution reduces to minimizing the sum S of squared deviations yk —y between the estimated source solution and its projection onto each sample subspace. Thus, finding the minimum of S= £ (A-J>)T(A-i>)

(9-2.9)

is equivalent to finding a unitary estimate y which minimizes S=yTM$

where the symmetric mxm projector matrix Mis defined by M=tvm-Qk)\lm-Qk)

(9.2.10)

Alternatively, using projector properties

M= t (Im-Qk) k=l

or M=S/ffl-|a

(9.2.11)

This problem is a standard eigenvalue problem, related to what is known as the Rayleigh quotient (Strang, 1976). S is minimum when y is equal to the eigenvector vm associated with the smallest eigenvalue of M. Once the composition is found, the modal parameters of melting (degree of melting, abundance of residual phases) may be determined for each sample. Although the constrained solution described in the analysis of the first case may be safely used, it is found (Albarede and Tamagnan, 1988) much simpler and more significant to avoid constraining the abundance of the sterile phases and to infer them from the difference between unity and the sum of the unconstrained abundances. This solution has very attractive stability properties: (a) If the partition coefficients of one phase are changed by a constant factor, as, for instance, by doubling the Kf of the clinopyroxene, neither the solution nor the modal parameters of melting are changed. It is well-known that REE partition coefficients may vary significantly as a function of temperature or melt composition even for a single mineral (e.g., Irving, 1978) but, usually these variations are strongly correlated from one element to another. The solution to the partial melting therefore will be robust relative to uncertainties on the absolute partition coefficients.

486

Trace elements in magmatic processes Table 9.2. Melt and mineral fractions (%) in the molten source assumed to calculate concentrations in 5 'melts' used as an example for inverse modeling of partial melting.

melt min. 1 min. 2

lava 1

lava 2

lava 3

lava 4

lava 5

0.02 0.40 0.20

0.03 0.20 0.20

0.04 0.20 0.30

0.05 0.20 0.30

0.10 0.10 0.30

Table 9.3. Synthetic example of batch-melting inverse modeling. Assumed source concentrations Col for four arbitrary elements (column 2), mineral 1—liquid and mineral 2-liquid partition coefficients (columns 3 and 4), residual solid-liquid bulk partition coefficients calculated from mineral abundances listed in Table 9.2. Concentration units are arbitrary. Assumed values Element i eli

el2 el 3

K

1. 2. 3. 4.

0.00 0.10 0.20 0.50

i

D

,- calculated from modal compositions

Kmin.2'

lava 1

lava 2

lava 3

lava 4

lava 5

0.50 0.20 0.10 0.00

0.10 0.08 0.10 0.20

0.10 0.06 0.06 0.10

0.15 0.08 0.07 0.10

0.15 0.08 0.07 0.10

0.15 0.07 0.05 0.05

(b) Fractional crystallization changes very little the relative abundance of very to moderately compatible elements: for the elements which are not significantly fractionated by mineral separation (e.g., REE, Zr, Ba, Ta, ... in basalts) the solution is therefore nearly insensitive to the degree of fractionation or accumulation.

& The best way to convince ourselves that this rather convoluted technique works well is to build a synthetic example that we invert in a second stage. We use four elements (m = 4 , e^ to el4), five lavas (s = 5) for which we assume the melt fraction and residual mineral abundances listed in Table 9.2, and two non-sterile residual minerals (Mini and Min2) whose partition coefficients are listed in Table 9.3. The assumed source composition is listed in Table 9.3 which also shows the assumed bulk solid-liquid partition coefficients for each lava. Since we know the source composition, partition coefficients and phase abundances in molten sources, we can calculate the synthetic melt and mineral concentrations using equation (9.2.2). The five 4 x 3 matrices Ak can be built: the first column of Table 9.4 is made of the melt concentrations (lavas'). Mineral concentrations in the next two columns are computed from melt concentrations using the appropriate mineral liquid partition coefficients. High precision is needed to ensure accurate inversion. Now the five individual 4 x 4 projector matrices Qk are calculated from equation (9.2.8) and listed in Table 9.5. We form the 4 x 4 matrix M through equation (9.2.11)

9.2 Batch-melting and crystallization

487

and the result is 0.9350

-1.6104

1.0450

-0.2123"

-1.6104

2.7953

-1.8301

0.3776

1.0450

-1.8301

1.2133

-0.2562

-0.2123

0.3776

-0.2562

0.0564

Using Matlab, the eigenvalues of this symmetric matrix have been found to be, in decreasing order, 4.9738,0.0253,0.0009, and 0. The corresponding eigenvector matrix Fis -0.4314 0.7496

-0.5906 -0.6571 0.1826 0.1549 -0.5299 0.3651

-0.4916

0.6657 -0.1233 0.5477

0.1018 -0.4291

0.5217 0.7303

By dividing each eigenvector component by the smallest of them, we find that the components of the fourth eigenvector [0.1826, 0.3651, 0.5477, 0.7303]7 associated with the smallest eigenvalue (here zero) are in proportion of (1, 2, 3, 4) which is precisely the source composition used to produce the synthetic data (Table 9.3). The capability of this formalism to invert the data to produce relative source concentrations is therefore established. It is left to the reader to show that correct source mineralogical compositions can be retrieved using the procedure outlined in Section 9.2.2. <^ 9.2.4 Shaw's formulation The batch-melting equation (9.2.2) can be applied to any combination of melt and residual phase proportions but the bulk partition coefficient D( does not stay an invariant parameter unless the /} values remain constant with F. A melting process keeping the fj values constant is called modal melting and is in general not representative of thermodynamic equilibrium. Shaw's (1970) main rationale for changing Schilling and Winchester's (1967) equation was to include the assumption of constant phase proportions during eutectic melting in the melting equation. Introducing the phase proportions prior to melting X?, the partial melting equation (9.2.1) is recast into

Defining Dt° as the solid-liquid partition coefficient for F = 0, i.e., (9.2.12)

and Pt as the partition coefficient for the melt norm relative to source minerals (9.2.13)

488

Trace elements in magma tic processes Table 9.4. The liquid concentrations in the five synthetic melts (column 2) calculated from the batchmelting equation (9.2.2) and the parameters of Tables 9.2 and 9.3 and concentrations in the minerals equilibrated with each melt (columns 3 and 4). The collection of these columns for the fcth melt sample makes the matrix Ak. Concentrations

el2 el3 el 4 A2: eli el2 el 3 eU A3: eli el2 el3 el 4 /i 4 : eli el2 el 3 eU A5: eli el2 el3 el4

liquid

min. 1

min. 2

8.4746 20.3252 25.4237 18.5185

0.0000 2.0325 5.0848 9.2593

4.2373 4.0650 2.5424 0.0000

7.8740 22.6757 34.0136 31.4961

0.0000 2.2676 6.8027 15.7480

3.9370 4.5351 3.4014 0.0000

5.4348 17.1233 27.9851 29.4118

0.0000 1.7123 5.5970 14.7059

2.7174 3.4247 2.7985 0.0000

5.1948 15.8730 25.7511 27.5862

0.0000 1.5873 5.1502 13.7931

2.5974 3.1746 2.5751 0.0000

4.2553 12.2699 20.6897 27.5862

0.0000 1.2270 4.1379 13.7931

2.1277 2.4540 2.0690 0.0000

we get

\-F

(9.2.14)

and the batch-melting equation becomes

Cliq

'

(9.2.15)

P, is sometimes improperly but illustratively referred to as the partition coefficient of the melt. For eutectic melting, it is expected to remain approximately constant. From this equation expanded by Hertogen and Gijbels (1976) to complex melting

489

9.2 Batch-melting and crystallization Table 9.5. The five projectors Q k calculated from Table 9.4 as

Qv eli el2 el3 el4 Q2 elt el2 d3 el4 S 3 : elt el2 el3

cU 6 4 : el! el2 el3

eU 2 5 : elt el2 el3

eU

ell

el2

el3

el4

0.8800 0.2502 -0.2000 0.0549 0.8154 0.3205 -0.2137 0.0462 0.7776 0.3530 -0.2160 0.0411 0.7886 0.3459 -0.2132 0.0398 0.8035 0.3408 -0.2021 0.0303

0.2502 0.4785 0.4169 -0.1145 0.3205 0.4435 0.3710 -0.0801 0.3530 0.4398 0.3428 -0.0652 0.3459 0.4340 0.3489 -0.0651 0.3408 0.4090 0.3505 -0.0526

-0.2000 0.4169 0.6667 0.0915 -0.2137 0.3710 0.7527 0.0534 -0.2160 0.3428 0.7903 0.0399 -0.2132 0.3489 0.7849 0.0402 -0.2021 0.3505 0.7922 0.0312

0.0549 -0.1145 0.0915 0.9749 0.0462 -0.0801 0.0534 0.9885 0.0411 -0.0652 0.0399 0.9924 0.0398 -0.0651 0.0402 0.9925 0.0303 -0.0526 0.0312 0.9953

paths, Treuil and Joron (1975) derived important plots used in identification and inversion techniques. Let us consider a perfectly incompatible element (or 'hygromagmaphile' in the terminology of these authors) which we label with the superscript H. Then

Combining the two equations, they become r C

H l

C

l

or

1 = RL{

(9.2.16)

If P( is constant, this is the equation of a straight line in the C liq H /C liq l vs Cliq H diagram, which is the foundation of widely used plots like Th/Ta vs Th aimed at identifying partial melting processes. These plots and alignment parameters were used

490

Trace elements in magmatic processes

by Minster and Allegre (1977) to invert partial melting equations quantitatively. Hofmann and Feigenson (1983) have adopted a slightly modified approach for inverting the trace-element data of the lavas from the Kohala volcano (Hawaii). Unfortunately, the reciprocal inference is not true: observation of alignments in the CliqH/Cliq' vs CliqH diagram by no mean implies eutectic melting (Albarede, 1983). From the definition (9.2.13) of the P£'s, it is clear that whenever residual phase abundances X} (or their contribution to the melt Xo — Xj) vary linearly with F, which also happens for peritectic melting, straight lines will also be obtained although their slope and intercept have a slightly more involved interpretation than that given by the eutectic melting equation. For that reason, Shaw's (1970) equation and the derived inversion method of Hofmann and Feigenson (1983) overconstrain unduly and unnecessarily the phase proportions in the molten source during the generation of the lava suite. The mass balance and equilibrium conditions, and those conditions only, are taken into account comprehensively in the approach of Albarede (1983) and Albarede and Tamagnan (1988) which produces more reliable results. Feigenson and Carr (1993) have recently modified the method of Hofmann and Feigenson (1983). The condition of constant Pt is relieved through Monte-Carlo sampling of the P( space. In addition, the presence of the same element concentration (in our example, Th) in both the coordinates introduces a strong artificial correlation of no meaning in terms of process. Using a random number generator, e.g., in a spreadsheet, the reader may check that randomly and independently generated values of Ta and Th usually produce quite good correlations in Th/Ta vs Th diagrams. This topic is specifically dealt with in Section 4.2. <& As an illustration, we will alter slightly the example calculated above to describe how Shaw's (1970) choice of variables is handled. A peridotite made of 80 percent olivine and 20 percent clinopyroxene contains 2500 ppm Ni, 1500ppm Cr, 0.2 ppm Yb, and 0.01 ppm Rb. Calculate the concentration of each element in the melt produced by 10 percent partial melting of this peridotite with a liquid norm of 40 percent olivine and 60 percent clinopyroxene. Assume the partition coefficients listed in Table 9.1 (p. 479). From equation (9.2.12), the bulk solid-liquid partition coefficients Dt° are Z)ni° = 0.8x 6 + 0.2x1 = 5.0 = 0.8x 1+0.2x8 = 2.4 DYb° = 0.8 x 0.1 + 0.2 x 0.3 = 0.14 DRb° = 0.8x 0 + 0.2x0 = 0 while, from equation (9.2.13), the 'liquid' partition coefficients Pt are computed as PNi = 0.4x 6+0.6x1 = 3.0 PCr = 0.4x 1+0.6x8 = 5.2 pYb = 0.4 x 0.1 + 0.6 x 0.3 = 0.22 PRb = 0.4x 0 + 0.6x0 = 0

9.3 Incremental processes

491

We finally calculate the melt concentrations from equation (9.2.15), as CliqNi = 2500/[5 + 0.1 x (1 -3.0)] = 521 ppm CliqCr = 1500/[2.4 + 0.1x (1-5.2)] = 758 ppm CliqYb = 0.2/[0.14 + 0.1x (1-0.22)] = 0.91 ppm CliqRb = 0.01/[0 + 0.1x(l-0)] = 0.10ppm

o

9.3 Incremental processes These processes, also referred to as distillation processes, have previously been discussed in Section 1.5, although with little emphasis on the properties of solutions for constant partition coefficients. Henry's law gives trace elements a definite advantage since the differential forms of mass balance equations can usually be integrated when partition coefficients are constant. Concentrations of trace elements in solid and liquid phases during magmatic processes can be described by relatively simple equations, thereby making application to geological examples a reasonably simple task. 9.3.1 Fractional crystallization: forward problem This model applies whenever a phase is removed progressively from a homogeneous medium with chemical or isotopic fractionation. Simple cases involve the distribution of trace elements during the crystallization of minerals from cooling magmas, but the progressive boiling of hydrothermal solutions or the evaporation of lakes can be treated in the same manner. In this case, concentrations of the parent melt change with the fraction extracted and the removed mineral phases will be spatially zoned. Chemical or isotopic equilibrium between the phases at the time they separate is usually assumed (e.g., Sr is likely to be nearly twice as concentrated in the plagioclase as in the basaltic liquid this mineral crystallizes from) but is not really a prerequisite for having a distillation process, also referred to as a finite-reservoir process. Fractional crystallization with equilibrium at the solid-liquid interface (Figure 9.4) will be considered to set up the fundamental equations. Let us consider the behavior of the ith among m trace elements upon partitioning between a homogeneous liquid (labeled liq) and the n phases (labeled;) of the cumulate in a system of finite size. These phases are usually considered as mineral phases but liquid trapped in the cumulate can be handled as an additional phase with partition coefficients equal to unity (Greenland, 1970; Albarede, 1976). The solution has been worked out in Section 1.5 and given in its differential form as equation (1.5.3) dlnCu^^.-lJdlnF

(9.3.1)

where F is the fraction crystallized and Dt the solid-liquid partition coefficient. No assumption on constant parameters has been made so far. Expressing the bulk solid-liquid partition coefficient as a function of the mineral-liquid partition coefficients K/ and mineral fractions fj in the cumulate through equation (9.2.3), we

492

Trace elements in magmatic processes

Figure 9.4 Fractional crystallization model. The magma is well-stirred and in equilibrium with the last solid crystallized.

get the alternative formulation dlnC liq '=

(9.3.2)

in which use has been made of the closure equation. For invariant cumulate mineralogy, these equations are integrated for constant Dt into the familiar Rayleigh distillation equation of Neuman et al. (1954) (9.3.3)

with Co ' as the value of C,iq' for parent magma or F = 1. The cases Dt=0.1 and D~5 are depicted in Figure 9.5. The instantaneous solid withdrawn from the magma has the concentration C^J such as (9.3.4)

while the bulk cumulate has a mean concentration Csol' such that

or 1-F

(9.3.5)

An alternative expansion using equation (9.3.3) is (9.3.6)

The Rayleigh equation (9.3.3) rests on the assumption of constant D/s. This problem was discussed by Allegre et al. (1977). The mineralogy of the cumulate during fractional crystallization varies nearly step-wise provided phase boundaries remain nearly linear

493

9.3 Incremental processes

Fraction crystallized 0.8

0.6

0.4

0.2

10

If

..TV o.i IT

Fractional crystallization -

Periodic recharge, periodic eruption of differentiated magma, erupted fraction Y = 0.5

0.01

Periodic recharge, periodic eruption of undifferentiated magma, erupted fraction Y = 0.5

0.001 0.2

i

0.4

0.6

i

i

0.8

i

1.0

Fraction of residual liquid, F Figure 9.5 Evolution of the concentration with the fraction crystallized (from right to left) for the fractional crystallization model [Rayleigh equation (9.3.3), heavy line] and two models of magma chamber with periodic recharge, periodic eruption and continuous fractionation [equations (9.4.7) and (9.4.8)].

(compare with Presnall, 1969 for fractional melting), which suggests that bulk partition coefficients remain approximately constant between discontinuities along the liquid line of descent. As discussed in Section 1.5, the Rayleigh equation (9.3.3) shows the property of step-wise linear covariation for a pair of elements i\ and i2 in a In CUqi2 vs In C liq a diagram (Treuil and Joron, 1975; Allegre et a/., 1977; Allegre and Minster, 1978). This property is valid for any pure phase which participates in the fractionation process: lavas are most commonly used as representing the liquid phase but Fourcade and Allegre (1981) used hornblende to trace the evolution of REE in fractionating granitic liquids. An important consequence of the power-law expressed by equation (9.3.3) concerns the most incompatible elements, namely those with very low partition coefficients: their concentration varies with the reciprocal of F, i.e., very slowly in the early and intermediate stages of the crystallization process. This point will be returned to in Section 9.5. A related problem is the evolution of a trace-element ratio with crystallization

liq

494

Trace elements in magma tic processes Table 9.6. The matrix of partition coefficients used in the fractional crystallization example.

olivine-liquid clinopyroxene-liquid plagioclase-liquid

Ni

Sr

Yb

Rb

15 1 0

0.0 0.1 2.0

0.05 0.35 0.25

0 0 0

Incompatible-element ratios (e.g., Th/La, Nb/Zr, Ce/Yb in basalts) are therefore expected to be very insensitive to mineral separation from the melt and, for differentiated lavas, can be used as a parameter characteristic of their parent liquid [see below). Langmuir (1989) has investigated the special case where part of the differentiated liquids produced in the boundary-layer is remixed with the inner part of the magma chamber. $? A basalt contains 150 ppm Ni, 100 ppm Sr, 3 ppm Yb, and 10 ppm Rb. Calculate the concentration of each element after removal of 20 percent of a cumulate containing 30 percent olivine, 20 percent clinopyroxene and 50 percent plagioclase in the residual liquid and in the average cumulate. Assume the partition coefficients given in Table 9.6. The bulk partition coefficients are calculated from equation (9.2.3) as = 0.3 x 15 + 0.2 x 1 +0.5 x 0 = 4.7 DSr = 0.3x 0 + 0.2x0.1+0.5x2= 1.02 DYb = 0.3 x 0.05 + 0.2 x 0.35 + 0.5 x 0.25 = 0.21 DRb = 0.3x 0 + 0.2x0.0 + 0.5x0 = 0 which give the following residual liquid concentrations through equation (9.3.3) CliqNi= 150(1-O^) 4 7 "^ 65.7 ppm CliqSr = 100(1 - 0 . 2 ) 1 0 2 - 1 =99.6 ppm CliqRb= 10(1 -0.2)°" l = 12.5 ppm Using equation (9.3.5), the average cumulate concentrations are computed as CsoiNi= 150[l -(1 -0.2) 4- 7]/[l -(1 -0.2] =487 ppm CsolSr= 100[l -(1 -0.2) 1 02 ]/[l -(1 -0.2)] = 102 ppm CsolYb = 3[l -(1 -0.2) O2

-(1 -0.2)] = 0.69ppm

Rb

Csol = 10[l -(1 -0.2)°]/[l -(1 -0.2)] =0 Changes in concentrations induced by fractional crystallization are much more visible for compatible than for incompatible elements. <>

9.3 Incremental processes

495

9.3.2 Fractional crystallization: inverse problem Given the composition of the parent and residual magmas, and a mineral assemblage, the fraction of melt crystallized and the modal composition of the cumulate can be uniquely determined. An alternate form of the Rayleigh equation is more useful if derivation of a magma /? from a parent magma a through fractional crystallization is to be tested (Albarede and Provost, 1977). Assuming constant D, and upon integration of equation (9.3.2), we get

ln^=Z (*/"!)/> J

(9.3.7)

where the subscripts a and j$ refer to liquids a and /?, respectively. The m x n matrix A is defined by its current element atj fly

=K/-l

(9.3.8)

the n-vector x of unknowns Xj as Fa)

(9.3.9)

and the m-vector y of data yt as yi

= ln(C//C;)

(9.3.10)

Provided m ^ n, the resulting matrix equation y = Ax

may be solved to give the usual least-square solution x = (ATAy1ATy The degree of fractionation Fp/F^ is retrieved from \n(Fp/Fa)= txj

(9.3.11)

J=I

whereas the n / / s are calculated through equation (9.3.9). Finding the primary melt in a differentiation series is an entirely distinct inverse problem. Since the incremental character of mineral removal and elemental fractionation removes any useful closure condition, it is usually possible to imagine a melt more primitive than the least differentiated lava of a magmatic series. This problem is commonly handled with very compatible elements, typically Ni in basaltic systems, that vary extremely fast during magmatic differentiation but stay wellbuffered during melting (Treuil and Joron, 1975; Allegre et al., 1977) as discussed in Section 9.5.

496

Trace elements in magmatic processes

<& The inverse solution to the previous problem will provide a good example. Using equation (9.3.10), they vector is built from the initial and residual liquid concentrations as

y=

ln(65.7/150)

-0.826

ln(99.6/100)

-0.004 0.1768

ln(3.58/3) Jn(12.5/10) .

0.223

whereas, from equation (9.3.8), the matrix A is 15-1 A=

1-1 0 - 1 "

0-1

0.1-1

2-1

0.05-1

0.35-1

0.25-1

0-1

0-1 0-1

With the intermediate steps 0.007196

-0.01587

0.02946

-0.01587

0.5032

-0.1422

0.02946

-0.1422

0.4140

and -0.06694" T

l T

= (A A)- A y = -0.04463 -0.1116

the final solution is obtained from equation (9.3.11) as p ip _e-0.06694-0.04463-0.1116_e-0.2232_Q g

Given In 0.8= -0.2232, mineral fractions are retrieved through equation (9.3.9) 0.06694/0.2232

/„, Jcpx /plag.

=

0.3

0.04463/0.2232 = 0.2 0.1116 /0.2232_

.0.5

9.3 Incremental processes

497

9.3.3 Fractional melting This model of liquid extraction is symmetrical to fractional crystallization and has attracted renewed interest after the demonstration by Johnson et al (1990) that REE distributions in abyssal peridotite clinopyroxene cannot be accounted for by equilibrium melting processes. The solid is supposed to maintain its chemical homogeneity while liquid is continuously extracted. Only the last drop of liquid is supposed to be in equilibrium with the residue. Concentration in the solid can be retrieved from the general equations developed in Section 1.5 . The general differential equation (1.5.28) (9.3.12)

was obtained in which F is the melt fraction and Dt the bulk solid-liquid partition coefficient. Logarithmic plots of concentrations in residual solids which underwent fractional melt extraction at constant D's should define linear arrays. The slope snl2 in a In Csoli2 vs In C sol fl diagram would be ("13) For very small partition coefficients (incompatible elements), equation (9.3.13) becomes snl

&Dn/Di2

Constant D is probably a good approximation in so far as the degree of melting is significantly smaller than the proportion of the least abundant mineral phase. Integrating the differential equation gives expressions for the solid and the instantaneous liquid in equilibrium with it Cj^CoXl-Fj*'1

(9.3.14)

and

W ^

(9.3.15)

In a In C liq l2 vs In C l i q a diagram and for small degrees of melting, the instantaneous liquids would also define a straight line with the same slope as the solid array. Fractional melting processes are even more efficient than equilibrium melting in fractionating incompatible elements for small fractions of melt since D

in which the exponent can be very large (Figure 9.6).

498

Trace elements in mag ma tic processes

10

"If

0.1

0.1

0.2

0.3

Melt fraction, F Figure 9.6 Comparison of the equilibrium [equation (9.2.2)] and fractional melting [equation (9.3.15)] models for a bulk solid-liquid partition coefficient D( of 0.1 (top) and 2 (bottom). Although the concentrations predicted by the two models diverge rapidly for incompatible elements in instantaneous melts, they remain virtually identical for compatible elements.

If the various melt fractions are collected together in the proportions they are extracted from the source (aggregated melt), the average liquid concentration is obtained by making the initial concentration equal to the sum of liquid and solid with the appropriate weight-fractions (9.3.17)

or C '

(9.3.18)

Clearly, the averaging process decreases the efficiency of fractionation between incompatible elements. We can again introduce Shaw's Pt variables, which we assume to be constant (eutectic melting), and change variables according to equation (9.2.14). Thereupon, the differential form of the fractional melting equation can be rewritten

dlnCso/ dln(l-F) or

-1

9.3 Incremental processes

499

The last equation can be rearranged for easier integration into -dFP,

dlnCsoll =

¥ —

1 / FP\ - - d l n ( l - F ) = — dln( 1 ^)-dln(l-F)

D?U--Pi

Integrating between 0 and F, we get

(9.3.19) Again, the aggregated liquid can be calculated through equations (9.3.17) and (9.3.18)

In order to retrieve concentrations in the instantaneous solid, the instantaneous residual mineralogy and bulk partition coefficient must be calculated. The difficulty of applying the fractional melting model is the discontinuous character of the melting process (e.g., Presnall, 1969). Whenever a mineral phase is exhausted, the progress of fractional melting requires temperature jumps of expectedly large amplitude and discontinuous variations in melt chemistry which are not in general well-documented in natural examples. ^ The fractional melting equations can be illustrated with the melting example calculated above. A peridotite made of 80 percent olivine and 20 percent clinopyroxene contains 2500 ppm Ni, 1500 ppm Cr, 0.2 ppm Yb, and 0.01 ppm Rb. Calculate the concentration of each element in the aggregated liquid produced after 10 percent partial melting with a liquid norm of 40 percent olivine and 60 percent clinopyroxene. Assume that the partition coefficients are given by Table 9.1 (p. 479). The bulk solid-liquid partition coefficients for F = 0 are computed from (9.2.12) as DNi° = 0.8x 6 + 0.2x1 = 5.0 DCr° = 0.8x 1+0.2x8 = 2.4 D° = 0.8 x 0.1 + 0.2 x 0.3 = 0.14

whereas, from (9.2.13), the partition coefficients of the melt mode are PNi = 0.4x 6 + 0.6x1 = 3.0 PCr = 0.4x 1+0.6x8 = 5.2 p Yb = 0.4 x 0.1 + 0.6 x 0.3 = 0.22 PRb = 0.4x 0 + 0.6x0 = 0

500

Trace elements in magmatic processes

From equation (9.3.19), concentrations in the residual solid are

1-0.1

\-F

- = 2721 ppm

and, likewise CsolCr= 1590 ppm CSOIYb=0.10 ppm C sol Rb =0ppm Concentrations in the aggregated liquid are given by (9.3.18) ^ ^

C 0 N i -(l-F)C 5 O , N i _ 2500-(l-0.1)2721

01

= 511 ppm

and, likewise, CliqCr = 690ppm C liq Yb = 1.08 ppm CliqRb = 0.10 ppm

9.3.4 Continuous

melting

Because of presumably finite residual porosity after melting completion, some magma is expected to be left behind. Langmuir et al. (1977) called continuous melting a fractional melting process with residual porosity. These authors have not provided the constitutive equation which can be found, although not quite in a physically consistent form, in McKenzie (1985). This equation has recently appeared in full in Sobolev and Shimizu (1992) under the term of 'critical' melting. Using a method applied by Greenland (1970) and Albarede (1976) to fractional crystallization, the continuous melting equations can be derived from those governing fractional melting by assuming a source with constant volume porosity is the mass porosity, e.g., the mass ratio of interstitial liquid to the sum of interstitial liquid and residual solid, the partition coefficient Dt in fractional melting equations must be replaced by an effective partition coefficient taking into account a liquid proportion <\> with a partition coefficient of 1. We consequently introduce the changes

xl

and

D,

(9.3.21)

9.4 Open magmatic systems

501

into the fractional melting equations. Mass and volume porosity relate through 0=

^

(9.3.22)

Psoi(l-
and therefore

(j>

psol(l

-
Introducing this expression into (9.3.14), the equations for continuous melting become

where the concentration in the residue (res) is related to that in the liquid and the solid through

For the liquid, we obtain

These equations converge towards those of the fractional melting model for cp«Dh and, contrary to McKenzie (1985) equation (29), Cliqf tends to Cj when porosity cp-+l. An example has been drawn in Figure 9.7 for Df = 0.001 and different values of cp. When the porosity cp and the partition coefficient are of the same order of magnitude, large variability is achieved in both the solid and the residue, a point which will be returned below. Considerable attention has been recently focussed on this model which may explain the fractionation of some strongly incompatible nuclides in the U decay series (McKenzie, 1985; Williams and Gill, 1989; Beattie, 1993).

9.4 Open magmatic systems The boundaries of open magmatic systems allow differential movements of either liquid or solid phases. Assimilation of solid or liquid material unrelated to the source of the melt is referred to as contamination, e.g., assimilation of granitic crust by basalts. Replenishment refers to the input of fresh magma into a differentiating magmatic body issued from the same source. Although it is usual and, as far as trace elements are concerned, justified to make a distinction between mostly liquid {magma chambers) and mostly solid systems {molten regions), a whole continuum of solid-liquid configurations exists which will be treated collectively under the generic heading of open magmatic systems.

Trace elements in magmatic processes

502

1 Residue

0.8 cp = 0.05

0.6 0.4 0.2 0

Continuous melting D: = 0.001

150 100 50

0.02 0.05

0.01

0.02

0.03

0.04

0.05

0.06

Degree of melting, F Figure 9.7 The continuous melting model for Dt = 0.001 and diverse values of the residual porosity (p. Concentrations in the residue, e.g., solid plus residual melt (top) and the liquid (bottom).

9.4.1 The steady-state magma chamber A simple version of this model is discussed in a more general context in Chapter 7. A body of magmatic liquid of constant mass M (Figure 9.8) and containing an element i with concentration initially at Col receives a continuous flux Q of fresh liquid with concentration Cin\ During the same time interval, an equivalent flux Q is either crystallized or erupted. The cumulate has a concentration Cso/ equal to D( times the liquid concentration Cliq\ Assuming a fraction of suspended crystals 1 — F in the magma chamber and a fraction 1—O of cumulate relative to the total (cumulate + erupted liquid), the budget for the element i can be written as (9.4.1)

Introducing the solid-liquid partition coefficients Dt and expanding the left-hand side for constant M :dF

M(l-D,)C I i q £

~dt

, d C W _=ecin --Q[
503

9.4 Open magmatic systems

M

Liquid Q

Q Crystallization

Figure 9.8 The steady-state magma chamber: constant M and Q are assumed, the liquid is well-stirred and in equilibrium with the last solid formed. we get the dynamic equation (see Section 7.2) M

1/1

p

r +( - )

i+

Q(

l-D.)— ' dt

l

M

dt

(

£)

dF

(9.4.2)

—

dt

In general, residence time depends on the rate of crystallization. For the limiting case where F = 1 (no suspended crystals), equation (9.4.2) simplifies to dC liq

dt

_

C in

(9.4.3)

Defining aI= + (l— )Df and the magma residence (flushing) time rm = M/Q gives the equivalent of equation (7.2.7) as

dt

(9.4.4)

The residence time of a trace element is Tm/at: compatible elements can be thought of as reactive and have shorter residence times than inert incompatible elements. As shown in Chapter 7, equation (9.4.4) can be integrated from 0 to t into equation (7.2.12)

= co expf - 4 " ) +

(9.4.5)

and steady-state concentration is Cinl/oLi. 9.4.2 A periodically erupting, periodically refilled magma chamber This model is a variant of the steady-state magma chamber discussed in the previous section but with periodic input and eruption rate. One version of this model

504

Trace elements in mag ma tic processes

corresponding to a specific sequence of processes has been calculated by O'Hara (1977) and O'Hara and Mathews (1981). A simpler derivation and more complete and physically elucidating solutions have been given by Albarede (1985). For an element i, fresh magma input balances output either as erupted lavas or as cumulate ( l - F ) C j + y C l i q f = (l-F+y)C 0 1 '

(9.4.6)

where 1 — F is the fraction of magma crystallized and Y the fraction erupted in each cycle. For O'Hara, eruption takes place before replenishment at the end of the differentiation stage. Inserting equation (9.3.6) into equation (9.4.6) gives l-{F-Y)FDi~1

Col

Alternatively, eruption may take place after replenishment at the onset of the differentiation stage (Albarede, 1985). Undifferentiated magma is erupted which has the concentration C liq l . Equation (9.3.5) therefore becomes

which, inserted into equation (9.4.6), gives

£» Co

l

l F+Y -

~

l-FDi+Y

(9.4.8)

A comparison of the two models described by equations (9.4.7) and (9.4.8) with fractional crystallization for Dt = 0.1 and Dt = 5 and assuming an erupted fraction Y of 50 percent is shown in Figure 9.5 (p. 493). Use of either equation (9.4.7) or equation (9.4.8) leads to quite different patterns of incompatible and compatible elements (Albarede, 1985; Caroffer a/., 1993) which makes it possible to discuss the timing of replenishment and eruption events. 9.4.3 Assimilation-fractional crystallization (AFC) Thermal and chemical effects of country-rock assimilation on the liquid line of descent of magmas were already known (Bowen, 1956), but considerable development and application flourished from the so-called assimilation-fractional crystallization (AFC) model. The AFC model resembles fractional crystallization with the difference that the magma chamber is continuously contaminated with assimilated country-rocks. The model was initially described for trace-element fractionation by Allegre and Minster (1978) and first applied to the combined systematics of 1 8 O/ 1 6 O and 87 Sr/ 86 Sr in crustally contaminated magmas by Taylor (1980). The constitutive equations were systematically developed for geological purposes by DePaolo (1981) and Taylor and Sheppard (1986) for trace elements and isotopes. RAFC process resembles AFC but includes magma chamber replenishment. Its constitutive equation is given by DePaolo (1985) and Hagen and Neumann (1990).

9.4 Open magmatic systems

505

The symbols used in Section 1.5 to describe the evolution of element i concentration in the solid and the liquid during fractional crystallization will be kept. Other parameters used in the present derivation are almost identical to those of DePaolo (1981) although reference to time, which is immaterial to the mass balance and equilibrium conditions, has been omitted. Let 'a' be the subscript representing the assimilated material, and assume that country-rocks concentration Cj is constant. Mass balance requires dMliq = dMa - dMsol (bulk material) dmliqi = dma'-dmgol' (species i)

(9.4.9)

whereas the solid-liquid equilibrium fractionation with partition coefficient Dt reads

dMsol

r Di < l M liq

(9.4.10)

Dividing the two equations (9.4.9) by each other, we get dmliq' = dmj-dmj dMliq dM a -dM sol then dm liq '^ dma£ dMa dMliq dM a dM a -dM s o l

dmj dMsol dMsol dM a -dM sol

(9 411)

The ratio r of assimilation and crystallization increments is defined as

r

_ dMa "dM^

(9.4.12)

Inserting r into equations (9.4.10) and (9.4.11) yields

dm^ = dmljL__Dm}}i_)_ dMliq

l

dMar-l

(94n)

Mliqr-l

Multiplying equation (9.4.13) by dM liq/m liq ', we get

.». i liq

m

* » 1 r ~"l

AS ^*liq

**. liq

m

AS ^Miq

« r

1 ~ l

Inserting the logarithmic differential of concentration into equation (9.4.14) gives W _ CJ

r dM,iq - 1 Mliq

MliqV-l

Defining the residual melt fraction F relative to the initial amount of magma M0 as F = Mliq/M0

506

Trace elements in mag ma tic processes

we get r dF

r + D,-l

dC liq -C a — -

,dF

—__Cliqy

and rearranging r

1dF C

liq

Defining z, over the [1 — Di91] interval as

,

=

!^iZl

= 1 -^L

r—1

(9.4,5)

1—r

the equation can be rewritten in a ready-to-integrate form as

No assumption on constant parameters has been made so far. If the amount of mineral precipitated is proportional to the amount of assimilated material, e.g., if latent heat is conserved, then r is constant. For the initial liquid state F = 1, Cliq* = Col as in Section 9.3.1 r

r i-\ c ' z{(r-\) L

c J

r

r{ \- ' ^-1) J 1

or equivalently C

^- = F~Zi+—-—^-(l-F~Zi)

(9.4.16)

z^r-VCo1

Col

Written in full, this expression becomes C- '

r

D X

C'

D 1

-^ = FT^- + ——---V1-FTTT ) (9A17) Co r + Dt—1 Col Making a = (l— r ) " 1 , the equation of Allegre and Minster (1978) for wall-rock dissolution during fractional crystallization is obtained, which is therefore equivalent to the AFC equation (9.4.17) derived by DePaolo (1981). Some combinations of parameters may lead to the reversal of normal fractionation trends. Constant C liq f is obtained for some critical value r c of r such that the F~Zi terms cancel out

9.4 Open magmatic systems

507

or (9.4.18)

rc represents a divide between the fractionation- and the contamination-controlled ranges. For 0 ^ r < rc (if rc > 0), concentration changes in the same direction as for simple fractional crystallization. For r > rc, changes are more similar to those expected from a contamination process. The corresponding expression for isotopic (or incompatible-element) ratios is given by DePaolo (1981). Let us label il and il the two isotopes of the same element. We further assume that their partition coefficient is identical, as are their r and zt values. Dividing equation (9.4.17) for isotope il by the corresponding equation for isotope il, we get

,—1 r

r

^i:« h

ii

r

il

l

q

which, dividing the left-hand side by C liq fl and the right-hand side by C o a , can be recast into r '•

/liq

1--

+ Dt-1 r

il

c5 i l I' 1

'rlCo'

iq

!

^_^1

or

The formula given by Fleck and Criss (1985) is easily arrived at from equation (9.4.19). Extreme values of concentrations occur for the fraction on the right-hand side being equal to 0 (pure contaminant melt) and 1 (no contamination). These relationships show a fairly simple behavior of the AFC model: the isotopic ratio (Ci2/Cn)liq should be linearly correlated with the inverse of the element concentration C l i q a , a property which it shares with all bulk mixing models. Such a linear relationship, initially suggested by Briqueu and Lancelot (1979) from the evidence of a numerical solution, was demonstrated by Fleck and Criss (1985) and Taylor and Sheppard (1986). The present analytical solution will help the reader to work out tests on geological cases. Although inversion techniques can be used (Mantovani and Hawkesworth, 1990), the parameter r can most easily be retrieved from either the slope or the intercept

508

Trace elements in magmatic processes

of AFC alignments in the diagram (Ci2/Cn)liq vs 1/Cliqfl. Extracting the term in 1/Cliqn from equation (9.4.19), the slope snl2 becomes -Cj1 aJ

l--

1-1

r

il

Multiplying this equation by the denominator of the last term on the right-hand side, we get l

-

or 2 cVM M

Defining, in the diagram (C l2 /C fl ) liq vs 1/Cliqfl, the slope sm of the mixing line between contaminant and initial magma as s Sm

_(Ci2/Cn)0-(Ci2/Cn)a (1/Con)-(1/Cmn)

r is calculated as

(9A20)

It is left to the reader as an exercise to demonstrate that r can also be retrieved from the intercept ini2 of the AFC alignment in the same diagram using

& Calculate the evolution of the normalized concentration C liq /C 0 of Sr in a magma with initial 87 Sr/ 86 Sr = 0.703 which fractionates a cumulate with partition coefficient Di = 2 while it assimilates surrounding rocks with a normalized concentration C a s 7C 0 Sr = 5C liq Sr/ C0 Sr and 87 Sr/ 86 Sr = 0.712. The calculations have been made on a spreadsheet: they consist in setting up a table of F and r values. Figure 9.9 shows the calculated isopleths (constant concentration lines) as they best show the critical phenomenon (steady concentration). From equation (9.4.18), the critical value rc of r is rc = — = 0.25 1-5

509

9.4 Open magmatic systems

•S

0.6 •

2 ^4H

0.4

o

1 0.2

00

Fractionation dominant

0.8

\*
X

M

I

( IIIf f f '

0.2

-

Assimilation dominant

0.6

0.4

0.8

Figure 9.9 AFC model for a bulk partition coefficient D{ = 2 and CJ/CQ1 = 5. Subscript 'a' refers to the contaminant. Parameter r is defined in equation (9.4.12). The critical r value rc = 0.25, calculated from equation (9.4.18), separates the fractionation dominant field from the assimilation dominant field. The labels on the curves refer to the values of CliqVCol.

1

r = 0.5

r = 0.3

oo

0.708

00

0.706

1 1 'W\ • \ V11 \ \

0.704

-

0.712 0.710

0.702

1

r = 0.2 P r-OA cT 5 percent crystallization increments

/

w

r=0

(S)ooc>ooo-o-o—o—o Initial liquid

0

o

o—

liq

Figure 9.10 AFC model with isotope ratios as in Figure 9.9. AFC curves at constant r are straight lines in a 87 Sr/ 86 Sr vs C 0 Sr /Cliq Sr diagram. The straight lines do not pass through the point representing the contaminant. Open circles indicate 5 percent crystallization increments.

The isotopic results calculated from equations (9.4.16) and (9.4.19) are shown in the 87 Sr/ 86 Sr vs C 0 Sr /C liq Sr diagram of Figure 9.10: again, we see how the system switches from the crystallization- to the assimilation-dominant regimes when r crosses the critical value rc. o

510

Trace elements in magmatic processes

This model describes the displacement of a molten zone of constant length and can be considered as a forerunner of the modern percolation theory. The process itself was first developed in metallurgy as a way of producing high-purity metals and is known as both zone-refining and zone-melting (Pfann, 1952). The original model, brought to geology by Harris (1957), initially dealt with a moving zone of completely molten material. It is of broader geological interest to consider (Figure 9.11) the displacement ofa partially molten zone of constant length L (= volume per unit surface) with a volume fraction O of liquid. This zone moves through a medium at concentration C0(z) and leaves a residuum with a volume fraction cp of liquid which will be referred to as residual porosity. psol and pliq are the densities of the solid matrix and melt, respectively. The molten zone is well-mixed so the reference level for the liquid and solid concentrations is conveniently taken as z. O, q>, and the partition coefficients are supposed to be invariant. When the molten zone has proceeded over a length z, mass balance requires dL[OPliqCliq' + (1 -
We define the enrichment factors ktL and ktR as ^ L = Opliq + (l-O)p sol 2) f

(9A23)

kiR = cpPnq + (l-
(9.4.24)

and

Rearranging equation (9.4.22), we get the general conservation equation ^

^

^

(9.4.25)

The characteristic length over which concentration of element i in the liquid changes by a factor e because of zone melting is (k^/k^L. If the distribution prior to melting is constant and such as C0\z + L) = Col independent of the depth z, equation (9.4.25) is integrated as

The value of Cliqf at z = 0 can be the concentration of a liquid generated by batch- or fractional melting from the same source or that of an exotic liquid introduced at the

9.4 Open magmatic systems

Freezing

511

wmm Molten zone propagates

Fresh wall-rock

Figure 9.11 The layout of the zone-refining model.

bottom of the melting column. The assumption of = 1 (all liquid zone) and

Pliq L

pliq L

(9.4.27)

When z»L, limit concentration is reached for (9.4.28)

Incompatible elements can achieve very large enrichment in the liquid. Steady-state is achieved over a characteristic length in proportion with (k^/k^L. For small porosities, this length is in the order of (<&/(p)L for incompatible elements, in the order of L for compatible elements (Figure 9.12). A very small fraction q> of liquid left behind therefore has a dramatic effect on both the limit concentration («C 0 '/p) and the characteristic length of incompatible elements. We can further investigate the concentration distribution left behind upon sweeping of the solid by a large number of liquid molten zones, which, for simplicity, we choose of identical thickness. This problem is identical to finding a distribution Col(z) which would be insensitive to the propagation of the molten zone. Equivalently, we can state that the molten zone does not transport any substantial amount of element I The asymptotic solution for an infinitely large number of sweeps has been given by Pfann (1952) in the case of complete melting. We will now derive a solution for constant L, Dh O, and cp. Since there is no transport of the element i by the molten zone, the local balance must be that of closed-system partial melting 1 Cz+ j ^\\a

\Z)

—

(9.4.29)

We recognize in equation (9.4.29) the batch equilibrium melting equation (9.2.2) with

512

Trace elements in magmatic processes

0.01 0.2

0.4

0.6

0.8

Z/L Figure 9.12 The zone-refining model described by simplified equation (9.4.27) for a completely molten zone. Concentration in the solid left behind the zone for different values of the bulk solid-liquid Dt. Steady-state is achieved over distances much shorter for compatible than for incompatible elements.

the solid concentration averaged over the molten zone as the source composition. What is left behind by the molten zone should both recover the concentration distribution at C0\z) and be in equilibrium with the liquid in the molten zone, i.e.,

Inserting this expression into (9.4.29) leads to the integral equation

iiqI(z)=

zS f +L

(9A3O)

We could take the derivative of this expression and apply Leibniz's rule, but we know that exponentials have the property that their integrals and derivatives are linearly related. We therefore try the solution CQ\z) = Co'(0) ez/^'

(9.4.31)

where (f is a constant with the dimension of a length to be determined. {, is the distance over which the concentration of element i changes by a factor e. Inserting this expression into the integral equation (9.4.30) gives LktL

513

9.4 Open magma tic systems

10

Upstrearn^^\^ enrichment ^^\

0.1

0.01 0.001

Downstream \ enrichment \

0.0001 10

-10

Figure 9.13 The zone-refining model with an infinite number of passes: determination through equation (9.4.32) of the length f£ in the exponential distribution of solid concentrations described by equation (9.4.31). Incompatible elements are such that the k^/k^ ratio is nearly equal to the ratio of residual porosity to the degree of melting and therefore are efficiently skimmed downstream (£,«£). The parameter d is therefore obtained as a solution of the transcendental equation (9.4.32)

This relationship has been displayed in Figure 9.13. For small values of O and cp, compatible elements are such that kfzzkf. This means that £,- »L and compatible elements such as Ni, Cr, or Mg are virtually unaffected by zone-refining. Incompatible elements are such that kf/kf «

i 2

—-

lnC 0 f 2 (z)-lnC 0 f 2 (0)_Cn

(9.4.33)

As for fractional crystallization and fractional melting, element-element plots with a logarithmic scale should show straight lines for the solid as well as for the liquid, since both differ by a constant coefficient. Contrary to fractional crystallization but similar to fractional melting, discussed above, and to percolation, to be presented below, zone-melting is a very powerful process to separate incompatible elements.

514

Trace elements in magmatic processes 9.4.5 Percolation and magma segregation

Percolation models differ from the zone-refining model essentially by the absence of mixing in the liquid, giving the liquid position-dependent properties. A simplified account of these models was described in Chapter 8. We will now provide a reasonably comprehensive account which may prove useful to the demanding reader, and then examine some properties of the chromatographic effect in a simple configuration. Let cp be the open volume porosity of the medium, p sol and p liq the density of the solid matrix and melt, respectively, vliq the liquid velocity relative to the matrix, and C so / and C liq ' the concentration of element i in the matrix and melt, respectively. Let us rewrite equation (8.3.14) as d<W

W

i

u

,

y

CJ-C^

d c h q 88

cpPliq + (l-cp)psoldCj/dCnqi

dt

hq hq

d(cppsol)

w , l q + (l-^)pMldCMlVdCllq'

dt

(9.4.34)

where, in order to keep the problem general, we temporarily keep the ratio dCsoiydCliq* on the denominators. The derivative d((ppsol)/dt on the right-hand side of (9.4.34) is a source term which corresponds to the rate of matrix conversion into melt. It should be expressed in unit mass per unit volume and unit time. Mass-fractions of melt, however, are more familiar to the geochemist. Using relation (9.3.22) between the mass-fraction 4> of porosity and the volume fraction (p, the following expression is easily derived

4>

Pliq
which, once inserted into (9.4.34), gives

Significant simplification can be achieved through the rather innocuous assumption of constant p sol . Given the differential form of equation (9.3.22) PliqPso1

dcp = -

we obtain the fundamental transport equation

+

dt

v

l

i

q

g

r a d C

$ + (l-t)dCsoll/dCUql

l

i

x

;

0 + (l-0)dC Ml '/dC liq '

(/> + (l-0)pliq/pSoi dt (9.4.35)

It should be kept in mind that the reference frame is attached to the solid matrix. In order to solve this equation, solid-liquid fractionation for element i, the flow field

9.4 Open magmatic systems

515

vliq and the rate of melting must be known. A common assumption for the dependence of cj) on time is that of adiabatic decompression, while buoyancy forces drive the melt out of the solid matrix. One extreme model (McKenzie, 1984) assumes that melt is expelled at a rate controlled by the deformation of the matrix in the gravity field (compaction). By contrast, Ribe (1985) argues that compaction should be negligible and uses Darcy's law of porous flow in a moving medium (upwelling) together with some arbitrary limits on the depth of melting. A simple solution is arrived at for local equilibrium obeying Henry's law and constant porosity. It resembles the solution which was worked out in the advection section of Chapter 8 except that now it is expressed in terms of concentrations in the liquid ^

+

dt

vliq grad 0^ = 0

(9.4.36)

An isopleth moves with an apparent rate vl such that v1' =

^

vliq = £lvliq

(9.4.37)

where st= ,

* , _ ^1

(9.4.38)

represents the fraction of element i that resides locally in the liquid. The isopleth velocity is identical to the fluid velocity for incompatible elements (Dt = 0) and slower for compatible elements. In a one-dimensional frame (z-axis), the equation becomes (9.4.39)

+v dt

dz

1

where the scalar component v of the velocity along z is now used. This is the linear traveling wave equation (e.g., Logan, 1987 and Chapter 8) which admits the general solution f(z — vlt) where / is a function that depends on the conditions at t = 0 only. The concentration distribution at any time t is simply the distribution at t — 0 shifted by the distance vh. For instance, if the initial distribution resembles a normal curve (9.4.40) where z0, I and Col

are

constants, the general solution is (9.4.41)

If, instead of keeping track of the concentration changes at a fixed level in the matrix, we follow a parcel of melt traveling with the velocity vliq (z = vliqr), the concentration

516

Trace elements in magmatic processes Table 9.7. Ratio vl/vliq during percolation of a melt with specific density 2.7 through a porous matrix with specific density 3.4for different values of partition coefficient D{ and porosity (p. Incompatible elements keep pace with the liquid, compatible elements lag significantly behind. Ratio vl/vliq for values of
Dt = 0

0.01

0.1

5

0.005 0.01 0.02 0.05 0.1

1.000 1.000 1.000 1.000 1.000

0.285 0.445 0.618 0.807 0.898

0.038 0.074 0.139 0.295 0.469

0.001 0.002 .003 0.008 0.017

becomes f

rVi-Wv,. )-7~l2') (9.4.42)

Chromatographic fractionation upon percolation is expected to be an extremely efficient way of changing relative distributions of trace elements. The initial concentration distribution is therefore simply translated at the velocity of the liquid: steady flow and full equilibrium between the liquid and its matrix require that the amount of element transported by the concentration 'wave' is constant. In more realistic cases, either the flow is non-steady due to abrupt changes in fluid advection rate or porosity, or solid-liquid equilibrium is not achieved. These cases may lead to non-linear terms in the chromatographic equation (9.4.35) and unstable behavior. The rather complicated theory of these processes is beyond the scope of the present book. & Discuss the elemental fractionation induced by the percolation of a melt with specific density 2.7 through a porous matrix with specific density 3.4 for different values of Di and volume porosity cp. The ratio vVvliq is given by the Table 9.7 which shows that the smaller the porosity, the more efficient the elemental fractionation. Assuming, as an illustration, that, at r = 0, element i is normally distributed as a function of depth with z o = 0 and A= 1, the spatial concentration distribution after a time t = 2 has been calculated from equation (9.4.41) and drawn in Figure 9.14 for various values of Dt. The more compatible the elements, the more they lag behind. Note the quite efficient separation of incompatible elements. An interesting property of trace-element ratios is their change around the initial value: since Nd is more incompatible than Sm, the Sm/Nd ratio is expected first to decrease and then increase below the initial Sm/Nd value as the liquid progresses in the rock column, o

9.4 Open magmatic systems

517

0.8 o

g o o

0.6

0.4

O

0.2

-2

-1

0

1

Reduced distance, z /A Figure 9.14 Chromatographic separation of elements with the same initial normal concentration (standard length A) and different bulk solid-liquid partition coefficient Dt through migration of a fluid in a medium of constant porosity q> at time t = 2 [equation (9.4.41)]. The pre-1987 literature on this topic is reviewed by Ribe (1987). McKenzie (1984) suggests that, since liquid looses contact with its source very rapidly, melt extraction upon compaction should be modelled by fractional melting. He later emphasized the role of residual porosity and used the continuous melting equations (McKenzie, 1985). Richter (1986) assumes gravitational compaction and through numerical schemes computes the apparent degree of melting that conventional models of batch-melting and fractional melting would hint at. Ribe (1985) recognizes that the batch-melting equation is solution to the steady-state percolation problem. Navon and Stolper (1987) investigate metasomatism by infiltration of basalts in peridotites and take diffusion in the solid into account, while Bodinier et al. (1990) and Vasseur et al. (1991) emphasize the kinetic role of grain-size heterogeneities. McKenzie (1985) and Spiegelman and Elliott (1993) investigate disequilibrium in the uranium series for compaction and steady-state models, respectively. A quite serious problem, however, still obscures most applications of the percolation theory to the transport of magmas. Most major elements, such as Si, Mg, Ca,... can be considered as compatible since their concentration in the peridotite source and the basaltic melt are similar within a factor of « 3 . Equation (9.4.37) indicates, as would equations (8.3.17) and (8.3.19) in the most general case, that major elements are slower than the liquid, especially for small porosities. But, what is the liquid made of, then? The velocity of a medium is the weighted average velocity of its constituents [see equation (8.1.4)]. The basalt velocity is that of Si, Mg, C a , . . . weighted by their

518

Trace elements in magma tic processes

abundance in the melt. Chemical melt velocity appears distinct from, and lower than, its physical velocity, which somehow violates common sense. Part of the answer is that, in order for the chromatographic model to apply, the liquid carrier should be chemically inert. If major elements are not to develop fronts in percolating basalts, it must be assumed that dC sol ydC liq ' in equation (9.4.34) is virtually zero, although C so / is not, i.e., the liquid is well buffered and Henry's law does not apply. The rather stringent assumption of trace-element transport by percolating magmas is therefore that of major elements being inert relative to the porous matrix while trace elements are easily exchanged.

9.5 Which element, which process?

9.5.1 The good use of compatible and incompatible elements Choosing the elements which are the most informative for a given situation requires some attention. In the context of magma genesis, the trace elements used for fusion processes, usually solid-dominated, should be different from those used for the crystallization processes that are most commonly liquid-dominated. Let us try to assess how informative the concentration of elements with given partition coefficients in melts are to identify processes, in particular fractional crystallization at low to moderate fractionation extent, and partial melting with small melt fractions. We derive first the relative change dC/C (or, equivalently, d l n C ) that an element with partition coefficient D undergoes when the melt fraction changes by a quantity dF. Taking the differential form of the fractional crystallization equation leads to

For small extents of crystallization, the maximum change, and thereby the most valuable information on F, will be obtained from elements with high Dt (compatible elements) such as Ni in basaltic olivine. Elements with Dt« 1 (incompatible elements), such as Th, Ba or rare-earth elements in basaltic systems, will provide basically no clue to F variations. In addition, information carried by incompatible elements, which do not fractionate with respect to each other, is entirely redundant. This is better shown by taking the relative change in the ratio of two elements i\ and il per increment of crystallization d(C'VC»),,

dF ^ - ^

(9 5 2)

--

In Figure 9.15, the relationship between the fractional change in the elemental ratio and the extent of crystallization F is plotted for different values of AD = Di2 — Dn: for partition coefficients less than 0.1, several tens of percent fractionation are needed before a change of a few percent in the ratio becomes visible. Crystal fractionation does not change incompatible-element ratios such as La/Yb, Zr/Nb, ... except in extremely residual melts.

519

9.5 Which element, which process? 1

0.8

/ -

AD =10

;

/ l

0.4 0.2

0.1

0.0 (

———

V^—

1

"

0.01 0

0.8

0.6

0.4

0.2

Fraction of residual melt, F Figure 9.15 Fractionation of two trace elements il and i2 during fractional crystallization according to equation (9.5.2). AD is the difference Dn—Di2. Incompatible elements are not fractionated efficiently even for large extents of solid removal.

Taking the log-derivative of the partial melting equation (9.2.2) relative to F leads to dC,liq

CjdF

_

1-D, F + D&1-F)

(9.5.3)

We will assume small degrees of melting, i.e., F « l . Two extreme cases will be considered. For compatible elements (D f »F) dC l i q '

1-Dt

D;

(9.5.4)

which shows that the concentration of compatible elements does not change much with the degree of melting (buffering). For instance, Ni will be buffered by olivine during mantle melting. In contrast, for incompatible elements (/),«1), concentration changes as

CjdF

(9.5.5)

Incompatible-element concentrations will change very strongly with the degree of melting. For extremely small degrees of melting, (F ^ Dt)9 incompatible-element ratios will also change, whereas at higher F they will tend to level off. Elements have therefore to be used for what they are good at. In a suite of rocks,

520

Trace elements in magmatic processes

using compatible elements to decipher partial melting processes is futile because: (i) magma differentiation makes the primitive melt concentrations unattainable; (ii) even if this problem can be overcome, information on the degree of melting F is very poor. Similarly, when using incompatible elements to address fractionation processes (i) subtle changes in the degree of melting which produced each parent magma will overwhelm the variations produced by crystal fractionation; (ii) even if this problem can be overcome, information on the extent of fractionation 1 — F is very poor. So much for trace elements in melts. One may wonder what we can expect from solids, especially residual rocks. Incompatible elements are, by definition, drained preferentially into the melt and we should be rather suspicious about using these elements in residues and cumulates. When melts are produced upon melting of a source rock or when melts infiltrate through a porous layer, it seems quite unlikely that they may be quantitatively 'wrung out' of the wetted layer. The concept of melt trapped in source rocks has been used by Langmuir et a\. (1977) to explain some geochemical features of the basalts from the FAMOUS area in the Atlantic. Residues are expected to recrystallize with the melt left behind (residual porosity). In addition to all the parameters which control melting processes in melts (melting process, melt fraction, residuum mineralogy,...), the fraction of melt trapped by the source rocks is an extra degree of freedom which complicates the interpretation of incompatible elements in solids. Assuming for element i a solid-liquid partition coefficient Dt, the concentration in a hypothetical dry residue of melting or percolation and the concentration in a rock which would have trapped a fraction/tr of interstitial liquid, are related by

u

^residue

i

Assuming that half a percent liquid is trapped with the residue, equating the concentration in the rock to that in the solid residue will result in a severe bias for incompatible elements with D^O.01 and in pure nonsense for elements with Dt<0.001. A typical example is represented by rare-earth elements in peridotites. Even separated clinopyroxenes can be suspected to have incorporated most of the REE from whichever trace amounts of liquid happened to be trapped in the cooling rock. If the rest of the minerals do not take any REE, it is left to the reader as an exercise to show that the concentration in clinopyroxene after uptake of incompatible elements is related to that in the clinopyroxene from a liquid-free residual peridotite through C <-cpx + tr

Ccpx*

' - ,

f

/(

1

/tr

/

A

/ c p x + /tr V^cpx/liq'

where Kcpx/liq* is the clinopyroxene/liquid partition coefficient. For KcvtxjXij = 0.\ percent,/tr = 0.5 percent, and/ c p x = 10 percent, concentration in the clinopyroxene contaminated by liquid is 43 percent larger than in the initial residual mineral. For / tr = 2 percent, and/ cpx = 5 percent, error exceeds 350 percent.

9.5 Which element, which process?

521

9.5.2 Elements and processes

The nature of melting and crystallization processes is largely unknown and much of what is described asfieldevidence is actually model-dependent. Because observations are dependent on what is assumed to represent a certain type of mantle (e.g., ophiolites), the respective role of porousflow(McKenzie, 1984) and channel migration (e.g., Nicolas and Jackson, 1982; Nicolas, 1986) is not unambiguously established, neither is the rheology of a molten mantle rock at high temperature. The role of regional stress in driving melts out of the pore depends on a mechanical model of source rocks and is largely a matter of speculation. For stability reasons, the persistence in the mantle of a liquid phase continuous over large vertical distances is rather problematic. Under certain conditions, solitary waves that propagates regions of high magma-filled porosity upwards are solutions to the percolation equation (Scott and Stevenson, 1986) and can be modeled by zone-refining. Ribe (1985) demonstrated that the equilibrium melting equation is solution to the porous flow equation with no diffusion term, although the process is not 'batch' melting since melt migrates relative to the matrix. To his own 'surprise', Richter (1986) found that perfect equilibrium partial melting equations replicate quite well the results of an ideal model of melt segregation from a deformable matrix. This is probably because incompatible elements are fractionated by melt segregation for liquid-solid ratios which would also produce serious elemental fractionation in a static melting process. Evidence for fractionated incompatible-element ratios in a melt may suggest either small degrees of melting or aggregation of different melt batches (e.g., O'Hara, 1985), where at least one of them represents a small liquid/solid volume ratio. For instance the U-Th fractionation during basalt genesis demonstrated by Th isotope geochemistry (Condomines et al., 1981) hints at extremely dispersed melt in some part of the mantle source although not necessarily through a percolation process. The equilibrium of melts with residual mantle is also currently under active research. From the U-shaped distributions of rare-earth elements observed in the ultramafic section of ophiolites, Prinzhofer and Allegre (1985) favored a model in which the melts under ridge crests are not equilibrated with the residual solid. Johnson et al. (1990) found that clinopyroxenes from mid-ocean ridge peridotites are too depleted in light REE relative to the heavy REE for being equilibrated with MORB liquids. They conclude that melting must have been progressive and their observations support a mechanism of fractional melting. Although these findings turn out to be real breakthroughs for understanding melting processes, the utmost care is still necessary in interpreting rare-earth and other incompatible-element distributions in peridotites and their minerals. Likewise, an acceptable picture of magma chambers is available only for those magmas which differentiate at less than a few kilobars (Mid-Ocean Ridge and Continental Flood basalts). Seismic evidence under Hawaii (Ryan, 1988) provides no more than a blurred image of mechanical events. Even in the best documented cases, animated controversies exist on many of their basic features such as the role of replenishment and crystal settling, persistent zoning, convection, the locus of crystallization and interpretation of the seismic evidence. Analog models, such as syrup-and-dye-in-tank magma chambers, provide a useful illustration of potential factors. However, scaling of natural processes on man-made material is approximate,

522

Trace elements in magmatic processes

geometry is dependent on ambiguous field observations, observables are few and their interpretation oriented by rather strong prejudices on the nature and dynamics of magma bodies. Even though they may sound physically less informative, simple models are as illuminating as complicated constructions. Models with a large number of parameters are only nearly as good as our knowledge of the least-known parameters. Little can be said about the unicity of solutions: as mentioned in Chapters 4 and 5, the co variance structure of the parameter space conditions strongly the results. A considerable effort is still to be made in order to describe the range of acceptable 'realistic' models. In this context, Monte-Carlo simulations have not received the attention they deserve. On the contrary, when the number of parameters is reasonably small, models can be tested accurately and stumbling blocks identified. Geochemistry offers four types of constraints: (a) elemental mass balance (b) elemental distribution among various phases (c) the rate of radioactive element decay (d) the overwhelming consistency of geochemical patterns among major petrological units (the odds of predicting reasonably well the La/Yb of a normal MORB or the 87Sr/86Sr ratio of a sedimentary carbonate are extremely good). These constraints, however, rarely provide evidence for specific physical processes, but should remain the base of any calculation.

9.6 Disequilibrium fractionation during crystal growth

For all the previous models, equilibrium partitioning of elements among homogeneous phases has been assumed. Crystal growth and melting, however, are disequilibrium processes and distribution of elements between mineral and melt must therefore be considered in kinetic conditions. We may wonder how kinetics, i.e., the combination of crystal growth and diffusional transport, affects trace-element distributions between crystals and liquid. We consider a liquid (x > 0) and a reference frame with x = 0 at the mineral-liquid interface. Liquid therefore seems to move towards negative x values and freeze upon interface crossing. Crystal growth rate will be assumed to be constant with modulus v. Fractionation of the element i at the interface will be assumed to take place at equilibrium and be governed by a mineral-liquid partition coefficient Dt. Diffusion in the solid is neglected as is volume change upon solidification. At steady-state and allowing for a negative advection rate, the one-dimensional transport equation reads ^

^ dx 2

+

v

Q

(961)

dx

where Q) is the diffusion coefficient of element i. Mass balance at the interface requires the equality of the diffusion and advectionfluxeson the liquid side with the advection flux on the solid side

dx

(9.6.2)

where, again, advection rate is negative. Equilibrium fractionation at the interface

9.6 Disequilibrium fractionation during crystal growth

523

requires

Cj = DtC^

(9.6.3)

The condition at x = 0 is therefore dC ' ^ ^

Ci(Dl)

It is convenient to introduce the new variable ti = dC liq '/dx

into the diffusion equation which leads to the first-order differential equation du Q)

hvw = 0

dx Upon two successive integration, we obtain

where a and j8 are two constants to be determined from the boundary conditions. Taking the derivative and applying the condition at x = 0, it becomes

dx Jx = 0

^a e 9)

v(

Therefore

and vx C^ = — D Q-

.,

^~vxn

(9.6.4)

One additional boundary condition being needed, two cases have been treated in the literature: (i) Tiller et al. (1953) assume the liquid medium is unbounded and therefore

which results in r i

Trace elements in magmatic processes

524

8 10

mm mm mm Solid

•:•••:•••:•••:•••:

mm mm mm

Liquid

0.1

0

1

2

3

4

Normalized distance to interface,

5

6

7

vx/2

Figure 9.16 Kinetic fractionation during crystal growth. Steady-state distribution of melt concentrations in the vicinity of a solid growing at the rate v for trace elements with different solid-liquid fractionation coefficients [equation (9.6.5), Tiller et al. (1953)]. The stippled area indicates the steady-state chemical boundary-layer with thickness S — @/v.

The concentration profiles for Df = 0.1 and D, = 5 have been depicted in Figure 9.16 as a function of the dimensionless distance vx/@. Accumulation of incompatible elements and depletion of compatible elements in the vicinity of the interface are the remarkable features of this model. Concentrations at the interface are given by Cliqf(0) = CoVDi

and

Cj(0) = Col

At steady-state, solid and liquid far from the interface tend to have the same concentration. Kinetic partitioning therefore brings solid-liquid partition coefficients close to unity and decreases chemical fractionation. The concentration profile in the liquid at distance x from the interface also reads

A convenient estimate of the anomalous layer thickness (chemical boundary layer) is given by d = @/v. Indeed, the excess or deficit M of diffusing substance is equal to the area limited by the concentration profile and the initial distribution Col in the liquid 0O

(C liq i -C 0 i )dx = [C l i q i (0)-C 0 i ] f° o Jo

9.6 Disequilibrium fractionation during crystal growth

525

The length scale 3 is therefore equivalent to the thickness of a layer with uniform concentration C_iig(0) and which would hold the same excess or deficit M of diffusing substance as the growing system at steady-state (Figure 9.16). (ii) Burton et al. (1953) consider the case where hydrodynamic conditions impose the concentration at a given distance L of the interface (e.g., rotating crystal growth during the industrial making of crystals) Cnqi = CL

at

x=L

which results in

which tends to Tiller et al's solution when L increases to infinity. Concentration in the solid at the interface is n c *

cj«»=

DiCh

Since the denominator falls in the range D{ to 1, concentration in the solid is closer to that of the liquid away from the interface than equilibrium fractionation would require. Again, disequilibrium partitioning during crystal growth decreases solid-liquid chemical fractionation.

For kinetic disequilibrium partitioning of trace elements, equation (9.6.6) after Burton et al (1953) is commonly presented as an alternative to equation (9.6.5) due to Tiller et al (1953) (e.g., Magaritz and Hofmann, 1978; Lasaga, 1981; Walker and Agee, 1989; Shimizu, 1981). However, the relative values of viscosity and chemical diffusivity in common liquids and silicate melts make the momentum boundary-layer (i.e., the liquid film which sticks to the solid) orders of magnitude thicker than the chemical boundary layer. It is therefore quite unlikely that, except for rare cases of transient state, liquid from outside the momentum boundary-layer may encroach on the chemical boundary-layer, i.e., 3 may actually be taken as infinite. As a simple description of steady-state disequilibrium fractionation, the model of Tiller et al (1953) has a much better physical rationale. A more elaborate discussion of these processes may be found in Tiller (1991a, b). The transient solution to this problem, briefly described in Section 8.5.9, has been worked out analytically by Smith et al (1955) and Hulme (1955), whereas Albarede and Bottinga (1972) calculated numerical solutions for the case where a crystal grows out of a finite amount of melt and discussed the geological implications.

References

Ahlberg, J. H., Nilson, E. N. & Walsh, J. L. (1967). The Theory of Splines and their Applications. New York: Academic Press. Albarede, F. (1976). Some trace element relationships amongst liquid and solid phases in the course of the fractional crystallization of magmas. Geochim. Cosmochim. Ada, 40,667-73. Albarede, F. (1978). The recovery of spatial isotope distributions from stepwise degassing data. Earth Planet. Sci. Letters, 39, 387-97. Albarede, F. (1983). Inversion of batch melting equations and the trace element pattern of the mantle. J. Geophys. Res., 88, 10573-83. Albarede, F. (1985). Open magma chambers: regime and trace element evolution. Nature, 318, 356-58. Albarede, F. (1992). How deep do common basalts form and differentiate? J. Geophys. Res., 97, 10997-11009. Albarede, F. (1993). Residence time analysis of geochemical fluctuations in volcanic series. Geochim. Cosmochim. Ada, 57, 615-21. Albarede, F. & Bottinga, Y. (1972). Kinetic disequilibrium between phenocrysts and host lava. Geochim. Cosmochim. Ada, 36, 141-56. Albarede, F. & Brouxel, M. (1987). The Sm/Nd secular evolution of the continental crust and depleted mantle. Earth Planet. Sci. Letters, 82, 25-35. Albarede, F., Michard, A., Minster, J. F. & Michard, G. (1981). 87 Sr/ 86 Sr ratios in hydrothermal waters and deposits from the East Pacific Rise at 21°N. Earth Planet. Sci. Letters, 55, 229-36. Albarede, F. & Provost, A. (1977). Petrological and geochemical mass balance: an algorithm for least-squares fitting and general error analysis. Comp. Sci., 3, 309-26. Albarede, F. & Tamagnan, V. (1988). Modelling the recent evolution of the Piton de la Fournaise volcano, Reunion Island, 1931-1986. J. Petrol, 29, 997-1030. Alibert, C. & Carron, J.-P. (1980). Donnees experimentales sur le diffusion des elements majeurs entre verres ou liquides de compositions basaltique, rhyolitique et phonolitique, entre 9 0 0 C et 1300C, a pression ordinaire. Earth Planet. Sci. Letters, 47, 294-306. Allegre, C. J. (1982). Chemical geodynamics. Tectonophys., 81, 109-32. Allegre, C. J., Brevart, O., Dupre, B. & Minster, J.-F. (1980). Isotopic and chemical effects produced in a continuously differentiating convecting earth mantle. Phil. Trans. R. Soc. London, A297, 447-77. Allegre, C. J., Hamelin, B., Provost, A. & Dupre, B. (1987). Topology in isotopic multispace and origin of the mantle chemical heterogeneities. Earth Planet. Sci. Letters, 81,319-37. Allegre, C. J., Hart, S. R. & Minster, J.-F. (1983). Chemical structure and evolution of the mantle and the continents determined by inversion of Nd and Sr isotopic data, I. Theoretical models. Earth Planet. Sci. Letters, 66, 177-90. 526

References

527

Allegre, C. J., Hart, S. R. & Minster, J.-F. (1983). Chemical structure and evolution of the mantle and the continents determined by inversion of Nd and Sr isotopic data, II. Numerical experiments and discussion. Earth Planet. Sci. Letters, 66, 191-213. Allegre, C. J. & Minster, J.-F. (1978). Quantitative models of trace element behavior in magmatic processes. Earth Planet. Sci. Letters, 38, 1-25. Allegre, C. J. & Rousseau, D. (1984). The growth of the continents through geological time studied by Nd isotope analysis of shales. Earth Planet. Sci. Letters, 67, 19-34. Allegre, C. J., Treuil, M., Minster, J.-F., Minster, B. & Albarede, F. (1977). Sytematic use of trace element in igneous process. Part I fractional crystallization processes in volcanic suites. Contrib. Mineral. Petrol., 60, 57-75. Allegre, C. J. & Turcotte, D. L. (1986). Implication of a two-component marble cake mantle. Nature, 323, 123-27. Anonymous (1982). Theory of Linear Systems. New York: Res. Edu. Ass. Arndt, N. T. (1977). Partitioning of nickel between olivine and ultrabasic and basic komatiitic liquids. Carnegie. Inst. Washington Year Book, 76, 553-57. Arnold, V. I. (1978). Ordinary Differential Equations. Cambridge: MIT Press. Backus, G. & Gilbert, F. (1967). Numerical applications of a formalism for geophysical inverse problems. Geophys. J. R. Astron. Soc, 13, 247-76. Barin, I. & Knacke, O. (1973). Thermochemical Properties of Inorganic Substances. Berlin: Springer-Verlag. Barling, J. & Goldstein, S. L. (1989). Extreme isotopic variations in Heard Island lavas and the nature of mantle reservoirs. Nature, 348, 59-62. Barnola, J. M., Raynaud, D., Korotkevich, Y. S. & Lorius, C. (1987). Vostok ice core provides 160,000-year record of atmospheric CO 2 . Nature, 329, 408-14. Bateman, H. (1910). Solution of a system of differential equations occurring in the theory of radioactive transformations. Proc. Cambridge Phil. Soc, 15, 423-27. Baumgartner, L. P. & Rumble III, D. (1988). Transport of stable isotopes: I: Development of a kinetic continuum theory for stable isotope transport. Contrib. Mineral. Petrol., 98, 417-30. Bear, J. (1972). Dynamics of Fluids in Porous Media. New York: Elsevier. Beattie, P. (1993). Uranium-thorium disequilibria and partitioning on melting of garnet peridotite. Nature, 363, 63-5. Benson, S. W. (1982). The Foundations of Chemical Kinetics. Malabar: Krieger. Berger, A. (1988). Milankovitch theory and climate. Rev. Geophys., 26, 624-57. Berger, W. H. & Heath, G. R. (1968). Vertical mixing in pelagic sediments. J. Mar. Res., 26, 134-43. Berner, E. K. & Berner, R. A. (1987). The Global Water Cycle. Englewood Cliffs: Prentice Hall. Berner, R. A. (1980). Early Diagenesis. A Theoretical Approach. Princeton: Princeton, University Press. Berner, R. A., Lasaga, A. C. & Garrels, R. M. (1983). The carbonate-silicate geochemical cycle and its effect on atmospheric carbon dioxide over the past 100 millon years. Amer. J. Sci., 283, 641-83. Bird, R. B., Stewart, W. E. & Lightfoot, E. N. (1960). Transport Phenomena. New York: John Wiley. Bodinier, J. L., Vasseur, G., Vernieres, J., Dupuy., C. & Fabries, J. (1990). Mechanisms of mantle metasomatism: Geochemical evidence from the Lherz orogenic peridotite. J. Petrol., 31, 597-628. Boger, P. D. & Faure, G. (1974). Strontium-isotope stratigraphy of a Red Sea core. Geol., 2, 181-83. Boher, M., Abouchami, W., Michard, A., Albarede, F. & Arndt, N. T. (1992). Crustal growth in West-Africa at 2.1 Ga. J. Geophys. Res., 97, 345-69.

528

References

Bowen, N. L. (1956). The Evolution of the Igneous Rocks. Dover. Bracewell, R. (1965). The Fourier Transform and its Applications. New York: McGraw Hill. Brady, J. B. (1975). Reference frames and diffusion coefficients. Amer. J. Sci., 275, 954-83. Brigham, E. O. (1974). The Fast Fourier Transform. Englewood Cliffs: Prentice Hall. Brinkley, S. R. (1946). Note on the conditions of equilibrium for systems of many consituents. / . Chem. Phys., 14, 563-64. Briqueu, L. & Lancelot, J. R. (1979). Rb-Sr systematics and crustal contamination models for calc-alkaline igneous rocks. Earth Planet. Sci. Letters, 43, 385-96. Broecker, W. S. (1974). Chemical Oceanography. New York: Harcourt Brace Jovanovich. Broecker, W. S. & Li, Y.-H. (1970). Interchange of water between the major oceans. J. Geophys. Res., 75, 3545-52. Broecker, W. S. & Peng, T.-H. (1982). Tracers in the Sea. New York: Eldigio. Broecker, W. S. & Takahashi, T. (1978). The relationship between lysocline depth and in situ carbonate ion concentration. Deep Sea Res., 25, 65-95. Brooks, C , Hart, S. R. & Wendt, I. (1972). Realistic use of two-error regression treatments as applied to rubidium-strontium data. Rev. Geophys. Space Phys., 10, 551-77. Brown, E. T , Edmond, J. M , Raisbeck, G. M., Yiou, F., Kurz, M. D. & Brook, E. J. (1991). Examination of surface exposure ages of Antarctica moraines using in situ produced 10 Be and 26A1. Geochim. Cosmochim. Ada, 55, 2269-83. Bruland, K. W. (1980). Oceanographic distributions of cadmium, zinc, nickel, and copper in the North Pacific. Earth Planet. Sci. Letters, 47, 176-98. Bryan, W. B., Finger, L. W. & Chayes, F. (1969). Estimating proportions in petrographic mixing equations by least-square approximations. Science, 163, 926-7. Burton, J. A., Prim, R. C. & Slichter, W. P. (1953). The distribution of solutes in crystals grown from the melt. Part I. Theoretical. / . Chem. Phys., 21, 1987-91. Candela, P. A. (1986). Generalized mathematical models for the fractional evolution of vapor from magmas in terrestrial planetary crusts. In Chemistry and Physics of Terrestrial Planets, ed. E. K. Saxena, pp. 362-96. NY: Springer. Caroff, M , Maury, R. C , Leterrier, J., Joron, J.-L., Cotten, J. & Guille, G. (1993). Trace element behavior in the alkali basalt-comenditic trachyte series from Mururoa Atoll, French Polynesia. Lithos, 30, 1-22. Carslaw, H. S. & Jaeger, J. C. (1959). Conduction of Heat in Solids. Oxford: Oxford University Press. Chadam, J. & Ortoleva, P. (1984). Moving interfaces and their stability: Applications to chemical waves and solidification. In Dynamics of Non-Linear Systems, ed. V. Hlavacek, pp. 247-78. New York: Gordon and Breach. Condomines, M., Morand, P. & Allegre, C. J. (1981). 23<>Th-238U ra( fi O active disequilibria in tholeiites from the FAMOUS zone (Mid-Atlantic Ridge, 36°50'N): Th and Sr isotopic geochemistry. Earth Planet. Sci. Letters, 55, 247-56. Corrigan, J. (1991). Inversion of apatite fission track data for thermal history information. J. Geophys. Res., 96, 10347-60. Cortini, M. & Hermes, O. D. (1981). Sr isotopic evidence for a multi-source origin of the potassic magmas in the Neapolitan area (S. Italy). Contrib. Mineral. Petrol., 11, 47-55. Cortini, M. & Scandone, R. (1982). The feeding system of Vesuvius between 1754 and 1944. J. Vole. Geotherm. Res., 12, 393. Cox, K. G., Bell, J. D. & Pankhurst, R. J. (1979). The Interpretation of Igneous Rocks. London: George Allen & Unwin. Cox, K. G., McKenzie, D. & White, R. S. (1993). Melting and Melt Movement in the Earth. Oxford: Oxford Univ. Press. Craig, H. (1961). Isotopic variations in meteoritic waters. Science, 133, 1702-3. Craig, H. (1969). Abyssal carbon and radiocarbon in the Pacific. J. Geophys. Res., 74,5491-506.

References

529

Craig, H. (1974). A scavenging model for trace elements in the deep sea. Earth Planet. Sci. Letters, 23, 149-59. Crank, J. (1976). The Mathematics of Diffusion. Oxford: Oxford University Press. Crawford, J. D. (1991). Introduction to bifurcation theory. Rev. Modern Phys., 63, 991-1037. Criss, R. E., Gregory, R. T. & Taylor, H. P., Jr (1987). Kinetic theory of oxygen isotopic exchange between minerals and water. Geochim. Cosmochim. Acta, 51, 1099-108. Dansgaard, W. (1964). Stable isotopes in precipitation. Tellus, 16, 436-68. Darken, L. S. (1948). Diffusion, mobility and their interrelation through free energy in binary metallic systems. Trans. AIME, 174, 184-94. Darken, L. S. & Gurry, R. W. (1953). Physical Chemistry of Metals. New York: McGraw-Hill, de Boor, C. (1978). A Practical Guide to Splines. New York: Springer. Deloule, E., France-Lanord, C. & Albarede, F. (1991). D/H analysis of minerals by ion probe. In Stable Isotope Geochemistry: A Tribute to Sam Epstein, ed. J. Taylor H. P., J. R. O'Neil & I. R. Kaplan, pp. 53-62. San Antonio: The Geochemical Society. Denbigh, K. (1968). The Principles of Chemical Equilibrium. Cambridge: Cambridge University Press. DePaolo, D. J. (1980). Crustal growth and mantle evolution: inferences from models of element transport and Nd and Sr isotopes. Geochim. Cosmochim. Acta, 44, 1185-96. DePaolo, D. J. (1981). Trace-element and isotopic effects of combined wallrock assimilation and fractional crystallization. Earth Planet. Sci. Letters, 53, 189-202. DePaolo, D. J. (1985). Isotopic studies of processes in mafic magma chambers. J. Petrol, 4, 925-51. DePaolo, D. J. & Ingram, B. L. (1985). High-resolution stratigraphy with strontium isotopes. Science, 227, 938-41. De Vault, D. (1943). The theory of chromatography. J. Amer. Chem. Soc, 65, 532^0. Dodson, M. H. (1973). Closure temperature in cooling geochronological and petrological systems. Contrib. Mineral. Petrol., 40, 259-74. Doerner, H. A. & Hoskins, W. M. (1925). Coprecipitation of radium and barium sulfates. J. Amer. Chem. Soc, 47, 662-75. Dudewicz, E. J. & Mishra, S. N. (1988). Modern Mathematical Statistics. New York: John Wiley. Dupre, B. & Allegre, C. J. (1983). Pb-Sr isotope variation in Indian Ocean basalts and mixing phenomena. Nature, 303, 142-6. Eberhardt, P., Geiss, J., Graf, H., Gr*gler, N., Krahenbuhl, U., Schwaller, H., Schwarzmiiller, J. & Stettler, A. (1970). Trapped solar wind noble gases, exposure age and K/Ar-age in Apollo 11 lunar fine material. Proc. Apollo 11 Lunar Sci. Conf, 2,1037-70. Faure, G. (1986). Principles of Isotope Geology. New York: John Wiley. Feigenson, M. D. & Carr, M. J. (1993). The source of Central American lavas: inferences from geochemical inverse modeling. Contrib. Mineral. Petrol, 113, 226-34. Feinn, D., Ortoleva, P., Scalf, W., Schmidt, S. & Wolff, M. (1978). Spontaneous pattern formation in precipitating systems. / . Chem. Phys., 69, 27-39. Finlayson, B. A. (1972). The Method of Weighted Residuals and Variational Principles. New York: Academic Press. Fleck, R. J. & Criss, R. E. (1985). Strontium and oxygen isotopic variations in Mesozoic and Tertiary plutons of Central Idaho. Contrib. Mineral. Petrol, 90, 291-308. Fletcher, C. A. J. (1991). Computational Techniques for Fluid Dynamics. Volume I: Fundamental and General Techniques. Berlin: Springer-Verlag. Fletcher, R. (1987). Practical Methods of Optimization. Chichester: John Wiley. Flicker, M. & Ross, J. (1974). Mechanism of chemical instability for periodic precipitation phenomena. / . Chem. Phys., 60, 3458-65. Foland, K. A. (1974). Ar 40 diffusion in homogeneous orthoclase and an interpretation of Ar diffusion in K-feldspars. Geochim. Cosmochim. Acta, 38, 151-66.

530

References

Fourcade, S. & Allegre, C. J. (1981). Trace-element behavior in granite genesis: A case study. The calc-alkaline plutonic association from the Querigut complex (Pyrenees, France). Contrib. Mineral. Petrol, 76, 177-95. Francis, D. (1985). The Baffin Bay lavas and the value of picrites as analogues of primary magmas. Contrib. Mineral. Petrol., 89, 144-54. Gast, P. W. (1968). Trace element fractionation and the origin of tholeiitic and alkaline magma types. Geochim. Cosmochim. Ada, 32, 1057-86. Ghiorso, M. S. (1985a). Chemical mass transfer in magmatic processes. I. Thermodynamic relations and numerical algorithms. Contrib. Mineral. Petrol., 90, 107-20. Ghiorso, M. S. (1985b). Chemical mass transfer in magmatic processes. II. Applications in equilibrium crystallization, fractionation and assimilation. Contrib. Mineral. Petrol., 90, 1021-41. Grandjean, P. (1989). Les terres rares et la composition isotopique du neodyme dans les phosphates biogenes: traceurs des processus paleo-oceanographiques et sedimentaires. Inst. Natl. Polytechn. Lorraine Ph.D., Nancy. Gray, P. & Scott, S. K. (1994). Chemical Oscillations and Instabilities. Oxford: Oxford University Press. Greenland, L. P. (1970). An equation for trace element distribution during magmatic crystallization. Amer. Mineral., 55, 455-65. Grossman, L. (1972). Condensation in the primitive solar nebula. Geochim. Cosmochim. Acta, 36, 597-619. Grossman, L. & Larimer, J. W. (1974). Early chemical history of the solar system. Rev. Geophys. Space Phys., 12, 71-101. Guinasso, N. L., Jr. & Schink, D. R. (1975). Quantitative estimates of biological mixing rates in abyssal sediments. J. Geophys. Res., 80, 3032-43. Guy, B. (1984). Contribution to the theory of infiltration metasomatic zoning; the formation of sharp fronts: a geometrical model. Bull. Mineral., 107, 93-105. Hackbusch, W. (1985). Multi-Grid Methods and Applications. New York: Springer. Hagen, H. & Neumann, E.-R. (1990). Modeling of trace-element distribution in magma chambers using open-system models. Comput. Geosci., 16, 549-56. Haken, J. (1978). Synergetics. An Introduction. Nonequilibrium Phase Transitions and Self-Organization in Physics, Chemistry and Biology. Berlin: Springer-Verlag. Hall, A. (1987). Igneous Petrology. Harlow: Longman. Hamelin, B., Manhes, G., Albarede, F. & Allegre, C. J. (1985). Precise lead isotope measurements by the double spike technique : a reconsideration. Geochim. Cosmochim. Acta, 49, 173-82. Hamilton, W. C. (1964). Statistics in Physical Science. New York: Ronald. Harris, P. G. (1957). Zone-refining and the origin of potassic basalts. Geochim. Cosmochim. Acta, 12, 195-208. Harrison, T. M. (1981). Diffusion of 40 Ar in hornblende. Contrib. Mineral. Petrol., 78,324-31. Hart, S. R. (1984). A large-scale isotope anomaly in the Southern Hemisphere mantle. Nature, 309, 753-7. Hart, S. R. & Davis, K. E. (1978). Nickel partitioning between olivine and silicate melt. Earth Planet. Sci. Letters, 40, 203-19. Hart, S. R., Hauri, E. H., Oschmann, L. A. & Whitehead, J. A. (1992). Mantle plumes and entrainment: isotopic evidence. Science, 256, 517-20. Hart, S. R. & Zindler, A. (1989). Isotope fractionation laws: A test using calcium. Int. J. Mass Spectr. Ion Proc, 89, 287-301. Hertogen, J. & Gijbels, R. (1976). Calculation of trace element fractionation during partial melting. Geochim. Cosmochim. Acta, 40, 313-22. Hilliard, J. E. (1970). Spinodal decomposition. In Phase Transformations, pp. 497-560. Metals Park: Amer. Soc. Metals.

References

531

Hirose, K. & Kushiro, I. (1993). Partial melting of dry peridotites at high pressures: Determination of compositions of melts segregated from peridotite using aggregates of diamond. Earth Planet. Sci. Letters, 114, 477-89. Hodell, D. A. & Cieselski, P. F. (1991). Stable isotopic and carbonate stratigraphy of the Late Pliocene and Pleistocene of Hole 704A: eastern subantarctic South Atlantic. Proc. ODP Sci. Results, 114, 409-35. Hoel, P. G., Port, S. C. & Stone, C. J. (1971). Introduction to Probability Theory. Boston: Houghton Mifflin. Hoffman, N. R. A. & McKenzie, D. P. (1985). The destruction of geochemical heterogeneities by differential fluid motions during mantle convection. Geophys. J. R. Astron. Soc, 1985, 163-206. Hofmann, A. (1971). Fractionation corrections for mixed-isotope spikes of Sr, K, and Pb. Earth Planet. Sci. Letters, 10, 397-402. Hofmann, A. W. (1972). Chromatographic theory of infiltration metasomatism and its application to feldspars. Amer. J. Sci., 292, 69-80. Hofmann, A. W. (1988). Chemical differentiation of the Earth: the relationship between mantle continental crust, and oceanic crust. Earth Planet. Sci. Letters, 90, 297-314. Hofmann, A. W. & Feigenson, M. D. (1983). Case studies on the origin of basalt: I. Theory and reassessment of Grenada basalts. Contrib. Mineral. Petrol., 84, 382 — 9. Hofmann, A. W. & Hart, S. R. (1978). An assesment of local and regional isotopic equilibrium in the mantle. Earth Planet. Sci. Letters, 38, 44-62. Holland, H. D. (1978). The Chemistry of the Atmospheres and Oceans. New York : Wiley. Hulme, K. F. (1955). On the distribution of impurity in crystals grown from impure unstirred melt. Proc. Phys. Soc, 68, 393-9. Irvine, T. N. (1977). Definition of primitive liquid compositions for basic magmas. Carnegie Inst. Washington Year Book., 76, 454-61. Irving, A. J. (1978). A review of experimental studies of crystal/liquid trace-element partitioning. Geochim. Cosmochim. Ada, 42, 743-70. Irving, A. J. & Frey, F. A. (1984). Trace-element abundances in megacrysts and their host basalts: Constraints on partition coefficients and megacryst genesis. Geochim. Cosmochim. Ada, 48, 1201-21. Jackson, D. D. (1972). Interpretation of inaccurate, insufficient and inconsistent data. Geophys. J. R. Astron. Soc, 28, 97-110. Jacobsen, S. B. & Wasserburg, G. J. (1979). The mean age of mantle and crustal reservoirs. J. Geophys. Res., 84, 7411-27. Johnson, K. T. M., Dick, H. J. B. & Shimizu, N. (1990). Melting in the oceanic upper mantle: an ion microprobe study of diopsides in abyssal peridotites. J. Geophys. Res., 95, 2661-78. Johnson, R. A. & Wichern, D. W. (1982). Applied Multivariate Statistical Analysis. Englewood Cliff: Prentice-Hall. Jouzel, J., Lorius, C , Petit, J. R., Genthon, C , Barkov, N. I., Kotlyakov, V. M. & Petrov, V. M. (1987). Vostok ice core: a continuous isotope temperature record over the last climatic cycle (160,000 years). Nature, 239, 403-8. Junge, C. E. (1974). Residence variability of tropospheric trace gases. Tellus, 26, 477-88. Juteau, M., Michard, A. & Albarede, F. (1986). The Pb-Sr-Nd isotope geochemistry of some recent circum-Mediterranean granites. Contrib. Mineral. Petrol, 92, 331-40. Kahlweit, M. (1965). The structure of a precipitate as determined by the interplay of nucleation, growth and ageing. Prog. Chem. Solids, 2, 134-74. Keir, R. S. & Berger, W. H. (1983). Atmospheric CO 2 content in the last 120,000 years: The phosphate extraction model. J. Geophys. Res., 88, 6027-38. Kent, J. T., Watson, G. S. & Onstott, T. C. (1990). Fitting straight lines and planes with an

532

References

application to radiometric dating. Earth Planet. Sci. Letters, 97, 1-17. Kinzler, R. J., Grove, T. L. & Recca, S. I. (1990). An experimental study of the effect of temperature and melt composition on the partitioning of nickel between olivine and silicate melt. Geochim. Cosmochim. Ada, 54, 1255-65. Kirkaldy, J. S. & Young, D. J. (1987). Diffusion in the Condensed State. London: The Institute of Metals. Korzhinskii, D. S. (1970). Theory of Metasomatic Zoning. Oxford: Clarendon Press. Lai, D. (1988). In s/fw-produced cosmogenic isotopes in terrestrial rocks. Ann. Rev. Earth Planet. Sci., 16, 355-88. Lancelot, J. R. & Allegre, C. J. (1974). Origin of carbonatitic magmas in the light of Pb-U-Th isotope system. Earth Planet. Sci. Letters, 22, 233-8. Lanczos, C. (1961). Linear Differential Operators. London: Van Nostrand. Langmuir, C. H. (1989). Geochemical consequences of in situ crystallization. Nature, 340, 199-205. Langmuir, C. H., Bender, J. F., Bence, A. E., Hanson, G. N. & Taylor, S. R. (1977). Petrogenesis of basalts from the FAMOUS area: Mid-Atlantic Ridge. Earth Planet. Sci. Letters, 36, 133-56. Langmuir, C. H., Vocke, R. D., Hanson, G. N. & Hart, S. R. (1978). A general mixing equation with applications to Icelandic basalts. Earth Planet. Sci. Letters, 37, 380-92. Larimer, J. W. (1967). Chemical fractionations in meteorites - I. Condensation of the elements. Geochim. Cosmochim. Ada, 31, 1215-38. Lasaga, A. (1980). The kinetic treatment of geochemical cycles. Geochim. Cosmochim. Ada, 44, 815-28. Lasaga, A. (1981a). Implications of a concentration-dependent growth rate on the boundary layer crystal-melt model. Earth Planet. Sci. Letters, 56, 429-34. Lasaga, A. C. (1981b). Dynamic treatment of geochemical cycles. In Kinetics of Geochemical Processes, ed. A. C. Lasaga & R. J. Kirkpatrick, pp. 69-110. Washington: Miner. Soc. Amer. Lasaga, A. C , Berner, R. A. & Garrels, R. M. (1985). An improved geochemical model of atmospheric CO 2 fluctuations over the past 100 millon years. In The Carbon Cycle and Atmospheric CO2: Natural Variations Archean to Present, ed. E. T. Sundquist & W. S. Broecker, pp. 397-411. Washington: American Geophysical Union. Leon, S. J. (1990). Linear Algebra with Applications. New York: Maxwell-Macmillan. Li, Y.-H. (1981). Ultimate removal mechanisms of elements from the ocean. Geochim. Cosmochim. Ada, 45, 1659-64. Li, Y.-H. (1982). A brief discussion on the mean oceanic residence time of elements. Geochim. Cosmochim. Ada, 46, 2671-5. Lichtner, P. C. (1985). Continuum model for simultaneous chemical reactions and mass transport in hydrothermal systems. Geochim. Cosmochim. Ada, 49, 779-800. Lloyd, E. (1980). Handbook of Applicable Mathematics. Vol. II: Probability. Chichester: John Wiley. Logan, J. D. (1987). Applied Mathematics: A Contemporary Approach. New York: John Wiley. Lomb, N. R. (1976). Least-squares frequency analysis of unequally spaced data. Astrophys. Space Sci., 39, 447-62. Lovera, O. M., Richter, F. M. & Harrison, T. M. (1989). The 40 Ar/ 39 Ar thermochronometry for slowly cooled samples having a distribution of diffusion domain sizes. J. Geophys. Res., 94, 17917-35. Lovett, R., Ortoleva, P. & Ross, J. (1978). Kinetic instabilities in first-order phase transitions. J. Chem. Phys., 69, 947-55. Ludwig, K. R. (1980). Calculation of uncertainties of U-Pb isotope data. Earth Planet. Sci. Letters, 46, 212-20. Magaritz, M. & Hofmann, A. W. (1978). Diffusion of Sr, Ba, and Na in obsidian. Geochim. Cosmochim. Ada, 42, 595-605.

References

533

Mantovani, M. S. M. & Hawkesworth, C. J. (1990). An inversion approach to assimilation and fractional crystallization processes. Contr. Mineral. Petrol, 105, 289-302. Marquardt, D. W. (1963). An algorithm for least square estimation of non-linear parameters. SIAM /., 11, 431-41. McBirney, A. R. (1984). Igneous Petrology. San Francisco: Freeman Cooper. McCulloch, M. T , Gregory, R. T., Wasserburg, G. J. & Taylor Jr., H. P. (1981). Sm-Nd, Rb-Sr, and 1 8 O/ 1 6 O isotopic systematics in an ancient crustal section: Evidence from the Samail ophiolite. J. Geophys. Res., 86, 2721-35. Mclntire, W. L. (1963). Trace-element partition coefficients: — a review of theory and applications to geology. Geochim. Cosmochim. Acta, 27, 1209-64. Mclntyre, G. A., Brooks, C , Compston, W. & Turek, A. (1966). The statistical assessment of Rb-Sr isochrons. J. Geophys. Res., 71, 5459-68. McKenzie, D.(1984). The generation and compaction of partially molten rocks. J. Petrol, 25, 713-65. McKenzie, D. (1985). 2 3 O Th- 2 3 8 Th disequilibrium and the melting processes beneath ridge axes. Earth Planet. Sci. Letters, 72, 149-57. Menard, H. W. & Smith, S. M. (1966). Hypsometry of ocean basin provinces. J. Geophys. Res., 71, 4305-25. Meyer, C. (1977). Petrology, mineralogy and chemistry of KREEP basalt. Phys. Chem. Earth, 10, 239-60. Michard, A. & Albarede, F. (1986). The REE content of some hydrothermal fluid. Chem. Geol, 55, 51-60. Michard, A., Gurriet, P., Soudan, M. & Albarede, F. (1985). Nd isotopes in French Phanerozoic shales:external vs. internal aspects of crustal evolution. Geochim. Cosmochim. Acta, 49, 601-10. Michard, G. (1989). Equilibres Chimiques dans les Eaux Naturelles. Paris: Publisud. Michel, H. V., Asaro, F., Alvarez, W. & Alvarez, L. W. (1990). Geochemical studies of the Cretaceous-Tertiary boundary in ODP holes 689B and 690C. Proc. ODP Sci. Res., 113, 159-68. Minster, J.-F. & Allegre, C. J. (1977). Systematic use of trace elements in igneous processes. Part III: inverse problem of partial melting. Contrib. Mineral Petrol, 68, 37-52. Minster, J.-F., Ricart, L.-P. & Allegre, C. J. (1979). 8 7 Rb- 8 7 Sr geochronology of enstatite meteorites. Earth Planet. Sci. Letters, 42, 333^7. Mitchell, A. R. (1969). Computational Methods in Partial Differential Equations. London: John Wiley. Morel, F. M. M. & Hering, J. G. (1993). Principles and Applications of Aquatic Chemistry. New York: John Wiley. Morse, P. M. & Feshbach, H. (1953). Methods of Theoretical Physics. New York: McGraw-Hill. Murray, J. W., Grundmanis, V. & Smethie, W. M. (1978). Interstitial water chemistry in the sediments of Saanich Inlet. Geochim. Cosmochim. Acta, 42, 1011-26. Navon, O. & Stolper, E. (1987). Geochemical consequences of melt percolation: the upper mantle as a chromatographic column. J. Geol, 95, 285-307. Neuman, H., Mead, J. & Vitaliano, C. J. (1954). Trace-element variation during fractional crystallization as calculated from the distribution law. Geochim. Cosmochim. Acta, 6, 90-100. Nicolas, A. (1986). A melt extraction model based on structural studies in mantle peridotites. J. Petrol, 27, 999-1022. Nicolas, A. & Jackson, M. (1982). High-temperature dikes in peridotites: origin by hydraulic fracturing. J. Petrol, 23, 568-82. Noyes, R. M. & Field, R. J. (1974). Oscillatory chemical reactions. Ann. Rev. Phys. Chem., 25,95-119.

534

References

O'Hara, M. J. (1977). Geochemical evolution during fractional crystallization of a periodically refilled magma chamber. Nature, 266, 503-7. O'Hara, M. J. (1985). Importance of the 'shape' of the melting regime during partial melting of the mantle. Nature, 314, 58-62. O'Hara, M. J. & Mathews, R. E. (1981). Geochemical evolution in an advancing, periodically replenished, periodically tapped, continuously fractionated magma chamber. / . Geoi Soc. London, 138, 237-77. O'Nions, R. K., Evensen, N. M. & Hamilton, P. J. (1979). Geochemical modeling of mantle differentiation and crustal growth. J. Geophys. Res., 84, 6091-101. O'Nions, R. K. & Powell, R. (1977). The thermodynamics of trace-element distribution. In Thermodynamics in Geology, ed. D. G. Fraser, pp. 349-63. Dordrecht: Reidel. Ortoleva, P. (1984). The self-organization of Liesegang Bands and other precipitate patterns. In Chemical Instabilities: Applications in Chemistry, Engineering, Geology and Material Science, ed. G. Nicolis & F. Baras, pp. 289-97. Dordrecht: Reidel. Ortoleva, P. J. (1994). Geochemical Self-Organization. Oxford: Oxford University Press. Ozisik, M. N. (1968). Boundary Value Problems of Heat Conduction. Scranton: Intern. Textbook. Co. Ottino, J. M. (1989). The Kinematics of Mixing: Stretching, Chaos, and Transport. Cambridge: Cambridge University Press. Palmer, M. R. & Edmond, J. M. (1989). The strontium isotope budget of the modern ocean. Earth Planet. Sci. Letters, 92, 11-26. Papoulis, A. (1984). Probability, Random Variables and Stochastic Processes. Kosaido: McGraw-Hill. Parker, R. L. (1977). Understanding inverse theory. Ann. Rev. Earth Planet. Sci., 5, 35-64. Pearce, J. A. & Cann, J. R. (1973). Tectonic setting of basic volcanic rocks determined using traceelement analyses. Earth Planet. Sci. Letters, 19, 290-300. Pearce, T. H. (1978). Olivine fractionation equations for basaltic and ultrabasic liquids. Nature, 276, 771-4. Pfann, W. G. (1952). Principles of zone-melting. Trans. AIME, 194, 747-53. Phillips, O. M. (1991). Flow and Reactions in Permeable Rocks. Cambridge: Cambridge University Press. Prager, S. (1956). Periodic precipitation. J. Chem. Phys., 25, 279-83. Presnall, D. C. (1969). The geometric analysis of partial fusion. Amer. J. Sci., 267, 1178-94. Press, W. H., Flanney, B. P., Teukolsky, S. A. & Vetterling, W. T. (1986). Numerical Recipes: The Art of Scientific Computing. Cambridge: Cambridge University Press. Prinzhofer, A. & Allegre, C. J. (1985). Residual peridotites and the mechanism of partial melting. Earth Planet. Sci. Letters, 74, 251-65. Procaccia, I. & Ross, J. (1978). Stability and relative stability in reactive systems far from equilibrium. II. Kinetic analysis of relative stability of multiple stationary states. / . Chem. Phys., 67, 5565-71. Provost, A. (1990). An improved diagram for isochron data. hot. Geosci., 80, 85-99. Provost, A. & Allegre, C. J. (1979). Process identification and search for optimal parameters from major-element data. General presentation with emphasis on the fractional crystallization process. Geochim. Cosmochim. Ada, 43, 487-501. Rayleigh, J. W. S. (1896). Theoretical considerations respecting the separation of gases by diffusion and similar processes. Phil. Mag., 42, 77-107. Raymo, M. E., Ruddiman, W. F. & Froelich, P. N. (1988). Influence of late Cenozoic mountain building on ocean geochemical cycles. Geology, 16, 649-53. Reed, M. H. (1982). Calculation of multicomponent chemical equilibria and reaction processes in systems involving minerals, gases and an aqueous phase. Geochim. Cosmochim. Ada, 46, 513-28.

References

535

Ribe, N. M. (1985). The generation and composition of partial melts in the earth's mantle. Earth Planet. Sci. Letters, 73, 361-76. Ribe (1987). Theory of melt segregation - A review. / . Vole. Geoth. Res., 33, 241-53. Richardson, S. M. & McSween, H. Y., Jr (1989). Geochemistry: Pathways and Processes. Englewood Cliffs: Prentice Hall. Richter, F. M. (1986). Simple models for trace-element fractionation during melt segregation. Earth Planet. Sci. Letters, 11, 333-44. Richter, F. M., Bowley, D. B. & DePaolo, D. J. (1992). Sr isotope evolution of seawater: the role of tectonics. Earth Planet. Sci. Letters, 109, 11-23. Richter, F. M. & Ribe, N. M. (1979). On the importance of advection in determining the local isotopic composition of the mantle. Earth Planet. Sci. Letters, 43, 212-22. Robie, R. A., Hemingway, B. S. & Fisher, J. R. (1978). Thermodynamic properties of minerals and related substances at 298.15 K and 1 bar (105 pascals) pressure and at higher temperatures. U.S. Geol. Surv., 1452, 1—456. Roeder, P. L. & Emslie, R. F. (1970). Olivine-liquid equilibrium. Contrib. Mineral. Petrol., 29, 275-89. Rosen, J. B. (1961). The gradient projection method for non-linear programming, Part II, Non-linear constraints. J. Soc. Indus. Appl. Math., 9, 414-32. Ruddiman, W. F. & Glover, L. K. (1972). Vertical mixing of ice-rafted volcanic ash in North-Atlantic sediments. Geol. Soc. Amer. Bull, 83, 2817-36. Russel, W. A., Papanastassiou, D. A. & Tombrello, T. A. (1978). Ca isotope fractionation on the Earth and other solar system materials. Geochim. Cosmochim. Ada, 42, 1075-90. Ryan, M. P. (1988). The mechanics and three-dimensional internal structure of active magmatic systems: Kilauea volcano, Hawaii. J. Geophys. Res., 93, 4213-48. Salters, V. J. M. & Hart, S. R. (1989). The Hf-paradox, and the role of garnet in the MORB source. Nature, 342, 420-22. Sandwell, D. T. (1987). Biharmonic spline interpolation of GEOS-3 and SEASAT altimeter data. Geophys. Res. Letters, 2, 139-42. Saxena, S. K. & Eriksson, G. (1983). Theoretical computations of mineral assemblages in pyrolite and lherzolite. J. Petrol., 24, 538-55. Scargle, J. D. (1982). Studies in astronomical time series analysis. II. Statistical aspects of spectral analysis of unevenly spaced data. Astroph. J., 263, 835-53. Scheid, F. (1968). Theory and Problems of Numerical Analysis. New York: McGraw-Hill. Schilling, J.-G. & Winchester, J. W. (1967). Rare-earth fractionation and magmatic processes. In Mantles of Earth and Terrestrial Planets, ed. S. K. Runcorn, pp. 267-83. New York: Interscience. Scott, D. R. & Stevenson, D. J. (1986). Magma ascent by porous flow. J. Geophys. Res., 91, 9283-96. Seber, G. A. F. (1984). Multivariate Observations. New York: John Wiley. Sen, A. & Srivastava, M. (1990). Regression Analysis. Theory, Methods, and Applications. New York: Springer-Verlag. Shackelton, N. J. & Kennett, J. P. (1975). Paleotemperature history of the Cenozoic and the initiation of Antactic glaciation: oxygen and carbon isotope analyses in DSDP Sites 277, 279 and 281. In Init. Rept. Deep Sea Drilling Project, pp. 743-55. Wahington D.C.: U.S. Government Printing Office. Shaw, D. M. (1970). Trace-element fractionation during anatexis. Geochim. Cosmochim. Ada, 34, 237-43. Shimizu, N. (1981). Trace-element incorporation into growing augite phenocryst. Nature, 289, 575-7. Sleep, N. H. (1976). Segregation of magma from a mostly crystalline mush. Geol. Soc. Amer. Bull., 85, 1225-32.

536

References

Smith, W. R. & Missen, R. W. (1982). Chemical Reaction Equilibrium Analysis. New York: John Wiley. Smith, V. G., Tiller, W. A. & Rutter, J. W. (1955). A mathematical analysis of solute redistribution during solidification. Canad. J. Phys., 33, 723-45. Sneddon, I. N. (1957). Elements of Partial Differential Equations. New York: McGraw-Hill. Sobolev, A. V. & Shimizu, N. (1992). Ultra-depleted melts and permeability of oceanic mantle (in Russian). Dokl. Acad. Sci. Russia, 236, 354-69 Southam, J. R. & Hay, W. W. (1976). Dynamical formulation of Broecker's model for marine cycles of biologically incorporated sediments. Math. Geol., 8, 511-27. Spiegel, M. R. (1973). Theory and Problems of Complex Variables. New York: MacGraw-Hill. Spiegel, M. R. (1975). Theory and Problems of Probability and Statistics. New York: McGraw-Hill. Spiegelman, M. & Elliott, T. (1993). Consequences of melt transport for uranium series disequilibrium in young lavas. Earth Planet. Sci. Letters, 118, 1-20. Stallard, R. F. & Edmond, J. M. (1981). Geochemistry of the Amazon 1. Precipitation chemistry and the marine contribution to the dissolved load at the time of peak discharge. J. Geophys. Res., 86, 9844-58. Steiger, R. H. & Wasserburg, G. J. (1966). Systematics in the Pb 2 0 8 -Th 2 3 2 , Pb 2 0 7 -U 2 3 5 , and pb 206_ U 238 s y s t e m S i j Geophys. Res., 71, 6065-90. Strang, G. (1976). Linear Algebra and its Applications. New York: Academic Press. Strang, G. (1986). Introduction to Applied Mathematics. Wellesley: Wellesley-Cambridge University Press. Stumm, W. & Morgan, J. J. (1981). Aquatic Chemistry. New York: John Wiley. Sundquist, E. T. (1985). Geological perspectives on carbon dioxide and the carbon cycle. In The Carbon Cycle and Atmospheric CO2: Natural Variations Archean to Present (AGU Geophys. Monograph 32), ed. E. T. Sundquist & W. S. Broecker, pp. 5-59. Washington: Amer. Geophys. Union. Swalin, R. A. (1962). Thermodynamics of Solids. New York: John Wiley. Tarantola, A. (1987). Inverse Problem Theory. Amsterdam: Elsevier. Tarantola, A. & Valette, B. (1982). Generalized nonlinear inverse problems solved using the least-square criterion. Rev. Geophys. Space Physics, 20, 219-32. Taylor, H. P., Jr. (1974). The application of oxygen and hydrogen isotope studies to problems of hydrothermal alteration and ore deposition. Econ. Geol., 69, 843-83. Taylor, H. P., Jr. (1978). Oxygen and hydrogen isotope studies of plutonic granitic rocks. Earth Planet. Sci. Letters, 38, 177-210. Taylor, H. P., Jr. (1980). The effects of assimilation of country rocks by magmas on 18 O/ 1 6 O and 87 Sr/ 86 Sr systematics. Earth Planet. Sci. Letters, 47, 243-54. Taylor, H. P., Jr. & Sheppard, S. M. F. (1986). Igneous rocks: I. Processes of isotopic fractionation and isotope systematics. In Rev. Mineral. 16: Stable Isotopes in High Temperature Geological Processes, ed. J. W. Valley, H. P. Taylor Jr. & J. R. O'Neil, pp. 227-71. Washington: Mineral. Soc. Amer. Tiller, W. A., Jackson, K. A., Rutter, K. A. & Chalmers, B. (1953). The redistribution of solute atoms during the solidification of metals. Acta Metall., 1, 428-37. Tiller, W. A. (1991a). The Science of Crystallization: Macroscopic Phenomenon and Defect Generation. Cambridge: Cambridge University Press. , Tiller, W. A. (1991b). The Science of Crystallization: Microscopic Interfacial Phenomena. Cambridge: Cambridge University Press. Tilton, G. R. (1960). Volume diffusion as a mechanism for discordant lead ages. J. Geophys. Res., 65, 2933-45. Treuil, L. & Joron, J.-L. (1975). Utilisation des elements hygromagmatophiles pour la simplification de la modelisation quantitative des processus magmatiques. Exemples de l'Afar et de la dorsale medioatlantique. Soc. Ital. Mineral. Petrol., 31, 125-74.

References

537

Turcotte, D. L. & Schubert, G. (1982). Geodynamics. Applications of Continuum Physics to Geological Problems. New York: John Wiley. Turner, G. (1968). The distribution of potassium and argon in chondrites. In Origin and Distribution of the Elements, ed. L. H. Ahrens, pp. 387-98. London: Pergamon. Turner, G. (1972). 4 0 Ar- 3 9 Ar age and cosmic ray irradiation history of the Apollo 15 anorthosite 15415. Earth Planet. Sci. Letters, 14, 169-75. Ulmer, P. (1989). The dependence of the F e 2 + - M g cation-partitioning between olivine and basaltic liquid on pressure, temperature and composition: An experimental study to 30kbars. Contrib. Mineral. Petrol, 101, 261-73. Van Zeggeren, F. & Storey, S. H. (1970). The Computation of Chemical Equilibria. Cambridge: Cambridge University Press. Vasseur, G., Vernieres, J. & Bodinier, J. L. (1991). Modelling of trace-element transfer between mantle melt and heterogranular peridotite matrix. In Orogenic Lherzolites and Mantle Processes, ed. M. Menzies, C. Dupuy & A. Nicolas, pp. 41-54. Oxford: Oxford University Press. Veizer, J. & Jansen, S. L. (1979). Basement and sedimentary recycling and continental evolution. J. GeoL, 87, 341-70. Vidal, P., Dosso, L., Bowden, P. & Lameyre, J. (1979). Strontium isotope geochemistry in syenite-alkaline granite complexes. In Origin and Distribution of the Elements, ed. L. H. Ahrens, pp. 223-31. Oxford: Pergamon. Vollmer, R. (1976). Rb-Sr and U-Th-Pb systematics of alkaline rocks: the alkaline rocks from Italy. Geochim. Cosmochim. Ada, 40, 283-95. Wagner, C. (1950). Mathematical analysis of the formation of periodic precipitation. J. Coll. Sci., 5, 85-97. Walker, D., Agee, C. B. & Zhang, Y. (1988). Fusion curve slope and crystal/liquid buoyancy. J. Geophys. Res., 93, 313-23. Walker, D., Shibata, T. & DeLong, S. E. (1979). Abyssal tholeiites from the Oceanographer Fracture Zone. Contrib. Mineral. Petrol., 70, 111-25. Walker, F. W., Parrington, J. R. & Feiner, F. (1989). Nuclide and Isotopes, 14th edition. General Electric. Walker, J. C. G. (1991). Numerical Adventures with Geochemical Cycles. New York: Oxford University Press. Walsh, G. R. (1975). Methods of Optimization. New York: John Wiley. Warren, P. H. (1986). The Bulk-Moon MgO/FeO ratio: A highlands perspective. In Origin of the Moon, ed. W. K. Hartmann, R. J. Phillips & G. J. Taylor, pp. 279-310. Houston: Lunar Planet. Inst. Warren, P. H. & Wasson, J. T. (1979). The origin of KREEP. Rev. Geophys. Space Phys., 17, 73-88. Wasserburg, G. J. (1954). Argon 40 :Potassium 40 dating. In Nuclear Geology, ed. H. Faul, pp. 341-9. New York: John Wiley. Wasserburg, G. J. (1963). Diffusion processes in lead-uranium systems. J. Geophys. Res., 68, 4823^6. Wasserburg, G. J., Jacobsen, S. B , DePaolo, D. J., McCulloch, M. T. & Wen, T. (1981). Precise determination of Sm/Nd ratios, Sm and Nd isotopic abundances in standard solutions. Geochim. Cosmochim. Ada, 45, 2311-23. Webster, R. K. (1960). Mass spectrometric isotope dilution analysis. In Methods in Geochemistry, ed. A. A. Smales & L. R. Wager, pp. 202-46. New York: Intersciences. Wetherill, G. W. (1956). Discordant uranium-lead ages. Trans. Amer. Geophys. Union, 37, 320-26. Wetherill, G. W., Davis, G. L. & Lee-Hu, C. (1968). Rb-Sr measurements on whole rocks and separated minerals from the Baltimore gneiss, Maryland. Geol. Soc. Amer. Bull, 79, 757-62.

538

References

White, W. M. (1985). Sources of oceanic basalts: radiogenic isotopic evidence. Geology, 13, 115-18. Whitfield, M. & Turner, D. R. (1979). Water-rock partition coefficient and the composition of seawater and river water. Nature, 278, 132-7. Wiberg, D. M. (1971). State Space and Linear Systems. New York: McGraw-Hill. Widder, D. V. (1975). The Heat Equation. New York: Academic Press. Wiggins, R. (1976). Interpolation of digitized curves. Bull. Seism. Soc. Amer., 66, 2077-81. Wilkinson, J. H. (1965). The Algebraic Eigenvalue Problem. Oxford: Clarendon Press. Williams, R. W. & Gill, J. B. (1989). Effects of partial melting on the uranium decay series. Geochim. Cosmochim. Ada, 53, 1607-19. Williamson, J. H. (1968). Least-squares fitting of a straight line. Can. J. Phys., 46, 1845-7. Wood, B. J. (1987). Thermodynamics of multicomponent systems containing several solid solutions. In Thermodynamic Modeling of Geological Materials: Minerals, Fluids and Melts, ed. I. S. E. Carmichael & H. P. Eugster, pp. 71-95. Washington: Mineral. Soc. Amer. Wright, T. L. & Doherty, P. C. (1970). A linear programming and least squares computer method for solving petrologic mixing problems. Geol. Soc. Amer. Bull., 81, 1995-2008. York, D. (1966). Least-squares fitting of a straight line. Can. J. Phys., 44, 1079-89. York, D. (1969). Least squares fitting of a straight line with correlated errors. Earth Planet. Sci. Letters, 5, 320-24. Zienkiewicz, O. C. (1977). The Finite Element Method in Engineering Science, 3rd edition. New York: McGraw-Hill. Zindler, A., Jagoutz, E. & Goldstein, S. (1982). Nd, Sr and Pb isotopic systematics in a three-component mantle: a new perspective. Nature, 298, 519-23. Zindler, A. W. & Hart, S. R. (1986). Chemical Geodynamics. Ann. Rev. Earth Planet. Sci., 14, 493-571. Zwillinger, D. (1989). Handbook of Differential Equations. Boston: Academic Press.

Subject index

activation energy 421, 455 activity 319 ADI 165 advection 165, 401, 406 advection-diffusion model 274, 464 AFC (assimilation-fractional crystallization) concentrations 505 identification diagrams 508 isotopic ratios 507 AFM diagram fractionation and mixing 32 alkalinity 395 Ar loss 128, 194, 313, 315, 446, 451, 455 Arrhenius plot 421, 455 asymptote mixing hyperbola 19, 264 autonomous system 345 base variable 340 batch melting 478 density function 192 forward problem 478 known source 479 Shaw's equation 487 unknown source 483 Berthelot-Nernst partition coefficient 477 bias mass see mass discrimination statistical 185 bifurcation 364, 417 bioturbation 408 Boltzmann variable 424, 428 boundary conditions 162, 421 boundary layer 436, 525 box see one-box, multiple-box boxcar function 101, 438 bulk mixing see conservative mixing bulk partition coefficient 478, 492 C mixing time ocean 354 carbonate equilibrium 320, 324, 325, 395 geochemistry 241, 267 compensation depth 393 CCD see carbonate Ce (La)-Yb fractionation 135, 216, 234 centered random variable 175 change of random variables 185, 206 characteristic equation 74 chemical equilibrium 318 chi-squared see pdf

clastic sediments 367 closure temperature 457 common dimension expansion matrix product 56, 75 compatible element 477 complexes solution 328 component 1, 318 identification by PC A 241, 243 loading 240 principal see PCA concentration-ratio hyperbola 18 retrieving 26 Concordia 125 confidence interval of the mean 196, 211 of the variance 197 conservative mixing 1 conservative property 401 constrained least-squares linear constraint 278 quadratic constraint 282 constrained minimum 147, 333 contamination (isotopic) binary 12,16,22 continuity equation 405 continuous inverse model 312 continuous melting 500 control line 114 cooling age 456 cooling rate 457 correlation coefficient 202 correlation matrix 203 sample 204 cosmogenic nuclides 410 covariance 202 covariance matrix 203, 208 matrix sample 204, 285 Crank-Nicholson see finite differences implicit critical melting see continuous melting crustal growth 367, 389 cumulate control line 114 curvature matrix 139, 147, 299 curvature mixing hyperbola 19 damping factor 352 Darken's theory 421 degree of freedom 181, 189, 197 density function see pdf determinant 58, 73 deviate 199, 233

539

540

Subject index

di-ol-si triangle 65 diagenesis 461 differential equations linear, order >1 97 linear, stability 98 first order, system of linear 85, 375, 381 differentiation 1 diffusion and precipitation 467 basalt-rhyolite 259 boundary layer 436 definition 406, 419 periodic boundary conditions 434 radial flux 446 radioactive decay 439, 451 semi-infinite medium, parallel flux 428 slab, parallel flux 437 solidification 442, 522 sphere in a well-stirred solution 449 uphill 422, 470 diffusion coefficient 420 variable with time 453 diffusivity chemical see diffusion coefficient disequilibrium radioactive 88 disequilibrium fractionation crystal growth 442, 522 dispersal passive tracer 154, 412 dispersion matrix see covariance matrix distillation see incremental process distribution function 173 divergence 139 divergence theorem 404 Doerner-Hoskins law 36 dot product 55 Duhamel's principle 451, 476 dynamic systems 344

eddy diffusivity 464 eigencomponents 73, 86, 140, 214, 216, 237, 238, 282, 375, 380 singular value decomposition (SVD) 75 symmetric matrix 75 eigenvalues see eigencomponents eigenvectors see eigencomponents electroneutrality 320 elemental fractionation 387 ellipsoid 78 end-member 1 entropy 129, 150 equilibrium constant 319, 394, 395, 477 equilibrium in solutions 320 erf 313, 430, 471 erfc see erf erosion rate 410 error calculation 217, 291, 306 error ellipsoid 80, 206, 212, 215, 285, 306 error function see erf error propagation 217, 233, 306 estimate 184, 204, 249 estimator 184, 203 Euclidian space 55 expectation 175, 249 expected value see expectation exponential see pdf

exposure age 410 extremum see minimum, maximum F see pdf feasible set 148, 340 FeO/MgO ratio 12, 20, 39, 126 finite-differences advection term 165 explicit 157 implicit 157 prescribed flux at boundary 162 flushing time 347 flux material 401 species 402 volume 401 FONI (first-order non-isothermal) reactor 361 forcing 345, 380 Fourier series 100 fractional condensation isotopic effects in rain 46 fractional crystallization 35, 114, 126, 491 inverse problem 495 isotopic effects 38 ratios 36, 494 fractional melting 43, 497 aggregated melt 498, 499 density function 192 free variable 340 front velocity 417 function spaces Legendre polynomials 104 spherical harmonics 107 trigonometric functions 99 gamma see pdf Gaussian see pdf (normal) geometric transformation 62 Gershgorin circles 82, 375, -378 Gibbs energy minimum 319, 331, 340 global geochemical models 386, 392 gradient projection 307, 334 gradient vector (grad) 138, 445 Gram-Schmidt see orthogonalization Green function 348 heat and mass transfer coupling 361 Hessian see curvature heterogeneities 359, 413 heterogeneous equilibrium 318 homogeneous equilibrium 318, 320 Hotelling's T2 see pdf hydrothermal alteration 87 Sr/86Sr in open system 50 <518O in closed system 23 <518O in open system 48 hyperbola binary mixing 18, 19, 262 hypsometric curve 393 ice cap isotopic effect of melting 13 ICP-MS 253, 310 ideal gases 331 ierfc see erf incompatible element 477, 489 incremental process 34, 35, 38, 114, 126, 491 ratios 36, 123, 494 independent random variables 201 influence of data 250

Subject index inner product see dot product interpolation 132 inverse matrix 60 inverse methods 248 inverse problem see continuous inverse model ion probe 183, 221, 251, 292 Ir pulse 408 isobaric interference see peak stripping isochron 125, 294, 303 isopleth velocity (non-Henry's law behavior) 416 isotope dilution 14, 229, 253 optimal 111 Jacobian 207 K-T boundary 408 Ko 36, 39, 126 kernel functions 313 kinetic exchange 93 Lagrange multipliers 149, 279, 282, 295, 332 lakes 350 Laplacian 420, 446 least-square constraints see constrained leastsquares criterion 249 errors, linear 288 errors, non-linear 294 hyperbola 262 non-linear 273 plane 257 polynomial 258 straight line 255 Legendre polynomials 104 Leibniz's rule 120,418,441 lever rule 5,16 Liesegang rings 467 linear array see mixing, AFC linear function spaces see function spaces linear programming 148, 340 liquid line of descent 115 loading see component log-log plot 37, 44, 493, 497, 513 magma chamber periodic regime 503 steady-state 502 magma residence time 357, 503 mantle components 28, 243 Mantle Plane 245 marble-cake mantle 413 marginal density function 201, 211 mass action 319 mass balance concentrations 2 ratios 11 mass discrimination (fractionation) 121, 229 mass interference see peak stripping Matano interface 275, 423 matrix inverse 60 operations 53 orthogonal 60, 62 special 53 square-root of 76, 289

541

subspaces 57 trace 61 matrix exponential 86, 375, 381 maximum 139, 144 mean 175 sample 184, 204 mean square of weighted deviations see MSWD mean squared distance of diffusion 429 metasomatic front 417 metasomatism 414 metric tensor 68 mg# 20, 21 Milankovich see periodogram mineral reaction 9 mineral removal 5,8, 39 mineralogical matrix 9, 220, 283, 318 minimum 139, 144 mixing 1 binary 3 et seq. bulk 1 concentrations 1 conservative 1 elemental and isotopic ratios 11,28 hyperbola see hyperbola mantle components 28 ratio-concentration relationship 15 ratio-ratio relationships 18 ratio-ratio ternary 28 retrieving concentration ratio 26 ternary 6 mixing length 465 mixing time 354, 359, 413 mixture ideal gases 331 mobility 421 modal abundances minerals 7, 220, 281 mode 191 moment 175 moment generating function 176 Monte-Carlo simulation 233 moving reference frame 407 MSWD (mean square of weighted deviations) 291 multiple-box model isotopic systems 386 model linear 371 Nd crustal residence age 226, 371 Nd isotopes 22, 28, 225, 226, 229, 366, 389 Newton method 123, 335 Newton-Raphson method 142, 299, 303, 320 Ni-Mg fractionation 41 normal see pdf normalized variables 31 AFM plot 32 oceanic islands 244, 262, 271 ODE see ordinary differential equations olivine fractionation 5, 32, 39, 41, 126 one-box model 345 isotopic ratios 355 non-reactive species 346 periodic input 351 radioactive species 353 reactive species 348 open-system exchange isotopic ratios 47

542

Subject index

orbital frequency see periodogram ordinary differential equations Euler method 129 Runge-Kutta method 130, 152 orthogonal functions see function spaces orthogonal matrix 60 orthogonalization 72, 105 outer product 55 outgassing see Ar loss outlier 196, 240 oxygen isotopes 13, 23, 26, 38, 46, 48, 93, 190, 267 P cycle 377 partial differential equation (PDE) finite differences 155 partition coefficient 477 pattern formation 467 Pb isotopes 125, 198, 205, 213, 271, 287, 303 PC A (principal component analysis) 237 component 237 loading 240 PDE (partial differential equation) 155 pdf (probability density function) 173 beta 181 Cauchy 180 chi-squared 181, 188, 189, 197, 206, 209, 289, 291, 301 exponential 178, 189 gamma 180, 187 Hotelling's T2 206, 212 joint multivariate 200, 205 log-normal 179, 189, 199 normal 179, 186, 187,473 Poisson 181 Snedecor's F 181, 206, 212, 216 Student's t 182, 196, 209, 212 uniform 178 various, relations 183 peak stripping 221, 251, 253, 292 percentile 175 percolation 407, 514 Henry's law 414, 514 non-Henry's law behavior 517 periodogram 264 pH 323, 325, 327, 400 phase chemical 1 phase shift 351 Pi variable 487, 498 picrites 42 point source 428 pooled mean see weighted mean population 184 population dynamics 366 pore water 463 porosity 414, 450 principal component analysis see PCA probability density function see pdf probability distribution function 173 probability ellipsoid see error ellipsoid productivity 393 projection oblique 68 orthogonal 65, 250, 274, 290 propane combustion 335

quadratic form 55 associated ellipsoid 78 quadric 78 radioactive disequilibrium 88 ramp function 103, 447 random variable 173 change of 185, 206 rare-earth elements 216, 221 rate of stretching 413 Rayleigh law 36, 492 reaction mineral 9 reactivity 349 recipe 318, 332 relaxation 345 reservoir see box residence time 347 reactivity 349, 359 river 5 rock 3 root of equations 123, 142 rotation matrix 62 Runge-Kutta method 130, 152 runoff 393 sample 184 scaling matrix 65 scavenging length 465 seawater 13, 355, 394 sediment recycling 367 separation of variables 437 Shaw's Pt see Pt signal drift 310 significance level 196 simplex method 148 singular value decomposition (SVD) see eigencomponents solubility CO2 in seawater 394 minerals in melts 257 solutions equilibrium in 320 spallogenic argon 317 spherical harmonics 107, 269 spline functions 132, 154 Sr isotopes 12, 16, 22, 26, 28, 50, 211, 233, 355, 357, 358, 508 Sr-Ca fractionation brines 39 stability one-box system 360 system of differential equations 98 thermodynamic 117, 143 standard deviation 175 matrix 203 standard error 185, 286 standardized variable 175 statistic 184 statistical distance 284, 286 steady-state 350, 354, 376, 380, 412, 461, 464, 511 steady-state magma chamber 502 steepest-descent 144 sterile phase 482 stoichiometric coefficient 9, 283, 319 Student's / see pdf sulfate reduction 461

Subject index SVD (singular value decomposition) see eigencomponents system closed/open 3 system of linear differential equations see differential equations t distribution see pdf tangent equation of 114 Taylor expansion 120 ternary plot 31 theta functions 474 total inverse 307 trace element 477 and magmatic processes 521 choosing the right 518 trace of a matrix 61, 65, 73 tracer dispersal 154, 412 transfer function 353 transition matrix 375 transport equation 405

trapped melt 520, 500 turbulent diffusivity see eddy diffusivity U-Pb dating 125 unit response function 348 uphill diffusion see diffusion variance 175 sample 184 variance-covariance matrix see covariance matrix water-rock ratio 25, 48 weathering 394 weighted-mean 285 zone-refining 510 partially molten zone 510 steady-state 511

543

ENVIRONMENTAL APPLICATIONS OF GEOCHEMICAL MODELING

Read more

Environmental Applications of Geochemical Modeling

Read more

Geochemical and Biogeochemical Reaction Modeling

Read more

Geochemical Reaction Modeling: Concepts and Applications

Read more

Geochemical and Biogeochemical Reaction Modeling (Second Edition)

Read more

An Introduction to Atmospheric Modeling

Read more

Introduction to Modeling for Biosciences

Read more

An introduction to mathematical modeling

Read more

An introduction to stochastic modeling

Read more

Geochemical Kinetics

Read more

Geochemical Kinetics

Read more

Geochemical Kinetics

Read more

Geochemical Kinetics

Read more

Modeling Evolution: An Introduction to Numerical Methods

Read more

Introduction to Mathematical Fire Modeling, Second Edition

Read more

Modeling Evolution: An Introduction to Numerical Methods

Read more

An Introduction to Credit Risk Modeling

Read more

An Introduction to Global Spectral Modeling

Read more

An Introduction to Stochastic Modeling, Fourth Edition

Read more

Introduction to time series modeling, no index

Read more

Introduction to Modeling in Physiology and Medicine

Read more

An Introduction to Stochastic Modeling, Third Edition

Read more

Introduction to Physical Modeling with Modelica

Read more

Introduction to modeling in physiology and medicine

Read more

Introduction to neural and cognitive modeling

Read more

Modeling Evolution: An Introduction to Numerical Methods

Read more

Modeling Evolution: An Introduction to Numerical Methods

Read more

An Introduction to Stochastic Modeling, Third Edition

Read more

Geochemical Sediments and Landscapes

Read more

Introduction to Stochastic Calculus Applied to Finance (Stochastic Modeling)

Read more

Recommend Documents

ENVIRONMENTAL APPLICATIONS OF GEOCHEMICAL MODELING

Environmental Applications of Geochemical Modeling Geochemical modeling is a powerful tool for characterizing environme...

Environmental Applications of Geochemical Modeling

Environmental Applications of Geochemical Modeling Geochemical modeling is a powerful tool for characterizing environme...

Geochemical and Biogeochemical Reaction Modeling

This page intentionally left blank G E OCHE MICAL AND BIOGEOCHEMI CAL RE ACT ION M O DELI NG Geochemical reaction mod...

Geochemical Reaction Modeling: Concepts and Applications

Geochemical Reaction Modeling This page intentionally left blank Geochemical Reaction Modeling Concepts and Applica...

Geochemical and Biogeochemical Reaction Modeling (Second Edition)

This page intentionally left blank G E OCHE MICAL AND BIOGEOCHEMI CAL RE ACT ION M O DELI NG Geochemical reaction mod...

An Introduction to Atmospheric Modeling

An Introduction to Atmospheric Modeling Instructor: D. Randall AT604 Department of Atmospheric Science Colorado State U...

Introduction to Modeling for Biosciences

Introduction to Modeling for Biosciences David J. Barnes Dominique Chu Introduction to Modeling for Biosciences ...

An introduction to mathematical modeling

AN INTRODUCTION TO MATHEMATICAL MODELING Edward A. Bender University of California, San Diego A W i iey- I n t e rsci...

An introduction to stochastic modeling

Geochemical Kinetics

Geochemical Kinetics Geochemical Kinetics Youxue Zhang PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD Copyrig...