QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
This Page Intentionally Left Blank
QUANTITATIVE SPECTROSCOPY: THEORY...
146 downloads
1378 Views
8MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
This Page Intentionally Left Blank
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
BRIAN C. SMITH Spectros Associates Shrewsbury, Massachusetts
/ ^ ACADEMIC PRESS V _ ^ An imprint of Elsevier Science Amsterdam • Boston • London • New York • Oxford Paris • San Diego * San Francisco • Singapore • Sydney • Tokyo
This book is printed on acid-free paper. @ © 2002 Elsevier Science (USA) All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher's consent that copies of the chapter may be made for personal or internal use of specific cHents. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-2002 chapters are as shown on the title pages. If no fee code appears on the title page, the copy fee is the same as for current chapters. $35.00 ExpHcit permission from Academic Press is not required to reproduce a maximum of two figures or tables from an Academic Press chapter in another scientific or research pubhcation provided that the material has not been credited to another source and that full credit to the Academic Press chapter is given. Academic Press An Elsevier Science Imprint 525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.academicpress.com International Standard Book Number: 0 12 650358 3 Typeset by Keyword Pubhshing Services Ltd, Barking, Essex Printed in Great Britain by MPG Books Ltd, Bodmin, Cornwall 02
03
04
05
06 07
MP
9
8
7 6
5 4
3 2 1
CONTENTS
Preface
vii
Acknowledgments Chapter 1
xi
FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
1
Chapter 2
SINGLE ANALYTE ANALYSIS
43
Chapter 3
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
85
Chapter 4
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS
Chapter 5
AND FACTOR ANALYSIS
125
IMPLEMENTING, MAINTAINING, AND FIXING CALIBRATIONS
181
GLOSSARY
189
INDEX
195
This Page Intentionally Left Blank
PREFACE
The quantitation of the concentrations of molecules in samples has long been an important application of spectroscopy. In the last two decades, advances in algorithms, computers, instruments, and software have led to a burgeoning of interest in this field. These developments mean samples and analytes that were once considered intractable are increasingly yielding usable calibrations. The purpose of this book is to give readers a thorough grounding in the theory and practice of modern quantitative spectroscopic analysis. This book is geared towards anyone using spectroscopic absorbance measurements to determine concentrations in unknown samples, and should appeal to users of infrared, near-infrared, and UV-Vis instruments. There are several aspects of this book that make it, I beheve, a valuable contribution to this field. 1. The balance between theory and practice. Relevant theory is interspersed with practical discussions. Equations are explained at length to make difficult concepts comprehensible. I have attempted to strike a balance between the mathematical and experimental aspects of this field. 2. Example caHbrations using real world data. In many chapters actual experimental data are used to generate example calibrations. I believe
viii
PREFACE
there is no substitute for real data to make understandable the important concepts of this field. 3. One stop shopping. There are many aspects of quantitative analysis, including instrumentation, software, mathematics, and experimental technique. This book pulls together information from all these fields in one place. Everything you need to know to obtain a fundamental understanding of quantitative spectroscopy is here. 4. This book is written at an introductory level. Increasingly, cahbrations are being developed and implemented by people without math degrees. This book is written for them. All technical terms appear in italics, and are defined in the glossary at the end of the book. Much effort has been expended to make difficult concepts understandable to novices. 5. A spectroscopist's perspective. Spectra comprise half the information in a cahbration. How the spectra are measured, their quality, and what they mean is important. Being a spectroscopist, I have included topics on the spectroscopic part of quantitative spectroscopy to emphasize its importance. The book begins with a theory and background chapter. The properties of Hght and the interaction of Hght with matter are the first topics introduced. Next, Beer's Law is derived, and the complexities behind the absorptivity {s) are discussed. At its root, the absorption of Hght by matter is a quantum mechanical process, and understanding this process is key to understanding cahbrations. This is discussed in the appendix to Chapter 1. The appendix shows that many of the features of spectroscopy, including the existence of quantized energy levels and selection rules, are a natural outgrowth of microscopic bound systems. Do not let this scare you off, the appendix is comprehensible even if you have never taken a course in physical chemistry. The second chapter of the book focuses on single analyte determinations. It shows how Beer's Law is used in practice, and provides a detailed look at how linear regression and the least squares algorithm are used to generate calibration hues. Calculating statistics that give the accuracy and robustness of a calibration is discussed. The chapter concludes with a discussion of standard methods, and practical tips on how to avoid experimental error in analyses. A simple single analyte example cahbration is used throughout to illustrate important concepts. The third chapter covers the least squares methods (inverse and classical) of obtaining multi-analyte cahbrations. A review of matrix algebra is followed by the matrix algebra behind least squares techniques. The ideas of linear regression are extended to multiple analytes. The chapter includes a discussion on the strengths and weaknesses of each of these methods.
PREFACE
ix
A multicomponent system is used to generate example calibrations to illustrate important concepts. A large part of the rest of the book is devoted to factor analysis (chemometrics). The mathematics of factor analysis algorithms are discussed in words and equations. Then, readers are presented with a process that they should follow to obtain successful chemometric caHbrations. The process starts with ascertaining the quahty of the raw data used in the calibration, proceeds to checking a calibration for predictive ability, and ends with using independent data to vahdate a caUbration. A multicomponent system is used throughout to generate example calibrations to illustrate important concepts. Finally, the last chapter discusses the practical aspects of implementing and supporting a calibration, and the theoretical and practical limits of quantitative analysis in general. There are those who may prefer to treat the instruments and calibration methods of quantitative spectroscopy as a black box. My goal here is to open that box. You might be able to obtain a calibration without the knowledge in this book. However, if you want a caUbration that is of high quaUty, is robust, and is implemented properly, then an understanding of the topics in this book is necessary. I will be the first to admit that no book is perfect, and that most certainly this book can be improved. Therefore, I welcome your comments and criticisms, and can be contacted at the e-mail address below.
BRIAN C . SMITH, P H . D .
Shrewsbury, MA bcsmith@spectrosl .com September 2002
This Page Intentionally Left Blank
ACKNOWLEDGMENTS
The writing of a book, even by a single author, is never a soHtary endeavor. The support, advice, and love of a wide group of people is necessary to bring such creations to fruition. I would first like to thank my reviewers, John Winn, Howard Mark, and Rich Kramer. Their input and constructive criticism have made this a much better book. This book was developed as part of the Quantitative Spectroscopic Analysis course I teach for my company, Spectros Associates. It has been made better by comments from many of my course attendees. I would like to thank my students for their honesty, and for helping me earn a living. Software is an overriding aspect of quantitative spectroscopy these days. Obtaining and learning how to use quantitative software packages was key to generating the real world cahbrations found in this book. The factor analysis cahbrations in this book were generated using the PLSPlus/IQ software package from Galactic Industries of Salem, NH. The people there, including Jamie Duckworth, Sue Kaplan, Tony Nip, and Brie Gervin were most helpful in answering my questions, and in solving my problems with the software. The P-matrix calibration discussed in Chapter 3 was generated using the Optical Solutions Multiple Linear Regression Software Package from Optical Solutions of Folsom, CA. Don Goldman of that company was very helpful in obtaining me a copy of the program, and helping me troubleshoot it. Thank you Don.
xii
ACKNOWLEDGMENTS
There has been a parade of editors in my Hfe during the writing of this book, but they have all contributed to its completion. Emma Roberts was smart enough to see the need for a book such as this, Carla Kinney provided support and encouragement, and Derek Coleman made this book a reality by setting a deadhne for me. Thank you all. Above all, it takes the love, support, and encouragement of one's family to make a book a possibility. I would Hke to thank the lovely ladies in my hfe, Marian, Eleanor, and Isabel.
Dedicated to... all my parents
1 FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
I. Terms and Definitions This book is devoted to quantitative molecular absorption spectroscopy, one of the most important and common applications of spectroscopy. The word quantitative means we are measuring the concentrations of the chemical species in a sample. The word molecular means that we are interested in the molecules, not the atoms or elements, in a sample. The word absorption means we will use the amount of Ught absorbed by a sample to quantify the amount of substance present. Thus, the purpose of spectroscopic quantitative analysis in this context is to determine the concentration of a molecule or molecules in a sample using absorbance spectra. The molecule whose concentration is measured is called the analyte. The peak heights or areas in an analyte's absorbance spectrum are directly proportional to the concentration of the analyte. To establish the correlation
2
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
between absorbance and concentration, we must take the spectra of standards, samples that contain known concentrations of the analyte. Through a process called calibration, a mathematical model is generated that correlates the absorbances to the known concentrations in the standard samples. Once an accurate cahbration is in hand, the spectrum of a sample with an unknown concentration of the analyte, the unknown, is measured. The concentration of the analyte in the unknown sample is predicted by applying the calibration to the absorbances in the unknown spectrum. Spectroscopic quantitative analyses are based on the assumption that caHbrations model the standard and unknown samples equally well. Much of the work involved in obtaining a calibration is done to assure that this is true. Since we can never prove with 100% certainty that a given calibration gives a completely accurate description of an unknown sample, it is proper to say that we predict unknown concentrations rather than calculate them. Concentrations always contain a certain amount of error, and to say that they are calculated means we know their exact value, which is impossible.
II. The Properties of Light Spectroscopy is the study of the interaction of light with matter. The fundamental measurement obtained in spectroscopy is a spectrum, which is a plot of measured Hght intensity versus some property of Hght. An example of a spectrum, the mid-infrared absorbance spectrum of polystyrene, is seen in Figure 1.1. The F-axis of this spectrum is in absorbance, which is a measure of how much light a sample absorbs. The J^-axis is in wavenumbers (cm~^), a property of hght discussed later. An instrument used to measure a spectrum electronically is called a spectrophotometer, or sometimes simply a spectrometer. There are many different kinds of spectrophotometers in the world, and they use many different types of hght to obtain spectra. The details of how these instruments work are beyond the scope of this book. Several of the books hsted in the bibliography of this chapter are excellent references on how different spectrophotometers work. To understand how a molecule absorbs hght, we must first understand something about the properties of hght. Light can be thought of as being a wave or a particle, depending upon the particular property of light under consideration. This wave-particle duality is an inherent feature of light. For now, we will consider hght as a wave. Light beams are composed of electric and magnetic waves that undulate in planes perpendicular to each
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
.3H .25
.2 .15 .1
wV
.05 4000
3500
3000
2500 2000 Wavenumber (cm-1)
1500
1000
500
Figure 1.1 The mid-infrared spectrum of polystyrene. Note that the Z-axis units are wavenumber (cm~^), and that the 7-axis units are absorbance.
other. Light is properly called electromagnetic radiation because it contains electric and magnetic waves. The light wave traverses through space in a direction defined by the line where the two planes containing the waves intersect. The interaction of the electric wave of hght, the electric vector, with matter is what is usually measured to obtain absorbance spectra. The amplitude of the electric vector changes over time and has the form of a sine wave, as shown in Figure 1.2. One of the properties used to distinguish between different types of Hght is the light's wavelength. A wavelength is the distance between adjacent crests or troughs of a wave, as seen in Figure 1.2. Wavelength is denoted by the lowercase Greek letter lambda (X). Different types of Hght have different wavelengths. For example, infrared radiation is longer in wavelength than visible Hght, which is longer in wavelength than X-rays. Another property of a Hght beam is its frequency. Frequency is denoted by the Greek letter nu (y), and equals the number of cycles a wave undergoes per second. A cycle is considered complete when a Hght wave starts at zero and then crosses the X-axis twice. The wave in Figure 1.2 undergoes almost three cycles. Frequency is measured in cycles/s or Hertz (Hz). Frequency is a measure of the number of cycles of a Hght wave per unit time. The frequency, wavelength, and speed of a Hght beam are related to each other via the following equation: c = vX
(1.1)
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Time
Figure 1.2 A plot of the amplitude of the electric vector of a light wave versus time. The arrow denotes the distance between adjacent crests, and is called the wavelength, X. Note that the polarity of the wave, denoted by the + and — signs, changes over time. Note also the definition of a cycle.
where c = the speed of Hght ( 3 x 1 0 10 cm/s) V —frequency in Hertz (s~^) X — wavelength This equation shows that the product of frequency and wavelength for a Hght wave is a constant, the speed of Hght. Another property used to describe Hght is its wavenumber. A wavenumber is defined as the reciprocal of the wavelength as foHows: W =z l/k
(1.2)
where W = wavenumber X = wavelength If X is measured in cm, then W is reported as cm"^ or reciprocal centimeters. A wavenumber measures the number of cycles in a light beam per unit length. If we substitute Equation (1.2) into (1.1) and solve for c, we obtain c = v/W
(1.3)
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
5
which upon rearranging gives
v = cW
(1.4)
Equations (1.1)-(1.4) show that Hght waves may be described by their frequency, wavelength, or wavenumber. These equations also show that these three quantities are related to each other. Throughout this book, we will usually refer to hght waves by their wavenumber. However, at times it will be more convenient to refer to a hght beam's frequency or wavelength. As mentioned above, hght can also be thought of as a particle. A particle of light is called a photon. A photon has no mass, but it does have energy. The energy of a photon is directly related to frequency as follows: E = hv
(1.5)
where E = photon energy in Joules /z = Planck's constant (6.63 x 10"^"^ J s) y = frequency in Hertz (s~^) If we substitute Equation (1.4) into (1.5), we obtain E = hcW
(1.6)
which shows that photon energy also depends on the wavenumber. Note that high wavenumber light has more energy than low wavenumber light.
III. The Electromagnetic Spectrum There are types of hght in addition to the visible hght that we can detect with our naked eyes. All the different types of light are called the electromagnetic spectrum. Each different type of light can be characterized by a different frequency, wavelength, wavenumber, or energy. A section of the electromagnetic spectrum is illustrated in Figure 1.3. Note that in reading Figure 1.3 from right to left the frequency, the wavenumber, and the energy increase while wavelength decreases. When performing a quantitative absorption experiment, the first thing to decide upon is what type of hght to use in the analysis. The type of hght chosen for an analysis affects the types of samples that can be investigated, the sample preparation necessary, the type of instrument that
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
40000-14000 cm Visible& Ultraviolet
Higher Wavenumber Higher Frequency Higher Energy Shorter Wavelength
14000 to 4000 cm"' Near -Infrared
4000 to 400 cm' Mid-Infrared
400 to 4 cm"^ < 4 cm'* Far Infrared Microwaves
Lower Wavenumber Lower Frequency Lower Energy Longer Wavelength
Figure 1.3 The electromagnetic spectrum, showing the wavenumber, wavelength, frequency, and energy ranges for different types of light. Also illustrated is how wavelength, wavenumber, frequency, and energy change across the spectrum.
can be used, and ultimately impacts the calibration quality. The types of light listed in Figure 1.3 are the ones most commonly used in quantitative absorption spectroscopy. The lowest energy Ught seen in Figure 1.3 is microwaves, which appear below 4 cm~^ (this is the type of radiation used in microwave ovens). When this type of Ught is absorbed by a molecule, there is an increase in the rotational energy of the molecule. This is why microwave spectroscopy is sometimes called rotational spectroscopy. This technique is typically limited to gases because gas phase molecules are free to rotate, whereas soHd and Hquid phase molecules are not. Although microwave spectrometers exist, and have been used to perform quantitative analyses, their use is not widespread. Next in energy above microwaves is far-infrared radiation, found from 400 to 4 cm~^ When molecules absorb far-infrared light, the energy excites vibrational motion of the molecule's bonds. However, far-infrared absorbances are low in energy, and are typically found in heavy molecules such as inorganic and organometallic substances. Most organic molecules do not absorb in the far infrared, limiting the types of molecules this wavenumber range can be used to analyze. Mid-infrared radiation is found between 4000 and 400 c m ' ^ Chemical bonds vibrate when they absorb mid-infrared radiation, but with more energy than in the far infrared. Many of the molecules in the universe (of which there are more than 10 million) give usable mid-infrared spectra. Midinfrared absorbances are intense, and often times only a miUigram of material is needed to obtain a spectrum. Additionally, almost every type of material, including solids, liquids, gases, polymers, and semi-solids can have their mid-infrared spectra measured. A disadvantage of the strength of midinfrared absorbances is that it is easy for a sample to absorb all the hght impinging on it at a given wavenumber, making it difficult to record a spectrum. As a result, a good deal of time consuming manual sample preparation can be involved in making the sample dilute enough or thin
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
7
enough so that it absorbs the right amount of Hght. This problem with midinfrared samples is called the thickness problem. However, mid-infrared Hght is commonly used for quantitative analysis. Throughout the rest of this book, the terms mid-infrared and infrared will be used interchangeably. From 14,000 to 4000 cm~^ lies the near-infrared region of the electromagnetic spectrum. Like in the mid-infrared, molecules vibrate when they absorb near-infrared light, but with higher energy than in the mid-infrared. A disadvantage of near-infrared absorbances is that they are typically 10-100 times weaker than mid-infrared absorbances. This is bad if only a small amount of sample is available. However, near-infrared samples do not suffer from a thickness problem, so sample preparation can be faster and easier compared to mid-infrared spectroscopy. Because of this, and of the high performance of near-infrared spectrometers, there has been an explosion of quantitative applications of near-infrared spectroscopy in the last 30 years. The highest energy Hght we will consider in this book is the ultraviolet and visible (UV-Vis). This type of Hght falls from 40,000 to 14,000 cm"^ When this type of light is absorbed by a molecule, an electronic transition takes place and the Hght's energy promotes an electron from a lower energy level to a higher energy level. Although UV-Vis absorbances can be intense, only molecules with certain types of chemical bonds absorb UV-Vis light, somewhat limiting the types of molecules that can be analyzed. Despite this limitation, UV-Vis light was historically the first type of light widely used in quantitative analysis, and is still widely used today.
IV. Beer's Law The basis of most quantitative spectroscopic analyses is Beer's law. This law relates the amount of Hght absorbed by a sample, a spectroscopically observable property, to the concentration of absorbing species in the sample. This relationship is what aHows absorbance measurements to be used to predict concentrations. To derive Beer's law, we assume the experimental setup shown in Figure 1.4. We also assume that monochromatic Hght of wavelength A, impinges upon the sample perpendicular to one of its faces. Before the light enters the sample, it has intensity /Q. Light intensity is defined as the number of photons of light hitting a unit area per unit time. The thickness ox pathlength of the material is denoted by L. An infinitesimally thin slab of the absorbing material is denoted by dL. When the light leaves the sample it has intensity I < IQ due to absorbance of the light by the sample.
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
I05 A>^
r'
dL
Figure 1.4 A sample of absorbant material of pathlength L. Monochromatic light of wavelength X and initial intensity /Q impinges on the sample perpendicular to it. An infmitesimally thin slab of the material is denoted by dL. The light leaves the sample with a lower intensity / due to the sample absorption at wavelength X.
Before Collision
After Collision Photon Still With Energy Ep
\
Photon With Energy Ep
Methane Molecule at Rest
Methane Molecule Still at Rest
Figure 1.5 An elastic collision between a photon and a methane molecule. Ep is the energy of the photon.
One way to think about how molecules absorb light is to consider the interaction as a colUsion between two particles. Imagine photons of light and molecules as simple particles shaped like spheres. When two particles colHde (be they molecules or billiard balls), several different types of colHsion can occur. The particles may undergo an elastic collision, which is illustrated in Figure 1.5. Before the colHsion, the incoming photon has energy Ep = hcW as stated in Equation (1.6), and imagine the methane molecule has no kinetic or vibrational energy, and can be considered at rest. By definition, an elastic colHsion results in no net energy exchange between the molecules. After the colHsion, the photon still has energy hcW, and the methane molecule is still at rest. The only thing that has changed is the direction of travel of the photon, and to a lesser extent, the position of the methane molecule. Elastic colHsions between photons and molecules result in a phenomenon called Rayleigh scattering, in which the direction but not the energy of the photon is changed. The intensity of Rayleigh scattering is proportional to the fourth power of the wavenumber of the photon, W, involved in the scattering. The highest wavenumber Hght we can see is blue light, so blue light is
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
9
Rayleigh scattered more intensely than other colors of light. Molecules in the upper atmosphere preferentially scatter blue light, which is why the sky is blue. Another type of photon/molecule collision are inelastic collisions. In this type of collision, energy is exchanged between the particles, and they leave the colHsion with different energies than before the colHsion. This is illustrated in Figure 1.6. Before the coUision, the incoming photon has energy E^, and the methane molecule is at rest. After the colHsion, some of the photon's energy is deposited into the molecule as vibrational energy, E^. The photon's energy after the colHsion is Ei (i for inelastic). AU processes must follow the law of conservation of energy, thus, E^ = Ey-\-Ei. The concentric circles in Figure 1.6 indicate that the methane molecule is vibrationally excited. Also, note that the direction of the photon has changed after the colHsion. Inelastic collisions between molecules and photons give rise to a phenomenon called Raman scattering. The amount of energy lost by the photon, £'v, is characteristic of the molecule involved in the scattering. Thus, after the colHsion the inelastically scattered photon carries chemical information. When the intensity and the wavenumber of these photons are analyzed and plotted, one obtains a Raman spectrum. Raman spectra are similar to infrared spectra in that they measure the vibrational energy levels of a molecule. Raman spectra can be used to quantitate chemical species, but is beyond the scope of this book. A third thing that can occur when a photon encounters a molecule is a totally inelastic collision. This is illustrated in Figure 1.7. Before the coHision, the photon has energy E^, and the methane molecule is at Before Collision
After Collision Photon With Energy ""l \tlip - tLv) — tLi
Photon With ^ Energy Ep
\
Methane Molecule at Rest
^--"irr--'' Vibrationally Excited Methane Molecule With Energy Ev
Figure 1.6 An inelastic collision between a photon and a methane molecule. Energy from the photon is deposited into the molecule as vibrational energy, E^. The photon leaves the collision with energy E^.
10
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Before Collision
After Collision Photon With Energy Ep
Methane Molecule at Rest
^--"JriExcited Methane Molecule With Energy Ep
Figure 1.7 An inelastic collision between a photon and a molecule, known as absorbance.
rest. After the collision the photon has disappeared; all its energy has been absorbed by the molecule leaving it excited. This phenomenon is known as absorbance. The wavenumber of the light absorbed and the intensity with which it is absorbed depends upon the molecule involved in the collision. Thus, chemical information can be derived from a plot of absorbance intensity versus wavenumber, called an absorbance spectrum. Such a spectrum is seen in Figure 1.1. A totally inelastic colUsion between a photon and a molecule results in the disappearance of the photon, and the transfer of all the photon's energy into the molecule. The total amount of light absorbed by a sample is simply equal to the total number of photons that undergo totally inelastic colHsions with molecules. The decrease in the number of photons leaving the sample will give rise to an absorbance feature in the spectrum of the sample. Now, photons may also be scattered by macroscopic-size particles such as dust grains, leading to a decrease in photons exiting the sample as well. Experimentally, this will look hke absorbance but it is not. Thus, we are tacitly assuming that our sample has no other species present other than molecules that will interact with the light beam. In Figure 1.4, the change in Ught intensity across slab dL of the sample will be denoted by d/. The more photons and molecules there are in dL, the more coUisions will occur. The number of molecules in dL is simply determined by the analyte concentration, c. The number of photons present is given by the intensity of Hght in dL, which is simply L Additionally, the thicker the slab dL, the more photons will be absorbed because there are more molecules encountered. Thus, we can write a proportionality for
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
11
the light absorbed in dL as -dIcxIcdL
(1.7)
where d/=: amount of light absorbed in dL / = intensity of light impinging on dL c = concentration of absorbing species in dL dL = thickness of an infmitesimally thin slab of sample and the negative sign of d/ means that the intensity of the light beam decreases as it traverses dL. This proportionaHty assumes that the change in /across the width of dL is negUgible because dL is infmitesimally thin. Thus, we assume that / is the same at all points in dL. Note that the amount of hght lost simply depends on the number of analyte molecules present (c), the number of photons present (/), and the sample thickness (dL). Equation (1.7) tells us what parameters determine the total number of photon-molecule colhsions. Remember that molecules and photons can undergo several different types of collisions. Only some percentage of these colhsions will be totally inelastic. It is the number of totally inelastic collisions that determines the amount of hght absorbed. To calculate the number of these colhsions, we must multiply the right-hand side of Equation (1.7) by the fraction of colhsions that are totally inelastic. This number is called the absorptivity, and is denoted with the Greek letter epsilon, s. The absorptivity can also be thought of as a probabihty, it is the probabihty that a photon-molecule collision wiU be totally inelastic. The absorptivity depends upon the identity of the molecule absorbing the hght, and the wavelength of hght being absorbed. By inserting the absorptivity into Equation (1.7), it functions as a proportionality constant making the formerly quahtative relationship between d/ and concentration quantitative. We can then remove the proportionahty sign from Equation (1.7) to obtain -dI = sIcdL
(1.8)
where s = the absorptivity of the analyte molecule at wavelength X and the other terms have the same meaning as before. We can combine terms and rewrite Equation (1.8) as follows: -dI/I
= 6cdL
(1.9)
12
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Since both sides of this equation contain an infinitesimal, we can integrate both sides as such
/ AI/I = sc f dL Jlo
(1.10)
Jo
The left-hand integration limits in Equation (1.10) mean that we are integrating between the intensities before and after the light passed through the sample. The right-hand integration limits are for the entire thickness of the sample, L. Note that s and c have both been brought outside the integration sign. This means that we are assuming that the concentration and the absorptivity are not a function of pathlength. In essence, we are assuming that the concentration and absorptivity are the same everywhere in the sample, i.e. the sample is homogeneous. Remembering that the integral of any quantity 1/1^ is In X, we can apply this to Equation (1.10) and evaluate the integrals to obtain \n(Io/I) = 6cL
(1.11)
Equation (1.11) is one way of stating Beer's law. However, it is traditional to express Beer's law using base 10 logarithms rather than natural logarithms. Recall that ln(10) = 2.303. We divide the right-hand side of Equation (1.11) by 2.303, which we will wrap into the absorptivity and rewrite to obtain \og(Io/I) = scL
(1.12)
To simpUfy Equation (1.12), we define a new quantity called the absorbance, denoted by A and given by ^=log(/o//)
(1.13)
and this allows us to rewrite Beer's law one last time to obtain its final form A = 8lc where A = absorbance £ = absorptivity / = pathlength c = concentration
(1.14)
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
13
This is the form in which Beer's law is most commonly expressed, and will be the form used throughout this book. What this equation tells us is that the amount of light absorbed by a sample depends on the concentration of the analyte, the thickness of the sample, and the sample's absorptivity. Also, note that the relationships in Beer's law are Hnear. For example, doubling the pathlength or concentration of a sample doubles its absorbance. Many spectrometers are capable of measuring spectra with the F-axis in absorbance units. An example of a spectrum plotted in absorbance units is seen in Figure 1.1. Because the relationship between absorbance and concentration is Hnear, the peak height or area of an analyte's absorbance band will vary hnearly with concentration. Sometimes spectra are plotted with the F-axis units in transmittance, which is defined as follows T = {I/Io)
(1.15)
where r:= transmittance The 7-axis values in a transmittance spectrum can vary from 1, when I=Io, to 0 when 7 = 0 (this means no hght is being passed by the sample). Transmittance measures the fraction of light transmitted by the sample. Spectra are also sometimes plotted with F-axis units of percent transmission, which is defined as % r = Tx 100 where %T=Vo transmission The F-axis values in a % transmission spectrum can vary from 100%, when / = / o , to 0% when 7 = 0 (when no light is being passed by the sample). Percent transmission measures the percentage of light transmitted by a sample. An example of a spectrum plotted in % transmission units is seen in Figure 1.8. Note that the peaks point down. Rearranging Equation (1.15) shows that \/T = (Io/I)
(1.16)
Substituting Equation (1.16) into (1.13) yields ^-log(l/r)
(1.17)
14
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
— " V V T N y - W ^ — N #«/^
2500
Af\
2000
Wavammba
Figure 1.8 The mid-infrared % transmittance spectrum of polystyrene.
Equation (1.17) establishes the relationship between absorbance and transmittance. Most spectroscopic software packages allow you to switch between the two units, and Equation 1.17 is used to make the conversion. If we substitute Beer's law (Eq. (1.14)) into Equation (1.17) we obtain
£lc^\og{\/T)
(1.18)
Raising both sides of the equation to the power of 10 yields 10'^' = \/T
(1.19)
and rearranging yields the fmal result: T = 10-'^^"
(1.20)
Equation (1.20) shows that the relationship between transmittance and concentration is not linear, making transmittance spectra inappropriate for use in quantitative analysis. It is absolutely necessary that you use spectra plotted in absorbance units for quantitative analysis because of the linear relationship between absorbance and concentration. Some further discussion of the absorptivity is in order. The absorptivity is the proportionality constant between concentration and absorbance.
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
15
It depends upon the wavenumber of light being absorbed, and the identity of the molecule doing the absorbing. For example, we can write _1700 / _1600 ^HiO 7^ ^ H 2 0
where the subscript denotes molecule and the superscript denotes wavenumber. This inequaUty shows that the absorptivity of water depends on the wavenumber, and illustrates that the absorptivity for any molecule varies with the wavenumber. Additionally, we can write ^1700 / ^1700 H2O / Acetone
illustrating that the absorptivity varies with molecule, even for a fixed wavenumber. For a given molecule and wavenumber of light, the absorptivity is a fundamental physical constant of the pure molecule. For example, the absorptivity of pure acetone at 1700 cm~^ is a fundamental property of the molecule like its molecular weight or boihng point. The absorptivity can be thought of as an "inherent absorbance." It measures, in an absolute sense, how strongly a specific molecule absorbs light at a specific wavenumber. A quick look at Beer's law (Eq. (1.14)) shows that the absorbance is a unitless quantity, so the units on the right-hand side of Beer's law must cancel each other. The product cL gives units of (length x concentration), so the units of the absorptivity (s) must be (length x concentration)"^ for the units in Beer's law to cancel properly. This sometimes leads to the absorptivity being expressed in units such as Hter/mole-centimeter. These units may be hard to understand, but further analysis shows that these units make sense. Recall that the absorptivity represents the probability of a photonmolecule collision being totally inelastic. Do the units of the absorptivity express themselves as a probabihty? Physicists often express probabilities in units of area, and call the quantity a cross section. Imagine throwing a baseball at the broad side of a barn. The probabihty of your hitting the barn increases the larger its area, and decreases the smaller its area. It all depends upon the size of the target. Additionally, your probabihty of hitting the barn is higher the closer you are to it, and lower the farther you are from it. In this case, the absolute area of the barn does not change, but the apparent area, how big the barn appears to you, changes with your distance from it. This apparent area is called the apparent cross section of the barn. Simply stated, the apparent cross section measures how large the barn appears to be from where you are standing. The same concept can be applied to photon-molecule colHsions. Molecules that strongly absorb
16
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
light undergo totally inelastic collisions with a high probabiHty. From the point of view of a photon, the molecule appears "large," and the interaction between the two has a large apparent cross section. This is measured experimentally as a large absorptivity. Molecules that absorb light weakly have a low probabiHty of undergoing totally inelastic coUisions with photons. From the point of view of the photon, the molecule appears "small," and the interaction is characterized by a small cross section. This is measured experimentally as a small absorptivity. Are the units of the absorptivity consistent with this idea of apparent cross section? It is common to measure concentration in moles/volume. The units of volume are length^. The unit of pathlength are length, so the units of the absorptivity are length^/(mole x length). CanceUng gives units of lengths/mole, where length^ is the unit of area. Thus, the absorptivity has units of area/mole when measured in this fashion. Since probability can be expressed in area as well, what the absorptivity measures is the apparent section of a mole of analyte molecules with respect to totally inelastic photon collisions. Consequently, this quantity is called the molar absorptivity. The units of the absorptivity do make sense when thought of as a probability. The absorptivity for a single molecule can be calculated by dividing the molar absorptivity by Avogadro's number to obtain an apparent cross section in units of area/ molecule. V. Variables Affecting the Absorbance and Absorptivity For a calibration to be legitimately appHed to standard and unknown samples alike, the absorbance reading for a given concentration of the analyte must be reproducible. To achieve this, an understanding of the variables other than concentration that affect the measured absorbance is essential. The purpose of this section is to discuss the important variables that must be controlled when performing quantitative spectroscopic analyses. This will involve discussing the physical processes behind the absorbance of Hght by molecules. Though this discussion is mathematical by its very nature, the amount of complex math has been minimized. For the mathematical underpinnings of the concepts presented here, consult Appendix 1 at the back of this chapter, or any of the undergraduate texts on physical chemistry or quantum mechanics cited in the bibliography.
A. THE IMPACT OF TEMPERATURE ON ABSORBANCE
As we all know from personal experience, matter is "clumpy," it comes in discrete pieces, be they people, rocks, molecules, or quarks. It turns out
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
17
that energy is also "clumpy" and comes in discrete packets called quanta. Therefore, energy is quantized. A quantum of energy is extremely small, and the fact that energy comes in discrete but small packets is not important for the consideration of the physics of the macroscopic world. However, in the microscopic world of atoms and molecules energy quantization has a huge impact upon how matter and energy interact. The field of physics that deals with the behavior of atoms, molecules, nuclei, and quanta of energy is called quantum mechanics. Quantum mechanics has Uttle to tell us about physical systems that are unbound, i.e. the parts of a system that are free to move at will. Imagine an electron moving through a vacuum or a baseball being thrown by a pitcher as examples of unbound systems. However, any time microscopic particles are bound, such as the electrons and protons in a molecule, their energy levels become quantized. This means that the particles in the system cannot have just any energy, but that only certain specific energies are allowed. The rotational, vibrational, and electronic energies of molecules are quantized. The reason that molecules have discrete spectral bands and absorb light at discrete energies is that their energy levels are quantized. There are a number of things that determine the quantized energy levels of a molecule, including its mass, type of atoms present, the strength of the chemical bonds, and the arrangement of the atoms in space. Two hypothetical quantized energy levels for a molecule are shown in Figure 1.9, with energy E\ for the lower level and E^ for the upper level. When a molecule absorbs a photon of Ught, it is said to make a spectroscopic transition from the lower to the upper energy level, as indicated by the arrow in Figure 1.9. Equation (1.6) gives the energy of a photon of light. For a photon to be absorbed and give rise to the transition seen in Figure 1.9, the difference in energy between the two levels must equal the energy of the photon.
Figure 1.9 A spectroscopic transition from a lower energy level {E{) to an upper energy level (^u) occurs when a molecule absorbs a photon of light, as indicated by the arrow, xj/x and ^^ stand for the wavefunctions of the lower and upper energy levels, respectively.
18
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
This idea can be summarized in an equation form as follows: AE = hcW
(1.21)
where AE = Eu - El /z = Planck's constant (6.63 x 10"^"^ Js) c = the speed of Hght (3 x 10^^ cm/s) P^=wavenumber in cm~^ Equation (1.21) holds for any two energy levels in a molecule. A photon can only be absorbed if its energy happens to match a AE in a molecule. If the energy of a photon does not match any of the AEs in a molecule, it will pass through unabsorbed. Thus, Equation (1.21) gives a necessary but not sufficient condition for molecular absorbance to take place. It tells us that absorbance can take place, but says nothing about the probability of absorbance taking place. For the transition seen in Figure 1.9 to occur, there must also be molecules present in a sample that have energy E\. If all the molecules in a sample were somehow excited to energy level E^ (via some process other than light absorption, such as molecular collisions), there would be no molecules available with energy Ei to make the transition from Ei to E^. As a result, no light would be absorbed. More specifically, it is the difference between the number of molecules in the upper and lower energy levels that determines the number of photons that can be absorbed. In equation form, we would say that Acx(Ni-N^)
(1.22)
\/A(x\/(Ni-N^)
(1.23)
or
where A = amount of hght absorbed A^u = the number of molecules with energy E^ Ni — the number of molecules with energy E\ In essence, the greater the number of molecules in E\, the more that are available to absorb light of energy A^" and make the transition to E^. The fewer the molecules with energy Eu the fewer are available to absorb light of energy AE and make the transition to Ey.
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
19
There is a law of physical chemistry, called the Boltzmann distribution, which gives the ratio NJNi, the ratio of the number of molecules in the upper state to the number of molecules in the lower state. The Boltzmann distribution assumes that energy levels are populated due to random thermal processes (collisions). The formula for the Boltzmann distribution is Nu/Ni = Q-^^/^^
(1.24)
where ^ = Boltzmann constant, 1.38 x 10~^^ J/K r = temperature in Kelvin (K) The Boltzmann distribution is derived assuming that there is no change in NJN\ due to the absorption of Hght. Note that as temperature is increased, the ratio NJN\ increases as more molecules populate the upper state. This leaves fewer molecules of energy Ei available to make the transition from El to E^ via hght absorption. Also, note that as A^" increases, the ratio NJNi increases as it becomes more difficult for thermal processes to populate high energy states. Solving Equation (1.24) for A^^, we obtain Nu = Nie-^^^^^
(1.25)
Examination of Equations (1.22) and (1.25) will show that each contains the quantity N^. By substituting Equation (1.25) into Equation (1.22) we obtain ^ a (A^i - NxQ-^^'^^) and rearrangement gives
There are two things in Equation (1.26) worth noting. First, note that A is proportional to A^i, the number of molecules in the lower state. The size of a peak in an absorbance spectrum responds not to the concentration of molecules in a sample, but to the concentration of molecules in state A^i. Any process that causes A^i to change will affect the absorbance independent of the total concentration of the analyte molecules in the sample. As described above, changes in the temperature affect the relative number of molecules with energies E\ and E^. This phenomenon is responsible for the
20
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
^-AE/kt ^^^^ ^^ Equation (1.26), which shows that as temperature goes up, absorbance goes down. Thus, changes in the temperature can affect the measured absorbance even at constant pathlength and concentration. This means that at a fundamental physical level, absorbance does depend upon the temperature, and that a caHbration obtained with samples at one temperature will not necessarily give accurate results for samples at a different temperature. Temperature is one of the most important variables to control when performing quantitative spectroscopic analysis. The practical impact of Equation (1.26) is typically small. At room temperature (298 K), each molecule in a sample has about 200 cm~^ of thermal energy. If Ei = 0 cm~^ and £"^ = 200 cm~^ then N^ will be appreciable because there is enough thermal energy present to populate E^. Thus, small changes in temperature will have a noticeable effect on A^i. On the other hand, if Ei = 0 cm~^ and E^ = 3000 cm~^ a small change in temperature will have little effect on A^i. It would take a large increase in temperature for there to be enough thermal energy present to populate E^ to any appreciable extent. A^i will be large and A^u will be effectively zero. So, although temperature is an important parameter to control in spectroscopic quantitative analysis, the types of caHbrations most sensitive to this effect will be those involving spectroscopic transitions that are low in energy. However, there are other ways in which changes in the temperature can affect caUbrations, as will be discussed below.
B. THE IMPACT OF ELECTRONIC STRUCTURE ON THE ABSORPTIVITY
What we have not discussed yet is why some absorbance bands are more intense than others. This phenomenon can be seen in the spectrum of polystyrene in Figure 1.1. Note that the band near 700 cm~^ is more intense than the peaks near 3000 or 1400 cm~^ This means that the spectroscopic transition that gives rise to the peak at 700 cm~^ absorbs photons with a higher probabiHty than the transition that gives rise to the other peaks. Alternatively, the absorptivity of polystyrene at 700 cm~^ is higher than its absorptivity at 3000 or 1400 cm~^ What fundamental physical property causes transitions in the same molecule to absorb hght with different intensities? More generally, why do some molecules absorb hght more strongly than others? Recall from above the discussion of quantum mechanics. One of the postulates of quantum mechanics is that there exists for energy levels in a molecule a wavefunction that contains all the information about that energy level. Wavefunctions are typically denoted with the Greek letter psi, x//. For the spectroscopic transition shown in Figure 1.9, we denote
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
21
the wavefunction of the lower energy level by V^i, and the wavefunction of the upper energy level by xj/u. If we could a priori know all the wavefunctions of a molecule, we would know everything there is to know about the molecule. In the real world, this is not usually the case. An important use of spectroscopy is to measure molecular energy levels, and use these to help formulate reaUstic wavefunctions for molecules. One of the important properties of wavefunctions is that their square gives a probability. For example, the probability of finding an electron at a specific place in a hydrogen atom is equal to the square of the electron's wavefunction at that point in space. Now, a spectroscopic transition involves two different wavefunctions, xj/i and x/r^- The quantity that gives the probability that a spectroscopic transition will take place is called the transition probability, and it is denoted by |R^^|^, where the superscripts 1 and u denote the two levels involved in the transition. The equation that gives the transition probabiUty is IR'»|2=|y
V^IMV^U dr
(1.27)
where |R^"|^ = the transition probability in going from state 1 to state u V^i = the wavefunction for the lower state in the transition /JL = dipole moment operator V^u = the wavefunction for the upper state in the transition and the dr indicates that the integral is taken over all space. The quantity being squared on the left-hand side, R^" is called the transition moment for the transition between the states 1 and u. Note that the transition probability depends upon the product of the wavefunctions of the two energy levels involved in a spectroscopic transition. R^" also depends upon a quantity called the dipole moment operator, ii. A dipole is simply two charges separated by a distance, and a dipole moment is a measure of charge asymmetry. The dipole moment operator for a molecule is calculated from the following equation: M
where II = dipole moment operator q = charge r = distance
Y^^qtri
(1.28)
22
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
H—CI Figure 1.10 The chemical bond in a hydrogen chloride molecule. The 8'^ and 8~ represent the partial positive and negative charges on the hydrogen and chlorine atoms. The arrow represents the magnitude and direction of the dipole moment for the bond.
and the subscript / denotes that we calculate a dipole moment considering the position of all the charged particles in a molecule. The dipole moment is a vector quantity, having both a magnitude and a direction. Vector quantities can be represented by arrows, where the length of the arrow is proportional to the magnitude, and the arrow point gives the direction. The dipole moment vector for the H-Cl molecule is seen in Figure 1.10. Each bond in a molecule will typically have a dipole moment called the bond dipole. The overall, or net dipole for a molecule, is the vector sum of the bond dipoles. This is expressed in Equation (1.28). For hydrogen chloride, since there is only one bond, the bond dipole and net dipole are the same. For a spectroscopic transition like the one shown in Figure 1.9, the dipole moment operator is related to the change in dipole moment between the lower and upper states. Thus, a molecule's electronic structure, the distribution of electrons and nuclei in space, determines the molecule's net dipole moment, and ultimately impacts the spectroscopic transition probability. Recall from the derivation of Beer's Law that the absorptivity is simply a measure of the probability that a molecule will undergo a totally inelastic collision with a photon. The transition probability measures the probability with which a specific spectroscopic transition will be excited by absorption of a photon. These two quantities are measuring the same thing, the probability of photon absorption. Thus, they are related to each other and we can write £a|R^^|2
(1.29)
The equation that more precisely relates these two quantities can be found in standard spectroscopic textbooks, such as the ones listed in the bibliography. By making use of Equations (1.27) and (1.28) we can also write (1.30) and £ (x fi
(1-31)
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
23
and by applying Beer's Law (Eq. (1.14)) Aocjji
(1.32)
In the jargon of quantum mechanics, the "observable" for a spectroscopic transition, the thing we actually measure, is the absorptivity. The integrated intensity, or peak area, of a spectroscopic transition is called the dipole strength of the transition. When a spectrum with absorptivity on the 7-axis is plotted versus wavenumber on the Z-axis, the peak area (dipole strength) of the band is given by D=
i£/WdW
= / V^i/xiAudr
(1.33)
where D = dipole strength (peak area) W = wavenumber Equations (1.32) and (1.33) are two of the most important relationships in all of spectroscopy. Equation (1.32) connects the absorbance, a spectroscopically observable quantity, to the electronic structure of a molecule. Equation (1.33) relates peak area to the underlying quantum mechanics behind the absorption process. Thus, we now have a picture of how molecules and light interact to give rise to absorption spectra. A more thorough quantum mechanical treatment of the absorption of light by molecules is given in the appendix to this chapter. There it is shown that treating a simple model of chemical bonds naturally gives rise to the quantized energy levels and selection rules that are such a large part of the absorption spectroscopy. The appendix is recommended reading for anyone who truly wants to understand what gives rise to absorbance spectra. Given the impact of electronic structure on the absorptivity, what factors influence the electronic structure of a molecule? Neighboring molecules, particularly in the solid and liquid phases, weakly interact with each other, which can alter the electronic structure of an analyte molecule. A good example of a molecular interaction is the hydrogen bonding that takes place in liquid water, as illustrated in Figure 1.11. The partial negative charge of one oxygen molecule interacts with the partial positive charge on the hydrogen of a second molecule, forming a hydrogen bond. This type of intermolecular interaction affects the electronic structure of the molecules involved. The electronic structure of a lone water molecule is different from that of water molecules with one or
24
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
8-
§+ /
A
H
H
H
A
\
H
S"*^
y
A
H
/
Hydrogen
B""**
H
Figure 1.11 The interactions of the nearest neighbor molecules in liquid water. The oxygen atoms have a partial negative charge ((5~) and the hydrogen atoms have a partial positive charge ((5"^). This type of interaction is called hydrogen bonding, as indicated by the dotted Hnes.
two hydrogen bonds. These water molecules are said to Hve in different chemical environments. The strength and number of interactions between neighboring molecules partly determines a molecule's chemical environment. Molecules in different chemical environments will have different electronic structures. We showed above that the absorptivity of a molecule depends upon the electronic structure, so the absorptivity depends upon the chemical environment. The water molecules in different chemical environments in Figure 1.11 will have sHghtly different absorptivities, which is another way of saying that they will have slightly different spectra. The absorptivity that is measured for a sample is averaged over all the chemical environments of the molecules in the Ught beam. We call the overall chemical environment of a sample its matrix. Thus, a sample's matrix affects its absorptivity. What factors influence the matrix of a sample? 1. Composition—The identities of the molecules in a sample determine the molecular interactions present, and hence impact the electronic structure of an analyte molecule. An acetone molecule in pure acetone finds itself in a different chemical environment than if the acetone molecule is dissolved in water. Acetone would have different absorptivities in these two cases. 2. Concentration—Even if the type of molecules in a sample is controlled, their concentration impacts the chemical environment. In a concentrated solution of acetone in water, an acetone molecule will have on an average more acetone molecules surrounding it than in a dilute solution of acetone in water. The acetone molecules in these two solutions will then find themselves in different chemical environments, and consequently have different absorptivities. This change in absorptivity with concentration is why Beer's law plots sometimes show a nonlinear relationship between
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
25
absorbance and concentration; the absorptivity has changed over the concentration range being measured. 3. Temperature—lncvQ3.SQS in the sample temperature cause the average thermal energy of the molecules in a sample to increase. The molecules in a hot sample move faster and with more energy than in a cold sample. Increased thermal energy also increases the number and energy of molecular colHsions, and the consequent pushing and pulling alters the strength of the intermolecular interactions. Ultimately, the electronic structure and the absorptivity are affected. This is in addition to the effect temperature has on caHbrations discussed above. 4. Pressure—The pressure of a sample matters because it determines the spacing between the molecules. Molecules under high pressure are forced closer together than molecules under low pressure, impacting the strength of the intermolecular interactions. This ultimately means that electronic structures and absorptivities can change with pressure. For solids and Hquids, the impact of pressure on the electronic structure is typically small, and day-to-day atmospheric pressure variations are not normally of concern. For gases, however, pressure is an important variable and should be controlled when possible. In summary, the sample matrix affects the absorptivity, and ultimately the spectrum of a sample. For a caUbration to be truly applicable to both the standard and unknown samples, the sample matrix should be reproduced as well as possible. There is no way of knowing, a priori, how a specific change in the sample matrix will impact the absorptivity of a sample. It therefore makes sense to measure the spectra and perform calibrations under different conditions to get a feel for the range of conditions under which your caUbration is accurate. It is never safe to assume that your calibration is good over all the conditions under which you will obtain unknown samples and measure their spectra. It may be necessary to obtain several caHbrations, and apply the appropriate one for the circumstances of the unknown sample being examined.
C. SUMMARY
For many of us who perform quantitative spectroscopic analyses, there is a tendency to dismiss the absorptivity as just a proportionality constant. It is easy to believe that its value is determined by the absorbances and concentrations measured, and there is not much thought given to its physical meaning. The whole purpose of this section has been to firmly establish the fact that the absorptivity does have physical meaning, and that
26
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
its value can be affected by experimental parameters as summarized in Equations (1.29)-(1.31). When performing a calibration with standard samples, we normally know the pathlength and concentration of the sample, and we measure the absorbance. Thus, the only piece of Beer's law that we do not know for the standard samples is their absorptivity. The entire reason why we have to perform a caHbration in the first place is to obtain the absorptivity. Now, the absorptivity is a fundamental physical constant of a molecule. If that is the case, why cannot one look up in a table the absorptivities for specific molecules at specific wavenumbers, just hke we can look up the melting points of substances? The reason is Equations (1.29)-(1.31). The electronic structure of a molecule can be influenced by a number of variables in the sample and the environment as we have seen. In a nutshell, the absorptivity is matrix dependent. If the absorptivity were not matrix dependent, we would only have to measure the absorptivity of a pure chemical once at a specific wavenumber. These values could be tabulated, and it would be a matter of simply looking up the absorptivity, sparing us the work of calibration. Sadly, this is not the case. Since the absorptivity is matrix dependent, we must calibrate for each matrix in which the analyte may be found. This is why spectroscopic calibrations have to be performed in the first place, and why it is so important to get them right. It has been stated elsewhere in this chapter that the fundamental assumption of quantitative spectroscopic analysis is that a calibration describes the unknown samples as well as the standard samples. Since it is the absorptivity we seek when calibrating, we are assuming that the analyte absorptivity is the same in the standard and unknown samples. To ensure that this happens, the sample matrix must be reproducible. If you control all the variables that impact the absorptivity of your analyte, your calibration will be applicable to unknown samples. If you do not, you will be using the wrong calibration on your unknown samples, and predict erroneous concentrations. Often the failure to understand and control these variables leads to the failure of a calibration.
VI. Gas Phase Quantitative Spectroscopic Analysis Quantitative spectroscopy can be performed on soHds, liquids, and gases. In this book, the information presented is general enough to be apphed to almost any type of sample. This section, however, will be devoted to the theory behind the quantitative spectroscopy of gases. Gases present a unique case. For gases, a simple equation called the ideal gas law relates the physical properties of a gas to its concentration. This type of equation is called an equation of state because it can be used to calculate the state of
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
27
a sample given enough information. Solids and liquids also have equations of state, but they are nowhere as simple or easily understood as the ideal gas law. The purpose of this section is to combine the ideal gas law with Beer's law to obtain an equation that relates absorbance to the physical properties of a gas. The derivation of the ideal gas law can be found in any of the physical chemistry texts cited in the bibliography. The ideal gas law states PV = nRT
(1.34)
where P = pressure F = volume n = number of moles of gas R = universal gas constant (0.082 Hter-atmosphere/mole-deg) r = temperature (normally expressed in degrees Kelvin, K) What Equation (1.34) tells us is the following; that for a constant amount of gas at a given temperature, the product of the pressure and the volume is constant. The gas constant, R, has been found by experiment to be a constant for a wide variety of gases under a variety of conditions. It is, in essence, a proportionaUty constant. Rearranging the ideal gas law in various ways illustrates the relationship between the different parameters. For example, we can say that V=:nRT/P
(1.35)
which says that as pressure goes up, volume goes down. The more pressure you apply to a gas, the smaller it is going to get, and Equation (1.35) is just a mathematical way of saying the same thing. Also, note in Equation (1.35) that volume and temperature are proportional. This means when a gas is heated it expands. Equation (1.35) is just a mathematical way of expressing something we already know from everyday experience. Although we have not derived the ideal gas law, these ideas should help its form seem sensible. We can rearrange the ideal gas law yet again to obtain n/V = P/RT
(1.36)
which says that the moles per volume of a gas is proportional to pressure and inversely proportional to temperature. A convenient set of units to measure moles per volume is moles per liter, or moles/Uter. These are also the units of concentration commonly used by chemists. Thus, we can rewrite
28
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Equation (1.36) c = P/RT
(1.36a)
where c = n/V= concentration The concentration of a gas can be calculated by simply knowing its pressure and temperature. Earlier in this chapter we derived Beer's law, namely that A = sic We can rearrange Beer's law to solve for the concentration such that c = A/8l
(1.37)
Both the ideal gas law and Beer's law contain concentration. We can then set Equation (1.36a) equal to Equation (1.37) and obtain A/sL = P/RT
(1.38)
A = PsL/RT
(1.39)
which on rearranging gives
Equation (1.39) relates the absorbance, a spectroscopically observable property, to the physical parameters characterizing a gas sample. Simply put, for gases the absorbance depends upon not only the pathlength and the absorptivity, but the pressure and the temperature of the gas as well. This makes sense if we think about it. Remember that concentration is a measure of the number of molecules per unit volume. Equation (1.39) says that absorbance is proportional to pressure. If pressure is increased, molecules are forced closer together, creating more molecules per unit volume or increasing the concentration. An increase in the concentration causes an increase in the absorbance. This means that the total pressure of the standard and unknown samples must be the same for gas phase quantitative analysis. Equation (1.39) also says that absorbance is inversely proportional to temperature. When a gas is heated, it expands. The molecules get further apart, the number of molecules per unit volume goes down, and so we would expect absorbance to go down.
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
29
Earlier in this chapter, we discussed at length the temperature dependence of the absorptivity. We noted that Beer's law is derived assuming that the absorptivity does not change with the temperature. However, we saw that the absorptivity does change with the temperature because of its dependence upon a molecule's electronic structure. The conclusion is that temperature needs to be controlled when performing quantitative spectroscopy. We would call the dependence of the absorbance upon temperature implicit, which is not obvious from looking at Beer's law. In Equation (1.39), the temperature dependence of the absorbance is explicit. Any change in the temperature will produce an immediate change in the absorbance even if the absorptivity does not change with the temperature. This makes using one calibration at multiple temperatures for gases even more inappropriate than it is for sohds and Uquids. We could call Equation (1.39) "Beer's law for the Gas Phase." It relates the spectroscopic and physical variables of a gas phase analysis to each other. It would be quite difficult to derive a similar expression for liquids or solids. This equation should be foremost in the mind of anyone performing quantitative spectroscopy in the gas phase. BIBLIOGRAPHY
B. C. Smith, Infrared Spectral Interpretation, CRC Press, Boca Raton, Florida, 1999. B.C. Smith, Fundamentals of FT IR, CRC Press, Boca Raton, Florida, 1996. N. Colthup, L. Daly, and S. Wiberley, Introduction to Infrared and Raman Spectroscopy, Academic Press, New York, 1990. Donald A. McQuarrie, Quantum Chemistry, University Science Books, Mill Valley, California, 1983. P. Griffiths and J. de Haseth, Fourier Transform Infrared Spectrometry, Wiley, New York, 1986. D. Skoog and D. West, Principles of Instrumental Analysis, Holt, Rinehart, & Winston, New York, 1971. J. Robinson, Undergraduate Instrumental Analysis, Marcel Dekker, New York, 1995. D. Peters, J. Hayes, and G. Hieftje, Chemical Separations and Measurements, WB Saunders, Philadelphia, 1974. M. Diem, Introduction to Modern Vibrational Spectroscopy, Wiley, New York, 1993. J.M. Hollas, Modern Spectroscopy, Wiley, New York, 1996. G. Castellan, Physical Chemistry, Addison-Wesley, Menlo Park, 1971. I. Levine, Physical Chemistry, McGraw-Hill, New York, 1983. John Winn, Physical Chemistry, Harper Collins, New York, 1995. T. Hirschfeld, Quantitative FTIR: A Detailed Look at the Problems Involved, Chapter 6 of FTIR: Applications to Chemical Systems, Vol. 2, edited by J. Ferraro and L. Basile. Academic Press, New York, 1979. D. Burns and E. Ciurczak, Handbook of Near Infrared Spectroscopy, Harcourt, New York, 1997.
30
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Appendix: The Quantum Mechanics of Light Absorption The purpose of this appendix is to fully establish the theoretical basis behind the absorption of light by molecules. You do not need to understand the theory of hght absorption to run a spectrometer and measure a spectrum. However, to truly understand what a spectrum means, where it comes from, and what variables impact spectroscopic caUbrations, a familiarity with this theory is necessary. I have attempted to write this section assuming that the user is not famihar with quantum mechanics. This is, however, a daunting task given the complexity of the math in this field. Thus, a nodding familiarity with the basics of quantum chemistry will be very helpful in deriving maximum benefit from this section. This appendix will be very focused, only those quantum mechanical concepts that impact directly on the Hght absorption process will be discussed. For those wishing a more detailed discussion of quantum mechanics, consult any of the physical chemistry texts listed at the end of this section. The discussion in this section owes much to References [1, 2, 4]. A famiharity with the contents of Chapter 1 of this book will be assumed. We will begin the appendix with an introduction to wavefunctions and Schrodinger's equation. We will then proceed to apply the time independent Schrodinger's equation to a simple system, the particle in a box, and derive the energy levels for it. This system is an oversimpUfied, but useful model for systems such as electrons bound in chemical bonds. We will see that one of the most important aspects of quantum mechanical systems, the quantization of energy levels, is natural for bound microscopic systems. We will then look at doing spectroscopy on the particle in a box, ascertaining how the particle might make the transition from one energy level to another. We will apply the equation for the transition probability given in Chapter 1, and derive selection rules for the particle in the box very similar to those for vibrational and other kinds of spectroscopy. We will see that the selection rules are a natural outgrowth of the shape of the wavefunction for the particle in the box.
I. Wavefunctions and Schrodinger's Equation We all know from personal experience that matter is "clumpy." It comes in discrete pieces, be they molecules, atoms, protons, or quarks. It turns out that energy is also "clumpy" and comes in discrete packets called quanta, so energy is said to be quantized. Einstein's famous equation relating matter and energy, E = mc^, tells us that matter and energy are equivalent. Thus, if matter is clumpy, then energy should be clumpy as well. A quantum of energy is extremely small, and the fact that energy comes in discrete
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
31
packets is not important in the macroscopic world. However, in the microscopic world of atoms and molecules energy quantization has a huge impact upon how matter and energy interact. The field of physics that deals with the behavior of atoms, molecules, nuclei, and quanta of energy is called quantum mechanics. Quantum mechanics has little to tell us about physical systems that are unbound, i.e. the parts of a system that are free to move at will. Imagine an electron moving through a vacuum, or a baseball being thrown by a pitcher, as examples of unbound systems. However, any time microscopic particles are bound, such as the electrons and protons in a molecule, their energy levels become quantized. This means that the particles in a molecule cannot have just any energy, only certain specific energies are allowed. The rotational, vibrational, and electronic energies of molecules are quantized. The reason that molecules have discrete spectral bands is that they absorb light at discrete energies that match those of an energy level difference in the molecule. There are a number of things that determine the quantized energy levels in a molecule. These include the mass and type of atoms in a molecule, the strength of the chemical bonds, and the arrangement of the atoms in space. Two hypothetical energy levels for a molecule are shown in Figure A.l, with energy E\ for the lower level and E^ for the upper level. When a molecule absorbs a photon of light, it is said to make a spectroscopic transition from the lower to the upper energy level, as indicated by the arrow in Figure A. 1. One of the postulates of quantum mechanics is that there exists for energy levels in a molecule a wavefunction that contains all the information about that energy level. Wavefunctions are typically denoted with the Greek letter psi, i//. For the spectroscopic transition shown in Figure A.l, we denote the wavefunction of the lower energy level by V^i, and the
Figure A.l A spectroscopic transition from a lower energy level (£"1) to an upper energy level (£u) that occurs when a molecule absorbs a photon of light, ^i and 1/^^ stand for the wavefunctions of the lower and upper energy levels, respectively.
32
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
wavefunction of the upper energy level by ^Ir^. If we could a priori know all the wavefunctions of a molecule, we would know everything there is to know about the molecule. In the real world, this is often not the case. Frequently, the form of wavefunctions is assumed, or calculated from spectroscopic data. The simplest way of writing how to determine the energies of a bound system is the general formulation of Schr6dinger's equation, which says H^lr = Ef
(A.l)
where H = The Hamiltonian operator \lr = Wavefunction E = Energy This equation is not as simple as it looks. The symbol H is not a simple variable, but an operator. An operator is a symbol that tells you to do something to whatever follows the symbol. The Hamiltonian operates upon wavefunctions to give energies. For a given quantum mechanical system, it is not necessarily obvious what form the Hamiltonian operator should take. By definition, the Hamiltonian represents the total energy, kinetic plus potential, of a system. More succinctly, H = (K^P)
(A.2)
where ^ a k i n e t i c energy P = potential energy When possible, expressions for the kinetic and potential energy of a system are incorporated into the Hamiltonian. For a one-dimensional system whose states are stationary, i.e. not changing over time, the Schrodinger equation is [1] -fi/2md^/dx^ + P(xM where fi = {hl2ii) h = Planck's constant m = mass X = distance P = potential energy
= E^lr
(A.3)
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
33
The equivalence of Equations (A.l) and (A.3) can be seen by recognizing that the term in curly brackets on the left-hand side of Equation (A.3) is the Hamiltonian operator, H.
11. The Particle in a Box Perhaps one of the most simple quantum mechanical systems is the one-dimensional particle in a box. This is illustrated in Figure A.2. The box has length L. The walls of the box are infinitely high so the particle cannot escape. This means that the potential energy of the particle is infinity for X not between 0 and L. We will also assume that the potential energy of the particle inside the box is zero. Thus, P = 0
forO<x
(A.4)
and P = OQ X < 0 and x > L
Since the potential energy inside the box is zero, Schrodinger's equation reduces to (A.5)
-fi/2md^/dx^}xl/=Ejl/
X =0
X=L
Figure A.2 A particle in a box of length L with infinitely high potential energy walls on each side at positions X=0 and X=L.
34
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
It may not be obvious at this point, but this simple model has some utiHty. The particle could be an electron, and the infinitely high walls represent the chemical bond force holding the electron in its orbital. A discussion of the nature of the wavefunction, i/^, is in order. In the macroscopic world, the square of the wavefunction describes the amplitude of a wave as a function of position and time. For example, imagine tying a piece of string to a doorknob and shaking it. If we knew the wavefunction for the motion, we could calculate the ampUtude of the string at any point at any moment in time. The French physicist de Broglie discovered that when any type of matter moves, it describes a wave. The wavelength of a "matter wave" is given by A = h/p
(A.6)
where X = matter wavelength h = Planck's constant p = momentum For macroscopic systems matter waves are small enough to be irrelevant. For example, a 5 oz baseball hurled by Pedro Martinez at 90 mph has a de BrogHe wavelength of 1.2 x 10~^^ m. A trivial size compared to that of a baseball, and a phenomenon that American League batters need not worry about. For a microscopic particle moving within a bound system, the square of the wavefunction can be thought of as the "ampUtude" or "intensity" of its matter wave [2]. It may seem nonsensical for matter to have intensity, but one interpretation of the square of the wavefunction is that it represents the probabiHty of finding the particle at a given location. The "intensity" of a matter wave is simply a probabiHty. For the particle in the box, the probability of finding the particle outside the box is zero, while the probability of finding the particle inside the box is 1 (the particle must be in the box). Thus,
/
}lf^dx=l
for 0 < jc < L
(A.7)
Another condition we impose upon the wavefunctions is that they be continuous. A continuous function has lvalues that vary smoothly with X; there are no "jumps" in a continuous function. An inexact, but useful
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
35
picture of a continuous function is that it can be drawn without Hfting your pencil off the paper you are drawing on [3]. In the case of our particle in the box wavefunctions, since the value of i/^^ outside the box is zero, the value of V^^ at the walls of the box must equal zero for the wavefunction to be continuous. Else, there would be a "jump" or discontinuity in the value of i/r^ in passing from inside the box to outside the box. In equation form this is expressed as follows: iA(0) = 0
(A.8)
and f{L) ^ 0 The conditions in Equations (A.7) and (A.8) are called the boundary conditions for the particle in the box problem. In many problems of this type, the boundary conditions play a large part in determining the form of the solution. Our job now is to solve Equation (A.5), the Schrodinger's equation for the particle in a box. Note that for any function V^ to be a solution to this equation its second derivative (on the left-hand side of the equation) must equal the function times a constant (the right-hand side of the equation). Combinations of sin and cosine waves fit this description. Thus, wavefunctions \lf that solve Equation (A. 5) have the form f = A sin(Cx) + B cos(Cx)
(A.9)
where A = constant B = constant C = (2mE)^^^/fi We now need to impose the boundary conditions on these wavefunctions. One boundary condition is that T//^(0) = 0. Recall that cos(0)=l. For the cosine term to follow the boundary condition, B = 0, which simplifies our solution to ilf = Asm(Cx)
(A. 10)
Another boundary condition is that \lf(L) = 0, thus V^(L) = ^ sin(CL) = 0
(A. 11)
36
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
The sin function has the value of zero at 180°, 360°, 540° etc, in other words at multiples of 180°. It is convenient to represent angles in units of radians. A 180° angle has n radians, a 360° angle contains 2n radians, etc. Thus, Equation (A. 11) is true if CL = nn
(A. 12)
where n =
0,1,2,3...
which simply depends on the fact that the sin of multiples of n radians is zero. Combining Equation (A. 12) with C = {2mE)^^^/fi gives mi = {2mE)^'^L/fi
(A. 13)
We next remember that fi = h/2n, substitute this into Equation (A. 13), and rearrange to solve for E as follows: E = h^n^/SmL^
(A. 14)
where n = 0,1,2,3...
the quantum number
This is the equation we have been seeking; it gives the quantized energy levels for the particle in a box. In this equation, n is referred to as a quantum number. Note that the quantization of the energy occurs as a natural outcome of restraining the particle to reside in the box, and assuming that the wavefunction is continuous. No other assumptions or constraints are needed. Now we can also derive the form of the wavefunctions for the particle in the box. We have already stated this form in Equation (A. 10). However, if we rearrange Equation (A. 12) we obtain C = nn/L
(A. 15)
-^ = A sm(nnx/L)
(A. 16)
substituting into (A. 10) yields
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
37
¥4
V3 E2 = 4h78m]
¥2
Vi
Figure A.3 The first four wavefunctions and energy values for the particle in a box.
H 2 C—CH—CH—CH 2 Figure A.4 The chemical structure and bonding pattern of 1,3-butadiene.
It can be shown via normalization that ^ = (2/L)^/^ [1]. The first four wavefunctions (n—l, 2, 3, and 4) and the energy levels for the particle in a box are shown in Figure A.3. Note that the wavefunctions look similar to some of the simple sin wave patterns that can be induced by wiggling a rope or string anchored at one end. The particle in a box models, at least crudely, the electrons in a chemical bond. Consider the structure of the butadiene molecule in Figure A.4. This molecule consists of two C=C bonds connected by a C C bond. The two double bonds interact with each other via the dumbbell shaped (p) orbitals shown on the right-hand side of Figure A.4. As a result, conjugation takes place and the four electrons in these bonds are free to travel the length of the molecule. Consider this bonding pattern as a particle in a box. The length of the box would be the length of the molecule. Assuming that the molecule is linear (which it is not), the box length would be equal to twice the bond length of a C=C bond plus the bond length of a C-C bond. Typical values for these quantities are 1.35 A and 1.54 A respectively, which would give a box length of 5.78 A.
38
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
n=3
n=2
n=l Figure A.5 The first three energy levels of the conjugated electrons in the butadiene molecule modeled as a particle in a box. The 4 electrons populate the n=\ and n = 2 levels. A spectroscopic transition, indicated by the arrow, can take place promoting an electron from the n = 2 to the n = 3 level.
Now, there are in actuality four electrons in this box. There is a rule of quantum mechanics called the Pauli Exclusion Principle that says no more than two electrons can occupy the same orbital. This means that two of the electrons are in the n = 1 energy level, and two of them are in the n = 2 energy level as shown in Figure A.5. The energy levels of the particle in a box are given by Equation (A. 14). An electron in the n = 2 level can in theory absorb a photon of light and undergo a spectroscopic transition to the n = 3 level. Thus, the resulting absorbance feature would appear at an energy that is the difference between n = 2 and n = 3 levels. Using Equation (A. 14), this gives A ^ = (3^ - 2^)h^/%mL^ AE = 5h^/^mL^
(A.17)
where A£' = energy difference Using L = 5.78 A and an electron mass of 9.11 x 10~^^ kg gives AE = 45,466 cm~^ In reaUty, the UV-Vis spectrum of 1,3-butadiene has an absorbance feature at 46,083 cm~^ (217 nm), an error of less than 1.5%. This is not a bad agreement given the simplicity of this model.
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
39
III. Transition Probabilities for the Particle in a Box Recall from Chapter 1 that the absorptivity of a spectral feature depends, ultimately, upon the transition probabiUty which is given by
|R'"l' = 1/
1
(A. 18)
where |R^"|^ = the transition probability in going from state 1 to state u V^i = the wavefunction for the lower state in the transition /x = dipole moment operator V^u = the wavefunction for the upper state in the transition If we take the square root of both sides of Equation (A. 18), we get an expression for the transition moment, R^^ as such
" ' " = / •V^iMiAudT
(A. 19)
Recall also from Chapter 1 that the dipole moment operator is given by /x = ^ . ^ / r ,
(A.20)
where /x = dipole moment operator q = charge r = distance / = particle number in a molecule For the particle in a box problem, there is only one particle so i=l in Equation (A.20), and /x will depend upon the charge and position of the particle as such fi = eX
(A.21)
where e = charge A'= distance The wavefunctions for the particle in a box are given by Equation (A. 16). We can then insert Equation (A.21) and (A. 16) into Equation (A. 19) to
40
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
come up with an equation for the transition moment for the particle in a box as such R^^ = {2e/L) j sm(lnX/L)X
sm{unX/L) dx
(A.22)
where / = quantum number for the lower energy state u = quantum number for the higher energy state The charge on the particle e and the constant A have been brought outside the integral because they do not depend upon x. Equation (A.22) can be solved by applying known properties of the products and integrals of trigonometric functions [4] to obtain the following: ^ 1 ^ ^ eL {cos[(u -1)K]-1
COS[(U + / )7r] -
1}
We could use Equation (A.23) to calculate the transition moment for any two energy levels for the particle in a box. However, we seek to obtain a more general description of what transitions are allowed by noting when R^" = 0. It is a property of cosine waves that cos(^7i) = — 1 for n = odd
(A.24)
and cos(/77i) = 1 for n = even We can see from Equation (A.23) that the transition moment for the particle in a box depends upon the quantities [cos(w — /)7r — 1] and [cos(w + /);! — 1]. For any transition where cos(w — /)TC and cos(w + /)7r equal 1 will mean both cosine terms in the numerators in Equation (A.23) would equal 1, and after subtracting 1 both terms go to zero, giving a transition moment of zero. It turns out that if u and / are both even, or are both odd, this condition is met (i.e. (odd + odd) = even, (odd — odd) = even, (evenH- even) = even, (even —even) = even). For example, for 1 = 2 and w = 4 (w-h/) and (u — l) are both even, and R^^ = 0. This type of transition is said to be a. forbidden transition because its transition moment is zero. Alternatively, combinations with u odd and / even, or u even and / odd give cosine terms in the numerators in Equation (A.23) of - 1 (i.e. (odd + even) = odd, (odd even) = odd, (even - odd) = odd). The numerators then each reduce to - 2 ,
1. FUNDAMENTALS OF MOLECULAR ABSORPTION SPECTROSCOPY
41
and the total value for R^" is nonzero. For example, for 1 = 2 and w = 3 (the transition seen in Figure A.5) both (/+w) and {l-u) are odd, making R^" nonzero. This type of transition is said to be an allowed transition because the transition moment is nonzero. If An = {u — l), transitions are forbidden for An = even and allowed for An = odd. We can summarize all of this in equation form as follows: R^^ = 0
for A« = ±2, ± 4, ± 6 . . .
(A.25)
and R^^T^O for A/i = ± l , i b 3 , i b 5 Equations (A.25) are an example of what are called selection rules. These rules determine what types of spectroscopic transitions are allowed in bound systems. Selection rules like those in Equations (A.25) exist for mid-infrared, near-infrared, and UV-Vis spectroscopy. The point of this derivation is that selection rules, like energy quantization, are a natural consequence of bound systems. Many of the important features of absorption spectra are determined by the energy levels of molecules, and the selection rules imposed upon those energy levels. Examination of the particle in the box shows that energy quantization and selection rules arise naturally out of binding microscopic particles, no other conditions or assumptions are necessary. Although this treatment will not accurately predict the actual absorption spectra of real world molecules, it at least gives a feel for what gives rise to important aspects of absorption spectra. REFERENCES
[1] Max Diem, Introduction to Modern Vibrational Spectroscopy, Wiley, New York, 1993. [2] Donald McQuarrie, Quantum Chemistry, University Science Books, Mill Valley California, 1983. [3] Sherman Stein, Calculus and Analytic Geometry, McGraw-Hill, New York, 1977. [4] Ira Levine, Physical Chemistry, 2nd Edition, McGraw-Hill, New York, 1983.
This Page Intentionally Left Blank
SINGLE ANALYTE ANALYSIS
The purpose of this chapter is to provide theoretical and practical information in determining the concentration of a single analyte in a sample. The field of multiple analyte analysis is covered in later chapters. Even if you do not plan to perform single analyte analyses, you should read this chapter. Many of the important concepts of quantitative spectroscopy are best learned using a single analyte as an example. This chapter begins with a discussion of the basics of precision and accuracy, and then moves on to how Beer's law is used to achieve cahbrations. Since this usually involves plotting a straight line, a discussion of how to plot and determine the equation for a straight Hne is given. This leads to a presentation of several important statistics, and how they are used to determine cahbration quahty and robustness. The chapter concludes with sections on standard methods, experimental protocols, measuring absorbances properly, and avoiding experimental errors.
43
44
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
I. Precision and Accuracy Before we proceed to a discussion of how spectroscopic calibrations are generated, and how their quaUty is measured, the ideas of precision and accuracy need to be discussed. These two quantities measure data quahty, and are often confused. All measurements contain some level of error, regardless of how well we perform an experiment, or how well our instruments work. For example, we may know the concentration of the analyte in a standard sample to be 60 ± 5 % , or the position of an absorbance band to be 1700 ± 2 cm~^ (the number after the ± represents the amount of error in the measurement). Another name for experimental error is noise. Noise is what we try to minimize to maximize the quality of a measurement. Signal is the measurement itself, which is the number before the it. The best data have maximum signal and minimum noise. A quantity used to measure data quahty is the signal-to-noise ratio (SNR) SNR = Signal/Noise where SNR = Signal to Noise Ratio Note that the higher the SNR, the better the data. Two types of error contribute to uncertainties in measurements. Random error is caused by variables over which we have no control, such as pressure, temperature, fluctuations in the electrical voltage powering an instrument, and sloppiness in the experimental technique of the analyst. A digital clock whose battery is dying and whose display flashes random times would give time measurements with a great deal of random error. The sign of random error is random; it can be either positive or negative. Hence, the di symbol is used to denote the size of random error. Performing multiple measurements and adding the measurements together can reduce the random error. This is why we frequently perform an experiment multiple times and report the average of the data as the final result. For example, the reason we add scans together in Fourier Transform spectroscopy is to reduce the noise level of the spectrum. Random error can be reduced, but never eliminated. We can never have complete control over every variable that affects a measurement. Systematic error is error that always has the same sign and magnitude. A digital clock that is 1 hour slow because it was not set for daylight savings time is an example of a systematic error. The clock will always be
2.
45
SINGLE ANALYTE ANALYSIS
in error by the same amount (1 hour) and in the same direction (slow). In order to calculate systematic error, we must have a true reference value to compare to the measured value. For the digital clock example, the actual time is the reference value. Systematic error is also called bias. Unlike random error, systematic error can be eliminated. In the case of the digital clock, its systematic error is removed by setting the clock ahead by 1 hour. Precision is a measure of the size of the random error in a measurement. Precision is often determined by measuring the same quantity several times, then looking at the scatter in the data. Accuracy is a measure of how far away a given measurement is from its true value. Like systematic error, determining accuracy requires the existence of a true value to which to compare the measured value. Accuracy includes the effects of random and systematic errors, and is a more complete description of the quahty of a measurement than precision. Precision and accuracy are not the same, as can be visualized by the targets seen in Figure 2.1. Consider the center of the bull's eye to be the true value of a measurement. The left-hand target in Figure 2.1 is the worst case scenario. The data are imprecise, scattered all over the place, and are inaccurate, falling far away from the center of the target. The data in the middle target are somewhat better. The data are precise, forming a tight cluster with httle scatter. However, the data are inaccurate because the cluster is centered far from the bull's eye. The distance from the center of the data cluster to the center of the target is a measure of the systematic error in the data. Note that precision is a measure of random noise only, and that data devoid of random noise can still be wrong because of bias. The rightmost target in Figure 2.1 is the ideal situation. The data are precise, forming a tight cluster, and the data are accurate, falling near the bull's eye. Note that good data have small amounts of both types of error, and that accuracy measures both the systematic and random errors.
Imprecise & Inaccurate
Precise & Inaccurate
Precise & Accurate
Figure 2.1 A visualization of the concepts of precision and accuracy using a target.
46
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
II. Calibration and Prediction with Beer^s Law Beer's law was derived in Chapter 1. As a review, we will restate its form here A=:6lc
(2.1)
where A = absorbance 6 = absorptivity / = pathlength c = concentration To obtain a cahbration using Beer's law, we must first observe that its form is analogous to the formula for a straight line. The formula for a straight line is Y = mX^b
(2.2)
where Y= 7-axis value of a data point m = slope of line X= X-axis value of a data point b — F-intercept of the Hne (where the Hne crosses the F-axis) Comparison of Equations (2.1) and (2.2) shows that if absorbance is plotted on the 7-axis, and concentration is plotted on the Z-axis, a straight hne should be obtained whose slope is given by m = sl
(2.3)
The slope of a Une can be measured by choosing two (or preferably more) data points on the line and calculating the difference between their 7-axis values (AY), then calculating the difference between their X-axis values (AX). Then, we divide A F b y AX (or put more simply, calculate rise/run) to obtain the slope of the Une. To obtain the data needed to develop a calibration the absorbances of a series of standard samples are measured. The peak height at a specific wavenumber, or an integrated peak area calculated with fixed limits, is used (see below for more on measuring peak heights and areas). The concentrations are determined by some method other than the one used to obtain the calibration. Frequently, standards are made with measured
47
2. SINGLE ANALYTE ANALYSIS
Peak Area Calibration for IPA y = 0.165X - 0.6834 R = 0.999 12
^
10 8 6 4 2 0
^
"¥
20
40 IPA
60
80
Figure 2.2 A calibration line for the analysis of isopropanol (IPA) in water. The 7-axis is the area of the IPA peak centered at 2973 cm^^ The peak area was integrated between 3008 and 2947 cm~^ The I'-axis is volume percent IPA in water. The data are contained in Table 2-1.
masses or volumes of analytes. Alternatively, a method such as titration or chromatography can be used to determine the concentration of the analyte in standard samples. This points out that quantitative spectroscopy is a secondary method of determining concentrations. We cannot caHbrate a spectrometer until some primary method of analysis is used to determine the analyte concentration in the standards. An example of a calibration obtained using Beer's law is seen in Figure 2.2. This figure is a Beer's law plot or calibration line for the determination of isopropanol (IPA, also known as rubbing alcohol) mixed with water. The 7-axis is the area of the IPA peak centered at 2973 cm~^ (the methyl group asymmetric C-H stretch, integration limits 3008-2947 cm~^). The X-axis is the volume percent of IPA in water. This hne is the mathematical model relating absorbance to concentration. The data used in making this plot are seen in Table 2-1. An overlay of the peaks analyzed is seen in Figure 2.3. The experimental details of how the spectra were measured are contained in the appendix to this chapter. If a plot of absorbance versus concentration is linear, a system is said to follow Beer's law. Beer's law plots may be nonhnear for a number of reasons as discussed later. Once a cahbration is in hand, the next step is to predict the concentration of the analyte in an unknown sample. We first need to write Beer's law for the unknown sample *unk = skyunk
(2.4)
48
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 2-1 Volume % and peak area for calibration line seen in Figure 2.2. %IPA
IPA
Area
9 18 35 53 70
1 2.2 4.9 8 11
2973 cm-1
^
T 2980 2970 2960 Wavenumber (cm-1)
2950
2940
Figure 2.3 An overlay of the five peaks whose areas are given in Table 2-1. These data are used in the calibration line seen in Figure 2.2. The dashed vertical line shows how the peak maximum changes with the concentration. This is an example of absorptivity drift as discussed in Chapter 1.
where ^unk = absorbance of the unknown sample el = thQ product of the absorptivity and the pathlength (= the slope of the calibration line) Cunk = the analyte concentration in the unknown sample Ultimately, it is the concentration of the analyte in the unknown sample, Cunk, ill which we are interested. Rearranging Equation (2.4) gives ^unk — ^ u n k / ^ t
2. SINGLE ANALYTE ANALYSIS
49
From Equation (2.3) m = £l giving Cunk = Aunk/fn
(2.5)
According to Equation (2.5), predicting the concentration of an analyte in an unknown sample is a matter of dividing the measured absorbance of the unknown sample by the slope of the caHbration Une. Note that to use Equation (2.5) you do not need to know the absorptivity and the pathlength expHcitly, just their product. In comparing Equations (2.1) and (2.2), note that Beer's law does not have a second term. This means that b, the F-intercept of the hne, should equal zero. In other words, the Une should pass through the origin. Sometimes plots of absorbance versus concentration have nonzero F-intercepts. At the origin the analyte concentration is zero, so should not the absorbance be zero as well? Not necessarily; part of the reason is absorbance by interferents. For example, solvent or impurity molecules may absorb at the same wavenumber as the analyte. Noise is also a culprit; the line may not pass exactly through zero due to the noise in the absorbance data. In general. Beer's law plots should not be forced through the origin. In doing so, it is assumed that the absorbance at zero concentration is exactly zero, which is not necessarily the case. In Chapter 1, there was a discussion of how concentration affects the absorptivity. An example is seen in Figure 2.3. Looking at the figure from top to bottom, the concentration of IPA goes from 70 to 9%. The vertical dashed line in the figure passes through the peak in the top spectrum at 2973 cm~^ At lower IPA concentrations, the peak maximum moves to a higher wavenumber and the peak becomes broader, making it increasingly difficult to determine the peak edges needed to perform an accurate integration. Ultimately, these changes in the spectrum of IPA with concentration were not enough to greatly affect the quality of the caHbration. However, Figure 2.3 illustrates that changes in absorptivity with concentration are real, and if the changes are large enough, inaccurate caUbrations may result.
III. Plotting and Analyzing Lines In single analyte analysis the calibration model is typically a straight line. To understand these cahbrations we must learn more about straight Hues and their properties. This section will cover how equations for straight Hues are calculated from a set of data, and what statistics can be calculated to assess the quality and robustness of a caHbration.
50
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
A. LINEAR REGRESSION: LEAST SQUARES FITTING
Regression is the process by which a mathematical model is generated relating sets of data. Linear regression assumes that two sets of data have a linear relationship. One method of using Hnear equations to regress two sets of data and calculate the line relating them is called the least squares equations. The resulting Hne is sometimes called a least squares fit [1-5]. You have probably heard of the least squares fitting routine, and may have used it to plot lines using various computer programs such as Microsoft Excel®. It is easy to treat least squares fitting routines as black boxes, simply stuffing the data in and getting out caHbration lines. However, least squares routines have certain attributes and assumptions associated with them that must be understood to use them properly. It is time we opened the black box.
1. Properties of the Least Squares Fit
It can be shown [1] that the slopes and F-intercepts calculated using the least squares methods are unbiased. This means no systematic error is introduced into the results by the calculation. This may seem an obvious requirement for a fitting method, but there are fitting methods that introduce bias into the calculated results [1]. The least squares fit will not remove bias already present in a set of data, it will simply not add any bias of its own. Another property of least squares methods is that they provide the best fit possible to the data. In Figure 2.2, you can see that all of the data points fall a Httle below or above the calculated Une. The distances of the data points from the hne represent the error in the least squares fit. It can be shown [1, 2] that a least squares fit provides a line that minimizes the distance between the data and the calculated hne. Thus, a least squares fit provides a hne with the least possible error in it. The proof of this idea [3], while beyond the scope of this book, is recommended reading because it is relatively easy to follow and provides an interesting insight into how least squares methods work. Thus, two important properties of least squares fits are that they provide unbiased, minimum error results for the slope and the intercept of lines being fit to the data. It is no wonder then that this method is widely used for regression analysis.
2. Assumptions of the Least Squares Method
There are a number of assumptions made when applying least squares fits to data. A complete understanding of these assumptions is necessary to
2. SINGLE ANALYTE ANALYSIS
51
insure successful calibrations. Here are some of the assumptions behind the least squares method [1]. 1. Perfect Model—In using the least squares method, it is assumed that the equation being used perfectly describes the data. In other words, there are no variables affecting the data outside those represented in the equation describing the data. This puts a huge burden on the analyst. It is up to you to figure out which equation best describes the data in front of you. For single analyte quantitative spectroscopic analysis, the equation being fit is Beer's law. By using Beer's law in a least squares fit, we are assuming that the system follows Beer's law, and that there are no parameters other than the absorptivity, pathlength, and concentration that affect the absorbance. We already know from Chapter 1 that many variables affect the absorbance besides those in Beer's law. If those variables are not controlled, not only will it be inappropriate to use Beer's law, it will be inappropriate to use least squares fitting on the data as well. 2. Linear Model—When performing linear regression, the equation being fit must be linear. Specifically, this means that the coefficients must be linear. In the case of Beer's law, A = elc, the coefficients are the absorptivity and the pathlength. It is assumed that these quantities vary Hnearly with absorbance. If, for example, absorbance happened to be proportional to F, it would be inappropriate to use linear regression to analyze the data. 3. No Systematic Error—This is just another way of saying that all the error in the data is random error. Since we are fitting plots of absorbance versus concentration, we are assuming that these data have no systematic error in them. Any bias in the data will produce bias in the line calculated from the data. In reality, it may be impossible to assure that there is absolutely no systematic error in the data. A more realistic assumption is that the systematic error is significantly less than the random error. If this is the case, the least squares methods can usually be appUed legitimately. 4. Error in the Y-Axis Data Must Have Certain Properties—This concept is best illustrated with an example. Imagine for the moment that the F-axis data is concentration, and that you have five standard samples, hence five F-axis data points. Also, imagine that we use titration to analyze the concentration of each standard, and that we perform nine titrations of each standard under controlled conditions. For each standard, we will determine nine concentrations, and because of random error, there will be some scatter of these concentrations around the average for each standard. Strictly speaking [3], the shape and extent of this scatter has to be the same for each sample to legitimately apply the least squares fitting to the data. For those familiar with statistics, we are saying that the errors for each
52
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
sample follow a Gaussian distribution, and that the standard deviation for each sample is the same (these concepts are explained later in this chapter). 5. There is no Error in the X-Axis Data—This is perhaps the most important assumption from a practical point of view. Error free data do not exist, thus this assumption is violated practically any time least squares fitting is performed. When fitting absorbance versus concentration data, we are assuming that there is no error in the concentration measurements, which is patently false. What then should we do? We know that both the absorbances and concentrations have error in them. However, the concentration error is often an order of magnitude greater than the absorbance error. The best way to minimize the damage caused by violating this assumption is to put the data with the least error in it on the Z-axis. This means putting absorbance on the 1^-axis, concentration on the 7-axis, and fitting plots of concentration versus absorbance. This is actually the recommended practice, as will be discussed in more detail later in this chapter. To summarize, to apply the least squares equations to a problem, conditions are placed upon the equation to be fit, and the error in the data. If these assumptions are met, least squares fitting can be apphed legitimately to your data.
3. Calculating a Least Squares Line
Table 2-1 contains two sets of data, a series of concentrations and a series of absorbances. There are a number of computer programs readily available today (including Microsoft Excel®) that can be used to calculate the least squares line relating two sets of data. However, it is useful to examine the equations used in this process to understand where caUbrations come from. Assuming two sets of data, X and 7, are related to each other by the following equation: Y ^mX
+b
Then the least squares slope of a Hne fitting this data is given by [1]:
"^ = Hi (^' -^)(Yi-Y)/
E / (^' - ^)'
^^'^^
and the least squares 7-intercept is given by b=Y-mX
(2.7)
2. SINGLE ANALYTE ANALYSIS
53
where m = slope of the Hne / = data point index b = 7-intercept of the hne X = the average of the Z-axis values Y = the average of the T-axis values The calculations involved in determining the cahbration hne include subtracting the average X from all X values, the average Y from all Y values, then performing the appropriate summing, multiplication, squaring, and division. Recall from the previous section that one of the assumptions of the least squares method is that there is no error in the Z-axis data. This assumption is best met by plotting concentration versus absorbance. To do this we use what is known as the inverse form of Beer's Law. Rather than writing absorbance as a function of concentration as seen in Equation (2.1), we write concentration as a function of absorbance as such c = A/sl
(2.8)
Equation (2.8) means that a calibration hne plotted using the inverse form of Beer's law would have concentration on the F-axis, absorbance on the X-axis, and a slope of I/si. We would call this graph an inverse Beer's Law plot. Inverse Beer's law plots are convenient because the unknown variable, the concentration, has already been solved for in Equation (2.8) and can be determined directly. There are other advantages of using the inverse form of Beer's law [3] discussed in Chapter 3. An inverse Beer's law plot for the IPA data in Table 2-1 is seen in Figure 2.4. Using the data in Table 2-1 and applying Equation (2.6), we can calculate the slope of the line seen in Figure 2.4. The data needed to determine the slope are listed in Table 2-2. Using the sums in the bottom row of Table 2-2, the slope is 411/68 or 6.04. To calculate the 7-intercept we use Equation (2.7) and find that /? = 37-(6.04 x 5.4) or 4.39. The final equation for the line plotted in Figure 2.4 is %IPA = 6.04(Peak Area) + 4.39
(2.9)
This equation would be used to predict the concentration of IPA in an unknown sample given the unknown sample's peak area. Note that this equation was calculated manually. The equation for this hne calculated using Microsoft Excel is seen at the top of Figure 2.4. While the two calculations did not produce identical results, they are very similar, and
54
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Inverse Beer*s Law Plot for IPA y = 6.0528X + 4.1936 R = 0.999
10 Peak Area Figure 2.4 The Inverse Beer's Law plot for the data contained in Table 2-1. This line also acts as a calibration line, relating volume percent IPA to peak area. TABLE 2-2
Least squares slope calculation for the calibration line seen in Figure 2.4. Zs and Ys from Table 2-1. X = 5.4, Y = 37. Xi
Yi
(Peak Areas) 1 2.2 4.9 8 11
(Xi - X)
(X, -
Xf
(Y^ - Y)
(% IPA) 9 18 35 53 70
{Xi -
X)
x ( r , - Y) -4.4 -3.2 -0.5 2.6 5.6
19.4 10.2 0.25 6.76 31.4 E = 68.0
-28 -19 -2 16 33
123 60.8 1 41.6 185 E = 411
the differences can probably be explained by the rounding of numbers in the manual calculation.
B. STATISTICS FOR DETERMINING CALIBRATION QUALITY AND ROBUSTNESS
One way to think about the problem of calculating cahbration models is that the variance in the data must be modeled. For our purposes, the variance in a data set is determined by the difference between each individual data point and the average for that set of data. For example, the variance in a set of concentration data would be determined by the difference between each individual concentration and the average concentration. The greater the difference between a set of data points and their average, the greater the variance in the data. Ultimately, variance is a measure of the scatter, or spread, in a data set.
55
2. SINGLE ANALYTE ANALYSIS
Variance Modeled (SSR, inner circle)
\ / /^^^r^m\ / /^^^•^v\ [ 4» X • k 1
Total Variance ^^^^^ ^^^^^ circle)
Unmodeled Variance (SSE, torus) Figure 2.5 An example of variance in a set of data. The "X" represents the average of the data. The outer circle represents the total variance, the inner circle the variance accounted for by the model. The area between the two circles (torus) is the unmodeled variance, the variance due to error.
If a calibration model were perfect, it would completely model all the variance in a set of data. For example, if all the data points in Figure 2.4 fell exactly on the caHbration line, we would have a perfect model with no unmodeled variance. In reahty, the data points in Figure 2.4 do not fall exactly on the Hne, which means the hne has not modeled all the variance in the data. The unmodeled variance can be attributed to error either in the data or in the model. The idea of variance and how a caHbration models it can be visualized by looking at Figure 2.5. This figure is a hypothetical plot of a set of data points. The large bold X in the middle of Figure 2.5 represents the average of the data. There is obviously some scatter in this data. It can be shown [3, 4] that the total variance in a data set, the variance explained by the mathematical model, and the variance due to error are given by the following equations: SST = ^ . ( r , - Y)^
(2.10)
SSR-^.(r-
f)2
(2.11)
SSE = ^ . ( r -
Yf
(2.12)
where r = actual (measured) 7-axis values Y = the average value of Y Y' = the value of Y predicted by the calibration model SST = sum of squares total (the total variance)
56
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
SSR = sum of squares explained by regression (the variance modeled) SSE = sum of squares due to error (variance not modeled) Note that SST is the total variance and depends only upon the data and its average. The area of the outer circle in Figure 2.5 represents the total variance. SSR is the amount of the variance explained by the calibration model (the R stands for regression), and depends upon the predicted values of Y and the average Y. The area of the inner circle in Figure 2.5 represents SSR. SSE is the amount of variance not explained by the caHbration model, or the amount of variance due to error. SSE depends on the actual and predicted Y values. It is represented by the area between the two circles in Figure 2.5. The area of this torus (doughnut shape) is what the least squares fits attempt to minimize. A perfect model would have outer and inner circles of equal areas, all variance would be modeled, and SSE equal zero. Since a caHbration model explains some but not all the variance, the total variance must depend on the sum of the variance modeled and the variance due to error. It can be shown [3, 4] that SST = SSR + SSE
(2.13)
Equation (2.13) is another way of saying that the area of the outer circle in Figure 2.5 is equal to the sum of the areas of the inner circle and the torus. Using the data in Tables 2-1 and 2-2, we can calculate SST, SSR, and SSE for the caUbration Une seen in Figure 2.4. In this example, Y^ values are calculated by substituting the measured peak area values for each sample into Equation (2.9). The data used to calculate these values is seen in Table 2-3. TABLE 2-3
Data used to calculate SST, SR, and SSE for IPA% calibration line seen in Figure 2.4. f = 37.
Y
r
9 18 35 53 70
10.4 17.7
Sums
34 52.8 70.9
(7, - Yf
(r - Yf
784 361 4 256
708 372 9 250
1089
1149 J2i = 2488 (SSR)
Ei = 2494 (SST)
(F- rf 1.96 0.09
1 0.04 0.81
J2i = 3.9 (SSE)
2. SINGLE ANALYTE ANALYSIS
57
For this calibration model, SST = 2494, SSR = 2488, and SSE = 3.9. Ideally, this data should satisfy Equation (2.13). However, 2488 + 3.9 = 2491.9 or ^2492 2494 7^ 2492 The data do not satisfy Equation (2.13) exactly, but do so within 0.1%. The probable explanation is the accumulation of the round off error in manually performed calculations. There are a number of very useful statistics that can be calculated from SST, SSR, and SSE that describe the quahty and robustness of a cahbration model. The best known of these is called the standard deviation, and is calculated from the following equation [1, 3]: a = [SSE/(« - l)]^/2
(2.14)
where a = standard deviation n == number of data points Note that the standard deviation depends only on SSE, the unmodeled variance, and does not depend upon SST or SSR. The term (n—l) term in the denominator is called the degrees of freedom. This term appears instead of n, the number of data points. The degrees of freedom take account of the fact that it only takes (n—\) data points to describe a data set containing n points. Here is how to think about it. Imagine we have the following simple algebraic equation: l + 2 + 3 + x = : 10 It is easy enough to solve this equation to discover that x = 4. Now, we only need three of four data points on the left-hand side of the equation to solve this equation, so this equation has three degrees of freedom. The same idea apphes to data sets being fit by a model. The standard deviation, then, is a standardized way of expressing the amount of error per data point. From the data in Table 2-3, we can calculate that a = [3.9/(5 - 1)]^^^ or 0.987% IPA (we will round this value up to one from here on out). Note that the standard deviation has the same units as the 7-axis data. The standard deviation is commonly used to refer to the accuracy of the predicted values. For example, if Equation (2.9) were used to predict the volume percent of IPA in an unknown sample, it would be legitimate to say that the predicted value had an accuracy of ± 1 % .
58
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Another statistic that describes how well a caUbration model fits a data set is the correlation coefficient. It is given by [1-4] 7? = (SSR/SST)^/2
(2.15)
where R = correlation coefficient The term "correlation coefficient" is used because this quantity describes how well the model fits, or correlates to, the data. R depends upon the fracfion of variance in the data described by the model, SSR. If the model perfectly describes the data, SSR = SST (SSE is 0) and 7^==1, all of the variance is modeled. If the model describes none of the variance, SSR = 0 and R = 0. Thus, the correlation coefficient varies between 0 and 1. A value of 1 is for a perfect model; a value of 0 is for the worst possible model. In terms of Figure 2.5, the correlation coefficient depends upon the ratio of the area of the inner circle to the area of the outer circle. The greater the fraction of the total area modeled, the bigger the inner circle, the smaller the torus, and the better the model. From the data in Table 2-3, the correlafion coefficient for the %IPA calibrafion is (2488/2494)^/^ or 0.998. This indicates that the model fits the data very well. Analysts occasionally refer to the quality of their cahbrations by the number of "9s" in the correlation coefficient. For example, a calibration with "one 9" would have 7^>0.9, a caHbration with two "9s" would have 7? > 0.99, etc. This cahbration has R = 0.998, so it has two 9s, but is close to three 9s. Another useful statistic that can be derived from SST, SSR, and SSE calculations describes the robustness of a caHbration. For our purposes, robustness is a measure of how sensitive a caHbration is to random error. If a small increase in the random error has a small effect upon the predicted values, a calibration is robust. On the other hand, if a small increase in the random error gives a large effect upon predicted values, a caHbration is not robust. The statistic used to measure the caHbration robustness is called the Ffor Regression and is given by the following formula [2]:
(SSE) where i^=:F for regression n = number of data points m = number of independent variables
m
2. SINGLE ANALYTE ANALYSIS
59
Note that the F for regression depends upon the amount of variance described by the model (SSR) divided by the amount of variance due to error (SSE). You can think of this calculation as a signal-to-noise ratio (SNR). SSR is the amount of variance that can be explained, the signal. SSE is the error, or noise, in the determination. By taking their ratio, we measure how much variance the model explains compared to what it does not explain. The reason the F for regression is a measure of robustness is as follows. Imagine a cahbration where F = 10, or in other words the SNR ==10. If a problem with the spectrometer causes the absolute error in measured absorbances to increase by 1 unit, SSE would increase 1/10 or 1%. On the other hand, if F for a calibration is 1000, an increase in error of 1 unit changes F b y 1/1000 or 0.1%. Simply stated, 1 is a smaller percentage of 1000 than it is of 10. A robust calibration with a large F is less sensitive than a fragile calibration with a low F to the same increase in error. In terms of Figure 2.5, the F for Regression depends upon the area of the inner circle divided by the area of the torus. A robust model will have a small torus; a fragile model will have a large torus. Using the data in Table 2-3, the F for regression for the IPA calibration in Figure 2.4 is [2488/(3.9/3)] or 1914. The variable n in Equation (2.16) is determined by the number of standard samples. As n goes up, so does F. This means that the more standard samples included in a calibration, the higher F becomes and the more robust the calibration will be. The variable m in Equation (2.16), the number of independent variables, is for our purposes a measure of how many different absorbance values in each spectrum are used in a cahbration. For the calibration in Figure 2.4, one absorbance band's area was plotted giving a value of m = 1. Since a Beer's law equation can be written for each absorbance in a spectrum used, m also determines the number of equations in a calibration. Note in Equation (2.16) that as m goes up, F goes down. This means the more equations used in a calibration, all else being equal, the less robust the cahbration becomes. F contains the quantity SSR/m, a measure of how much variance per equation is modeled. A robust calibration will model the most variance with the fewest equations. A fragile cahbration will need many equations to model the same amount of variance. The measures of cahbration quahty for IPA cahbration discussed in this section are summarized in Table 2-4. Overall, this is an excellent calibration. Other absorbance bands, including the one at 948 cm~^ were used in trial calibrations. These typically gave standard deviations of two or greater. The key to the success of this cahbration was that the peak at 2973 cm~^ was not very intense. This peak's intensity for the most concentrated sample is 0.76. As will be discussed below, ideally only absorbance bands
60
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 2-4
Calibration metrics for the IP A calibration. Metric
Value
Standard Deviation (a) Correlation Coefficient (R) F for Regression
1.0 0.998 1914
whose intensity is 0.8 or less should be used in caUbrations. At higher absorbances, the relationship between absorbance and concentration tends to be nonHnear. One of the reasons Beer's law fails at high concentrations is absorptivity drift. As was stated in Chapter 1, the absorptivity is a function of the chemical environment of a molecule. The greater the concentration range modeled in a calibration, the more the chemical environment changes, and the more probable a measurable absorptivity change will be observed.
IV. Methods for Making Standards and Measuring Spectra It is important when making standards to give some thought to the concentrations of the analytes in the standards. First, the concentration range should be broad enough to bracket the concentration that any unknown example might exhibit. This prevents applying the calibration to an unknown that falls outside the concentration range of the caUbration. Second, the concentration range should be significantly broader than the error in the concentration measurements. For example, if the error in a series of concentration determinations is ± 3 % , and if standards are prepared with nominal concentrations of 10, 12, 14, and 16%, the concentration error bars for all these standards would overlap and there would be no unique data points. The solution to this problem is to space out the concentrations, so that they are farther apart than the error bars. In this case, standard concentrations of 10, 20, 30, and 40% would work. A third thing to consider about the analyte concentrations is their distribution. A caUbration Hne is based on data points. The more data points in a specific concentration region, the more accurate the line is in that region. The concentrations plotted in Figure 2.4 are about evenly spaced. This is ideal; it means that the cahbration is equally accurate throughout its range. On the other hand, if the standards contained four samples with IP A concentrations below 40%, and one sample at 70%, the caUbration would be very accurate below 40% and less accurate at higher concentrations. There are situations where you may not have control over the
2. SINGLE ANALYTE ANALYSIS
61
concentrations in the standard samples, and have to Hve with whatever concentrations the standards contain. However, if you do have control over the concentrations in the standards, try to space them out. The next two sections discuss two different standard methods. These are different ways of using standards in creating a caHbration.
A.
EXTERNAL STANDARDS
The method of external standards can be summarized as calibrate now, predict later. It follows the same steps outlined earlier in this chapter for obtaining a caHbration. Standards are made, their spectra obtained, and their absorbances measured. The absorbances and concentrations are plotted to give a caHbration curve such as Figure 2.4. The concentration of unknown samples is determined by obtaining their spectrum, measuring their absorbance, and then using the caHbration Hne to predict the concentration in the unknown sample. The method of external standards is so called because the calibration and prediction steps are performed at different points in time. The caHbration plotted in Figure 2.4 was obtained using this method. The external standards method works fine for many samples, and is relatively simple to perform. However, it does have some drawbacks. It does not take into account fluctuations in random variables over time, such as changes in the instrument performance or sampHng. These variables can have an unknown effect on the measured absorbances, contributing an unknown amount of error to the predicted concentrations. When using the external standards method (in the absence of calibration checking), one can never be 100% sure if a calibration obtained in the past is stiH working weH. A way of insuring the validity of a calibration over time is to check the calibration with a new standard sample. This is done by simply making up a new standard with a known concentration of the analyte, measuring its spectrum, predicting its concentration, then comparing the known and predicted concentrations for the sample. If the concentrations agree reasonably weU, the caHbration is stiH good. However, if the predicted and known concentrations are not in reasonable agreement, the caHbration may no longer be accurate, and the source of variability must be eliminated, or a completely new caHbration may need to be obtained. Some users of the external standards method create a new standard sample every time they analyze an unknown sample, and check the calibration at the same time they are predicting the concentration in the unknown sample. This is an excellent way of insuring the caHbration is stiH accurate.
62
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
B. INTERNAL STANDARDS
The method of internal standards was developed to try to get around the problems of the external standards method. It involves adding a known amount of a known material, called the internal standard, to each standard sample and to each unknown sample. The internal standard should be chemically stable, not react or interact with other molecules in the sample, not interfere with the analyte absorbance, and have a unique absorbance band free of interferences. Once the spectrum is measured the absorbance of the analyte is divided by the absorbance of the internal standard. The caUbration curve is generated by plotting this absorbance ratio versus analyte concentration. For unknown samples, the ratio of the analyte and internal standard absorbances is calculated, and then used with the calibration line to predict the concentration of the analyte in the unknown. The internal standards method assumes that any unknown source of error will affect the internal standard and analyte absorbances equally. When these absorbances are ratioed, the effects of the unknown variables are canceled, yielding a more accurate analysis. As a result, internal standards analyses are generally more accurate than external standards analyses. However, the internal standards method involves more work and calculations than the external standards method. The internal standards method is useful for any type of sample where the pathlength is unknown or difficult to reproduce (such as in KBr pellets in FTIR). This is illustrated by first writing Beer's law equations for the analyte and the internal standard. (In this discussion, the subscript "a" stands for the analyte, and the subscript 'is' stands for the internal standard.)
When the two absorbances are ratioed, we obtain the following:
The pathlength for the analyte and the internal standard are the same because they are in the same sample. When the ratio is calculated the pathlength cancels and the result is a pathlength independent quantity as such: ^ a / ^ i s — ^a^a/^is^is
v^-^ ')
2. SINGLE ANALYTE ANALYSIS
63
A calibration obtained using Equation (2.17) would be pathlength independent, and is an example of how the internal standards method compensates for random variations in experimental parameters. C. A N EXPERIMENTAL PROTOCOL FOR SINGLE ANALYTE ANALYSES
The following is an experimental protocol that, if followed, will help in the attainment of accurate single analyte caHbrations. Analyzing for a Single Analyte 1. Prepare the standards. Be as accurate as possible since any concentration error in the standards is carried through to generate error in the predicted concentrations of the unknown samples. 2. Take two aUquots of each sample and obtain separate spectra of them. In theory, these two spectra should be identical because they are obtained with the same sample, instrument, and sampling accessory. 3. Subtract the two spectra of each sample using a subtraction factor of one, giving a subtraction result or residual. Look closely at the residual. Two spectra of the same sample should be identical, and should give a residual that is a reasonably flat hne containing only noise. Such a residual is seen in the top of Figure 2.6. The only features seen in this residual are noise and a CO2 peak at 2350 cm~^ The bottom residual in Figure 2.6 shows what happens when absorbance does vary between two spectra of the same sample. This spectrum shows analyte peaks, meaning something caused the absorbances to change. Since these are spectra of the same sample, the problem may be caused by the instrument or sampling variability. At a minimum, the two spectra of the standard should be re-measured and a new subtraction performed. It may take a good deal of experimentation and instrument adjustment to control the variable(s) causing the problem. 4. Examine the spectra and choose a band whose height or area changes with concentration (more on this below). 5. Measure the appropriate absorbance and plot the caHbration hne (see above). Examine the plot to insure linearity, and then calculate its slope. Use the statistics discussed above to check the quality and robustness of the cahbration. 6. Obtain the spectra of the unknown samples, and calculate the unknown concentrations using the cahbration curve's slope and the absorbance of the unknown. If these simple steps are followed, an accurate single analyte calibration can often be obtained.
64
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Noise Only "~X
LM,,^.**^-^^
Lw--->--J 1 3500
1 3000
1 2500
1 2000
\ 1500
1 1000
Wavenumber (cm-1) Figure 2.6 Spectral residuals obtained by subtracting two spectra of the same sample. Top: No spectral features are seen, just noise and a CO2 peak. This indicates that there are no unusual sources of absorbance variability in the experiment. Bottom: Real spectral features, pointing up or down, indicate a source of absorbance variability exists in the experiment.
V. Measuring Absorbances Properly A. PEAK AREAS VERSUS PEAK HEIGHTS
Correctly measuring the absorbance of a spectroscopic feature is more difficult than might first be imagined. You must first insure that the spectral feature being measured is due to the analyte, and that there are no interfering absorbances from other components. It may appear that reading the absolute peak height directly off a spectrum will give an accurate absorbance. However, the baselines of spectra can drift up and down on the 7-axis, due to things Hke instrumental problems. As a result, spectra of the same sample measured at different points in time may give different absolute absorbances. To avoid this problem, it is common to draw a baseline connecting the beginning and ending points of a peak, and then measuring the peak height or area with respect to that baseline. This is shown in Figure 2.7 for the C = N stretching band in the mid-infrared spectrum of benzonitrile. Note the vertical lines in Figure 2.7 denoting the beginning and endpoints of the basehne. It is critical that these endpoints be chosen such that they include the entire analyte peak, but no spectral features due to other molecules present. The peak height is measured from the baseline to the top of the peak.
65
2. SINGLE ANALY TE ANALYSIS
2228
Height = 0.445 Area = 6.58 .4-
.3-
o
.2-
.12270 0- 1
1
^
2195
> ^ ,
2260
p-
2240
1
2220
-L
1 2200
1—
2180
Wavenumber (cm-1) Figure 2.7 An example of a spectroscopic absorbance band, the C=N stretch of benzonitrile. Note the basehne drawn between the peak edges, used to remove basehne drift. Also, note the integration limits of 2270 and 2195 c m ' \
The peak area is measured by integrating the peak between the basehne endpoints. In Figure 2.7 the basehne endpoints were 2270 and 2195 cm~^ By drawing the basehne such that the endpoints both touch the spectrum, spectral drift on the F-axis is subtracted from the peak height or area calculation. This ultimately gives a more accurate and robust calibration. Many of us instinctively choose to measure the peak heights at the peak top. Why is it done this way? The peak seen in Figure 2.7 consists of 40 data points. In theory, the absorbance at any of these points could be used to create a cahbration since all of these points respond to changes in the analyte concentration. We use the data point at the top of the peak for two reasons. First, assuming the noise level across the peak is constant, the maximum absorbance at the peak top will have the best SNR of all the data points in the peak. Second, the slope of the spectrum at the top of a peak is close to or equal to zero, so a small error in the X-axis (AX), such as a shift in the measured wavenumber or the wavelength, will produce a small shift in the measured absorbance (AY). However, on the steep sides of an absorbance band where the slope is large, a small change in the wavenumber or the wavelength means a large change in the absorbance. This is why absorbances measured on the side of the peak are more sensitive to X-axis error than absorbances measured near the top of a peak. There is an ongoing debate as to whether peak areas or the peak heights provide better calibrations. Here is one author's opinion. When a peak height is measured, only one of the many data points in a peak whose
66
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
absorbance is sensitive to the analyte concentration is used in the analysis. This is like putting all your eggs in one basket. If anything causes the accuracy of this one data point to be compromised (an interferent in the unknown sample for example), the entire calibration is thrown off. This cahbration would not be robust with respect to these types of absorbance errors. Peak areas are measured by adding the absorbances of many different data points together. For the band seen in Figure 2.7, the peak area is calculated by adding together the absorbance at 40 different wavenumbers. If one of these data points is inaccurate, it is averaged over the other 39 data points that are accurate. By using more data points, the cahbration is less sensitive to error in any specific data point. Additionally, by adding the absorbances together, the SNR of the area increases with respect to that of the peak height. In general, peak areas produce more robust and accurate caUbrations because they are not as dependent on a single data point, and the raw data has a better SNR. However, there are real world examples of chemical systems where peak height caUbrations work better than peak area caUbrations. Therefore, the author recommends that when establishing a cahbration on a new chemical system, that peak areas be tried first. If you have difficulty in getting a good cahbration with peak areas, then try peak heights. There is no guarantee that this will fix your cahbration, but it might. It is now trivial to measure the peak heights, the peak areas, and generate caUbrations using computers. Therefore, it is straightforward to try both types of cahbration and choose the one that works best. It would be useful in this discussion to have some measure of the quality of the set of peak areas or peak heights used in a calibration. So far, we have discussed signals, noise, and SNRs as measures of spectral quality. Typically, we use the absolute absorbance of a feature in a spectrum as a measure of its signal. How should we characterize the "signal" in a set of peak areas or heights? It is the change in absorbance with concentration, not individual absorbances that we are interested in when calibrating. Thus, the average Av4 (peak area or peak height) for a spectral set characterizes the signal in the data. The noise in the data is measured as the average of the noise for spectra in the set. We can calculate the SNR for the spectral set plotted in Figure 2.3 using the data seen in Table 2-5. The noise levels in these spectra were measured as the peak-to-peak noise (maximum minus minimum noise value) in the 2600-2500 cm~^ spectral region. This region was chosen because the basehne was fiat and free of absorbance bands. The AA values were calculated by subtracting sequential values from each other. The averages for APeak Area, APeak Height, and noise are in the bottom row of Table 2-5. The SNR we desire is the average AA divided by the average noise level. For the peak area data,
2. SINGLE ANALYTE ANALYSIS
67
TABLE 2-5
Data used to calculate SNR for the IPA spectral set used in the calibration seen in Figure 2.4. % IPA
9 18 35 53 70 Average
A Peak Peak Area (3004-2947 cm"^;) Area
1 2.2 4.9 8 11
— 1.1 2.7 3.1 3 2.5
Peak Height (2973 cm-^)
A Peak Height
Peak-to-Peak Noise (2600-2500 cm-^)
0.05 0.1 0.22 0.37 0.51
_ 0.05 0.12 0.15 0.14 0.11
0.003 0.002 0.0015 0.0012 0.001 0.0017
SNR = 2.5/0.0017 or 1471. For the peak height data, SNR = 0.11/0.0017 or 65. Thus, the peak area data have a 23 times better SNR than the peak height data. This is to be expected since peak areas involve adding multiple absorbances together, whereas peak heights are a single absorbance. The point is that the higher SNR for the peak area data means calibrations based on this data are more robust with respect to absorbance error than the peak height data. B. DEALING WITH OVERLAPPED PEAKS
In an ideal world, we measure the absorbance of an analyte band that is baseline resolved, and has no interference from bands due to other molecules. The peak in Figure 2.7 is a good example of this. However, there are samples where analyte bands are overlapped with bands due to other molecules. The peak at 2973 cm~^ in Figure 2.8 is an example. It is an IPA peak superimposed upon a much broader and stronger absorbance band due to water. Are usable calibrations possible in such circumstances? The solution to overlapped peaks is judiciously drawing an appropriate baseline for the spectral feature being measured. Only data points whose absorbance is sensitive to the concentration of the analyte should be included in a peak height or area measurement. The advantage of using the peak heights or areas measured with an appropriate baseline is seen in the data in Table 2-6. The table contains nonbaseline corrected peak areas for the IPA peak at 2973 cm~^ The spectra from which these data are taken are seen in Figure 2.3. The calibration Hne obtained from the data in Table 2-6 is seen in Figure 2.9. We can compare the cahbration hne in Figure 2.9 to that in
68
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
1.2H
l-[
.2H
1
1
3400
3300
1
1
3200
1
1
1—
3100 3000 2900 2800 w avenumD er (^cm-1;
2700
—I— 2600
Figure 2.8 An example of a shoulder peak, and how to draw a baseline to accurately measure its peak height or area. TABLE 2-6 Nonbaseline corrected peak area and %IPA data. Nonbaseline Corrected Area
%IPA
16.3 19.9 22.9 23.2 26.2
9 18 35 53 70
IPA Uncorrected Peak Area Calibration y = 6.3099X • 99.925 R = 0.948 80 •
60 40 20 0 15
20
25
30
Peak Area Figure 2.9 The calibration line obtained using uncorrected IPA peak areas, based on data in Table 2-6.
2. SINGLE ANALYTE ANALYSIS
69
i
1260
1220 1200 Wavenumber (cm-1)
Figure 2.10 A series of overlapped peaks that share a common baseUne, and how to draw basehne segments to measure their absorbances.
Figure 2.4. Both the cahbrations were obtained using the same integration Umits and the same spectra, the difference is that the data in Table 2-1 were baseline corrected; the data in Table 2-6 are not. Recall from earher in this chapter that the correlation coefficient, R, is a measure of model quahty, and that R=\ is a perfect model. The calibration using basehne corrected peak areas has 7^ = 0.998 as seen in Table 2.4. The model using nonbaseline corrected peak areas has 7^ = 0.948, significantly worse than the basehne corrected model. By judiciously choosing the basehne endpoints that include only the IPA feature, the effect of the interferents and the basehne drift on the absorbance of the analyte has been reduced. Another instance when it can be difficult to measure the absorbances properly is when a group of peaks overlap and are not basehne resolved, as illustrated in Figure 2.10. To deal with this situation, use the minima between the overlapped peaks as baseline endpoints, as shown in Figure 2.10. Used properly, this approach isolates the absorbances of the analytes, and removes the effects of baseline drift from the measurement. C. CORRECTING FOR INTERFERENTS
When dealing with the determination of single analytes in a mixture, the analyte band may be overlapped with those of interferent molecules. As long as the interferent band appears as a shoulder on the analyte band and can be somewhat resolved, judicious drawing of basehnes can
70
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Polystyrene/Lexan Mixture
' Lexan
A / Pure Polystyrene
Wavenumber (cm-l) Figure 2.11 An example of an interferent band totally overlapping with an analyte band. The bottom spectrum is of polystryene, the interferent. The middle spectrum is of the polycarbonate Lexan®, the analyte. The top spectrum is of a mixture of the two polymers. The dashed lines show that the polystrene features are almost totally subsumed by the polycarbonate features in the mixture spectrum.
compensate for the interferent absorbance as shown above. However, there are times when the interferent bands totally overlap the analyte bands, making it difficult to draw proper basehnes. An example of this is seen in Figure 2.11. The spectra in Figure 2.11 are of polystyrene, the polycarbonate Lexan®, and a mixture of the two. The dashed Unes show that the polystyrene bands are almost totally subsumed by the polycarbonate bands. This is why we cannot draw a baseline to remove their absorbance from the Lexan absorbance. How then could we obtain a calibration for Lexan using these features without the polystyrene interference? A technique called interferent correction helps get around this problem. Here is how it works. First, assume the analyte feature whose absorbance we want to use in a cahbration is overlapped with a peak of one interferent molecule. We will call this spectral feature the analytical hand. We will further assume that no other molecules but the analyte and the interferent contribute to the absorbance of the analytical band. Next, assume that the interferent has an absorbance band away from the analytical band that is isolated and due only to the interferent, called the interferent hand. For this situation.
2.
SINGLE ANALYTE ANALYSIS
71
we can write the following equations: ^ t - A + ^i
(2.18)
Ct = Ca + C,
(2.19)
where the indices t, a, and i represent total, analyte, and interferent respectively. Equation (2.18) states that the total absorbance of the analytical band is equal to the absorbance of the analyte plus the absorbance of the interferent. We are taking advantage here of a property of the absorbances called their additivity. This simply means that the total absorbance at any wavenumber is the sum of all the absorbances of all the species present at that wavenumber. Concentrations are also additive, which allows us to write Equation (2.19). Like any other quantitative analysis, we seek Ca, the concentration of the analyte. We rewrite Equation (2.19) to solve for c^ to obtain c^ = a-
Ci
(2.20)
which shows that the analyte concentration is simply the total concentration minus the interferent concentration. Using Beer's law we can write the following for the total absorbance: At = Bile,
(2.21)
c, = A/8,1
(2.22)
and rearrange to obtain
We can apply Beer's law to the interferent concentration similarly to obtain A; = Silci
(2.23)
Ci = Ai/sil
(2.24)
and rearrange to obtain
Substituting Equations (2.24) and (2.22) into Equation (2.20) gives
c. = A/eJ - A/sJ
(2.25)
72
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Equation (2.25) gives us a means of correcting for the interferents even if the analyte and interferent features are completely overlapped. Note that the right-hand side of Equation (2.25) has two terms. The first term involves parameters for the analyte and interferent together, the second just the interferent. How would we use Equation (2.25) to predict the analyte concentrations? Recall from earUer in this chapter that the slope of a single analyte cahbration line is equal to si. What we need to do is make up a series of standards with known concentrations of the analyte and the interferent. Once the spectra of these standards are obtained, the absorbance of the analytical band and the interferent band are measured. Then, two independent calibrations can be performed. Using the analytical band and Equation (2.21), the slope of the cahbration hne is found to be e^l. Using the interferent band and Equation (2.23), the slope of the cahbration line is found to be £{1. Prediction would simply be a matter of measuring A^ and Ai in the unknown spectrum, then using the slopes of the cahbration lines and Equation (2.25) to obtain the analyte concentration. Note from Figure 2.11 that if we did not have spectra of the pure analyte and interferent, we would not have been able to see how the polystyrene spectrum interfered with that of the Lexan spectrum. It is generally a requirement when performing single analyte analyses in mixtures to have pure spectra of all the different molecules present in the unknowns. You then need to closely examine these spectra to look for isolated analyte bands. If none are found, then either judicious drawing of baselines, or the interferent technique discussed here need to be used. If none of these techniques work, or if there are multiple analytes needing to be determined with multiple interferents, the more powerful calibration algorithms discussed in Chapters 3 and 4 should be used.
VI. Avoiding Experimental Errors Avoiding errors is the key to obtaining good calibrations. It is convenient to divide error into different categories. Since for a quantitative spectroscopic cahbration the two pieces of data are concentrations and absorbances, it is appropriate to speak of concentration errors and absorbance errors. This section will cover how to measure and avoid these types of errors, in addition to discussing general sources of experimental and instrumental errors.
A. SPOTTING OUTLIERS
The concentration error in each sample in a set of standards should be approximately the same. This happens if standards are made up following
2. SINGLE ANALYTE ANALYSIS
73
an exact, reproducible experimental technique. The errors in the absorbances, which are based on the SNR of the spectrum, should also be approximately the same if careful experimental technique is followed. However, from time to time human or instrumental errors can cause a specific concentration or absorbance to have much more error than average. We will call those data points that have significantly more error than the rest of the data in a model outliers. Outliers can be spotted by subtracting the predicted data from the real data to obtain a residual. The size of the residual measures how well the model predicts a specific data point, such as a standard sample's concentration or absorbance. If the error across a set of data is approximately the same, then the model should predict the properties of each sample with approximately equal accuracy. If a specific sample has an unusually large residual compared to the rest of the data in a set, it may be an outlier. If all the other data points are well predicted, it probably means that the problem is with the data point rather than the model itself Once an outlier is spotted, the data should be examined to try to determine why the outher is present. This can possibly lead to removing the data from the calibration. The next two sections discuss how to spot outliers in the concentration and absorbance data. 1. Concentration Outliers
There are many reasons why analyte concentrations contain error, including mistakes in making up standards. Another problem is transcription errors. If the concentration value is recorded incorrectly, or if it is entered incorrectly into the caHbration software, it will throw off a caHbration. Lastly, reactions between the sample components, evaporation, or sample precipitation can also cause the predicted analyte concentration to be different from the actual. It is therefore important to pay attention to the preparation and condition of every standard sample. A standard whose actual and predicted concentrations lie far apart is called a concentration outlier. A quantity called the concentration residual can be used to spot concentration outUers. The concentration residual for a sample is calculated as follows: R^ = c-c' where Re = concentration residual c = actual concentration c' = predicted concentration
(2.26)
74
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 2-7
Actual, predicted, and residual concentration data for the IPA calibration in Figure 2.4. Sample Number
c
c
Re
1 2 3 4 5
9 18 35 53 70
10.4 17.7 34 52.8 70.9
-1.4 0.3 1.0 0.2 -0.9
Concentration Residuals for IPA Calibration -5 2 9
S
1
^
0
o -1
-2
I 0.3
-M10.2 • -0.9 6
• -1.4 Sample #
Figure 2.12 A plot of the concentration residuals from the IPA calibration, based on data from Table 2-7.
Table 2-7 shows the concentration residuals and the actual concentrations for the IPA calibration seen in Figure 2.4. The equation hsted in Figure 2.4 was used to predict the concentrations. A scatter plot of the concentration residuals of Table 2-7 is seen in Figure 2.12. Note that the X-axis on this plot is the sample number, and that the F-axis is in concentration units. The residuals in Figure 2.12 fall in a "clump" around zero. This is normal if all the sources of errors in the concentration values are random. If the clump is centered on a value other than zero, this may indicate a systematic error in the concentration data. The clump should look like a "random scatter;" it should have no structure. If the points fell on a line, or had a recognizable shape to them, it would again indicate that the error sources are not random, and that a systematic error may be present. If one of the residuals falls far outside the clump, it might be an outlier. An example of a residual plot containing a suspected outlier is seen in Figure 2.13. Note the suspected outlier in Figure 2.13 has a residual of
2.
75
SINGLE ANALYTE ANALYSIS
IPA Concentration Residuals
«
0
'
-5
• 1
• -1.4 d)
1 i
1 AO
I "•" 4
1
•-0-9
1 $
QC
6 -10 c o
O -15
Outlier?
—\ ^•-15.9
-20 Sample # Figure 2.13 A plot of concentration residuals containing a possible outlier.
— 15.9, while the absolute value of the second largest residual is 1.4. The outlier's residual is 10 x greater than that of any other data point. Spotting outhers is not an exact science; it must be done by looking at the shape of a residual plot, and the size of the residuals themselves. However, some rules of thumb are useful. If the absolute value of the residual for a specific datum is greater than 3 x the absolute value of the residual for any other sample, it is suspicious. You should re-examine how the data point was determined and where it came from. It could be a transcription error, a reaction taking place in the sample, or a mistake made in determining the standard's concentration. If a specific reason can be found to explain an outlier, it is legitimate to exclude it from the calibration. However, if no reason explaining a sample's large residual can be found, the judgment and experience of the analyst must come into play to determine what data to include and exclude in a calibration. A large residual may indicate that a sample represents an important source of variation that needs to be modeled. Alternatively, there may be something wrong with the data point but the reason is not identifiable. By the way, the outlier in Figure 2.13 is due to a transcription error. 2. Absorbance Outliers
The error (noise) in a spectrum is frequently less than 1%, while concentration error is often greater than 1%. Noisy or false absorbance readings can throw a caHbration off just as surely as concentration errors can. There are many sources of spectroscopic errors. Most spectrometers are
76
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
quite stable and measure spectral absorbances accurately and reproducibly. However, there are instrument malfunctions that can lead to inaccurate absorbance readings. In addition, incorrect absorbance readings can be caused by failure of the operator to control the variables affecting the measurement of a spectrum. All experimental parameters, the number of scans, resolution, and wavenumber range should be identical for the standards and unknowns. Impurities can also interfere with absorbance measurements. If these impurities absorb at or near the same wavenumber as the analyte, and were not present in the standard samples, they can interfere with the accurate measurement of the analyte absorbances. After obtaining a cahbration, it is possible to predict the absorbance of each standard spectrum (just like we can predict the concentrations) and compare it to the actual value. This comparison can be quantitated by calculating the spectral residual, which is given by the following formula: Rs = {A- A')
(2.27)
where R^ = spectral residual A = actual absorbance A^ = predicted absorbance Equation (2.27) assumes that only one absorbance measurement (be it a peak height or area) was used in the cahbration. Table 2-8 contains the actual and predicted peak areas, and spectral residual values for the IPA standards discussed earlier in this chapter. The equation listed in Figure 2.4 was used to predict the peak areas. A scatter plot of the spectral residuals listed in Table 2-8 is shown in Figure 2.14. Note that absolute values of the spectral residuals in Figure 2.14 are all approximately 0.1. If one of the spectral residuals had been TABLE 2-8
Actual, predicted, and residual peak height data for the IPA calibration. Sample Number 1 2 3 4 5
A
A'
Rs
1 2.2 4.9 8 11
0.79 2.28 5.09 8.06 10.9
0.21 -0.08 -0.19 -0.06 0.1
2. SINGLE ANALYTE ANALYSIS
77
IPA Peak Area Spectral Residual Plot 0.3 0.2 0.1 0 u 0) -0.2 a W -0.3
1 T
^^
A 2
T
4T
T
T
^^
1
1
1
Sample # Figure 2.14 Spectral residual scatter plot for the data listed in Table 2-8.
several times larger than the rest, it would be considered a spectral outlier. This would indicate that something about that sample is unusual, such as the presence of an interferent. At a minimum, the outlier spectrum should be looked at and compared to other standard spectra to attempt to identify why it has such a large residual. Please refer to the section above on concentration residuals for further discussion of how to ascertain when a datum is an outlier, and what to do about it. B. EXPERIMENTAL ERRORS TO A V O I D
Following is a hst of the common experimental problems that can ruin quantitative analyses. Avoiding these problems is a key to insuring the accuracy of caHbrations. 1. Use standards with concentrations that bracket the expected concentration range of components in your unknowns. Predicted concentrations measured outside the range used in the calibration are invahd. For example, if your standards contain from 10 to 90% IPA, only unknowns whose concentrations fall between 10 and 90% IPA can be analyzed. Your mathematical model cannot be applied to concentrations greater than 90% or less than 10% because you have no information on the behavior of the model in those concentration ranges. It is always tempting to extrapolate a model into a concentration range where there is no data, it is also always wrong to do so. 2. Use the actual components to be measured when preparing the standards. For example, imagine your job is to develop a caHbration to measure the amount of acetone dissolved in water. The unknown samples will come from a factory floor, and are made using 95% pure technical grade acetone. You may have 99% pure reagent grade acetone in your lab. Since as chemists we are taught that "purer is better," you
78
3.
4.
5.
6.
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
may be tempted to make up the standard samples using the reagent grade acetone. You may even get a perfectly good calibration with the reagent grade material. The problem is that it would be the wrong calibration! The reagent grade caUbration would systematically overpredict the concentration of acetone in the unknowns made with the technical grade material by 4%. The technical grade material has 4% more impurities than the reagent grade material so the acetone in it is 4% more dilute than in the reagent grade material. Also, remember that the fundamental assumption of quantitative spectroscopy is that the absorptivity of the samples and unknowns is the same. Because of the impurities in the technical grade acetone, the analyte absorptivity may be different than in the purer reagent grade material. This absorptivity drift may be enough to cause the reagent grade cahbration to fail on the technical grade material. Remember that the cahbration line is a mathematical model of your unknowns, and for the model to be accurate, your standards must be as similar to the unknowns as possible. Run standards in random order to insure that the sample order has no effect on the results. Obtain two spectra of each standard, and subtract them using a subtraction factor of 1.0 to see if a flat hne is obtained. If not, it means that a variable in the sample or the experiment has caused two spectra of the same sample to be different. The source of variability must be found and eliminated before attempting to perform a quantitative analysis. Be careful when making the standards. The accuracy of the final results can be no better than the accuracy to which the standards were prepared. If the concentrations in your standards are accurate to 1%, no amount of careful sample preparation, calibration Une plotting, or statistical analysis will make the final results any more accurate than 1%. The accuracy with which you know the concentrations in the standards is the fundamental limit on the accuracy of a predicted concentration. The minimum number of standards to run is 2n-\-2, where n is the number of the components to be analyzed. Thus, for a one component system the minimum number of standards to run is ((2 x l)H-2) or 4. However, running more standards typically will give a better calibration. The more data points that determine the caUbration hne, the more accurately the position of the line is known, and predictions will be more accurate. Cleanliness is crucial! Thoroughly clean everything your sample comes in contact with before performing an analysis. Occasionally obtain the spectrum of the sample cells and sampling accessories without a sample present to check for contamination.
2. SINGLE ANALYTE ANALYSIS
79
7. Only use absorbance bands that are less than 0.8 absorbance units. Beer's law is usually not obeyed by bands whose absorbances are greater than this. If necessary, dilute the sample or reduce the pathlength to get the absorbance to less than 0.8. 8. Be aware of the chemistry of your sample and sampling device. Make sure your sample components do not react with each other. This can be checked by obtaining spectra of the sample over time and looking for changes in the spectrum. Be aware of the materials that the sample cell is made of, and how they might react with any components in the sample. For example, KBr is a commonly used cell and window material in infrared spectroscopy. Like many salts, KBr is water soluble. Any sample containing liquid water will dissolve KBr windows, damaging your sampling accessory. Knowledge of the properties of cell and window materials should prevent this type of a mistake. 9. The physical condition of the standards and the unknowns, such as temperature, and pressure, must be the same when their spectra are acquired. Daily fluctuations in temperature and pressure are difficult to control because of the weather. However, do not intentionally run your standards at room temperature, and then measure an unknown sample spectrum at 150°C in a heated cell. Absorptivity does depend upon the temperature as stated in Chapter 1. 10. Use all the same instrumental parameters, e.g. number of scans, resolution, and spectral range for both the standards and the unknowns. You do not want changes in scanning parameters contributing to changes in your results. Many instrument software packages allow you to save a set of instrumental parameters on the computer's hard disk. By saving the parameters used when the standards are run, then loading the same parameter set when the unknowns are run, you guarantee that the same parameter set is used for both types of sample. 11. Do as Httle spectral in manipulation as possible. Spectral manipulations alter the absorbances in a spectrum, and are difficult to reproduce from one spectrum to another. In effect, they are another source of variance. If you must manipulate your data before quantitation, all the standard and unknown spectra must be manipulated in exactly the same fashion. 12. Be consistent in the use of cells, windows, crystals, and samphng accessories for your standards and unknowns. For example, KBr and NaCl are both commonly used window materials in infrared spectroscopy. These two materials also have different optical properties. If a caHbration is developed with a sample cell containing KBr, and unknown spectra are obtained using a cell with NaCl windows, the unmodeled spectral difference may be enough to introduce errors into
80
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
the predicted concentrations. The solution to this problem is to simply use the KBr windows for both the sample and the unknown spectra. 13. Once a cahbration is obtained and is working properly, occasionally check your calibration by running a new standard and comparing the known and calculated concentrations. If the concentrations are similar, the calibration is still good. If the concentrations do not agree, then changes in the sample, sampling accessory, and instrument over time may have contributed to the drift in quantitative results. You must track down the source of variability, eliminate it, and then run a new set of standards and plot a new calibration curve. Daily or weekly checks of the cahbration can insure accurate results, and contribute to the peace of mind of the analyst. Despite the problems discussed here, thousands of quantitative spectroscopic analyses are performed each day, and quantitative analyses can be made routine. The point to remember is that if you develop the method properly using the information contained here, you will have fewer problems later when the method is actually being used. C. INSTRUMENTAL DEVIATIONS FROM BEER'S LAW
1. The IQ Problem
The definition of absorbance in Equation (1.13) shows that it depends on the ratio /Q//. Recall that /Q is the light intensity measured with no sample in the spectrometer, and / is the light intensity measured with the sample present. Because of Equation (1.13), spectra can suffer from an "/Q problem." Every spectrometer that measures absorbance spectra has some way of measuring /Q. The "problem" is measuring an appropriate and accurate /Q. Ideally, the only thing that should be different when / a n d /Q are measured is the presence of the sample in the Hght beam. All differences between / and /Q are ascribed to the sample. However, if the /Q used to calculate an absorbance spectrum is different from the true /Q as a result of /o drift, inaccurate absorbances will result. Things that can contribute to /Q drift include changes in the lamp intensity, dirt and dust on optical components, changes in the spectrometer alignment, and fluctuations in the hne voltage powering an instrument. How then should /Q be measured? All aspects of the instrument and the spectral measurement need to remain constant between when / and /Q are measured including the number of scans, resolution, and apodization function. Additionally, the measurement of / and /Q should take place as close together in time as possible to minimize /Q drift. For what are called double beam spectrometers (typically dispersive instruments)
2.
SINGLE ANALYTE ANALYSIS
81
/ and IQ are measured at the same time, so there is Uttle opportunity for /o drift. For what are called single beam instruments (typically Fourier transform spectrometers such as FTIRs), / and /Q are measured at different points in time. The longer the time between when / and /Q are measured, the higher the likelihood that /Q drift will occur. You should measure the drift of your instrument over time to quantify /Q drift, and then use this information to determine the appropriate intervals at which to measure /Q. In essence, the solution to the /Q problem is reproducibility. Anything and everything that can contribute to changes in the measured Ught intensity must be controlled to obtain accurate absorbances. 2. Completely Resolving a Spectral Feature
When spectra are displayed, they look Hke continuous functions, as seen in Figure 1.1. However, spectra are actually composed of discrete points. This is because in most modern spectrometers the data are digitized, which means the absorbance has been sampled at specific wavenumbers rather than as a continuous function of wavenumber. Spectra are displayed by connecting the adjacent data points with a Hne segment, and this can be seen if a spectral display is expanded enough. The instrumental resolution used to measure a spectrum determines the number of data points in the spectrum. High resolution spectra contain more data points than low resolution spectra. For example, a 16 cm~^ resolution spectrum nominally contains a data point every 16 cm~\ and a 4 cm~^ resolution spectrum nominally contains a data point every 4 cm~^ Some samples, particularly gases, have very narrow linewidths. Linewidth is determined by a number of things including temperature, pressure, chemical environment, and the physical state of a sample. Occasionally there will be situations where the inherent bandwidth of a line in a sample spectrum is less than the instrumental resolution used. This situation is illustrated in the top of Figure 2.15. The band is naturally 4 cm~^ wide, and there is only a data point every 4 cm~^ The peak maximum at 3000 cm~^ falls between where data points are measured, and so the peak height is not properly measured. The absorbance measured would be less than the true absorbance. The correct way to measure a spectrum is seen in the bottom of Figure 2.15. Now there is a data point every 1 cm~^ so the peak maximum at 3000 cm~^ is observed and the measured absorbance is correct. A peak needs to consist of at least five data points to be characterized properly. This could be accomphshed for the peak in Figure 2.15 by increasing the resolution to 2 cm~^ At this resolution there would be data points at 2994, 2996, 2998, 3000, 3002, 3004, and 3006 cm~\ more than
82
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
FuU Width at Half Height = 4 cm-1 Resolution = 1 data point every 4 cm-1 Measured Absorbance is WRONG!
2994
3006
3000 cm-1 FuU Widtti at Half Height = 4 cm-1 Resolution = 1 data point every 1 cm-1 Measured Absorbance is CORRECT!
2994
3006
Figure 2.15 Examples of the effect of resolution on the accuracy of the measured absorbance in spectra where the sample bandwidth is less than the instrumental resolution. "A" denotes absorbance, x denotes a data point, and all numbers are in cm~^ Top: Too few data points were used to properly measure the peak. The absorbance measured as shown does not correspond to the top of the peak, and is too low. Bottom: Enough data points were used to measure the peak, and the measured absorbance value is correct.
enough to characterize the peak. Thus, the instrumental resolution used to measure a spectrum should be less than the linewidth of the peaks in the spectrum. For soHds and Hquids, which naturally have broad peaks, this is generally not a problem. For gases, where Hues can be very narrow, this type of problem can occur. On the other hand, the world is full of filter based spectrometers that work at very low resolution, they usually do not fully resolve the spectral features of a sample, and yet produce usable calibrations. Because of the phenomenon shown in Figure 2.15, the absorbance measured by these instruments is systematically less than the true absorbance. When absorbance is plotted versus concentration to obtain a caUbration line in these situations, the line "bends over" as seen in Figure 2.16. This type of calibration hne is usable as long as it is reproducible, but the ideal calibration Hne seen in Figure 2.16 is preferred. There are many instrument problems that affect quantitative analyses not discussed here. These are typically a function of specific instrument design, and are beyond the scope of this book. Refer to the spectroscopy books in the bibhography for more information.
2. SINGLE ANALYTE ANALYSIS
83
Beer's Law Plot with Negative Deviation
Figure 2.16 Two Beer's law plots of absorbance versus concentration. In the ideal case, Beer's law is followed and the top line is obtained. If the spectral feature whose absorbance is being measured is not fully resolved, the measured absorbance will be systematically low, giving a negative deviation to the plot as seen in the bottom line.
Appendix: Experimental Details Used to Generate Spectroscopic Data A solution of 70% by volume of IPA in water ("rubbing alcohol") was diluted with tap water to obtain samples containing the volume percents of IPA Usted in Table 2-1. Changes in the volume upon mixing were ignored. Infrared spectra were obtained of the samples using 32 scans and 8 cm~^ resolution on a Midac Corp. (Irvine, CA) M series FTIR. The spectrometer was equipped with an air cooled source, Ge on KBr beam splitter, and DTGS detector. A Pike Technologies (Madison, WI) horizontal ATR accessory equipped with a liquid trough plate was used to obtain spectra of the samples. The ATR crystal was made of ZnSe, and had nine "bounces." For one bounce, the depth of penetration was calculated to be approximately 5 microns, for an approximate total pathlength of 45 microns. Spectra were processed using the GRAMS/32 V. 4.04 software package from Galactic Industries (Salem, NH). Peak areas were measured using the "Integrate.AB" Array Basic program within GRAMS.
REFERENCES
[1] D. Albritton, A. Schmeltekopf, and R. Zare in Molecular Spectroscopy, Modern Research: Volume II, K. Narahari Rao, Ed. Academic Press, New York, 1976.
84
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
[2] H. Mark and J. Workman, Statistics in Spectroscopy, Academic Press, New York, 1991. [3] H. Mark, Principles and Practice of Spectroscopic Calibration, Wiley, New York, 1991. [4] H. Robbins and J. Van Ryzin, Introduction to Statistics, Science Research Associates, Palo Alto, 1975. [5] R. Walpole and R. Myers, Probability and Statistics for Engineers and Scientists, Macmillan, New York, 1978. BIBLIOGRAPHY
J. Duckworth, "Spectroscopic Quantitative Analysis" chapter in Applied Spectroscopy: A Compact Reference for Practitioners'', J. Workman, A. Springsteen, Eds., Academic Press, Boston, 1998. W. George and H. Willies, Eds., Computer Methods in UV, Visible, and IR Spectroscopy, Royal Society of Chemistry, Cambridge UK, 1990. H. Preiser, Concepts & Calculations in Analytical Chemistry, CRC Press, Boca Raton, Florida, 1992. S.V. Compton, D.A.C. Compton, in Practical Sampling Techniques for Infrared Analysis, P. Coleman, Ed., CRC Press, Boca Raton, Florida, 1993. N. Colthup, L. Daly, and S. Wiberley, Introduction to Infrared and Raman Spectroscopy, Academic Press, Boston, 1990. D. Shoemaker and C. Garland, Experiments in Physical Chemistry, McGraw-Hill, New York, 1967. D. Skoog and D. West, Principles of Instrumental Analysis, Holt, Rinehart, & Winston, New York, 1971. J. Robinson, Undergraduate Instrumental Analysis, Marcel Dekker, New York, 1995. J. Ferraro and K. Krishnan, Eds., Practical Fourier Transform Infrared Spectroscopy, Academic Press, New York, 1990.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
Before discussing multicomponent analysis, we need to review the terms component, analyte, and interferent. A component is any chemical species present in a sample. An analyte or analytes are the chemical species whose concentration(s) we are interested in. Interferents are chemical species that are present in a sample whose concentrations are not of interest to us. They are called interferents because their spectra can interfere with the spectra of the analytes. The need to perform multiple component analyses arises when the concentrations of one or more species need to be determined in the presence of one or more interferents. There are a number of ways of attacking the problem of multiple component analysis. The method to choose depends upon the complexity of the spectra, the accuracy and speed of analysis desired, and to a point, the skills of the analyst. The next two chapters will discuss the different methods of multiple component analysis. This chapter will include what are known as "least squares methods." 85
86
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
These methods take many of the Hnear regression concepts from Chapter 2 and extend them to multiple components.
I. A Data Set for Multicomponent Analysis One of the goals of this book is to illustrate the field of quantitative spectroscopic analysis using example calibrations obtained with real data. In addition, it would be nice to have the same data set with which to compare different cahbration algorithms. For the example cahbration, four-component standard samples consisting of mixtures of isopropanol (IPA), ethanol (EtOH), acetone, and water were made. The analytes were IPA and EtOH, while acetone and water were the interferents. The amount of the interferents in each sample were assumed to be unknown, and their concentrations were not included in the cahbrations. This is a simple system, but will illustrate the nuances and complexities of trying to analyze for multiple analytes in the presence of multiple interferents. Nine standard samples were made. Component amounts were measured volumetrically; the impact of mixing upon the solution volume was ignored. The volume percents of IPA and EtOH in each standard are listed in Table 3-1. For the method of independent determination calibration, all nine standards were used. For the P matrix calibration and for all the calibrations in Chapter 4 the standards were split into cahbration and validation sets. Standards 1, 2, 4, 5, 8, and 9 comprised the cahbration set. Standards 2, 6, and 7 comprised the validation set.
TABLE 3-1
Concentrations, in volume percent, of analytes in example calibrations. Standard#
IPA
EtOH
18.2 4.6 18.3 27 0 7.7 32 13.8 54.4
0 0.8 1.5 9.0 7.5 9.1 0.8 6.6 3.2
3.
3500
87
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
3000
2500 2000 Wavenumber (cm-l)
1500
1000
Figure 3.1 The 12 spectra used in example calibrations.
Mid-infrared spectra were obtained of the samples from 4000 to 700 cm~^ using 32 scans and 8 cm~^ resolution on a Midac Corp. (Irvine, CA) M series FTIR. The spectrometer was equipped with an air-cooled source, Ge on KBr beamspHtter, and DTGS detector. A Pike Technologies (Madison, WI) "MIRacle" single bounce ATR accessory equipped with a ZnSe crystal was used. The approximate depth of the penetration was 5 microns. Replicate spectra of each sample were measured, and were processed using the GRAMS/32 V. 4.04 software package from Galactic Industries (Salem, NH). The rephcate spectra of the cahbration set are shown in Figure 3.1. Peak areas were calculated using an Array Basic^^ computer program written by the author. For the P matrix cahbration, OS-MLR software from Optical Solutions (Folsom, CA) was used. This package is written in the Array Basic programming language, and runs alongside the GRAMS/32 software package. The calibration set is used for all the subsequent cahbrations discussed in this book.
II. The Method of Independent Determination (MID) The term independent determination means that the spectra of the components are independent of each other. Practically, this means each analyte must have a lone absorbance band free of interference from other species. This method is in many ways an extension of the single component
88
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
techniques discussed in Chapter 2. However, instead of ending up with one calibration Hne and equation, there will be a separate calibration Hne and equation for each analyte. The Beer's law expression for an independent two analyte system is A\=siJc^
(3.1)
where A = absorbance at a given wavenumber £ = absorptivity at a given wavenumber for a given component / = pathlength c = concentration The subscripts 1 and 2 denote different wavenumbers, and the subscripts a and b refer to the analytes. Note that the absorbance, absorptivity, and concentration in these equations are different. These equations share no parameters, which is why they are independent of each other. Hence the name of the technique, the method of independent determination (MID). If there were spectral overlap for any of the component features. Equations (3.1) would not be adequate to describe the situation, and extra terms including the interferent absorbance would be needed to model the situation properly. As an example of how to use MID, we will separately generate caHbration Hues for the determination of IPA and EtOH in the presence of acetone and water, using the caHbration set discussed above. When using MID, an important first step is to consult the pure spectra of the components that will be present in the samples. You should look for bands or regions where the analyte absorbs, but where nothing else absorbs. An example of this is shown in Figure 3.2. The pure spectra of water, acetone, EtOH, and IPA are shown. These are the pure spectra of the four components present in the samples analyzed. There are many regions in Figure 3.2 where at least two or more of the components absorb. However, close inspection shows that IPA has a peak at 948 cm~^ and EtOH has a peak at 1045 cm~^ that appear relatively free of interferences. These peaks are illustrated by the vertical lines in Figure 3.2, and may be well suited for use as analyte bands in MID. The next step in MID is to look at the spectra of the standards themselves. Upon mixing, peak positions, intensities, and widths can change introducing interferences that were not present in the pure spectra. Therefore, it is important to examine closely the wavenumber region of the proposed
89
3. MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
XlUiJ
.41
.TV. 3500
3000
1500
2500 2000 Wavenumber (cm-1)
1000
Figure 3.2 The pure spectra of water, acetone, ethanol (EtOH), and IPA (IPA). These are the four components in the samples analyzed. The vertical Hues denote wavenumbers at which the absorbances of IPA and EtOH are relatively free of interference from the other components.
.35-
/ /
EtOH/IPA Interferent Peak, 1092 cni^
Acetone Interferent Peak, 902 cm" IPA Peak 948 cm^
.25u
^ S <
.2'
r
.15-
.1.05-
EtOH Peak 1045 cm^
01
1050
1000 Wavenumber (cm-1)
~\
~\ \ \\
A /d
///^
A M\ —1
950
J
900
Figure 3.3 The two analyte bands used to generate calibrations via MID. The EtOH caHbration peak is at 1045 cm"^ and the IPA calibration peak is at 948 cm~^ Potential interferent peaks are noted. analyte bands in the standard spectra. A plot of the standard spectra between 1100 and 900 cm~^ is seen in Figure 3.3. A close examination of Figure 3.3 shows that the EtOH analyte peak at 1045 cm~^ is sHghtly overlapped with a feature at 1092 cm~^ which is
90
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 3-2
Volume % and peak areas for IPA calibration line from MID. The calibration line is plotted in Figure 3.4. Peak areas were integrated from 964 to 933 cm~^ Spectrum# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
%IPA 18.2 18.2 4.6 4.6 18.3 18.3 27 27 0 0 7.7 7.7 32 32 13.8 13.8 54.4 54.4
IPA Area 964^933 cm"^ 0.0835 0.0837 -0.0209 -0.0211 0.0754 0.0751 0.132 0.132 -0.0304 -0.0308 0.0152 0.0136 0.176 0.175 0.05 0.047 0.318 0.316
itself an overlap of an IPA and an EtOH peak. The IPA analyte band at 948 cm~^ rides on top of a broader absorbance due to acetone at 902 cm~^ So, neither of these peaks is perfectly free of interfering absorbances, illustrating that even in a relatively simple four-component mixture it is not easy to find analyte bands free of interference. We will cahbrate using the IPA peak at 948 cm~^ and the EtOH peak at 1045 cm"^ The raw data, caHbration Unes, calibration metrics, and residual plots for the IPA MID analysis are seen in Table 3-1 and Figures 3.4 to 3.6. Results for the EtOH analysis are seen in Table 3-3 and Figures 3.7 to 3.9. Calibrations were plotted using the inverse form of Beer's Law, where concentration is plotted versus absorbance, for reasons outlined in Chapter 2. The calibration line using the integrated peak intensity for IPA from 964 to 933 cm~^ is seen in Figure 3.4. There are several data points below zero absorbance units. This does not mean that the sample gave off more light than impinged upon it. The negative absorbances are an artifact of the integration wavenumbers chosen. In principle, one should use the same integration limits for all the standard and unknown spectra when
3.
91
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
IPA MID Calibration 4 Component Mixture y=152.06x +6.1297 R = 0.998
<
r—^
Q.
0)
on C.\J
E
3 O
> -0.1
0.1
0.2
0.3
0.4
Peak Area (964-933 cm-1) Figure 3.4 IPA calibration line using the method of independent determination for a fourcomponent mixture containing IPA, EtOH, water, and acetone. The peak area from 964 to 933 cm"* was used.
IPA MID Concentration Residual Plot 2 1.5 § 2
1 0.5
S
0
o -0.5 ^
• • 4 • • •
• •
^0-
^
-1.5 -2
Sample Number Figure 3.5 Concentration residual plot for IPA. Calibration obtained using MID.
quantitating. The limits chosen were a compromise aimed to work best on most of the spectra. Unfortunately, the limits chosen lead to "negative absorbances" for a few of the lower concentration samples. These absorbances were used in the caUbration as is. The spectral residual and concentration residual plots for the IPA calibration are seen in Figures 3.5 and 3.6. There are no apparent outUers in the concentration data. In the spectral residual plot, the data for spectrum #4 look suspicious; its residual is three times greater than the next highest value. Close inspection of the spectrum, and comparison with a replicate spectrum of the same sample showed no obvious problems with the data, so this spectrum was included in the cahbration. The IPA calibration metrics appear in Table 3-4. The metrics were calculated using
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
92
IPA MID Spectral Residual Plot 0.04 To 0.03 2
0.02
(0
S 0.01 o
0
CO -0.01 31 ^
•
•
-^•--f
•
^ • ^
t • • • -4|D-
4 • 115
-20
-0.02
Sample Number Figure 3.6 Spectral residual plot for IPA. Calibration obtained using MID.
EtOH MID Calibration 4 Component iVIixture y = 35.663X -n 2.604 R = 0.987
Peak Area (1060-1010 cm-1) Figure 3.7 Ethanol calibration line using MID for a four-component mixture containing IPA, EtOH, water, and acetone. The peak area from 1060 to 1010 cm~^ was used.
equations listed in Chapter 2. The standard deviation, correlation coefficient, and F for regression are all quite good. This indicates that the small overlap of the IPA analyte peak with the nearby acetone peak has not severely damaged the quahty of this calibration. Figure 3.7 contains an EtOH cahbration Une using the integrated peak area from 1060 to 1010 cm~^ Like the IPA calibration in Figure 3.4, the EtOH cahbration has several data points below zero absorbance units. Again, this is an artifact of the choice of integration limits, and does not indicate any experimental problems. The concentration residual and spectral residual plots for EtOH appear in Figures 3.8 and 3.9. There are no obvious outliers in these plots; so all the 18 spectra are included in the final cahbration. The cahbration metrics are seen in Table 3-4. The EtOH cahbration is not as good as the IPA calibration. There are several possible explanations for this. First, the EtOH concentrations modeled were on
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
93
TABLE 3-3
Volume % and peak areas for EtOH calibration line from MID. Calibration line is plotted in Figure 3.7. Peak areas were integrated from 1060 to 1010 cm~^ Spectrum#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
%EtOH
EtOH Area 1060-1010 cm"^
0 0 0.8 0.8 1.5 1.5 9 9 7.5 7.5 9.1 9.1 0.8 0.8 6.6 6.6 3.2 3.2
-0.0735 -0.0738 -0.0629 -0.0627 -0.0309 -0.0306 0.198 0.197 0.11 0.111 0.171 0.172 -0.042 -0.042 0.107 0.107 0.0414 0.0488
average lower than the IPA concentrations modeled. Thus, the EtOH peaks were probably smaller, noisier, and closer to the detection limit than the IPA peaks. Second, the interferent peak at 1092 cm~^ may have caused more of a problem for the EtOH analyte peak than the interferent band at 902 cm~^ did for the IPA analyte peak. Third, more judicious choice of integration limits may improve the caHbration. The quaHty of these calibrations show that MID can work well as long as isolated, well resolved component bands can be found. However, for mixtures of many components it becomes increasingly difficult to find isolated bands for all the analytes. To solve this problem, methods outhned below and in Chapter 4 can be tried.
III. Simultaneous Determination of Multiple Components The methods described in the rest of this chapter are used for the simultaneous analysis of multicomponent mixtures. Multicomponent
94
Q U A N T I T A T I V E SPECTROSCOPY: THEORY A N D PRACTICE
EtOH MID Concentration Residual Plot 1.5
1
1
-^-^
^ 0.5 • & 0 -•--•-o -0.5 f o -1 1 -1.5
•
^•^^•^ 4—••
•
• • • •
w-
w-
-*
Sample Number
Figure 3.8 Concentration residual plot for EtOH. Calibration obtained using MID.
EtOH MID Spectral Residual Plot 0.04 (0 3 T3 (0 0>
GC
o 0)
0 .03 0 .02 0 .01
• •
-0 01 4 (0 -0 02 -0 03 Q.
-^^^ •
^^-•^
•
1P* •
•
• • •
2P
• t Sample Number
Figure 3.9 Spectral residual plot for EtOH. Calibration obtained using MID.
TABLE 3-4 Calibration metrics for IP A and EtOH using MID. Calibration
IPA EtOH
Standard Deviation (cr)*
Correlation Coefficient (R)
Ffor Regression
0.96 0.58
0.998 0.987
4544 622
*Units of Std. Dev. are volume percent of analyte.
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
95
methods can work even if the analyte features in a sample spectrum are overlapped with each other or with interferent absorbances. The basis of all quantitative analyses is Beer's law, which was described in Chapter 2. The following discussion assumes familiarity with Beer's law, single analyte analysis, and the experimental details of performing quantitative analyses described earlier in this book.
A. THE ADDITIVITY OF BEER'S LAW
All multicomponent analyses are based on a property of Beer's law called its additivity. This simply means that the absorbance at a specific wavenumber equals the sum of the absorbances of all chemical species at that wavenumber. This idea is expressed in equation form as follows. For a three species system, where the species are denoted by the subscripts d, e, and f, the total absorbance at a given wavenumber is given by A, = A^ + A^ + Af
(3.2)
where Ax = the total absorbance at a given wavenumber A^ = the absorbance of component d AQ = the absorbance of component e Af = thQ absorbance of component f If each species follows Beer's law, a separate Beer's law expression for each is written and substituted into Equation (3.2) to obtain Ai = Sdlcd + ejce + eflcf
(3.3)
where ^d,e,f—the absorptivity of component d, e, or f / = pathlength Q,e,f= concentration of component d, e, or f We cannot solve for the three concentrations c^, Ce, and Cf with just one equation. We need at least three equations to solve for the three unknowns. In the method of simultaneous determinations, measuring the absorbance at three different wavenumbers allows three equations in three unknowns to be written and solved. The equations are given below, components are denoted by subscripts d, e, and f, and the three wavenumbers are denoted by
96
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
subscripts 1, 2, and 3: Ai = e^^lcd +
6\JCQ
+ siflcf
A2 = ejdlcd + S2JCQ + S2flcf
(3.4)
A3 — S^dlCd + £3JCQ + 63flCf
Systems of equations such as (3.4) are called simultaneous equations because they share variables, i.e. they depend on each other. In this case, the three concentrations of interest, Cd, c^, Cf appear in all the three equations. In theory, these equations can be solved algebraically. However, it is easier to represent and organize these equations using matrix notation and matrix algebra, which are discussed below. B. INTRODUCTION TO MATRIX ALGEBRA
A matrix is simply a table of numbers. Matrices have specific properties and can be manipulated mathematically. For a review of matrix algebra see Reference [1], or any college level textbook on hnear algebra. Here we will introduce only those matrix algebra concepts needed to understand how to generate cahbrations simultaneously for multiple components. An example of a matrix is seen in Table 3-5. This matrix is denoted by the letter Q (upper case letters in bold will be used to denote matrices in this book). Each number in a matrix is called a matrix element. To denote a matrix element, the matrix name is stated with subscripts for the row and column numbers. The number 18 in matrix Q is denoted Q22 and is called the "2,2 element" of Q. The number 35 is the "3,1 element" of Q. The size of a matrix is denoted by stating the total number of rows, then the total number of columns. Q is called a 3 x 3 or "three by three" matrix because it has three rows and three columns. Any generic matrix with m rows and n columns is called an "m x «" matrix and is said to have
TABLE 3-5
An example of a matrix of numbers. This matrix is called Q. Row# 1 2 3
Column #1
Column #2
Column #3
36 24 35
17 18 19
90 60 88
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
97
order mxn. If the number of rows equals the number of columns, a matrix is said to be square with order n x n. The matrix Q is square. If a matrix consists of just one column or one row, it is called a vector. Any of the rows or columns of Q by themselves can be considered vectors (vectors will be denoted by bold upper case letters the same as matrices). One of the matrix operations used in generating multicomponent calibrations is called a matrix transpose. This involves swapping the rows and the columns of a matrix. The transpose of Q is seen in Table 3-6. The transpose of a matrix is denoted by adding a superscript T to the matrix's name. The matrix in Table 3-6 is properly denoted Q^, and pronounced "Q transpose." Another matrix operation used to calculate calibrations is matrix multipUcation. Assume for the moment that we wish to multiply two matrices X and Y to give the result matrix Z. We begin with the matrix equation (3.6)
Z-XY
Table 3-7 lists the elements of matrices X and Y. For simplicity we have made X and Y the transpose of each other (the operations shown here apply even if two matrices are not the transpose of each other). That is X^-Y TABLE 3-6
The transpose of matrix Q, properly denoted
Q" =
-36
24
351
17
18
19
.90
60
88 _
TABLE 3-7 The matrices X and Y.
[1 X =
4
[7 [1 Y =
2
[3
2 5 8
31 6 9_
4 5 6
71 8 9
98
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
and Y^ = X
(3.7)
Remembering that we are calculating Z ^ X Y , matrix multiplication begins by taking the elements of the first row of X and multiplying them by the corresponding elements of the first column of Y. Then these products are added together to produce the 1,1 element of the result matrix Z as seen below. Z\\ = Xu Y\\ -\- X\2Y2\ + ^ 1 3 ^31 Zn = 1 x 1 + 2 x 2 + 3 x 3
(3.8)
Zn = 14 Square matrices were used here for simplicity sake. However, matrix multiplication is not restricted to square matrices. Any two matrices can be multiplied together as long as the number of columns in the first matrix equals the number of rows in the second matrix. To put it more mathematically, any matrix X of order m x n and Y of order q x p may be multiplied together if n = q. For any two matrices, X of order m x n and Y of order n x p, the matrix product Z is given by [1]
^ij — i_j
^ikykj
k=\
where / = 1, 2 , . . . , m 7 = 1 , 2,...,p z, X, y = individual matrix elements of matrices Z, X, and Y respectively. This was the formula used in Equation (3.8). The rest of Z is constructed by multiplying the appropriate rows and columns and adding the products together. The structure of Z before adding the products together is shown in Table 3-8. The result of multiplying matrices X and Y is seen in Table 3-9. Unhke when multiplying regular numbers ("scalars"), the order in which matrices are multiphed matters. Thus generally, XY/YX
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
99
TABLE 3-8 The results matrix Z obtained by multiplying matrices X and Y. • 1+4 + 9
4+10+18
7+16 + 27 '
4+10 + 18 16 + 25 + 36 28 + 40 + 54 7+16 + 27 28 + 40 + 54 49 + 64 + 81
TABLE 3-9 The results matrix Z obtained by multiplying matrices X and Y. 14 32 50' 32 77 122 50 122 194 TABLE 3-10 A 3 X 3 identity matrix.
[1
0 0
1= 0 1 0 0 0
1
and care must be taken when multiplying matrices to specify in what order the operations should be performed, and to insure that the calculations actually follow that order. Another matrix operation important in generating multicomponent calibrations is matrix inversion. To understand matrix inverses, we first have to define a special kind of matrix called the identity matrix. This matrix has I's as its diagonal elements and all other elements are zeros as seen in Table 3-10. The special property of the identity matrix is that when multiplied by any other matrix, it reproduces that matrix. Thus, XI = IX = X
(3.9)
assuming the order of X and I allow matrix multiplication to be performed as discussed above. Multiplying a matrix by I is analogous to multiplying an individual number by 1.
100
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
A matrix inverse is best thought of as a form of matrix division. The inverse of any matrix X is denoted as X~^ The inverse of any matrix X is that matrix that when multiphed by X produces the identity matrix. Thus, XX^ = I
(3.10)
In this context, multiplying a matrix by its inverse is equivalent to multiplying an individual number by its reciprocal to produce the result of 1. Not all matrices have inverses. There are a number of theorems of matrix algebra [1] that deal with whether matrices have inverses, the properties of matrices that do have inverses, and how to calculate those inverses. A discussion of these theorems is beyond the scope of this book. C. THE MATRIX FORM OF BEER'S LAW
Drawing the analogy between tables of numbers and matrices, we can rewrite each group of variables in Equations (3.4) as matrices or vectors, using curly brackets to denote them. The three absorbances from Equation (3.4) can be rewritten as the vector A as such.
A =
A.
The nine absorptivities comprise the matrix E £\d
£\e
su
£2d
^2e
^2f
I ^3d
£3e
^3f
E= {
The pathlengths will comprise a vector L, and the concentrations will comprise vector C Cd
C =
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
101
Combining these matrix expressions together leads to the matrix equation
^r A2 ^3.
h =
<
' ^id
^le
^If ^
^2d
^2e
^2f
. ^3d
^3e
^3f .
•/' •
^
c^
/
J ^ I C{
which is the matrix equivalent of Equations (3.4). These equations can be written more succinctly as A = ELC
(3.11)
where A = the vector of absorbances E — the matrix of absorptivities L = the vector of pathlengths C = the vector of concentrations Frequently, the pathlength for all the analytes and all the standards is the same, in which case all the elements of matrix L will be the same. In this case, multiplying a matrix by a matrix full of identical elements is equivalent to multiplying all the matrix elements by the same number. Thus, for the case of constant pathlength we can factor out the pathlength and Equation (3.11) reduces to A=:E/C where the unbolded, lower case / indicates that the pathlength is a simple constant. Equation (3.11) is referred to as the matrix form of Beer's law. All the mathematical methods used in multicomponent quantitative analysis start with Equation (3.11). The standards and their spectra supply information for the matrices A, L, and C. Application of an appropriate calibration algorithm gives E. Different calibration algorithms perform matrix algebra upon the data in different ways to generate caHbrations. Once a multicomponent calibration is obtained, predicting the unknown concentrations is simply a matter of measuring the absorbances of the unknown sample and applying the cahbration to the data.
102
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
D. THE BENEFIT OF USING MANY ABSORBANCES
In multicomponent analyses, we must measure more absorbances to construct more equations to solve for multiple analyte concentrations. From what has been said so far, it appears that for each unknown concentration we need one equation, and this determines the number of absorbances to be measured. This is actually only a minimum condition. There may be hundreds of wavenumbers in a spectrum where one or more components absorb. In theory, any subset of these absorbances could be used to construct a system of equations to comprise a calibration. Is there any advantage to using more than the minimum number of absorbances in a multicomponent cahbration? In Chapter 2, we showed that the signal-to-noise ratio of any measurement is improved by adding measurements together. This occurs because the sign of the random noise is random, and positive and negative noise points cancel themselves when added together. Using many absorbances in a cahbration provides an averaging effect that can reduce the amount of noise in a cahbration. In addition, you can think of the absorbance "signal" as increasing if more absorbance measurements are used. Lastly, as was discussed in Chapter 2, some peaks follow Beer's law more closely than others. Cahbrations can be improved by choosing absorbances that follow Beer's law closely. An illustration of how increasing the number of absorbance measurements improves a calibration can be seen by considering the simple IPA in water cahbration discussed in Chapter 2. The raw data for this calibration is in Table 2-1, the peaks used are shown in Figure 2.3, and the calibration line is in Figure 2.2. This cahbration used the area of the IPA peak centered at 2973 cm~^ A summary of the cahbration metrics for this cahbration is seen in Table 2-4. Now, imagine that we had chosen a different IPA peak to calibrate with, such as the one centered at 948 cm~^ An expansion of this spectral region for the five standard samples of IPA in water is shown in Figure 3.10. Table 3-11 lists the area of this peak and the volume percent IPA in the standard samples. The cahbration hne plotted using the 968-910 cm~^ peak area data is seen in Figure 3.11. Metrics for this cahbration, calculated using equations from Chapter 2, are given in Table 3-13. Note that the standard deviation is 2.0, correlation coefficient is 0.997, and the F for regression is 464. Now, imagine that your boss has tasked you to come up with an IPA quantitative method with a standard deviation of less than two. This particular calibration is not quite good enough. Let us see if using multiple absorbances solves this problem.
103
3. MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
Isopropuiol Peak @ 948 cm-1
.2H — I —
1000
980
960
— I —
940
920
Wavenumber (cm-1) Figure 3.10 The 948 cm~^ IPA peak for five standard samples of IPA in water.
TABLE 3-11
Volume percent IPA and integrated peak area, 968-910 cm~^ Calibration line plotted in Figure 3.11. %IPA
Area
9 18 35 53 70
1.6 3.6 8.1 13 19
There is an IPA peak centered at 2973 cm~\ and its area for the five standard samples is given in Table 2-1 (this peak was used in the example caHbrations in Chapter 2). Adding together the areas of the IPA peaks at 2973 and 948 cm~^ gives the summed absorbance data seen in Table 3-12. The cahbration Une for this summed absorbance data is seen in Figure 3.12. The calibration metrics for the 948 cm~^ calibration and the summed absorbances calibration are seen in Table 3-13. Note that for all the metrics there is an improvement for using summed absorbances rather than individual absorbances. The improvement is not huge, but note that the standard deviation is now 1.8, which would meet the goal set by your
104
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
IPA Calibration: 968-910 cm-1 y = 3.5155x +5.1497 R = 0.997
^
OU
0(1 Z\3
^^^
0
10 Peak Area
20
15
Figure 3.11 A calibration line for IPA, peak area measured from 968 to 910 cm . Raw data in Table 3-9.
TABLE 3-12 The summed peak areas for the IPA calibration seen in Figure 3.12. IPA peak areas from 3008 to 2947 cm"^ and 968 to 910 cm~^ were added together. Summed Area
%IPA 9 18 35 53 70
2.6 5.8 13 21 30
IPA Calibration: Summed Absorbances 2973 + 948 cm-1 Peaks y = 2.2249X + 4.7836 R = 0.998
^
bU
^
20-
£3 AO
0-
0
10
20 Summed Peak Area
30
Figure 3.12 The calibration line for the summed IPA absorbances. Raw data in Table 3-12.
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
105
TABLE 3-13
Calibration metrics for the single peak area calibration seen in Figure 3.11, and for the summed peak areas calibration seen in Figure 3.12. Statistic Standard Deviation (or) Correlation Coefficient (R) F for Regression
Single Peak Area CaUbration
Summed Peak Areas Cahbration
2.0 0.997 464
1.8 0.998 555
boss of obtaining a calibration with a < 2. The reason the summed absorbances cahbration is better than the 948 cm~^ cahbration may be related to Beer's law linearity. The size of the peaks at 948 cm~^ range in size up to 1.4 absorbance units (AU). The largest peak used in the 2970 cm~^ cahbration is 0.8 AU. As stated in Chapter 2, peaks whose absorbance is less than 0.8 AU tend to follow Beer's law better than peaks whose absorbance is > 0.8. This may explain why the cahbration with a less intense peak is better than a calibration with a bigger peak. Now, the concentration range for this cahbration, from 9% to 70% IPA is rather broad, varying by greater than a factor of seven. The physical reason behind the 2970 cm~^ peak following Beer's law better than the 948 cm~^ peak may be that the absorptivity at 2970 cm~^ is less sensitive to changes in IPA concentration than the absorptivity at 948 cm~^ Additionally, using summed peak areas improves the signal-to-noise ratio of the absorbance data compared to using a single peak area. Because using many absorbances can improve a cahbration, it is common to use many absorbances in multicomponent calibrations. However, there are some limits. First, using hundreds of absorbances makes the cahbration data set large, and it can be time consuming to perform the calculations to obtain a calibration. However, this is less of an issue as time goes by because of improvements in computer performance. Second, the calibration algorithm plays a role in determining the number of absorbances that can be used. These details are explained below and in Chapter 4. Third, more absorbances are usually better as long as one or more components absorb at the wavenumbers selected. Inclusion of wavenumbers whose absorbance is pure noise may add error to the calibration and decrease its accuracy. Also, indiscriminately adding wavenumbers to a cahbration is not a good idea. The threat is overfitting, generating a calibration with too many equations, some of which only model noise.
106
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
E. MULTICOMPONENT CALIBRATIONS: GENERALITIES
The discussion of multicomponent calibration methods that follows will focus on their mathematical and practical differences. The types of cahbration techniques available to you will depend on the software you have. In general, the calculations described below are transparent to the user when using a computer; the choice of which technique to use is simply a matter of setting a software parameter. However, an understanding of what the computer is doing to your data is necessary to truly understand how to develop an accurate quantitative method. Be aware that the same high quaHty lab technique needed for a single component analysis is just as important for multicomponent analysis. The awesome mathematical power of these algorithms will not compensate for poorly made standards, sloppy technique, or lousy spectra. The focus for the rest of this chapter will be on two "least squares" cahbration algorithms, the mathematics behind them, and their relative advantages and disadvantages.
IV. The Classical Least Squares (K Matrix) Method One of the methods of simultaneously determining multiple components in an unknown sample is the Classical Least Squares (CLS) method. Like all multicomponent calibration algorithms, CLS starts with Equation (3.11). Then we write a new equation combining the pathlength and absorptivity matrices into a single matrix called K as follows: K = EL
(3.12)
where K = the absorptivity-pathlength matrix E = the absorptivity matrix L = the pathlength matrix (or vector) The classical least squares method is often called the "K matrix" method because of Equation (3.12). Using Equation (3.12), we can rewrite Equation (3.11) as A = KC
(3.13)
Note that in Equation (3.13) the absorbance is written as a function of concentration, or absorbance is the dependent variable and concentration is the independent variable. To obtain a calibration using the CLS method,
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
107
the values in the K matrix must be determined. This means measuring the absorbances of standards to obtain the data for the A and C matrices. From what we have learned about matrix algebra, it would be tempting to solve Equation (3.13) for K by writing the following K = AC-^
(3.14)
where C~^ = the inverse of the concentration matrix The problem with Equation (3.14) is that it ignores some of the known properties of matrix inverses. A given matrix may have one inverse, many inverses, or no inverse at all [1]. This makes it difficult to use Equation (3.14) as written. Square matrices are better behaved when it comes to inversion. A square matrix possesses either a unique inverse or no inverse at all. There is a way of "squaring" the concentration matrix, then using its inverse to calculate K. It makes use of the transpose of C, C^ as follows: K = AC^(CC'^)-^
(3.15)
The matrix operation shown in Equation (3.15) does more than square the concentration matrix so its inverse can be found. Equation (3.15) is the matrix definition of a least squares fit. The properties, assumptions, and mathematics of the least squares fit for one component were discussed in Chapter 2. Here we are simply extending its use to multiple components and the solving of simultaneous equations. Everything that was said about the least squares algorithm in Chapter 2 will hold true here for multiple components. Recall that a least squares fit, by definition, provides the best model relating two sets of data to each other. Once K is known, we can predict concentrations in unknown samples using the following equation C,nk= (K'rK)-^KTA,nk
(3.16)
where Cunk = component concentrafions in the unknown sample Aunk = absorbances of unknown sample Equation (3.16) is also a least squares fit, and will produce the best possible prediction of C. Note that by calculating C this way, all the unknown concentrations are determined at once.
108
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
A limitation in using CLS is that the number of absorbances measured must equal or exceed the number of chemical species present in the samples. This is simply another way of stating that there must be at least one equation containing the concentration term for each species. A species whose concentration does not appear in any of the equations in the caHbration cannot be modeled. A. ADVANTAGES AND DISADVANTAGES OF CLS
The K matrix method is useful because the mathematics are straightforward and relatively easy to understand. Additionally, as long as the number of absorbances exceeds the number of species present, you can use as much of the spectrum in the caHbration as you wish. Using many absorbances gives an averaging effect that improves the calibration quality as seen above. If many wavenumbers are used in a calibration, the absorptivity matrix E will contain the predicted absorbance spectra of the components. This allows comparison of predicted and actual standard spectra, which can be useful in ascertaining calibration quahty. Despite these advantages, there are a number of problems with the CLS method that prevents its widespread use. Most of the problems stem from the form of the equations themselves. Equations (3.17) show the first three rows of the matrices used in CLS, assuming that there are three components called d, e, and f, and we are measuring absorbances at wavenumbers 1 and 2 Ai = kidCd + kiQCe + kifCf ^2 = k2dCd + ^2e<^e + ^2f<^f
(3.17)
A3 = k^dCd -f ^3e<^e + ^3f <^f
where k — The K matrix element for a specific component and wavenumber. These equations are a recasting of Equations (3.4), simply substituting the value of k for the values of the absorptivity and pathlength. These equations only contain concentration terms for the three analytes. We are assuming that all the absorbance at these wavenumbers is due to these three species only. What would happen if there were an interferent molecule, called g, which absorbed at one of the wavenumbers used in the analysis? There are no terms containing the concentration of g in Equations (3.17). A model developed without terms explicitly modehng the presence of g would give inaccurate predictions. To model g, we must know its
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
109
concentration in the standard samples. Then, we must include terms for the concentration of g in the calibration equations as such Ai = kidCd + ki^Ce + kifCf + kigCg ^2 = kidCd + feeQ + k2{C{ + kigCg A3 = ^ 3 d Q + ^3e<^e + ^3f<^f +
hgCg
Only by knowing the concentration of g in the standards, and including its concentration in the calibration, can we accurately account for its interference with the absorbance of the analytes. More generally, to use the K matrix method successfully the concentration of every species in the standards must be known, and the appropriate extra terms must be added to Equations (3.17) to obtain a reahstic cahbration. It is not always easy to know the concentration of every species in a standard. This is a significant disadvantage of the CLS technique. Another problem with Equations (3.17) is that the cahbration is susceptible to basehne drift. Drifting basehnes can cause absorbances to move up and down, and this change is not modeled in the CLS method. As is discussed in Chapter 4, spectral manipulation techniques such as baseline correction and spectral derivatives can be used to correct spectra for the baseline problems. However, these techniques do not always solve the problem. The applications of the K matrix method are limited. Typically, the method is only applied to well-controlled mixtures of pure substances. One area where this method has found success is in the analysis of gas standard mixtures. These samples are mixtures of pure substances, and the gas phase samples rarely suffer from basehne drift or other spectral problems, thus CLS is well suited to them. A hsting of the advantages and disadvantages of the CLS method is found in Table 3-14.
TABLE 3-14 Advantages and disadvantages of the CLS method. Advantages
Disadvantages
Mathematics easy to understand
Must know concentration of all molecules in standards to obtain an accurate model. Standard and unknown spectra must be free of baseline effects
Can use a large number of wavenumbers, up to the entire spectrum, enhancing model quality Possible to obtain predicted pure component spectra
Limited application
110
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
V. The Inverse Least Squares (P Matrix) Method A. THE ADVANTAGES OF THE INVERSE BEER'S LAW FORMULATION
Recall from Chapter 2 that for single component analysis the inverse Beer's law formulation is c = A/si
(3.18)
The practical effect of using the inverse Beer's law formulation is that concentration is plotted versus absorbance rather than plotting absorbance versus concentration. Figures 3.4 and 3.7 are examples of inverse Beer's law plots. The advantages of Equation (3.18) include putting the data with the least amount of error, the absorbances, on the Z-axis. Additionally, we caUbrate and predict using the same equation. There are further advantages that accrue to the Inverse Beer's law approach in the case of multiple components. We will illustrate this point by examining an analyte band that is overlapped with an interferent band. We will use the IPA-EtOH system discussed earlier in this chapter. Ethanol has a nice peak centered at 2974 cm~^ that could be used for quantitation. However, IPA has a peak almost on top of it at 2970 cm~^ The overlap of these peaks is illustrated by overlaying their pure spectra, seen in Figure 3.13.
Ethanol Peak at 2974 (soUd)
•H
<
T2980
T" 2960
Wavenumber (cm-1)
Figure 3.13 The overlap of EtOH and IPA peaks near 2970 cm
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
111
We can use Beer's law to write an equation relating absorbance and concentration for the two components at 2970 cm~^; we will call the absorbance at this wavenumber Ai A\ =
£\JCQ
+ sulci
(3.19)
where the subscript e stands for EtOH and the subscript i stands for IPA (you can also think of i as standing for interferent). Each absorptivity has two subscripts since this parameter depends upon both the molecule and wavenumber of absorbance. We now need an equation to predict for the desired quantity, c^. Rearranging Equation (3.19) gives Ce=
A\
:
8\iCi
(3.20)
Note that in the absence of the interferent IPA, the second term goes to zero. This means the second term in Equation (3.20) corrects for the presence of the interferent molecule's absorbance at wavenumber 1. Also, note that the second term in Equation (3.20) contains the ratio of the absorptivities of the two species at wavenumber 1. To predict CQ using Equation (3.20), we need to find ^le, su, and Cj. In theory, we could obtain ^le by cahbrating with standards containing known quantities of EtOH but no IPA. Additionally, we could find su by calibrating with standards containing known amounts of IPA but no EtOH (this assumes that the absorptivities for these components in their pure form stays constant when they are mixed, not necessarily a safe assumption as discussed in Chapter 1). How then would we find q? If the interferent (IPA in this case) has a lone absorbance band elsewhere in the spectrum, we can use that feature to find Ci. Examination of the pure spectra of these compounds shows that IPA has a lone peak at 953 cm~^ that does not overlap with any EtOH peaks. This is illustrated in Figure 3.9. We can write a Beer's law equation for the IPA peak at 953 cm~\ where A2 will denote the absorbance at 953 cm~^ A2 = S2ilc[
(3.21)
This is simply the direct Beer's law formula for a single component with no interferences. If we make up a series of standards with known concentrations of EtOH and IPA, we can use Equation (3.20) to develop a caHbration to obtain 82il
112
Q U A N T I T A T I V E SPECTROSCOPY: THEORY A N D PRACTICE
t^ ffi
IPA Peak at 953 cm-1 (dashed)
!
e
T
'f\ 1
\
\
/
^ \,
^
y
\
/
^
- V .
\
--"^^ 1 990
1 980
1 970
1 960
No Ethanol Peaks (solid) 1 1 1 940 930 920
1 950
^ ---^ r 910
Wavenumber (cm-l) Figure 3.14 An overlay of the pure spectra of IPA and EtOH, showing the lone IPA peak at 953 cm~^
Therefore, we can use this process to obtain a calibration even in the presence of an interfering absorbance. However, this process involved several calibrations, knowledge of the interferent concentration in the standards, and rewriting of the calibration equation into a prediction equation. This is a complex process involving a lot of hard work. Is there an easier way? The answer, in short, is yes, and it involves using the inverse Beer's law formulation to describe an analyte with an interfering absorbance. Remember that both EtOH and IPA absorb at yli, but that only IPA absorbs at A2. We rewrite Equation (3.21) in inverse form as such c\ = A2/e2il
(3.22)
then plug it into Equation (3.19) to obtain £ulA2
A\ = S\JCQ + -
(3.23)
£2\l
Solving Equation (3.23) for c^ gives Ce
=
Ax
£nA2
S\J
^le^2i/
(3.24)
Note that in the absence of IPA, A2 and thus the second term in Equation (3.23) would go to zero, and the equation would reduce to the inverse
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
113
Beer's law equation for a single component in the absence of an interferent. Note the absorbances on the right hand side of Equation (3.24) are multipUed by combinations of absorptivities and the pathlength. These are all constants, and can be represented more simply by constants called b as such b\ = \/exJ bi =
£\i/£lQS2il
then we can simpHfy Equation (3.24) to obtain c, = biAi-b2A2
(3.25)
In Equation (3.25), the constants b are known as coefficients. Note that Equation (3.25) does not contain the term q. To cahbrate, we would need standards that contained both EtOH and IPA. However, we would only need to know the concentration of EtOH since Equation (3.25) contains the concentration term CQ and does not contain the term for the IPA concentration, Ci. To determine the two coefficients, we would simply need to measure Ai and A2 for several standard samples and perform a cahbration. At a minimum, the number of standards must exceed the number of coefficients so that there are enough equations to solve for the coefficients. To predict Cg, simply measure the spectrum of the unknown sample and measure the absorbances Ai and A2. Then, using the known values of the coefficients b from the cahbration, Q is calculated using Equation (3.25). We need not know Ci to obtain a cahbration. This inverse Beer's law formulation involves only one cahbration, uses the same equation for prediction and cahbration, and requires no knowledge of the concentration of the interferent species during cahbration or prediction. This is less work than the direct Beer's law formulation discussed above. By folding the absorptivities and pathlength into the coefficients b, we never need to explicitly know their values. The ideas discussed here carry over to multiple components with multiple interferences. For these systems, there would be multiple equations like (3.25) with multiple terms. As long as every species absorbs at one or more of the wavenumbers selected, and there are enough standards run to give enough data points so that all the coefficients can be determined, in theory a calibration can be obtained. We would only need to know the concentration of the analytes in the standards to predict their concentrations in unknowns. There is never a need to know the
114
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
concentration of interfering species. Recall for the K matrix method you must know the concentration of each and every species in a sample in order to obtain a caHbration. Since this information is often times difficult or impossible to obtain, the inverse Beer's law approach is often preferable. Simply put, one must measure either many concentrations or many absorbances to compensate for interferents. In general, it is easier to measure many absorbances since most spectra consist of many absorbances. This is the essential advantage of the inverse Beer's law approach. The extension of the inverse Beer's law formulation to multiple components is given below.
B. P MATRIX THEORY
The technique to be presented here goes by several names including the P matrix method. Inverse Least Squares (ILS) technique, and Multiple Linear Regression (MLR). This technique starts with the matrix expression of Beer's law seen in Equation (3.11). In matrix notation, the inverse form of Equation (3.11) is C = A(EL)-^
(3.26)
To simplify Equation (3.26) we define a new matrix P as follows: P = (EL)~^
(3.27)
then rewrite Equation (3.26) to obtain the inverse matrix form of Beer's law as such C = PA
(3.28)
The name "P matrix method" derives from Equation (3.27). Note that in Equation (3.28) concentration is written as a function of absorbance. In other words, concentration is the dependent variable and absorbance is the independent variable. All the benefits discussed in the last section of the inverse Beer's law approach accrue to the P matrix method. To obtain a caUbration via ILS means determining the values in the P matrix. As always, this necessitates measuring the absorbances of standard samples, and finding the data for the C and A matrices. Like the CLS method, we cannot solve for P by simply calculating A~^ because of the issues that surround the calculation of matrix inverses. However, we can
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
115
apply the least squares method to Equation (3.28), and perform a calibration to find P as such P = CA'^CAA'^)-^
(3.29)
Everything that was said in Chapter 2 about the assumptions and properties of the least squares algorithm will hold true for the calculation of P as well. Once P is known, predicting the concentration matrix, C, is simply a matter of multiplying it with A as shown by Equation (3.28). C. A N EXAMPLE ILS CALIBRATION
This section contains an example of a real world ILS caHbration. The spectra used are seen in Figure 3.1, and the concentration data are in Table 3-1. Recall that standards 1, 2, 4, 5, 8, and 9 were used in this calibration, standards 2, 6, and 7 will be used in the vaUdation. The samples were mixtures of EtOH, IPA, acetone, and water. The experimental details of how the standards were made up and how the spectra were measured are given earher in this chapter. Although this is a four-component system, there are only two analytes, IPA and EtOH. The concentrations of the other components were assumed to be unknown, and were not included in the caHbration. Two spectra of each calibration sample were obtained for a total of 12 spectra. Replicate spectra were subtracted using a subtraction factor of 1 to insure that there were no extraneous sources of variation in the absorbances. Wavenumber selection is an important aspect of the ILS technique [2, 3]. Unlike the K matrix method, where large parts of the spectrum may be used, with the P matrix method the number of standards must exceed the number of absorbances measured. Since it is never easy to make a large number of standards, the ability of the analyst to make standards limits the number of wavenumbers at which absorbances can be measured. Thus, the wavenumbers used must be chosen carefully so that each component contributes to the absorbance at one or more of the wavenumbers chosen. There are algorithms that can assist the user in the selection of appropriate wavenumbers for use in ILS cahbrations. These algorithms can make the selection of wavenumbers fast and convenient. However, you should always look at the wavenumbers chosen by the software, and closely examine the caHbration metrics obtained using the proposed wavenumbers. These algorithms do make mistakes. For example, the algorithm may pick a wavenumber that is mostly noise, or may simply pick wavenumbers that give poor calibration metrics. So, always examine
116
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
the results of wavenumber selection algorithms before using these wavenumbers in a final calibration. The software used in this example allowed separate wavenumber selections for each analyte. Initially, the software was allowed to pick the two best wavenumbers anywhere in the entire spectral range from 4000 to 700 cm~^ However, this led to marginal caHbrations for both analytes. This is an example of why it is important to inspect the results of wavenumber search algorithms before using the wavenumbers in a final calibration. By restricting the wavenumber search region from 2000 to 700 cm~\ much better caHbration results were obtained for the two wavenumbers picked for each analyte. This may be because the 2000 to 700 cm~^ region contains a lot of spectral information. Searches using three or more wavenumbers resulted in unacceptably long calculation times. Ultimately, the IPA caHbration wavenumbers selected were 1624 and 945 cm~\ while the EtOH caHbration wavenumbers selected were 1168 and 976 cm~^ The wavenumbers used in these caHbrations are not necessarily those that a trained spectroscopist (such as the author) would have picked. However, good caHbrations were obtained nonetheless. The plot of the actual versus predicted concentration for IPA and EtOH using the ILS algorithm are shown in Figures 3.15 and 3.16. The plots are both very good; the data faH on a straight Hne quite well. As discussed earlier in this book, before implementing a caHbration, it is important to examine residual plots to check for outHers (bad data). The P matrix concentration residual plots for IPA and EtOH are seen in Figures 3.17 and 3.18. For the IPA plot the average of the absolute values of the residuals is 0.24. There are no obvious outHers in this plot. A look at Figure 3.18 shows that the residual for sample 11 is noticeably larger than the residuals for the other samples. The average of the absolute value of the residuals in Figure 3.18 is 0.04, and the absolute value of the residual for sample 11 is 0.137, 3.4 times the average. However, this does not mean that we automatically exclude sample 11 from the caHbration. Ideally, there should be a specific reason to exclude outHers. The spectrum of sample 11 was examined closely, and the caHbration data were checked for transcription errors. No problems were found, so sample 11 was included in the final caHbration because it is possible that this sample is capturing some variance that may be important to model. A nice feature of ILS is that it uses linear regression techniques similar to those described in Chapter 2, and the same metrics that are used to measure the quaHty of single component calibrations can be used to measure the quality of multicomponent calibrations. Table 3-15 Hsts the relevant
117
3. MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
60H
5040-
IPA
^ ^ ^
y
^
^.•^^
30-1
j ^ , ^ ^
20100
ji^^^-'^^^
^.^^ ^'^'^^''^
•10Hn~~-
-10
!
(
10
1
20
1
i
40
30 Actual
1
50
i
60
Figure 3.15 Predicted versus actual IPA concentration plot from an ILS calibration. 12 replicate spectra of 6 standards, and absorbances at 1624 and 945 cm~^ were used.
1G~
EtOH
^
^
8~ T3 (U O T3 V-
Q.
b4 2" 0^
.
—
—
„
^
—
4 Actual
6
^ —
10
Figure 3.16 Predicted versus actual EtOH concentration plot from an ILS calibration. 12 replicate spectra of 6 standards and absorbances at 1168 and 976 cm~^ were used.
IPA P Matrix Concentration Residual Plot 75 0-6 % 0.4 8 0.2 0
o c -0.2 ^ o o -0.4 < CL
^5
-0.6
Sample Number Figure 3.17 Concentration residual plot for the IPA P matrix calibration.
118
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
EtOH P Matrix Concentration Residual Plot (0 3
0.15 0.1
•o "55 0.05 0)
oc
0 6 c -0.05 ^ o
JP-
o X O
-0.1 -0.15
f
15 Samplel11 Possible Outlier
Sample Number Figure 3.18 Concentration residual plot for the EtOH P matrix calibration. Note that sample 11 is a possible outlier. TABLE 3-15 Comparison of calibration metrics for IPA inverse least squares calibration and MID calibration. Statistic Standard Deviation (a) Correlation Coefficient (R) F for Regression
ILS Value
MID Value
0.35 0.9999 15340
0.96 0.998 4544
Statistics for the IPA calibration whose plot is seen in Figure 3.15. Table 3-15 also Hsts the calibration metrics for the IPA caHbration obtained using MID from earlier in this chapter. Note that for all the metrics the ILS results are much better than the MID results. This can be partly explained by the advantages of using multiple wavenumbers and of the inverse Beer's law formulation of ILS. Ultimately, it is probably true that ILS does a better job of modeling interferents than MID, which does not model interferents at all. Table 3-16 shows a comparison between the metrics for the EtOH cahbration obtained using ILS and the metrics for the EtOH caHbration using MID. Like the IPA results, the EtOH results using ILS are significantly better than MID across the board. The improvement here is even more impressive than for IPA. Recall that the MID calibration for EtOH was not that good. Again, this improvement with ILS can be chalked up to using multiple wavelengths, the inverse Beer's law formulation, and the abihty of ILS to model interferents. Using MID, the IPA calibration was better than the EtOH calibration. Using ILS, they are both modeled
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
119
TABLE 3-16 Comparison of calibration metrics for EtOH ILS calibration and MID calibration. Statistic Standard Deviation (a) Correlation Coefficient (R) F for Regression
ILS Value
MID Value
0.064 0.9999 16911
0.58 0.987 622
well. Overall, at least based on this data set, ILS is a better algorithm for multi-analyte/multi-interferent analysis than MID. Note from Tables 3-15 and 3-16 that the ILS standard deviation for EtOH is quite a bit lower than the standard deviation for IPA. This is probably explained by the range of the data for each analyte. The IPA concentrations modeled ranged from 0 to 54%. The EtOH concentrations modeled ranged from 0 to 9%. At high analyte concentrations, it is not uncommon for Beer's law to be nonlinear since Beer's law works best for dilute solutions. It is possible that the highest concentration IPA samples gave absorbances that were not as linear with concentration as the absorbances measured for EtOH. At any rate, the IPA cahbration is still excellent. The ILS calibration metrics in Tables 3-15 and 3-16 show that it is possible to use this technique to obtain quality caUbrations for multiple analytes in the presence of multiple inter ferents. D. VALIDATION AND PREDICTION
Recall from earlier in this book that one of the fundamental assumptions behind spectroscopic quantitative analysis is that a mathematical model (calibration) calculated from a set of standard samples will do an equally good job of describing the standard and unknown samples. All of the plots and metrics described in this book can inform you about how well a cahbration does in describing the standard samples. However, cahbration metrics do not tell you how well a calibration will do on unknown samples. The idea behind validation is to use a cahbration to predict the concentrations of standards that were not used in the cahbration. We will call these samples validation samples or a validation set. The actual and predicted concentrations for the vahdation set are compared to see how well the cahbration does in modehng the vahdation samples. Vahdating a multicomponent cahbration is analogous to checking a single analyte calibration as was discussed in Chapter 2. The vahdation of a model,
120
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
unlike model development, does not involve fancy algorithms or complex mathematics. It simply involves time and effort on the part of the analyst to obtain and measure the spectra of extra standards to insure that a calibration is trustworthy. You should always, if possible, validate a cahbration. A useful statistic to calculate from the predicted and real concentrations for the validation set is the Standard Error of Prediction. This statistic is a direct measure of how well a cahbration predicts the concentrations in a validation set. It is calculated as follows: SEP = ^ . ( c p - C a ) V ( « - l )
(3.30)
where SEP = standard error of prediction / = sample number index Cp = predicted concentration Ca = actual concentration n — total number of vahdation samples There is a separate SEP for each analyte. Naturally, the lower the SEP, the better the predictive capabihties of a model. It is common to use the standard deviation of a cahbration, as seen in Tables 3-15 and 3-16, as the measure of accuracy for predicted concentrations. However, remember that the prediction is being performed on unknown samples. The cahbration is performed with samples whose concentrations are included in the model. The vahdation is performed with samples unknown to the model. The validation step is much more hke a prediction than a cahbration. Thus, the argument can be made that the SEP, a measure of vahdation quahty, is a better measure of predicted concentration accuracy than the standard deviation. Recall that for the example data set being used throughout this chapter, three standards were set aside for use as a vahdation set. Since there are replicate spectra of each standard, this makes a total of six "samples." The example ILS cahbration was used to predict the concentrations of IPA and EtOH in these samples. An example of a prediction report is seen in Figure 3.19. This prediction is for the IPA concentration in validation sample 1. The format of such reports is very much dependent upon the software package used. The actual and predicted concentrations of IPA and EtOH in each of the six validation samples are seen in Table 3-17. This table contains ah
121
3. MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
Predictions for Quantitative Analysis Data file: c:\GRAMSN\IPADATA\4MIXP5.SPC Equation file: c:\GRAMSN\IPADATA\IPA4Mri .TXT Results file: c:\GRAMSN\IPADATA\4MIXP5I.RES Date = Sat Jun 08 2002 Time = 10:12:03 Sample
Predicted IPA 19.009
Figure 3.19 A prediction report from an ILS calibration. TABLE 3-17
Validation set results for IPA and EtOH using a P matrix calibration. "Sample"#
1 2 3 4 5 6
EtOH Cone.
IPA Cone. Real
Predicted
Real
Predicted
18.3 18.3 7.7 7.7 32 32
19.0 18.9 6.57 7.72 32.85 32.31
1.5 1.5 9.1 9.1 0.8 0.8
1.41 1.44 9.14 9.18 0.78 0.85
the information needed to calculate the SEP. Using Equation (3.30) and the data in Table 3-17, SEPs for each analyte were calculated, and the results are Hsted in Table 3-18. Both analytes are predicted quite well in the validation samples by the ILS calibration. This indicates that this calibration should do well on unknown samples as well.
E. ADVANTAGES AND DISADVANTAGES OF ILS
The recasting of the matrix form of Beer's law into Equation (3.28) may seem trivial, but it gives rise to the many advantages of ILS. To understand this, we will write out the first few rows of a P matrix, assuming that there are three components called d, e, and f, and absorbances are measured
122
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 3-18
Validation results for the two example analytes using the ILS technique. Analyte
SEP
IPA EtOH
0.59 0.019
at wavenumbers a, ^, and y. Cd = puAa -^pnAp ^puAy
...
Ce = P2\Aa + PllAfi -\-P23Ay . . .
(3.31)
Cf =P3\Aa -i-p32A^ -\-p33Ay . . . where p = aF matrix element The subscripts for the matrix elements of P simply refer to the row and column number. Each of these elements is a phenomenological constant made up of many absorptivities of different species at different wavenumbers, much like the coefficients b discussed above. It is therefore inaccurate to give the p's subscripts for specific components and wavenumbers. Equations (3.31) tell us that we can think of concentration as being determined by the absorbances (which is exactly what we do when we predict concentrations). The presence of an interferent that absorbs, let us say at wavenumber a, would distort the value of A^^ in Equations (3.31). However, by including extra absorbance terms in the equations, the interferent's contribution to the standard spectra can be modeled. Only the concentrations of the analytes need to be known in the standard samples. This is why the P matrix method is more tolerant of impurities than the K matrix method, and is its major advantage. This was seen above where a system with two analytes and two interferents of unknown concentration was modeled quite well using ILS. Because of its advantages, the P matrix caHbrations have found a wide variety of uses in the analysis of complex mixtures. Everything from octane number in gasoHne to percent fat in food have been obtained using the P matrix caHbrations.
3.
MULTIPLE COMPONENTS I: LEAST SQUARES METHODS
123
The major drawback of the P matrix method is due to the dimensionaUty of the matrix equations themselves. Specifically, the number of standard samples must equal or exceed the number of absorbance measurements. In a complex mixture, there will be many different components that need to be modeled. This is only possible by measuring the absorbance at many different wavenumbers, but this automatically means running extra standards so that the number of standards meets or exceeds the number of absorbances. Simply put, the advantage of not having to worry about interferents is paid for in the extra work of preparing and analyzing extra standards. Since the number of wavenumbers we can use in a caUbration is limited by the number of standards we can acquire, we are limited to using only a small part of a spectrum. Spectra consist of hundreds or thousands of data points, but we will typically only have about tens of standards. This means that we can measure absorbances at only tens of wavenumbers. Thus, much of the spectral information is not used, and is wasted. We cannot achieve the averaging effect obtained in CLS where using large parts of the spectrum helped improve the accuracy of a calibration. Also, it is not possible using ILS to obtain the predicted spectrum of the components in a mixture. To summarize, the real disadvantages of the P matrix method are that it involves running more standards than the K matrix method, and it makes inefficient use of the available data. However, since it is difficult to always account for every component in a sample, P matrix methods are usually the matrix method of choice in multicomponent quantitative analyses. Table 3-19 summarizes the advantages and disadvantages of the ILS method.
TABLE 3-19
Advantages and disadvantages of the ILS method. Advantages Do not need to know the concentration of all the molecules in standard samples. Interferents, impurities, and baseline effects are handled well. Wide variety of applications
Disadvantages Number of standards must be > number of absorbances, limiting the number of wavenumbers that can be used, and preventing improvement of model via averaging effect May be difficult to find optimum set of wavenumbers More standards = more work
124
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
REFERENCES
[1] D. Steinberg, Computational Matrix Algebra, McGraw-Hill, New York, 1974. [2] H. Mark, Principles and Practice of Spectroscopic Calibration, Wiley, New York, 1991. [3] J. Duckworth, Spectroscopic Quantitative Analysis, chapter in Applied Spectroscopy: A Compact Reference for Practitioners, J. Workman and A. Springsteen Eds., Academic Press, Boston, 1998.
BIBLIOGRAPHY
p. Griffiths and J. de Haseth, Fourier Transform Infrared Spectrometry, Wiley, New York, 1986. J. Ferraro and K. Krishnan, Eds., Practical Fourier Transform Infrared Spectroscopy, Academic Press, New York, 1990. H. Mark and J. Workman, Statistics in Spectroscopy, Academic Press, New York, 1991.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
I. Introduction Many successful analyses have been performed using the K and P matrix methods discussed in Chapter 3. However, these methods do have some disadvantages, which have lead to research into other multicomponent quantitative analysis methods. Some recent developments in calibration algorithms fall within the broad field of chemometrics, the application of statistical and mathematical methods to chemical data. A group of chemometric techniques known as factor analysis methods avoids many of the disadvantages of the K and P matrix methods, and have significant advantages of their own. The term factor analysis derives from the use of abstract functions called factors to model the variance in a data set. Each factor attempts to span the maximum amount of variance in the data, regardless of its source. The challenge when using factor analysis 125
126
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
techniques is determining which factors to use in a caUbration (see below). You may have heard the terms Principal Components Regression (PCR) and Partial Least Squares (PLS). These are two types of factor analysis techniques. This chapter will introduce you to the fundamental concepts behind PCR and PLS. A. ADVANTAGES AND DISADVANTAGES OF FACTOR ANALYSIS TECHNIQUES
There are a number of advantages to factor analysis methods as listed in Table 4-1. In Chapter 3, we saw how the Inverse Least Squares (ILS) method was able to accurately quantify analytes in the presence of interferents. Factor analysis techniques enjoy the same advantage. As long as an interferent is present in the standard samples, and its concentration varies as it will in the unknowns, factor analysis can model its presence in the unknowns. However, factor analysis methods cannot model interferents that are not present in standard samples but are present in unknown samples. Simply put, you cannot model what is not present in the standards. Recall from Chapter 3 that for the P matrix method the number of standards limits the number of wavenumbers at which absorbances are measured. This is not a problem with factor analysis methods. The number of absorbances is not limited by the number of standards. If desired, entire spectra containing thousands of absorbances can be used in a cahbration. The advantage of using many absorbances is twofold. First, the more information fed into a cahbration, the better it can model reaUty. Second, by using many absorbances the signal-to-noise ratio (SNR) of a cahbration is improved (as pointed out in Chapter 3). The beauty of using the entire spectrum is that unHke single component methods, you do not have to find a unique, isolated absorbance band for each component. In fact, it is possible to obtain factor analysis cahbrations without knowing exactly where the components absorb, as long as they absorb somewhere in the spectral regions that are used in the model. Overlaps and interferences between analytes and interferents are not a problem. This does not mean, however, that you can ignore the spectra entirely. You should still examine TABLE 4-1 The advantages and disadvantages of factor analysis. Advantages Handles impurities/interferents Handles noisy data No limit on number of absorbances; up to an entire spectrum may be used
Disadvantages Math more complex Calibration process more complex
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
127
the spectra to make sure that they are free of artifacts, and have a good SNR. Another advantage of factor analysis methods is that they tolerate noisy spectra. This does not mean that you should slack off in your attempts to obtain the highest quality spectra possible. It does mean that if after your best efforts you still have noisy spectra, factor analysis methods are more likely to produce a good calibration than other techniques. Since factors describe variance, if factors modeling only noise are left out of the calibration, the noise level of the calibration is reduced. Note that all the disadvantages of factor analysis listed in Table 4-1 have one word in common, complexity. The math behind these techniques is so complex that entire books have been written on the subject (see references and bibhography for this chapter). Another disadvantage of factor analysis techniques is the complexity of the caUbration process. Many trial caHbrations may be needed, varying parameters such as the wavenumber region(s), algorithm, preprocessing, and the number of factors to optimize a cahbration. A key to understanding the application of factor analysis methods is that they are used to correlate many different types of data, not just spectral absorbances and concentrations. For example, factor analysis methods could be used to correlate the taste of a cookie with the amount of fat it contains, or the effect of interest rate changes on the stock market. As long as two sets of data are correlated, it may be possible for factor analysis methods to quantify that correlation and extract a calibration from the data. This greatly expands the applications of factor analysis in spectroscopy. Spectra can be correlated with sample properties other than absorbance, as long as the property in question varies with the spectra of the samples, a cahbration might be obtained. A classic example of the use of factor analysis methods correlating spectra with sample properties is the measurement of octane number in gasoline [1]. The octane number of a gasoline is determined by its molecular composition. The spectra of gasolines vary with their molecular composition. By using factor analysis methods, these two sets of data have been correlated. As a result, near-infrared spectra of gasolines are now used in refineries to measure octane number. In general, if measuring a sample's spectrum is faster and easier than measuring the desired property directly, a great deal of time and money may be saved by applying factor analysis methods to spectroscopic data instead of measuring sample properties directly. B. FACTOR ANALYSIS OVERVIEW
Like any calibration algorithm, factor analysis techniques seek to model the variance in a set of data. The sources of variation in spectroscopic or
128
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
concentration data can include changes in component and interferent concentration, spectroscopic noise, and baseline variations. In a factor analysis calibration, each factor captures some amount of the variance in the data. The factors are then added together using appropriate weighting factors to try to reproduce the standard spectra. To make all this a little clearer, we will present a concrete example of how adding together spectral "factors" can describe the variance in a spectral data set. In Chapter 2, we used solutions of isopropanol (IPA) in water in an example cahbration. Five standard spectra were measured. The standard samples were made by diluting a 9 1 % IPA in water solution with varying amounts of pure water. In theory, any of the standard spectra could be reproduced by adding together the spectra of the IPA solution and pure water using appropriate weighting factors. For example, one of the samples was made by mixing 5.8 ml of IPA solution and 4.2 ml of water, giving a sample containing 53% IPA by volume. The spectrum of this sample may be reproduced by adding together 0.58 times the IPA spectrum and 0.42 times the water spectrum as such: 53% IPA Spectrum = (0.58) (IPA Solution Spectrum) + (0.42) (Water Spectrum)
(4.1)
The spectra used in calculating the 53% IPA spectrum are seen in Figure 4.1. A comparison of the real and calculated spectra of the 53% IPA sample is seen in Figure 4.2. In the parlance of chemometrics, the two spectra shown in Figure 4.1 are factors, and the weighting factors (0.58 and 0.48) are called scores. In theory, all five spectra of the IPA/water standards could be reproduced by multiplying the spectra in Figure 4.1 by appropriate scores and adding them together. For example, the spectrum of the standard sample containing 70% IPA could be reproduced by using weighting factors of 0.77 and 0.23 in Equation (4.1). The real and calculated spectra in Figure 4.2 are similar but not identical, there are real absorbance differences between them. These absorbance differences represent variance in the data not modeled by Equation (4.1). In factor analysis, we would attempt to account for this unmodeled variance by adding more factors to the calculation of the standard spectra. The trick, as we shall see below, is finding the right number of factors that does the best job of modeling the variance in the data set. This example illustrates an important advantage of factor analysis, its ability to describe a large data set using a relatively small number of factors and scores, thus reducing the complexity of the data set. In this example, each spectrum was measured from 4000 to 700 cm~^ at 8 cm~^ resolution
4. MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
129
Spectrum of Pure Water
r —,— 3500
3000
w
Spectrum of 91% IPA & Water Mixture
2500 2000 Wavenumber (cm-l)
1500
1000
Figure 4.1 Spectra of a 9 1 % IPA-water mixture (bottom), and of pure water (top). These spectra were used to calculate a spectrum for a sample containing 53% IPA. The spectra in this figure can be thought of as factors. A comparison of the real and calculated spectra is seen in Figure 4.2.
3000
2500
2000
Wavenumber (cm-l)
Figure 4.2 An overlay of a real and calculated spectrum for a standard sample containing 53% IPA. The calculated spectrum (dashed) was obtained by taking 0.58 times an IPA solution spectrum and adding to it 0.42 times the spectrum of pure water. The spectra with which the calculation was performed are seen in Figure 4.1.
for 412 data points per spectrum. A total of (412 x 5) or 2060 data points for all five spectra comprise the standard sample set. We can describe this data set using two factors, and two scores for each spectrum. The two factors have (412 X 2) or 824 data points, and there are a total of (5x2) or 10
130
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
scores. This is a total of 834 data points, a reduction of almost 2.5 times in the size of the data set. This advantage is even more startHng for data sets containing many more samples than this one. The point is that large, complex data sets can be analyzed and reduced so that they can be described by simpler models. Please realize that this discussion is a special example of what happens in factor analysis. In reahty, the factors are not spectra, but are abstract vectors. Each factor attempts to span the maximum amount of variance in the data set, but typically the variance being spanned by a factor is from several root causes. For example, there is not a factor devoted just to variances in IPA or water concentration. The factor analysis algorithm knows nothing about the cause of the variance; it simply tries to span the variance as efficiently as possible. This being said, software programs do allow users to look at the factors calculated for a data set. Examination of factors is sometimes helpful to get a feel for what factors look like, and to ascertain noise levels. However, assigning unique physical meaning to features in factors should be approached with caution.
C. A DATA SET FOR FACTOR ANALYSIS CALIBRATION AND VALIDATION
The data set used in caUbration and vaUdation in this chapter is described at the beginning of Chapter 3. Briefly, nine standard samples containing IPA, ethanol (EtOH), acetone, and water were made. Amounts in standard solutions were measured volumetrically (changes in volume upon mixing were ignored). Six standards were used in caUbrations, three standards were set aside for use in validation. Replicate mid-infrared spectra of the nine standard samples were obtained for a total of 18 spectra. The standard spectra and their concentrations are seen in Chapter 3. IPA and EtOH are the analytes, and their concentrations were included in the caUbrations. Acetone and water are interferents, and their concentrations were not included in the calibrations. All caUbrations discussed in this chapter wiU be performed using this data set.
II. Factor Analysis Algorithms In Chapter 3, we represented Beer's law in matrix form. We can represent a factor analysis caUbration in matrix form as such: A = SF
(4.2)
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
131
where A = the matrix of absorbances of order S = the matrix of scores of order /i x / F = the matrix of factors of order / x p n = the number of standard spectra p = the number of absorbances f— the number of factors
n^p
The order of A is nxp where n is the number of standard spectra and p is the number of wavenumbers at which absorbances were measured. Each row of matrix A contains a standard spectrum. The rows of matrix S contain the scores for each standard spectrum. Each of these scores is muUipHed by the appropriate factor in F to produce A. The order of S is n x / w h e r e / i s the number of factors. Each row of F contains a factor, and the order of F isfxp. When we multiply the n x/scores matrix by the fxp factors matrix we obtain the nxp absorbances matrix. This is simply another way of stating that we are representing the standard spectra using factors and scores. To completely reproduce any set of standard spectra, we would need to include factors that span all the variance in the data. Naturally, this is difficult because of the presence of error in the data set. Our real interest, however, is in how changes in component concentration cause the spectra to change. Factors that simply describe noise are of no interest. A strength of factor analysis is that through judicious choice of the factors used in a cahbration, variation in component concentrations can be modeled and some noise can be eliminated from the cahbration. There are several different factor analysis algorithms available in modern software programs. These algorithms vary in the details of how factors and scores are calculated. These mathematical differences have practical implications for the accuracy and utility of these algorithms. The following discussion will touch upon the math behind factor analysis techniques, and will emphasize the practical impact of the use of each algorithm. A.
PRINCIPAL COMPONENTS REGRESSION ( P C R )
In principal components analysis (PCA), a set of factors and scores are initially calculated from the standard spectra. Then, the concentration information is combined with the scores in a linear regression calculation (hke the ones discussed in Chapter 3) to determine the final cahbration coefficients. This combination of PCA and hnear regression is known as Principal Components Regression (PCR). In PCR, a Principal Component is just another name for a factor. The calculation of the principal components
132
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
is performed as follows: 1. Find a single factor that spans the maximum possible variance in the set of standard spectra. 2. The amount of factor in each standard spectrum is calculated. This is its score. There is one score per factor per standard. 3. The score for each standard spectrum is multiplied by the factor, and is subtracted from each standard spectrum to produce a residual as such (standard spectrum) — (score) (factor) = residual
(4.3)
This step strips the variance spanned by this factor from the data set, leaving behind some amount of unmodeled variance. There is one residual per standard spectrum. 4. Go to step #1 and start the process over using the set of residuals instead of the standard spectra. The residuals are used in the next iteration of the technique to calculate a new factor, then a new set of scores etc. The process is repeated, and more factors and sets of scores are generated until the user stops the process or until all the variance in the spectra has been modeled. In the end, we obtain a group of factors, and a set of scores for each factor. If we continue to find all of the factors for the standard spectra we will reach a point where the residuals are zero, then Equation (4.3) becomes Equation (4.4) as such. (standard spectrum) — (score) (factor) = 0
(4.4)
We can rearrange Equation (4.4) to obtain Equation (4.5) standard spectra = (scores) (factors)
(4.5)
Equation (4.5) allows us to calculate, or "reconstruct" the entire set of standard spectra using the factors and scores obtained via principle components analysis. Note that Equation (4.5) is just a restatement of Equation (4.2). Up to this point, we have only used the spectra to calculate the factors and scores. The concentrations of the analytes, the second important piece of information needed to obtain a cahbration, have not been used. Therefore, the second step of PCR is to relate the scores to the concentrations using cahbration coefficients named " B " as follows: C = BS^
(4.6)
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
133
where C = the matrix of component concentrations of order mxn B = the matrix of caUbration coefficients of order mxf S^ = the transpose of the scores m a t r i x / x n The quantities «,/, and;? were defined above; m is the number of analytes. The order of C is mxn; each column of C contains the analyte concentrations for a given standard sample. The coefficients matrix B is of order mxf. Each row of B contains the coefficients for a specific analyte, and each column of B contains the coefficients for a specific factor. The matrix S^ has dimension/x n since the untransposed matrix S has order nxf. A sanity check for the validity of any matrix multiphcation is to examine the orders of the matrices being multiplied together. Recall from Chapter 3 that for any two matrices A and B to be multiplied together as AB, the number of columns in A must equal the number of rows in B. In terms of matrix orders, if A is of order q xw, then B must have w rows and have order w xr where q and r are any number. The order of the result of the multiphcation AB would be {q x n){n xr) = (qxr). In Equation (4.6), the orders of the matrices being multiplied are (mxn) = (m xf)(fx n). This satisfies the condifion that the number of columns in the first matrix equals the number of rows in the second matrix. The multiphcation produces an (m x n) matrix, which is the correct order for the concentration matrix. The columns of S^ contain the scores for each standard. These are multiplied by the rows of B, which contain the correlation coefficients for each analyte, to produce the concentration of each analyte in each standard. The form of matrix Equation (4.6) is similar to the matrix equations seen in Chapter 3, where we used the least squares method to solve matrix equations. Using an ILS approach, we can solve Equation (4.6) for the calibration coefficients matrix B as such B = CS^^(S'^S'^'^)-^ Now, the double transpose of any matrix is simply itself, so S^^ = S [6] which simphfies the regression equation to B=:CS(S^S)-^ (S^S)~^ is a square matrix of order (fxf). in Equation (4.7) are then
(m xf)
(4.7) The orders of the matrices
= (mx n){n xf)(f
xf)
so the orders work out, indicating that Equation (4.7) is true.
134
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
The first part of the PCR caHbration produced the factors matrix F and the scores matrix S. Equation (4.7) gives the second part of the caHbration, the matrix of caHbration coefficients B. These three matrices, F, S, and B comprise a caHbration in PCR. Our final goal is to obtain a prediction equation that includes the F, S, and B matrices. We can in theory rearrange Equation (4.2) as such S - AF^ and substitute it into Equation (4.7). However, recall from Chapter 3 that a matrix inverse may or may not exist for a given matrix, which means F~^ may not exist. Then how do we obtain a PCR caHbration? Fortunately, the factors matrix generated via PCA has a unique property, it is called an orthogonal matrix. Orthogonal matrices have the special property that when multiplied by their transpose, they give the unity matrix I [6]. Thus, the following holds true for the factors matrix F. FF^^I This aHows us to solve Equation (4.2) for the scores matrix as such [5] AF^ = SFF^ AF'^
= SI
(4.8)
S = AF^ Let us examine the order of the matrices in Equation (4.8) as a sanity check.
A is {n xp), F^ is (p xf) and (nxf)
=
(nxp){pxf)
which means these two matrices can be multiplied together. Their product is a matrix of order {nxf), which is the correct order for the scores matrix S. Now, our goal is a prediction equation, which means that we need concentration as the dependent variable. Equation (4.6) fits this description. The problem is that Equation (4.6) contains the term S^ while Equation (4.8) contains S. To find S^ we simply transpose both sides of Equation (4.8) as such S'T z. (AF'^)^
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
135
Now, there is theorem of matrix algebra that says for any two matrices A and B [6] (AB)^ - B^A^ Applying this theorem to the equation for S^ gives S'^ = FA^ Substituting this into Equation (4.6) gives C - BFA'^
(4.9)
Once more let us look at the orders of the matrices multipHed together in Equation (4.9), which are (m X n) = (m xf)(f
x p){p x n)
When multiplied out this gives a matrix of order (m x n) which is the proper order for the concentration matrix C. Thus, the orders of the matrices multipHed in Equation (4.9) are correct. When an unknown spectrum is measured, its absorbances are in the A^ matrix in Equation (4.9). Since we know B and F^ from the caUbration, we can predict the unknown concentrations, C, by simply multiplying the appropriate matrices together. Equation (4.9) is the prediction equation for PCR analysis. The problem with PCR being a two step process is that no concentration information is incorporated into calculating the factors, and they may capture sources of variation caused by changes other than in the component concentrations. For this and other reasons, factor analysis algorithms other than PCR have been developed. B. THE PARTIAL LEAST SQUARES ( P L S ) ALGORITHMS
The advantage of the Partial Least Squares (PLS) algorithm is that it is a one step process. The spectral and concentration information are included in the calculation of the factors and of the scores. One way of thinking of the calculation is that a set of factors and scores are generated for the spectra, and a set of factors and scores are generated for the concentrations. The calculation of the factors and scores for the spectroscopic and concentration data look Uke this in equation form: A = SsFs
(4.10)
136
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
where A = the absorbance matrix Fs = the factors calculated from the spectral data Ss = the spectral scores matrix which is simply a variation of Equation (4.2), and C^ScFe
(4.11)
where C = the concentration matrix Sc = the concentration scores matrix Fc = the factors calculated from the concentration data However, all these calculations are performed in tandem, and the two sets of scores are related to each other. The advantage of including concentration information in the factors calculation is that it will help factors better capture variance due to component concentration fluctuations. This is the variance we seek to model. There are two variants of the PLS algorithm, known as PLS-1 and PLS-2. In PLS-1, a separate set of scores and factors are calculated for each analyte. PLS-2 works more like PCR in that only one set of factors and scores is calculated for the entire data set. The advantage of PLS-1 is that it is in theory more accurate since scores and factors are tuned to each analyte. In general, PLS-1 works better on systems where the concentrations vary widely. PLS-2 gives equivalent results to PLS-1 on systems where the concentrations do not vary widely, and better results when there is correlation between the concentrations. C. ALGORITHM DISCUSSION AND COMPARISON
Now that we have discussed factor analysis techniques, we are in a position to understand where the advantages of these algorithms come from. Recall that each factor describes a certain amount of variance in a data set. Once factors have been calculated to model all other sources of variability, higher numbered factors model the noise in the data. We will call these noise factors. When determining which factors to include in a caHbration, judiciously excluding noise factors excludes some of the noise in the data set from being modeled. This enhances the model quality and accuracy. However, if a factor models noise but also models variations in component concentration, excluding it from a calibration may actually decrease caHbration accuracy and quality. The question is how do we distinguish noise
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
137
factors from factors that model variations in component concentration. To do this we have to understand more about what factors look Hke. An example of a couple of factors, calculated using the example data set and the PLS-2 algorithm, are shown in Figure 4.3. The standards used contained IPA, EtOH, water, and acetone. In general, factors are abstract, and it can be dangerous to read physical meaning into them. However, the gross differences between the two factors in Figure 4.3 need an explanation. The left hand factor in Figure 4.3 was the second one calculated. UnHke a spectrum, it contains peaks that point up and down. However, the major features have positions and widths similar to important peaks in the spectra of the standards. This factor is modeHng, to some extent, variations in component concentrations. The ninth factor calculated is shown in the right hand side of Figure 4.3. The features here look nothing hke spectral features, and in fact, this factor is probably modeling noise. The flat areas in these factors are spectral regions excluded from the calculation. Looking at factors like this can be instructive, but do not read too much into their appearance. Figure 4.3 intentionally shows two factors that look radically different for illustrative purposes. In reaUty, factors may look very similar, and their appearance by itself would not be enough to help decide which of them, if any, are to be included in a final calibration. Many 4mixg$amp.Wf.2
2600
4mixggamp.Wf,2
2000
Spectral Units ( )
Spectral Units ()
Figure 4.3 Two factors calculated using the PLS-2 algorithm and the example data set. The standards used in the caUbration contained IPA, EtOH, water, and acetone. Left: The second factor calculated. Note that although it is abstract, it has features reminiscent of the peaks in a spectrum. Right: The ninth factor calculated. It appears to be mostly noise. The flat sections in the factors are spectral regions excluded from the calculation.
138
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
tools can be used to help decide which factors are to be included in a calibration, as will be discussed below. An advantage of factor analysis algorithms is that they tolerate interferents. It does not matter whether there are many species with bands overlapping those of the analytes. It does not even matter that you know all the concentrations of all the species in the sample. As long as you know the concentrations of the analytes in the standards, and the standards are truly representative of the unknowns to be analyzed, a factor analysis calibration is possible. How is this accomplished? Variations in interferent concentration are just another source of variance. The factors calculated can model this variance along with variations in analyte concentration. As long as the factors modehng the interferent and component concentrations are included in a caUbration, analyte concentrations can be successfully predicted. Another advantage of factor analysis techniques is that they tolerate spectral artifacts. Examples of artifacts include basehne drift, slope, and curvature caused by changes in the spectrometer or sample. These artifacts are yet another source of variance, and can be modeled using factors. Again, including the factors describing this variance in the model helps reduce their impact on the accuracy of predicted concentrations. However, only artifacts that are stable and present in both the standards and the unknowns can be modeled. Artifacts absent from the standards will not be modeled. Given that factor analysis algorithms have many advantages, and that we have talked about three of them (PCR, PLS-1, and PLS-2), which algorithm should we use? In general, either PLS algorithm is more accurate than the PCR method. This is because in PLS the concentration data is included in the calculation of the factors. Recall also that for PLS-1 a separate set of factors and scores is calculated for each component, whereas for PLS-2 there is only one set of factors and scores. In theory, this means that by using more factors and scores to describe a data set, PLS-1 will be more accurate than PLS-2. Because of these ideas, traditionally the accuracy of factor analysis algorithms has been Hsted as PLS-1 > PLS-2 > PCR [2]. However, there are known exceptions to this dictum. So, which algorithm should you use? Modern software makes it relatively easy to perform test caHbrations using all the different algorithms. Once these are in hand, their quahty can be checked, and the algorithm that does the best job on your data set can be chosen. In summary, unless onerous amounts of calculation time are involved, try all the algorithms and use the one that works best.
ill. Standards Preparation and Training Set Design A training set is just another name for the spectra and concentration information used to generate a cahbration. The name stems from the
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
139
fact that we are "training" the algorithm about the correlation between the absorbance and concentration data. Regardless of whether we use the term training set or standard spectra, all of the advice given in Chapter 2 about how to prepare and treat standards still applies. Most importantly, remember that the standards have to mimic the unknowns. Any variance that is not present in the standards but is present in the unknowns will invalidate the predictions. Although factor analysis algorithms are powerful, that does not mean you should strive to obtain poor data. The more painstaking your experimental technique, the better your caUbration will be regardless of the algorithm chosen. The first issue to address in training set design is where the standard samples are going to come from. There are typically two scenarios. First, for single analyte or simple multicomponent systems, aliquots of pure materials can be mixed together to make standards. Second, in process control or situations where complex mixtures must be analyzed, samples may need to be collected and the concentrations determined by some primary method, such as a titration or gravimetric analysis. One important thing to reahze is that quantitative spectroscopy is a secondary method of analysis. We cannot simply walk up to a spectrometer, place a sample in it, and immediately obtain quantitative information. We must first calibrate the spectrometer, which means that we have to know the concentration of components in the standards using some other primary means of analysis. Regardless of how the standards are obtained, it is critical that you know the concentration of analytes in these samples accurately. The SNR (signalto-noise ratio) of spectroscopic measurements is often approximately hundreds or thousands to one. Whereas, it is rare to have concentration data with a SNR of greater than 100 (i.e. an error of less than 1%). Thus, the error in the concentration data is often the major source of error in a calibration. Accordingly, it is critical that the concentrations of the standards be known with as much accuracy as possible. Experience has shown that if the concentration error is less than 5%, usable calibrations can be achieved with factor analysis algorithms [2]. Reducing the concentration error to 2 or 1% greatly increases the calibration quaUty and makes obtaining a caHbration easier. On the other hand, for concentration errors greater than 5%, obtaining usable cahbrations becomes more difficult. For concentration errors greater than 10%, a usable calibration may not be obtained at all. The maxim to keep in mind is that the more accurate the concentration data, the better. Once the accuracy of the concentrafion data is established, the next issue to tackle is the number of standards to analyze. In Chapter 2, the formula "In + 2" was given for the number of standards to obtain, where n is the number of components in the system (not analytes, components means
140
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
separate sources of spectral variation). This equation, however, represents the minimum number of standards needed to mathematically obtain a caHbration for a given number of sources of variability. In almost all circumstances, more standards are better. For factor analysis methods, a rule of thumb [2] is that 3« is the minimum number of standards to use; 5n is even better, and \0n is ideal. Granted, as the number of standards increases, the cost and complexity of achieving a caHbration increases, and it may not always be possible to use 5n or 10« standards. However, effort expended to increase the number of standards will usually be repaid in accuracy and ease of model building. Ultimately, the common sense and judgment of the analyst must be used to balance the benefit of using many standards with the time and cost of obtaining them and measuring their concentrations. Another consideration when obtaining standard samples is the concentration range to use. It is critical that you bracket the expected concentration range of your unknown samples with standards. For example, if you are developing a method to analyze for EtOH, and you think your unknown samples will contain EtOH in the range of 5-15%, it would be necessary to have standards containing less than 5% and more than 15% EtOH to bracket these EtOH concentrations. Failure to do so would mean that you would be applying the calibration in a concentration regime where you do not have data. It is always tempting to extrapolate a model into regions where you do not have data. It is also always wrong to do so. A calibration can only be legitimately applied in concentration regimes for which you have calibration data. The only way to extend the concentration range of a model is to measure the spectra of standards in the desired concentration range, then include this information in the model. Yet another thing to consider when preparing standards is the distribution of the concentrations across the concentration range. Imagine again that you want to develop a calibration for 5-15% EtOH. Does it matter whether most of the standard concentrations are clustered near 5%), near 15%, or near the middle of the range? The answer is yes. Simply put, the model will be most accurate in those concentration ranges where there are the most samples. The concentration range with the most standards will carry more weight in the model than those ranges with fewer data points. For example, if nine standards contained greater than 10% EtOH and only one was less than 10%, the model would be skewed towards predicting higher concentrations better than lower concentrations. Conversely, if nine standards contained less than 5% EtOH, and only one contained more than 10%, the model would be skewed to predicting lower concentrations better than higher concentrations. If we want our model to have equal accuracy throughout the concentration ranges of our components, standard concentrations should ideally be evenly spread throughout the concentration
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
141
range. You want an approximately even number of standards at low, medium, and high concentrations. This being said, it is not always practical. When making up standards from pure components you have control over the standard concentrations. However, in process control environments, you may not have control over the concentration values in the samples. In these cases, you cannot choose to evenly spread out your standard concentrations. However, keep in mind that your model will still predict best in concentration ranges where there is the most data, and predict worst in concentration ranges where there is the least data. When choosing the concentrations of analytes in standards, it is also important to make sure that the concentrations of two or more analytes are not collinear. This means that the concentrations cannot be a multiple of each other. For example, if the concentrations of EtOH in a set of three standards were 1,2, and 3%, and if the concentrations of IPA in these same standards were 2,4, and 6%, the standards would be colHnear because the IPA concentration is always twice the EtOH concentration. This is a problem because factor analysis algorithms correlate the changes in one data set with the changes in another. If the concentrations of two components are multiples of each other, their relative concentration does not change. There is no variation in this data, and there will be no factors calculated to account for differences between these components. This would lead to a calibration that would be completely unusable except for the unlikely case that the IPA concentration was always twice that of the EtOH concentration. To test for coUinearity, plot the actual concentrations of one analyte against those of another. This is illustrated in Figure 4.4, which is a plot of the actual concentrations of two analytes drawn from real data. The plot in Figure 4.4 is a straight Hne. Its correlation coefficient (the square root of R^) is 0.999, indicating a very good straight line relationship. These two sets of concentrations are collinear. Ideally, a plot of actual versus actual concentration for two components should produce a random scatter of points. A more random actual versus actual concentration plot is shown in Figure 4.5. The correlation coefficient for this plot is 0.17. While not perfectly random (which would give a correlation coefficient of zero), the scatter of concentrations is random enough to allow factor analysis algorithms to model all the variations in component concentrations. A classic way of getting into coUinearity trouble is to make a stock solution of concentrated components, then make serial dilutions of this solution. For example, if a stock solution contained three components in concentrations of 30,20, and 10%, serial dilutions of this sample would always produce samples where the ratio of the three concentrations was 3 : 2 : 1 . This would mean that there would be no relative changes in concentration, and factor analysis algorithms may have difficulty finding
142
Q U A N T I T A T I V E SPECTROSCOPY: THEORY A N D PRACTICE
2mix.tdf.2 m' = 0.997351 en
Actual Concentration ( C1 )
Figure 4.4 A plot of actual versus actual concentrations for two analytes drawn from real data. Since the plot is a straight Hne, the concentrations are collinear.
4mix2.tdf .1 (R^ = 0.0313160772)
-
8
1J
451% 16
30-
18
15-
-
6
ia 0-
2 1
1
1
1
1
r
Actual Concentration ( CI )
Figure 4.5 A plot of actual versus actual concentrations for two analytes drawn from real data. The concentrations are well scattered, so these data are not collinear.
any meaningful variation in the data. Ultimately, it is best to randomize the concentrations in your standards to avoid colHnearity problems. In addition to everything else discussed in this section, you need to consider how many spectra of each standard to use in your caHbration.
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
143
For factor analysis to work properly, you must try to capture all sources of variation in the spectra, including instrument and sampling variation. A good way to capture this variation is to take two or more spectra of each standard and use in the model. This does not mean placing the standard into the spectrometer, measuring spectrum #1, waiting a moment, and then measuring spectrum #2. To truly model sampling variation, separate ahquots of the standard must be used to measure separate spectra. Therefore, you would place aHquot 1 of the standard in the instrument to obtain spectrum 1. The sample cell would be cleaned, the instrument reset, then spectrum 2 using aliquot 2 of the same standard would be measured. Ideally, you should even randomize the order of what ahquots of what samples are being measured. In practice, using two or three separate spectra of the same standard in a calibration does a good job of capturing spectroscopic and sampling variance. It is a good idea to take at least two spectra of separate aliquots of each standard, and subtract them from each other using a subtraction factor of 1. The point of this exercise is to look for uncontrolled variance. In theory, two spectra taken of the same sample in the same instrument should be identical. The subtraction result, called the residual, should be a flat Hne containing nothing but noise if there are no unusual sources of variation in the experiment. The top residual in Figure 4.6 is a good example of a residual that contains nothing but noise (and a CO2 peak at 2350 cm~^). This residual indicates that the process of taking ahquots and measuring spectra is reproducible and under control. The opposite situation is seen in the residual at the bottom of Figure 4.6. This residual contains what appear to be real spectral features at wavenumbers where the sample absorbs. Variations in absorbance between the two spectra cause these features, and indicate that the process of sampling and measuring spectra is not under control. It is critical that all the sources of variabiHty be eliminated before proceeding with a calibration.
IV. Spectral Preprocessing After standard spectra are measured, but before a cahbration is performed, it is possible to manipulate the spectra using a computer and software to try to improve the cahbration. The idea is to reduce spectral problems such as noise and baseline drift, or to attempt to eliminate unwanted spectral features (artifacts). However, as wih be seen below, if spectral manipulations are used improperly the data in a spectrum can be damaged or destroyed. This adds unwanted variance to a data set, and can destroy the correlation between absorbance and concentration. Whenever spectra are manipulated, the original data should be kept in case the manipulation creates problems.
144
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
.OH
Spectral Residual Without Sample Features
Spectral Residual With Sample Features Wavenumber (cm-l) Figure 4.6 Spectral residuals obtained by subtracting two spectra of the same sample. Top: no sample features are seen, just noise and a CO2 peak. This indicates no unusual sources of variability in the experiment. Bottom: Real spectral features, pointing up and down, indicate that a source of uncontrolled variation exists in the experiment.
Also, how and why the spectra were manipulated should be recorded for anyone wishing to reproduce the work. This section will cover appropriate and inappropriate uses of spectral manipulations in quantitative analysis. A. M E A N CENTERING
Mean Centering is probably the most frequently used type of spectral preprocessing in factor analysis. It involves taking the average of a set of standard spectra, and then subtracting it from each standard spectrum. The result is that each spectrum is now centered on zero absorbance units. The net effect is to remove the largest source of variation from the data before caUbration. In so doing small changes in the data set will carry more weight in the calibration and in theory make the calibration more sensitive. In addition, by centering the data points that were far away from the mean are brought closer to the middle. This reduces the impact of outlying data on the calibration, and increases the impact of data that falls closer to the center. This is generally positive since you want the calibration to be most sensitive where most of the data falls. The average spectrum calculated via mean centering is not thrown away. It is treated hke any other factor, and is used in the final calibration to reproduce the standard spectra. There is some discussion in the hterature
4. MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
145
about whether you should mean center your data or not [2]. However, in papers this author has read the data is mean centered more often than not. If your software allows you to turn mean centering on and off, the best approach may simply be to cahbrate both ways and see which one produces the better result for your specific data set. B. SPECTRAL DERIVATIVES
As you may recall from calculus, the slope of any mathematical function can be determined by calculating its derivative. Since a spectrum is a mathematical function, spectral derivatives can be calculated. The derivative of a spectrum can be taken a number of times, producing first derivatives, second derivatives etc. There are a number of algorithms used to calculate derivatives, the choice of which will depend on the software package you use. The "Savitsky-Golay" algorithm [3] is widely used, and is the algorithm we will use in this book to calculate derivatives. To understand the appearance of spectral derivatives, we need to remember that the first derivative of a function measures slope. The signs of the slopes of a typical absorbance band are noted in Figure 4.7. On the low wavenumber side of the peak the slope is positive. At the very top of the peak, there is a small region where the peak is flat and the slope is zero. On the high wavenumber side of the peak the slope is negative.
Zero Slope
%
1180
Wavenumber (cm-l) Figure 4.7 The signs of the slope for a typical absorbance band. The slope at low wavenumber is positive, zero at the top of the peak, and negative at high wavenumber.
146
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Point Where Slope Passes Through Ziero
Wavenumber (cm-1)
Figure 4.8 Bottom: The absorbance band seen in Figure 4.7. Top: Its first derivative (SavitskyGolay algorithm, second order, 5 points). Note that the wavenumber at the top of the absorbance band is the point at which the value of the derivative passes through zero. The first derivative has been plotted as its individual data points for clarity.
This change in slope across the band means that its derivative should have a positive lobe, pass through zero at the peak wavenumber, and then have a negative lobe. This is illustrated in Figure 4.8, which shows the first derivative of the absorbance band seen in Figure 4.7. The derivative seen in Figure 4.8 has a positive lobe at low wavenumber, passes through zero, and has a negative lobe at high wavenumber as expected. The derivative passes through zero at the wavenumber corresponding to the top of the peak. For this reason, derivatives are sometimes used to locate peak positions in spectra. Note that in parts of the spectrum where the baseUne is flat, the derivative is zero since the slope of a horizontal Une is zero. Basehne offset is a real problem when measuring spectra. Offset means that the entire baseline of a spectrum is raised above or lowered below zero absorbance units. Variation in baseline levels produces unwanted variation in absorbance measurements. A spectrum with 0.5 AU of offset is shown in Figure 4.9. Many instrumental or sampling problems can cause basehne offset. An excellent way of removing offset is by taking the first derivative of a spectrum. Offset represents a constant value added to all the absorbances in a spectrum. The derivative of a constant is zero, so when a first derivative is calculated the offset is removed, and the basehne of the derivative falls to zero. Figure 4.10 illustrates how the
4. MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
147
LAJJ 0.5 Absorbance Unit OfTset
Wavenumber (cm-l)
Figure 4.9 An absorbance spectrum with 0.5 absorbance unit offset.
X)
LAJJ
\j
Spectrum with OfTset
< 1st Derivative Witliout Ofl'set
Wavenumber (cm-l)
Figure 4.10 Top: A spectrum containing 0.5 absorbance units of offset. Bottom: The first derivative of this spectrum, which contains no offset and whose basehne is at zero absorbance units (derivative calculated using Savitsky-Golay algorithm, second order, 5 points).
offset seen in Figure 4.9 is corrected by taking the first derivative of the spectrum. The peak area information in a first derivative is conserved, so the quantitative correspondence between the spectra and concentration still
148
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
^
1160
1180
1200
1220
1240
— I — 1260
Wavenumber (cm-1)
Figure 4.11 A description of the changes in concavity across an absorbance band.
exists in derivative spectra. So, using first derivatives in a calibration gets rid of baseline offset while retaining the spectral details needed to correlate absorbances and concentrations. The disadvantage of some algorithms used to calculate derivatives is that they introduce noise into the result. This also decreases caUbration accuracy, and defeats the purpose of using a derivative in the first place. It is the job of the analyst to determine whether the offset eliminated by using derivatives compensates for the spectral noise they might introduce. In addition to a first derivative, the second derivative of a spectrum can also be calculated. The second derivative can be thought of as the derivative of the first derivative. The second derivative measures a function's concavity, or curvature. The concavity of a typical absorbance band is shown in Figure 4.11. At the edges of a peak, where the peak just begins to rise or settle back to the baseline, the curvature is upward, or positive. Where the peak is taller and steeper, the concavity passes through a transition point called the inflection point where the concavity is zero. The top of a peak is concave down, or negative. Because of this behavior, we would expect the second derivative of an absorbance band to have a positive lobe, pass through zero at the inflection point, have a negative lobe, pass through zero again at the second inflection point, and have a positive lobe before settling back down to the baseline. This is illustrated in Figure 4.12. The bottom of the negative lobe in a second derivative corresponds to the wavenumber of maximum absorbance of the band in the original spectrum. This is easily seen in Figure 4.12, where the bottom of the second
4. MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
149
2nd Derivative
I
— I
1160
1 1220
Wavenumber (cm-1)
Figure 4.12 Bottom: A typical absorbance band. Top: Its second derivative. Note that there are two positive lobes and one negative lobe in the second derivative. Also, note that the derivative passes through zero at the inflection points. Lastly, note that the bottom of the derivative points to the top of the absorbance peak (second derivative calculated using the Savitsky-Golay algorithm, second order, five data points).
derivative feature is seen pointing at the top of the peak from which the derivative was calculated. This is why second derivative spectra can be used in peak picking, and is why second derivatives are used to "pull apart" groups of overlapped peaks. Like first derivatives, second derivatives have zero basehne offset, so the basehne of the second derivative in Figure 4.12 is at zero. Also, second derivatives have zero basehne slope. This is illustrated in Figure 4.13. Second derivatives have no slope because a line has zero concavity (curvature). This is illustrated in Figure 4.13. The original spectrum, shown at the bottom of the figure, slopes upward at low wavenumber. In the second derivative of this spectrum, shown at the top of the figure, there is no slope and the baseline falls at zero. Peak area information is conserved in second derivatives, preserving the correlation between the spectral measurement and concentration values. The advantage of using second derivatives in a calibration is that both offset and slope are removed from a spectral data set. With first derivatives, just offset is removed. Second derivatives are typically noisier than the spectra from which they are calculated. This introduces inaccuracy into a calibration, and may defeat the purpose of using the second derivative in the first place. It is the job of the analyst to determine whether the offset and slope ehminated compensates for the noise that might be introduced
150
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
I
1000
2000
2500
3500
Wavenumber (cm-l) Figure 4.13 Bottom: an absorbance spectrum containing slope. Top: The second derivative of this spectrum, devoid of slope (second derivative calculated using the Savitsky-Golay algorithm, second order, five data points).
when calculating second derivatives. The ability of the first and second derivatives to remove offset and slope from the spectra makes them useful for reflectance spectra, where baseline slope can be a problem. C. BASELINE CORRECTION
Baseline correction, Uke derivatives, can be used to remove both baseline offset and slope from a spectrum. In this technique, a function parallel to the slope or curvature in a spectrum's basehne is drawn, either by the user or by an algorithm. The function is then subtracted from the spectrum to remove the slope and offset. The effectiveness of baseline correction is illustrated in Figure 4.14. At the bottom of the figure is a spectrum containing basehne slope, along with a parallel function drawn perpendicular to the basehne. The parahel function consists of a series of line segments. The small squares in the figure denote the line segment endpoints. After subtracting this function from the spectrum the baseline corrected result in the top of Figure 4.14 is obtained. The baseline slope is now gone. The key to a successful baseline correction is drawing a function that is truly parallel to the slope or curvature in the baseline. In the bottom of Figure 4.15, the function has been intentionally drawn to not parallel the
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
151
Parallel Function Original Spectrum Figure 4.14 Bottom: a spectrum with baseline slope, and a function consisting of a series of line segments drawn parallel to it. Top: The basehne correction result obtained after subtracting the parallel function from the original spectrum.
baseline. The top of Figure 4.15 shows that upon subtraction the original spectrum is greatly distorted, showing the damage that basehne correction can cause to a spectrum when used improperly. The utihty of using basehne correction as a pretreatment for spectra to be used in factor analysis is limited. First, there are the problems of obtaining the correct parallel function. There are algorithms that will automatically determine the "best" parallel function. However, these do not always work properly, and can end up adding variance to your data. However, it is rare that the slope and curvature in all the standard spectra are the same. Thus, subtracting the same function from all the spectra will not give the best result for all the spectra in the set. Finally, one could try to optimize the basehne correction manually for all the standard spectra. However, this could be tedious for a large data set. Derivatives get rid of basehne offset and slope, but do not suffer from the problems of basehne correction. The calculation of a derivative does not involve the use of something as subjective as the parallel function. Also, it is easy to apply the same derivative calculation to an entire set of standard spectra, which means the derivative calculation should not introduce a large amount of variance into the data. The major disadvantage of taking derivatives, as discussed above, is that they can be noisier than the original spectral data. If the SNR of a data set is large, this is not a problem.
152
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
^\ Result 1.41.82.2-
1
X
1/
HJU
2.6-
-^
\
-3-
"
^
"
^
^
^
Poorly Drawn ''ParalleV Function
Figure 4.15 Bottom: A spectrum with baseline slope, and a poorly drawn "parallel" function. Top: The result of subtracting the poorly drawn parallel function from the spectrum.
However, for a data set with a low SNR taking derivatives may simply make the data set worse. In these cases, baseline correction may be a more appropriate way to get rid of the baseline offset and slope. D. SMOOTHING
Smoothing is a spectral manipulation technique used to reduce the noise level in spectra. The details of how smoothing works are described elsewhere [3,4]. The focus here will be on using smoothing to reduce the noise level in standard spectra, and its effect on caHbration quality. An example of how smoothing removes noise from a spectrum is shown in Figure 4.16. The bottom spectrum in the figure is a mid-infrared spectrum of a polyethylene bottle. There is noticeable noise, and there are features around 1700 cm~^ that are suggestive of real peaks, but are almost lost in the noise. The result of smoothing this spectrum is shown in the top of Figure 4.16. The noise level is reduced across the entire spectrum. The features at 1700 cm~^ have been pulled out of the background noise level, and are probably real. The utility of smoothing for quantitative analysis is that by reducing the spectroscopic noise, overall calibration quaUty is increased. Additionally, the abihty of smoothing to "pull" small peaks out of a noisy background can help factor analysis algorithms "see" correlations that might have been missed otherwise.
4.
153
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
.015H
<
1000
Wavenumber (cm-l) Figure 4.16 Bottom: A spectrum of a polyethylene bottle with noticeable amounts of noise. Top: The same spectrum smoothed (Savitsky-Golay algorithm, second order, 11 points). Note the decrease in noise level, and the appearance of peaks that were difficult to discern before smoothing.
For all of its advantages, smoothing must be used with caution. The smoothing of a spectrum automatically lowers its resolution. Narrow peaks get broad, and broad peaks get broader. Since resolution determines the information content of a spectrum [4], a smoothed spectrum contains less information than an unsmoothed spectrum. This is the price paid for increasing the SNR of a spectrum. There is such a thing as oversmoothing. This is illustrated in Figure 4.17. The bottom spectrum in Figure 4.17 is the same spectrum of a polyethylene bottle seen in Figure 4.16. The top of Figure 4.17 shows this spectrum extensively smoothed. The noise level is greatly reduced in this spectrum. However, there is significant spectral distortion. The two peaks near 3000 cm~^ have grown together to form a very broad feature, and the sharp peak near 1400 cm~^ has been turned into a hump. This illustrates the danger of smoothing; significant spectral features can be altered or lost if smoothing is not applied properly. When should smoothing be applied to standard spectra, and how can it be used without damaging the data? In general, smoothing should only be applied to data that are noisy to begin with. With this type of data, the decrease in noise level may compensate for the loss of spectral information. On the other hand, quahty standard spectra with a good SNR do not necessarily need to be smoothed to produce a good caHbration. The small
154
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
.oiH
2500
2000
1500
Wavenumber (cm-1) Figure 4.17 Bottom: An unsmoothed spectrum of a polyethylene bottle containing noise. Top: The same spectrum, oversmoothed (Savitsky-Golay algorithm, second order, 33 points). Note the distortion in spectral peak shapes in the smoothed spectrum.
reduction in noise level that smoothing gives may not be worth the loss in spectral information. If smoothing must be appHed to your standard spectra, experiment with the smoothing parameters first. Increase the amount of smoothing in small, gradual increments, keeping an eye on the noise level, and the shape and width of the narrowest peak in the spectrum. This peak will be the first to show the signs of oversmoothing. Hopefully, you will reach a point where the noise level is noticeably reduced, but spectral distortion is minimized. This would be the proper amount of smoothing to use. Lastly, the same amount and type of smoothing should be applied to all the standard spectra. If each standard spectrum is smoothed differently, it becomes another source of variance. E. SPECTRAL PREPROCESSING: SUMMARY AND GUIDANCE
We have discussed four different types of spectral preprocessing in this section; mean centering, spectral derivatives, basehne correction, and smoothing. Recall that mean centering involves subtracting the average spectrum from a set of standard spectra before cahbrating. The net effect is to remove a large amount of variance, making the cahbration more sensitive to small changes in the component concentration. First and second
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
155
derivatives are used to remove the offset and slope respectively from a data set. Baseline correction is also used to remove the slope and offset, and smoothing can reduce the noise level in a spectrum. With all these preprocessing routines to choose from, how, when, and should you use these techniques? First, a discussion of what not to do. There is a tendency amongst some workers to automatically manipulate spectra before looking at the data, and before considering the consequences of their actions. All spectral preprocessing routines alter the spectral data in some way. These manipulations can add unwanted variance to the data, and the calibration can end up modeling this variance instead of variance due to changes in the component concentration. At worst, some of these techniques, when improperly applied, can distort spectra, leading to distorted calibrations. Mean centering is probably the most frequently applied spectral pretreatment. Derivatives are meant to remove offset and slope. If your spectral data are relatively free of offset and slope, there is not necessarily an advantage in using derivatives; remember, they add noise to the spectral data. BaseUne correction, because of its abihty to distort spectra, should only be used in cases where there is offset, slope, and lots of noise. BaseUne correction may be able to remove offset and slope without adding as much noise as derivatives. Smoothing should be used on noisy data. Smoothing low noise spectra can distort good data without removing that much noise. The spectral distortions caused by smoothing should only be tolerated when meaningful noise reduction can be achieved by its use. The basic message is this. Spectral pretreatments, except perhaps mean centering, work best on problem data. Hopefully, use of spectral pretreatment gets rid of the problem without unduly altering the data. However, it is the job of the analyst to understand how these pretreatments work, how and when it is appropriate to use them, and to make sure that their use does not have a negative affect on the calibration.
F. CHOOSING PROPER SPECTRAL REGIONS
One of the beauties of factor analysis is that it is a whole spectrum technique. This means that some or all of a spectrum can be used in a calibration, and each component does not need to have an isolated absorbance band. As long as there is an absorbance somewhere in the spectrum whose value varies with the concentration of the components, a cahbration is possible. What this means in practical terms is that overlap of absorbance bands is not a problem.
156
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
Jtesion Used ^
Region Used
w
^
1
2600 2200 Wavenumber (cm-1)
—1
1
'
Figure 4.18 A set of 12 spectra of EtOH/IPA/water/acetone solutions. The vertical lines denote the wavenumber ranges used in the calibration. These spectra are used in example calibrations seen in this chapter.
Since factor analysis is considered a full spectrum technique, many analysts automatically use the entire spectrum in a caUbration. However, this overlooks the opportunity to achieve an improved calibration. Most sets of standard spectra will have some, perhaps many, wavenumbers at which there is Uttle or no useful information. Since these absorbances do not correlate with component concentration, they are a source of noise and are best left out of a cahbration. Most software packages allow you to choose the regions of spectra to use in a factor analysis calibration, and by picking these regions judiciously, cahbration quality can be optimized. This is illustrated in Figure 4.18. This figure contains the 12 spectra of IPA, EtOH, acetone, and water used in example calibrations in this chapter. Note in Figure 4.18 that there are regions from 4000 to 3700, 2600 to 1850, and from 870 to 700 cm~^ where there is just noise or a sloping baseline. There is no point in using these spectral regions in a calibration since there are no features there that correlate to component concentration. On the other hand, the regions where there are obvious absorbance peaks, 3700 to 2600 and 1850 to 870 cm"^ should be included in caHbrations because they contain component features. The vertical lines in Figure 4.18 denote the two spectral regions that were used for the calibration. Judicious selection of appropriate spectral regions can improve cahbration quahty.
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
157
V. Cross Validation: Testing Model Quality After obtaining the standard spectra and performing the spectral preprocessing desired, the next step in obtaining a factor analysis caHbration is to perform a cross validation. The purpose of this step is twofold. First, a cross vahdation can help to spot problem data points such as poor spectra or incorrectly entered concentrations. Second, the quaUty of trial calibrations can be assessed to assist in selecting the optimum algorithm and number of factors to use in the final calibration. During a cross vahdation, one or more standard spectra are set aside, and trial calibrations are made using the standards left, varying the number of factors used. Afterwards, various metrics assess the quahty of the different trial cahbrations. Following are the steps in a cross vahdation. STEPS IN A CROSS VALIDATION
1. Take one standard spectrum and its concentration data and set it aside. You may choose to set aside more than one standard spectrum at a time, saving calculation time. However, setting aside one spectrum at a time is the most common approach. 2. The software will calculate trial cahbrations using the spectra and concentrations left, varying the number of factors. 3. The trial cahbrations are used to predict the concentrations in the sample(s) set aside. 4. Start over by going to step 1 and setting aside the spectral and concentration information for the next standard sample(s). Calculate trail calibrations leaving this standard out of the cahbration and varying the number of factors, and then predict the concentrations. 5. The cross vahdation is over when all the standards have been left out equally. There wih be calculated concentrations for each standard sample to compare to the real concentrations. This comparison is a useful tool for ascertaining model quality. If there are 12 standard spectra, the algorithm will iterate 12 times. By leaving samples out of the cahbration and then predicting their concentrations, we get some idea of how well the calibration will do on unknown samples. A cross validation also provides us with the information needed to spot bad data and pick the right number of factors to calibrate with as discussed below. An issue to consider before performing a cross vahdation is the maximum number of factors to employ. Remember that you want your set of factors to describe all the variance in a data set. In theory, each standard sample could represent a separate source of variabihty. Therefore, as a
158
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
rough guide, the maximum number of factors used in the cross vaUdation should be approximately equal to the total number of standard samples. Consult your software user's guide for any software specific limitations on this number. Despite the utility of a cross vahdation, the best test of a cahbration is to predict the concentrations in samples that were not used in the cahbration. This step is called validation and is discussed below.
A. DATA SET QUALITY: SPOTTING OUTLIERS
After performing a cross validation, it is tempting to immediately decide upon the right number of factors to with which to calibrate. However, we need to ascertain the quality of the data in the training set first. One of the beauties of cross vahdations is that they provide us with the information needed to spot bad training set data. In any data set, a point that hes far away from the rest of the data is called an outlier. An outher draws our attention because it is inconsistent with the other data in a set. This inconsistency indicates that there may be something wrong with the data point. Inclusion of a bad data point in a cahbration can damage model quahty and performance. By spotting an outher, we can hopefully correct the problem with the data. However, in the absence of an obvious problem with a data point, you have to decide whether to include it in a cahbration or not. There are times outliers represent unusual or important sources of variance which may need to be modeled. The following sections describe how to detect and correct concentration and spectral outhers. B. CONCENTRATION OUTLIERS
To spot errors in concentration data, or concentration outliers, it is useful to look at a plot called the concentration residual plot. A concentration residual is simply the difference between the actual and predicted concentration for a specific component in a specific standard. It is calculated from the following equation: Rc = CA-cp where i^c = concentration residual CA = actual concentration cp = predicted concentration
(4.12)
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
159
4mix2.tdf.1
Sample Number ()
Figure 4.19 An example of a concentration residual plot with concentration residual on the y-axis and standard number on the A'-axis. Note that the data are scattered about zero, and that no residual is significantly greater than the others. This indicates that the cahbration does an approximately equal good job of describing all the standard samples. The PLS-2 algorithm and five factors were used.
The size of a concentration residual for a sample is a direct measure of how accurately a model predicts the analyte concentrations in that sample. A concentration residual plot is shown in Figure 4.19, with concentration residual on the F-axis and sample number on the X-axis. Remember from above that during a cross validation, the number of factors is varied to produce trial cahbrations. Each trial cahbration produces a set of predicted concentrations, so that each trial cahbration will have its own concentration residual plot. Also, there will be a different concentration residual plot for each analyte. For example, you can look at the plot for analyte 1 using three factors, the plot for component 3 using five factors, and so on. The residual plot in Figure 4.19 is for analyte number 1 using five factors and the PLS-2 algorithm. To spot concentration outliers, we will be interested in the scatter in the concentration residual values. In general, a cahbration should predict the concentrations in all standards approximately equally well. On average, the concentration residuals should be about the same, and randomly scatter about zero. On the whole, the data in Figure 4.19 scatter randomly about zero. Also, no residual is significantly different from the rest, they all fall between 1.0 and —1.0. A crude, but simple way to measure the scatter in a plot hke this is
160
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
to calculate the average residual and compare it to the residual for each standard. In the case of a concentration residual plot the points have positive and negative values, so that the average of the absolute values of the residuals is used. The average absolute residual in Figure 4.19 is 0.4. None of the other residuals is more than three times this value. There are more sophisticated ways to measure the scatter in a residual plot [5]. However, these metrics are harder to visualize than the simple average we are calculating here. Sometimes, before obtaining a calibration, we know something about the size of the concentration error in our data. The average residual for concentration outher plots should be approximately the same order of magnitude as the concentration error. If the average residual is much bigger than the concentration data error, there may be something wrong with the data set, such as the inclusion of outUer(s), or something wrong with the way the model was put together. A concentration residual plot containing an outlier is seen in Figure 4.20. Sample number 16 in Figure 4.20 should draw your attention. Its residual is significantly greater than that of all the other samples. This means that the model did fine predicting of the concentrations for all the standards except #16, for which it did a very poor job. In this circumstance, it is very possible that the problem is with the sample rather than the model, and the 4mix.tdf.1 18
15
7
2-
11
9 3
1
-
4
5 ^
2
6
J
13
8
1?
14
10 17
-4-
Possible Outlier 10'—1
1
1
1
J ——\—1—1—1—1—1—1—1—
—V \
\
16
- ] — 1 — 1 — r — —!
1
1
Sample Number ()
Figure 4.20 A concentration residual plot, with concentration residual on the 7-axis and standard number on the X-axis. Note that most of the data cluster together closely around zero except for sample 16. Its concentration residual is significantly greater than those for the rest of the samples, and it is a possible outher. The PLS-2 algorithm and five factors were used.
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
161
plot indicates that this sample may be an outHer. As above, we calculate the average residual for this plot, which is 1.8. The absolute value of the residual for sample 16 in Figure 4.20 is 10.5, more than 8.7 times the average absolute residual. In general, regardless of how or in what units you measure the scatter in a residual plot, any sample with a residual more than three times the average is suspicious [2, 5]. It would be tempting to look at Figure 4.20 and leave sample 16 out of the calibration. However, there is risk associated with this action. The sample may be different from the others for a legitimate reason; it may represent a source of variance that needs to be included in the model. On the other hand, the sample may be an outlier because there is something wrong with it. Ideally, before excluding outliers from a calibration, you should try to find why the sample is an outlier. There are a number of causes of concentration outliers.
POSSIBLE CAUSES OF CONCENTRATION OUTLIERS
1. If standards were made up by mixing together pure compounds, an incorrect amount of analyte may have been put into the sample. This could be checked by analyzing the sample via some other technique, or by remaking the standard and seeing how it performs in the cross vaHdation. 2. If your standard concentration values are being determined by some other means (titration, chromatography, gravimetric analysis etc.), it is possible that a particular sample's analysis is incorrect. If possible, find the offending sample and reanalyze it to see if the new values agree with the initial values. You should retain your standard samples to help solve problems Hke this. 3. Transcription errors—a key part of obtaining a factor analysis caHbration is typing into the software the names of the standard spectra to use and the concentration of analytes in each standard. It is all too easy to type in an incorrect spectrum name or concentration, leading to what is called a transcription error (a glorified name for a typo). Once an outlier has been spotted, the easiest way to find a transcription error is to simply go back and review the spectrum names and concentrations typed into the computer for the calibration. If one of these or some other explanation can be found to explain why a sample is an outlier, it is then legitimate to exclude the sample from the caHbration. Ideally, the problem should be corrected and new informafion for that standard included in a new caUbrafion. The residual plots in Figure 4.19 and 4.20 come from actual experimental data. The author has to embarrassingly admit that upon investigafion sample 16 in Figure 4.20 is
162
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
an outlier because of a transcription error on his part. After correcting the transcription error, the residual plot seen in Figure 4.19 was obtained. There is a lesson to be learned from this though. The average residual in Figure 4.20, which contains an outHer, is 1.8. The average residual in Figure 4.19 where there are no outHers is 0.4. The presence of one bad data point in the caUbration increased the average residual 4.5 times. This illustrates the effect of one bad data point on the cahbration, and the importance of examining the residual plots to spot suspect data. C. SPECTRAL OUTLIERS
Like concentration data, we need to have some way of checking the quality of spectral data points to make sure no bad data makes it into the final cahbration. This is done by calculating what is called a spectral residual The cross validation produces a trial set of factors and scores, and using Equation (4.2) the absorbances in the standard spectra can be calculated. The calculated spectra are called reconstructed spectra because they are constructed (calculated) from the cross validation data. A spectral residual is calculated using the following equation [5]:
i?s = E ( ^ a - ^ p ) '
(4.13)
i=\
where R^ n / A^ Ap
— spectral residual = number of data points (wavelengths or wavenumbers) in the spectrum = index for data point number = actual absorbance = predicted absorbance
The spectral residual for a sample is calculated by subtracting the predicted absorbance spectrum from the actual absorbance spectrum for a standard sample to obtain a residual. The residuals are all squared, and then added together. Calculation of the spectral residual allows us to use a number to describe the quality of the fit between the real and calculated spectra. Since the differences between real and calculated absorbances are squared in Equation (4.13), spectral residuals are always positive. A spectral residual plot contains spectral residual on the F-axis and sample number on the Z-axis. An example of such a plot is seen in Figure 4.21. Each trial cahbration with a different number of factors will have its own spectral residual plot.
4.
163
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
4mix2.tdf.7 18E-05-
15
-
12
O
18
u;i2E-0515 9
"is cc
7
^OE-06-
16
8
Q. CO
^^
13
10 14 1
0-
2 3
4
5
6
— 1 — r " —r~— 1 — ~ r - "~r~— 1 —
1
1
~~r~~~l
1
1
1
1
1—~~\—r
1
18
1—
Sample Number ()
Figure 4.21 A spectral residual plot, with spectral residual on the 7-axis and sample number on the X-axis. The predicted spectra were calculated using the PLS-2 algorithm and five factors.
Ideally, a spectral residual plot should be a random scatter of points. The presence of structure in a residual plot may indicate that unmodeled variance is present. The plot in Figure 4.21 is not a perfect scatter; it appears that the residual gets bigger at higher sample number. The author investigated what might have caused this type of structure, but could not find a definitive cause. Besides looking for structure in residual plots, we also look for outliers. Naturally, any spectrum that is poorly predicted is suspect, and may have something wrong with it. There are a number of ways of determining the center of a scatter of data points Hke those shown in Figure 4.27 [5]. To make our lives simple, we will simply calculate the average of the spectral residuals. Like concentration outliers, a spectral outher will usually be three times or more than the average residual. The highest residual/average residual ratio in Figure 4.21 is 2.9 for sample 15. An examination of this spectrum found no obvious problems with it, so it was kept in the training data set. An example of a spectral residual plot containing a definite outlier is shown in Figure 4.22. The average residual in Figure 4.22 is 8.2 x 10~^, while the residual for sample 17 is 5.22 x 10""^ or 6.4 times the average. This is much greater than the 3x criterion discussed above, and suggests that sample #17 is an outher. The more important question is why spectrum 17 is an outlier. It might be tempting to automatically exclude data that is 3 times the average from a calibration, but data should not be excluded from a
164
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
4mix2.tdf.1
45E-05-
Possible Outlier
5
in LL
"^OE-OS-
•6 "to DC
I
ai5E-05-
10
—I—
12 Sample Number ()
15
18
Figure 4.22 A spectral residual plot, with spectral residual value on the 7-axis, and sample number on the X-axis. Note the unusually large residual for sample #17. The predicted spectra were calculated using the PLS-2 algorithm and five factors.
calibration arbitrarily. If on investigation a legitimate explanation for a sample being an outlier is found, it should be excluded. However, if no obvious problem with the spectrum can be found, it is possible that the spectrum represents some source of variability that might be important to include in the model. In these cases, it is not as obvious whether a sample should be excluded or not. Spectra 17 and 18 are of the same sample, and a comparison of them shows no glaring differences or errors between them. A review of the data showed no transcription errors. Spectra 17 and 18 have the two largest residuals in Figure 4.22. They are replicate spectra of the sample with the largest concentration of IPA. It is possible that at these high IPA concentrations, the concentration-absorbance relationship is nonlinear, and the model had difficulty predicting the spectra of this sample as a result. Since spectrum 17 may represent an important source of variance, and it appears we understand what that source is, it should probably be included in the final caUbration. There are a number of reasons why a spectrum may be an outher, these include: POSSIBLE CAUSES OF SPECTRAL OUTLIERS
1. Transcription errors—Hke concentration residuals, spectral residuals can be caused by transcription errors. These are easily checked by looking at
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
165
the names of the spectra included in the caHbration, and correcting the Hst if needed. Spectral noise or artifacts—if one spectrum has substantially more noise in it than the others used in a cahbration, or if its baseline has substantially more slope or curvature, that can cause a spectrum to be an outher. This is easily checked by comparing the noise and basehne in the suspect spectrum to the other spectra in the calibration. Remeasuring the spectrum of the standard may be necessary to correct the problem. Unmodeled impurities—factor analysis algorithms are capable of achieving useful calibrations in the absence of knowledge of the concentrations of all the species in a training set. The standards being analyzed in this chapter contain an unknown amount of acetone and water, yet quality calibrations for the two analytes were obtained. A problem arises if one standard sample contains an impurity that the other standards do not. This impurity will be difficult to model, and its presence may cause a spectrum to be flagged as an outlier. Again, comparison of the outlier spectrum with others in the calibration set can be useful in disclosing this type of problem.
VI. Calibration: Choosing the Right Number of Factors Once a cross validation is performed, and you have confirmed that your data set contains high quaUty information (i.e. no outliers), the next task is to choose the right number of factors to use in your final cahbration. This is, in some respects, the most difficult part of factor analysis. Choosing the right number of factors has a large impact on the quality and accuracy of a calibration. Fortunately, there are procedures to follow, and metrics to look at, which make this important process easier. If you have a familiarity with the chemical system you are studying, you should have some feel for the approximate number of factors that will be necessary to characterize your samples. Remember that factors capture sources of variance. At a minimum, you will probably need as many factors in your calibration as there are components. Additionally, noise, baseline problems, or unknown components may cause the number of factors needed to increase. These sources of variance may or may not present a problem depending upon how well they are modeled, and whether they appear in the standard and unknown samples as well. If significantly fewer factors than expected work well, it may mean the concentrations of two or more components are colhnear. Many metrics can be used to ascertain the quahty of a trial calibration and establish the ideal number of factors to use. We will discuss here those
166
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
that the author, being a spectroscopist and not a statistician, has found the easiest to understand and interpret. The example cahbrations in this section were all performed using the example data set discussed earlier in this chapter. A. PREDICTED VERSUS ACTUAL CONCENTRATION PLOTS
Thanks to the cross vahdation, there exists sets of predicted concentrations for the data set that depend upon the number of factors used in the trial cahbration. This means that plots of the actual versus predicted concentration varying the number of factors are available. These plots are excellent indicators of cahbration quaUty, an example of which is seen in Figure 4.23. There are separate concentration plots for each analyte. Figure 4.23 is for the component EtOH, using one factor and the PLS-2 algorithm. The beauty of these plots is that they are simple to understand. The closer the data are to falling on a straight hne, the better the cahbration. If the trial cahbration did a perfect job of predicting the concentrations of the standards, the points in Figure 4.23 would fall on a perfect straight hne, meaning the model accounted for all the variance. Of course, there is some scatter in the data. We can use a statistic first discussed in Chapter 2, the correlation coefficient R to measure the quahty of these plots. In Figure 4.23, the correlation coefficient (which is the square root of the R^ number seen at the top of the plot) is 0.64. This means that there is not good agreement between the data and the hne. The poor quahty of this plot
4mix6samD.tclf.2 (R' = 0.412376844^ 8-
-
12
6
6.5-
y^
5-
-
ft 9
3.5-
2-
2
y^ .5"
Actual Concentration (C2 )
Figure 4.23 A plot of the actual versus predicted concentrations for the component EtOH. Note the poor linearity of the plot. Data calculated using one factor and the PLS-2 algorithm.
4. MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
167
Actual Concentration ( C 2 )
Figure 4.24 A plot of the actual versus predicted concentrations for the component EtOH. However, in this plot four factors were used to calculate the predicted concentrations. Note the excellent agreement between the actual and predicted concentrations.
is no surprise since only one factor was used to calculate the predicted concentrations. Figure 4.24 contains a better looking predicted versus actual concentration plot. It is for EtOH using four factors. Note how well the data in Figure 4.24 fall on a straight line, showing excellent agreement between actual and predicted concentrations. The correlation coefficient for the plot (the square root of the F^ number seen at the top of the plot) is 0.9998, confirming the excellent agreement between the two data sets. The point of looking at these plots is to ascertain calibration quahty. The optimal number of factors should do a good job of predicting the concentrations of all the analytes. However, there may be times when different analytes need different numbers of factors to be well modeled. In this case, the judgment of the analyst is needed to pick the right number of factors. It is obvious from Figures 4.23 and 4.24 that four factors do a better job of predicting ethanol concentrations in this cahbration than one factor does. B. RECONSTRUCTED SPECTRA
As stated previously, factors and scores can be used to obtain a set of calculated standard spectra. A comparison of these calculated or "reconstructed" spectra to the actual standard spectra can be useful in establishing the quality of a calibration model. A reconstructed spectrum
168
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE 4fnix6?9mp.t^
Real Spectrum
Calculated Spectrum
Residual 2600
2000
Spectral Units ()
Figure 4.25 A reconstructed spectrum plot. The real and calculated spectra are labeled. The difference between the two gives the residual. The calculated spectrum was obtained using one factor and the PLS-2 algorithm. The actual spectrum is for sample #12.
plot is shown in Figure 4.25. The spectra were calculated using one factor and the PLS-2 algorithm. The actual spectrum is for sample 12. There is a separate reconstructed spectrum plot for each standard sample, and the plot for each sample will vary depending upon the number of factors used to calculate the predicted spectrum. The X and 7-axis units for these plots are the same as for the spectra used in the cahbration. The residual shown in Figure 4.25 is calculated by subtracting the predicted spectrum from the actual spectrum for sample #12. The size of the residual is a measure of how close the fit is between the two spectra. A large residual means there is a poor fit, a small residual a good fit. The residual in Figure 4.25 is relatively large. There are obvious differences between the real and calculated spectra. This indicates that one factor does not do a good job of reproducing this spectrum. A reconstructed spectrum plot for sample 12 using four factors is seen in Figure 4.26. There is a vast improvement in the fit between real and calculated spectra in this plot compared to Figure 4.25. The real and calculated spectra are almost perfect overlays, and the residual is barely discernible. Expansion of this plot shows real differences between the two spectra, but they are very small, on the same order of magnitude as the noise in the standard spectra. This is a good sign; it means four factors describe the spectra well. This plot indicates that four factors might be a good number to use in an actual cahbration.
4.
.45-|
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
169
4mixg$amp.t<|f.2
Real and Calculated Spectra
Residual Spectral Units ()
Figure 4.26 A reconstructed spectrum plot for sample #12 using four factors and the PLS-2 algorithm.
If possible, you should look at all the reconstructed spectrum plots in a caUbration. This may prove tedious when using tens or hundreds of spectra, in which case a representative set of reconstructed spectra should be examined. It is important to examine more than one reconstructed spectrum plot because it may turn out that different standard spectra may be best described by different numbers of factors. In the present example, sample 1 was well described by two factors, but it took four factors to do as good a job of describing sample 12. If the latter plot had not been examined, the analyst may have been misled into using two factors in the fmal calibration instead of four. C. FACTOR LOADINGS PLOTS
Remember that factors try to account for the variance in a data set. An example factor loadings plot is shown in Figure 4.27, which shows factors 1 and 2 for a PLS-2 calibration using the example data set. Remember that factors are abstract, and can have features pointing up and down. However, the lower numbered factors should model changes in component concentration. It is no surprise then to fmd features in Figure 4.27 reminiscent of those in the standard spectra. This suggests that these factors are modeling changes in the component concentration. The same calibration that produced the factors seen in Figure 4.27 produced factor #9 seen in Figure 4.28. This factor has httle resemblance
170
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE 4mix6samD.tdf.2
4mix6samp.tclf.2
Factor #1
Factor #2
Spectral Units ( )
Spectral Units ()
Figure 4.27 Factors 1 and 2 from a PLS-2 calibration using the chapter's example data set. 4mix6samo.tdf.2
Spectral Units ( )
Figure 4.28 Factor #9 from the same trial calibration that produced the factors seen in Figure 4.27. Note the sharp, narrow, noise like features.
to the other factors, but looks much Hke spectral noise. This suggests that factor #9 is likely modehng noise, not variations in component concentrations. The conclusion suggested by Figures 4.27 and 4.28 is that factor #2 should be included in the final cahbration, while factor #9 should not be included.
4. MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
171
4mix6samD.tdf.2
Factor #6
Spectral Units ( )
Spectral Units ( )
Figure 4.29 Factors 6 and 7 from the PLS-2 calibration obtained using the example data.
The difference between factors is not always so stark as shown in Figures 4.27 and 4.28. Frequently, subsequent factors can be quite similar, and trying to choose the final number of factors based on just this information is tricky. This is illustrated in Figure 4.29, which shows factors 6 and 7 from the PLS-2 caHbration using the example data set. The two factors in Figure 4.29, though not identical, are similar in that they contain a lot of noise and a few spectral-Hke features. If it came down to choosing between 6 and 7 factors in the final caHbration, and this plot was the only information available, it would be a difficult choice. Fortunately, there are other metrics used to help determine the right number of factors as discussed elsewhere in this section. Using this plot and other information, the author chose four factors to use in the final calibration. D. THE PRESS PLOT
A very useful tool in estabhshing the right number of factors to use in a caUbrafion is the PRESS plot. PRESS is an acronym for Predicted Residual Error Sum of Squares. The PRESS value is a direct measure of how well a caHbration predicts the concentrations left out during a cross validation. It is calculated from the following formula: PRESS = ^ ^ ( C p - C a ) '
(4.14)
172
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
where / = the index for sample number j = the index for analyte number Cp = the matrix of predicted concentrations Ca = the matrix of actual concentrations For a given number of factors, the differences between the predicted and actual concentrations are squared, and then are summed up for all the components in a standard, then for all the standards. The PRESS value is a single number that summarizes the quaUty of a given calibration. For the example data set, there are two analytes and six standards (replicate spectra of six standards were used, 6 x 2 =12 "samples"). A single PRESS value for this system would be calculated as follows. Assume we are using one factor to predict the concentrations. For the first sample left out, the differences between the real and calculated concentrations for IPA and EtOH are calculated, squared, and added together. For the second sample left out, the differences between the real and calculated concentrations for IPA and EtOH are calculated, squared, added together, and this result is added to the result for the first sample. The difference between real and calculated concentrafions for sample 3 are calculated, squared, and added to the results for standards 1 and 2 and so on. This process is carried on until all 12 samples are used in the calculation. This calculation produces a single number that is an excellent measure of the predictive ability of a model for a given number of factors. A way to view PRESS data is as a PRESS plot, which is a plot of PRESS value on the F-axis versus factor number on the X-axis. An example of such a plot is seen in Figure 4.30. This is a PRESS plot for the example data set using the PLS-2 algorithm. Note that the PRESS value falls dramatically for the first few factors, and then levels off. In any factor analysis calibration, the first few factors account for most of the variance in the data, which is why the plot falls rapidly. The remaining factors oftentimes model small variations in the data, which is why the plot levels off. Frequently, the plot will pass through a minimum PRESS value, and then begin to rise. The PRESS value in Figure 4.30 passes through a minimum at factor #4, and then begins to rise slightly. However, in this particular plot the minimum is difficult to see. A plot of the log of the PRESS value versus factor number can sometimes make it easier to discern the minimum PRESS value. Such a plot is seen in Figure 4.31. The increase in PRESS at higher factor number indicates a lowering of the model quaUty, and is a strong indicafion that noise is beginning to be modeled rather than variations in component concentrations. Note that in Figure 4.31 there is an arrow pointing at the number of factors where the
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
173
4mix6samn.tdf.2 fRecommended # factors = 4^
4.5 Factor NumbertO] ()
Figure 4.30 A PRESS plot using the example data set and the PLS-2 algorithm. The plot passes through a minimum at four factors.
Q.
4.5 Factor NumbertO] ()
Figure 4.31 A plot of log(PRESS) versus factor number. The minimum PRESS value at four factors is easier to see here than in Figure 4.30.
PRESS passes through a minimum (four in this case). In general, you want to choose the minimum number of factors that minimizes the PRESS value. For example, if four and five factors gave similar PRESS values, you should choose four factors for your cahbration, five factors would overfit the data.
1 74
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
It may be tempting to just look at the PRESS plot to determine the best number of factors to use in the final caUbration. However, this is not a good idea. The PRESS plot should not be used by itself to judge the right number of factors to keep in a caHbration. The PRESS value will tell you what number of factors are best for all components in all samples. However, examination of data such as the actual versus predicted concentration plots and reconstructed spectra discussed above is important too. For example, the PRESS plot may show that four factors work best overall for a data set. However, examination of the actual versus predicted concentration plot may show for a specific analyte that five factors works best. You may then decide to use five factors in the final calibration if accuracy for this analyte is more important than for the other analytes. After following the process laid out in this chapter, the author picked four factors to use in the PLS-2 caUbration based on the example data set. This calibration will be used for vaHdation and prediction as discussed below. Since it was known there were at least four components in the standard samples, it is gratifying that four factors work best. This means that we have probably done a good job of characterizing the system, and that there are no huge sources of unmodeled variance. If the optimum number of factors had turned out to be significantly different from four, that would have been a surprise, and is cause for reexamining the model and the data. Picking the right number of factors to include in a final factor analysis caHbration is crucial. The plots discussed here should all be consulted to help you make a final decision. There are even more plots, and more metrics, than discussed here that you may also find useful. No one plot or piece of data will help you make the final decision. Knowledge of the data, its noise level, and examination of a wide variety of plots and metrics should be used to optimize your caHbration.
VII. Validation RecaH from earlier in this book that one of the fundamental assumptions behind spectroscopic quantitative analysis is that a mathematical model calculated from a set of standard samples wiH do an equally good job of describing the unknown samples as well. All of the plots and metrics described in this book can inform you about how well a caHbration does in describing the standard samples, but there is no way of knowing how weH the caHbration will do on unknown samples. The purpose of validation is to obtain this informafion. VaHdation involves obtaining samples of known concentration that were not used in the caHbration.
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
175
We will call these validation samples or a validation set. The spectra of the vaUdation samples are obtained, and the concentrations of the components in these samples are predicted using the calibration in hand. Finally, the actual and predicted concentrations are compared to see how well the caHbration does in modehng the validation samples. Validating a multicomponent caHbration is analogous to checking a single component caHbration as was discussed in Chapter 2. The vaUdation of a model, unhke model development, does not involve fancy algorithms or complex mathematics. It simply involves time and effort on the part of the analyst to obtain and measure the spectra of extra samples to insure that a caHbration is trustworthy. Before performing a vaHdation you should consider how many samples to use in the vaHdation. The more vaHdation samples, the better. Ideally you should try to have as many validation samples as calibration standards. What some analysts do is to assemble a number of samples of known analyte concentration, set some aside for validation, and caHbrate with the rest. A useful statistic to calculate from the predicted and real concentrations for the validation set is the Standard Error of Prediction. This statistic is a direct measure of how weH a caHbration predicts the concentrations in a vaHdation set. It is calculated as foHows:
SEP = X](cp-Ca)V(«-l)
(4.15)
i
where SEP = standard error of prediction / = sample number index Cp = predicted concentration Ca = actual concentration n = total number of vaHdation samples Note that this formula is very similar to the PRESS formula seen in Equation (4.14). The difference between the two quantities is that SEP is divided by the number of degrees of freedom (n-l), so it is statisticaHy standardized. Additionally, the SEP is specifically calculated for samples that were not used in the original calibration. Naturally, the lower the SEP, the better the predictive capabiHties of a model. There is a separate SEP for each analyte. Recall that for the example data set being used throughout this chapter, three standards were set aside for use as a vaHdation set. Since there are replicate spectra of each standard, this makes for six "samples"
1 76
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 4-2 Standard Error of Prediction (SEP) for several different calibration algorithms using example validation data set. Algorithm PCR PLS-1 PLS-2 ILS
IPA SEP
EtOH SEP
0.54 0.50 0.50 0.59
0.026 0.029 0.026 0.019
total. We have not yet decided whether to use PCR, PLS-1, or PLS-2 in the final caUbration. We will use the validation set to make this decision. The concentrations of IPA and EtOH using each of the six vaHdation set spectra were predicted using each algorithm. Then, the SEP for each algorithm was calculated using Equation (4.15). The results are seen in Table 4-2. The SEP is the best measure of the predictive ability of a caUbration. Thus, if using the PLS-2 algorithm, we could legitimately say that the IPA concentration in unknown samples can be predicted with an accuracy of 0.5 volume percent, and that the EtOH concentrations can be predicted with an accuracy of 0.026 volume percent. From the data in Table 4-2, PLS-2 is the factor analysis algorithm that does best for both analytes, although not by much. All other things being equal, PLS-2 would be the best algorithm for this data set. Also, notice that the SEP for IPA is much larger than for EtOH. This is interesting because EtOH was present in the standards at much lower concentration than IPA, meaning the EtOH absorbances would be smaller and have a smaller SNR as well. However, the real reason the EtOH concentrations are predicted better is probably Hnearity. The EtOH concentration range was 0-9%. Beer's law works best for dilute solutions, which describes this situation. The IPA concentration in the standards varies from 0 to 54%. This is a much larger concentration range to have to model, and it is possible that Beer's law is not well followed for the higher concentration IPA samples. This may explain the lower SEP for EtOH compared to IPA. The IPA concentration is still well modeled, just not as well modeled as the EtOH concentration. For comparison purposes, the SEP for IPA and EtOH derived from a vaHdation with the ILS technique (Chapter 3) is also seen in Table 4-2. What is interesting is that although ILS is not a factor analysis technique but a "least squares matrix method," the ILS SEP is very close to those obtained with the factor analysis methods. Thus, with the
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
177
example data set, we have the good fortune of knowing that four different cahbration algorithms perform well on our data. This is not always the case. The example data set was a simple, well behaved chemical system of four different molecules. The absorbance spectra had a high SNR, were free of artifacts, and the concentrations were well known. This is not always the case. For more problematic cahbration data sets, greater differences in algorithm performance might be expected. The problem created by the success of all four algorithms will be trying to choose among them for the calibration to put into practice. There is an extra benefit of going through the trouble of analyzing vahdation samples. If your tests show that the calibration does a good job of predicting concentrations in the validation set, and if the concentrations in the validation samples are known as accurately as in the training set, the spectral and concentration data from the validation set can be folded into the data from the original cahbration to obtain a new cahbration. This increases the number of samples in your final cahbration, hopefully making it more accurate and robust. There are those who may argue that validating a cahbration is not worth the trouble because of the extra time, money, and expense involved in doing so. However, the time, money, and expense of recalibrating once a cahbration is implemented are even higher. Vahdations are the best test of the predictive abihty and stability of a calibration. Your calibrations should be validated whenever possible.
VIII. Prediction The whole point of obtaining a calibration is to be able to predict the concentration of analytes in unknown samples. After all the work, worry, and stress involved in obtaining a calibration, it turns out that the prediction step is rather anticlimactic. Once you have the calibration, the hard part is over. However, to accurately predict concentrations you still have to follow good experimental practices. The unknown samples need to be treated and prepared for analysis exactly the same as the calibration samples. The spectrum of the unknown must be measured in the same way using the same parameters as the training set. Ideally, the same spectrometer and samphng device should be used for unknowns and standards. The unknown spectrum must be preprocessed the same as the standard spectra. Once the unknown spectrum has been properly measured, the calibration is simply applied to it to generate the predicted concentrations.
178
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 4-3 The prediction report for a calibration using the example data set and the PLS-2 algorithm. PLSplus/IQ Prediction PLS-2, 6 stds, IPA Report Date: 06/06/2002 EtOH Analytes Cahbration File: c:\gramsn\ipadata\4mix6samp.cal Sample: C:\GRAMSN\ipadata\4mixp6.spc Cahbration Constituent
Concentration
IPA EtOH
M Distance
Limit Tests
Spec. Residual
0.580335
PASS (PPPP)
2.47E-05
1.46
As an example of what a prediction report looks Hke, the data in Table 4-3 were obtained using the example data set and the PLS-2 algorithm. The prediction report is for one of the vahdations samples. The predicted concentration for IPA is 18.6%, and the predicted concentration for EtOH is 1.46%. Since this is a vahdation sample, we know the true concentrations to be 18.3% and 1.5% respectively. Thus, the model works quite well on this sample. Note in the table a column entitled "Spec. Residual." This is the difference between the actual and calculated unknown spectrum. Monitoring the spectral residual during prediction is important. If there are impurities present in the unknown sample that were not present in the standards, the model will not work well on the unknown. This is indicated by a large spectral residual upon prediction. Thus, the spectral residual can help ascertain whether it is appropriate to use a cahbration on a specific unknown sample.
REFERENCES
[1] R. DiFoggio, Appl. Spectrosc, 54 (2000) 94A. [2] R. Kramer, Chemometric Techniques for Quantitative Analysis, Marcel Dekker, New York, 1998. [3] A. Savitsky and M. Golay, Anal. Chem., 36 (1964) 1627. [4] Brian C. Smith, Fundamentals of Fourier Transform Infrared Spectroscopy, CRC Press, Boca Raton, Florida, 1996. [5] J. Duckworth, Spectroscopic Quantitative Analysis, chapter in Applied Spectroscopy: A Compact Reference for Practitioners, J. Workman and A. Springsteen Eds., Academic Press, Boston, 1998. [6] D. Steinberg, Computational Matrix Algebra, McGraw-Hill, New York, 1974.
4.
MULTIPLE COMPONENTS II: CHEMOMETRIC METHODS AND FACTOR ANALYSIS
179
BIBLIOGRAPHY
H. Mark and J. Workman, Statistics in Spectroscopy, Academic Press, New York, 1991. H. Mark, Principles and Practice of Spectroscopic Calibration, Wiley, New York, 1991. K. Beebe, R. Pell, and M. B. Seasholtz, Chemometries: A Practical Guide, Wiley, New York, 1998. H. Robbins and J. Van Ryzin, Introduction to Statistics, Science Research Associates, Palo Alto, California, 1975.
This Page Intentionally Left Blank
IMPLEMENTING, MAINTAINING, AND FIXING CALIBRATIONS
I. Implementing Calibrations The whole point of developing a cahbration is to predict concentrations in unknown samples. Implementing a calibration simply means the process by which the calibration is put into use. No cahbration, particularly multicomponent calibrations, should be placed into service without first performing a vahdation. A vahdation is the best measure of the predictive capabilities of a cahbration, as discussed in Chapters 3 and 4. Another thing to do before using a cahbration is to document it. Record all the hard work you put into developing the cahbration. This is necessary for anyone else who may want to repeat your work, and is necessary for you as a memory aid. Once a cahbration is in use, you may forget some of the details of how you went about developing the cahbration. 181
182
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
If the calibration fails, you will need to revisit the calibration process and check the raw data. It is impossible to do this without a written record of what was done. It is even appropriate to write a memo about the work put into the cahbration, and perhaps even pubUsh it in the scientific hterature. Oftentimes, the cahbration developer is not the same person as the cahbration user. The user needs instructions on how to use the cahbration, where it came from, how it works, what assumptions were behind it, and what to do in case of trouble. A calibration "user's guide" or operating procedure should be written before putting the calibration in use. The guide should be written from the perspective of the user. The gory mathematical detail of the calibration algorithms does not probably interest the cahbration user. What they need is practical guidance, such as what to do if a predicted concentration falls out of the concentration range covered by the model, what to do if the unknown spectra are really noisy or have sloping basehnes etc.
II. Maintaining Calibrations Okay, you did it. You have developed a cahbration and it has been implemented and is working fine. Does that mean you can now disown it. By no means! Calibration performance and apphcabihty will change over time with changing conditions. There may come a time when your calibration stops predicting concentrations accurately. If you are not alert enough to catch this, you wih be creating bad information, which leads to bad decisions and wasted money. You must maintain or monitor a cahbration for as long as it is in use. Failure to do so will only lead to trouble later. An important way of maintaining cahbrations is to check them. This simply means, from time to time, obtaining a new standard sample(s) whose concentrations are known, but were not used in the cahbration. Obtain spectra of the new sample(s), predict their concentration, and compare to the known concentrations. This is similar to the vahdation steps discussed at the end of Chapters 3 and 4. A vahdation is essentially a cahbration check performed before implementation. Here, we are talking about a cahbration check after the implementation. As with a vahdation, a cahbration check is better if multiple samples are used. Then, a standard error of prediction (SEP, discussed in Chapters 3 and 4) can be calculated. Comparing the SEP before and after implementation is an excellent way of monitoring the health of a cahbration. How often should you check your cahbration? As often as is practically possible, which will vary greatly with the details of your implementation.
5.
IMPLEMENTING, MAINTAINING, AND FIXING CALIBRATIONS
•
183
In general, it is probably not necessary to check the caUbration every day, unless measurement and sampling conditions fluctuate greatly on a daily basis. On the other hand, once a year is probably not enough. Something is bound to change in the spectrometer or sampling procedure during this time, and you need to know if these variations have impacted caUbration accuracy or not. In addition to caUbration checking, there are ongoing things that need to be monitored every time a calibration is used to predict unknown concentrations. First, recall that the applicability of a caUbration is limited to the concentration range of analyte(s) used in the calibration. Thus, the range of predicted concentrations should never fall outside the range of caUbration concentrations. This is something that should be monitored continually. Part of the standard operating procedure for caUbration implementation should be to alert the calibration user to the limits on predicted concentrations, and what to do if an unknown has predicted concentrations that fall outside the accepted range. RecaU from above that we cannot model interferents that were not present in a calibration sample set. If there are chemical species present in the unknown samples that were not present in the caUbration samples, this can affect the accuracy of predicted analyte concentrations, and make the calibration unusable. How do we detect the presence of unmodeled impurities in an unknown sample? If you are lucky, your calibration software will predict the spectrum of the unknown sample, allowing the actual and predicted spectra to be compared. A spectral residual for the unknown can then be calculated (see Chapters 2-4). The spectral residuals from the caUbration and validation will give you some feel for how big the spectral residuals for the unknowns should be. If the spectral residual for an unknown is much larger than expected, this may indicate the presence of unmodeled chemical species in the unknown. Some software packages even allow you to put in spectral residual limits, above which you will be warned about problem samples. What do you do with an unknown that fails the spectral residual test? First, compare its spectrum to that of the caUbration samples. You may be able to spot what the impurity is, and perhaps insure that it is not present in future samples. If the impurity is going to be present in most or all unknown samples, you need to develop a new calibration with the impurity present so that it can be modeled. An example of what happens when a calibration is appUed to an inappropriate unknown is seen in Table 5-1. The PLS-2 calibration developed in Chapter 4 for systems containing IPA, EtOH, acetone, and water was applied to a spectrum of an "unknown" sample containing 9 1 % IPA and 9% water. The unknown sample is significantly different from
184
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
TABLE 5-1
Prediction results by applying example PLS-2 calibration from Chapter 4 to a mixture of 91% IPA-9% water. PLSplus/IQ Prediction Report
Date: 06/08/2002
Calibration File: c:\gramsn\ipadata\4mix6samp.cal Sample: C:\GRAMSN\ipadata\ipa91 .spc PLS-2, 6 stds, IPA EtOH Analytes
Spectral Residual
IPA EtOH
14.529198
511.0385 -68.7104
the cahbration samples in that it is completely missing acetone and ethanol, and has an IPA concentration almost twice that of the highest IPA concentration used in the cahbration. The result of the prediction is that the IPA concentration is 511%, and that the EtOH concentration is —68.7%. Of course, both of these numbers are absurd, and these predicted values by themselves should be enough to flag this sample as being inappropriate for use with this cahbration. The spectral residual for this prediction also shows that there is trouble. The spectrum of the 9 1 % IPA-9% water sample is compared to the cahbration sample with the highest IPA concentration (Sample #12) in Figure 5.1. The spectra are obviously quite different. This is reflected in the spectral residual for the prediction, which is 14.5. The average spectral residual for the cahbration samples (from Chapter 4) is 8.2 x 10~^. The spectral residuals are different by several orders of magnitude, again indicating that this unknown sample should not be used with this cahbration.
III. Fixing Problem Calibrations Regardless of how well you develop, implement, and monitor a cahbration, it is hable to fail at some point due to factors beyond your control. A cahbration "failure" is when your ongoing monitoring program indicates that the accuracy of the calibration has degraded, or that the calibration is no longer applicable to the unknown samples. There are a number of reasons why calibrations may fail, and it may not necessarily be easy to fix the problem. Like any task, it helps to break things down into pieces and work on them separately. It is also helpful to ask the
5.
185
IMPLEMENTING, MAINTAINING, AND FIXING CALIBRATIONS
Dashed: "Unknown" Spectrum of 91% IPA/9% Water Mixture
1.5H
Solid: Calibration Sample #12
3500
3000
2500 2000 Wavenumber (cm-l)
1500
1000
Figure 5.1 Solid: The spectrum of calibration Sample #12. Dashed: An "unknown" spectrum of a sample containing 9 1 % IPA and 9% water. Note the differences between the two spectra.
following two questions: 1. Have any assumptions been violated? 2. What changed? First, remember that throughout this book a number of assumptions have been made as we developed cahbrations. A quick check of those assumptions is in order when a calibration fails, because the failure may be due to a violation of one of those assumptions. Amongst the assumptions to consider are: 1. Beer's law assumes the sample is homogeneous, which means the concentrations and optical properties of the sample are the same at all points. This assumption could be violated by the presence of Ught scattering particles, inhomogeneity due to poor mixing, or the presence of a temperature of pressure gradient in the sample. 2. When using the least squares methods, there are a number of assumptions made about the model and the data. These are Usted in Chapter 2. Reread those assumptions and think about whether they apply to your caHbration or not. 3. Since the spectra used in quantitative spectroscopic analysis are usually in absorbance units, it makes sense to revisit how absorbance is calculated. The equation is A = log(/o//)
(5.1)
186
QUANTITATIVE SPECTROSCOPY: THEORY AND PRACTICE
where A — absorbance /o = intensity of light measured with no sample present / = intensity of hght with sample present If it has been a long time since /Q was measured, you are assuming /Q is constant over time. This is not always the case; /Q can drift, causing the measured absorbances to be incorrect. Consider measuring a fresh /Q whenever a calibration fails. 4. Going back to the first chapter, remember the fundamental assumption of quantitative analysis: the cahbration models the standard samples as well as the unknowns. If anything in the unknowns is different from the cahbration samples, the cahbration will cease to be applicable. The second question to ask upon cahbration failure is: what has changed? As we have seen, there are a myriad of variables that affect quantitative accuracy. A change in just one of them can be enough to throw off a cahbration. Since the calibration consists of spectroscopic and concentration data, it is convenient to break sources of variance into those that effect the absorbances, and those that effect the concentrations. One of the big issues that affect absorbances is /Q drift as discussed above. Another name for /Q is the background spectrum. For double beam instruments, /Q is measured at the same time as /, the sample spectrum. In this case, /Q drift is usually not a problem. However, for single beam instruments (including FTIRs), / and /Q are measured at different points in time. Anything that affects the intensity of the light beam impacts /Q. This includes the source, atmosphere inside the instrument, beam sphtters, mirrors, samphng devices, and the detector. Causes of /Q drift can include a change in source intensity, dirt on mirrors, or crud collecting in a sample cell. Also, temperature variations can cause subtle changes in optical alignment, impacting light intensity. For single beam instruments, the best way to deal with /Q drift is to measure background spectra as frequently as is practical. When working with lab instruments, the author prefers to measure a new background spectrum before every sample. In process control applications, this is not practical since taking a background spectrum may involve stopping a chemical process, or uninstalhng the spectrometer. However, it is still necessary to measure fresh backgrounds occasionally. If it has been months or years since /Q has been measured, you are setting yourself up for cahbration failure. Any time a process is shut down, or can be conveniently shut down, measure a new /Q. Another issue for spectra is noise. In general, the noise level of a spectrometer should be stable over time. If a cahbration fails, look at the
5.
IMPLEMENTING, MAINTAINING, AND FIXING CALIBRATIONS
187
noise level of the unknown spectra. If the noise level has increased significantly since the cahbration was performed, something in the instrument has changed. Again, look at things Hke source intensity, dirt on mirrors and in sample cells, and optical aUgnment. Lastly, spectral artifacts can impact the spectra adversely. For example, many instruments in process applications are sealed from the environment. If that seal is broken, and significant amounts of water vapor and CO2 leak into the spectrometer, this can cause a model to fail. BaseHne drift, related to /o drift is another source of artifacts, along with fluctuations in power feeding an instrument. Recall from Chapter 1 that it is assumed that the absorptivity is the same for the standards and the unknowns. Also, recall from Chapter 1 that there are many things that cause the absorptivity to change. These include pressure, temperature, composition, and concentration. If your cahbration fails, make sure that all these variables are under control. Lastly, look at any change made since implementation, no matter how trivial. Changes in operator, equipment, environmental conditions, sample preparation, and the sample cell may be the root of the problem. Every variable is fair game for scrutiny when trying to fix a cahbration. If you find an experimental problem and fix it, you might be able to continue to use the old calibration after checking it thoroughly. If after doing all the obvious things your calibration does not work properly, there may be new sources of variance that are not modeled. Then, a new calibration may be needed. BIBLIOGRAPHY
[1] R. Kramer, Chemometric Techniques for Quantitative Analysis, Marcel Dekker, New York, 1998. [2] J. Duckworth, Spectroscopic Quantitative Analysis, chapter in Applied Spectroscopy: A Compact Reference for Practitioners,'' J. Workman and A. Springsteen Eds., Academic Press, Boston, 1998. [3] H. Mark and J. Workman, Statistics in Spectroscopy, Academic Press, New York, 1991. [4] H. Mark, Principles and Practice of Spectroscopic Calibration, Wiley, New York, 1991. [5] K. Beebe, R. Pell, and M. B. Seasholtz, Chemometrics: A Practical Guide, Wiley, New York, 1998.
This Page Intentionally Left Blank
GLOSSARY
Absorbance—The result of a totally inelastic collision between a molecule and a photon. Absorbance Spectrum—A plot of absorbance versus wavenumber. Absorptivity—A measure of the absolute amount of light absorbed by a molecule at a given wavenumber. Accuracy—A measure of how close a measurement is to its true value. Allowed Transition—A spectroscopic transition that is allowed according to the selection rules. Analyte—The chemical species of interest in a chemical analysis. Beer's Law—The equation that relates absorbance to absorptivity, pathlength, and concentration. Bond Dipole—The dipole moment for a chemical bond. Calibration—The process by which two or more sets of data are related to each via a mathematical model. Calibration line—A mathematical model showing a linear relationship between two sets of data. Chemical Environment—The immediate surroundings of a molecule. Chemometrics—Application of mathematical techniques such as factor analysis to chemical data. Collinear—When two sets of data have a linear relationship with each other. 189
190
GLOSSARY
Component—A chemical species in a sample. A component may be an analyte or an interferent. Concentration Residual—The result of subtracting actual and predicted concentrations. Correlation Coefficient—A measure of model quality, it describes how well the model fits the data. Cross Section—A measure of the probability that two particles will collide with each other. Measured in units of area. Cross Validation—The process of excluding one or more standard samples from a calibration, then predicting the concentration in the samples left out. Cycle—When a wave propagates through a complete crest and trough. Dipole—Two charges separated by a distance. Dipole Moment—A measure of charge asymmetry. A vector quantity, it is calculated by multiplying charge times distance. Elastic Collision—When two objects collide and there is no transfer of energy between them. Electric Vector—The electric part of light. Electromagnetic Radiation—Another term for light. Light consists of magnetic and electric parts, hence this term. Electromagnetic Spectrum—A term describing all the different types of light, from radio waves to gamma rays. Electronic Structure—The distribution of charges in a molecule. Electronic Transition—When absorption of light causes an electron in a molecule to be promoted from a lower energy level to a higher energy level. External Standards—A standard method where calibration and prediction are performed at different points in time. F for Regression—A measure of the robustness of a calibration. Factor Analysis—A calibration technique that uses abstract vectors called factors to model the variance in a data set. Factors—Abstract vectors used to model variance in factor analysis. Far Infrared—Light between 4 and 400 cm~^ Forbidden Transition—A spectroscopic transition not allowed according to the selection rules. Frequency—The number of cycles of a light wave per unit time. Hertz—Units of see" \ typically used to QxpvQSS frequency. Hydrogen Bonding—a weak chemical interaction between molecules that typically occurs when hydrogen is bonded to an oxygen or nitrogen atom. Identity Matrix—A matrix whose only nonzero elements are Is along the diagonal. The product of the identity matrix and any matrix A is A. It is the matrix equivalent of multiplying by one.
GLOSSARY
191
Independent Determination—A technique of multianalyte calibration where separate caHbration Unes are obtained for each analyte. Inelastic Collision—When two objects colhde and there is a transfer of energy between them. Interferent—A chemical species present in a sample with the analyte{s). The interferent may absorb at the same place as the analyte, interfering with the analyte's absorbance. Internal Standard—An inert material added to unknown and standard samples as part of the internal standards method. Internal Standards—A standard method where an internal standard is added to each standard and unknown sample. The absorbance of the internal standard and analyte are ratioed in an attempt to compensate for random variables. Inverse Beer's Law plot—A plot of concentration versus absorbance. Inverse Beer's Law—Rewriting Beer's Law so that absorbance is expressed as a function of concentration. Least Squares—A method of calculating mathematical models. It can be shown that this method produces models with the least amount of error. Least Squares fit—A model generated using the least squares method. Linear Regression—The process of generating a mathematical model, assuming that two sets of data have a Hnear relationship. Matrix—1. The environment in a sample, determined by composition, concentration, physical state etc. 2. A table of numbers. Matrix Element—An individual number in a table of numbers. Matrix Inverse—A matrix that, when multiplied by another matrix, gives the identity matrix as a product. Matrix Inversion—The process of calculating a matrix inverse. Matrix Order—The size of a matrix. Denoted by first stating the number of rows, then the number of columns. Matrix Transpose—Interchanging the rows and columns of a matrix. Mean Centering—The process of subtracting the average spectrum from a set of standard spectra before performing factor analysis. Microwaves—A type of light found at less than 4 cm~^ Mid-infrared radiation—Light between 4000 and 400 cm~^ Molar Absorptivity—A measure of the amount of light absorbed by a mole of molecules at a given wavenumber. Molecular Absorption—The process by which molecules absorb light. Near-Infrared—Light from 14,000 to 4000 cm~^ Net Dipole—The net dipole moment for a molecule. Noise—Error or static in a measurement. Orthogonal Matrix—A matrix that, when multiplied by its matrix transpose, produces the identity matrix.
192
GLOSSARY
Outliers—Data points that are substantially different from the other points in a data set. Partial Least Squares—A form o^ factor analysis where the spectral and concentration data are incorporated into the model in one step. Pathlength—The thickness of sample encountered by a light beam. Pauli Exclusion Principle—States that no two electrons can have the same quantum numbers. Essentially, no two electrons can occupy the same space. Percent Transmittance—Transmittance times 100. Photon—A particle of light. Precision—a measure of the random error in a data set. Predict—The process of determining the concentration(s) of analyte(s) in an unknown sample. PRESS—An acronym for Predicted Residual Error Sum of Squares. It is calculated from a cross validation, and is a measure of calibration quality. Principal Component—Another name for a, factor. Principal Components Analysis—A type of factor analysis where one set of data (such as the spectra) are modeled using abstract vectors called principle components. Principal Components Regression—A form of factor analysis where the spectral and concentration data are modeled in two steps. Quanta—several packets of energy. Quantized—coming in discrete packets. Quantum—a packet of energy. Quantum Mechanics—The field of physics that deals with the behavior of atoms, molecules, nuclei, and quanta of energy. Quantum Number—Numbers used to denote the energy levels in a quantum mechanical system. Raman Scattering—The result of an inelastic collision between a molecule and a photon. Raman Spectrum—A plot of Raman scattering intensity versus wavenumber. Random error—Error in a measurement due to uncontrolled sources. The sign of random noise is random. Rayleigh Scattering—The result of an elastic collision between a molecule and a photon. Reconstructed Spectra—predicted spectra obtained by multiplying a calibration's factors by the appropriate scores. Regression—The process by which a mathematical model is generated correlating sets of data. Residual—The result of a subtraction. For example, the result of subtracting real and predicted concentrations, or real and predicted absorbances.
GLOSSARY
193
Robustness—A measure of the sensitivity of a calibration to small changes in the random error level. Schrodinger's Equation—An equation of quantum mechanics that can be used to calculate the energy levels for a quantized system. Scores—The weighing factors that determine how much oi a, factor is needed to model a standard spectrum. Selection Rules—The rules that govern what spectroscopic transitions a molecule can make. Signal—the quantity being measured; the information in which we are interested. Signal-to-noise ratio—A measure of data quality. Obtained by dividing the signal by the noise. Simultaneous Equations—A series of equations that share variables. Spectral Residual—The result of subtracting real 3,nd predicted ahsorhancQS. Spectrometer—See spectrophotometer. Spectrophotometer—An instrument used to measure a spectrum. Spectroscopic Transition—The process by which a molecule absorbs light and is promoted from a lower to a higher energy level. Spectroscopy—The study of the interaction of light with matter. Spectrum—A plot of light intensity versus some property of Hght, such as wavelength, wavenumber, or frequency. Square Matrix—A matrix that has the same number of rows as columns. Standard Deviation—A measure of the accuracy of a predicted quantity. Standard Error of Prediction—A quantity that measures how well a calibration predicts the concentrations in a set of validation samples. As such, it is a measure of the predictive ability of a model. Standard Methods—Ways of making up standard samples. Standards—Samples that contain known concentration(^) of the analyte(s). Systematic error—Error that reproducibly has a specific magnitude and a specific direction. The sign of systematic error is not random. Thickness Problem—In the mid-infrared the absorptivities are so large that samples must be thinned out to be between 1 and 20 microns thick so spectra can be accurately measured. Totally Inelastic Collision—When two objects collide and there is a complete transfer of energy from one particle to the other. Training Set—A set of standard samples. Transition Probability—A measure of the probability of a spectroscopic transition occurring. Transmittance—A measure of the fraction of light transmitted by a sample. Unknown—A sample that contains an unknown amount of analyte. Validation—A way of measuring calibration quality by applying the calibration to a series of standards that were not used in the calibration.
194
GLOSSARY
Validation Samples—Samples used in a validation. Variance—The scatter in a data set. Vector—A column or row of numbers. Wavefunction—Contains all the information about a quantized energy level. Wavelength—The distance between adjacent crests or adjacent troughs of a hght wave. Wavenumber—The number of cycles of a light wave per unit length. Typically measured in units of cm~^
INDEX
Absorbance(s) 2, 10, 12, 14-16, 20, 23, 25, 28, 46, 48, 50, 52-54, 62, 63-69, 73-74, 76, 80, 83-88, 97, 111, 127, 131, 143, 145, 162, 186-187 additivity 75 band 149 errors 76 outliers 79 spectrum 10, 19, 84 variability 66 Absorptivity 11-12, 14-16, 20, 22-26, 28, 48, 50, 52, 62, 88, 97, 104-105, 109, 111, 113, 116, 118, 188 drift 48, 62, 82 matrix 111, 113 Abstract vectors 130 Accuracy 44-45, 68, 77, 82, 85-86, 139, 148, 165 Actual versus predicted concentration plot 174 Additivity of Beer's law 97 Advantages and disadvantages of CLS 113
Advantages and disadvantages of factor analysis techniques 126 Advantages and disadvantages of ILS 127 Algorithm discussion and comparison 136 Allowed transition 41 American League 34 Analyte(s) 1, 11, 16, 19, 23, 26, 47, 48, 62, 64, 69, 73, 75-76, 80, 85, 127, 132-133, 138-139, 141, 165, 167 absorbances 64 band 69, 74 concentration(s) 10, 68, 75, 77, 106 Analytical band 74, 76 Apodization function 84 Apparent cross section 15 Array Basic 87 Artifacts 138, 143, 165 Assumptions of the least squares method 51 ATR87 Background spectrum 187 Bandwidth 85-86 195
196
INDEX
Baseball 15, 17, 31, 34 Baseline 66-67, 69, 71, 73, 128 correction 114, 150-151, 154 drift 67, 73, 143 offset 146, 148-149, 151 spectra, of 66 Beer's law 7, 12, 14-15, 22, 24, 26-28, 43, 46-47, 50, 52, 61, 64, 75, 83-84, 87-88, 90, 97, 106, 109, 115-116, 123-124, 176, 186 matrix form, in 130 Bias 45, 51-52 Boltzmann distribution 19 Bond dipole 22 Bound system(s) 30, 34, 41 Boundary conditions 35 Butadiene 37
Concentration 13-16, 19, 24, 28, 4 7 ^ 8 , 50, 52, 54, 62-63, 65, 68-69,79, 88,97, 111, 113, 118, 126, 129, 136, 147, 159, 178, 188 data 139, 157 errors(s) 53, 62, 65, 76, 79, 139 matrix 112 outlier(s) 77, 158-159, 161, 163 range 62, 140, 183 residual 77, 158-160 plot 92, 158-160 residuals 78-79, 81, 164 scores 136 Concentrations 25, 46, 63, 75-76, 81-82, 84, 104-105, 132-133, 135, 141-142, 148, 172, 186 Conjugated electrons 38 Correlation coefficient 60, 62, 73, 93, 96, 106, 109, 123-124, 141, 166-167 Cross section 15 Cross validation 157, 171 Cycle 3
Calibration 2, 16, 20, 25-26, 46, 50, 55, 60-63, 65, 67-68, 74, 79, 81, 83, 92, 105-106, 109, 114, 118, 121, 124, 126-127, 130-132, 134-135, 137-140, 142, 144, 148-149, 152, 160, 162, 165, 167, 169, 171, 177-178, 181-184, 188 algorithm(s) 86, 105, 109, H I , 125 check 63, 84, 183 line 4 7 ^ 8 , 50, 54^55, 57, 62-63, 65, 69, 76, 82, 86, 90, 92-94, 106-108 metrics 62, 90, 93, 96, 106, 109, 124 robustness 60 Calibrations 30, 43-44, 52, 67, 69, 73, 117, 126, 165 Chemical environment 24, 62, 85 Chemometrics 125, 128 Choosing the right number of factors 165 Chromatography 47, 161 Classical least squares method (see CLS) CleanUness 82 CLS H I , 113-114, 131 Coefficients 52, 118, 129, 132-134 Collinearity 141-142 ColUsions 18 Composition 24, 188 Concavity 148-149
Data quality 44 Degrees of freedom 59 Derivative 145-146, 148, 150-151 Digitized 85 Dipole 21, 39 moment 21 operator 21, 39 strength 23 Discontinuity 35 Discrete energies 31 Double beam spectrometers 84 Double bonds 37 Elastic colUsion 8 Electric vector 3 Electromagnetic radiation 3 Electromagnetic spectrum 5-7 Electronic structure 20, 22-24, 26, 29 Electronic transition 7 Energy 38 level difference 31 levels 17, 30 quantization 41 Error 44, 52, 131
197
INDEX
Ethanol (EtOH) 86, 88-90, 92-94, 116, 118, 120-121, 123-124, 126-127, 130, 137, 166, 172, 176, 178, 183-184 Experimental errors 43, 76 External standards 63 F for Regression 60, 62, 93, 96, 106, 109, 123-124 Factor 131-132, 173 analysis 125-128, 130, 136, 138, 140-141, 144, 156, 165, 174, 176 algorithms 130, 139 loadings plots 169 Factors 125-127, 130-131, 134-137, 157, 162, 165, 167-168, 171-172 Far-infrared radiation 6 Filter based spectrometers 86 First derivative(s) 145-147, 148 Fixing calibrations 184 Forbidden transition 40 Fourier Transform spectroscopy 44 Frequency 3 FTIR 64, 87 FTIRs 85, 187 Gas phase analysis 26, 28, 29, 114 Gases 26-28, 86 Gasolines 127 Gaussian distribution 53 Gravimetric analysis 139, 161 Hamiltonian 32 operator 32 Hydrogen bonding 23 /o 84, 187 drift 84, 187 problem 84 Ideal gas law 26 Identity matrix 102, 104 ILS 119-121, 123, 126-127, 129, 131, 133, 176 cahbration 120 Implementing calibrations 181 Inelastic collision 9 Inflection point 148 Instrumental resolution 86
Integrating peaks 67 Interferent(s) 50, 73, 7^76, 81, 85, 88, 116-118, 126, 128, 130, 138, 183 band(s) 74, 76 correction 74 Intermolecular interactions(s) 23, 25 Internal standard(s) 64 method 64 Inverse Beer's law 54, 115, 117, 123 plot 54 Inverse Least Squares (ILS) 119, 126 Inverse Least Squares method 115 Isopropanol (IPA) 50, 54, 58-59, 61, 69, 73, 78, 80-81, 86, 88-90, 92,94, 106-109, 116, 118, 120-123, 126-127, 129-130, 137, 164, 172, 176, 178, 183-184 K matrix 111, 113-114, 125, 131 method 113, 120 KBr 83 pellets 64 Least squares 52-53, 55 fitting 51, 112 Hne 53 methods 85, 120, 133, 186 Light absorption 30 Linear regression 51, 121 Linewidth 85 Log(PRESS) 173 Maintaining cahbrations 182 Mathematical model 57, 81 Matrix 24-26, 102, 105, 112, 130, 133-134, 136 algebra 98, 104, 135 element(s) 98, 101 form of Beer's law 104-105, 127 inverses and inversion 102, 112, 119, 134 multiplication 101-102, 133 order 98, 101, 133, 135 product 101 transpose 100, 133-134 Matter wave 34 Matter wavelength 34 Mean centering 144-145, 154
198
Method of independent determination (MID) 86, 88, 90, 92, 94, 96, 123 Microsoft Excel 53-54 Microwaves 6 MID (see Method of independent determination) Mid-infrared 130 radiation 6 Molar absorptivity 16 Moles 27 Multicomponent analysis 86 Multicomponent calibration(s) 109, 111, 124, 175 Multicomponent quantitative analysis 105, 125 Multiple Hnear regression 119 NaCl 83 Near-infrared 7 Net dipole 22 Noise 44, 50, 61, 67, 69, 79, 109, 128, 137, 143, 152, 15^155, 165, 170, 187-188 factors 136 levels 68 Number of standards 82, 140-141 Octane number 127, 129 Offset 147, 155 Operator 32 Orthogonal matrix 134 Outliers 76-77, 79, 81, 93, 121, 158, 161, 165 Overlapped peaks 69, 73, 149 Oversmoothing 153 P matrix 121-122, 125-127, 131 method 119-120 theory 119 p orbitals 37 Partial Least Squares (PLS) 126, 135 algorithms 135 Particle in a box 30, 33, 36-37, 39-40 Pathlength(s) 7, 12-13, 16, 26, 28, 48, 50, 52, 88,97, 104-105, 113, 118 matrix 111 Pauli Exclusion Principle 38 PCR 132, 134-136, 176
PCR, PLS-l,ynd PLS-2 138 Peak area 46, 58, 66-68, 80, 94, 107, 108, 109, 149 Peak height 46, 66-67, 68 Peak positions 146 Pedro Martinez 34 Percent transmission 13 Perfect model 52 Photon(s) 5, 8-9, 17, 20, 22, 31, 38 Plot 166-167 Plotting lines 50 PLS 136 PLS-1 136, 176 PLS-2 136-137, 159-160, 163, 166, 168-169, 173, 176, 178, 183 calibration 174 Potential energy 32-33 Precision 44-45 Predict 80 Predicted concentrations 183 Predicted versus actual concentration plots 166 Prediction 46, 134, 174, 177-178, 184 report 126-127, 178 PRESS (Predicted Residual Error Sum of Squares) 171, 172, 175 PRESS Plot 171, 173 Pressure 25, 27-28, 83, 85, 186, 188 Principal component 131 Principal components analysis (PCA) 131, 132 Principal components regression (PCR) 126, 131 Probability 11, 15, 18, 20-22, 34 Process control 139, 141 Properties of the least squares fit 51 Quanta 17 Quantitative spectroscopy 29, 43, 47, 82, 139 Quantization of energy levels 30 Quantized 17, 30 energy levels 17, 23, 31, 36 Quantum 17, 30 mechanics 16, 20, 23, 30-31 number 36 Radians 36 Raman scattering 9 Raman spectrum 9
199
Random error 44-^5, 60 Random noise 106 Random variables 63 Rayleigh scattering 8 Reconstructed spectra 132, 162, 167 Reconstructed spectrum 168 Regression 51 Residual(s) 65, 77, 78, 121, 132, 143, 159 plots 90, 121, 161 Resolution 80, 83, 85 Robustness 43, 50, 59-60 Rotational spectroscopy 6 Savitsky-Golay 145 algorithm 146 Scalar 101 Schrodinger's equation 30, 32-33, 35 Scores 128-129, 131-135, 167 Second derivative(s) 148, 149 Selection rules 23, 30, 41 SEP (standard error of prediction) 126-127, 129, 175, 182 Shoulder peak 71 Signal 44, 106 Signal-to-noise ratio (SNR) 44, 106, 109, 126 Simultaneous determination of multiple components 94, 97 Simultaneous equations 98 Single analyte 43 analysis 50 cahbration 65 Single beam instruments 85, 187 Slope 50, 53-54, 65, 76, 145-146, 150-151, 155 Hne, of 46 Smoothing 152, 154 SNR 61, 67-68, 77, 127, 139, 151, 153 Spectral artifacts 188 Spectral derivatives 114, 145, 154 Spectral manipulations 83 Spectral outlier(s) 163, 162, 164 Spectral preprocessing 143 summary and guidance 154 Spectral residual(s) 66, 162, 164, 178, 183-184 plot 92, 162, 164 Spectral scores 136 Spectrometer 2, 30, 84, 139, 177 Spectroscopic transition 17, 22, 31, 38
Spectroscopy 2, 21 Spectrum 2 Square matrices 112 SSE 57, 58, 60-61 SSR 60-61 SST 58, 60 Standard 16, 25-26, 52, 64^65, 84 deviation (a) 53, 59, 62, 93, 96, 106, 109, 123-124, 126 error of prediction 126, 175 methods 43, 63 samples 86, 107, 131, 139 spectra 89, 131, 157, 177 spectrum 132, 144, 154 Standards 2, 46, 62, 76, 82, 88, 118, 120, 139, 161 preparation 138 Statistics 50, 65 Straight line 46, 50 Sum of squares due to error (see SSE) Systematic error 44-45, 51-52, 78 Temperature 16, 19, 25, 27-28, 83, 85, 186, 188 Thermal energy 20, 25 Thermal processes 19 Titration 47, 52, 139, 161 Total variance 57 Totally inelastic conision(s) 9, 10, 16 Training set 138, 165 design 138-139 Transcription errors 77, 121, 161, 164 Transition moment 21, 39-41 Transition probabihty(ies) 21-22, 30, 39 Transmittance 13-14 Trial calibration(s) 157, 159, 165 Ultraviolet 7 Unity matrix 134 Universal gas constant 27 Unknown 2 sample 16, 54 Unmodeled variance 57 UV-Vis spectrum 38 Validation 86, 120, 126, 129-130, 158, 174-177, 181-183 prediction, and 124 samples 124, 126, 175 set(s) 86, 124, 127, 175-177
200
INDEX
Validations 177 Variance 55, 57, 60-61, 121, 125, 128, 130, 132, 136, 143, 154, 157, 161, 165, 174 Vector(s) 100, 104 Visible (UV-Vis) 7 Volume 27-28
Wavefunctions 30, 34-35, 37, 39 Wavelength 3,11 Wavenumber 4, 15, 18, 46, 50, 67, 80, 145-146, 148, 156 selection algorithms 121 Wavenumbers 26, 85, 88, 106, 109, 113, 120, 131
Wavefunction 20-21, 31, 34, 36, 39
r-intercept(s) 46, 50, 54