MODERN TECHNIQUES FOR CIRCULAR DICHROISM AND SYNCHROTRON RADIATION CIRCULAR DICHROISM SPECTROSCOPY
Advances in Biomedical Spectroscopy Spectroscopic methods play an increasingly important role in studying the molecular details of complex biological systems in health and disease. However, no single spectroscopic method can provide all the desired information on aspects of molecular structure and function in a biological system. Choice of technique will depend on circumstance; some techniques can be carried out both in vivo and in vitro, others not, some have timescales of seconds and others of picoseconds, whilst some require use of a perturbing probe molecule while others do not. Each volume in this series will provide a state of the art account of an individual spectroscopic technique in detail. Theoretical and practical aspects of each technique, as applied to the characterisation of biological and biomedical systems, will be comprehensively covered so as to highlight advantages, disadvantages, practical limitations and future potential. The volumes will be intended for use by research workers in both academic and in applied research, and by graduate students working on biological or biomedical problems. Series Editor: Dr. Parvez I. Haris De Montfort University, Leicester, United Kingdom
Volume 1
ISSN 1875-0656
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy
Edited by
Bonnie Ann Wallace Department of Crystallography, Birkbeck College, University of London
and
Robert William Janes School of Biological and Chemical Sciences, Queen Mary, University of London
Amsterdam • Berlin • Tokyo • Washington, DC
© 2009 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-60750-000-1 Library of Congress Control Number: 2009924844 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail:
[email protected] Distributor in the UK and Ireland Gazelle Books Services Ltd. White Cross Mills Hightown Lancaster LA1 4XS United Kingdom fax: +44 1524 63232 e-mail:
[email protected]
Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail:
[email protected]
LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved.
v
Series Editor’s Preface The idea of producing a book series on Advances in Biomedical Spectroscopy is linked to the organisation of the First International Conference on Biomedical Spectroscopy (Cardiff, 2001) which was dedicated to the memory of late Professor Dennis Chapman FRS. He was a rare breed of scientist who did not restrict himself to one particular technique but was always on the outlook for new techniques to use in his research. He truly believed that advances in our understanding of biological processes are intricately linked with the development and application of physical techniques. He was one of the pioneers in the application of different spectroscopic methods for the analysis of biological molecules. In the post-genomic era the need to understand the structure and dynamics of macromolecules, not just single molecules but also their multiple interactions as part of a systems biology approach, is increasingly becoming important. It is therefore not surprising that in recent years several Nobel Prizes have been awarded to scientists who have developed well established analytical techniques to the study of biological and medical systems. These include Mass Spectrometry, NMR spectroscopy and Magnetic Resonance Imaging. There is no doubt that the development of new analytical techniques and the effective utilisation of existing methods are vital for obtaining a better picture of the molecular details of complex biological systems in both health and disease states. Such progress is important for disease diagnosis and drug discovery processes. However, the complexity of biological systems is such that no single experimental method can provide information on all aspects of molecular structure and function. There are a large number of spectroscopic methods that can be used in the analysis of biological systems. Some can be used to carry out analysis in both in vivo and in vitro settings whereas others are restricted, at least currently, to one particular environment. The timescales of many of these techniques are different – ranging from minutes to picoseconds. Some require the use of a potentially perturbing probe molecule whereas others do not. Clearly, no single technique is perfect and each has its respective advantages and disadvantages. Consequently, a serious scientist would not be fully satisfied with the analysis of particular system based on results from a single technique. Ideally, one should use a battery of techniques before drawing a final conclusion. Considering the wide of array of techniques available for analysis of biological systems, producing a single book on one particular spectroscopic technique would not be sufficient to meet the needs of scientists engaged in understanding biological molecules and their interactions. Therefore, I decided to produce a series of books on emerging and established spectroscopic methods to serve the needs of academics, industrial scientists as well as graduate students who are currently using or seeking to use a particular spectroscopic method in their research work. The first volume of the series provides a comprehensive discussion of the state-ofthe-art methods in Circular Dichroism spectroscopic analysis of biological systems. The volume is edited by Bonnie Ann Wallace (Birkbeck College, University of London) and Robert William Janes (Queen Mary & Westfield College, University of London). They are pioneers in the development of Synchrotron Radiation-based Circular Dichroism Spectroscopy for biological studies.
vi
I would like to thank the editors of the current book, and the forthcoming books, for their hard work in bringing together leading experts in their field that ultimately results in the production of each volume in the series. Finally, I would like to thank a good friend of mine, Peter Brown, whose encouragement and advice played an important role in my decision to embark on this book series. Parvez I. Haris Leicester, United Kingdom
vii
Preface We were prompted to develop this book by our experiences in teaching at the United Kingdom/European Union Circular Dichroism (CD) Summer Schools run at Warwick University in the U.K. for the past seven years, and at the BioCD Workshops run at the National Synchrotron Light Source (NSLS) in the United States since 2005. The Warwick schools were organised by Professor Alison Rodger, and the NSLS ones by Professor John Sutherland (of Brookhaven National Lab and East Carolina University). One of us (BAW) was co-director of both of these courses, and both of us (BAW and RWJ) have lectured, demonstrated, and organised experimental exercises for both sets of courses. Indeed, all of the authors of this volume have participated in teaching at one or both of these. The one-week intensive courses were initially aimed at PhD students and postdocs working in the general area of biophysics, to give them a background for undertaking, analysing and interpreting both the established technique of CD spectroscopy, and the newer related methods of Synchrotron Radiation Circular Dichroism (SRCD) and Linear Dichroism (LD) spectroscopies. However, as the courses evolved, both industrial users and senior academics also became “students”. It is on the basis of all of our experiences at these courses, and requests for a permanent record from the students who participated in them – and from other members of their labs who didn’t attend the workshops – that we decided to compile this volume which deals with the practical issues and state-of-the-art methods for analyses involved in CD, SRCD, and LD spectroscopic research. This is also a particularly timely endeavour given the emergence of SRCD as an important new tool for structural biology. This is evidenced by the Second International SRCD Meeting held in Beijing, China in 2009 (which followed the First International SRCD Meeting held at Daresbury, U.K. in 2001 that was organized by the two editors of this volume and supported by a grant from the U.K. Biotechnology and Biological Sciences Research Council (BBSRC)). The U.K. versions of the CD course were supported in their first three years by a grant from the U.K. Engineering and Physical Sciences Research Council (EPSRC) to Alison Rodger and BAW; later they were sustained by support from Warwick University and the MOAC Centre (of which Professor Rodger is director) and the E.U. Marie Curie BIOCONTROL network (of which BAW is a partner). The NSLS course was supported by Brookhaven National Lab, through the U.S. Department of Energy. All lecturers freely gave of their time to enable the courses to take place. We thank them and the members of our labs who helped in running of the courses (especially Dr. Andy Miles and Dr. Lee Whitmore) for their contributions to this volume. The research on CD and SRCD in our labs was supported by grants from the BBSRC. It is hoped that this volume will be a valuable resource for past and future course participants, and especially for other researchers who plan to, and use, CD and SRCD as part of their structural biology studies. BAW RWJ London, January 2009
This page intentionally left blank
ix
Addresses of Contributors Dr. Benjamin M. Bulheller School of Chemistry University of Nottingham University Park Nottingham NG7 2RD U.K. Dr. Anna E. Hills Biotechnology Group National Physical Laboratory Hampton Road Teddington, Middlesex TW11 0LW U.K. United Kingdom Professor Jonathan D. Hirst School of Chemistry University of Nottingham University Park Nottingham NG7 2RD U.K. Dr. Robert William Janes School of Biological and Chemical Sciences Queen Mary University of London London E1 4NS U.K. Dr. Sharon M. Kelly Division of Molecular and Cellular Biology University of Glasgow Glasgow, G12 8QQ, Scotland, U.K. Dr. Alex Knight Biotechnology Group National Physical Laboratory Hampton Road Teddington, Middlesex TW11 0LW U.K. Dr. Andrew J. Miles Department of Crystallography Institute of Structural and Molecular Biology Birkbeck College University of London London WC1E 7HX U.K.
x
Professor Nicholas C. Price Division of Molecular and Cellular Biology University of Glasgow Glasgow, G12 8QQ, Scotland, U.K. Dr. Jascindra Ravi Biotechnology Group National Physical Laboratory Hampton Road Teddington, Middlesex TW11 0LW U.K. Professor Alison Rodger Department of Chemistry University of Warwick Coventry CV4 7AL U.K. Professor John C. Sutherland Department of Physics East Carolina University Greenville, NC 27858-4353 U.S.A. and Department of Biology Brookhaven National Laboratory Upton, N.Y. 11973-5000 U.S.A. Professor Bonnie Ann Wallace Department of Crystallography Institute of Structural and Molecular Biology Birkbeck College University of London London WC1E 7HX U.K. Dr. Lee Whitmore Department of Crystallography Institute of Structural and Molecular Biology Birkbeck College University of London London WC1E 7HX U.K.
xi
Contents Series Editor’s Preface Parvez I. Haris
v
Preface B.A. Wallace and R.W. Janes
vii
Addresses of Contributors
ix
An Introduction to Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy Robert W. Janes and B.A. Wallace
1
Measurement of Circular Dichroism and Related Spectroscopies with Conventional and Synchrotron Light Sources: Theory and Instrumentation John C. Sutherland
19
Calibration Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy Andrew J. Miles and B.A. Wallace
73
Sample Preparation and Good Practice in Circular Dichroism Spectroscopy Sharon M. Kelly and Nicholas C. Price
91
Sample Preparation and Good Practice in Synchrotron Radiation Circular Dichroism Spectroscopy Andrew J. Miles and B.A. Wallace
108
Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications Jascindra Ravi, Anna E. Hills and Alex E. Knight
125
Synchrotron Radiation Circular Dichroism Spectroscopy: Applications in the Biosciences Andrew J. Miles and B.A. Wallace
141
Linear Dichroism Spectroscopy: Techniques and Applications Alison Rodger
150
Methods of Analysis for Circular Dichroism Spectroscopy of Proteins and the DichroWeb Server Lee Whitmore and B.A. Wallace
165
Reference Datasets for Protein Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopic Analyses Robert W. Janes
183
Ab Initio Calculations for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy of Proteins Benjamin M. Bulheller and Jonathan D. Hirst
202
xii
The Protein Circular Dichroism Data Bank (PCDDB): A Resource for Data Archiving, Sharing, Validation and Analysis B.A. Wallace, Lee Whitmore and Robert W. Janes
216
Appendix: Selected Website and Monograph References for CD and SRCD of Biomolecules
229
Author Index
231
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-1
1
An Introduction to Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy Robert W. Janes1 and B.A. Wallace2 1
School of Biological and Chemical Sciences, Queen Mary, University of London 2 Department of Crystallography, Birkbeck College, University of London Abstract. The aim of this chapter is to introduce the techniques of circular dichroism (CD) and synchrotron radiation circular dichroism (SRCD) spectroscopies. A brief account is given of how the spectrum of a molecule is generated as a consequence of excitations resulting in electronic transitions, which for chiral molecules produce differential absorbances for left- and right- handed circularly polarised light. There follows an overview of the important basic principles of CD and of good practice protocols for collecting data, areas which are further developed in subsequent chapters of this book. In addition there are sections describing potential applications of CD for studies of proteins and nucleic acids. The final section illustrates the enhanced capacity inherent in the technique of SRCD spectroscopy for applications in structural biology, again discussed in detail in other chapters.
1. Introduction The two shells in Figure 1 illustrate the concept of chirality. Each shell is essentially a mirror form of the other, one spiralling to the left, the other to the right, and neither can be superimposed on the other. The interaction of plane-polarised light with chiral molecules is a phenomenon first described more than two centuries ago. The separation of the two mirror crystal forms of tartaric acid by Pasteur and his observation that when in separate solutions they rotated plane-polarised light in equal but opposing directions was a classic study. This led to the development of the technique of Optical Rotatory Dispersion (ORD), which measures how plane-polarised light is rotated by a chiral molecule to differing degrees dependent on the wavelength of that light. Valuable information can be obtained on the configuration of a chiral molecule in solution using this technique. However, although it was widely used to characterise small biological molecules, ORD was superseded by circular dichroism (CD) spectroscopy for the study of macromolecules due primarily to the greater ease of interpretation of the resulting spectra because peaks in the CD spectrum of a molecule correspond to peaks in the absorbance spectrum of that molecule. CD is a technique based not on the rotation of the light per se, but on the differences in the absorption of left- and right- circularly polarised light as it passes through the sample.
2
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
Figure 1. Two seashells which are almost mirror images of each other; neither is superimposable on the other. The one on the left spirals upwards anticlockwise from its opening while that on the right spirals upwards clockwise. That on the left is a fossil shell, while that on the right is a modern day shell; both were collected from the same area on the east coast of England.
2. Principles of Circular Dichroism Spectroscopy This section is aimed at covering the background basics of circular dichroism spectroscopy. More details on the theory and instrumentation can be found in the chapter in this book by Sutherland and in the monographs listed in the Appendix. Although the focus is on protein samples, much of the discussion is applicable to nucleic acids as well. 2.1 The Molecular Origins of Chirality In many biological systems it is often the asymmetry of the attached chemical groups about a carbon atom that gives rise to chirality. Such is the case for amino acids that form proteins where four non-identical groups usually surround a central ‘alpha’ carbon atom. As proteins are built from units that are chiral, they will interact with leftand right- circularly polarised light to differing extents when such light beams are passed through a protein sample. It is the fact that there is a difference between the interactions that gives rise to circular dichroism; therefore a chiral sample is essential for a CD spectrum to be obtained. In some cases, induced chirality can be detected in (mostly small) non-chiral molecules when they interact with chiral molecules (ie. drug
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
3
binding to proteins or nucleic acids), thereby inducing a chiral environment within the molecule; this type of chirality also leads to a CD signal. 2.2 What is Circular Dichroism? Circular dichroism is defined as the difference between the absorption of left- and right- circularly polarised light by a chiral molecule. This can be shown in Eqn. (1) ΔA = AL – AR
(1)
where AL and AR are, respectively, the absorption of the left- and right- circularly polarised light by the chiral molecule. Strictly there should be a wavelength dependence in these terms but this is omitted here and in subsequent equations for clarity. For achiral molecules, because AL = AR, this means that ΔA = 0 at all wavelengths so there is no CD signal (Figure 2). For chiral molecules A L does not equal AR at some wavelengths, so the resulting ΔA will be non-zero and can have either a positive or negative sign, depending on the relative intensities of the left- and righthanded absorbances (Figure 2). The difference is very small (usually on the order of 0.1% or less of the total absorbance), and must be enhanced electronically to be detectable (how this is done is discussed in the chapter on Instrumentation by Sutherland in this book). In addition, many factors have to be considered in order to optimise the data obtained from a CD spectrum; these are outlined below and detailed in subsequent chapters on Good Practice by Kelly and Price and Miles and Wallace. An illustration of how circularly polarised light interacts with chiral molecules is given in Figure 3. A plane-polarised light beam can be considered to consist of two equal and opposing left- and right- circularly polarised light beams. For simplicity we only need consider the electric vectors of these beams and in Figure 3 these vectors are viewed looking down the direction of the light source. When the magnitudes of the vectors are equal then the resultant will trace the linear path of the electric vector amplitude of a plane-polarised light beam. A CD signal is generated when there is a difference in the absorption between the left and right beams when they pass through a chiral sample. This will mean the magnitudes of the vectors will no longer be the same. This difference in magnitudes means that the resultant vector will be that of an elliptically polarised light beam having passed through the sample. Ellipticity is commonly reported as a measure for CD from spectrophotometers and is defined as the angle whose tangent is the minor elliptical axis divided by the major elliptical axis. Therefore a CD signal can be viewed either as an absorption difference as in (1) or in terms of the ellipticity produced, although it can be seen that they are related to each other. Ellipticity as a term used for CD arose as a historical consequence of the earlier use of ORD and its inherent measure of rotation of plane-polarised light by chiral samples. 2.3 How is a CD Signal Produced? This section is aimed at providing an overview of how CD signals arise rather than being a detailed study of CD theory. The reader is referred to the monographs cited in the Appendix to this book for references for more detailed discussions.
4
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
Figure 2. This figure shows the differing effects that arise from achiral and chiral samples on the absorption of left- and right- circularly polarised light. AL and AR are the absorptions of left- and right- circularly polarised light beams, respectively. The left of the figure for an achiral sample shows that AL and AR are always identical, and this is the case across all wavelengths, given as λ; therefore their difference is always zero. On the right of the figure a chiral sample shows that AL and AR can be different and this difference is related to the circular dichroism of the sample. For illustrative purposed, the magnitude of the difference is great enhanced in this figure; it is usually on the order of 0.001 absorbance units or less.
A chromophore may be defined as a virtually self-contained functional group to which specific electronic transitions can be attributed. A CD signal is produced by excitation of an electron from a ground state orbital to another orbital that is an excited state. For proteins the peptide bond represents the principal chromophore of interest, and the key transitions are from the n to π* (nπ*) and π to π* (ππ*) orbitals as shown in Figure 4 (note that subscripts are omitted in the text for simplicity but included in the figure for completeness). For a single peptide bond in isolation the nπ* transition gives rise to a CD signal at ~220 nm. The ππ* transition is at ~190 nm. The additional chirality features that arise from the three-dimensional secondary structures adopted by the polypeptide backbones modify these basic electronic transitions, giving rise to information about the conformation of the protein, as discussed in more detail in the chapter on Analyses of Proteins by Whitmore and Wallace.
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
5
Figure 3. Plane-polarised light can be considered as composed of the sum of two equal and opposing leftand right- circularly polarised light beams. These views are of the electric vector component of the beams looking down the direction of propagation of the light source. a) As the two beams pass through an achiral sample they are equally absorbed and so no CD signal is generated; the summed transmitted beam remains plane-polarised. b) Two circularly polarised beams when passing through a chiral sample will be unequally absorbed and the resultant light beam will transcribe an elliptically polarised path. c) Pictorial representation defining the CD unit of ellipticity as detailed in the text. Note that to maintain simplicity no ORD effects are incorporated into this figure.
2.4 Wavelengths of Light Used for CD and SRCD Measurements The wavelength of light at which a CD spectrum is recorded is standardly reported in nanometer (nm) units, although sometimes synchrotron radiation circular dichroism (SRCD) data is reported as an energy function in electron volts (largely due to the initial development of this method within the physics community where such units are more common when referring to synchrotron radiation). However, in this volume we will adhere to the standard wavelength units in both cases, where visible light spans the range from ~780 nm (the red end of the spectrum) down to ~360 nm, the violet end. From ~360 nm to ~260 nm is the near ultraviolet (UV) region, and from ~260 nm to ~190 nm is the far ultraviolet region. From ~190 nm down to ~120 nm is considered the vacuum UV (VUV) region, and most of the transitions in this region are only accessible using SRCD instruments. Coloured molecules such as the proteins myoglobin or bacteriorhodopsin also absorb light in the visible wavelength region, and if, as in these cases, the chromophores are either chiral themselves or in a chiral environment, they can give rise to CD signals in the visible wavelength region.
6
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
Figure 4. Schematic diagram of the orbital transitions generating CD signals from the amide bonds of the polypeptide backbones of proteins. The ‘n’ is associated with the lone pair orbitals while the ‘πo’ refers to the non-bonding delocalised π-orbitals and the ‘π*’ refers to the anti-bonding orbitals.
However, most CD studies of biomacromolecules are done in the UV region; the aromatic side chains of proteins produce distinct peaks in the near UV region, whereas peaks due to the protein polypeptide backbones tend to dominate the far UV region. Nucleic acids tend to produce peaks over the entire UV range, whereas carbohydrates for the most part absorb only in the VUV wavelength region, and hence have only been studied to any extent now that SRCD instruments have become available.
3. Units of Measurement CD spectral data are reported in a number of different units, and the equations necessary to convert between them will be defined here. It is notable that there is considerable confusion in the literature as to the definitions because in a number of cases, the wrong equations or terms have been cited. 3.1 Ellipticity Typically, commercial CD instruments make measurements as ellipticity units, θ, in dimensions of millidegrees. However, these are not the most useful units for comparisons of different samples as their magnitudes differ depending on the amount of protein present in the sample (i.e. the concentration and pathlength cell used to collect the data). Note that this dependence of ellipticity on the amount of protein in the sample is not entirely obvious when its units are expressed only as millidegrees.
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
7
3.2 Delta Epsilon (Molar Circular Dichroism) One of the most commonly used measures of CD is Delta Epsilon. It is defined as: Δε = εL – εR = (AL – AR) /(dl)
(2)
where εL and εR are defined as the left and right extinction coefficients, respectively, l is the pathlength for the sample in centimetres, and d the molar concentration of the sample. For macromolecules the extinction coefficients are usually, although not always, defined per amino acid residue. This then defines Delta Epsilon, Δε, which has the dimensions M-1 cm-1. Delta epsilon is sometimes referred to as molar circular dichroism. 3.3 Mean Residue Ellipticity The other major parameters used to report CD data are mean residue and molar ellipticities, which can be linked to delta epsilon from Eqn. (3). Mean residue ellipticity, [θ]MRE or MRE, is defined as: [θ]MRE = 3298 Δε
(3)
The scalar term linking the MRE and delta epsilon values can be rigorously derived [1]. MRE has the dimensions degrees cm2 dmol-1 residue-1. Molar ellipticity, [θ], has dimensions of degrees cm2 dmol-1, and is calculated as above except using extinction coefficients derived from the whole protein rather than per residue. Some confusion has resulted because the same symbol is often used for MRE and molar ellipticity, but the magnitude of the spectrum should be a clue as MRE values tend to be on the order of 20,000, whereas molar ellipticities can be much larger. [θ]MRE is obtained by dividing [θ] by the number of residues in the protein (strictly the number of residues in the protein minus one, although for large proteins the difference is negligible). The advantage of reporting the MRE value instead of the molar ellipticity is that this then means the magnitude of the spectrum will be independent of the size (molecular weight) of the protein, so comparisons between large and small proteins are easier. In practical terms, the machine units (ellipticities) measured can be directly converted to mean residue ellipticities by the equation: [θ]MRE = (θ x 0.1 x MRW)/cl
(4)
where c is the concentration in mg/ml, l is the pathlength in cm and MRW is the mean residue weight of the sample (usually around 110 Daltons), defined as: MRW = MW /(n-1)
(5)
where the MW is the molecular mass of the protein (in Daltons) and n is the number of residues. The term n-1 refers to the number of peptide bonds present in the structure which is almost invariably one less than the number of residues. Conversion of directly measured ellipticity values in millidegrees to either Δε or [θ]MRE parameters can enable ready comparisons of CD spectra from proteins
8
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
independent of the conditions (concentration, pathlength) under which the data were collected, so most papers in the literature report CD values in one of these units.
4. Good Practice in Data Collection As with most experimental techniques there is little chance of salvaging meaningful results from data that are poorly collected for whatever reason. Therefore to gain the maximum possible information from a CD spectrum it is essential to try to optimise the data collection conditions. Kelly and Price and Miles and Wallace discuss the issues relating to these in detail in their chapters on Good Practice, and Ravi, Hills and Knight discuss their relevance to reproducibility standards for industry in the chapter on Biopharmaceutical Applications. Good practice approaches should take into account sample preparation, such as purity and concentration. In addition the solvents and buffers to be used should be considered for both their absorption properties and in terms of their ability to maintain protein integrity. These factors are important as CD is reliant on very small differences in absorption and unknowns that directly affect a measurement will clearly compromise the quality of the results obtained. Another important factor is an accurate knowledge of the pathlength of the sample cell being used, particularly significant (and potentially problematic) when cells with short pathlengths (around 10 microns for example) are used. Miles and Wallace describe methods for accurate determinations of cell pathlengths in the chapter on Calibration. It is notable that the pathlength values cited by the manufacturers of the cells are often in error and should be redetermined by the user. An often overlooked condition that can have serious consequences for the quality of the data is the total absorbance of the sample. Spectral data should be collected only in the wavelength range where the detector is capable of measuring the signal. Too much absorption by the sample or its buffer will mean that insufficient numbers of photons will reach the detector for accurate measurements to be made of the CD signal. The consequences of ignoring this issue are also discussed in the Good Practice chapters. Finally, the spectrophotometer should be correctly calibrated and maintained on a regular basis. A poorly used or maintained instrument can still give a respectable looking CD spectrum but no meaningful results will be obtained from the data. Additionally, choosing the instrument parameters, for example how long to collect data at any given wavelength to gain the best signal-to-noise ratio, and choice of step size to avoid over-sampling but maximising the spectral definition based on the instrument resolution, is important in optimising the results obtained. These issues are discussed in the chapters on Calibration by Miles and Wallace and on Instrumentation by Sutherland. Ultimately experiments should aim to collect as broad a wavelength range as is necessary to ensure that sufficient information content is present for meaningful interpretation to be made of the data (see the chapter on Analyses of Proteins by Whitmore and Wallace for a discussion of the information content as a function of wavelength).
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
9
5. Utilisation and Applications of CD Data in Studies of Proteins
5.1 CD Spectral Features Associated with Different Protein Secondary Structures Within proteins, peptide bonds are not in isolation so this leads to the potential for interactions between them. Polypeptide chains form diverse types of secondary structures that are themselves chiral (such as left-handed or right-handed helices); these features modulate the CD spectra of the protein’s peptide bonds and hence provide information on the net secondary structures of the protein. To illustrate additional features that can arise due to these three-dimensional features, consider the α-helix. Within the conformation of most proteins this type of secondary structure most commonly spirals in a right-handed sense, and as a result adds a further layer of chirality to a CD spectrum. α-Helical structures produce two negative CD peaks at ~222 nm and ~208 nm and a positive peak ~190 nm. A property known as exciton coupling arises as a consequence of a number of electronic transitions of comparable type, namely those of the ππ*, being close in both physical proximity and alignment causing excitations to arise within and between them [2]. Exciton coupling splits the ππ* transition into two components defined relative to the helix axis; the π to π*-parallel (ππ*║) being the negative peak ~208 nm, and the π to π*-perpendicular (ππ*┴), the positive ~190 nm peak (Figure 5). The juxtapositions of peptide bonds within the other types of secondary structure result in their characteristic spectral features. For β-sheet structures, the transitions are at ~215 nm for the n to π* transition, and ~195 nm for the π to π* transition (although the wavelengths and shapes of the spectra for different types of β-sheet structures differ considerably due to the different peptide bond geometries associated with parallel and anti-parallel sheets, and sheets with different twists). 5.2 Determination of Protein Secondary Structures from CD Data To a reasonable approximation, as the overall tertiary structure of a polypeptide chain can be considered to be a sum of all its secondary structure components, so too a CD spectrum can be thought of as a sum of the spectral contributions of each of the secondary structures types present. At each wavelength, the following relationship holds between the CD measurement made and the fractions of secondary structures present: [θ]MRE = h fα [θ]α + fβ [θ]β + fT [θ]T + fO [θ]O
(6)
where the mean residue ellipticity measured, [θ]MRE, is the sum of the mean residue ellipticities [θ]α, [θ]β, [θ]T and [θ]O for pure α-helix, β-sheet, β-turn and "other" components, respectively, weighted by the fractional proportions of residues present in these types of secondary structures in the protein (for example fα is the fraction of helix). It should be noted that "other" is often referred to as "random coil", however this is a misnomer as for most proteins this type of structure is neither random nor coil. Here it is meant to represent the components that are not otherwise defined as one of the standard secondary structure types; these are often now referred to as “natively
10
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
45000 30000
MRE
15000 0
-15000 -30000 -45000 160
180
200
220
240
260
280
300
Wavelength (nm) Figure 5. SRCD spectra of three proteins, each with a single predominant type of secondary structure component. The solid line is the spectrum of a mainly α-helical protein, the dashed line is that of a β-sheetrich protein, and the dotted line is that of a protein containing most polyproline-II structures.
disordered” or “unfolded” structures. Within most protein structures these will have relatively well-defined topologies but will not have the regular repeating pattern or defined geometry found in one of the canonical secondary structure features. The coefficient h in the equation is included because there is a modified contribution to the CD spectrum associated with the length of the helices found in proteins. The relationship described in Eqn.(6) can be used empirically to determine the secondary structure of an unknown protein from its CD spectrum, if there is information on what the spectral characteristics are for the “pure” types of secondary structures. Two general approaches have been adopted to produce the relevant information: the basis set and reference dataset methods. 5.2.1 Basis Sets Basis sets represent the mathematically-derived hypothetical CD spectra of the individual types of secondary structure components. Two such basis sets of spectra have been produced by back-calculation of Eqn. (6) using proteins of known secondary structures (from their solved crystal structures) and measured CD data [3, 4]. Although the α-helical spectra are essentially identical in both basis sets, the spectra of the other components differ slightly in their appearances, largely because these components differ in their geometries in different proteins, and the two basis sets were constructed using different protein examples. In general, for many CD spectra the basis set approach is good for helical proteins, but does not allow for the significant diversity seen in secondary structural features associated with different types of protein folds.
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
11
5.2.2 Reference Datasets Reference datasets rely on the fact that the secondary structure components of a protein can be determined from analysis of its crystal structure and that the secondary structures of the “unknown” protein to be analysed has similar geometric and secondary structural characteristics to that of the proteins in the reference dataset. Consequently having more examples of secondary structures from different proteins will increase the accuracy of the analysis when used in a variable selection type of procedure. A reference set of proteins (each of which has a well-determined crystal structure, and for which a good quality CD spectrum is available) is used to create matrices containing the information on the determined secondary structure components for each protein and the CD spectra of the same proteins; this is covered in more detail in the chapter on Reference Datasets by Janes. 5.2.3 Empirical Calculations of Secondary Structure Because there is a wavelength-dependency for the mean residue ellipticity values measured, Eqn. (6) holds true for every data point across the far UV spectral range, meaning that there will be fifty or more simultaneous equations that can be solved. The number of equations depends on the wavelength range and interval in the data collected. With each equation having a measured value (the ellipticity) and five constants (for the four “pure” spectra and h), and only four unknowns, using (simplistically) a least squares approach, the fractional values can be solved for by using a matrix algebraic approach. In actuality, more sophisticated algorithms are used (see the Analysis of Proteins chapter by Whitmore and Wallace), but the principle is the same whether the simple basis sets with a least squares method are used or the more flexible and generally more accurate variable selection procedures with the reference datasets are employed. 5.3 Applications of CD Spectroscopy for the Study of Proteins The resolution of the information produced from a CD spectrum is limited with respect to that produced by other techniques such as X-ray crystallography or NMR spectroscopy because only the total helix, sheet or other secondary structure content is determined (and thus it can not attribute specific secondary structures to specific regions of the molecule). However, this information can be very beneficial in structural biology studies, in part because it can examine a protein in solution (as opposed to crystals) and under more nearly physiological conditions than the low pH/high concentrations required for NMR data collection. In addition, it can examine dynamic changes, requires little material, and can produce results rapidly. If very specific and well-defined questions are posed, it can answer a number of important structural and functional issues regarding a protein. Below are described a few examples of the types of studies in which CD has been shown to be very useful. 5.3.1 As Tests of Molecular Modelling and Structural Integrity Secondary structural information on a novel protein can be used to test tertiary structure models produced by homology, ab initio or other techniques. Because the spectrum will not uniquely correspond to a particular structure, it cannot provide
12
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
“proof” that a given model is correct. It can, however, provide a test criterion for elimination of some models from consideration. CD is very useful for monitoring whether an expressed or isolated protein is correctly folded. This can be an important issue for expressed domain constructs of proteins, which may or may not fold correctly in the absence of the other parts of the protein, but which are often produced to facilitate crystallisation or better folded structures suitable for NMR. An additional role for CD can be to examine a mutant protein with respect to the wild type form, to see if the mutation alters the overall fold and integrity of the protein [5]. If a study aims to look for subtle differences in protein function (especially loss of function), but the mutation produces a misfolded protein, CD studies can be important in identifying the structural basis for loss of activity. However, it should be noted that whilst a mutant having a CD spectrum similar to that of the wild-type offers support for correct folding, it is not a guarantee that the structures are the same because several different structures can produce the same CD spectrum. 5.3.2 Ligand/Protein/Drug Binding Very often binding of a natural ligand or drug to a protein slightly alters the secondary structure content from that of the apo-protein. Comparisons of CD spectra taken before and after binding can be used to identify whether binding occurs, for determination of binding constants [6], and to identify the secondary structure components that have been altered through the binding. The latter is a significant increase in information about the system relative to that provided by other methods which monitor binding and can be very advantageous when the tertiary structure might be known for one form but not the other, thereby providing functionally-important information. 5.3.3 Thermal Stability and Ligand Binding A mutation in a protein that is the causative agent of a genetic disease may give rise to altered physical properties in comparison to the wild-type protein [7]. These types of alterations often affect the stability of the protein and how it responds to being heated. They may be more or less susceptible to changes in temperature than the wild-type protein and this can readily be monitored via “thermal melt” CD experiments. Two methods can be employed: either a full spectrum at different temperatures can be recorded, or data at one single significant peak wavelength, chosen because of the nature of the secondary structure content present in the protein, can be collected to monitor changes in that component. Such thermal studies can aid in defining the mechanism and unfolding pathway (and sometimes even the in vitro folding pathway and mechanism if the protein can be re-folded). An additional use of CD thermal melt experiments is as a very sensitive monitor of ligand binding [8]. Even if the binding does not sufficiently influence the secondary structure to cause a detectable change in the CD spectrum as described in the previous section, if the ligand binding stabilises the protein structure (as is often the case), a shift of the melting point (Tm) of the protein can provide a very sensitive measure for ligand binding.
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
13
5.3.4 Changes Associated with pH Some proteins undergo conformational changes comparable to those seen in thermal stability studies when the pH of their surrounding environment is modified. These structural alterations may well be related to their functioning, and can be monitored by titration studies (some commercial instruments even include a titration module to enable this). Local changes induced in side chains and main chains can alter the interactions holding the structure together such that the protein changes its overall conformation; these can readily be monitored either in the near UV for changes involving side chains (tertiary structure), or in the far UV for changes involving secondary structures. 5.3.5 Kinetics (Folding and Reactions) An extremely effective utilisation of CD is to observe the results of reactions that happen quickly, commonly on the microsecond to millisecond time frame. These are best studied by “stopped flow” techniques, whereby the reaction is initiated by mixing the protein and the ligand at a fixed time before the material is flowed into the sample chamber at which point that flow is stopped and the data are recorded. Varying the time between initiation of the reaction and when the mixture reaches the sample chamber can monitor different components or intermediate stages of the reaction quite readily. Limitations to such studies are that the changes in signal may be very small and that the changes may occur primarily at wavelengths (below 220 nm) where the signal-to-noise levels are high. Hence time-resolved studies have tended to focus on changes which involve α-helical components, that can be monitored at 222 nm. With the development of SRCD beamlines, stopped flow studies at lower wavelengths (i.e. ~190, 208 or 215 nm) have been enabled, thus permitting monitoring of peaks associated primarily with other types of secondary structures such as β-sheets or disordered structure. The kinetics of folding and unfolding of proteins in the presence of chemical denaturants can also be monitored using stopped flow, and these can be compared with the kinetic changes monitored by other biophysical techniques such as fluorescence spectroscopy to determine if concerted or cooperative unfolding and/or folding occurs. In addition, temperature-jump initiated unfolding events can examine whether thermal and chemical denaturation processes follow similar pathways and kinetics.
6. Utilisation and Applications of CD Data in Studies of Nucleic Acids
6.1 Nucleotide Spectral Features In nucleic acids it is the bases that are the principal chromophores, having as they do many lone pairs and π bonds within their structures. However, as they are planar, they do not have any intrinsic CD. The CD arises from the asymmetry induced by the linked sugar groups. Most of the electronic transitions associated with the sugar groups themselves, however, fall at wavelengths that are below the detection range of conventional lab-based CD instruments. The CD signals that arise from the bases are dominated by ππ* transitions since the nπ* transitions are of moderate to weak intensity. The CD spectra of the different nucleotides are distinct [9] and include peaks
14
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
across the near, far and VUV wavelength ranges. The spectra of mononucleotides are relatively weak, but if two nucleotides are linked together to form a dinucleotide, exciton coupling can occur between the ππ* transitions that are either identical (degenerate) if both nucleotide components are the same, or nearly identical if they are different. The CD spectra from dinucleotides exhibit hyperchromicity, being more intense by roughly an order of magnitude than those from monomers. It is when the bases are linked together in polynucleotides, giving rise to many degenerate interactions between the bases, and they gain the additional characteristics associated with the asymmetric features of secondary structures such as the DNA double helical helix, that they develop strong CD signals in the UV region of the spectrum [10]. The CD spectral characteristics of the different forms of DNA are distinct [11], due primarily to the different helical parameters resulting in different interactions between the bases in the different forms of double helical DNA. Three major structure forms of DNA have been identified as stable in solution; they are A-DNA, B-DNA and Z-DNA. These differ in either the pitch (the distance along the axis it takes to complete one turn) of the double helix, or the direction of rotation of the double helix along the axis, or both, as well as in the nature of the stacking of the bases. B-DNA is the form first identified by Crick and Watson and is most often found in nature. B-DNA has a positive peak in its CD spectrum at ~280 nm and a negative peak at ~240 nm, both of similar magnitudes, and an intense positive peak at ~190 nm. In contrast A-DNA has a stronger positive peak ~270 nm, a negative peak at ~210 nm and a very strong positive peak at ~185 nm. Z-DNA, which is of the opposite helical sense from both A- and BDNAs, has a negative peak at ~290 nm and a characteristic intense negative peak at ~195 nm, so its spectrum is more or less a mirror image of that of A- and B-form DNAs. In addition to the double helix forms of DNA, there can exist single or multistrand forms as well, which have characteristic spectral “signatures”. Some tertiary structure level arrangements of DNA can still retain relatively simple bonding arrangements within their structures, which can be identified by CD spectroscopy. Holliday junctions, for example, retain a B-DNA-like geometry according to their CD spectra [12]. Recent studies using SRCD spectroscopy have demonstrated the importance of the VUV wavelength region for showing long distance electron coupling [13] as well as the conformation of the sugar bases [14]. The exciton couplings are more prominent in the VUV region of the spectrum and can span many nucleotide base pairs within an oligonucleotide and within a single chain can affect as many as eight bases. The related technique of linear dichroism spectroscopy (discussed in this book in the chapter by Rodger) has proven to be an important addition to circular dichroism spectroscopy for examination of extended and orientable molecules such as DNA and RNA. 6.2 Nucleotide-Drug Binding An important area of nucleic acids research in which CD studies have played an integral role is that of binding of drugs (including anti-tumour drugs) to DNA structures. Many of these drugs do not intrinsically generate a CD signal being achiral themselves, however when they bind to DNA with its innate chirality associated with its secondary structures, an induced/enhanced CD signal can be generated [15]. These studies can often provide valuable information on the mode of binding of the drug to the DNA, binding constants and drug:DNA mole ratios, and whether or not the binding
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
15
is specific to certain base-pair arrangements within the structure. Binding of such agents generally takes place in one of three ways: Groove binding is where the drug binds either into the major groove (the usual mode of binding), or minor groove of the DNA helix. Intercalation is when the drug binds between adjacent bases in one of the chains often forming π-bonding interactions with the bases as it does so. The last is the insertion form, where the drug binds right in between the two strands of the double helix fully inserting itself into the heart of the structure. Each of these forms gives rise to characteristic but different CD spectra.
7. Synchrotron Radiation Circular Dichroism Spectroscopy Most of this chapter has been concerned with lab-based conventional CD data. This final section introduces SRCD, an emerging technique which extends the capacity to record CD data down into the VUV wavelength range, substantially lower than wavelengths attainable on conventional CD instruments. Only a brief overview will be presented here; the chapters on Good Practice in SRCD and Applications of SRCD by Miles and Wallace and the chapter on Instrumentation by Sutherland cover the material in greater depth. 7.1 Advantages of SRCD over CD SRCD as the name clearly suggests, arises from utilisation of a synchrotron as the light source for CD studies. Conventional CD instruments typically use a Xenon arc lamp as their light source, but the light from these lamps drops in flux by more than two orders of magnitude as the wavelength is decreased from ~260 nm to ~180 nm. Generally, SRCD sources maintain a constant flux from ~260 nm down to at least ~140 nm, which means that measurements to much lower wavelengths are possible. In most cases the flux in a synchrotron beamline is many orders of magnitude higher at all wavelengths than is the flux of the Xenon lamp [16]. This higher flux and the consequential extended wavelength range are the keys to many of the benefits gained using SRCD. A major advantage of the high flux is that this greatly improves the signal-to-noise ratios in the data. An experimental alternative is that the high flux makes it possible to attain comparable signal-to-noise levels as in a conventional instrument, but using very much smaller amounts of sample, or else very much faster measurements can be to be made, thus providing a significant benefit if only small amounts of material exist, or if the sample is unstable with time. Both the small sample requirement and the faster data collection also facilitate time-resolved measurements to follow reaction kinetics. A second advantage of the increased flux is that samples can be measured in buffers or other environments (solvents, membranes, detergents) that themselves produce significant (unpolarised) absorbance. Even with a highly absorbing solution, some light does reach the detector in an SRCD so measurements in a wider range of buffer and salt conditions are feasible. A further advantage of the much higher flux in the low wavelength regions is that accurate SRCD measurements can be made in aqueous solutions to wavelengths as low as ~160 nm. The limit in this case is where the water absorption becomes too high for CD data to be measured reliably, although hydrated or dehydrated film samples, where there is minimal or no water present, respectively, can still provide CD data well below this wavelength. In the SRCD-accessible lower wavelength range (the region between
16
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
~160 nm and the practical limit of conventional instruments of ~185 nm or 190 nm) there are additional electronic transitions, considered to arise from charge transfer transitions both in proteins and in DNA samples [17, 13]. This increased wavelength range provides a higher information content in the data, which can improve the analyses of the data (see the Analysis chapter by Whitmore and Wallace for a further discussion of this). In addition, it enables examination of molecules which only have transitions in the VUV region, for example some sugar molecules [18] that are otherwise transparent to the CD measurements. 7.2 Applications Unique to SRCD Because of the advantages noted above, SRCD enables a number of types of experiments that would be beyond the capabilities of conventional CD instruments. These are discussed in more detail in the chapter on SRCD Applications by Miles and Wallace and in the review in reference [19]. A few examples of such experiments are highlighted below. 7.2.1 Measurements in Highly Absorbing Solutions (Including Membranes) Proteins often require the use of buffers and/or high salt concentrations or additives in order to retain their solubility and prevent aggregation [15]. Many such buffers absorb light strongly in the far UV wavelength range, and although they do not exhibit a chiral signal, the transparency of the solution to both the left- and right- circularly polarised light in a conventional CD is severely limited. SRCD offers the chance to undertake measurements in higher concentrations of buffers and salts (such as sodium chloride) that absorb strongly [19]. Because similar absorbance problems arise from lipids and detergents, SRCD can greatly improve the ability to measure membrane protein samples in bilayers and micelles [20] 7.2.2 Detection of Subtle Differences in Mutants On occasions, changes in structure between a wild-type and mutant protein are so subtle that they remain immeasurable on a conventional CD machine because the overall noise level is comparable to the differences detected between the spectra, and hence the changes are below the confidence/error levels. With the inherent increase in signal-to-noise offered by SRCD, subtle differences can be identified in such data. For example, in one case a mutation related to disease (the single amino acid change from a wild type to a cataract-forming mutant in the gamma D-crystallin protein), was undetectable using a lab-based instrument, but the same sample readily showed significant differences associated with an increase in sheet content when examined by SRCD [21]. 7.2.3 Detection of Protein-Protein Rigid-Body Interactions The detection of protein-protein intermolecular interactions has been an important role for conventional CD spectroscopy. It has been possible to detect such interactions in the far UV wavelength region provided they involve a net change in secondary structure of one or both of the two components. However, when the interaction results
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
17
in no change in secondary structure content (i.e. rigid body interactions where neither component changes conformation), CD measurements in the far UV region are unable to detect the binding. These sorts of interactions can be observed in SRCD data however, where changes in the lower wavelength (VUV) regions can reflect different types of interactions – resulting either from differences in side chain or backbone mobility or environmental changes affecting the charge-transfer transitions. SRCD was shown to be able to detect the interaction between two proteins, even though that interaction did not involve any change in secondary structure of either component protein [22]. Because all the differences occurred at wavelengths below 190 nm, they were not detectable by conventional CD measurements. Hence SRCD can make these new types of measurements that would otherwise not be possible, expanding the utility of the technique.
8. Summary Clear strengths of CD as a technique are its relative speed and simplicity in producing data that can give important insight into the structure and function of macromolecules in solution. As such it has the capacity to complement the results obtained from other techniques, such as X-ray crystallography and NMR spectroscopy, as well as providing results unobtainable via any other means. This book includes chapters which discuss how data can be optimally collected, analysed and interpreted, as well as discussing new applications of the method in the biosciences.
Acknowledgements The conventional CD and SRCD studies in our labs have been supported by grants from the U.K. Biotechnology and Biological Sciences Research Council.
References [1] [2] [3]
[4] [5] [6] [7] [8]
[9]
G.D. Fasman, (ed.), Circular Dichroism and the Conformational Analysis of Biomolecules, (1996) Plenum Press. C.R. Cantor and P.R. Schimmel, Biophysical Chemistry, Part II, (1980) W.H. Freeman and Company. Y.H. Chen and J.T. Yang, A new approach to the calculation of secondary structure of globular proteins by optical rotatory dispersion and circular dichroism, Biochim. Biophys. Res. Comm. 4 (1971) 1285-1291. S. Brahms and J. Brahms, Determination of protein secondary structure in solution by vacuum ultraviolet circular dichroism, J. Mol. Biol. 138 (1980) 149-178. D.G. Lawton, C. Longstaff, B.A. Wallace, J. Hill, S.E.C. Leary, R.W. Titball and K.A. Brown, Interactions of the type II secretion pathway proteins LcrV and LcrG from Yersinia Pestis are mediated by coiled-coil domains, J. Biol. Chem. 277 (2002) 38714-38722. Y.C. Chen and B.A. Wallace, Binding of alkaline cations to the pore form of gramicidin, Biophysical J. 71 (1996) 163-170. R. Maytum and R.W. Janes, Synchrotron radiation circular dichroism spectroscopy reveals a new structural transition in tropomyosin, Biophysical J. 92 (2007) 362a. A.O. O’Reilly, K. Charalambous, G. Nurani, A.M. Powl and B.A. Wallace, G219S mutagenesis as a means of stabilising conformational flexibility in the bacterial sodium channel NaChBac, Molecular Memb. Biol. 25 (2008) 670-676. C.A. Sprecher and W.C. Johnson, Jr., Circular dichroism of the nucleic acid monomers, Biopolymers 16 (1977) 2243-2264.
18
R.W. Janes and B.A. Wallace / An Introduction to CD and SRCD Spectroscopy
[10] K.E. van Holde, J. Brahms, and A.M. Michelson, Base interactions of nucleotide polymers in aqueous solution, J. Mol. Biol. 12 (1965) 726-739. [11] J.H. Riazance, W.A. Baase, W.C. Johnson, K. Hall, P. Cruz, and I. Tinoco, Evidence for Z-form RNA by vacuum UV circular dichroism, Nucleic Acids Res. 13 (1985) 4983-4989. [12] T. Shida, H. Iwasaki, H. Shinagawa and Y. Kyogoku, Characterization and comparison of synthetic immobile and mobile Holliday junctions, J. Biochem.(Japan) 119 (1996) 653-658. [13] U. Kadhane, A.I.S. Holm, S.V. Hoffmann, and S.B. Nielsen, Strong coupling between adenine nucleobases in DNA single strands revealed by circular dichroism using synchrotron radiation, Phys. Rev. E 77 (2008) 021901. [14] A.I.S. Holm, E.S. Worm, T. Chakraborty, B.R. Babu, J. Wengel, S.V. Hoffmann and S.B. Nielsen, On the influence of conformational locking of sugar moieties on the absorption and circular dichroism of nucleosides from synchrotron radiation experiments, J. Photochem. Photobiol. A 187 (2007) 293-298. [15] V. Rajendiran, M. Murali, E. Suresh, M. Palaniandavar, V.S. Periasamy and M.A. Akbarsha, Noncovalent DNA binding and cytotoxicity of certain mixed-ligand ruthenium(II) complexes of 2,2'dipyridylamine and diimines, Dalton Trans. 16 (2008) 2157-2170. [16] A.J. Miles, R.W. Janes, A. Brown, D.T. Clarke, J.C. Sutherland, Y. Tao, B.A. Wallace and S.V. Hoffmann, Light flux density threshold at which protein denaturation is induced by synchrotron radiation circular dichroism (SRCD) beamlines, J. Synchrotron. Rad. 15 (2008) 420-422. [17] A.T.B. Gilbert and J. D. Hirst, Charge-transfer transitions in protein circular dichroism spectra, J. Mol. Struct. (Theochem) 675 (2004) 53-60. [18] E.R. Arndt and E.S. Stevens, Vacuum-ultraviolet circular-dichroism studies of simple saccharides, J. Amer. Chem. Soc. 115 (1993) 7849-7853. [19] A.J. Miles and B.A. Wallace, Synchrotron radiation circular dichroism spectroscopy of proteins and applications in structural and functional genomics, Chem. Soc. Rev. 35 (2006) 39-51. [20] B.A. Wallace, J.G. Lees, A.J.W. Orry, A. Lobley and R.W. Janes, Analyses of circular dichroism spectra of membrane proteins, Protein Sci. 12 (2003) 875-884. [21] P. Evans, K. Wyatt, G.J. Wistow, O.A. Bateman, B.A. Wallace and C. Slingsby, The P23T cataract mutation causes loss of solubility of folded γD-crystallin, J. Mol. Biol. 343 (2004) 435-444. [22] N.P. Cowieson, A.J. Miles, G. Robin, J.K. Forwood, B. Kobe, J.L. Martin and B.A. Wallace, Evaluating protein:protein complex formation using synchrotron radiation circular dichroism spectroscopy, Proteins: Struct. Func. Bioinform. 70 (2008) 1142-1146.
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-19
19
Measurement of Circular Dichroism and Related Spectroscopies with Conventional and Synchrotron Light Sources: Theory and Instrumentation John C. SUTHERLAND1, 2 Physics Department, East Carolina University 2 Biology Department, Brookhaven National Laboratory 1
Abstract. A linearly polarized optical beam that passes through a photoelastic modulator can be used to measure circular dichroism, linear dichroism, optical rotary dispersion, fluorescence detected circular dichroism and fluorescence polarization anisotropy. These parameters, along with the simultaneous measurement of absorption and fluorescence are possible with a conceptually simple single beam spectrometer, with only minor adaptations. Practical spectrometers and the components needed to build them using both conventional (Xenon arc) and synchrotron light sources are reviewed. The need to match the components throughout the instrument is discussed and methods of calibrating critical components are described. Potential artifacts associated with sample inhomogenity are discussed. An explanation for the observation that a linear dichroism signal can appear at the modulation frequency expected for circular dichroism is presented. Spectra demonstrating the ability of a synchrotron source spectrometer to extend the range of wavelengths for the circular dichroism of proteins are presented. Three general classes of practical spectrometers are described based on the components used and the performance achievable. Vendors of polarization-modulation spectrometers and specialized components required to build them are listed, as are existing and planned synchrotron based user facilities.
1. Introduction This chapter focuses on instrumentation for the measurement of circular dichroism (CD) and related spectroscopic measurements in the visible and ultraviolet (UV) regions of the spectrum, which provide information on molecules of biological interest. Particular attention is devoted to the vacuum ultraviolet (VUV), which begins at roughly 190 nm, because of recent advances permitting measurement of CD of suitably transparent samples to about 120 nm using synchrotron radiation, an experiment referred to as SRCD. I review the extraordinary variety of measurements useful in molecular biophysics that can be performed using a basic CD spectrometer with relatively modest additions or modifications [1-8]. The measurements of the chirality of vibrational transitions, which can be observed at infrared wavelengths or via differential Raman spectroscopy [9] and CD in the Xray region, also made possible by synchrotron radiation but using vastly different technology, are not treated here, nor are certain aspects of electronic CD
20
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
instrumentation that have been the subject to excellent reviews in the past [10-12]. First, I discuss the operation of a basic dichrometer for the measurement of CD. I also describe how a few easily implemented and, in many cases, commercially available extensions permit the measurement of complementary parameters that extend the usefulness of the basic instrument. Such instruments are made possible by: the ability of photomultipliers to detect light over very great ranges of intensities by changing the applied high voltage; the ability of phase-sensitive detection to recover tiny electronic signals of known frequency and phase in the presence of much higher levels of incoherent noise; the ability of a photoelastic modulator (PEM) to alter the polarization of a light beam without changing its intensity and the ability of computer systems to record multiple data streams. In a later section, I describe practical aspects of the components required to build three distinct classes of such instruments, stressing how the properties of one component can influence the choice or performance of other components and the resulting instrument. CD and absorption spectra of a protein in an aqueous environment and dehydrated films are presented that demonstrate the capabilities made possible by synchrotron radiation and the limitations in spectral range caused by water and other solutes such as NaCl.
2. Theory of Measurements 2.1 The DichroFluoroSpectroPhotometer: A Conceptual Multifunction Single Beam Spectrometer Figure 1 is a schematic diagram of the optical components of a basic CD spectrometer modified by the addition of a second detector to record light emitted from the sample. Broad spectrum or “white” light from a suitable source passes through a monochromator, a linear polarizer (if necessary) and a polarization modulator. The beam then passes through the sample before reaching a detector, which is usually a photomultiplier for visible and UV light. I shall suppose that nearly monochromatic incident light of wavelength λ is traveling in the positive direction along the x axis and that the x and y axes define a horizontal plane. The positive z axis is defined as the vertical direction and the coordinate system is presumed to be right handed. The stress axis of the modulator is oriented in the vertical plane midway between the y and z axes. The terms “horizontal” and “vertical” are appropriate for most practical instruments, but do not have to be taken literally. If fluorescence (or more generally luminescence) is to be observed, it will be for light traveling away from the sample parallel to the y axis, as shown in Figure 1. For the measurement of CD, the polarization of the incident light could be in any direction in the y-z plane. In practice, many dichrometers and fluorometers are designed such that the direction of polarization is vertical, i.e. parallel to the z axis and most previous descriptions of such instruments have assumed this orientation. However, horizontal incident polarization is a natural choice for synchrotron radiation based dichrometers and is also encountered in instruments from some commercial vendors. Thus, I will analyze the operation of the conceptual spectrometer assuming horizontal polarization and present results appropriate for vertical polarization only when they differ.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
kˆ
Emission Detector
Polarizer
Light Source
hν
21
Monochromator Sample Holder
ˆj
iˆ
Photoelastic Modulator
Sample Platform
Transmission Detector
Figure 1. Schematic pseudo-perspective plan view showing the essential components of a single beam dichrometer for the measurement of CD and other parameters. The transmission detector can either be moved to the emission position or a second detector provided for that role, typically with enhanced long wavelength sensitivity. The sample holder is usually mounted on a platform that is kinematically located to permit rapid interchange of various configurations, e.g., for cylindrical, rectangular or multiple sample cells. The platform and the components mounted upon it can transform the basic spectrometer to measure several different optical parameters. The unit vectors iˆ, ˆj and kˆ indicate the orientation of the x, y and z axes. In some instruments, a lens or mirror focuses the light emerging from the exit slit of the monochromator at the center of the sample.
2.2 Characterization of the Optical Beam Stokes vectors and Muller matrices are a convenient way of representing the intensities of the light moving through the system and the effects of various optical elements. As noted above, we assume that the light incident on the PEM is polarized in the horizontal plane and moving in the positive x direction and that the stress axes of the PEM makes an angle of π/4 with respect to that plane. At a particular instant, the PEM generates a phase shift or “retardation” of δ radians between the components polarized parallel and perpendicular to its stress axis. Figure 2 shows the Stokes vectors representing the light incident on the PEM and emerging from it, along with the Muller matrix describing the PEM.1 Stokes vectors only record the differences in the intensities of the horizontally and vertically polarized intensities (in the second element) and the left and right circularly polarized intensities (in the fourth element). The sums of these quantities are recorded in the first element. Shindo [13] has provided a description of polarization-modulation instruments based entirely on Muller matrices and Stokes vector. The discussion provided below does not use this canonical approach, but focuses on the individual quantities of interest.
1 In the mathematical expressions used in this chapter, function arguments are enclosed in square brackets when it is necessary to delineate them explicitly, vectors are denoted by an arrow above the symbol, average values are denoted by a bar above the symbol and parentheses are reserved for grouping expressions or for enclosing the explicit elements of a vector or matrix. All angles are in radians unless otherwise indicated. In expressions with dual subscripts and arithmetic signs, the first subscript corresponds to the upper sign.
22
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
⎛1⎞ ⎜ ⎟ ⎜1⎟ I H = I0 ⎜ ⎟ , 0 ⎜ ⎟ ⎜0⎟ ⎝ ⎠
M PEM
0 ⎛1 ⎜ ⎜ 0 cos δ =⎜ 0 0 ⎜ ⎜ 0 − sin δ ⎝
0 0 ⎞ ⎟ 0 sin δ ⎟ , 1 0 ⎟ ⎟ 0 cos δ ⎟⎠
I PEM = M PEM
⎛ 1 ⎞ ⎜ ⎟ ⎜ cos δ ⎟ I H = I0 ⎜ 0 ⎟ ⎜ ⎟ ⎜ − sin δ ⎟ ⎝ ⎠
Figure 2. Stokes vector representing the horizontally polarized light incident on the PEM, ĪH, the Mueller matrix describing the PEM, MPEM [14], and the Stokes vector for the light emerging from the PEM and incident on the sample, ĪPEM. Note the intensities are wavelength dependent. The phase shift induced by the modulator varies as a function of time; and thus can be written δ[t]. As expected, substituting the values of –π, –π/2, 0, π/2 and π in the expression for ĪPEM return the Stokes vectors for vertical, left circular, horizontal, right circular and vertical polarization, respectively.
To determine the effects of dichroic absorptions, the actual intensities of these components are needed. In the case of CD, we know from the Stokes vector that IR + IL = I0 and IR – IL = I0 sin δ, which yield the values for IL and IR shown in Eqn. (1). If the incident light is vertically polarized, the signs in Eqn. (1) are reversed. Similar reasoning leads to the expressions for IH and IV shown in the same equation. CD and LD are usually defined as the differences between the corresponding decadic absorptions, e.g., ΔACD = AL – AR, but in derivations it is convenient to use the corresponding Eulerian absorptions, aL/R. Definitions of these parameters in terms of the incident and transmitted intensities and their relationship to each other are shown in Eqn. (2). For both the linear and circular terms, a is the average of the two components and Δa is their difference, as shown in Eqn. (3). Similar relationships exist for the corresponding decadic absorptions, Ā and ΔA.
IL / R =
I0 (1 ∓ sin δ ) and I H / V = I 0 (1 ± cos δ ) 2 2
I = I 010 − A = I 0e − a
aL / R = a ±
⇒ a = A log e 10 ≈ 2.303 A
ΔaCD Δa and aH / V = a ± LD 2 2
(1)
(2)
(3)
The phase difference produced by a PEM is a sinusoidal function of time, i.e., δ = δ0 sin ωt, where δ0 is the amplitude of the phase shift and ω is the angular frequency of the PEM. (Note that ω = 2π f, where f is the frequency of the PEM, typically 50 kHz.) In evaluating the intensities emerging from the sample, it is useful to expand the sine and cosine functions in terms of Bessel functions of the first kind [15], as shown in Eqn. (4).
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
23
cos [δ 0 sin ω t ] = J 0 [δ 0 ] + 2 J 2 [δ 0 ]cos 2ω t + 2 J 4 [δ 0 ]cos 4ω t + ... sin [δ 0 sin ω t ] = 2 J 1 [δ 0 ]sin ω t + 2 J 3 [δ 0 ]sin 3ω t + ...
(4)
2.3 Transmission Experiments First, we consider experiments in which the light passes through the sample and is incident on the transmission detector. There are two classes of such experiments. One class detects the absorption of light. The other detects differences in the refractive index. All transmission detected experiments consist of pairs of experiments, one from each class. Each pair of experiments are coupled by Kramers-Kronig (KK) integral transforms [16] that permit the calculation of one parameter at each wavelength provided that the conjugate parameter is known for all wavelengths. In practice, this condition is not as restrictive as it might appear. With the exception of Optical Rotary Dispersion, we consider only the absorption experiments, but will indicate the corresponding KK conjugate. 2.3.1 Circular Dichroism (CD) CD is the difference in the absorption of left- and right- circularly polarized light due to inherent asymmetry of the absorbing species; the “chromophore”. The KK transform of CD is optical rotation, which is discussed in section 2.3.4. The basic design of most CD spectrometers has changed little since the pioneering work of Grosjean and Legrand [17], although there have been numerous improvements in the individual components. When using a PEM, the total intensity of light leaving the sample and arriving at the transmission detector is given in Eqn. (5), where the two circularly polarized components are attenuated by their respective Eurelian transmission functions factored into average and differential absorptions, and the phase shift term is replaced by the lowest frequency term in the Bessel function expansion. In arriving at the expression shown on the right hand side of Eqn. (5), each exponential term containing ΔαCD is expanded in a Taylor series and only the lowest order term retained. This is an excellent approximation for small absorption differences (ΔαCD < 0.1) because the even order terms in the series, 1 and (ΔαCD)2, are eliminated by cancellation from the corresponding terms in the other exponential, which appears with the opposite sign. For vertically polarized light incident on the PEM, the sign appearing in Eqn. (5) is a negative. Note than if either the CD is zero or if we average over many cycles of the modulator, the intensity reaching the detector is just what is expected for unpolarized absorption.
I [t ] = I L e −aL + I R e −aR ⇒ I 0 e −a (1 + ΔaCD J1 [δ 0 ]sin ωt + ...)
(5)
From the experimental prospective, the intensities of light arriving at the transmission detector generate a photocurrent that is converted to a time varying voltage, v[t], that can be described as the sum of a time constant voltage, v0, plus components modulated at the frequency of the PEM and each of its harmonics, as shown in Eqn. (6). For each of the frequency modulated terms we specify both an
24
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
amplitude (in volts) and a phase, ψ. Note that these phases refer to processing of electronic signals by the lock-in amplifier, and are not related to the phases used to describe the optical intensities modulated by the PEM.
v[t ] = v 0 + vω sin[ ω t + ψ ω ] + v 2ω sin[ 2ω t + ψ 2ω ] + v 3ω sin[ 3ω t + ψ 3ω ] + …
(6)
A typical electronic circuit used to analyze the signals from the detector in all of the experiments described below is shown in Figure 3. A synchronous detector, also called a “lock-in amplifier”, uses a reference signal supplied by the PEM control circuit to measure the amplitude of the fundamental, vω, or the first harmonic, v2ω, for specified values of the corresponding phase. (The outputs from many lock-in amplifiers are proportional to the root-mean-square of the modulated signal, but here I shall assume that the true amplitude of the modulated signal is returned or subsequently calculated.) Comparing the ratio of the terms modulated at angular frequency ω to the corresponding time invariant terms gives the ratio shown in Eqn. (7), where the factor of –1 reflects choosing ψω= π, which gives the same result obtained for vertically polarized light incident on the PEM and ψω= 0. Solving for the CD and converting to decadic absorption results in the expression shown on the right hand side of Eqn. (7). For historical reasons, CD is also reported, particularly in the biochemical literature, as “ellipticity”, θ, which is directly proportional to ΔACD.
vω = −ΔaCD J 1[δ 0 ] v0 ⇒ ΔACD ≡ AL − AR =
vω −1 ln[10] J 1 [δ 0 ] v0
(7)
Measurement of CD is thus a ratio experiment, but unlike unpolarized absorption, the ratio involves two quantities of the beam that has passed through only a single sample. Statements to the effect that the CD is obtained from the vω/v0 ratio for δ0 = π/2 are found frequently in both product literature and scientific publications. This choice of PEM phase, while sufficient, is not a necessary condition for extracting the CD, as demonstrated by Eqn. (7) and previously mentioned by Kemp [18] and others. In principal, any value of δ0 other than those that cause J1[δ0] to equal zero are acceptable. However, the calibration constant will change accordingly, as indicated by the values of J1[δ0] plotted in Figure 4. The maximum experimental signal (value of vω), and hence the minimum calibration constant, are found for the maximum in J1 at δ0 = 0.587 π (106o) for which J1 = 0.582, whereas J1[π/2] = 0.567. This flexibility in the choice of the PEM phase is useful is designing experiments for the simultaneous extraction of multiple parameters from a single experiment. Traditionally, CD instruments have been calibrated empirically using a transfer standard such as d-10-camphorsulfonic acid. This avoids the necessity of actually determining the value of δ0 or the absolute gain of the lock-in amplifier.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
90
HV
90
Comparator
V 100
90
V
PEM
PM
v[t ]
90
vnω
90
90
90
v0
ADC sig
ω
90
ref DAC 90
PEM head
ADC
Lock-in
i/v 90
90
25
PEM Control
δ 0 Control
GPIB
ADC
Stored Beam Current
Figure 3. Schematic diagram of an electronic circuit used to extract the parameters required to measure CD and the other experiments performed with a PEM. The configuration shown here is similar to the U11 beamline at the National Synchrotron Light Source at Brookhaven National Laboratory. Light passing through the PEM and a sample is incident on the photocathode of a photomultiplier (PM). The instantaneous light intensity is transformed to a current by the PM. A current-to-voltage converter (i/v) inside the PM housing generates a voltage proportional to the light intensity. For experiments using a PEM operating at a frequency of 50 kHz, the roll-off frequency of the i/v converter should be greater than 200 kHz. The output from the i/v goes to both the ac coupled signal input of a lock-in amplifier and also to a custom dc coupled voltage comparator circuit. The comparator keeps the time average voltage, v0, from the PM essentially constant by generating an electrical signal that controls the high voltage, V, which energizes the PM. This power supply functions like an amplifier with a gain of 100, i.e., an input of 10 V yields an output of 1,000 V. A reference signal from the PEM control permits the lock-in to recover signals at either the frequency of the PEM or at twice that frequency. The output of the lock-in is displayed in both analog and digital form on the front of the instrument and transferred across a general purpose interface bus (GPIB) to the end-station computer. The lock-in has auxiliary analog-to-digital (ADC) inputs that monitor the voltage applied to the PM and the current present in the storage ring. This information is used to compute pseudo-absorption, which permits simultaneous measurement of CD and absorption, as discussed in section 2.3.5. Knowledge of the stored current, a surrogate for I0, or some other indication of this parameter, is necessary for measurement of pseudo absorption in a synchrotronbased instrument because the current circulating in most electron storage rings decreases slowly between periodic injections. An ADC could also be used to measure the time average current from the PM to insure that it remains constant. The lock-in also has digital-to-analog outputs that we use to set the amplitude of the phase retardation produced by the PEM, although modern PEMs can also be controlled directly through a GPIB or a serial interface.
2.3.1.1 Magnetic Circular Dichroism (MCD) Adding a magnetic field parallel (or anti-parallel) to the direction of the light beam as it passes through the sample can produce an additional CD signal. Thus MCD is the simplest “add-on” capability from the viewpoint of the analysis of the optical system, being identical to that for CD given above. It also increases the usefulness of the spectrometer, as it provides information entirely complementary to the inherent or “natural” CD. Ironically, MCD as usually implemented may also be the most disruptive add-on capability in terms of the design and operation of the spectrometer. However, recent developments in permanent magnet technology, discussed in section 3.13, may facilitate the wider use of MCD and thus increase the versatility of dichrometers. The Kramers-Kronig transform of MCD is the Faraday rotation spectrum.
26
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
0 1
ο
90
ο
180
ο
J0
2/3 1/3
270
ο
360
ο
∗ rH
∗ rV
δ0
δ0
J1
J2
0 −1/3
0
∗
δ0 π/2
π
δ0
3π/2
2π
Figure 4. Values of the three lowest order Bessel functions of the first kind as a function of their dimensionless argument. As applied to a photoelastic modulator, the argument is the maximum phase retardation, δ0, expressed in radians. The corresponding phase angles expressed in degrees are shown on the upper horizontal axis. The “magic phase” for transmission experiments is the phase that causes the lowest order Bessel function, J0, to equal zero. The two magic phases that apply to fluorescence polarization anisotropy are the values that cause J0 to equal ± ⅓ for light incident on the PEM polarized vertically or horizontally, respectively. The amplitude of the light intensity modulated at the frequency of the PEM is proportional to J1, while that at twice the modulator’s frequency is proportional to J2.
The observed CD is the sum of the inherent or natural CD, ΔACD, (if any) and the magnetic field induced CD, ΔAMCD. As indicated in Eqn. (8), reversing the sign of the field, reverses the contribution of the MCD to the observed signal. The MCD is linearly proportional to the intensity of the applied magnetic field, H, which in the SI system is reported in tesla (T). Obtaining the MCD spectrum, ΔAMCD, in the presence of a non-zero CD “base line” signal requires recording the spectrum of the sample twice with different magnetic fields applied, typically either H and 0 or H and –H. If the baseline of the instrument is sensitive to the field, obtaining an MCD spectrum requires the linear summation of four values at each wavelength. As with absorption and natural CD, the MCD can be expressed as the product of parameters that separate the contributions of different effects to the observed signals. As shown in Eqn. (9), this involves the product of the concentration of absorber, c, pathlength, l, magnetic field, H, and the differential extinction coefficient, ΔεMCD, which expresses the magnitude of the response to the external magnetic field as a function of wavelength. (An alternate approach is to include H in Eqn. (8) instead of Eqn. (9), but the meaning and units of ΔεMCD remain unchanged.) MCD is complementary to natural CD in that all absorbers, including nonchiral absorbers, can give signals. Indeed, some of the strongest signals are from highly symmetric molecules, which exhibit no natural CD. MCD is used most frequently in detailed studies of the electronic structure of small molecules [19]. Its primary uses in molecular biophysics are in specialized studies of metalloproteins,
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
27
mostly in the visible and near infrared regions where the metal and its ligands absorb. MCD is also useful for detecting multiple transitions within a single absorption envelope, which can facilitate the interpretation of CD, unpolarized absorption and other spectral measurements, e.g., [20, 21]. It can also detect and quantify certain species with large MCD signatures in the presence of other absorbers with lower MCD signals; tryptophan is a prominent example [22-24]. MCD data were the first type of CD data to be recorded with synchrotron radiation [25].
ΔA+ H = ΔACD + ΔAMCD ΔA− H = ΔACD − ΔAMCD
(8)
ΔAMCD = Δε MCD c l H
(9)
2.3.2 Linear Dichroism (LD) LD is defined as the difference in the absorption of a sample measured with light polarized parallel to an experimentally defined direction minus the absorption of light polarized perpendicular to that direction. The difference can be ether inherent in the structure of the material or induced by some external agent. For the spectrometer shown in Figure 1, the preferred directions would be horizontal and vertical. A potentially powerful but underused spectral measurement, LD is particularly valuable for obtaining information on the relative orientations of molecules in macromolecular complexes. Methods for aligning molecules in solution and the interpretation of LD data are discussed in detail elsewhere in this volume [26]. The recent commercial availability of Couette cells that orient macromolecular complexes using sheer flow [27-29] should result in greatly increased use of LD in molecular biophysics. The configuration of the sample platform of the basic dichrometer configured to measure LD of samples aligned by flow in a Couette cell is shown in Figure 5. Other methods for aligning molecules in solution are mentioned in the section of practical components and instruments. Birefringence, either inherent as encountered in the crystals used to make polarizers or induced, as encountered in a photoelastic modulator or filamentous proteins or nucleic acids oriented by shear, is the KK transform of LD. The expression for the time dependent total intensity for light transmitted through a sample exhibiting LD is shown in Eqn. (10). The derivation proceeds as for CD, except that the expressions for the time varying components of the linear beam shown in Eqn. (1) require that we use the Bessel function expansion for cos δ shown in Eqn. (4), resulting in the expression on the right hand side of Eqn. (10).
I [t ] = I H e − a H + I V e − aV Δa LD ⎛ ⎞ J 0 [δ 0 ] − Δa LD J 2 [δ 0 ]cos 2ωt + ... ⎟ ⇒ I 0 e − a ⎜1 − 2 ⎝ ⎠
(10)
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
Polarizer
28
Photoelastic Modulator
Flow LD Sample Platform
Transmission Detector
Figure 5. Configuration of the sample platform of a dichrometer required for the measurement of LD with a Couette cell. The sample containing solution is located between an inner rod and outer cylinder, one of which rotates. The rotation produces a linear shear gradient across the fluid in the gap, which is represented as lightly shaded. The gap is typically much smaller in relation to the other components than represented here. Additional lenses may be required to insure that the transmitted beam goes through the sample in the Couette cell, especially for instruments in which an image of the exit slit of the monochromator is not formed at the position of the sample. Practical designs have focused on reducing the size of the cell, thereby reducing the quantity of sample required for a measurement.
Statements to the effect that the LD is obtained from the v2ω/v0 ratio for δ0 = π are found frequently in both product literature and scientific publications. This is not rigorously true because the expression proportional to v0 is a sum that contains a minor term that is proportional to ΔaLD. This may not cause significant errors because the effect is small in most cases. However, LD absorption differences can be much bigger than those observed for CD. Fortunately, a tidy solution is available. If we choose a value of δ0 such that J0[δ0] = 0, the extra time-invariant term vanishes and the decadic LD is given by the expression shown in Eqn. (11). By analogy to the situation for fluorescence polarization anisotropy, discussed in section 2.4.1, I call the smallest phase amplitude that causes J 0 to vanish the “magic phase” for transmission and represent it with the symbol δ 0* ; its value is approximately 0.765 π radians (138o) (see Figure 4) and the corresponding amplitude is J 2 [δ 0* ] = 0.432. Methods for calibrating the amplitude of the phase shift produced by the PEM to achieve these values are described in section 3.6.1. Use of this PEM phase amplitude to eliminate J0 terms has been employed previously in experiments such as the measurement of strain-induced retardations in which the additional static term cannot be ignored.
ΔALD =
v 2ω − v 2ω −1 ≈ * ln[10] J 2 [δ 0 ] v0 v0
(11)
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
29
2.3.3 LD Signals can Distort a CD Spectrum The preceding sections indicate that the CD signal appears at frequency ω while the LD signal appears at frequency 2ω so the two can be separated by a frequencyspecific device such as a lock-in amplifier. However, it is observed in experiments that the CD spectrum of a sample exhibiting LD can be distorted by some of the 2ω signal “leaking” into the ω channel. The reverse is usually less of a problem because of the typically lower amplitude of the CD signal. Thus, it is usually desirable to avoid measuring the CD of a sample if it exhibits LD. A possible cause of the CD-LD coupling is discussed in section 3.6.2. 2.3.4 Optical Rotation and Optical Rotary Dispersion (ORD) Optical rotation measures the rotation of the plane of polarization of a light beam by a sample resulting from the differences in the refractive indices of left and right circularly polarized light. Optical rotary dispersion (ORD) is optical rotation presented as a function of wavelength or photon energy. ORD is thus the KramersKronig transform of CD; the two types of measurements, in principal, contain the same information. ORD, and its predecessor, optical rotation, were the original chiral spectroscopies, as instruments to make such measurements were available before the advent of those to measure CD. However ORD spectra are usually more difficult to interpret than the corresponding CD and thus relative few ORD data are reported in the current literature. However, it may be necessary to measure ORD for comparisons with older data and for quality control purposes when legacy standards are expressed in terms of rotations. Earlier generations of ORD instruments were quite different from modern dichrometers and ORD accessories using these approaches are available from some vendors. Fortunately, it is rather simple to modify a basic dichrometer of the type depicted in Figure 1 to measure ORD by placing a polarizer, sometimes called an analyzer, between the sample and the detector [1, 30], as illustrated in Figure 6. The transmission axis of the analyzer is adjusted to be parallel (or perpendicular) to the stress axis of the PEM; at an angle of π / 4 from the horizontal for the configuration shown in Figure 1. Oakberg described an alternate configuration in which the sample is placed between the first polarizer and the PEM [31]. Here, I will focus on the configuration described above, which requires the least rearrangement of the basic dichrometer. The beam leaving the PEM will be described as the sum of vertically and horizontally polarized components, IV and I H as represented in Eqn. (1), just as we did for LD. The sample may have two effects on this beam. The intensity may be attenuated by absorption. As for CD, the attenuation of the linear components are presumed equal, i.e., AV = AH = A . In addition, a chiral sample causes a rotation of α radians for each of the linearly polarized beams. By definition, the angle is positive for a clockwise rotation as viewed looking into the beam. Thus modified, the beam enters the analyzer. The transmission of a linearly polarized beam through a linear polarizer goes as the square of the cosine of the angle between the plane of polarization of the light beam and the transmission axis of the polarizer. Thus, in the absence of a chiral sample exactly one half of both the IH and IV will be transmitted by the polarizer because it makes an angle of π/4 with each polarized beam.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
Analyzer
Polarizer
30
Sample Holder ORD Sample Photoelastic Platform
Transmission Detector
Modulator Figure 6. Configuration of the sample platform of a PEM dichrometer modified to measure ORD [18]. An additional lens may be required to insure that the transmitted beam goes through the analyzer. Oakberg recommends that the analyzer should be mounted in a precision rotator so that its angular orientation with respect to the polarizer can be set precisely [32]
Because of the rotations produced by a chiral sample, the angles between the planes of polarization of the light and the axis of the polarizer are now π/4 ± α where the plus sign is for the horizontal polarization. That is, the (originally) vertically polarized component makes a smaller angle and the (originally) horizontally polarized component makes a larger angle with the pass axis of the polarizer, leading to increased and reduced transmission respectively. The transmission of the two linearly polarized components transmitted by the analyzer is thus given by TH/V = (cos[π/4 ± α])2 → (1 ± sin 2α)/2. The intensity of the light reaching the detector as a function of time is given by the expression in Eqn. (12). In the rightmost expression, the pesky J0 term was eliminated by invoking the same magic phase as used for LD. The value of the ORD (in radians) is obtained from experimentally determined values as shown in Eqn. (13). For small values, the inverse sign is approximately equal to its argument, but this is not as robust an approximation as those used for the terms in exponential series that appear in the expressions for CD and LD. Also, because almost all modern dichrometers are interfaced to a computer, there is no compelling reason to resort to this approximation.
I [t ] = I H e − a TH + I V e − a TV I 0 e −a ⇒ 1 + 2 J 2 δ 0* sin 2α cos 2ωt 2
(
[ ]
)
(12)
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
⎡ 1 v2ω ⎤ 1 α = sin −1 ⎢ ⎥ *T 2 ⎣ 2 J 2 [δ 0 ] v0 ⎦
31
(13)
2.3.5 Simultaneous Absorption A measurement of the (mean) absorption of the sample, Ā, complements all of the differential measurements described above, as well as most of the emission experiments described below. A variety of arrangements involving additional optical components have been developed to permit the measurement of absorption using a dichrometer, but they disrupt the inherent simplicity of the optics of the basic instrument. However a simpler approach is available which does not require additional optics and leverages the fundamental aspect of the electronic processing of the dichroic signal illustrated in Figure 3. This approach was first employed to monitor CD of samples eluting from a column as a function of sample temperature but at a single wavelength [33] and was later extended to include the ability to record absorption as a function of wavelength even when the intensity of the light source changes slowly with time, as happens with most synchrotron sources [3]. The key observation is that the comparator circuit shown in Figure 3 completes a servo circuit that functions to keep the time average output of the photomultiplier, v0, constant. As shown in Eqn. (14), v0 can be represented as the product of the time-average intensity of the light reaching the photomultiplier, Ī, times the gain of the photomultiplier and its associated current to voltage converter, G[V]. The gain is determined by the high voltage, V, directed to the photomultiplier dynode chain by the comparator circuit. As written, G is a dimensioned quantity with units of volts per unit photon flux, which makes it a function of the wavelength of the incident radiation. However, we are only concerned with comparing the gain required to yield the same output voltage for two different fluxes at the same wavelength, which is only a function of the applied voltage, V. Suppose that we first place a sample with absorption AS in the beam, for which the servo circuit generates a voltage of VS,λ at wavelength λ to satisfy the condition of constant timeaverage signal. Also assume that the incident intensity was IS. At some earlier or later time, we measure a reference sample or “blank” which is identical to the sample except for the material of interest. For this sample, all of the “S” subscripts are changed to “B”. However, the servo circuit insures that the time average output from the photomultiplier is the same. Thus, we equate the two expressions which both equal v0, and hence equal each other. Taking the common logarithm of each side and solving for the absorption difference yields the expression shown in Eqn. (15). The relevant parameters for both sample and reference can be collected at the same time the CD of the sample and the blank are recorded and combined to form the corresponding pseudo-absorptions, as defined in Eqn. (16). It follows that the differences in absorption between sample and blank is just the differences in the corresponding pseudo-absorptions, as shown in Eqn. (17).
v0 = G[V ] I = G[V ] I 0 10 − A
(14)
32
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
AS − AB = log10 G[VS ] + log10 I S − (log10 G[VB ] + log10 I B )
(15)
pAS = log10 G[VS ] + log10 I S and pAB = log10 G[V B ] + log10 I B
(16)
AS − AB = pAS − pAB
(17)
2.4 Emission Experiments The emission detector in the prototype spectrometer can be used either to monitor the interaction of the incident beam with the sample or to characterize the sample based on the properties of the emitted light. Some experiments depend on both. Here I discuss only those experiments that can be performed using the configuration shown in Figure 1 along with modifications that can be implemented with passive components, e.g., lenses and filters, mounted on the sample platform as shown in Figure 7. Experiments requiring the relocation the PEM have been described elsewhere [9, 14, 34] and will not be treated here. Fluorescence detected CD [6] and MCD [35] were the first experiments to use the configuration shown in Figure 7, but fluorescence polarization anisotropy is used far more frequently in biophysical spectroscopy. In addition, the analysis of the anisotropy experiment is useful in understanding artifacts that appear in fluorescence detected CD, and thus anisotropy is treated first. 2.4.1 Fluorescence Polarization Anisotropy When a molecule is excited with linearly polarized light, the intensity of fluorescence polarized parallel to the polarization of the exciting beam may be greater or less than the intensity with perpendicular polarization. This occurs when the time required to randomize the orientation of the fluorophore is greater than the mean time between absorption and emission. For example, tryptophan embedded in a protein is constrained compared to free solution. Thus the degree of linear polarization of fluorescence provides information on the environment of the fluorophore on a time scale of nanoseconds. Fluorescence polarization and CD are the most used applications of polarized light in biophysical spectroscopy. The degree of polarization can be quantified in either of two ways, as shown in Eqs. (18) and (19), where Fp is the intensity of fluorescence polarized parallel and Fs is the intensity of fluorescence polarized perpendicular to that of the exciting light. If the sample is isotropic (LD is zero), the fraction of the excitation beam absorbed in the sample is 1 − e − a for both polarizations and thus disappears from the anisotropy because it appears in each term. If the intensity of the incident light is also the same for both polarizations, it also disappears from the ratios, as does the intensity of the incident excitation beam. Thus, the degree of polarization of the fluorescence can
Polarizer
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
33
Emission Detector Filter Lens
Sample Holder Photoelastic Modulator
Fluorescence Sample Platform
Figure 7. Configuration of the sample platform of a dichrometer in “right angle” geometry for fluorescence polarization anisotropy (r), FDCD and the measurement of total fluorescence intensity. A short-wavelength cutoff filter prevents scattered light from reaching the detector. The exclusion of scattered light is critical because scattered light is strongly polarized and thus can easily distort measurements of polarization anisotropy [36]. A lens can increase the intensity of the fluorescent light that reaches the detector without having to move the emission detector too close to the sample. A slightly more complicated sample platform could be used to measure “front-face” fluorescence, which may be useful for suppressing linear-polarization artifacts in FDCD (sect. 2.4.3). A more complicated instrument can include a monochromator or spectrograph to analyze the wavelength distribution of the light emerging from the sample. To avoid interference with the emission detector shown in this figure, it could be positioned on the opposite side of the sample, i.e., along the negative y axis. Because synchrotron radiation is inherently pulsed at a high frequency, it is possible to measure time resolved fluorescence [37] in addition to CD, LD, absorption and time average polarization anisotropy in such an instrument. However, the configuration shown here is superior for determining fluorescence anisotropy because monochromators and spectrographs usually exhibit polarization sensitive throughput that changes with wavelength.
be expressed as the ratios of the quantum yields for emission parallel and perpendicular to the polarization of the exciting beam, as indicated in Eqn. (18). The polarization anisotropy ratio, r, is generally considered a better description because there are always two axes that are perpendicular and one that is parallel to the direction of polarization of the excitation. (P and r are not linearly independent so one can be calculated from the other.) Numerous experimental configurations have been developed to quantify these parameters. Most involve one polarizer in the incident beam and another between the sample and the emission detector or detectors [36]. The degree of automation of such multi-polarizer configurations has increased over time. The potential for measuring fluorescence polarization with a PEM was recognized early on [2, 14, 34], but these approaches required placing the PEM between the sample and the emission detector and adding a second polarizer. Such configurations made it difficult to adapt commercial dichrometers to measure fluorescence polarization. Recently, Ives DuPont and his colleagues introduced an elegant approach to measuring fluorescence polarization anisotropy using the basic configuration shown in Figure 1 [7]. DuPont’s critical observation is that when the exciting light is vertically polarized, the intensity reaching a 90° detector is the sum of the fluorescence polarized parallel to the polarization of excitation plus the fluorescence polarized perpendicular to that direction, i.e., FV = Fp + Fs, while for horizontally polarized incident light, all of the intensity reaching the emission detector is polarized perpendicular to the direction of excitation, i.e., FH = 2Fs. Thus
34
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
changing the linear polarization of the incident light while recording the total intensities reaching the emission detector provides enough information to calculate the anisotropy of the fluorescence. Reference [7] outlined the concept and provided experimental results derived from a PEM based implementation. The details of the operation of the PEM required to record fluorescence polarization anisotropy were published soon thereafter [38]. Assume that the sample is isotropic so that aV = aH = a , that the quantum yield for fluorescence for emission parallel and perpendicular to the polarization of the exciting light are φp and φs respectively and that the emission detector has an unobstructed view of the entire sample through which the excitation beam passes. The recorded intensities of fluorescence for vertically and horizontally polarized incident light as a function of time are thus FV = (φ p + φs )(1 − e − a )IV and FH = 2φs (1 − e − a )I H . The intensity of the fluorescence reaching the detector as a function of time is shown in Eqn. (20). The modulated amplitude of the fluorescence shown in Eqn. (20) is proportional to φ p − φs . The time invariant part of the florescence can be made proportional to φp + 2φs by choosing δ0 such that J0[δ0] = –⅓, as shown in Eqn. (21). I refer to this value of the phase amplitude as the magic phase for fluorescence polarization anisotropy with horizontal incident *rH
polarization and denote it by the symbol φ0 . Comparing the modulated and steady-state components of the expression for fluorescence with the expression for r shown in Eqn. (19) and rearranging gives the expression for r shown in Eqn. (22), where v0 and v2 are the same measured voltages indicated in Eqn. (6). The value of this magic phase is approximately 1.035 π (186°) and the corresponding value of J2 is 0.481, which is near its maximum value. The corresponding situation for vertical incident polarization was described previously [38]. The magic phase, δ 0*rV , for fluorescence polarization anisotropy The value is approximately 0.577π (104°), and the causes J0 to equal +⅓. corresponding value of J2 is 0.309. This magic phase is close to the optimal for measuring CD. The expression for r is the same as for horizontal excitation except for a sign, which reflects a difference in phase. Because of the flexibility inf selecting the phase for the measurement of CD, either vertical or horizontal incident polarization can be used for the simultaneous measurement of CD and fluorescence polarization anisotropy.
P≡
r≡
Fp − Fs Fp + Fs
Fp − Fs Fp + 2 Fs
⇒
φ p − φs φ p + φs
⇒
φ p − φs φ p + 2φ s
(18)
(19)
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
(
F [t ] = 1 − e − a
(
) I2 ((φ 0
⇒ 1 − e −a
(
F [t ] = 1 − e − a
r=
p
+ φ s ) (1 − cos δ 0 ) + 2φ s (1 + cos δ 0 ))
⎛ φ p (1 − J 0 [δ 0 ]) + φ s (3 + J 0 [δ 0 ]) ⎞ ⎟ ⎟ ( ) [ ] + − 2 φ φ J δ cos 2 ω t p s 2 0 ⎝ ⎠
) I2 ⎜⎜ 0
) I2 ⎛⎜ 43 (φ 0
⎝
35
p
[ ]
⎞ + 2φs )− 2(φ p − φs ) J 2 δ 0*rH cos 2ωt ⎟ ⎠
v2ω v −2 ⇒ −1.386 2ω *rH 3 J 2 [δ 0 ] v0 v0
(20)
(21)
(22)
2.4.2 Simultaneous Fluorescence Intensity Just as it is desirable to measure the total absorption along with the differential parameters CD or LD, it is useful to measure the total fluorescence along with the fluorescence polarization anisotropy. Eqs. (21) and (22) show that this is even easier than the corresponding absorption measurement because v0, which is proportional to F , is an inherent part of the measurement of r. Indeed, setting the amplitude of the phase retardation as indicated above has the same effect as placing a linear polarizer oriented at the “magic angle” (≈ 55°) between the sample and detector in a conventional fluorometer [36], i.e., assuming vertically polarized excitation, the perpendicular component of the time-average fluorescence is weighted twice as heavily as the parallel component. Use of a PEM to achieve proportional weightings has the advantage of increasing the signal by about a factor of 2 because the second polarizer is eliminated and makes the measurement of r possible for both vertically and horizontally polarized incident light. 2.4.3 Fluorescence Detected CD and MCD When first reported in the mid 1970s, FDCD [6] and FDMCD [35] generated considerable excitement based on the expectation that the CD signal of a single fluorescent component could be extracted from a complex mixture of absorbing species as frequently found in proteins, nucleic acids and their complexes. Several factors have frustrated these expectations. The FDCD/FDMCD signals are frequently very noisy just because the CD signals are inherently small but far fewer photons reach the detector in an FDCD experiment compared to transmissiondetected CD. The major loss mechanisms are the quantum yield for fluorescence of the fluorophore and the fact that most of the emitted photons never reach the detector. The quantum yield for even a “good” fluorophore may be only 10% and even for aggressive collection optics operating at l/d = 1, 95% of the emitted
36
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
photons will not reach the detector (Figure 9 and section 3.1). FDCD has found applications in certain systems where the quantum yield for fluorescence is high [39] and fluorescence can be collected very efficiently with an enveloping elliptical mirror system [40]. The measurement of FDCD/FDMCD is actually more complicated than transmission-detected CD because extraction of the signal requires a calculation involving both the modulated and time average fluorescence plus the absorption spectrum. This is illustrated in Eqs. (23) and (24), where φ is the quantum yield of fluorescence – assumed equal for both circular polarizations – and the other symbols are as used previously or, in the case of the fluorescence magnitudes, the direct analogies of the corresponding transmitted intensities. Eqn. (23) indicates the approach. It is assumed that the emission detector can receive light generated in all parts of the sample with equal efficiency and that only the fluorophore absorbs left- and right circularly polarized light to different extents. Note that as the absorption of the chiral fluorophore increases, the time average fluorescence continues to increase, but ΔFω peaks and then returns asymptotically to zero.
(
) ( ) ⇒ I φ (1 − e − J [δ ] e
F [t ] = φ 1 − e − aL I L + φ 1 − e − ar I R −a
0
ΔACD =
(1 − 10 ) −A
J 1 [δ 0 ] ln 10 10
−A
1
vω v0
0
−a
Δa CD sin ωt
)
(23)
(24)
The interpretation of FDCD data can be further complicated by the fact that modulation of the fluorescence at the frequency of the PEM can be due to either the CD of the fluorophore, the CD of non-fluorescent components of the sample, or a combination of the two [35], with the fluorescent and non-fluorescent components contributing to ΔFω with opposite signs relative to their CD (measured in absorption). Finally, as noted above, if the fluorophore is not free to reorient during the lifetime of the excited state, the fluorescence will be partially linearly polarized. Using the expressions for the sum of the quantum yields described above for polarization anisotropy in place of the single quantum yield used in this derivation (because both IL and IR can be considered the sum of equally intense horizontal and vertical components), we find additional signals at angular frequency 2ω. However, the recorded FDCD can be subject to artifacts due to the 2ω signal “leaking” into the ω channel and distorting the FDCD data – the same type of artifact that can occur in CD measured for a sample that exhibits LD. A plausible explanation of this effect is discussed in section 3.6.2. An interesting experimental approach to avoiding this problem is the use of an enveloping elliptical mirror that captures a very large fraction of the fluorescence [40]. In the limit that all of the emitted light could be captured, the intensity could not depend on the degree of polarization of the fluorescence. A simpler way to suppress linear polarization effects is to view the fluorescence emitted along to the x axis, i.e., front-face fluorescence geometry.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
37
2.5 Inhomogeneous Distribution of Absorbers and Scattering Can Distort Spectra The derivations presented above are based on the exponential decrease in intensity, of a light beam, either polarized or unpolarized, as it traverses the sample, i.e., Eqn. (2). However the derivation of Eqn. (2) assumes that the distribution of absorbing species is completely random. The localization of chromophores within substructures within a sample, e.g., proteins restricted to be within the membrane fragments dissolver or suspended in solution, can result in the failure of Eqn. (2) to accurately describe the intensity profile of a beam of light passing through the sample, resulting in the departure of experimental results from the expressions derived above. This effect is sometimes referred to as “Absorption Flattening” because absorption bands are truncated, but CD bands can also experience wavelength shifts in addition to reductions in amplitude. Of course, the magnitude of the problem will depend on the degree of the clustering of absorbers, among other factors, but it can become significant in the measurement of CD [41] and other spectral parameters in some circumstances. Recently Castiglioni and his colleagues [42-44] have reviewed previous work in this area and proposed empirical approaches to correct spectra for such effects. The above derivations also assume that absorption is the only mechanism for attenuating the light beam passing through a sample. But scattering can also attenuate the beam, particularly in the presence of membrane fragments or other heterogeneous components. Instrumental approaches to minimize artifacts arising from light scattering are discussed in section 3.8. It is interesting to note that the effects of absorption flattening and light scattering on unpolarized absorption spectra are opposite in sign, although they differ strongly on their dependence on wavelength. 2.6 Light Scattering as a Monitor of Molecular Aggregation While light scattering is frequently regarded as an artifact that may interfere with the measurement of CD, it can also be used to monitor the degree of aggregation in a sample. Most scattering will result in light polarized parallel to that of the incident beam. When the sample is viewed by a detector located on the positive y axis, the time course will follow the intensity of the light polarized in the z direction and will appear at a frequency of 2ω. Thus is a variety of experiments, the emission detector can follow the degree of aggregation of a sample while the transmission detector records CD as a function of wavelength, time, temperature or some other independent parameter.
3. Practical Instruments and their Components All instruments represent some combination of compromises or “trade-offs” involving performance parameters, components and cost. The selection of one component will influence the choices available for others. The issues involved in the selection of many basic components have been discussed in detail elsewhere, see e.g. [45].
38
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
3.1 Characterizing Optical Beams: Focal Ratios and Solid Angles Each component of an optical system is characterized by the limiting size, shape and divergence, of the optical beam it can accommodate. These parameters should be mutually consistent for the various components in the system. This is easiest to understand in terms of a limiting circular aperture. 3.1.1 Circular Apertures There are several ways of quantifying the optical characteristics consisting of a focusing lens or mirror in the case of a circular aperture. The easiest to measure is the ratio l/d, where these quantities are defined in Figure 8. In the case of collecting light emitted in all directions from a source, a lower ratio means that a larger fraction of the light is being collected. A related parameter is the half angle of the cone, θ = arctan[d/(2l)] or numerical aperture which in air or vacuum equals sin θ. The acceptance angles of crystal polarizers are often expressed in terms of twice this angle. In some ways, the most useful description of this optical system is the solid angle of the cone, Ω, a dimensionless number generally expressed for clarity in “sterradians”. For a circular aperture, Ω is the area of the aperture projected on a sphere of radius r divided by r2, where r = (l2 + d2/4)½. For a circular aperture, Ω ~ can be calculated from θ or the l/d ratio as shown in Eqn. (25), where θ represents the variable angle of integration. The fraction of the light incident on a circular aperture from a uniformly omnidirectional source is given by Ω/4π. Its relation to l/d for a circular aperture is shown in Figure 9. If the focal length of the lens/mirror is equal to l, the l/d ratio is called a focal ratio, f/#. This implies that the optical element is forming an image of the source point at infinity or forming an image of an infinitely distant object. This condition exists between the slits and mirrors of a Czerny-Turner monochromator (Figure 13), but not for the source collection optics shown in Figure 10 or the curved grating of the monochromator shown in Figure 14. Nevertheless, the term “focal ratio” is sometimes applied to the l/d ratio even when l is not the focal length of the optical element.
source or focus
r θ
d
l Figure 8. Definitions of the parameters used to quantify the light-gathering ability of a circular optical element such as a lens or mirror. The distance of the source object (or image) from the plane of the optical element is represented by l, the distance from the source to a point on the edge of the circular element is r, and the diameter by d. The half angle of the cone of light accepted by the lens/mirror is θ and the solid angle subtended by the cone of light is represented by Ω, although it cannot be shown in these orthographic projections.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
Ω=
1 r2
∫
θ
~ θ =0
39
~ ~ 2π r 2 sin θ dθ
⎛ 1 ⇒ 2π (1 − cos θ ) ⇒ 2π ⎜1 − 1 + 2 ⎜ 4 (l d ) ⎝
−1
⎞ ⎟ ⎟ ⎠
(25)
Fraction of Light Collected
0.5
0.1 0.05
0.01 0.005
0
1
2
4
6
8 10
l/d Figure 9. Fraction of the light collected from a uniform omnidirectional source by a circular aperture as a function of the l/d ratio of the optical system. Calculated from Ω/4π using Eqn. ((25).
3.1.2 Rectangular Apertures Most gratings and prisms are rectangular rather than circular, so the optical beam corresponding to the cone in the previous paragraph is a quadrilateral pyramid. The focal ratio of such a system is sometimes computed using the diagonal of the rectangular in place of the diameter of the circle. A more realistic approach that can be used when the rectangle is not too far from a square is to calculate l/d using the “effective” diameter of a circle having the same area as the rectangular aperture, i.e., de = (4hw/ π)½, where h and w are the height and width of the rectangular aperture. It is easy to calculate separate half angles for the two axes of the rectangle, and the solid angle can be approximated using the effective diameter as long as the aperture is reasonable close to a square, as is usual for optical systems associated with conventional light sources. Synchrotron sources produce quite different beams compared to conventional sources. While not nearly as collimated as the light from a laser, the light emitted from a synchrotron diverges with a high f/#. For the beamlines used for circular dichroism, the vertical divergence is typically much smaller that that in the horizontal plane. For large f numbers, the radius of an arc approaches the cord, so the solid angle subtended closely approximates the product of the vertical and horizontal acceptance angles. A lens or mirror can easily increase the l/d ratio of an image cone compared to the corresponding object cone. However, the size of the object to that of the image always changes by the same ratio. Thus, for example, imaging the exit slit of a monochromator on a polarizer with a more limited acceptance cone can easily be performed, but at the cost (literally) of requiring a physically larger polarizer to
40
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
accept the beam. Thus, the f/# of the monochromator is usually chosen to be compatible with the acceptance angle of the polarizer. Similarly, a low f/# lens or mirror will collect more of the light from an omnidirectional source and match it to the f/# of a monochromator. However, beyond a certain point, the increased size of the image of the source points overfills the length of the entrance slit of the monochromator, so the additional light collected from the source is effectively lost. The directional nature and high f/#s typical of synchrotron radiation, in contrast, make it possible to direct more of the radiation onto the dispersing element of a properly chosen monochromator. 3.2 Light Sources The choice of the light source, the first component in the optical train (Figure 1), is particularly important in influencing the selection of subsequent components. Conventional sources like Xenon arcs or quartz-halogen lamps, which approximate black-bodies, have maximal intensity in the visible and decrease in intensity in the ultraviolet. Synchrotron radiation from electron storage rings that are operated as light sources, in contrast, emit maximally in the X-ray region; their emission spectrum increases monotonically with decreasing wavelength throughout the visible and ultraviolet. Thus, the short wavelength limit of synchrotron radiation sources used for SRCD is always determined by other components inserted into the beam. Conventional and synchrotron sources also differ in the way light is collected from the source and processed as it proceeds through the instrument. 3.2.1 Xenon Short Arcs and Other Conventional Sources Most conventional sources radiate uniformly in all directions. Thus the number of photons collected depends on the solid angle that the first optical element subtends with respect to the source. However, the resulting beam must be optically matched to the optics and size of the monochromator, which limits the size of the solid angle, and thus the flux, collected. Typical configurations for a conventional source and the associated collection optics are shown in Figure 10. An approximately cone shaped beam of light is collected from the source and focused onto the entrance slit such that the solid angle is matched to the internal optics of the monochromator. The primary mirror shown in the left hand side of Figure 10 must be an off-axis ellipsoid to avoid substantial distortion of the image of the source. This collection system has an f/# = 1 so the primary mirror collects about 5% of the light emitted by the lamp. The rear mirror approximately doubles this value. On the right hand side of Figure 10 a lens is the primary optical element. The system shown has f/# = 1.36, so the lens collects approximately 3% of the total light generated by the source and the rear reflector approximately doubles this value. 3.2.2 Synchrotron Radiation Synchrotron radiation is limited in the solid angle emitted, which has significant implications for the choice of the monochromator and subsequent components. In addition, there are significent differences between the range of angles in the vertical compared to the horizontal. The range of useful intensities in the vertical are limited in the UV to a total of about 10 mRadians (~ 0.6°) oriented symmetrically about the orbital plane of the electron beam. This is due to the inherent nature of the
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
41
Figure 10. Plan views of typical collection optics for a conventional source such as a Xenon arc or quartz halogen lamp. Left: An off axis ellipsoidal mirror collecting light from a source (filled circle) and focuses it onto the entrance slit of a monochromator. The angular convergence of the focused beam should be such that the light just fills the first optical element inside the monochromator. A less convergent beam will under fill a grating and result in less than the optimal separation of adjacent wavelengths, while a more convergent beam will overfill the first optic and result in a portion of the beam being lost, increasing out of band light in the monochromatic beam. The distance from the source to the collecting mirror is small to intercept a larger solid angle of the light radiated from the source. The longer distance from this mirror to the entrance slit results in an enlarged image of the source at the entrance slit (shaded semicircle). The diameter of the magnified source image should not exceed the long dimension of the slit. A spherical mirror located on the opposite side of the arc at a distance equal to its radius or curvature focuses an equal solid angle back through the lamp and onto the collection mirror, hence approximately doubling the intensity of the beam reaching the monochromator. Right: The off axis collection mirror can be replaced by one or mores lenses, which creates a linear system, but results in both chromatic and spherical aberrations. This is a disadvantage for fixed lenses because the focus point changes with wavelength and thus is not always on the entrance slit. However, the position of the lens can be adjusted as the monochromator scans, in which case the chromatic aberration can improve spectral purity, particularly in the ultraviolet [46]. Translating the collection lens is made much easier in contemporary instruments that are controlled by a computer. Lens based source optics are encountered more frequently in fluorometer type dichromators that use monochromators that accept larger solid angles and hence have lower focal ratios, e.g., [2]. Practical designs require that high-pressure arcs must be enclosed is a sturdy housing to limit the extent of damage in the event of an explosion. The positions of either the mirrors and lenses or the lamp must be adjustable so that the image of the source is focused properly. Prealigned modular lamp-optics assemblies are an alternate possibility. All non-vacuum monochromators and source compartments should be purged with dry nitrogen gas.
radiation. The horizontal acceptance, in contrast, is limited by the design of the vacuum system of the storage ring. For example, beamline U11 at the NSLS, receives 55 mRad (~ 3°) of radiation from a bending magnet, which is large compared to the horizontal acceptance of most X-ray beamlines. Thus, the synchrotron radiation beam is highly collimated compared to a conventional UV source, although far from the columniation of a laser. The synchrotron radiation beam can be demagnified and still focused onto the entrance slit of a high-focal ratio monochromator. This, and the fact that the intensity of synchrotron radiation increases as wavelength decreases, helps reduce out of band radiation in the beam exiting the monochromator, hence reducing the need for double monochromators that reduce UV intensities. The performance of a single monochromator SRCD instrument for measuring the CD and absorption of a protein in a short pathlength cell and an aqueous environment is shown in Figure 11, while the CD of a myoglobin film on CaF2 is shown in Figure 12. The spectra for the sample in solution is truncated at 170 nm due to the absorption of water, but the CD of the film extends to 140 nm, revealing additional spectral features. Extension to about 120 nm is possible with a vacuum sample chamber.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
1.6
0.0015
1.4
Circular Dichroism
0.0010
1.2 0.0005
1.0 0.8
0.0000
0.6
-0.0005
0.4 -0.0010 -0.0015 160
Absorption
42
0.2 180
200
220
240
0.0 260
Wavelength (nm) Figure 11. The SRCD (absorption units) and absorption of a myoglobin sample dissolved in water in a cell with CaF2 windows and a 6 μm pathlength. Three CD spectra of the sample overlap for all wavelengths greater than 170 nm (vertical dashed line) demonstrating reproducibility of the instrument. The solid baseline is a single scan of the same cell containing water. The dotted line is the absorption of the myoblobin from which the absorption of the water filled cuvette has been subtracted from the pseudoabsorptions acquired simultaneously with the CD data as described in section 2.3.5. The photomultiplier reached its maximum voltage at 170 nm for both the myoglobin sample and the water blank. At shorter wavelengths, the servo-loop could not keep v0 constant and both CD and absorption data are invalid.
3.3 Windows and Mirrors The short wavelength limit of dichrometers used in molecular biophysics is determined by the most restrictive absorbing material that the light passes through. As used here, “window” applies to any material the optical beam must pass through. The envelope of a high-pressure Xenon arc, the window separating the ultra-high vacuum (UHV) of a synchrotron ring from the atmospheric pressure of a sample compartment and the window of a photomultiplier are all obvious examples. However, the transmission of other components must also be considered, e.g., crystal polarizers, the optical element of a PEM, the windows of a cuvette, any gas filling the sample compartment or monochromator and, most importantly, the sample being studied. The ideal situation is for the sample itself to be the wavelength-limiting element and for all other components to provide the widest practical margins so that the maximum information is obtainable. As we move to shorter wavelengths, the number of transmitting materials decreases and thus costs usually increase. Transmission decreases exponentially with pathlength, so the short-wavelength limit of transmission of any material is always related to its thickness. Thus, high quality CD spectra can be recorded easily for aqueous samples down to at least 170 nm (Figure 11), but only if the pathlength is a few microns; a cm of water being opaque below ~ 190 nm. Conventional source dichrometers usually rely on quartz-based transmission optics (arc envelope, polarizer, PEM, cuvette and photomultiplier) and the practical cut-off for quartz, both crystalline and amorphous, is near 160 nm. Thus the performance of such
0.0015
3.0
0.0010
2.5
0.0005
2.0
0.0000
1.5
-0.0005
1.0
-0.0010
0.5
-0.0015 140
160
180
200
220
240
43
Absorption
Circular Dichroism
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
0.0 260
0.0015
3.0
0.0010
2.5
0.0005
2.0
0.0000
1.5
-0.0005
1.0
-0.0010
0.5
-0.0015 140
160
180
200
220
240
Absorption
Circular Dichroism
Wavelength (nm)
0.0 260
Wavelength (nm) Figure 12. Upper: The SRCD (absorption units) of a sample of Myoglobin in 1 M NaCl. The high absorption of the solution limits CD data to longer wavelengths. The data were acquired at 1 nm increments with a 1 s analog time constant and 4 s dwell time at each wavelength. The spectral bandwidth was 0.25 nm. Lower: The SRCD (absorption units) of a film of myoglobin deposited on a CaF2 plate as a function of wavelength. The protein had been dissolved in trifluoroethanol. Two CD and pseudoabsorption spectra were averaged before the corresponding no-sample baselines were subtracted. The data were acquired at 1 nm increments with a 1 s analog time constant and 4 s dwell time at each wavelength. The spectral bandwidth was 0.25 nm. Spectra have been truncated at 142 nm near the start of the N2 absorption bands. All spectra collected on beamline U11 at the NSLS; none of the spectra shown in this figure have been smoothed. Wavelength axes are to the same scale to emphasize the difference in spectral range.
instruments is declining in the very region where the experiment requires the most light. Most synchrotron source instruments, in contrast, use metal fluoride widow materials. The 10% transmission cutoff at 300 K is 104 nm for LiF, 116 nm for MgF2 and 122 nm for CaF2 [47]. With N2 purged atmospheric-pressure sample compartments, these instruments can operate down to about 140 nm, providing a clear margin for extracting the maximum data from aqueous samples. All-vacuum dichrometers with CaF2 PEMs can reach ≈ 120 nm. However, the fluorides are mechanically less robust than quartz. LiF forms color centers when exposed to UV
44
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
that it absorbs. One reason that most SRCD beamlines use UHV monochromators is to allow a LiF or CaF2 vacuum window to be placed downstream of the monochromator and thus not exposed to the “white” beam of synchrotron radiation. Mirrors and reflective gratings designed for operation in the VUV are usually coated with a layer of vacuum-evaporated aluminum that is over coated with layer of vacuum-evaporated MgF2, which prevents oxidization of the aluminum. The reflectivity is high above 200 nm, but decreases in the VUV. Thus instruments designed to operate in the VUV are usually designed to minimize the number of reflections. 3.4 Monochromators Almost all instruments designed to record CD, LD or ORD combine a broad spectrum light source with a monochromator that selectively transmits a narrow band of wavelengths and strongly reject wavelengths outside of the selected region. There are many different types of monochromators, the properties of which make them compatible with different sources and play a major role in defining the performance of the instrument. 3.4.1 Types Monochromators used in modern dichrometers can change the center of the selected spectral band (scan through the spectrum) under computer control. Important parameters in selecting a monochromator for use in a dichrometer are its spectral range, resolution, stray (out of band) light characteristics, throughput, focal ratio, dispersive element[s], and configuration of the optical components. In particular, the focal ratio of the monochromator should be selected to be compatible with the source of the radiation and with the optical elements that follow it. Monochromators are classified by the configuration of the optical components, or type of “mount”, and these are frequently named for their inventor[s]. Here we focus only on the types of monochromators that are used in CD instruments. 3.4.1.1 Plane grating monochromators Czerny-Turner monochromators have become the dominant mounting used in the visible and UV. As shown in Figure 13, they employ separate optical elements to collimate, disperse and focus the light. Figure 13 shows an “M” configuration in which the entrance and exit beams are parallel. Moving the two mirrors closer together results in a “W” configuration, in which the entrance and exit beams are nonparallel. This is more expensive to manufacture but reduces stray light and provides more separation of external optical components. The first synchrotron beamline built expressly for SRCD used a single W configuration Czerny-Turner vacuum monochromator [48]. An even greater angular separation is achieved with the McPherson Criss-Cross Czerny Turner vacuum monochromator in which the entrance and exit beams are separated by 44˚. One of the conventional source VUV CD instruments incorporated this monochromator [49]. Ebert-Faste monochromators have the same number of reflections as Czerny-Turner monochromators, but use only one larger spherical mirror for both columniation and focusing. This results in a mechanically robust monochromator well suited for deployment in extreme environments, but with more distortion of the beam. Grating
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
Mc
45
Se Gp
45
Sx
Mf
45
45 45
Figure 13. Upper: Plan view of an M configuration Czerny-Turner plane grating monochromator. The optical elements, in the order that they interact with an incident beam of polychromatic light are: entrance slit, Se, spherical collimating mirror, Mc, plane grating, Gp, spherical focusing mirror, Mf, and exit slit, Sx. The selected wavelength is chosen by rotating the grating about a vertical axis that passes through its front surface. One or more additional mirrors or lenses are usually required to form an image of the light source on the entrance slit (Figure 10). Lower: Plan view of two Czerny-Turner grating monochromators joined to share a common intermediate slit. This design eliminates additional folding mirrors. Both can use either gratings or prisms as the dispersive elements, or one of each. The spaces shown in the lower right quadrant of the figure is a convenient location for the polarizer/modulator/sample compartment. Two commercial CD spectrometers use double prism monochromators in this configuration.
monochromators exhibit two characteristic artifacts: multiple order transmission and Woods anomalies. If the orientation of the grating is such that wavelength λ is transmitted, then wavelength λ/2, λ/3, … will also appear in the transmitted beam if present in the spectrum from the source. In the UV region, second or higher order radiation is easily controlled by inserting a short-wavelength cutoff filter into the beam. Indeed, the optical train usually contains a component that acts as a cutoff filter. Examples are the synthetic quartz envelope of a Xenon arc or the UHV window of a synchrotron beamline. The sample can also act as a cutoff filter. No additional filter is needed for wavelengths up to twice these intrinsic cutoff
46
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
wavelengths, usually including the entire spectral range needed to characterize proteins in solution. Woods anomalies can cause rapid, polarization dependent changes in the intensity transmitted by a grating monochromator as a function of wavelength. It is thus important to be careful about any assumptions that are made in the design of an instrument or the interpretation of data that involve the intensities of the polarization of the light emerging from the monochromator. Xenon arcs produce more light in the visible and near UV than in the far and vacuum UV, which makes stray light problematic. Thus, many such instruments require two inline monochromators to suppress out of band light. Double monochromators can be formed by coupling two single monochromators in series using additional plane mirrors to transfer the output of the first stage to the input of the second. However, it is possible to build a double Czerny-Turner monochromator without additional mirrors, a desirable feature in the far and vacuum UV where the reflectivity of normal incidence mirrors decreases and is also subject to degradation due to surface contamination. Such a configuration is shown in the lower panel of Figure 13. An alternate approach is to restrict the spectral response of the detector, as discussed in section 3.8. 3.4.1.2 Prism and Prism-grating Hybrid Monochromators A prism can also be the dispersive element in a Czerny-Turner monochromator. This makes it more difficult to calibrate the wavelength drive, but eliminates the necessary for filters to remove second order diffracted light. The dispersion – separation of adjacent wavelengths – of a prism monochromator increases rapidly with decreasing wavelength in the UV. This makes it possible to open the slits to increase throughput while maintaining constant spectral band width – the range of wavelengths within the “monochromatic” exit beam. On the other hand, prism monochromators have much lower dispersion in the visible and near infrared, which limits achievable spectral band width. Prisms can be made of either optically isotropic materials, such as amorphous silica or CaF2, or dichroic crystals, such as quartz or MgF2. The latter, when employed in a double monochromator, eliminate the need for the separate polarizer shown in Figure 1. For double prism monochromators, there are thus three possibilities; each is used in a presently-available commercial dichrometer. If both prisms are isotropic, the external polarizer is still required. If one or both of the prisms are dichroic, there is no need for an external polarizer. The use of two dichroic prisms further complicates the wavelength drive mechanism, but provides better spatial separation of the two orthogonally polarized beams focused on the exit plane of the second stage of the monochromator. This translates into better polarization purity of the exiting beam, particularly in the far UV where the slits are frequently wide open to provide the highest possible intensity. The hybrid prism-grating double monochromators used in Cary 14 and 17 spectrophotometers combined a Czerny-Turner prism monochromator followed by a Czerny-Turner grating monochromator. The prism stage increased dispersion in the UV and eliminated the need for order-sorting filters, while the grating stage insured good dispersion at longer wavelengths. The McPherson 608 Fastie-Ebert Prism Monochromator can be mounted on the entrance slit of a number of single monochromators from that firm to achieve a similar configuration. The Cary monochromators have been used in both laboratory built, e.g. [50] and commercially-available dichrometers, resulting is flexible instruments for use across a broad spectral range. However, their performance in the far- and vacuum UV is
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
47
not as good as the best double-prism or non-planer grating (see below) monochromators optimized for that spectral region. 3.4.1.3 Non-planer Grating Monochromators Below about 200 nm, the reflectivity of aluminized mirrors and gratings begins to decline and degradation of reflectivity due to surface contamination becomes more important. Thus instruments designed to operate in the vacuum UV are configured to eliminate non-grazing incidence reflections whenever possible. Combining the focusing and dispersive elements eliminates the two spherical mirrors of a CzernyTurner monochromator. In synchrotron-based instruments, additional reflections required to focus the beam on the entrance slit of the monochromator can also be eliminated by using the electron beam in the storage ring as a virtual entrance slit. Typically, one (usually water cooled) folding mirror is still employed to absorb the heat load generated by the “white” synchrotron beam and to insure that the photon beam reaching the experimental station is horizontal and (usually) traveling away from the storage ring. The early conventional-source VUV CD instruments used McPherson 15˚ vacuum monochromators with a spherical grating and a nominal focal length of 1 m [51-53]. The SRCD beamline at the Hiroshima Synchrotron Radiation Center uses a UHV version of this monochromator [54]. With this mounting, scanning through the spectrum requires both rotating and translating the grating, resulting is a slight shift in the direction of the exiting beam. The beamline used by Snyder and Rowe to measure MCD at the Tantalus synchrotron used a Seya Namioka UHV monochromator [25] as do the SRCD beamlines at the Beijing Synchrotron Radiation Facility and the National Synchrotron Radiation Research Center on Taiwan [55] . Beamline U11 at the NSLS employs a modified Wadsworth monochromator, which combines a spherical, water cooled mirror in addition to a spherical grating [56]. The advent of aberration corrected holographic gratings that can be formed on non-planer surfaces has greatly increased the available configurations. Figure 14 illustrates the orientation of the optical components in the SRCD beamlines at the Synchrotron Radiation Source, Daresbury, UK [57] and at the ISA ring at Århrus University, Denmark [55], among others. 3.4.2 Wavelength Calibration Wavelength standards for the far- and vacuum ultraviolet are surprisingly easy to obtain, and standards for the near UV and visible are well established. The simple approach with a CD instrument is to measure the high-voltage applied to the photomultiplier by the servo system to maintain a constant time-average signal. Peaks in this spectrum correspond to absorption maxima, the wavelengths of which are known to high precision for easily obtainable materials. For the visible spectrum and longer wavelength UV, holmium oxide is a convenient standard. In the vacuum UV, the absorption spectrum of oxygen in the air can be used in those instruments with adequate resolving power. (The air should be in the sample compartment only; non-vacuum monochromators should always be purged with dry nitrogen to preserve optical surfaces. Replacing the air with pure oxygen is not necessary and potentially dangerous.) Alternately, the absorption of NH3 in a closed cell can be employed in roughly the same spectral range. Further into the vacuum UV, the absorption of the nitrogen provides a convenient group of
48
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
UHV Slits 45
UHV Window Bending Water cooled mirror Magnet Figure 14. Elevation view (not to scale) of a beamline for CD spectroscopy using synchrotron radiation from a storage ring. The vertical magnetic fields of a bending magnet causes the electron beam to bend along a curved trajectory that lies in a horizontal plane, i.e., out of the plane of the page. A water-cooled mirror absorbs most of the X-rays and reflects the remaining UV beam vertically to an ellipsoidal or toroidal diffraction grating. The grating disperses and focuses the light in the vertical plane. A horizontal slit, located at the same height as the grating selects the waveband transmitted to the sample compartment, which is predominately polarized in the horizontal plane. An ultra-high vacuum (UHV) window made of LiF or CaF2 separates the UHV monochromator form the lower vacuum or atmospheric pressure of the sample compartment. (The walls of the ultra high vacuum chamber are not shown in this figure.) There are only two reflections between the source of the light and the UHV window, and the window is not exposed to the damaging effects of the broad spectrum UV light that is reflected by the first mirror. In synchrotron storage rings with higher energy electrons, this simple design may be difficult to implement because of the increased need for radiation shielding that usually requires a greater distance between the source and the grating. Using the electron beam as the entrance slit was greatly facilitated by the advent of electronic feedback systems that keep the orbit of the electron beam fixed in space because a vertical displacements of the electron beam results in a clandestine shift in the wavelength calibration of the monochromator.
wavelength standards. These nitrogen absorption peaks lie at wavelengths less than 145 nm, and thus do not interfere with CD measurements for samples in aqueous solution in an N2 purged sample compartment. They may become an issue when films are studied. Calibration data for the monochromator at beamline U11 at the NSLS is shown in Figure 15 and the literature values for each peak are listed in Table 1. Table 1. Wavelengths (nm) used to calibrate the Wadsworth monochromator at NSLS beamline U11, as shown in Figure 15. The holmium oxide standard was in a sealed quartz cuvette containing holmium oxide (4%) in perchloric acid (10%) (Starna Scientific Limited, Essex, UK). Nitrogen [58] Oxygen [59] Holmium Oxide [60] 145.0 194.7 287.19 141.6 192.4 278.13 138.4 190.3 249.87 135.4 188.2 241.12 132.5 186.4 129.8 184.7 127.3 183.1 124.9
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
49
1
0
PM Voltage
1200 -1 1000
O2
N2
-2
1200
800
1100 1000 900 800 700 600
600
500
180
190
200
Ho2O3
Wavelength Error (nm)
1400
120 140 160 180 200 220 240 260 280 300
Wavelength (nm) Figure 15. Wavelength calibration of the monochromator at beamline U11 at the NSLS in the far and vacuum UV. The voltage applied to the photomultiplier to maintain constant time average current was recorded for three situations: (right) a cuvette containing holmium oxide; (center) no sample, but the sample compartment was not purged and thus contained oxygen; (left) the sample was purged with dry nitrogen. Wavelength accuracy is better than ± 0.2 nm over the range of wavelength employed in studies of proteins and nucleic acids. However, these data reveal a systematic trend, which should be correctable in software, thus further improving wavelength accuracy.
3.5 Polarizers Producing linearly polarized light is necessary for all of the experiments that use a photoelastic modulator. This discussion is limited to those types that have been used in dichrometers operating in the visible and ultraviolet. The list is surprisingly long. In some dichrometers, the polarizing elements are combined with other components; an example are the birefringement prisms used in the double monochromators of instruments from some commercial vendors. 3.5.1 Intrinsically Polarized Source Conceptually, the simplest approach is to use a source that generates linearly polarized light. Synchrotron radiation generated exactly in the orbital plane of a storage ring is linearly polarized and was used to measure MCD at the Tantalus storage ring in Wisconsin [25] and CD at the Synchrotron Radiation Sources at the Daresbury Laboratory in the UK [57] and other facilities. In the X-ray region, essentially all of the radiation is linearly polarized, but as the wavelength of the light increases, the degree of linear polarization emitted out of plane decreases. Thus, for UV radiation with wavelengths greater than 100 nm, care must be exercised to reject light that is even slightly – more than a few milliradians – out of plane [61]. 3.5.2 Dichroic Polarizers The best known dichroic polarizers are the various forms of Polaroid films. They consist of absorbing molecules embedded in a plastic matrix that are oriented by stretching in a single direction during manufacture. These polarizers are inexpensive, have a wide acceptance angle and produce a degree of polarization
50
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
adequate for the measurement of CD. Unfortunately, their spectral range is limited by the absorption spectrum of the embedded molecules at long wavelengths and the absorption of the plastic matrix at short wavelengths. Their opacity for wavelengths less than about 300 nm greatly reduces their usefulness in molecular biophysics. Photoelastic modulators from Hinds Instruments traditionally ship with Polaroid polarizers mounted on both sides of the optical element. 3.5.3 Birefringent Polarizers Birefringent crystals such as calcite, BBO, quartz and MgF2 exhibit different indices of refraction along different axes, and can thus separate a beam of light into two orthogonally polarized beams separated spatially. Calcite, a particular structural form or polymorph, of CaO2, and BBO (α-BaB2O4), have large differences in refraction that permit larger acceptance angles and can be used in Glan-Taylor and Glan-Thompson configurations in which one polarized beam passes straight through the polarizer and the other is deflected by total internal reflection and is either absorbed or passes out the side of the polarizer housing where it can be used to monitor beam intensity. BBO Glan-Taylor polarizers can be used down to about 200 nm. Glan-Taylor polarizers have been used in multi-function fluorometerdichrometers that have monochromators with relatively low focal ratios [2]. The chief disadvantage of Glan polarizers is that they limit the ability to measure CD in the peptide absorption region. Quartz and MgF2 provide smaller differences in refractive index, but can be used in Rochon and Wollaston geometries to shorter wavelengths. Both beams emerge from the same face of the polarizer in these devices, so care must be taken to insure that only one beam reaches the detector. Rochon polarizers are usually preferred because one beam passes straight through the polarizer so the optical system remains axial, although Wollaston polarizers give twice the angular separation. 3.5.4 Reflection Polarizers Pile-of-plates polarizers are used in some commercial dichrometers to enhance the degree of linear polarization of light from a double prism monochromator with birefringent prisms. They use the fact that light polarized perpendicular to the plane of incidence at Brewester’s angle is partially reflected, but the parallel polarized component is not reflected. Making several such reflections can thus progressively eliminate the partially reflected component, as shown schematically in Figure 16. The chevron pattern is necessary because passage through each plate displaces the transmitted beam slightly. The second set of plates further increases the polarization and also reverses the displacement. The commercial version uses thin quartz plates, but similar polarizers have been fabricated from LiF, which provides the shortest wavelength transmission available [62]. Eugene Stevens (nee Pysh) used a sheet of biotite, a naturally occurring material with a sheet-like structure similar to mica, as a polarizer in one of the first conventional source VUV dichrometers [53]. This requires only a single reflection, but the throughput is low and the resulting optical path is nonlinear. Horton et al. [63] described a polarizer that uses triple reflections from gold surfaces that permits a linear optical system and operates to wavelengths less than the LiF cutoff, but, throughput is low.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
51
Figure 16. Polarizers used in CD instruments. Glan-Taylor polarizers can operate to about 200. Rochon polarizers, along with double monochromators that use birefringent prisms, are usually employed for dichromators designed to operate in the VUV.
3.6 Polarization Modulators The first generation of dichrometers, beginning with Grosjean and Legrand [17], modulated the polarization of the photon beam with Pockels cells, which are voltage-controlled wave plates. Pockels cells are extraordinarily useful devices for some purposes and are used extensively in laser driven experiments. Because they are not resonant devices, they can be used as very fast switches and the waveform can be tailored to the requirements of an experiment, e.g., a square wave. As modulators in the sort of spectrometer shown in Figure 1, however, they have significant limitations: very limited transmission in the vacuum ultraviolet, inability to operate in the near infrared, extremely limited angular acceptance and the requirements for high driving voltages. Drake [10] presented a detailed description of the construction and operation of these devices. As the use of CD grew during the 1960s, so did dissatisfaction with the Pockels cell. Several investigators explored alternate approaches [18, 64-66] leading to what we now call photoelastic modulators (PEM). The approach developed by Jasperson and Schnatterly [65] and Kemp [18] rapidly became the modulator of choice in most dichrometers and many other types of optical instruments. All PEMs consist of an optical element for which the index of refraction is normally isotropic. The material can be amorphous quartz or isotropic crystals such as CaF2 and LiF. A stress applied along an axis perpendicular to the direction of a photon beam results in strains that change the refractive index parallel to the direction of the stress relative to the refractive indices in the perpendicular direction. When the stress is a compression, the refractive index is greater for light polarized parallel to the stress axis, while tension results in the opposite change. This follows because compression reduces the distance between the atoms of the material in the direction parallel to the stress while increasing their separation in the orthogonal plane, as shown schematically in Figure 17. The waistline bulge can be demonstrated by placing a rectangular rubber eraser in a vice and is quantitatively described in a tensor analysis of the problem. Some of the early devices used piezo-electric transducers to drive the stress on the optical element. This provided control of the frequency of the modulator, but tended to produce undesirable residual strains. The PEM described by Kemp [18] bonded the optical element to a quartz crystal that serves as the frequency determining element in an oscillator circuit. The mechanical distortions of the quartz are coupled mechanically to the optical element which vibrates at their common resonant frequency. The joined components are held in position in the modulator housing by soft elastomer supports that isolate their vibrations. The PEMs manufactured by JASCO use a different mounting system [13] and include a heater to maintain the housing at a constant temperature slightly above ambient.
52
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
π 4
Figure 17. Optical element of a PEM in the orientation used in the spectrometer shown in Figure 1 showing the deformations that occur under compression and extension. The dotted lines for the stressed modulators show the shape of the unstressed element. Phase retardation occurs because of the difference in refractive index in the direction parallel and perpendicular to the stress axis. The stress-induced changes in shape are greatly exaggerated for purposes of illustration.
3.6.1 Calibration of the Phase Amplitude of a Photoelastic Modulator The voltage supplied to the PEM control circuit required to produce half-wave (δ0 = π) retardation can be determined as shown in Figure 18 and is found to be a linear function of wavelength, as shown in Figure 19. However, a line fit to these data does not pass through the origin, as was originally indicated by Kemp [18] and former product literature. Rather, the extrapolation of the data crosses the wavelength axis at λ0, which is less than 100 nm and hence out of the usable range of the modulator. If a straight line given by Vπ = mπ λ + b is fit to the experimental data, then λ0 = –b/mπ and the voltage program needed to produce other values of δ0 is given by Vfπ = f mπ (λ – λ0), where f is the desired fraction of half-wave retarditation. Modulators with fused silica or LiF optical elements exhibit half-wave retarditation functions with different slopes and intercepts [67]. Operating a modulator in a vacuum environment changes the calibration function slightly compared to atmospheric pressure. The present generation of photoelastic modulators from Hinds Instruments have been engineered to incorporate the programming information described above while shielding the user from the complexities. However, low level control is still possible and would be needed to achieve phase retardations other than half- and quarter wave. 3.6.2 CD–LD Crosstalk Attempts to measure CD when the sample exhibits LD or FDCD when the fluorescence is also linearly polarized can result in artifacts due to the linear signal, which should be observed at 2ω, “leaking” into the circular polarization signal at frequency ω. One plausible explanation is that this coupling is due to an additional static phase shift, δs introduced by residual strain in the PEM or other optical element. This causes δ[t] = δs + δ0 sin ωt. Invoking the expressions for the sine and cosine for the sum of two angles leads to the expressions in Eqs. (26) and (27). Proceeding with the derivation of the expression for the intensity of the light reaching the detector similar to Eqs. (5) and (10), results in Eqs. (28) and (29),
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
53
which demonstrate that CD generates a signal at angular frequency 2ω while LD generates a signal at angular frequency ω. Note that if either δs or ΔaLD are zero, the interfering signal vanishes and Eqn. (26) reverts to Eqn. (5). A corresponding result holds for Eqn. (29). LD interference in the measurement of CD is usually the greatest concern. Strain in optical elements downstream of the modulator, the cuvette windows and window of the detector, are also capable of inducing false signals, but the PEM is particularly important because any static shift shows up in the argument of the sine or cosine function. 2
1.5
Volts
1
0.5
0 0
100
200
time
Figure 18. Detector output versus time with an analyzer exactly orthogonal to the polarization of the incident beam. The “flat-top” pattern is observed in a stress-free modulator when the phase amplitude, δ0, is exactly π. The single-peak pattern (• • • •) is for a 10% lower value of δ0 while the double peaked curve (- - - -) is for a value 10% larger. This is a mathematical simulation of the patterns that are observed experimentally with a dc coupled oscilloscope monitoring the signal from the transmission detector.
8
δ0= π Control (volts)
7 6 5 4 3 2 1 0
0
100
200
300
400
500
600
700
Wavelength (nm)
Figure 19. Programming voltage producing half wave retardation versus wavelength for a quartz PEM in a conventional source spectrometer () [2], two CaF2 photoelastic modulators operating on beamlines U9B () and U11 () and a LiF modulator on beamline U9B () at the National Synchrotron Light source. All modulators were operating at atmospheric pressure. Operating in a vacuum slightly changes the calibration of a modulator.
54
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
sin δ [t ] = sin δ s cos[δ 0 sin ωt ] − cos δ s sin[δ 0 sin ωt ]
(26)
cos δ [t ] = cos δ s cos[δ 0 sin ωt ] − sin δ s sin[δ 0 sin ωt ]
(27)
Δa J [δ ] sin δ s ⎛ ⎞ ⎜1 + CD 0 0 ⎟ 2 ⎜ ⎟ + Δa CD J 1 [δ 0 ] cos δ s sin ω t ⎟ I [t ] = I 0 e − a ⎜ ⎜ ⎟ + Δa CD J 2 [δ 0 ] sin δ s cos 2ω t ⎟ ⎜ ⎜ ⎟ ⎝ ⎠
(28)
⎛ Δa LD J 0 [δ 0 ] cos δ s ⎞ ⎜1 − ⎟ 2 ⎜ ⎟ + Δa LD J 1 [δ 0 ] sin δ s sin ω t ⎟ I [t ] = I 0 e − a ⎜ ⎜ ⎟ − Δa LD J 2 [δ 0 ] cos δ s cos 2ω t ⎟ ⎜ ⎜ ⎟ ⎝ ⎠
(29)
3.7 Sample Cells and Holders Cylindrical cells (cuvettes) are preferred for most CD experiments because they tend to have lower residual strain. But cylindrical cells are awkward for detection of fluorescence emitted perpendicular to the incident beam. Thus rectangular cells with four sides designed as windows are appropriate for experiments involving fluorescence. For measurements at short wavelengths, very short pathlengths are usually required and dismountable cells have become popular, in part because of the difficulties inherent in cleaning conventional short pathlength cells. Active control of sample temperature over much of the range where water is a liquid has become common. Cryostats can reduce the range of accessible temperatures if required. Some specialized MCD studies routinely control sample temperatures to just above absolute zero. 3.7.1 Materials For conventional source dichromators, sample cells are usually made of amorphous silica, while for the shortest wavelength penetration, CaF2 has become the material of choice [68]. 3.7.2 Calibration of Optical Path Determination of the pathlength of a sample cell (cuvette), either fixed path or dismountable variable-path, becomes a critical issue for the cells required for short-
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
55
wavelength CD. A 5 μm error in pathlength is of no concern for a cell with a nominal pathlength of one cm, but very important if the nominal pathlength is 10 μm. Thus, the pathlength of such cells should be measured. One approach is to measure the absorption spectrum of a concentrated solution is the short pathlength cell and the absorption spectrum an appropriately diluted solution is a longer pathlength cell, where the percentage error in the path is much less. An important control is to show that the shape of the spectrum does not change, which is easily done by showing that the log of the absorption spectra can be scaled to overlap. Another approach that works with very short pathlength cells is based on the observation that the intensity of light transmitted through an empty cell can exhibit an interference pattern, as illustrated in Figure 20. Suppose that Δx is the distance between the inside of the front and rear windows of the cell. A peak in the Intensity resulting from constructive interference, which corresponds to a trough in the apparent absorption, occurs at some wavelength λi when 2Δx = mi λi where mi is an integer. Suppose we choose two such wavelengths, λ1 and λ2 such that λ2 > λ1 and define Δm12 = m1 – m2, which is a positive integer determined by counting the number of troughs that λ1 is displaced from λ2. Note that λ1 is located at trough number zero. Solving for Δx results in the expression shown in Eqn. (30).
Δx =
Δm12 λ1 λ2 2(λ2 − λ1 )
(30)
Cell pathlength calibrations can be performed using a spectrophotometer or with the same dichrometer used to measure CD, as illustrated by the data in Figure 20. Using the dichrometer has the advantage that the position and size of the beam is exactly the same for the calibration and measurement of the CD spectrum. In dichrometers not configured to measure absorption, the voltage applied to the photomultiplier can be used instead. From the interference pattern shown in Figure 20, the pathlength is found to be 24 μm, which is 20% larger than the nominal value provided by the manufacturer. This is an unacceptable error for quantitative analysis such as calculating the secondary structure of a protein. The scaled drawing on the right hand side of Figure 20 illustrates the challenge inherent in achieving tolerances of relative large objects on a micron scale. Obtaining a good interference pattern requires that the spectral band width of the light is less than the separation of adjacent peaks and troughs; that the cuvette is positioned exactly perpendicular to the incident beam; and that the longitudinal coherence of the light is greater than twice the cell pathlength. Obtaining an interference pattern also provides evidence that the front and rear faces of the cuvette are parallel. An expression for the separation of adjacent extrema in the 2 interference pattern is obtained from Eqn. (30) by taking λ1λ2 = λ , λ2 – λ1 = Δλ and Δm = ½, leading to the expression shown in Eqn. (31), which is plotted in Figure 20. The nominal spectral bandwidth of the monochromator for the 0.1 mm slit width used to record this interference pattern was 0.17 nm, which is more than adequate. However, the absorbance was recorded ever 0.5 nm, which provides only four points between adjacent extrema at the short wavelength end of the spectrum; more closely spaced data are desirable. The decrease in the amplitude of the interference pattern as a function of wavelength is due to the finite longitudinal coherence length of the light beam. This phenomenon limits the maximum pathlength that can be
56
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
determined using the interference method, but, along with the spectral bandwidth requirement, insures that we do not have to be concerned with interference effects involving the exterior surfaces of the windows of the cell because the relevant values of Δx are 50 or 100 times larger, resulting in correspondingly smaller spectral band width requirements. Interference effects are not observed when the CD and absorbance spectra of samples are recorded both because the refractive index of the solution is closer to that of the window materials, hence reducing the intensity of reflected light, and because the monochromator is typically operated with a spectral band width roughly ten times greater than used to record an interference pattern. 0.08
4
0.02 mm
1 mm
1 mm
Rear
3
Front
0.06 0.05 0.04
2
0.03 0.02 0.01 0.00
λ2
λ1 400
450
500
550
600
Δλ (nm)
Apparent Absorption
0.07
1
0
x 10
Wavelength (nm)
Figure 20. Apparent absorption of the light traversing an empty cuvette with a nominal pathlength of 20 μm. Data were recorded at the CD end station on beamline U9B at the NSLS. The pseudoabsorption recorded in the absence of the cuvette was subtracted from the pA recorded with the empty cuvette in the beam to obtain the apparent absorption spectrum. From these data we find λ1 = 401.25 (by interpolation) nm, λ2 = 595.5 nm and Δm12 = 39 so that Δx = 23,987 nm or 24 μm, a value 20% larger than the nominal pathlength. The diagram at the right hand side of this figure shows the relative size of a 20 μm pathlength cell in relation to 1 mm thick front and rear windows.
Δλ =
λ2 4 Δx
(31)
3.8 Detectors Photomultipliers have been the detectors of choice throughout the visible and ultraviolet for CD and related experiments that record signals from only one wavelength or wavelength band at a time. They provide high sensitivity and low noise and the sensitivity can be changed over four or five orders of magnitude by simple changing the voltage applied over the dynode chain between the photocathode and anode, thus simplifying the determination of the ac/dc ratio (section 3.10). End window tubes have become ubiquitous among both commercial and bespoken instruments because their flat window reduces certain artifacts compared to side-window tubes. Most laboratory based instruments use photomultipliers with synthetic silica windows, which have similar short
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
57
wavelength cutoffs as Xenon arc envelopes and other components of such systems. SRCD instruments tend to use photomultipliers with CaF2, MgF2 or LiF end windows, which transmit progressively further into the VUV. MgF2 is birefringent and so the window should be cut so that the two crystal axes in the plane of the window have the same refractive indices. LiF has the best VUV penetration, but is the lease robust of these materials. Some designs result in an exposed metal ring at the potential of the photocathode, which is problematic as arching can occur under some conditions, potentially damaging the detector. One way or dealing with this hazard is to operate the photocathode at ground potential and ac-couple the signal from the anode, which is at a high positive potential, to the detection system [61]. Snyder and Rowe [25] used a phosphor screen in front of a conventional photomultiplier to detect the beam of synchrotron radiation. This standard VUV technique extends the range of detectable wavelengths below the LiF absorption limit, but is much less sensitive than using a transparent window because much of the fluorescence does not reach the detector. However, there is little point in extending the spectral range of a CD experiment using a PEM to wavelengths less than the LiF cutoff because the transmission of the PEM will determine the short wavelength limit. A useful method of suppressing out-of-band light is to employ a photomultiplier that responds only to short-wavelength UV, thus suppressing interference from visible and longer wavelength UV. This has the added benefit of suppressing “dark current”, the signal generated when no light in incident on the photocathode, because of the higher photoelectric work function characteristic of the materials used for photocathodes. Dark current tends to increase as the detector becomes sensitive to longer wavelength light that has lower energy per quantum. Dark current also increases as the voltage applied across the dynode chain of the photomultiplier increases, possibly invalidating the use of a high-voltage servo circuit to maintain the time-invarient output of the photomultiplier constant in a CD experiment. Dichrometers designed to operate in the far red and infrared, where detectors are usually noisier, have resorted to chopping (mechanically interrupting) the light beam, sometimes resulting in the need for three lock-in amplifiers in the CD detection system [69, 70]. Both of these instruments were designed to operate in the near infrared and used solid-state detectors rather than photomultipliers. One vendor has recently introduced a dichrometer for the visible and UV that also uses solid state detectors, the ac/dc ratio issue having been rendered tractable by a variety of modern analog and digital electronics. A photomultiplier or other detector does not measure the absorption of a sample directly (as implied in some product literature), but rather it generates a current that is converted to a voltage that should be linearly proportional to the intensity of the light incident on the device. To obtain absorption, we compare the signals recorded at a given wavelength with the sample in the beam, which is proportional to the value of I in Eqn. (2) with an appropriate “blank” or reference sample, which is proportional to the value of I0 in the same equation. However, anything that modifies the intensity reaching the detector will be recorded as absorption. This is the reason, for example, that the interference spectrum presented in Figure 20 is referred to as “apparent absorption”. A more frequently encountered form of apparent absorption results from light scattering, a phenomenon that may deflect the direction of an incident photon without actually absorbing it. If the angular deflection is sufficiently large, the photon will not reach the detector and will appear to have been absorbed. If the scattering cross section is insensitive to the
58
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
polarization of the light, this will not affect the measurement of CD because only photons reaching the detector are used to compute the AC/DC ratio. Unfortunately, scattering from samples that exhibit CD frequently have polarization dependent cross sections [71, 72], which can lead to artifacts in a measured CD spectrum. Samples involving proteins embedded in membrane fragments are particularly vulnerable to scattering artifacts. A good indicator of the presence of scattering is the appearance of apparent absorption or CD at wavelengths longer than those that should be absorbed by a sample. Because most scattered light in the samples most frequently studies in molecular biophysics is deflected through a relatively small angle (the scattering is predominately in the “forward direction”) scattering artifacts can be reduced by placing the detector close to the sample so that it subtends a larger solid angle with respect to the scatterings source points within the sample. Reducing the sample-detector distance is particularly important when using CD to determine secondary of proteins embedded in membrane fragments and some vendors provide this option. However, there are disadvantages to short sampledetector distances that should be considered. Components such as thermal regulators or cryostats surrounding the sample may block access or interfere with the operation of the photomultiplier, e.g. the fringing magnetic fields in a MCD experiment. If the sample happens to be fluorescent, a larger fraction of the light reaching the detector will be due to fluorescence if the detector intercepts a larger solid angle. One approach to addressing these issues was to mount the detector on a translatable platform so that the position of the photocathode relative to the sample can be adjusted from as little as a centimeter to roughly half a meter [50]. At least one commercially available dichrometer provides this capability. However, this arrangement is awkward for VUV CD where the sample should be contained is a vacuum or nitrogen purged environment of the smallest reasonable volume. Another approach is to almost completely surround the sample with a solution containing a highly fluorescent compound, i.e., a “fluoroscat” cell [73] or an integrating sphere. Neither has achieved widespread use, presumably due to reductions in sensitivity resulting from fewer photons reaching the detector. Thus, the optimum sample-detector distance may differ for different experiments and should be addressed for each situation. 3.9 Lock-in Amplifiers The critical electronic signal processing component that makes CD and related measurements possible is a phase-sensitive detector, or “lock-in” amplifier. These devices permit the measurement of tiny ac signals of known frequency and phase (a needle) in the presence of much higher levels of random noise (the hay stack). The critical operations that make such feats possible are diagrammed in Figure 21. Earlier generations of lock-ins were analog devices, but, like many contemporary signal processing devices, modern lock-ins make extensive use of digital signal processing techniques. As noted above, modern lock-ins also incorporate computer interfaces and ancillary digital and analog input and output ports so that they can serve as the heart of the data acquisition and control system for the entire spectrometer. Alternate approaches to recovering the AC signal of interest include the use of fast sample-and-hold circuits.
59
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
14
2
12
1. 5
10
1
8
0. 5
6
0
4
-0. 5 0
2
-1
0
-1. 5 0
10
20
30
40
50
60
70
80
90
100
signal in
15
14
30
40
50
60
70
80
90
100
-5
capacitor
5 4
6 0
10
20
30
40
50
60
70
80
90
100
-15
3
4
2 1
2
-10
-2
7
8
0 20
6
10
5
10
8
12
10
0
0
multiplier
-2 0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
signal out
`
tuned filter & amplifier
low-pass filter
reference in 0. 9 0. 4 -0. 1
amplifier-limiter
0. 9 0. 4
0
10
20
30
40
50
60
70
80
90
100
-0. 6 -1. 1
-0. 1 0
10
20
30
40
50
60
70
80
90
100
-0. 6 -1. 1
Figure 21. Schematic diagram of a phase-sensitive detector or “lock-in amplifier”. The signal from the photomultiplier is a time varying signal consisting of a time average component plus both the timevarying signal of interest plus significant random noise. Typically, the amplitude of the noise from a photomultiplier goes as the square root of the time average signal, while the signal of interest is orders of magnitude smaller. A capacitor blocks the time average component of the input signal and an amplifier tuned to the desired signal frequency rejects much of the random noise while increasing the amplitude of the signal. A separate reference channel accepts the signal from the PEM control and converts it to a square wave which alternates between some positive voltage and its negative value. These two signals are multiplied or “mixed”. The signal channel is, in effect, rectified synchronously. This rectifies signal is sent through a low-pass filter which smoothes it, producing an output that follows the amplitude of the time-varying signal of interest. Real lock-ins also have provisions for shifting the phase of the reference signal to insure that it is synchronous with the desired modulated signal as well as changing the overall amplification to accommodate a range of signal levels.
3.10 AC/DC Ratio Measurements The experiments to measure CD, LD, ORD, FDCD and fluorescence polarization anisotropy all require that the ratios of two electronic signals are calculated. From our vantage point after the electronics and computer revolutions of the late twentieth century, this would not appear to be a difficult task. Such was not the case in the late 1950s when the first modern CD instruments were being built [17]. The clever solution to the ratio problem was an analog circuit that controls the high voltage supplied to the photomultiplier, and hence the gain of the photomultiplier, so that the time average current remains constant as the monochromator scans across the spectrum. Since the modulated or “AC” signal extracted by the lock-in amplifier experiences the same gain as the time average or “DC” signal, then the output of the lock-in is proportional to the desired ratio. In instruments with prism monochromators, this can be combined with a mechanism to open the monochromator slits as the wavelength decreases. This works because the dispersion of a prism monochromator increases with decreasing wavelength. Thus opening the slits can increase the intensity of the light reaching the sample while maintaining a near constant spectral bandwidth. Other approaches were found for spectrometers operating in the infrared beyond the range of photomultipliers. Drake reviewed a variety of such methods and presented designs for some of the analog circuits that have been used to control the high voltage supplied to the photomultipliers [10]. Controlling photomultiplier gain is problematic in those experiments that involve fluorescence, as situations where no fluorescence is observed causes the control circuit to increase the voltage to the maximum
60
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
permitted without generating an adequate DC signal. With modern, computer controlled instruments, a solution to recording signals in emission mode is to hold the gain of the detector constant and record the AC and DC signals separately. Simple comparisons performed in software can then determine when a usable signal was recorded and compute the desired ratio. 3.10.1 DeSa Direct (Digital) Ratio Determination An approach to extracting CD from the optical signals developed by Richard DeSa and marketed by Olis, Inc. deserves special mention. It has never been described in the refereed scientific literature (R. DeSa, personal communication) and certain aspects of the method eluded to in product literature seem to have created some uncertainty as to how the system functions. The configuration of optical components is similar to those shown in Figure 1, except that both the ordinary and the extraordinary beams emerging from a Rochon polarizer pass through the PEM. Both beams pass through the sample, positioned just after the PEM, and the transmitted intensities are recorded by separate transmission detectors. These detectors are positioned some distance behind the sample to permit adequate spatial separation of the two beams. Alternately, the diverging beams may pass through two different samples, positioned just before the detectors, so that the CD of both samples can be recorded simultaneously. After emerging from the PEM, the polarization state of the two beams will be orthogonal; when one is left circularly polarized, the other will be right circularly polarized, etc. However, their intensities will usually be different, because light emerging from the monochromator usually is partially polarized. This configuration has lead some to conclude, incorrectly, that in the direct subtraction approach, the CD is computed by measured differences between the two beams in somewhat the same manner as a dual beam spectrophotometer. In fact, the CD is obtained independently for the two channels from the intensities recorded by the two detectors as described by Eqn. (7). The “direct subtraction” aspect of the process applies to the manner in which the vω/v0 ratio is obtained, which, in the DeSa instruments, also differs from other approaches. Instead of obtaining vω with a lock-in amplifier while using an analog control circuit to keep v0 constant, the voltage generated by each photomultiplier is digitized by a 16 bit analog-to-digital converter (ADC) located on an Input/Output card in a host computer. The conversion rate of the ADC is about 10 times the frequency of the PEM, but it is not synchronized with the PEM reference signal. Instead, the 16 bit ADC output is part of a 32 bit word read by the host computer. The remaining 16 bits are used to input data on the status of other aspects of the system, including the state of the reference signal from the PEM controller. The time average voltage, v0, can be generated just by averaging the digital values read by the ADC. Properly combining the ADC data to obtain vω is performed by software in the host system. This is the “direct” or “digital” subtraction referred to in product literature. However, control circuits still vary the high voltage applied to each photomultiplier so that the ADCs always read values near the top of their input range to maintain adequate resolution. A 16 bit ADC has a resolution of 1 part in 216 = 65,536. Averaging probably extends the available resolution to better than one part in 105, thus determining the limiting precision of a CD measurement using this arrangement.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
61
3.11 Simultaneous Absorption Eqn. (17) shows that determining the absorption of a sample relative to an appropriate blank involves determining the pseudo-absorption of each, as defined in Eqn. (16). Because we are dealing with the differences of logarithms, we can replace both the gains and the intensities by parameter proportional to those quantities. Since we are effectively taking the ratio of two quantities at the same wavelength, the proportionality constants do not have to be constant in wavelength. The gain term is easily obtained using the fact that the logarithm of the gain of a photomultiplier is almost a linear function of the logarithm of the applied voltage, V. This relationship is easily mapped using the fact that the high voltage servo circuit of a dichromator keeps the time average output of the photomultiplier constant. Eqn. (14) indicates that if the intensity incident on the photomultiplier decreases by some factor, e.g., 10, the gain must increase by the same factor. The incident beam can be attenuated either by inserting a filter of known absorption at a given wavelength or adjusting the slit width to attenuate the beam. The latter requires the ability to disable the servo mechanism (“open the loop”) and control the high voltage manually while reading the output signal voltage. The procedure is repeated with a greater attenuation factor, e.g., 10, 100, 1000, … and the corresponding voltages required to maintain constant detector signal recorded. The resulting data points are fit to a second order polynomial as shown in Eqn. (32). The coefficients, which are specific for a particular photomultiplier, along with the value of V recorded at each wavelength for a particular sample are used to compute the gain term in the pseudo-absorption (Eqn. (16)). Data demonstrating the relationship described in Eqn. (32) have been published elsewhere [3, 61]. Arvinte et al. [8] describe how the structure and operation of a photomultiplier influences the relationship between gain and applied high voltage. However, they do not include the empirical quadratic term shown in Eqn. (32), which is necessary for accurate measurement of absorption over a range of wavelengths. The second term in the definition of pseudo absorption involves the intensity of the incident light (Eqn. (16)). For SRCD instruments, the current circulating in the storage ring at the time of the respective measurement works nicely provided there are no changes in the optical train such as a change in the width of a slit. Recording this or some other measure to the incident intensity is important for SRCD instruments because the current in most storage rings decreases slowly with time. Fortunately, most storage rings make available to their experimental stations a voltage or other signal proportional to ring current. Figure 3 shows how this signal can be sampled by the auxiliary analog inputs of the same lock-in amplifier that is the heart of the data acquisition system of most SRCD instruments. Other schemes for acquiring a suitable measure of the incident intensity could involve detection of the extraordinary beam from a Wollaston or Rochon polarizer (Figure 16) or reflecting a small fraction of the actual incident beam with a transparent plate set at an angle, typically 45 degrees, to the main beam, a standard arrangement in fluorometers. In both cases an auxiliary detector is required. The sampled-beam approach is particularly attractive for multi-mode instruments because the auxiliary detector can be a calibrated photodiode or employ a quantum screen and thus be useful for recording intensity corrected fluorescence excitation spectra. However, correcting the fluorescence for the intensity of the exciting light is more demanding in that the measured parameter must be proportional to the exciting intensity and
62
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
either the proportionality constant must be constant for all the wavelengths of interest, e.g., a quantum screen, or the proportionality constant must be known as a function of wavelength. Obtaining an actual measurement of the incident intensity can even be ignored if the radiance of the light source is deemed sufficiently stable over time, an approach used in at least one commercial dichrometer. The multimode spectrometer proposed by Arvinte et al. [8] also made this assumption.
100k
log10 G
10k 1k 100 10 1 200
400
600
V
800
1000
1200
Figure 22. Log10 G vs V for an Electron Tubes Model 9402 MgF2 window photomultiplier currently in use on beamline U11 at the NSLS. The solid squares are the experimentally determined values. The solid curve was determined by least-squares fitting of a second degree polynomial to the experimental data. Small variations between individual photomultipliers even though they are the same model from the same manufacture, mean that each tube must be calibrated before use and periodically thereafter. In this figure and in Eqn. ((32) we are ignoring the fact that for a grounded-anode photomultiplier, the voltages are actually negative with respect to ground.
log10 G[V ] = c0 + c1V + c2V 2
(32)
3.12 Computer Systems All modern dichrometers are coupled to a computer that both provides the user interface for operation of the instrument and acquires, displays and stores the spectral data. They can be classified as one of three different types based upon the degree of integration between the computer system and the dichrometer. 3.12.1 Weakly Coupled Weakly coupled systems consist of a largely self-contained dichrometer coupled to a computer by some standard interface, such as serial (RS-232), GPIB (IEEE-488) or high-speed serial (USB). A program running in the computer sends high-level commands to the dichrometer. Dedicated circuits within the dichrometer perform the details of controlling the wavelength of the monochromator, the voltage applied to the photomultiplier and the width or the slits and recording spectral data from the
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
63
signals received by the optical detector[s]. Such instruments are often the linear decedents of the stand-alone instruments of earlier times. They tend to cost more to manufacture because they contain more subsystems designed specifically for the particular instrument. The significant advantage of this approach is that it permits great flexibility in the choice of the computer system; any combination of computer hardware and operating system that can run the control program and has the standard interface can do the job. Vendors can share the high-level commands required to operate the various functions of the instrument, without revealing the underlying details. Thus, users who are so inclined can write their own data collection programs, which facilitate integrating spectral data with laboratory-wide data storage schemes. 3.12.2 Strongly Coupled In the 1980’s it was realized that many of the dedicated electronic components of scientific instruments, including CD spectrophotometers, could be replaced by a combination of general-purpose interface circuits and instrument-specific software. This combination reduces the cost of development because software is easier to “build” and test and than dedicated hardware. The cost of manufacturing an instrument is also reduced because general-purpose I/O boards are produced in larger quantities and hence are less expensive. These very positive aspects are tempered by some equally significant disadvantages, which have emerged over the past two decades. The principal disadvantage is that the useful lifetime of the instrument tends to be determined by the computer hardware and software because they have and presumably will continue to evolve much faster than the moreexpensive optical components of the instruments. Computer components have finite lifetimes. Even if you can still find a replacement computer that uses an ISA bus, you may have difficulty finding the specialized I/O cards your instrument uses. Upgrading to a later operating system is problematic because software drivers for the embeded I/O cards may not be available. The next model of the instrument from a manufacture may use different I/O cards or the control program may run on a different operating system. In either case, the new software may not be backward compatible, although the vendor will be delighted to sell you an entirely new instrument. Even sticking with the old computer system may be unworkable, as the cyber security polizei at some institutions may forbid computers with “unsupported” operating systems on the local area network. An additional problem is that vendors tend to be far less inclined to share their source code, as opposed to a high-level command set, with end users. Thus, customization may be more difficult. 3.12.3 Modular Systems These are largely the province of laboratory-built instruments such as those found at most synchrotron light sources. For example, commercially available lock-in amplifiers like the one shown in Figure 3 can be interfaced to a control computer via one of the standard interfaces such as the GPIB (IEEE-488) and supply most of the I/O functions needed to control the instrument, except possibly for setting the wavelength of the monochromator. Optionally, the PEM controller may be interfaced to the computer with a standard interface independent of the lock-in. Developing the control software becomes an important part of building the instrument. Because the optics of such a custom instrument may outlive many
64
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
generations of computer and instrumentation technology, care should be taken to virtualize all aspects of the instrument. For example, it will be necessary to read the output of the lock-in amplifier at many points within the overall control program. Rather than build the specific command for a particular lock-in into each such instance, a function should be written to execute the command for a particular device. The function is called whenever the program needs to read the lock-in output. If a new lock-in is purchased, only the device specific subroutine should have to be rewritten. 3.13 Magnets Measurement of MCD and other field induced spectral measurements require that the sample be located in a magnetic. Three types of magnets are available, with drastically different performance characteristics and impacts on the operation of the other components of the spectrometer. Safety Note: All of the magnets used for MCD and related experiments generate intense fields that can exert strong forces on and may rapidly accelerate objects containing ferrous metals. They may also interfere with medical implants, the magnetic strips on various types of identification and credit cards and computer data storage devices. All magnets must be treated with appropriate caution. Superconducting Magnets offer fields up to around 10 T and thus provide the largest MCD signals. They are also the most disruptive to the operation of the instrument and the most difficult and expensive to operate and maintain. Photomultipliers are adversely affected by magnetic fields and PEM control circuits may also be affected. All ferromagnetic components must be carefully secured and some mechanical slit control mechanisms can be “frozen” by the fringing fields of a superconducting magnet. Ramping the field on and off to separate CD and MCD signals greatly increases the loss of liquid helium, however instruments have been reported that rotate the magnet to reverse the direction of the magnetic field [50]. Electromagnets are much easier and less expensive to operate and the field can be turned off or reversed electrically for separation of CD and MCD data. The return path for the magnetic flux provided by an iron yoke greatly diminishes fringing fields, reducing interference with other components. Compact units have been designed to fit in enlarged sample compartments of commercially available CD instruments. Fields exceeding 2 T can have been achieved with large electromagnets although maximum field strengths of about 1.5 T are more usual for the magnets typically incorporated into spectrometers [2, 50]. Until recently, the weaker magnetic fields that can be generated by permanent magnets have made them a marginal choice for MCD experiments. However, the availability of rare earth (e.g., niobium-iron-boron: Nd2Fe14B) permanent magnets permit construction of compact and relatively inexpensive magnets for MCD that will fit on a typical sample platform. The field cannot be turned off, but, just like the superconducting magnet mentioned above, can be reversed by rotating the magnet by half a turn. Fields around 1.5 T should be achievable and stray fields should be small. Such systems may bring basic MCD capabilities to the typical biophysics laboratory. Richard DeSa designed such a magnet system that produces a peak field of 1.4 T and is available for CD spectrometers from Olis and other vendors.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
65
Magnetic fields can also be used to align samples for LD when many chromospheres are rigidly attached to extended structures such as membranes. To accommodate such experiments, the magnet must be configured with optical access perpendicular to the direction of the field. All three types of magnets can be designed to provide side view capabilities instead or, or in addition to optical access parallel to the field. 3.14 Linear Dichroism The renaissance we are experiencing in LD derives from the commercial availability of Couette cells that are easily installed in commercial dichrometers as well as laboratory-built instruments, require very modest sample volumes and have good transmission far into the UV. These are described in greater detail elsewhere in this volume [26]. The small sample size may require the addition of lenses on either side of the Couette cell shown in Figure 5 to focus the incident light beam into the sample and then on to the detector. Couette cells have the additional advantage of producing a uniform sheer across the sample, unlike the parallel plate and capillary flow cells used previously [74, 75]. Other approaches to orienting chromophores for LD studies include formation of crystals, stroking a thin layer of a polymercontaining solution with a delicate brush [76], embedding molecules in a plastic matrix that is later stretched and electric and magnetic fields. Electric fields are limited to transient observations because of complications from electrophoresis and electrochemistry and unsuitable for the modulations methods emphasized here. Magnetic field orientation of biological macromolecules can be useful is some very limited circumstances. A protein containing secondary structure such as alpha helices posses a finite magnetic moment and thus is in a lower energy state when its magnetic moment is parallel to an applied field. However the interaction energy of a single protein is small compared to kT at room temperature, so the orientation of proteins in solution remains essentially isotropic. When proteins are embedded in a membrane, however, their magnetic dipole moments are effectively coupled, because the alpha helices are usually oriented perpendicular to the plane of the membrane. Thus a magnetic field can partially orient membrane fragments containing embedded proteins and LD can be observed for the direction parallel and perpendicular to the field direction. For such an experiment using the spectrometer shown in Figure 1, the magnetic filed would be directed either along the y or z axis. Note that the CD, MCD and absorption of an orientable sample will be influenced by the application of a filed parallel to the direction of the light (the x axis in Figure 1) because the distribution of the orientation of the transition dipoles of the chromophores is no longer homogeneous. 3.15 Detection of Fluorescence and Scattered Light The fittings to mount an emission detector are available on several commercial dichrometers and are easily included in laboratory-built instruments. The minimal additional components usually consist of a lens to increase the solid angle of emitted radiation that reaches the detector, thus increasing signal strength. A cut-off filter in the emission optical train is necessary to prevent scattered light at the wavelength of the excitation beam from reaching the emission detector. This is doubly important in measurements of fluorescence polarization anisotropy because the scattered light will usually be linearly polarized. A useful refinement is to have multiple filters
66
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
with different cutoff wavelengths that can be positioned remotely. Alternately, band pass filters can be used to record scattered light to the exclusion of fluorescence when monitoring molecular aggregation. If fluorescence is weak or not present, no additional filter is needed to record the scattered light.
4. Simultaneous measurements There are two reasons why the ability to measure two or more parameters simultaneously is desirable. The first, and obvious, rational is that it may be advantageous to acquire two or more distinct types of data during a single scan. Measuring the absorption and CD of a sample is one important example. The other rational is that if we can demonstrate the ability to perform two (or more) measurements simultaneously, then we can switch between them seamlessly. This holds even if it would be rare to actually record both parameters at the same time. The combination of CD and fluorescence polarization anisotropy (using the method described in section 2.4) exemplifies this class because simultaneous measurements requires two detectors, two high-voltage power supplies and two lock-in amplifiers. Thus no reconfiguration of the instrument is required when the experiment changes. Of course certain other combinations are either problematic or mutually exclusive; CD and LD or LD and fluorescence polarization anisotropy are examples. However, at beamline U11 at the NSLS, we routinely record the 100 kHz LD signal in addition to the 50 kHz CD signal as a quality assurance strategy, i.e. to check for the existence of an LS signal that might compromise the CD data.
5. Three Categories of Instruments The proceeding sections stressed the broad range of measurements that can be performed with a single beam dichrometer equipped with a photoelastic modulator. Building actual instruments involves numerous choices for the components, which depend on both the intended purpose and cost. While acknowledging the limitations inherent in any classification scheme, I have chosen to consider three classes of instruments. Class 1: These are conventional source instruments with monochromators and other optical components that operate into the UV to less than 200 nm. Traditional dichromators, some with added fluorescence capabilities [5], are in this group. They are usually more expensive than the Class 2 instruments and represent the vast majority of the instruments in use today. They employ Xenon arc light sources and most have double prism or prism/grating monochromators with f/# ≥ 8, which provide excellent stray light rejection, but significantly lower throughput compared to the “faster” monochromators typical of Class 2. Typically they employ quartz polarizers, either in the prism or prisms of the monochromator, or as a separate Rochon polarizer, and amorphous quartz photoelastic modulators. Photomultipliers with amorphous quartz end windows and photo cathodes sensitive to light spanning the UV and most of the visible are also typical, as they are designed as general purpose instruments. Class 2: These are conventional source instruments with optical components similar to those found in fluorometers. They perform best in the visible and longer wavelength UV and are the least expensive class of instruments. The
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
67
monochromators have lower f/# and hence source optics can collect a larger solid angle from the source and still couple efficiently into the monochromator. Thus, they use Glan-Thompson polarizers with larger acceptance angles. These instruments serve well as process monitors, in time-resolved (stopped flow) experiments and experiments involving fluorescence. One of the early “multiparameter” spectrometers used low f/# monochromators and is thus a member of this class [2]. However the lack of deep UV penetration and inability to effectively probe protein secondary structure using CD is a significant limitation in many biophysical environments. Class 3: Synchrotron source instruments are optimized for the far and vacuum UV. Although assembled using commercially available components, all are custom built to a greater or lesser degree. Typically, they are designed and built by the scientists and engineers who operate the beamline, are based on ultra-high vacuum (UHV) monochromators, do not have fluorescence capabilities and are operated as user facilities. Windows, polarizers (when used), PEMs and sample cells are usually made of CaF2, MgF2 or LiF, as appropriate. The custom built instruments that led the way in exploring VUV CD using H2 discharge conventional sources [49, 51-53, 77-79] shared some of the characteristics of the synchrotron based instruments, but all have now been retired.
6. Resources Table 2 lists vendors of CD and related instruments and components. Components with major applications not involving CD and related experiments, e.g., monochromators, photomultipliers, polarizers and lock-in amplifiers, are not included. Table 3 lists facilities for SRCD; most are operated as user facilities. Both lists are believed to be accurate as of late 2008. Table 2. Vendors of CD spectrometers and related specialized components. Vendor
URL
Products
Applied Photophysics, Ltd. Leatherhead, Surrey, UK
http://www.photophysics.com/
CD spectrometers and accessories
AVIV Biomedical, Inc. Lakewood, New Jersey, USA
http://www.avivbiomedical.com/
CD spectrometers and accessories
Bio-Logic SAS, Claix, France
http://www.bio-logic.info
CD systems and accessories, especially for rapid kinetics
Crystal Precision Optics, Rugby, Warwickshire, UK
http://www.crystal-optics.u-net.com/
Couette cells for Linear Dichroism
Hinds Instruments, Inc. Hillsboro, Oregon, USA
http://www.hindsinstruments.com/
Photoelastic modulators and accessories
JASCO, Inc. (offices world wide)
http://www.jascoinc.com/
CD spectrometers and accessories
Olis, Inc., Bogart, Georgia, USA
http://www.olisweb.com/
CD spectrometers and accessories
68
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
Table 3. Facilities for UV SRCD: operating, under development, being planned, proposed or recently retired as of December 2008. Facility
Country
URL
Association of National Research Centers (development)
Germany
http://ankaweb.fzk.de/
Australian Synchrotron (proposed)
Australia
http://www.synchrotron.org.au/conte nt.asp?Document_ID=494
Beijing Synchrotron Radiation Facility (operating)
China
http://www.ihep.ac.cn/bsrf/english/fa cility/html/VUV.htm
Berliner Elektronenspeicherring-Gesellschaft für Synchrotronstrahlung (operational)
Germany
http://www.srs.dl.ac.uk/VUV/CD/ne w12.html
Diamond Light Source (commissioned late 2008/early 2009)
UK
http://www.diamond.ac.uk/Beamline s/Beamlineplan/B23/default.htm
Hiroshima Synchrotron Radiation Center (operational)
Japan
http://www.hsrc.hiroshimau.ac.jp/english/bl15.htm
Institute for Storage Ring Facilities (2 CD beamlines, operational)
Denmark
http://www.isa.au.dk/facilities/astrid/ beamlines/uv1/uv1.html http://www.isa.au.dk/facilities/astrid/ beamlines/cd1/cd1.html http://www.nsls.bnl.gov/beamlines/b eamline.asp?blid=U11
National Synchrotron Light Source (2 CD beamlines, operational)
USA
National Synchrotron Radiation Laboratory (operational)
China
http://www.nsrl.ustc.edu.cn/en/
National Synchrotron Radiation Research Center (operational)
Taiwan
http://140.110.203.42/bldoc/04BSN M.htm
SOLEIL (operational, multiexperimental beamline)
France
http://www.synchrotronsoleil.fr/portal/page/portal/Recherche /LignesLumiere/DISCO
Synchrotron Radiation Center (Aladdin) (MCD, multiexperiment beamline)
USA
http://www.src.wisc.edu/facility/list/ port_051.pdf
Synchrotron Radiation Source, Daresbury Lab. (decommissioned in 2008)
UK
http://www.srs.dl.ac.uk/VUV/CD/ne w12.html
http://www.nsls.bnl.gov/beamlines/b eamline.asp?blid=U9B
Acknowledgments I thank John Trunk for recording the data shown in several of the figures and Ettore Castiglioni, Università di Brescia, for helpful comments. Preparation of this article was supported by East Carolina University and the Office of Biological and Environmental Research, U.S. Department of Energy, which also supports operation of beamlines U9B and U11 at the National Synchrotron Light Source at Brookhaven National Laboratory. The NSLS is supported by the Office of Basic Energy Sciences, USDOE.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
69
References [1] [2] [3] [4] [5]
[6] [7] [8]
[9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26]
J.C. Kemp, Basic laboratory setup for various measurements possible with the photoelastic modulator. Application note, Hinds Instruments, Inc., Hillsboro, Oregon (1975). J.C. Sutherland, G.D. Cimino, and J.T. Lowe, An emission and polarization spectrometer for biophysical spectroscopy, Rev. Sci. Instrum. 47 (1976) 358-360. J.C. Sutherland, P.C. Keck, K.P. Griffin, and P.Z. Takacs, Simultaneous measurement of absorption and circular dichroism in a synchrotron spectrometer, Nuc. Instr. Meth. 195 (1982) 375-379. G. Ramsay and M. Eftink, A multidimensional spectrophotometer for monitoring thermal unfolding transitions of macromolecules, Biophys. J. 66 (1994) 516-523. G. Ramsay, R. Ionescu, and M. Eftink, Modified spectrophotometer for multi-dimensional circular dichroism/fluorescence data acquisition in titration experiments Application to the pH and guanidine-HCI induced unfolding of apomyoglobin, Biophys. J. 69 (1995) 701-707. D.H. Turner, I. Tinoco, and M. Maestre, Fluorescence detected circular dichroism, J. Am. Chem. Soc. 96 (1975) 4340-4342. D. Canet, K. Doering, C.M. Dobson, and Y. Dupont, High sensitivity fluorescence anisotropy detection of protein folding events: Application to alpha-lactalbumin, Biophys. J. 80 (2001) 99037-. T. Arvinte, T.T. Bui, A.A. Dahab, B. Demeule, A.F. Drake, D. Elhag, and P. King, The multi-mode polarization modulation spectrometer. Part 1: Simultaneous detection of absorption, turbidity, and optical activity, Anal. Biochem. 332 (2004) 46-57. L.D. Barron, Molecular Light Scattering and Optical Activity. Second edition, Cambridge University Press (2004). A.F. Drake, Polarisation modulation - the measurement of linear and circular dichroism, J. App. Phys. E 19 (1986) 170-181. W.C. Johnson, Jr., Protein secondary structure and circular dichroism: A practical guide, Proteins: Struct. Funct. Genet. 7 (1990) 205-214. A. Rodger and B. Nordén, Circular Dichroism and Linear Dichroism, Oxford University Press, Oxford (1997). Y. Shindo, Applications of polarization modulation technique in polymer science, Optic. Eng. 34 (1995) 3369-3384. J.E. Wampler and R.J. DeSa, Recording polarization of fluorescence spectrometer - A unique application of piezoelectric birefringence modulation, Anal. Chem. 46 (1974) 563-567. A. Gray and G.B. Mathews, Bessel Functions and their Applications to Physics, MacMillan and Co., London (1895). C.F. Bohren and D.R. Huffman, Absorption and Scattering of Light by Small Particles, John Wiley and Sons, New York (1983). M. Grosjean and M. Legrand, Polarimetrie. Appareil de mesure du dichroïsme circulaire dans le visible et l'ultraviolet, Compt. Rend. 251 (1960) 2150-2153. J.C. Kemp, Piezo-optical birefringence modulators: New use for a long-known effect, J. Opt. Soc. Am. 59 (1969) 950-954. W.R. Mason, A Practical Guide to Magnetic Circular Dichroism Spectroscopy, Wiley, New Work (2007). J.C. Sutherland, J.F. Duval, and K.P. Griffin, Magnetic circular dichroism of netropsin and natural circular dichroism of the netropsin-DNA complex, Biochemistry 17 (1978) 5088-5091. J.C. Sutherland and K. Griffin, Magnetic circular dichroism of adenine, hypoxanthine, and guanosine 5'-diphosphate to 180 nm, Biopolymers 23 (1984) 2715-2724. G. Barth, E. Bunnerberg, and C. Djerassi, Magnetic circular dichroism studies. XIX. The determination of the tryptophan:tyrosine ratio in proteins, Anal. Biochem. 48 (1972) 471-479. B. Holmquist and B.L. Vallee, Tryptophan quantitation by magnetic circular dichroism in native and modified proteins, Biochemistry 12 (1973) 4409-4417. J.C. Sutherland, I. Salmeen, A.S. Sun, and M.P. Klein, Ferredoxin, the uses of natural and magnetic circular dichroism in a multi-chromophore system, Biochim. Biophys. Acta 263 (1972) . P.A. Snyder and E.M. Rowe, The first use of synchrotron radiation for vacuum ultraviolet circular dichroism measurements, Nuc. Instr. Meth. 172 (1980) 345-349. A. Rodger, Linear Dichroism Techniques, in: Modern Techniques for Circular Dichroism and Synchtrotron Radiation Circular Dichroism Spectroscopy, B.A. Wallace and R.W. Janes, Editors, IOS Press, Amsterdam (2009).
70
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
[27] R. Marrington, T.R. Dafforn, D.J. Halsall, and A. Rodger, Micro-volume couette flow sample orientation for absorbance and fluorescence linear dichroism, Biophys. J. 87 (2004) 2002-12. [28] R. Marrington, T.R. Dafforn, D.J. Halsall, J.I. MacDonald, M. Hicks, and A. Rodger, Validation of new microvolume Couette flow linear dichroism cells, Analyst 130 (2005) 1608-1616. [29] A. Rodger, R. Marrington, M.A. Geeves, M. Hicks, L. de Alwis, D.J. Halsall, and T.R. Dafforn, Looking at long molecules in solution: what happens when they are subjected to Couette flow?, Phys. Chem. Chem. Phys. 8 (2006) 3161-3171. [30] Y. Shindo, H. Hayakawa, and M. Sudani, The optical rotatory dispersion and linear dichroism option of the JASCO J-500 A-Type CD spectropolarimeter, App. Spectroscopy 43 (1989) 14711475. [31] T.C. Oakberg, Polarimetry: Optical rotation, Application Note, Hinds Instruments, Inc., Hillsboro, Oregon (2005). [32] T.C. Oakberg, Linear Birefringence and Optical Rotation, Applications Note, Hinds Instruments, Inc., Hillsboro, Oregon (2002). [33] R. Mandel and G.D. Fasman, Thermal denaturation of DNA and DNA:polypeptide complexes. Simultaneous absorption and circular dichroism measurements, Biochem. Biophys. Res. Comm. 59 (1974) 672-9. [34] K.W. Hipps and G.A. Crosby, Applications of the photoelastic modulator to polarization spectroscopy, J. Phys. Chem. 83 (1979) 555-562. [35] J.C. Sutherland and H. Low, Fluorescence-detected magnetic circular dichroism of fluorescent and nonfluorescent molecules, Proc. Natl. Acad. Sci. U.S.A. 73 (1976) 276-280. [36] J.R. Lakowicz, Principles of Fluorescence Spectroscopy. Second edition, Kluwer Academic Publishers, Inc., New York (1999). [37] L.A. Kelly, J.G. Trunk, and J.C. Sutherland, Time resolved fluorescence polarization measurements for entire emission spectra with a resistive-anode, single-photon-counting detector: The fluorescence omnilyzer, Rev. Sci. Instrum. 68 (1997) 2279-2286. [38] J.C. Sutherland. Simultaneous measurement of circular dichroism and fluorescence polarization anisotropy, in: Clinical Diagnostic Systems: Technologies and Instrumentation: SPIE (2002). [39] K. Tanaka, G. Pescitelli, K. Nakanishi, and N. Berova, Fluorescence detected exciton coupled circular dichroism: Development of new fluorescent reporter groups for structural studies, Monatshefte für Chemie 59 (2005) 121-125. [40] T. Nehira, K. Tanaka, T. Takakuwa, C. Ohshima, H. Masago, G. Pescitelli, A. Wada, and N. Berova, Development of a universal ellipsoidal mirror device for fluorescence detected circular dichroism: Elimination of polarization artifacts, App. Spectroscopy 59 (2005) 121-5. [41] B.A. Wallace and C.L. Teeters, Differential absorption flattening optical effects are significant in the circular dichroism spectra of large membrane fragments, Biochemistry 26 (1987) 65-70. [42] E. Castiglioni, S. Abbate, G. Longhi, and R. Gangemi, Wavelength shifts in solid-state circular dichroism spectra: A possible explanation, Chirality 19 (2007) 491-6. [43] E. Castiglioni, S. Abbate, G. Longhi, R. Gangemi, R. Lauceri, and R. Purrello, Absorption flattening as one cause of distortion of circular dichroism spectra of Delta-RuPhen3.H2TPPS complex, Chirality 19 (2007) 642-646. [44] E. Castiglioni, F. Lebon, G. Longhi, R. Gangemi, and S. Abbate, An operative approach to correct CD spectra distortions due to absorption flattening, Chirality 20 (2008) 1047-1052. [45] J.F. James and R.S. Sternberg, The Design of Optical Spectrometers, Chapman and Hall, Ltd., London (1969). [46] H.E. Johns and M. Rauth, Theory and design of high intensity UV monochromators for photobiology and photochemistry, Photochem. Photobiol. 4 (1965) 673-692. [47] W.R. Hunter and S.A. Malo, The temperature dependence of the short wavelenth transmittance limit of vacuum ultraviolet window materials. I. Experiment, J. Phys. Chem. Solids 30 (1969) 27392745. [48] J.C. Sutherland, E.J. Desmond, and P.Z. Takacs, Versatile spectrometer for experiments using synchrotron radiation at wavelengths greater than 100 nm, Nuc. Instr. Meth. 172 (1980) 195-199. [49] C.A. Bush, S. Ralapati, and A. Duben, Computer-controlled vacuum ultraviolet circular dichroism spectrometer with Fourier digital data smoothing, Anal. Chem. 53 (1981) 1140-1142. [50] J.C. Sutherland, L.E. Vickery, and M.P. Klein, A spectrometer for the measurement of magnetic and natural circular dichroism, Rev. Sci. Instrum. 45 (1974) 1089-1093.
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
71
[51] O. Schnepp, E.F. Pearson, and E. Sharman, The measurement of circular dichroism in the vacuum ultraviolet, Rev. Sci. Instrum. 41 (1970) 1136-1141. [52] W.C. Johnson, Jr., A circular dichroism spectrometer for the vacuum ultraviolet, Rev. Sci. Instrum. 42 (1971) 1283-1286. [53] E. Pysh, Optical activity in the vacuum ultraviolet. Ann. Rev. Biophys. Bioeng. 5 (1974) 63-75. [54] N. Ojima, K. Sakai, K. Matsuo, T. Matsui, T. Fukazawa, H. Namatame, M. Taniguchi, and K. Gekko, Vacuum-ultraviolet circular dichroism spectrophotometer using synchrotron radiation: Optical system and on-line performance, Chem. Lett. 30 (2001) 522-523. [55] A.J. Miles, S.V. Hoffmann, Y. Tao, R.W. Janes, and B.A. Wallace, Synchrotron radiation circular dichroism (SRCD) spectroscopy: New beamlines and new applications in biology, Spectroscopy (2007) 245-255. [56] M.R. Howells, Theory of a modified Wadsworth monochromator matched to a low energy storage ring source, Nuc. Instr. Meth. 195 (1982) 215-222. [57] D.T. Clarke and G.R. Jones, CD12: A new high-flux beamline for vacuum-ultraviolet circular dichroism on the SRS at Daresbury, J. Synchrotron Rad. 11 (2004) 142-149. [58] W.W. Watson and P.G. Koontz, Nitrogen molecular spectra in the vacuum ultraviolet, Physical Rev. 46 (1934) 32-37. [59] K. Yoshino, J.R. Esmond, A.S.-C. Cheung, D.E. Freeman, and W.H. Parkinson, High resolution absorption cross sections in the transmission window region of the Schumann-Runge bands and Herzberg continuum of O2, Planetary Space Sci. 40 (1992) 185-192. [60] J.C. Travis, J.C. Acosta, G. Andor, J. Bastie, P. Blattner, C.J. Chunnilall, S.C. Crosson, D.L. Duewer, E.A. Early, F. Hengstberger, C.-S. Kim, L. Liedquist, L.A.G. Monard, S. Nevas, A. Mito, M. Nilsson, M. Noel, A.C. Rodriguez, A. Ruiz, A. Schirmacher, M.V. Smith, G. Valencia, N. van Tonder, and J. Zwinkels, Intrinsic wavelength standard absorption bands in holmium oxide solution for UV/visible molecular absorption spectrophotometry, J. Phys. Chem. Ref. Data 34 (2005) 41-56. [61] J.C. Sutherland, Circular dichroism using synchrotron radiation: from ultraviolet to X-rays. in: Circular Dichroism and the Conformational Analysis of Biomolecules, G.D. Fasman, Editor, Plenum Press, New York (1996). [62] D.C. Hinson and J.R. Stevenson, A lithium fluoride pile of plates polarizer for the vacuum ultraviolet, J. Opt. Soc. Am. 56 (1966) 408. [63] V.G. Horton, E.T. Arakawa, R.N. Hamm, and M.W. Williams, A triple reflection polarizer for use in the vacuum ultraviolet, App. Opt. 8 (1969) 667-670. [64] M. Billardon and J. Badoz, Modulateur de biréfringence, Compt. Rend. Acad. Sci. Paris 262 (1966) 1672-1675. [65] S.N. Jasperson and S.E. Schnatterly, An improved method for high reflectivity ellipsometry based on a new polarization modulation technique, Rev. Sci. Instrum. 40 (1969) 761-767. [66] L.F. Mollenauer, D. Downie, H. Engstrom, and W.B. Grant, Stress plate optical modulator for circular dichroism measurements, App. Opt. 8 (1969) 661-665. [67] T.C. Oakberg, J. Trunk, and J.C. Sutherland, Calibration of photoelastic modulators in the vacuum UV, Proc. SPIE 4133 (2000) 101-111. [68] F. Wien and B.A. Wallace, Calcium fluoride micro cells for synchrotron radiation circular dichroism spectroscopy, App. Spectroscopy 59 (2005) 1109-1113. [69] G.A. Osborne, J.C. Cheng, and P.J. Stephens, A near-infrared circular dichroism and magnetic circular dichroism instrument, Rev. Sci. Instrum. 44 (1973) 10-15. [70] J.M. Olson, J.T. Trunk, and J.C. Sutherland, Circular dichroism of the 1300 nm band of oxidized reaction centers from Rhodopseudomonas viridis, Biochemistry 24 (1985) 4495-4497. [71] I. Tinoco, W. Mickols, M.F. Maestre, and C. Bustamante, Circular intensity differential scattering, Ann. Rev. Biophys. Bioeng. 16 (1987) 319-349. [72] C. Bustamante, I. Tinoco, and M.F. Maestre, Circular differential scattering can be an important part of the circular dichroism of macromolecules, Proc. Natl. Acad. Sci. U.S.A. 80 (1983) 35683572. [73] B.P. Dorman and M.F. Maestre, Experimental differential light-scattering correction to the circular dichroism of bacteriophage T2, Proc. Natl. Acad. Sci. U.S.A. 70 (1973) 255-259. [74] R.L. O'Brien, J.L. Allison, and F.E. Hahn, Evidence for intercalation of chloroquine into DNA, Biochim. Biophys. Acta 129 (1966) 622-624. [75] L.S. Lerman, The structure of the DNA-acridine complex, Proc. Natl. Acad. Sci. U.S.A. 49 (1963) 94-102.
72
J.C. Sutherland / Measurement of Circular Dichroism and Related Spectroscopies
[76] R.B. Setlow and E.C. Pollard, Molecular Biophysics, Addison Wesley, Reading, MA (1962). [77] S. Feinleib and F.A. Bovey, Vapour-phase vacuum-ultraviolet circular-dichroism spectrum of (+)3-methylcyclopentanone, Chem. Comm. (1968) 978-979. [78] S. Brahms and J. Brahms, Determination of protein secondary structure in solution by vacuum ultraviolet circular dichroism, J. Mol. Biol. 138 (1980) 149-178. [79] S. Brahms, J. Brahms, G. Spach, and A. Brack, Identification of beta, beta-turns and unordered conformations in polypeptide chains by vacuum ultraviolet circular dichroism, Proc. Nat. Acad. Sci. U.S.A. 74 (1977) 3208-3212.
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-73
73
Calibration Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy Andrew J. Miles and B.A. Wallace Department of Crystallography, Birkbeck College, University of London
Abstract. Good practice in basic research and industrial applications necessitates the accurate and regular calibration of circular dichroism instruments and synchrotron radiation circular dichroism beamlines for magnitude, optical rotation and wavelength. If spectra obtained in different laboratories are to be comparable and usable with standard reference datasets for secondary structure analyses, then instruments must be standardized, enabling cross-validation of spectra produced. In this chapter calibration methods are discussed along with techniques for measuring other relevant parameters such as sample cell pathlengths and protein concentrations. Accurate knowledge of the values of these parameters is important for producing correct spectral magnitudes, that are, in turn, essential for correct secondary structural analyses.
1. Introduction The circular dichroism (CD) signals from biomolecules are extremely small and therefore it is imperative that close attention be paid to experimental conditions in order to avoid artifacts that can be easily misinterpreted as accurate data. However, regardless of strict adherence to good practices and protocols, it is generally observed that any two instruments, be they conventional lab-based CD machines or synchrotron radiation circular dichroism (SRCD) beamlines, will often produce slightly different spectra for the same sample under comparable conditions. This is due to small disparities between the polarisation of the incident light from each spectropolarimeter along with a host of variables that lie between the light source and the final output and are most commonly manifest as differences in spectral magnitude or wavelength. Since secondary structure analysis programs are sensitive to peak magnitude and wavelength [1], it is essential to calibrate any instrument before collecting data. There are two issues: internal calibration (which adjusts the instrument to an internal standard), and cross-calibration against a defined set of parameters. In principal, if the former is correct then so too should be the latter; however this is rarely the case. Cross-calibration of spectral magnitude and polarisation can be achieved using standards that give rise to peaks of known molar ellipticity values. Then the ratios between the experimentally-measured values and the established values of the standard can be used to correct the sample spectra so that, in principal, the same results can be produced from any instrument. Similarly, standards with sharp peaks can be used to correct for shifts in the wavelength dimension.
74
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
Once the spectrum is obtained, the spectral magnitude in standard units, either as mean residue ellipticity, [θ]MRE, or molar circular dichroism, Δε, can be calculated using Eqn. (1): [θ]MRE = (θ x 0.1 x MRW)/cl = 3298 Δε
(1)
where θ is the measured ellipticity (in millidegrees) produced by the instrument, c is the concentration in mg/ml, l is the pathlength of the sample cell in cm, and MRW is the mean residue weight of the sample (usually around 110 Daltons), defined as: MRW = MW/(n-1)
(2)
where the MW is the molecular mass of the protein (in Daltons) and n is the number of residues. The term n-1 refers to the number of peptide bonds present in the structure. It is obvious that not only is Δε dependent on accurate measurements of θ, but also on the correct determination of the sample concentration and cell pathlength. The latter is especially problematic if short pathlength optical cells (< 0.001 cm) are used since the pathlengths cited by the manufacturers may be in error by as much as 50% [2]. This chapter will explore methods for calibrating the magnitudes and wavelengths in conventional CD and synchrotron radiation circular dichroism instruments as well as techniques for measuring the pathlengths of sample cells and means of accurately determining protein concentrations.
2. Spectral Magnitude 2.1. Materials The CD signals or spectral magnitude/optical rotation of CD instruments are typically calibrated at a single wavelength, 290.5 nm, using either D-(+)-10-camphoursulphonic acid (CSA) [3,4] or ammonium camphoursulphonate (ACS) [5]. The latter compound, while less hygroscopic and therefore easer to handle, is more expensive and less readily available. For accuracy, however, calibrations should be done at more than one wavelength [3,6] in order to cover the full wavelength range measured in a protein spectrum, and to demonstrate the linearity (or wavelength-dependence) of the instrument response. Other candidate molecules that have been proposed as standards are the lower wavelength (192.5 nm) CSA peak [3] and pantolactone (PL) [7], which has a peak at 219 nm. Cobalt (III) tris-ethylenediamine (Co(en)), with a maximum at 490 nm [8], provides a good standard for the visible region (Figure 1). A multi-standard calibration method using a polynomial fit has previously been proposed [4] as means of producing a highly accurate calibration. However pantolactone is hygroscopic and has no convenient absorption peak for measuring the concentration and Co(en) is not commercially available. Therefore this section will concentrate on the description of a two point calibration using CSA, which we have found to be similarly accurate for UV wavelengths to the four standard procedure. As a note, the procedure can also be applied if ACS is used instead of CSA.
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
75
2.2. Calibrating Instruments using CSA
Δ ε (M -1 cm -1)
Solid CSA (highest purity) can be obtained from suppliers such as Sigma-Aldrich and should be stored in a dry dark place because it is light-sensitive and hygroscopic. Solutions should be made just prior to use and stored at 4o C in the dark for no longer than two weeks. Because of its hygroscopic nature, after a solution has been prepared gravimetrically, its concentration should be confirmed from its A 285 using the recently re-determined molar extinction coefficient of 34.6 M-1 cm-1 [12]. CSA has two CD peaks in the far UV wavelength region: the one at 290.5 nm has an absolute Δε value of +2.36 [9]. The second, a negative peak at 191.5 nm is at the lower wavelength limit for measurement by conventional CD and ORD instruments. The value of -4.72 has generally been accepted for this peak [3], which produces a peak ratio for the 191.5/290.5 peak magnitudes of 2.00. These values were used when calibrating the new CD reference dataset (SP175) [10] used for analysis by the DichroWeb online secondary structure analysis server [11]. Therefore if that reference dataset is to be used for spectral analyses the instrument should be cross-calibrated to those values accordingly.
3 2 1 0 -1 -2 -3 -4 -5 -6 170
270
370
470
570
w avelength (nm )
Figure 1. CD Spectra of three standards used for calibrations of magnitude/optical rotation: CSA (thick line), Co(en) (thin line) and PL (dashed line).
When using CSA as a standard, it is important that the concentration of the solution is accurately determined prior to making the CD calibration measurements. The optimal concentration for a two-point calibration was found to be ~6 mg/ml where an extinction coefficient of 34.6 [12] at 285 nm gives rise to an absorption peak of 0.89 when measured in a 1 cm pathlength cell. This is well within the region where concentration and absorption show a linear relationship, large enough for an accurate reading to be recorded, and deviations from the documented pathlengths for cells of this size usually negligible. Then, it is necessary to consider conditions that enable accurate collection of both the high and low wavelength peaks using the same sample. The estimate of the best
76
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
concentration to use for the CD calibration was obtained as follows: the expected magnitude of a CD peak due to any small optically active moiety can be calculated from the concentration and the ellipticity at a given wavelength by the following equation: θ = (329.8 Δε c l) /MW
(3)
where c is the concentration in mg/ml, l is the cell pathlength in cm and MW is the molecular weight. At 6 mg/ml the CD spectrum of CSA (MW = 232.3), measured in a 0.01 cm cell, has a calculated positive peak of 20.10 mdeg at 290.5 nm and a negative peak at 192.5 nm of around 40.2 mdeg. This is sufficiently large for accurate measurements to be made using a cell with a pathlength that deviates minimally from that documented by the manufacturer. Note also that by using this protocol, the absorption of CSA at 191.5 nm is not so high as to cause distortion of the CD peak. The CD spectrum of CSA is temperature-sensitive and must be measured at 25o C if the above values for the CSA extinction coefficients are to be used. It is important that a baseline spectrum of water is also collected in the same cell, and that the “flat” region of the spectrum between the two peaks not be relied on for setting the zero point. With the above concentration and pathlength the step size or interval used should be 0.5 nm to produce the appropriate spectral resolution to detect the peak accurately. When using step mode, the averaging time should be between 1 and 3 seconds, depending on the instrument. For instruments that perform continuous scans (e.g. Jasco) a scan speed of 50 nm/min is adequate. As discussed in the chapter on Good Practice by Kelly and Price, a scan rate that is too rapid will cause an apparent shift in peak position and a useful guideline is: scan rate x time constant < bandwidth < W/10 where W is the width at half the height of the spectral feature. Once the measurements of the standards have been made on a given instrument, a calibration curve can be calculated for use in correcting any spectra measured on that instrument. Conventional machines will remain stable for hundreds of lamp hours and may only need recalibrating if there is a modification to the optics or when the lamp is replaced; however, SRCD instruments should be calibrated following each beam injection to compensate for any change in the beam position. To produce the calibration curve, the following procedure is used: For each peak, the delta epsilon value measured for the instrument is divided by the literature value of delta epsilon. A second-order polynomial fit (in the case of 3 or more standards) or a linear fit (in the case of 2 standards) is then produced and extrapolated to cover the wavelength range of the protein spectrum. The resulting curve produces a correction factor designated Rλ, as a function of wavelength. Multiplying each data point of any subsequent spectrum collected on that same instrument by its corresponding value of Rλ produces the cross-calibrated spectrum. [4]. This method is facilitated by using the processing software CDTool [13] or it can be performed using spreadsheet software such as Microsoft Excel.
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
77
2.3 Spectral Calibration using the CDTool Software When the CSA spectrum is opened in the CDTool software, the magnitude of the peaks at ~ 290.5 nm and ~191.5 nm can be determined by placing the cursor on each peak, which will result in the CD values being displayed in the bottom left hand corner of the window. For higher precision, the “zoom” function can be selected from the “Plot” menu at the top to enlarge the peak region. To create a calibration curve, the “Calculate new calibration curve” is selected from the “Calibration” menu and the dialog box shown in Figure 2 (inset) will open. The experimental and theoretical (see Table 1) in which the ellipticities have been converted to Δε values) values of the two peaks can be entered and, upon closing the dialog box, the calibration curve will be displayed in the plot window, overlaying the original CSA spectrum (Figure 2). The spectrum or spectra to be calibrated are then read into the programme and the “Calibrate by polynomial” function is chosen from the ‘Calibration’ menu, thus producing a calibrated spectrum that can be saved. The header information in the saved files will include the information on the files used for the calibration.
Figure 2. (inset) Dialogue box from CDTool software [13], where the theoretical and experimental values of the two CSA peaks are entered. (main figure) Screen shot of CDTool main window and menus showing the CSA spectrum (solid line) and calculated curve (dotted line) that will be applied to other spectra for calibrations.
78
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
2.4 Spectral Calibration using Spreadsheets A spreadsheet can be used to create a calibration curve by plotting the Rλ values for the two CSA peaks against wavelength, and fitting a second order polynomial (in the case of 3 or more standards) or linear (in the case of 2 standards) trendline to the data points (Figure 3). Next, the protein CD data is pasted into the spreadsheet with the wavelength in the first column, the CD data in the second column and the formula for the polynomial in the third column (in the example from Figure 3 this would be “=((0.0006*A1)+0.786)”. The fourth column contains the product of the data in the second column and the third column “=(B1*C1)”. A plot of wavelength (column 1) against calibrated data (column 4) will produce the calibrated spectrum.
Figure 3. CSA calibration curve produced by a spreadsheet-based procedure.
Table 1. Value of Δε for standard compounds measured on four CD and three SRCD instruments, compared to the literature values. The ratios of CSA peaks are included in the bottom row (adapted from [4]). Peak
Instrument ID
Compound
(nm)
Lit value
cCD1
cCD2
cCD3
cCD4
SRCD1
SRCD2
CSA
191
-4.72 [3]
-5.26
-4.70
-4.76
-4.84
-5.21
-3.90
-4.35
PL
219
-4.90 [7]
-5.18
-4.80
-4.43
-4.90
-5.10
-
-4.39
CSA
290
2.36 [9]
2.46
2.40
2.43
2.42
2.37
1.99
2.18
Co(en)
490
1.89 [8]
1.78
1.94
1.89
1.77
-
-
-
-2.00 [4]
-2.16
-1.96
-1.96
-2.00
-2.20
-1.90
-2.00
CSA ratio
SRCD3
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
79
2.5 The Effects of Calibration on Spectra and Secondary Structure Analyses Figure 4 (top) shows the spectra of the same sample of the protein subtilisin obtained under similar conditions on two different CD instruments. The differences between the two uncalibrated spectra are obvious between 208 and 220 nm. However after being calibrated by their respective CSA spectra, they match perfectly (Figure 4 (bottom)). Table 2 shows the calculated secondary structures of the uncalibrated and calibrated subtilisin spectra. The calibrated spectrum produces results considerably closer to the secondary structure content of the crystal structure (Protein Data Bank code 1sca) as determined by the DSSP [15] algorithm.
θ (mdeg)
12 6 0
-6 -12 180
200
220
240
260
280
260
280
Wave le ngth (nm)
θ (mdeg)
12 6 0
-6 -12 180
200
220
240
Wave le ngth (nm) Figure 4. top: Two spectra of the same sample of the protein subtilisin measured at two different SRCD beamlines. bottom: The same two spectra after calibration using CDTools, as described in the text.
Table 2. Secondary structure analysis of uncalibrated and calibrated spectra of subtilisin. Analyses were carried out using the DICHROWEB server [11] utilizing the CONTINLL [14] algorithm and dataset SP175 [10]. “X-ray” is the value obtained from the crystal structure PDB file using the DSSP [15] algorithm. Type of Structure
Uncalibrated
Calibrated
X-ray
% Helix
33
30
30
% Sheet
13
15
17
80
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
3. Wavelength Calibration Another type of instrument error can arise from misalignment or miscalibration of the instrument monochromator, which can lead to apparent shifts of wavelength that are manifest as differences in peak positions. These too will have a significant effect on empirical analyses of protein spectra. A good commercially-available wavelength standard is a sealed cuvette containing a solution of holmium oxide (4%) in perchloric acid (10%) which gives rise to a series of sharp peaks which can be seen in the HT signal over the wavelength range from 650 nm to 219 nm. However this standard is expensive and a cheaper alternative is a certified holmium oxide filter (Figure 5). When using this, the instrument should be set to a bandwidth of 1 nm with a step size of 0.1 nm. Alternatively, if a spectral resolution of at least 0.1 nm can be achieved, benzene vapor, which is readily available in most laboratories, produces a useful series of sharp peaks between 270 nm and 230 nm (Figure 6) [16]. In this case, one drop of benzene is introduced into a 1 cm pathlength cell, which is then sealed, producing benzene vapour. In the VUV region, the sharp peaks from the nitrogen gas used for purging the instrument can be used as a standard. As an indication of the level of variability that may be seen, Table 3 shows the peak positions (in nm), as measured on different instruments for both benzene vapor and a certified holmium oxide filter (Hellma), as well as the literature values for these peaks. The position of the positive π → π* exciton peak of the protein myoglobin is also included for comparison. Table 4 shows that the effects of wavelength errors on the calculated secondary structures can be significant [2]. If the peaks are found to be shifted from the standard values, the wavelength scale can be easily adjusted using the instrument software or, if the data has already been collected, the spectra can be adjusted by X-axis shifts in CDTool.
1.8 1.6
279.02
360.9
453.65
536.39
637.58
Absorbance
1.4 1.2 1 0.8 0.6 0.4 0.2 0 250
300
350
400
450
500
550
600
650
Wavelength (nm) Figure 5. HT spectrum of a certified holmium oxide standard (Helma) with the positions of the major peaks indicated by the arrows.
81
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
670
241.7
253.0
260.1
620
HT
570 520 470 420 370 240
245
250
255
260
265
270
Wavelength (nm) Figure 6. HT spectrum of benzene vapour (measured on beamline UV1 at ISA).
Table 3. Wavelength calibration study showing the peak positions (in nm) measured on different instruments for benzene vapour and a certified holmium oxide filter, compared with literature values. The peak position of the positive π → π* exciton peak of the protein myoglobin is also included for comparison (adapted from [2]).
literature value (nm)
cCD1
Values Measured for Each Instrument cCD2 cCD3 cCD4 SRCD1 SRCD2
SRCD3
241.7
240.1
241.8
241.6
241.8
-
241.7
242.7
253.0
252.0
253.1
252.9
253.1
-
253.0
253.7
261.1
259.1
260.1
260.0
260.0
-
260.0
260.8
279.2
278.1
279.2
279.0
280.0
279.1
279.3
-
360.9
359.4
361.6
361.0
362.0
360.7
-
-
453.7
451.7
454.0
454.0
455.0
-
-
-
Myoglobin:
191.6
192.2
192.0
192.6
192.0
192.2
193.0
Benzene vapour:
Holmium oxide filter:
82
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
Table 4. The effect of (artificial) wavelength shifts on calculated secondary structure contents for several proteins. This table illustrates that even very small wavelength calibration errors can have significant effects on the analyses (adapted from [2])
Protein
Myoglobin
Serum albumin
Lysozyme
Ceruloplasmin
Avidin
Wavelength shift (nm) applied
Calculated helix content (%)
Calculated sheet content (%)
0
74
0
+1
77
0
-1
69
1
0
72
1
+1
72
2
-1
70
1
0
38
14
+1
39
15
-1
36
13
0
2
34
+1
5
36
-1
3
32
0
1
50
+1
1
49
-1
1
48
X-ray value Helix sheet 74
0
72
0
40
6
1
35
7
50
4. Determination of Sample Cell Optical Pathlength If protein concentrations of >1 mg/ml can be achieved, it may be desirable to use sample cells with optical pathlengths of less than 0.1 mm to obtain spectra with the lowest possible wavelength cut-off. This is because the cutoff limit in SRCD instruments (assuming that appropriate non-absorbing buffers and optically-transparent cells made of CaF2 are used) is ultimately determined by the very large absorbance of water which has a peak maximum at ~168 nm. By using shorter pathlengths, the amount of water present in the beam, and hence its absorbance, is minimised. This enables the collection of lower wavelength data. Rectangular or circular demountable Suprasil quartz cells are sold with pathlengths as low as 0.005 cm, 0.002 cm and 0.001 cm and some manufacturers will produce customized pathlengths to order. Short pathlength, very low volume, calcium fluoride cells, with a better absorption profile than quartz cells, have been developed and are particularly useful for SRCD experiments or where minimal amounts of sample are available [17], since they can require as little as 1 microlitre of solution. There are a number of issues that must be addressed when using short pathlength cells, however. Firstly, quartz and calcium fluoride crystals are inherently birefringent and although the crystals are cut on planes so that the phenomenon is minimized, a small
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
83
amount of distortion may occur as illustrated in Figure 7, where the baseline is altered significantly by rotating one plate of the cell by 45o. It follows that the two halves of a demountable cell must be oriented identically with respect to each other and the beam when measuring a sample spectrum and its baseline spectrum, to prevent the distortion of the processed spectrum illustrated by Figure 8.
θ (mdeg)
10 8 6 4 2 0 -2 -4 -6 -8 -10 175
195
215 235 Wavelength (nm)
255
275
Figure 7. Two baseline spectra acquired using the same sample in the same calcium fluoride demountable cell. The position of the cell used to obtain the first spectrum was then rotated by 45 degrees to obtain the second spectrum.
40 30
θ (mdeg)
20 10 0 -10 -20 175
195
215 235 Wavelength (nm)
255
275
Figure 8. Spectra showing the consequences of using erroneous baselines. The same CD spectrum of the protein sample was used in both, except that the two baselines shown in Figure 7 were subtracted, respectively, from the sample spectrum have been subtracted producing different net spectra.
84
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
Secondly, the cell pathlength cited by the manufacturer may not be accurate for short pathlength cells, with errors of up to 50% having been reported [2]. Obviously such errors in pathlength lead to errors in calculated spectral magnitudes. Accurate measurement of short cell pathlengths (<0.001 cm) can be achieved using the interference fringe method as follows: The cell holder and empty cell is placed in a UV/visible spectrophotometer and a transmission spectrum acquired over the wavelength range from 800 nm to 500 nm with a step size of 0.2 nm and bandwidth of 1 nm. A small proportion of the collimated monochromatic light passing through the empty cell will be retarded by double reflection from the internal surfaces of the cell causing an interference pattern in the spectrum (Figure 9). The distance between consecutive peaks or fringes in the pattern is inversely proportional to the cell pathlength and an accurate value can be obtained by choosing a high wavelength fringe designated W1 and a lower wavelength fringe designated W2 with 10 to 20 peaks between them and applying Eqn. (4): pl = n(W2*W1)/((2(W2-W1))*1000)
(4)
where pl is the pathlength in microns, W1 and W are the wavelengths of the fringes in nm and n = number of fringes between the W1 and W2 where, for W1, n = 0 [2]. In some spectrophotometers, it may be necessary to manipulate the cell so that it is absolutely perpendicular to the beam before a good fringe pattern with at least two transmission units between peaks and troughs can be obtained.
% Transmission
92
W1
W2
90 88 86 84 500
600
700
800
Wavelength (nm)
Figure 9. Interference fringe pattern obtained by scanning an empty sample cell (with a nominal path length of 0.001 cm) using a UV/Visible spectrometer in the transmission mode [2].
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
85
Thirdly, in order to ensure the reproducibility of the measurements, it is good practice to load demountable cells with the minimum sample volume necessary to fill the cell, but also to prevent a layer of liquid forming between the cell flanges, which can result a significant change in optical pathlength and in the reproducibility of measurements. One good way to insure reproducibility is to keep a list of the precise volume needed to fill each cell, and to adhere strictly to those values. To do this, it will be necessary to give each cell an identifying code, which can be scratched into the fritted glass region on the outside of the cell with a diamond pen. It can also be us useful to make fiducial marks on the cell that can help with maintaining consistent orientations of the plates with respect to each other and with respect to the cell holder. It is useful to note the cell identity on the log sheets recorded for each experiment, so that the exact conditions can be reproduced at a later date, and also to provide a record of the cell pathlength used for when the data is processed and analysed.
5. Protein Concentration Determination If the procedures described above are followed, the largest remaining source of error will reside in the protein concentration determination. A number of available methods for accurate protein concentration measurements have been described in the chapter on Good Practice by Kelly and Price. In our experience, for soluble proteins of known sequence, the preferred method is quantitative amino acid analysis (QAA). For SRCD, this is particularly advantageous as it only requires two or more samples containing only 300 picamols of protein. It is unrealistic to for SRCD measurements using low volume (for instance 1 microlitre) cells to examine the protein, and then use a protein concentration determination method that requires hundreds of microlitres of sample. Unfortunately QAA is performed in specialised laboratories, sometimes at a considerable cost. Furthermore QAA is generally not possible for membrane proteins since samples containing detergents and phospholipids can damage the chromatography columns used for the analyses. A more economically friendly, if slightly less reliable choice that can be used for both soluble proteins, and membrane proteins in detergent micelles, is to measure the absorbance at 280 nm using a microspectrophotometer such as the Nanodrop, which only requires 2 microlitres per measurement and gives accurate measurements even for samples with concentrations greater than 0.5 mg/ml. To determine the concentration using the A280 method in an ordinary spectrophotometer, the sample should ideally be diluted to give an absorbance of between 0.5 and 1.2 in a 1 cm cuvette. This may not be possible when only small amounts of protein are available however a well-calibrated UV spectrophotometer should give reasonably accurate values at much lower concentrations and narrow aperture cells are available which reduce the sample volume required. Also 1 mm and 0.2 mm pathlength cuvettes equipped with a series of mirrors to focus the incident beam onto a 2μl drop have been designed for use with standard spectrophotometers and are currently available from Hellma UK Ltd.
86
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
1.4
Absorbance
1.2 1 0.8 0.6 0.4 0.2 0 250
270
290
310
330
350
Wavelength (nm) Figure 10. Absorbance spectrum of myoglobin at a concentration of 0.7 mg/ml in a 1 cm pathlength cell. Unlike most (colourless) proteins, the spectrum is not zero above 300 nm.
0.2 0.1
Log10 Absorbance
0 -0.1 -0.2 -0.3 -0.4 -0.5 -0.6 -0.7 2.45
2.47
2.49
2.51
2.53
2.55
Log10 Wavelength (nm) Figure 11. A plot of the Log10 absorbance versus Log10 wavelength allows a linear fit (thick line) to the spectrum (thin line) above 310 nm that can be extrapolated to eliminate the contribution to absorbance at 280 nm by the porphyrin group, which has an absorbance peak at ~400 nm [18]
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
87
If the extinction coefficient at 280 nm is not accurately known for the protein of interest, the sequence can be entered into the ProtParam Tools website located at http://www.expasy.org/tools/protparam.html, where the value for the protein in water, with or without cysteine bridges, can be calculated [19]. Alternatively the extinction coefficient of the protein can be calculated manually from Eqn. (5): ε (protein) = NTyrεTyr + NTrpεTrp + NCysεCys
(5)
where, for proteins in water measured at 280 nm, εTyr, the extinction coefficent per tyrosine is 1490, εTrp per tryptophans is 5500, and εCys per cysteine is 12. The use of the calculated extinction coefficients should be reasonably accurate for proteins containing tryptophans but may be prone to significant errors for samples which do not contain tryptophans, and indeed its use is not practical when that is the case [19]. A disadvantage of the A280 method is its limited utility for proteins which contain co-factors that absorb in the visible wavelength range, such as porphyrin (Figure 10), or for samples containing suspended particles that cause light scattering above 310 nm. The latter case may include micelles or membrane particles and is therefore a further hindrance to obtaining accurate data for membrane proteins. If the problem proves intractable, Wallace and Mao [18], have suggested plotting the log of the absorbance against the log of the wavelength in the non-absorbing regions above 310 nm and extrapolating the line into the absorbing regions (but not below ~280 nm) to determine the contribution to absorbance by light scattering at these wavelengths (as shown in Figures 10 and 11).
6. Temperature Calibration CD measurements are often obtained at room temperature, 4o or 25o C. However, where temperature sensitivity is an issue or if the thermal stability of a protein is to be determined by raising the temperature until it denatures (ie. a “thermal melt”), the temperature of the sample must be known with a high degree of accuracy. Instruments that are equipped with a Peltier system can be programmed to sustain a set temperature for any length of time. However, heat is not always efficiently conducted to or from the sample and, at the extremes, there is often a discrepancy between the set temperature and the sample temperature. To obtain accurate values it may be necessary to create a calibration curve (Figure 12) and one way that this can be done is by introducing a micro thermocouple into a cuvette containing a test sample. The measured temperature can be compared to the set temperature indicated by the CD instrument software, and a thermal calibration curve produced that can be applied to all temperature measurements that are made.
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
o
90 80
Actual Temperature (
C)
88
70 60 50 40 30 20 10 0 0
20
40
60
80
10 0
Set point temperature oC Figure 12. Calibration of the temperature controller at the CD12 SRCD beamline. The solid line passes through the set point temperatures while the dotted line passes through the temperature of the sample measured after 15 minutes equilibration at each point.
7. Summary There can be a significant variation in the spectra of the same protein sample measured on different instruments due to differences in instrument calibration [2,4,20]. This leads to inconsistencies in the calculated secondary structures derived from the spectra and it is therefore good-practice to calibrate CD and SRCD instruments regularly. Of greatest concern are disparities in CD magnitude/optical rotation. A test set of experiments by the authors revealed that even when the same person examined the same sample of the protein myoglobin in the same sample cell (to eliminate a number of systematic errors) very different spectra were produced using three different conventional CD and four different SRCD instruments. As a result, the calculated helical content of the protein ranged from 44 to 83%, as opposed to the crystallographic value of 75% [4]. A similarly large range of spectra was documented by the National Physical Laboratory when they sent identical samples to several laboratories around the U.K. [20]. These studies both emphasise the need for careful cross-calibration of CD instruments, both for research in academic labs, and for product quality control in biopharmaceutical industrial labs. Further errors can be introduced by using incorrect values for cell pathlength or protein concentration. For optical cells with pathlengths >0.01 cm, the values cited by the manufacturers are usually correct to within 1 or 2 μm which is sufficiently accurate, however deviations of this size become increasingly significant as the cell pathlength decreases (1 or 2 μm errors in a pathlength of 10 μm will produce very distorted magnitude values). Pathlengths of ≤0.001 cm can be accurately measured by interferometry using a basic UV/visible spectrophotometer. Careful sample loading that ensures there is no liquid between the flanges of demountable cells is also necessary to produce reproducible pathlengths. Both the present authors and the National Physical Laboratory Report [21] concluded that users can also introduce reproducibility errors if they are not careful to load cells in the same orientation when measuring sample and baseline spectra and both suggested that such errors can be reduced if loading of a sample and its cognate baseline is restricted to a single user.
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
89
Accurate protein concentration determinations are critical for accurate calculations of spectral magnitudes. Quantitative amino acid analysis is the most accurate method for determining protein concentration however it may be restricted by unavailability, the presence of detergents or lipids in the sample, or by cost. The next best method is probably to use the A280 measurement, although this may need to be calibrated since it can vary with buffer conditions and whether the protein is in an aqueous solution. In any case, it is important not to use colorimetric assays or rely on gravimetric measurements, as these rarely produce accurate values. Other peripheral measurements such as temperature also need to be calibrated since the thermal conductance of cells and samples are not the same as the surrounding cell holder and components of the Peltier system. If accurate measurements of T M are to be produced by thermal melt experiments, then each data point must be corrected from the thermal controller calibration curve. In summary, it is important that users undertake careful and regular calibration of both instruments and associated parameters, as these are used to process the spectral data, producing the spectra in delta epsilon or mean residue ellipticity units, which in turn will be used for secondary structure or other analyses.
Acknowledgements This work was supported by grants from the U.K. Biotechnology and Biological Sciences Research Council to BAW. Beamtime at the SRS, Daresbury was supported by a programme mode access grant to BAW and R.W. Janes (Queen Mary, University of London). Beamtime access at ISA was enabled by the European Community Research Infrastructure Action under the FP6.
References [1] [2]
[3] [4]
[5] [6]
[7] [8]
[9]
A.J. Miles, L. Whitmore and B.A. Wallace, Spectral magnitude effects on the analyses of secondary structure from circular dichroism spectroscopic data, Protein Sci. 14 (2005) 368-374. A.J. Miles, F. Wien, J.G. Lees and B.A. Wallace, Calibration and standardisation of synchrotron radiation circular dichroism and conventional circular dichroism spectrophotometers. Part 2: Factors affecting wavelength and ellipticity measurements, Spectroscopy 19 (2005) 43-51. G.C. Chen and J.T. Yang, Two-point calibration of circular dichrometer with d-10-camphorsulfonic acid, Anal. Lett. 10 (1977) 1195-1207. A.J. Miles, F. Wien, J.G. Lees, A. Rodger, R.W. Janes and B.A. Wallace, Calibration and standardisation of synchrotron radiation circular dichroism and conventional circular dichroism spectrophotometers, Spectroscopy 17 (2003) 653-661. T. Takakuwa, T. Konno, and H. Meguro, A new standard substance for calibration of circular dichroism ammonium d-10-camphorsulfonate, Anal. Sci. 1 (1985) 215-218. K. Tuzimura, T. Konno, H. Meguro, M. Hatano, T. Murakimi, K. Kashiwabara, K. Saito, Y. Kondo and T.M. Suzuki, A critical study of the measurement and calibration of circular dichroism, Anal. Biochem. 81 (1977) 167-174. T. Konno, H. Meguro and K. Tuzimura, D-Pantolactone as a circular dichroism (CD) calibration, Anal. Biochem. 67 (1975) 226-232. A.J. McCaffery and S.F. Mason, The electronic spectra, optical rotatory power and absolute configuration of metal complexes: The dextro-tris (ethylenediamine) cobalt (III) ion, Mol. Phys. 6 (1963) 359-371. P.H. Schippers and H.P.J.M. Dekkers, Direct determination of absolute circular dichroism data and calibration of commercial instruments, Anal. Chem. 53 (1981) 778-782.
90
A.J. Miles and B.A. Wallace / Calibration Techniques for CD and SRCD Spectroscopy
[10] J.G. Lees, F. Wien, A.J. Miles and B.A. Wallace, A reference database for circular dichroism spectroscopy covering fold and secondary structure space, Bioinformatics 22 (2006) 1955-1962. [11] L. Whitmore and B.A. Wallace, DICHROWEB, An online server for protein secondary structure analyses from circular dichroism spectroscopic data, Nucleic Acids Res. 32 (2004) W668-673. [12] A.J. Miles, F. Wien and B.A. Wallace, Redetermination of the extinction coefficient of camphor-βsulfonic acid, a calibration standard for circular dichroism spectroscopy, Anal. Biochem. 335 (2004) 338-339. [13] J.G. Lees, B.R. Smith, F. Wien, A.J. Miles and B.A. Wallace, CDtool - an integrated software package for circular dichroism spectroscopic data processing, analysis, and archiving, Anal. Biochem. 332 (2004) 285-289. [14] S.W. Provencher and J. Glockner, Estimation of globular protein secondary structure from circular dichroism, Biochemistry 20 (1981) 33-37. [15] W. Kabsch and C. Sander, Dictionary of protein secondary structure: Pattern recognition of hydrogenbonded and geometrical features, Biopolymers 22 (1983) 2577-2637. [16] F.M. Garforth and C.K. Ingold, Excited states of benzene. Part II. Analysis of the first ultraviolet band system of the absorption spectrum of benzene, J. Chem. Soc. (1948) 417-427. [17] F. Wien and B.A. Wallace, Calcium fluoride micro cells for synchrotron radiation circular dichroism spectroscopy, App. Spectroscopy 59 (2005) 1109-1113. [18] B.A. Wallace and D. Mao, Circular dichroism analyses of membrane proteins: An examination of differential light scattering and absorption flattening effects in large membrane vesicles and membrane sheets, Anal. Biochem. 142 (1984) 317-328. [19] E. Gasteiger, A. Gattiker, C. Hoogland, I. Ivanyi, R.D. Appel and A. Bairoch, ExPASy: The proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31 (2003) 3784-3788. [20] C. Jones, D. Schiffmann, A. Knight and S. Windsor, Val-CiD Best Practice Guide: CD spectroscopy for the quality control of biopharmaceuticals, NPL report DQL-AS 008 (2004). [21] W.C. Johnson, Jr., Protein secondary structure and circular dichroism – A practical guide. Proteins: Struct. Funct. Genet. 7 (1990) 205-214.
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-91
91
Sample Preparation and Good Practice in Circular Dichroism Spectroscopy Sharon M. Kelly and Nicholas C. Price Division of Molecular and Cellular Biology, University of Glasgow Abstract. This chapter aims to set out the guidelines for obtaining reliable CD data when studying molecules of biological interest. The main focus will be on the study of protein samples, but the points about sample purity and characterisation, attention to the solvent system employed, the proper use of the instrument and appropriate methods of data handling apply to all samples, whether they are small molecules, such as drugs or other ligands or macromolecules such as nucleic acids or their fragments. More details on most of these aspects are given in the review by Kelly et al. [1].
1. Introduction: The Importance of Good Practice Circular Dichroism (CD) spectroscopy has marked advantages as a structural technique in terms of the economy of time and amount of sample required, the ability to explore the structure of the sample under a wide variety of conditions (e.g. concentration, pH, temperature, ionic strength) and to assess the rate and extent of structural changes in the sample. Measuring the CD of biological samples almost always involves the recording of extremely small signals; the difference in absorbance between the left- and rightcircularly polarised light beams (ΔA = AL – AR) is of the order of 0.001 to 0.0001. This makes it necessary to pay very close attention to the choice of experimental conditions in order to obtain accurate data. If this is not done, it is possible to obtain results which are superficially appealing, but which on close inspection cannot be considered reliable. In addition to making sure that the protein is of high purity and of known concentration, it is important that the CD measurements are recorded under appropriate conditions using instruments (spectropolarimeters) and cells which have been properly calibrated. This Chapter will deal with the main aspects of good practice in conventional CD and should be read in conjunction with later Chapters in this book which explore certain aspects in more detail.
2. Preparation of Protein Samples The majority of protein samples are produced by heterologous over-expression in a suitable host system, such as the Gram-negative bacterium Escherichia coli, or the lower eukaryote methylotrophic yeast Pichia pastoris. Many current approaches employ the use of suitable tags, i.e. the gene encoding the protein of interest is linked with that encoding another protein such as glutathione-S-transferase (GST) or maltosebinding protein (MBP) or a short stretch (usually 6) of histidine residues. Affinity methods involving immobilised binding partners (glutathione for GST, maltose for
92
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
MBP and chelated Ni(II) ions for the hexa-His tag) can be used to provide very effective purification of the protein of interest. For CD studies of the target protein, it is normally necessary to remove the linked tag protein by means of a suitable protease, since this will contribute to the observed CD signals. In the case of a small tag such as hexa-His, it may not be necessary to remove the tag, but it is important to establish whether or not the presence of the tag affects the function and/or stability of the protein of interest. Whichever method is used, it is important to use a combination of purification methods (e.g. ion-exchange and hydrophobic interaction chromatography, affinity chromatography or gel filtration) which will generate protein suitable for CD work. The criteria are that (a) the protein should be at least 95% pure, (b) the protein should be free from nucleic acids and their fragments, and (c) the protein should be free from other aggregates. The purity is readily checked by SDS-PAGE using Coomassie Blue or silver staining. The absence of nucleic acids and aggregated material can be established by recording the absorption spectrum over the range from 400 nm to 220 nm. Proteins generally absorb maximally at 280 nm, whereas nucleic acids absorb maximally at 260 nm, so that ratio A280/A260 is a useful indication of purity (the ratio is typically about 1.7 for proteins and about 0.6 for nucleic acids). The absence of aggregated material is indicated by a flat baseline in the absorption spectrum over the range 400 nm to 310 nm; if aggregates are present the baseline will increase progressively at the lower wavelengths in this range. Aggregated proteins can cause optical artefacts in CD such as differential light scattering and absorption flattening, which can be corrected for by various experimental and mathematical procedures [2-4]. It is usually a much better approach to see whether the aggregates can be removed by centrifugation (5000 g; 5 min) in a bench-top centrifuge, or by filtration through a 0.2 μm Millipore filter. The identity and the occurrence of the correct post-translational modifications of the protein can be established by mass spectrometry. The biological activity of the protein can be checked by performing a suitable assay (such as enzyme activity, binding affinity etc.). It is important that the protein is stable under the conditions of data collection, for example in the buffer or solvent system used and is not subject to photodamage. This can be easily tested for if there is a convenient functional assay available. 2.1. Determination of Protein Concentration An accurate determination of protein concentration is important for calculating the values of parameters on a molar basis, for example the specific activity of an enzyme (in terms of the turnover number, kcat), the number of ligand binding sites, and in the present context, the molar CD signals in terms of absorbance or ellipticity (see section 4). Accurate knowledge of the molar CD signals of a protein in the far UV is vital for the reliable estimation of secondary structure content (see the chapter on Analyses of Proteins by Whitmore and Wallace). Five of the main methods which can be used to determine protein concentration are listed below. More details on these are given in Price [5]. •
The biuret method depends on the formation of a purple complex between Cu(II) ions and adjacent peptide bonds in a protein under alkaline conditions. Because peptide bonds occur with very similar frequencies in different
93
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
•
•
•
•
proteins, there is a fairly uniform response for different proteins. However, the method requires large amounts of sample (0.5 to 5 mg). The Lowry method is based on the reduction of the phosphomolybdic tungstic mixed acid chromogen (in the Folin-Ciocalteu reagent) by a protein to give a blue product. There is a rapid reaction with certain side chains (notably Tyr and Trp), and a slower reaction with the Cu(II) chelates of the peptide bonds. Although the Lowry method is sensitive (5 to 100 μg sample), different proteins give different responses and a large number of substances can interfere. The bicinchoninic acid (BCA) method is based on the ability of BCA to combine with Cu(I) ions (produced by reduction of Cu(II) by a protein under alkaline conditions) to give a yellow-green product. The method is convenient and sensitive (1 to 100 μg sample), but the response given depends markedly on the time and temperature of the incubation, so it is important to standardise conditions carefully. The Coomassie Blue binding method (for example the Bradford method; marketed as the BioRad method) relies on the change in the absorption spectrum of the dye when it binds to proteins under acid conditions (from orange-red to blue). Although the method is convenient and sensitive (1 to 100 μg sample), the interaction with different proteins depends on the numbers of basic side chains (principally Arg) present, but also to some extent on the numbers of non-polar side chains. This results in there being different responses for different proteins. The most convenient (and non-destructive) method depends on the absorbance of a protein at 280 nm (assuming that the protein contains at least some Tyr and Trp).
The expected A280 value for a 1mg/ml solution of protein in 6 M GdmCl (i.e. under denaturing conditions) can be calculated from the following formula (Eqn. (1)), which is based on the absorbance properties of these amino acids as well as a small contribution from any disulphide bonds [6]. It should be noted that although these bonds are rarely found in intracellular proteins, this term makes only a very small contribution to the calculated absorbance. A280 (1 mg/ml; 1 cm) = (5690nW + 1280nY + 60nC)/M
(1)
where nW, nY and nC are the number of Trp, Tyr and Cys per polypeptide chain and M is the molecular mass (in Da). (Note that the coefficient of nC in the equation given by Gill and von Hippel [6] has been halved since their value refers to the disulphidebonded cystine. If there are no disulphide bonds in the protein the term in n C is neglected). The calculated value of A280 (1 mg/ml; 1 cm) is available from analytical tools such as Protparam within the Expasy system (http://us.expasy.org/tools/protparam.html) for any given protein stored in Swiss-Prot or TrEMBL databases or for a user-entered sequence. Two sets of calculated A 280 values are given, one assuming that all the cysteine side chains form disulphide bonds, the other assuming that none do. The calculated value of A280 is valid for a
94
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
homogeneous sample of protein, provided that ALL the following conditions are met [1]: • There is no contribution from light scattering • There is no other chromophore (e.g. cofactor) in the protein • There is no other absorbing contaminant, e.g. nucleic acids • A correction is applied for the difference between native and denatured protein 2.1.1 Light Scattering It is important to check the absorption spectrum of the protein sample, ideally over the range from 400 nm to 240 nm. If there is a gradually increasing baseline (i.e. apparent absorbance) as the wavelength decreases from 400 nm to 310 nm, this indicates that light scattering is a problem; it is assumed that no cofactors such as flavin mononucleotide (FMN) or pyridoxal-5´-phosphate which absorb in this range are present. If light scattering is a problem, the solution can be centrifuged or filtered to try to eliminate or minimise the problem. If these approaches are not successful, it is possible to correct for the actual contribution of light scattering to the observed A 280 by use of a log/log plot or use of inbuilt software in the spectrophotometer. 2.1.2 Interference from Other Chromophores The presence of nucleic acids can be inferred from the A280/A260 ratio of the sample or by the use of dyes which bind specifically to nucleic acids. Nucleic acid contaminants can be removed by treatment of the cell extract with the appropriate nuclease (e.g. DNase for DNA or its fragments). Some non-protein chromophores such as haem groups absorb at 280 nm and their contribution has to be taken into account if A 280 values are being used to determine the protein concentration. 2.1.3 Correction to Native Conditions The calculated value of the A280 for a protein refers to the unfolded protein, i.e. where the chromophore side chains are fully exposed to solvent as they are in small model compounds; it should be noted that the spectral properties of the chromophores will depend to a small extent on the polarity of the environment. The correction can be made, i.e. the ratio of the A280 values for native and denatured protein can be determined, by performing parallel dilutions, e.g. (a) 0.25 ml of native protein in buffer is diluted into 0.75 ml buffer (native) or (b) 0.25 ml of native protein in buffer is diluted into 0.75 ml 8 M GdmCl in buffer (denatured). The experimentally determined ratio of the A280 values of native and denatured proteins generally lies between 0.9 and 1.1 and can be used to correct the calculated A280 to give the appropriate value for native protein. 2.2 Recommended Method for Accurate Determination of Protein Concentration For accurate determination of protein concentration, the method of choice is quantitative amino acid analysis [1]. Although a number of amino acids are not released in quantitative fashion by hydrolysis in 6 M HCl, for example Thr and Ser are partially destroyed, Trp is largely destroyed and bulky amino acids such as Ile and Val may not be fully released, the method can be made reliable by measuring the amounts of a number of stable, abundant amino acids (e.g. Ala, Arg, Lys, Tyr etc.) in a sample relative to a known amount of an added internal amino acid standard such as
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
95
norleucine. The relative amounts of the stable, abundant, amino acids should correspond to those predicted from the amino acid composition of the protein. The composition can be inferred from the sequence, which will be known if an overexpressed protein is being purified, or may have been determined directly for the wildtype protein. The correct relative amounts of the amino acids provide an additional check on the authenticity of the protein sample. From the measured amounts of the abundant amino acids, the number of moles of protein can be determined and hence the mass of protein calculated. Although this method is cumbersome, it can be used to calibrate a more readily applied everyday method for the protein in question, e.g. A 280 measurements, dye binding, BCA assay etc. A relatively small number of proteins contain cofactors which absorb in the near UV or visible region of the spectrum; this absorption can be used as the basis for estimating the protein concentration, e.g. cytochrome P450 is estimated on the basis of absorbance of the haem group at 450 nm in the reduced form of the enzyme complexed with CO [7]. It should be noted that this method assumes that there is stoichiometric occupation of the cofactor binding sites in the protein, which may not always be the case with proteins over-expressed in heterologous systems.
3. Choice of Solvent/Buffer System The aim in recording CD spectra of good quality is to keep the overall absorbance within reasonable bounds (less than about 1). If the absorbance is too high, the relatively small photon flux reaching the detector will lead to high errors in measuring the small value of AL – AR, and consequent high levels of noise relative to the signal. (In practice this is manifested as an increase in the High Tension (HT) Voltage applied to the photomultiplier detector, which will then be working outside its linear response range). Since the overall absorbance of the sample arises from the contributions of the sample itself and the solvent system, it is clear that the preferred situation is to keep the former as high and the latter as low as possible. When dealing with proteins it is often the case that as well as a buffer system it may be necessary to maintain the ionic strength or to add reducing or chelating agents in order to improve the stability of the protein; hence there may need to be a compromise between these 2 factors. Problems can be particularly acute in the far UV region, especially below 200 nm. As indicated in Table 1, chloride and carboxylate ions absorb strongly below 195 nm. High concentrations of chloride ions arising from high concentrations of Tris/HCl buffers or from the saline in phosphate buffered saline (PBS) should be avoided. In order to maintain ionic strength, fluoride or sulphate ions can be used instead of chloride. Buffers based on phosphate, borate or Tris are suitable for maintaining pH values in the range from about 6.5 to 9.5; however buffers in the pH range from 4 to 6 are usually based on carboxylate groups (e.g. acetate) or on piperazine sulphonic acid derivatives (e.g. HEPES) and can only be used at low concentrations, ideally ≤ 10 mM. Imidazole can lead to particular problems as it is widely used to elute His-tagged proteins from immobilised Ni(II) columns, often at high concentrations (up to 500 mM). Imidazole has an absorption maximum at 207.5 nm of 3500 M-1 cm-1. Thus a solution of 50 mM imidazole in a cell of pathlength 0.02 cm (typical for far UV CD studies) has an absorbance at 207.5 nm of 3.5, which is far too high for meaningful CD data to be collected. It is very important to remove as much imidazole as possible by either extensive dialysis or gel permeation chromatography.
96
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
Table 1. Absorption properties of selected buffer components in the far UV. Data are adapted from [1]. Component
Absorbance (50 mM solution in 0.02 cm pathlength cell) 180 nm
190 nm
200 nm
210 nm
NaCl
>0.5
>0.5
0.02
0
NaF
0
0
0
0
Na borate (pH 9.1)
0.3
0.09
0
0
Na2HPO4
>0.5
0.3
0.05
0
NaH2PO4
0.15
0.01
0
0
Na acetate
>0.5
>0.5
0.17
0.03
Tris/H2SO4 (pH 8.0)
>0.5
0.24
0.13
0.02
HEPES/Na+ (pH 7.5)
>0.5
>0.5
0.5
0.37
Organic solvents are rarely used in protein studies except when dealing with some very non-polar integral membrane proteins. However, such solvents are often employed to dissolve small molecules, such as natural products or drugs. Solvents containing C-Cl bonds absorb strongly below 230 nm, and the useful dipolar, aprotic solvents dimethylsulphoxide and dimethylformamide cannot be used below 240 nm and 250 nm, respectively. Acetonitrile, ethanol, methanol and trifluoroethanol can be used down to 190 nm, provided they are of high purity. In all cases, it is essential to run a blank spectrum with buffer or solvent system alone in order to check that the absorbance and hence HT voltage is not excessive. If problems are encountered, it may be possible to use cells of shorter pathlength, provided that the concentration of the sample can be increased, or lower the concentration of the species which is causing the problem (e.g. chloride ions). If these approaches are not feasible, a possibility is to use synchrotron radiation CD (SRCD) as described in the chapters by Miles and Wallace; SRCD has a vastly increased photon flux in the spectral range below 200 nm and so can be employed to extend the wavelength range deeper into the far UV.
4. Units of CD Data The units in which CD results are reported can cause a good deal of confusion, so we feel that a discussion of this topic is merited in this Chapter. CD data can be presented in terms of either ellipticity (θ, in degrees or millidegrees) or differential absorbance (ΔA). In both cases, the actual readings are usually scaled to the molar values of these quantities. In the case of polymers such as proteins, the values of far UV CD signals (which arise from the peptide bonds) are normally scaled to the molar concentration of the repeating unit, i.e. the peptide bond in the case of proteins. The mean residue weight (MRW) for the peptide bond repeating unit of a protein is calculated by dividing the molecular mass (in Da) by n-1 where n is the number of amino acids in the polypeptide chain. For most proteins, the MRW is in the range 110 ± 5 Da.
97
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
The mean residue ellipticity at a wavelength λ ([θ]MRE,λ) is given by Eqn. (2): [θ]MRE,λ = (MRW × θλ)/(10 × d × c)
(2)
where θλ is the observed ellipticity in degrees at a wavelength λ, d is the cell pathlength in cm and c is the concentration in g/mL. If the molar concentration (m) of the solute is known, the molar ellipticity at wavelength λ ([θ]molar,λ) is given by Eqn. (3): [θ]molar,λ = (100 × θλ)/(m × d)
(3)
where θλ and d have the same meanings as in the definition of mean residue ellipticity. The units of mean residue ellipticity and molar ellipticity are deg cm 2 dmol-1. It is usual to express CD data in absorbance units in terms of the molar differential absorption coefficient, Δε, as in Eqn. (4): Δε = ΔA/(m × d)
(4)
where m is the molar concentration and d is the pathlength in cm. The units of Δε are M-1 cm-1. There is a simple numerical relationship between [θ]mrw and Δε (Eqn. (5)): [θ]mrw = 3298 × Δε
(5)
Whereas virtually all studies of the far UV CD of proteins refer to the concentration of the repeating unit, i.e. the peptide bond, there is considerable debate about the appropriate choice of the molar unit for near UV and visible CD studies. The molar unit could correspond to the peptide bond or to the whole polypeptide chain; there are arguments in favour of both approaches. In reporting any results, it is essential to specify which molar unit is being used.
5. Calibration of Spectropolarimeter It is important that the spectropolarimeter is regularly (ideally on a weekly basis) checked and calibrated, particularly with regard to wavelength and signal magnitude. This topic is discussed in more detail in the chapter on Calibration by Miles and Wallace, but some of the key points are given here. Wavelength calibration requires the use of well-characterised spectroscopic standards; these include rare earth element filters such as holmium oxide (peaks at 279.4 nm, 361.0 nm and 453.7 nm) or sealed cells containing benzene vapour (peaks at 241.7 nm, 253.0 nm and 260.1 nm). For studies in the near and far UV regions, a convenient standard is a solution of 1S-(+)-10-camphorsulphonic acid (CSA), which has peaks at 290.5 nm and 192.5 nm (see below). It is good practice to make sure that wavelength calibration is performed in the spectral region where CD data are to be recorded [1,8]. Several compounds have been proposed as standard for magnitude calibration of spectropolarimeters; again this should be done in the spectral region where CD data are
98
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
to be recorded [1]. CSA is a very widely used standard for studies on proteins since its 2 peaks (at 290.5 nm and 192.5 nm) are in the near UV and far UV regions respectively. The molecular mass of the free acid form of CSA is 232.3 Da and solutions for calibration are typically made up at a concentration of 0.06% (w/v), i.e. 0.6 mg/mL in distilled water. Since the absorption coefficient of CSA at 285 nm is 34.6 ± 0.2 M-1 cm-1 [9], the absorbance of this solution in a cell of 1 cm pathlength should be 0.0894. The ellipticity of this solution in a cell of 1 cm pathlength should be +202 mdeg at 290.5 nm and –420 mdeg at 192.5 nm. In measuring the ellipticity of CSA, it should be noted that a shorter pathlength cell should be used to resolve the 192.5 nm peak and the readings should be corrected for the solvent blank). The ratio of the magnitudes of the signals for CSA at the 2 wavelengths (i.e. θ192.5/θ290.5) should be 2.05 or greater. The free acid form of CSA is hygroscopic so it is important to check the concentration of any solution made up by weight spectroscopically. Alternatively the ammonium salt of CSA (molecular mass 249.3 Da) can be used; a 0.06% (w/v) solution of this compound has an absorbance at 285 nm of 0.0833 in a 1 cm pathlength cell; the ellipticities at 290.5 nm and 192.5 nm in a cell of pathlength 1 cm would be +188 mdeg and –391 mdeg respectively. Because the ellipticity of CSA solutions is sensitive to temperature (over the range 5o C to 30o C there is a 2.5% decline in the signal), it is important to perform the calibration at a standard temperature, for example 20 o C. It is also recommended that, between calibration operations, the standard solutions of CSA are stored in the dark at 4o C for a period not exceeding 4 weeks. Pantolactone (molecular mass 130.15 Da) has also been recommended for calibrating spectropolarimeters; it has a strong band at 219 nm. For (R)-(-)pantolactone, the molar ellipticity at 219 nm is –16160 deg cm2 dmol-1; hence a 0.015% (w/v) (i.e. 0.15 mg/mL) solution will have an ellipticity at 219 nm of –186 mdeg in a cell of 1 cm pathlength [1].
6. Calibration of Cell Pathlength The cells used for CD work should be manufactured to extremely high standard. Suprasil (quartz) cells can be used for measurements down to about 165 nm and are usually made to order. It is advisable that they are certified as “strain free for polarimetry” to avoid optical artefacts. For each sample being run, it is important to record a blank spectrum with the buffer or solvent in the same cell in the same orientation. Refillable cells are available commercially with pathlengths in the range 0.01 cm to 10 cm. Shorter pathlength cells (e.g. 0.001 cm (10 μm) or even shorter) are of a demountable type, consisting of 2 quartz plates, one of which is machined out to the required depth. It is important to check the actual pathlength of the cell used; this is especially important in the case of very short pathlength cells where there can be considerable (up to 50%) deviations from the manufacturer’s values [10]. As described in more detail in the chapter on Calibration Techniques by Miles and Wallace, there are 2 main methods for establishing the pathlengths, based either on interference patterns caused by reflection of the incident light from the internal surfaces of the cells, or on the absorbance of standard solutions such as potassium chromate (formula mass 194.2; absorption coefficient at 372 nm in 0.05 M KOH is 4830 M-1 cm-1). The absorbance of
99
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
a 0.1 M solution of potassium chromate (19.42 g/L) in a cell of 10 μm pathlength would be 0.483 [1].
7. Choice of Instrumental Parameters The four parameters which can be adjusted to obtain spectra are bandwidth, time constant (or dwell time), scan rate and number of scans. The bandwidth measures the precision with which light of a specified length is chosen by a monochromator. An increase in the bandwidth allows more light to fall on the sample and hence on the detector which would facilitate more accurate measurements, but would decrease the ability to resolve features in the spectrum. For routine CD studies the bandwidth should be less than or equal to 1 nm; however to resolve fine structure particularly in the near UV region, smaller bandwidths should be used. The time constant is a measure of period over which the CD data are averaged; its precise meaning will depend on the design of the spectropolarimeter (the terms response time or dwell time have analogous meanings). The scan rate is the rate at which the wavelength is changed; most spectropolarimeters will offer scan rates between 10 nm/min and 1000 nm/min (or even greater). The exact choice of scan rate will depend on how quickly data need to be collected, but should conform to the set of guidelines shown as Eqn. (6) [11]: Scan rate × time constant < bandwidth < (spectral feature width)/10
(6)
For example if we wished to resolve a spectral feature 5 nm wide, the band width should not exceed 0.5 nm. If a time constant of 2 s is chosen, the scan rate should not exceed 10 nm/min (0.167 nm/s); use of 20 nm/min would give a product (scan rate × bandwidth) of 0.667 nm, which is too large to allow the spectral feature to be resolved properly. It is important to ensure that any departure from these guidelines (for example if it were necessary to use a fast scan rate) should only be made if it can be shown that there is no significant distortion of the spectrum obtained. Increasing the number of scans which are averaged will improve the spectrum, since the signal/noise ratio is proportional to the square root of the number of scans. However, it must be ensured that the additional time required to accumulate the scans, during which the sample is exposed to a high intensity light source, does not lead to damage to the sample. This factor is not generally a major problem with conventional CD, but can be significant in SRCD with the very intense light sources used. However, adjustment of the various parameters can only improve the quality of the CD spectra within certain limits; it is important to ensure that the guidelines concerning protein concentration, cell pathlength and choice of solvent or buffer system are observed. Illustrations of the way in which the choice of experimental conditions and instrumental settings can affect the quality of CD spectra are provided in Figures 1 to 8. All spectra are recorded on a sample of bovine α-lactalbumin, a 14 kDa protein with 4 of each of the aromatic amino acids Phe, Tyr and Trp. In each figure the upper trace shows the CD signal and the lower trace the corresponding High Tension voltage.
100
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
Figures 1 to 5 show the near UV CD spectra of α-lactalbumin over the range from 250 to 320 nm as raw data (i.e. mdeg). In all cases, the protein concentration is 2.27 mg/ml and the cell pathlength 0.2 cm. Signals in the wavelength ranges 255 to 270 nm, 275 to 285 nm and 285 to 305 nm can be generally ascribed to contributions of Phe, Tyr and Trp side chains respectively [1]. Thus the near UV CD spectrum αlactalbumin consists of a broad envelope due to the Phe and Tyr side chains, with distinct narrow peaks at 287 nm and 293 nm characteristic of Trp. Between Figures 1 and 3, it is clear that there is an improvement in the signal to noise ratio, as the response time is increased from 0.5 s (Figure 1) to 2 s (Figure 2) and then the number of scans is increased from 1 to 3 (Figure 3). In all these cases, the product of the scan rate and time constant is below the bandwidth (1 nm), so the first condition of Eqn. (6) applies. In Figure 4, it is clear that increasing the bandwidth from 1 to 3 nm has compromised the sharpness of the peaks from the Trp side chains at 287 nm and 293 nm; clearly at the higher bandwidth the second condition of Eqn. (6) no longer applies. Finally, Figure 5 shows that the effect of increasing the scan rate from 10 nm/min to 200 nm/min is to distort the shape of the spectrum especially in the Trp region. At the high scan rate, the first condition of Eqn. (6) is no longer satisfied. Figures 6 to 8 show the far UV CD spectra of α-lactalbumin over the range from 180 to 260 nm expressed as mean residue ellipticity units (deg cm2 dmol-1). The spectra are characteristic of a protein with a significant α-helical content. In Figure 6, spectra are recorded at two different protein concentrations 0.45 mg/ml and 2.27 mg/ml in a cell of pathlength 0.02 cm. The buffer is 50 mM sodium phosphate, pH 7.5. It is clear that when the raw data are corrected for concentration, there is no difference between the two spectra down to 196 nm, i.e. when the High Tension voltage is below about 600 V. However, at lower wavelengths, the high absorbance of the more concentrated sample leads to a marked increase in the High Tension voltage and there is an increasing divergence between the two spectra. Finally, when the voltage is above 800 V (i.e. below 185 nm), there is a collapse of the CD signal. These data emphasise the need to keep a careful check on the High Tension trace and only record CD data when the voltage is kept below the chosen threshold value. (This is normally 600 V, but the actual value should be established for the spectropolarimeter being used). Figures 7 and 8 show the effects of adding strongly absorbing species to the phosphate buffer, namely 150 mM chloride ions and 200 mM imidazole respectively. Chloride ions cause problems below 195 nm, with a large increase in noise and collapse of the CD signal. This is important since NaCl is often added to stabilise protein solutions. Imidazole causes even greater problems; as shown in Figure 8 in the presence of 200 mM imidazole, CD data can only be collected at wavelengths above 230 nm. This is important since high concentrations of imidazole are often used to elute His-tagged proteins from immobilised metal columns. Under such circumstances, it is necessary either to try to remove as much imidazole as possible, e.g. by extensive dialysis, or to use very short pathlength cells. The latter would require the protein concentration to be high in order to observe suitable CD signals.
8. Good Practice in Equipment Maintenance The spectropolarimeter should be regularly calibrated and serviced in accordance with the manufacturer’s recommendations. It should be located in a room which is free from
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
101
excessive vibration or atmospheric dust. Because the high intensity light source will convert O2 to O3, which damages the optics, the spectropolarimeter should be purged with O2-free N2 at a flow rate of about 10 L/min prior to switching on the light source. It is advisable to allow the instrument 30 min to warm up and achieve stability before recording data. During operation, the flow rate of the N2 should be maintained; any remaining O2 will absorb strongly below 190 nm. The light source and optical system should be checked regularly and any corrective action taken when necessary. Most Xe arc lamps have a lifetime of about 1000 h before performance deteriorates (detected by an increase in the High Tension voltage). The first mirror in the optical system (which gathers the light from the source) is most rapidly degraded and may require replacement or cleaning. The need for this is indicated by a decrease in the signal/noise ratio. The cells should be handled and stored carefully to avoid mechanical or thermal damage. Traces of protein deposits should be removed using either proprietary cleaning formulations or concentrated HNO3 (note use of the latter requires considerable care since it is corrosive and a powerful oxidising agent). A typical procedure might involve a short (1-2 min) treatment with concentrated HNO3 followed by thorough washing with high purity water, then with ethanol, followed by drying with a vacuum pump.
9. Good Practice in Data Recording and Analysis A good system of record keeping is essential for efficient data storage and retrieval. This would be particularly important in cases when spectra are recorded before the concentration of protein is known accurately, so that conversion of the data into appropriate molar units can be carried out. An integrated software package (CDtool) has been developed to allow more streamlined processing and archiving of data [12]. This is required to establish a data audit trail which is an integral part of good laboratory practice procedures. Currently, a Protein Circular Dichroism Data Bank (PCDDB) is being developed, the aim of which is to provide open access and archiving facilities for validated CD spectra in a manner analogous to that of the well-established Protein Data Bank for X-ray crystallography and NMR structural data. The analysis of far UV CD data to give estimates of secondary structure is described in detail in the chapter on Analyses by Whitmore and Wallace. Several algorithms are available, which differ in terms of the computational procedures and in the data sets of reference proteins employed. In assessing the estimates obtained by the different procedures, the following points should be kept in mind [1,13,14]. The goodness of fit parameter NRMSD (normalised root mean square deviation) which is a measure of how closely the calculated spectrum matches the experimental one should be less than 0.1 or ideally less than 0.05. NRMSD is defined by [Σ(θexp – θcal)2/Σ(θexp)2]1/2, where θexp and θcal are the experimental and calculated ellipticity values at a particular wavelength; the value of the NRMSD ranges from 0 (a perfect fit) to 1 (no fit whatsoever).
102
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
The comparison between the plots of the calculated and experimental spectra should be inspected to check for systematic differences, rather than random ones. This could indicate the use of inappropriate reference data sets as in the case for soluble proteins to analyse experimental data of membrane proteins, for example. The results of using different algorithms should be compared. If very similar estimates are obtained from a number of procedures, such consensus estimates can be considered more reliable. In addition, the R value (if available) is a measure of the appropriateness of the analysis. It is defined as the sum of the differences (irrespective of sign) between the fractions of each of the secondary structural features (helix, sheet, turn) derived from CD and X-ray analyses. A low value of R (less than 0.1) indicates that the analysis is appropriate and that the data set of reference proteins is suitable for the protein being analysed.
Figure 1. The near UV CD spectrum of α-lactalbumin. The protein was at a concentration of 2.27 mg/ml in 50 mM sodium phosphate buffer, pH 7.5 in a cell of pathlength 0.2 cm. The spectrum was recorded at 20o C using the following instrumental settings:- bandwidth, 1 nm; time constant, 0.5 s; scan rate, 50 nm/min; 1 scan. The upper panel shows the CD spectrum as raw data and the lower panel shows the corresponding High Tension voltage trace.
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
103
Figure 2. The near UV CD spectrum of α-lactalbumin. The spectrum was recorded under the same conditions as for Figure. 1, except that the time constant was 2 s and the scan rate 10 nm/min.
Figure 3. The near UV CD spectrum of α-lactalbumin. The spectrum was recorded under the same conditions as for Figure. 2, except that 3 scans were accumulated.
104
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
Figure 4. The near UV CD spectrum of α-lactalbumin. The spectrum was recorded under the same conditions as for Figure 3, except that the bandwidth was 1 nm (solid line) or 3 nm (dashed line).
Figure 5. The near UV CD spectrum of α-lactalbumin. The spectrum was recorded under the same conditions as for Figure 3, except that the scan rate was 10 nm/min (solid line) or 200 nm/min (dashed line).
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
105
Figure 6. The far UV CD spectrum of α-lactalbumin. The protein was at a concentration of 0.45 mg/ml (solid line) or 2.27 mg/ml (dashed line) in 50 mM sodium phosphate buffer, pH 7.5 in a cell of pathlength 0.02 cm. The spectrum was recorded at 20o C using the following instrumental settings:- bandwidth, 1 nm; time constant, 0.5 s; scan rate, 50 nm/min; 8 scans accumulated. The upper panel shows the CD spectra expressed as mean residue ellipticities in units of deg cm2 dmol-1 and the lower panel shows the corresponding High Tension voltage traces.
Figure 7. The far UV CD spectrum of α-lactalbumin. The protein was at a concentration of 0.45 mg/ml and spectra were recorded under the conditions described for Figure. 6. The solid and dashed lines refer to spectra recorded in 50 mM sodium phosphate buffer, pH 7.5 or 50 mM sodium phosphate buffer, pH 7.5 to which 150 mM NaCl has been added respectively.
106
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
Figure 8. The far UV CD spectrum of α-lactalbumin. The protein was at a concentration of 0.45 mg/ml and spectra were recorded under the conditions described for Figure. 6. The solid and dashed lines refer to spectra recorded in 50 mM sodium phosphate buffer, pH 7.5 or 50 mM sodium phosphate buffer, pH 7.5 to which 200 mM imidazole has been added respectively.
References [1]
Kelly, S.M., Jess, T.J. and Price, N.C., How to study proteins by circular dichroism, Biochim..Biophys. Acta 1751 (2005) 119-139. [2] Wallace, B.A. and Mao, D., Circular dichroism analyses of membrane proteins: an examination of differential light scattering and absorption flattening effects in large membrane vesicles and membrane sheets, Anal. Biochem. 142 (1984) 317-328. [3] Wallace, B.A. and Teeters, C.L., Differential absorption flattening effects are significant in the circular dichroism spectra of large membrane fragments, Biochemistry 26 (1987) 65-70. [4] Ganesan, A., Price, N.C., Kelly, S.M., Petry, I., Moore, B.D. and Halling, P.JCircular dichroism studies of subtilisin Carlsberg immobilised on micron sized silica particles, Biochim.. Biophys. Acta 1764 (2006) 1119-1125. [5] Price, N.C. in : Engel, P.C. (ed.) The determination of protein concentration in Enzymology Labfax, Bios Scientific Publishers, Oxford, (1996) pp. 34-41. [6] Gill, S.C. and von Hippel, P.H., Calculation of protein extinction coefficients from amino acid sequence data, Anal. Biochem. 182 (1989) 319-326. [7] Omura, T. and Sato, R., A new cytochrome in liver microsomes, J. Biol. Chem. 237 (1982) 1375-1376. [8] Miles, A.J., Wien, F., Lees, J.G., Rodger, A., Janes, R.W. and Wallace, B.A., Calibration and standardisation of synchrotron radiation circular dichroism and conventional circular dichroism spectrophotometers, Spectroscopy 17 (2003) 653-661. [9] Miles, A.J., Wien, F. and Wallace, B.A., Redetermination of the extinction coefficient of camphor-10sulphonic acid, a calibration standard for circular dichroism spectroscopy, Anal.. Biochem. 335 (2004) 338-339. [10] Miles, A.J., Wien, F., Lees, J.G. and Wallace, B.A., Calibration and standardisation of synchrotron radiation circular dichroism and conventional circular dichroism spectrophotometers. Part 2: Factors affecting magnitude and wavelength, Spectroscopy 19 (2005) 43-51.
S.M. Kelly and N.C. Price / Sample Preparation and Good Practice in CD Spectroscopy
107
[11] Jones, C., Schiffmann, D., Knight, A. and Windsor, S.,Val-CID best practice guide: CD spectroscopy for the quality control of biopharmaceuticals, National Physical Laboratory, Teddington, U.K., Crown Copyright (2004) (ISSN 1744-0602). [12] Lees, J.G., Smith, B.R., Wien, F., Miles, A.J. and Wallace, B.A., CDtool – an integrated software package for circular dichroism spectroscopic data processing, analysis and archiving, Anal. Biochem. 332 (2004) 285-289. [13] Sreerama, N. and Woody, R.W., Computation and analysis of protein circular dichroism spectra, Methods Enzymol. 383 (2004) 318-351. [14] Whitmore, L. and Wallace, B.A., DICHROWEB, an online server for protein secondary structure analyses from circular dichroism spectroscopic data, Nucleic Acids Res. 32 (2004) W668-673.
108
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-108
Sample Preparation and Good Practice in Synchrotron Radiation Circular Dichroism Spectroscopy Andrew J. Miles and B.A. Wallace Department of Crystallography, Birkbeck College, University of London Abstract. The development of synchrotron radiation circular dichroism (SRCD) spectroscopy, which uses the intense light of a synchrotron beam, has greatly expanded the utility of circular dichroism (CD), enabling the measurement of lower wavelength data containing more electronic transitions and hence more structural information. Furthermore the higher signal-to-noise and the ability to do faster measurements facilitate high-throughput data collection using smaller samples. In general the good practice protocols required for conventional CD studies also apply to SRCD studies; however there are additional good practice issues unique to SRCD spectroscopy and these are covered in this chapter.
1. Introduction The technique of synchrotron radiation circular dichroism (SRCD) spectroscopy was first described more than 25 years ago [1, 2] but until relatively recently there have only been a few working SRCD beamlines in existence. This is in large part because many of the proof-of-principle studies essential for their application to biological systems were not done until the new millenium [3, 4, 5]. Since that time, there have been a number of exemplification studies which have shown the value of the method in the life sciences. Consequently SRCD has gained the attention of the structural biology community, and as a result a number of new SRCD beamlines have been built and additional ones are currently under construction worldwide. The light flux from a synchrotron radiation source is far more intense (three orders of magnitude or more) than that generated by the Xenon arc lamps typically used in conventional circular dichroism (CD) instruments. Furthermore, the synchrotron light flux remains relatively constant throughout the far ultraviolet (UV) and vacuum ultraviolet (VUV) wavelength regions, where conventional lamp flux drops off dramatically as a function of decreasing wavelength. This gives rise to a number of practical advantages for SRCD spectroscopy including a higher signal-to-noise ratio that enables the collection of spectra using shorter averaging times or lower sample concentrations, and the ability to collect data to wavelengths below the practical limits of a conventional CD instrument; these low wavelength data include additional electronic transitions which provide additional structural information [3, 4]. The Good Practice aspects of data collection described for conventional CD spectroscopy in the chapter by Kelly and Price in this volume also apply to SRCD spectroscopy; however a number of additional issues that are unique to the SRCD method are addressed in this chapter.
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
109
Sample chamber Exit slit Toroidal grating
Detector
PEM
LiF window
8 meters Vacuum valves
Plane mirror Beam position monitor
Beam from synchrotron
Figure 1. Layout of Station CD12 at the SRS Daresbury UK, showing the major components (Modified from [6]).
2. Instrumentation Synchrotron radiation, emitted by electrons as they are accelerated at velocities close to the speed of light, includes the gamut of electromagnetic radiation from radio frequencies to hard X-rays. The whole UV wavelength range is therefore available for electronic spectroscopy including the VUV that is not available at a useful intensity from the Xenon arc lamps used in conventional CD instruments. At an SRCD beamline UV light is extracted at a bending magnet and is reflected from a cooled collimating mirror onto a monochromator, which refocuses selected wavelengths through an exit slit, to the photoelastic modulator (PEM) and then into the sample chamber (Figure 1). Synchrotron radiation is emitted from the bending magnet in a cone with linear polarised light at the center becoming increasingly more elliptically-polarised towards the edges with a corresponding decrease in the intensity. Light at the PEM should be 100% linearly polarised. Selecting the linearly-polarised component, or equal but opposite amounts of elliptically-polarised light from either side (to produce linearlypolarised light) theoretically negates the need for the linear polariser used in conventional CD instruments. Nevertheless many beamlines have now found it advantageous to incorporate one because the beam can move relative to the selection baffles thereby skewing the polaristion of the final beam. The entire instrument, up to the sample chamber, is maintained under high vacuum to prevent a decrease in light flux from absorption at wavelengths below 200 nm by atmospheric oxygen, and the elimination of reactive oxygen species that can damage the optical elements. The sample chamber is designed to be quickly flushed with nitrogen (to eliminate oxygen) and the sample cell is located as close to the detector as possible so that any light scattered by the sample is captured. Detectors vary but are normally sensitive to wavelengths from 350 nm to 120 nm. All windows and the PEM are constructed from materials that are as transparent as possible in the VUV wavelength region; these include CaF2, MgF2 and LiF, as opposed to quartz, which begins to absorb substantially below 200 and becomes effectively opaque at 160 nm.
110
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
All currently operational beamlines include these elements although the details of their designs differ [6-11]. The technical aspects of SRCD beamline design are described in more detail the chapter on Instrumentation by Sutherland.
3. Advantages of SRCD Spectroscopy over Conventional CD Spectroscopy Xenon arc lamps produce a light flux of ~1010 photons s-1 at wavelengths > 220 nm [6]; the flux rapidly decreases at lower wavelengths thereby limiting the acquisition of reliable data to, at best, ~185 nm and in many cases to only ~190 or 200 nm in conventional CD instruments. A synchrotron light source can be several orders of magnitude brighter than a Xenon lamp at 200 nm (Table 1) and the flux remains relatively constant well into the VUV wavelength range [6]. Consequently the signalto-noise ratio in SRCD is higher and the spectra of proteins in aqueous solution can be accurately acquired to ~170 nm (Figure 2). At wavelengths below 175 nm, the absorbance of the water solvent becomes substantial, and this is what imparts the practical wavelength for aqueous samples; partially hydrated samples or samples dissolved in non-absorbing organic solvents can be measured to wavelengths as low as 130 nm. The aim is to obtain as much low wavelength data as possible, as the VUV data incorporate charge transfer transitions at around 180 nm [12] and the high energy πNB→π* transition at 140 nm [13], thus providing significantly more information than can be obtained from the far UV wavelength region alone. Also, reliable low wavelength data greatly improves the accuracy of secondary structure analyses, especially of beta-sheet rich proteins [14, 15], as described in the Analysis chapter of this book by Whitmore and Wallace. The high signal-to-noise levels characteristic of SRCD enable the acquisition of good data using shorter averaging times and fewer repeat scans than required to obtain data of a similar quality using CD instruments; therefore data collection can be considerably faster [5]. The high quality data is especially beneficial when comparing spectra with small differences that may be indistinguishable from variations due to noise in spectra measured on conventional CD instruments. An example, illustrated by Figure 3, is the difference between the wild-type human eye lens protein, γD-crystallin and its P23T mutant that causes cataracts. A small variation between the two proteins, seen as a shoulder centered at 208 nm, although barely discernable relative to the noise (error levels) on a conventional instrument, is unequivocal and significant in the SRCD spectrum, which has much lower noise levels [14]. Similarly, Figure 4 presents a comparison of the spectra of a latexin:carboxypeptidase complex collected on both CD and SRCD instruments, demonstrating the benefits of the higher signal-to-noise ratio for detecting subtle changes [15] upon complex formation. In this case the SRCD experiment detected a significant difference between the measured spectrum of the complex and the spectrum calculated for the two proteins if they did not interact. This difference, which largely occurs below 190 nm, is not distinguishable from the noise in the CD spectra.
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
111
4. Sample Preparation Data collection at a synchrotron beamline as opposed to an in-house conventional CD instrument has the added complication of transporting materials to the data collection site. Since this often involves air travel, it is prudent to make enquiries about equipment and consumables available at the beamline well in advance. Because of regulations regarding what may be taken by air travelers in hand and checked luggage, it is often best to send the samples to the beamlines in advance. Cold and frozen samples can be sent by courier, having first established the stability of the samples and necessary storage conditions.
50 40
CD (mdeg)
30 20 10 0 -10 -20 -30 170
190
210
230
250
270
Wavelength (nm) Figure 2. Comparison of raw (unsmoothed) data (single scans) of the same sample collected on a conventional CD instrument (noisy signals) and a SRCD beamline (smooth signals), showing the improved signal-to-noise levels in the SRCD spectra, as well as the lower wavelength limits achievable with SRCD spectroscopy. The spectra are from myoglobin (large spectrum) and concanavalin A (small spectrum). (Adapted from [5]).
Before use, protein solutions should be degassed under vacuum to remove any dissolved oxygen that both absorbs at the lower wavelengths available to SRCD and may form bubbles as it outgases during data collection; this is a problem that can be exacerbated when the sample is subjected to the pressure changes inherent to air travel. The degassing procedure may not be appropriate for membrane proteins in detergent or lipids since degassing may create bubbles due to the surfactant properties of the detergents. After degassing, a bench-top centrifuge should be used to eliminate any suspended particulates, insoluble material or aggregates that can result in light scattering artifacts. Even if the protein concentration was determined prior to transport, it should be re-measured immediately before running a SRCD spectrum because precipitation and evaporation can occur en-route. If this is not possible at the time of data collection, then two or more aliquots should be removed for future duplicate
112
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
quantitative amino acid analyses or spectroscopic determinations of concentrations. Methods for accurate protein concentration determination are described in the chapter on Good Practice by Kelly and Price. 6 4
Δε
2 0 -2 -4 170
190
210
230
250
Wavelength (nm) 6
4
Δε
2 0
-2
-4 170
190
210
230
250
Wavelength (nm) Figure 3. Comparison of (top) conventional CD spectrum and (bottom) SRCD spectrum of wildtype γDcrystallin (thick lines) and P23T mutant (thin lines). Globally, the spectra are very similar. A shoulder centered at 208 nm is more pronounced in P23T than in wildtype. In the SRCD spectrum, the difference exceed the error bars representing one standard deviation between repeated scans of each sample, suggesting the difference is significant; however, in the CD spectrum, the difference is smaller than the reproducibility levels, and so could not be considered to be a significant difference. (Adapted from [14]).
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
113
6 4
Δε
2 0 -2 -4 170
190
210
230
250
Wavelength (nm)
6 4
Δε
2 0 -2 -4 170
190
210
230
250
Wavelength (nm)
Figure 4. Comparison of (top) conventional CD spectra and (bottom) SRCD spectra of the complex between latexin and carboxypeptidase A. The observed spectrum of the complex is indicated in each case by the thick lines, and the calculated spectrum derived from the spectra of the two proteins measured in isolation is indicated by the thin lines. Error bars plotted on the spectra correspond to one standard deviation of the measurements. In the CD case, the differences are roughly comparable to the error levels at all wavelengths, whereas in the SRCD case, they are consistently larger than the error bars (which are very small) at wavelengths in the VUV range below around 190 nm. (Adapted from [15]).
114
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
5. Collecting Spectra - Practicalities In most cases a preliminary study should be carried out on a conventional CD instrument to establish sample conditions for optimal data collection so that valuable beamtime is not wasted. The SRCD beamline should be calibrated immediately following each beamfill to mitigate against systematic errors that can arise from changes in the beam characteristics following each injection. Calibration procedures are described in detail in the Calibration chapter by the present authors. A rapid preliminary scan of the sample using a step size of 2 nm and/or a short averaging time may be useful for optimising parameters for data collection including protein concentration and cell pathlength, averaging time, number of scans, wavelength interval (spectral resolution), and wavelength cutoff limit. Usually good steady-state data can be acquired in the far UV and VUV using a step size of 1 nm, a bandwidth of 0.5 nm, and an averaging time of between 0.2 and 3 seconds. Note that both bandwidth and averaging time is instrument dependent. It is good practice to average at least three consecutive scans each of the sample and baseline, thereby ensuring reproducibility and allowing the calculation of errors bars [5]. The three individual spectra and the high tension (HT) spectra collected at the same time should be overlaid to determine if there has been any systematic change (i.e. precipitation, aggregation, denaturation) of the sample whilst it was in the beam. Such changes can be detected as either increases or decreased in the HT values at a low wavelength (usually that wavelength is at or near the cutoff limit of the data). The low wavelength cutoff for aqueous samples containing UV-transparent buffers in a 10 μm pathlength quartz cell is approximately 172 nm; a 50 μm cell will cutoff at 175 nm and a 100 μm cell will allow data collection to only 177 nm. Calcium fluoride cells, having a better absorption profile than those made from quartz, allow data acquisition to 169 nm if the pathlength is less than 20 μm. In all cases it is judicious to scan to a wavelength at least 3 nm below the actual data cut-off point to accommodate data smoothing algorithms. It is also recommended that data be collected to wavelengths greater than 260 nm at the high wavelength end, where there is no protein significant CD signal. A mismatch between the protein spectrum and baseline spectrum in this region usually indicates a difference in cell orientation between measurements; in such cases both spectra should be re-measured. If this is not possible due to lack of sample or time, the subtracted spectrum can be zeroed between 267 and 273 nm, where there is no protein signal. Spectra that are not zeroed in this wavelength region will have a Y-axis offset that will have a detrimental effect on calculations of secondary structure (discussed in the Analysis chapter by Whitmore and Wallace). 5.1 Low Wavelength Cut-off Limit As the sample absorbance fluctuates, the signal relayed from the detector to the processing electronics is held constant by altering the voltage fed into the detector. This gives rise to the high tension signal (otherwise known as the dynode voltage), which is proportional to absorbance. This is usually collected and stored simultaneously with the CD spectrum. A usable signal is maintained until the sample absorption rises to a level where the light available at the detector is so low that the HT current can no longer compensate for the lack of photons. For each beamline the limiting HT should be
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
115
established. Usually at the low wavelength cutoff, the HT signal will rise rapidly and the CD signal may suddenly change direction (see Figure 5 for an example of this); however sometimes it is less obvious at which wavelength the CD signal has become unreliable, as the signal may look perfectly good, but be suppressed by the lack of photons available. Figure 6 is an example of where two spectra of the same protein sample were measured in different pathlength cells. The dashed lines are the SRCD and HT signals for the samples obtained in a 20 μm pathlength cell, and the solid lines are the CD and HT signals for the samples measured in a 6 μm pathlength cell, where the spectrum in the shorter pathlength cell has been scaled to the other spectrum at 222 nm to enable facile comparisons. Although the signal in the shorter pathlength cell is smaller, and thus has a lower signal-to-noise ratio, it is actually the “correct” one. This is because the CD signal becomes unreliable when the HT exceeds 800 mV, which, for the spectrum obtained in the longer cell, occurs at around 195 nm. Thereafter the CD signal is smooth and appears to be dependable data. However when a spectrum of the same sample is obtained in the shorter cell, it becomes apparent that the 190 nm peak of the first spectrum had been distorted (decreased) by the high sample absorption and is much less intense than it should be. This produces an erroneous spectrum. It is therefore advisable to measure the spectrum of any sample twice, in two different pathlength cells or dilute the sample to test if the absorbance is too high, and to balance the tradeoff in data quality (signal-to-noise) with obtaining the lowest wavelength data possible. As a rule of thumb, to obtain an accurate signal at 190 nm the amide absorption should give rise to an HT maximum that is below the nominal defined cutoff point by at least 10%, bearing in mind that highly absorbing species in the buffer may not permit this under any conditions. 10 8 6
Δε
4 2 0
600
HT (mV)
950
-2 -4 -6 160
250 180
200
220
240
260
280
Wavelength (nm) Figure 5. SRCD and HT spectra of two proteins collected at station CD12. The larger spectrum is from the mostly helical protein myoglobin; the smaller spectrum is from the mostly sheet protein concanavalan A. For each protein the concentration was 8 mg/ml and the cell pathlength 10 μm. The low wavelength cut-off is indicated by the arrows where the sudden increase in the HT signal (dashed lines) corresponds to a sudden change in direction of the CD signal (solid lines). (Adapted from [5]).
116
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
20 15
Δε
10 5 600
0
HT (mV)
1000
-5 -10 175
200 195
215
235
255
Wavelength (nm) Figure 6. SRCD and HT spectra of the same protein sample (myoglobin) collected in a 20 μm pathlength cell (dashed line) and a 6 μm pathlength cell (solid line). The spectra were scaled to the same ellipticity values at 222 nm but the 190 peak in the sample with higher absorption is depressed with respect to that with lower absorption, clearly illustrating the effect of signal distortion where there is high absorption below 200 nm as indicated by the HT signal (thin lines for these samples, dotted and solid, respectively).
20 15
CD (mdeg)
5 0 750
-5
HT (mV)
1150
10
-10 -15 -20 165
350 185
205
225
245
265
Wavelength (nm) Figure 7. SRCD and HT spectra (thin solid line) and HT (thin dashed line) of lysozyme in 150 mM NaCl, 20 mM phosphate, pH 7.0. The low wavelength cut-off limit of this sample is ~187 nm . The same sample in 150 mM NaF, 20 mM phosphate, pH 7.0 (thick lines) has a low wavelength cut-off ~170 nm. (Adapted from [5]).
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
117
20 1350
15
1150
5 950
0 -5
750
-10
550
-15 -20 165
HT (mV)
CD (mdeg)
10
350 185
205
225
245
265
Wavelength (nm)
Figure 8. SRCD and HT spectra of lysozyme in: 50 mM Tris HCl, pH 7.0 (thin lines) and 50 mM Tris sulphate, pH 7.0 (thick). (Adapted from [5]).
5.2 Selection of Buffer Systems Salts and buffers can absorb light, even if non-chiral. Their absorbance will lower the signal-to-noise ratio and reduce the quality of the data because fewer photons reach the detector. Therefore, if possible, any compounds that absorb strongly in the wavelength region of interest should be substituted by more transparent ones. For example, solutions containing chloride ions are notably opaque in the VUV and low far UV wavelength regions at concentrations above 50 mM even in small pathlength cells. If high ionic strength is required for protein stability, an alternative to the use of NaCl is the corresponding fluoride salt, NaF, (Figure 7). However, before use it must be ascertained whether the fluoride ions have any detrimental effect on the protein or other constituents of the sample (e.g. metal ions react strongly with fluoride) before use. Similarly, sometimes buffers can be acidified with sulphuric acid rather than HCl (Figure 8) to improve penetration into the VUV. Phosphate and borate buffers are much less problematic than buffers containing compounds with carbonyl or amide groups, such as glycine or HEPES. For membrane proteins, many commonly used detergents such as octyl glucoside, dodecyl maltoside, C12E8, LDAO, lysolecithin and deoxycholate can be used at low concentrations; however some detergents such as CHAPS or Triton X100 absorb strongly below 210 nm and should not be used for CD studies.
118
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
Likewise, absorbing additives such as β-mercaptoehanol, dithiothritol and sodium azide should be avoided. Chaotropic reagents such as guanidine HCl and especially urea have an adverse absorption profile and at the high concentrations generally used to unfold proteins, it may not be possible to monitor the low wavelength peaks in their presence; however the denaturing detergent sodium dodecylsulphate (SDS) is usually not a problem with respect to absorbance. Imidazole, which is often used in purification procedures to isolate expressed proteins with His tags proteins, absorbs strongly and should be removed before examining the protein by CD or SRCD. 5.3 Selection and Loading of Sample Cells Short pathlength cells should be calibrated by methods described in the Calibration chapter by the present authors, as the actual values (especially for short pathlength cells) can differ significantly from the pathlengths cited by the manufacturers. Usually circular demountable cells are used for SRCD for several reasons: 1) they can be obtained with very short pathlengths, which are particularly helpful in eliminating water absorption (see above), 2) they use small volumes of material, which can often be an advantage for studies of scarce or precious proteins, 3) they can easily be cleaned, and 4) the circular geometry is more suitable to SRCD than the rectangular shape often used in conventional CD instruments. The cells are a better match with beam shape and scattering samples can produce isotropic scattering profiles by scattered light intercepting the cell edge in some directions before reaching the detector. A particularly useful situation is to have an adaptor on conventional CD instruments that will also permit use of circular cells, so that the sample can be examined by CD beforehand in the same cell and under the same conditions as will be used in the SRCD beamline.
Figure 9. SRCD cell holder designed to accommodate demountable cylindrical short pathlength cells.
It is imperative that cells are loaded so that they always have the same orientation with respect to the beam and that the two plates of a demountable cell are aligned with each other and placed in the cell holder with the same orientation each time. This will ensure that the baseline and sample spectra exhibit the same birefringence, which will cancel out upon baseline subtraction; to this end it is recommended that the same person load the sample and the baseline to ensure systematic reproducibility. Figure 9 shows a cell holder design which has been adopted by most SRCD beamlines, that permits highly reproducible loadings of the cell [16]. The cells are inserted between the PTFE washers (using different numbers of washers to compensate for the overall
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
119
thickness of cells with different pathlengths. The lid can be screwed down to provide the same symmetric pressure on the cell each time, whilst the PTFE washers prevent the cell from rotating with the lid. Calcium fluoride cells can be especially problematic with respect to orientation and loading since they are more intrinsically birefringent than quartz cells and also have a lower friction coefficient which means the two halves are more liable to move with respect to the cell holder and each other when the cell holder lid is tightened. However they do have the advantage of having a good absorption profile and samples can be easily recovered from their wetted surfaces.
6. Irradiation (Sample Heating) Effects at SRCD Beamlines When the first SRCD beamlines became operational, proof-of-principle experiments were done to examine whether the high intensity of the synchrotron UV light would create destructive free radicals in aqueous solution. It was demonstrated on two lower energy beamlines, 3.1 at SRS Daresbury, UK (now decommissioned) and UV1, ISA Aarhus, Denmark, that the beam caused no change in protein integrity (as monitored by mass spectrometry and SDS page gel electrophoresis) or in the protein spectrum following extended exposure of up to 24 hours at low wavelengths [17,18]. However the light flux from a beamline such as CD12 (the replacement for station 3.1 at the SRS, Daresbury) was some 2 to 3 orders of magnitude higher than that of the earlier beamline. It became immediately apparent from the reduction in spectral intensity and change in spectral shape between consecutive scans that protein samples were being affected (Figure 10 top). An extensive study of the phenomenon demonstrated that, for each protein studied, the integrity of the peptide chain was not compromised by the beam, but that the change in signal was likely to be caused by thermal denaturation of the protein causing unfolding rather than free radical damage. The proposed mechanism, now generally accepted, was that this was due to the heating of water molecules internally bound to the protein [18], since there is a strong water absorption peak around 170 nm. In some cases the proteins can refold to their native conformation when removed from the beam. The problem can be minimised if the cell is reloaded with a new sample for each repeat scan, but this can be costly in terms of both time and sample. However, to record only a single spectrum so as not to see the problem is NOT an acceptable solution, as the availability of repeat scans is important for assessing the reproducibility and error levels in the data. Also, the beamline will be unsuitable for experiments that require a sample to remain in the beam for extended periods; for example, when doing thermal melt or other time course experiments. Thermal damage tends not to be a problem for lower energy synchrotrons or lower flux beamlines such as those at ISA (Denmark) and BSRF (China), where in both cases it was possible to obtain 20 or more consecutive scans of a particularly sensitive protein with no effect on the integrity of the protein [11,19] as shown in Figure 10 (bottom). A survey of the light flux and the length of beam exposure tolerated by a particularly sensitive protein at a number of the SRCD stations currently in use (Table 1), demonstrated that the problem can be avoided if the flux density of the beam at the sample is not above 0.4 x 1011 photons s-1 mm-2 [19]. If the flux is slightly higher, the flux intensity (flux per unit area) can be reduced by increasing the spot size with no deleterious effect on the signal-to-noise ratio or, as a study on CD12 at the SRS Daresbury indicated, reducing the slit size can alleviate the problem albeit at the cost of signal-to-noise [20]. New beamlines at third-generation
120
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
high energy sources such as DIAMOND and Soleil (beamlines under construction at time of writing) will need to consider this issue in their design.
20 15 10
Δε
5 0 -5 -10 -15 170
190
210
230
250
Wavelength (nm)
20 15
Δε
10 5 0 -5 -10 -15 170
190
210
230
250
Wavelength (nm )
Figure 10. (top) 25 consecutive scans of human serum albumin measured on CD12, SRS Daresbury where thermal effects of the beam are evident from the changes in the curves. (bottom) The first and 25th scan of human serum albumin measured at UV1, ISA where, by contrast, the beam has no discernable effect on the protein spectrum. (Adapted from [18]).
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
121
Table 1. Flux, intensity, spot size and characteristics of a number of SRCD beamlines. The effects of these beams on the lifetime of the protein, human serum albumin, is indicated in the last two lines (Adapted from [19]).
SRCD Beamline Location (synchrotron) Slit size (mm) Horizontal beam size at slit (mm) Spectral bandwidth (nm) Scan speed (nm min-1) Max Flux @ 200 nm (x1011 ph s-1) Spot size at sample (mm2) Flux density (x 1011 ph s-1 mm-2) Time HSA remains stable (minutes) Number scans before significant signal decay observed
UV1 ISA 0.5 5 0.5 22 1.5 10 0.125 >3600
4B8 BSRF 4.2 40 1.0 12 0.6 6 0.1 >180
CD1 ISA 0.4 5 0.6 22 4.5 12 0.37 35
25+
14+
7
U11 CD12 NSLS SRS 1.0 4.4 5 10 0.32 1.0 15 16 30 >250 4 25 7.5 20 ~20 <12 2
1
7. Analyses of SRCD Data Previously available protein reference datasets used for the analysis of protein secondary structures (discussed in the chapter on Reference datasets by Janes) do not contain the extra low wavelength data available to SRCD, and so do not permit users to take advantage of the extra information present in these spectra. Hence a new database of 71 proteins with a low wavelength cutoff of 175 nm (called SP175) [21] has been created for analyses of SRCD data; it is available at the DICHROWEB online calculation server (discussed in more detail in the Analysis chapter by Whitmore and Wallace) [22]. SP175 has been designed to include a broad range of secondary structures and folds, and is based on high quality crystal structures and SRCD data. Cross validation studies have shown it to out-perform other data sets currently in the public domain [21]; the improvement is especially notable for beta-sheet rich proteins.
8. Kinetic Experiments and Other New Techniques At the present time facilities are being developed to enable measurements of fast kinetic processes such as protein folding at several SRCD beamlines. As a result of the improved signal-to-noise ratios (due to the higher flux using synchrotron light), it should be possible to measure smaller samples, resolve faster processes, and monitor changes at lower wavelengths than can be achieved using stopped-flow attachments on conventional CD instruments. In the future, it may also be possible to use the whole spectrum of light (“white light”) with new detectors that can make measurements at multiple wavelengths simultaneously [23]. This will enable monitoring of very fast conformational changes when combined with improved stopped-flow and temperature jump technologies for initiating the dynamic processes to be monitored.
122
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
Other technologies are being developed include the simultaneous measurement of SRCD, fluorescence and absorbance [1] and also synchrotron radiation-bases linear dichroism (LD) spectroscopy (described in the chapter in this volume by Rodger), which may be a particularly useful technique for monitoring samples that are either intrinsically elongated or that can be aligned. For the latter, a micro couette LD cell in which samples are oriented by the viscous drag generated in a solution flowing in a narrow gap between the walls of a rotating cylindrical cell has been designed at University of Warwick [24], and successfully tested on station UV1 at ISA Aarhus. Comparisons of flow-oriented samples by LD and oriented films by SRCD are proving useful for defining the orientations of peptides in membrane systems [25].
9. Practical Aspects in the Use of Synchrotrons Often applications for SRCD beam-time must be made six months or more in advance of the beamtime slot provided; this allows plenty of time to complete trial experiments using a conventional CD instrument to find optimal conditions for sample preparation so that the SRCD beamtime can be used efficiently, but it can be a detriment to obtaining timely results; consequently, several SRCD facilities have instigated “rapid access” models for short and preliminary experiments to be done as proof-of-concept with much shorter lead times. The availability of auxiliary apparatus such as centrifuges, sonicators, microscopes, vacuum desiccators and pumps, absorption spectrophotometers, sample cells, etc. should be ascertained before traveling, as it may be necessary to take some portable equipment to the beamline. It is advisable to arrive at the facility a day before the allotted beamtime begins so that safety induction and registration procedures and initial sample preparations can be completed, leaving the first morning free for doing experiments on the beamline. Most synchrotrons have one or two beam fills each day; the times will be posted on the facility website or on monitor screens. It will be important to note these to ensure that the sample and baseline spectra for a given experiment are both optained within the same beamfill, thus minimizing systematic errors. Once accustomed to the instrument it may be efficient to divide the team of experimenters to work in shifts (depending on how many users and how experienced they are) to utilize the whole of the allotted time, as synchrotrons usually operate 24 hours a day, 7 days a week.
10. SRCD Facilities At the time of writing (December, 2008) there are more than forty synchrotron storage rings in operation, with a further twenty or more currently under construction worldwide. At present there are SRCD beamlines at the NSLS Brookhaven (U.S.A.), ISA in Aarhus (Denmark), HSRC/HiSOR in Hiroshima (Japan), Bessy2 in Berlin (Germany), the BSRF in Beijing (China), the NSRL in Hefei (China), the NSRRC in Taipei (Taiwan), with other beamlines being developed at Diamond (U.K.), which will be the replacement for the SRS Daresbury (U.K.), Soleil (France), the Melbourne Synchrotron (Australia), and an additional beamline (CD1) at ISA in Denmark. CD12, formerly at the SRS Daresbury, is being relocated to ANKA (Germany) and other SRCDs are in the planning stages at sites worldwide. Details of the specifications of the
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
123
beamlines are described in the chapter on Instrumentation by Sutherland in this volume.
Acknowledgements This work was supported by grants from the U.K. Biotechnology and Biological Sciences Research Counicl to BAW. Beamtime at the SRS, Daresbury was supported by a Programme Mode Access grant to BAW and R.W. Janes (Queen Mary, University of London). Beamtime access at ISA was enabled by the European Community – Research Infrastructure Action under the FP6. Beamtime access to the 4B8 beamline was enabled by a grant to BAW and RWJ from the BSRF, and beamtime access at the National Synchrotron Light Source, Brookhaven National Laboratory was supported by the U.S. Department of Energy, Division of Materials Sciences and Division of Chemical Sciences, under Contract No. DE-AC02-98CH10886.
References [1] [2] [3] [4]
[5]
[6] [7]
[8]
[9] [10]
[11] [12] [13] [14]
[15] [16]
J.C. Sutherland, E.J. Desmond and P.Z. Takacs, Versatile spectrometer for experiments using synchrotron radiation at wavelengths greater than 100 nm, Nucl. Instr. Meth. 172 (1980) 195-199. P.A. Snyder and E.M. Rowe, The first use of synchrotron radiation for vacuum ultraviolet circular dichroism measurements, Nucl. Instr. Meth. 172 (1980) 345-349. B.A. Wallace, Synchrotron radiation circular-dichroism spectroscopy as a tool for investigating protein structures, J. Synchroton Rad. 7 (2000) 289–295. B.A. Wallace and R.W. Janes, Synchrotron radiation circular dichroism spectroscopy of proteins: Secondary structure, fold recognition and structural genomics, Curr. Opin. Chem. Biol. 5 (2001) 567– 571. A.J. Miles and B.A. Wallace, Synchrotron radiation circular dichroism spectroscopy of proteins and applications in structural and functional genomics, Chem. Soc. Rev. 35 (2006) 39-51. D.T. Clarke and G.R. Jones, CD12: A new high-flux beamline for ultraviolet and vacuum-ultraviolet circular dichroism on the SRS, Daresbury, J. Synchrotron Rad. 11 (2004) 142-149. D.T. Clarke, M.A. Bowler, B.D. Fell, J.V. Flaherty, A.F. Grant, G.R. Jones, M.L. Martin-Fernandez, D.A. Shaw, B. Todd, B.A. Wallace and E. Towns-Andrews, A high aperture beamline for vacuum ultraviolet circular dichroism on the SRS, Synchrotron Rad. News 13 (2000) 21-27. N. Ojima, K. Sakai, T. Fukazawa and K. Gekko, Vacuum-ultraviolet circular dichroism spectrophotometer using synchrotron radiation: Optical system and off-line performance, Chem. Lett. 7 (2000) 832-833. J. Qian, Y.L. Yan, and Y. Tao, Design and calibration of the monochromator in 3B1B beamline, High Energy Phys. Nucl. Phys. (Chinese Edition) 27 (2003) 125-128. S. Miron, M. Refregier, A.M. Gilles, and J.-C. Maurizot, New synchrotron radiation circular dichroism end-station on DISCO beamline at SOLEIL synchrotron for biomolecular analysis, Biochim. Biophys. Acta 1724 (2005) 425-431. A.J. Miles, S.V. Hoffman, Y. Tao, R.W. Janes and B.A. Wallace, Synchrotron radiation circular dichroism: New beamlines and new applications in biology, Spectroscopy 21 (2007) 245-255. A.T.B. Gilbert and J.D. Hirst, Charge-transfer transitions in protein circular dichroism spectra, J. Mol. Struct. (Theochem) 675 (2004) 53-60. L.B. Clark, Polarization assignments in the vacuum UV spectra of the primary amide, carboxyl and peptide groups. J. Am. Chem. Soc. 117 (1995) 7974-7986. P. Evans, K. Wyatt, G.J. Wistow, O.A. Bateman, B.A. Wallace and C. Slingsby, The P23T cataract mutation causes loss of solubility of folded γ-D-crystallin, J. Mol. Biol. 343 (2004) 435-444. N.P. Cowieson, A.J. Miles, R.J. Gautier, K. Forwood, B. Kobe, J.L. Martin and. B.A. Wallace, Evaluating protein:protein complex formation using synchrotron radiation circular dichroism spectroscopy, Proteins: Struct. Func. Bioinform. 70 (2007) 1142-1146. F. Wien and B.A. Wallace, Calcium fluoride micro cells for synchrotron radiation circular dichroism spectroscopy, App. Spectroscopy 59 (2005) 1109-1113.
124
A.J. Miles and B.A. Wallace / Sample Preparation and Good Practice in SRCD Spectroscopy
[17] A.J.W. Orry, R.W. Janes, R. Sarra, M.R. Hanlon and B.A. Wallace, Synchrotron radiation circular [18] [19] [20]
[21] [22] [23] [24] [25]
dichroism spectroscopy: Vacuum ultraviolet irradiation does not damage protein integrity, J. Synchrotron Rad. 8 (2001) 1027-1029. F. Wien, A.J. Miles, J.G. Lees, S.V. Hoffmann and B.A. Wallace, VUV irradiation effects on proteins in high-flux synchrotron radiation circular dichroism spectroscopy, J. Synchrotron Rad. 12 (2005) 517523. A.J. Miles, R.W. Janes, A. Brown, D.T. Clarke, J.C. Sutherland, Y. Tao, B.A. Wallace and S.V. Hoffmann, Light intensity threshold at which protein denaturation is induced by synchrotron radiation circular dichroism (SRCD) beamlines, J. Synchrotron Rad. 15 (2008) 420-422. R.W. Janes and A.L. Cuff, Overcoming protein denaturation caused by irradiation in a high-flux synchrotron radiation circular dichroism beamline, J. Synchrotron Rad. 12 (2005) 524–529. J.G. Lees, F. Wien, A.J. Miles and B.A. Wallace, A reference database for circular dichroism spectroscopy covering fold and secondary structure space, Bioinformatics 22 (2006) 1955–1962. L. Whitmore and B.A. Wallace, DICHROWEB, An online server for protein secondary structure analyses from circular dichroism spectroscopic data, Nucleic Acids Res. 32 (2004) W668-673. S. Manolopoulos, D. Clarke, G. Derbyshire G. Jones, P. Read and M. Torbet, A new multichannel detector for proteomics studies and circular dichroism, Nucl. Instr. Meth. Phys. Res. 531 (2004) 302306. R. Marrington, T.R. Dafforn, D.J. Halsall and A. Rodger, Micro volume Couette flow sample orientation for absorbance and fluorescence linear dichroism, Biophysical J. 87 (2004) 2002-2012. B. Perrone, A.J. Miles, B. Bechinger, S.V. Hoffman, and B.A. Wallace, Oriented synchrotron radiation circular dichroism and linear dichroism spectroscopy of peptides in model membranes, Biophysical J. (2009), in press.
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-125
125
Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications Jascindra Ravi, Anna E. Hills and Alex E. Knight Biotechnology Group, National Physical Laboratory Abstract. Circular Dichroism (CD) spectroscopy is an important technique for measuring the structure of protein biopharmaceutical products, which are increasingly important to the pharmaceutical industry. Its application tends to be rather different to pure research, in that it is more important to compare spectra of different samples or batches of material. However, this requires that spectra are obtained that are comparable, and that there exist objective methods for comparing spectra. We review the issues involved in obtaining good quality spectra, including instrument calibration, reference materials, cell pathlength, protein concentration and instrument maintenance, and the progress that is being made in these areas. We also discuss the available approaches for the objective comparison of CD spectra.
1. Introduction This chapter differs from most others in this book in that it is primarily concerned with an industrial application of CD – the quality control of biopharmaceuticals – rather than its use in “pure” research. However, much of the discussion is also relevant to making reproducible measurements of protein structure in other contexts, such as academic research, particularly where results will be stored in databases or compared with results from other laboratories. We begin by discussing the importance of biopharmaceuticals, and the requirements for the use of CD in this context. Then we discuss why reproducibility in CD is not a trivial issue, and how this problem is being addressed; we also make some brief recommendations on getting the most reproducible CD data. Finally, we briefly present some chemometric approaches to data analysis that might be useful in this context.
2. Biotechnology and Biopharmaceuticals 2.1. What are Biopharmaceuticals? Biological molecules as therapeutic agents have proved to be an important and growing sector in the pharmaceutical industry since the approval of the first human recombinant protein, insulin, in 1982. Advances in recombinant DNA technology over the last twenty-five to thirty years has lead to an increase in the types of biotechnology
126 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications
products (biopharmaceuticals) seeking regulatory approval, such as vaccines, peptide and protein therapeutics, DNA and gene therapy products. This chapter is principally focussed on protein biopharmaceutical products. Spectroscopic techniques, such as CD, have an important place in the suite of analytical methods that can be used to characterise these large, complex molecules where the biological activity depends on the primary, secondary, tertiary and quaternary structure. For most biotechnology products, there is a regulatory requirement that the biological activity or “potency” of individual batches is tested upon release of a batch; the product is not assessed solely by physicochemical techniques to determine its activity and a biological assay is required that reflects the mechanism of action in the body. However, in a few cases, there may be adequate data to link the physicochemical attributes to functional activity. Recombinant insulin and growth hormone products are well characterised enough by physicochemical methods that it is unnecessary to test each batch by bioassay (although some regional differences in regulatory requirements exist). With advances in analytical instrumentation to study the structural components of biopharmaceuticals, this approach may also provide insights for the development of similar products. Spectroscopic techniques are also used to demonstrate batch-to-batch consistency and to demonstrate that the products’ higher order structure is stable over the stated shelf-life. In addition to characterisation of purified product, spectroscopic methods can be applied to monitor “upstream” bioprocessing steps (fermentations) or subsequent purification stages. These are attractive technologies as they allow non-invasive, non-destructive monitoring during production, which can help identify failures earlier than at end product testing. 2.2. Regulatory Drivers Characterisation of a biopharmaceutical starts early in the development process (see Figure 1), where knowledge of structural conformations or drug-ligand interactions may aid drug discovery and candidate selection. Early characterisation benefits product development by providing supporting data to demonstrate the comparability of material used throughout development. As the production process is optimized or scaled up, small changes in the process may affect the structure of the product. A set of guidelines exists for the procedures that are appropriate for the development and registration of biopharmaceuticals, set out by the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). ICH makes recommendations on ways to achieve greater harmonisation between the regulatory regions in the interpretation and application of technical guidelines and requirements for product registration. Guideline Q6B specifically covers “Test Procedures and Acceptance Criteria for Biotechnological/Biological Products”, including the physicochemical properties that should be studied [1]. Within this section, some spectroscopic techniques are specified:
J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications 127 Clinical Development and testing
Research and Discovery
Pre-clinical development
Phase I
Phase II
Phase III
Registration and approval
Commercial Manufacture
CD confirms structural identity of protein at every stage
Figure 1. Stages of drug development. Schematic view of the phases of development of a biopharmaceutical, showing the role of CD throughout the process. Here CD is used to show that the three-dimensional structure of different samples of the protein is the same. Since the structure of the protein is important for its potency and safety, it is critical to establish that the structure of a batch destined for use in patients is the same (as near as can practically be determined) to that used in the clinical trials.
“Spectroscopic profiles. The ultraviolet and visible absorption spectra are determined as appropriate. The higher-order structure of the product is examined using procedures such as circular dichroism, nuclear magnetic resonance (NMR), or other suitable techniques, as appropriate.” Therefore circular dichroism is directly identified by name in the key regulatory document covering this area. 2.3. Determination of Biopharmaceutical Structure The regulatory requirements on quality control of the structure of biopharmaceuticals exist because of the vital importance of the three-dimensional structure of the molecule. Incorrectly folded molecules may lack the potency of the correct product, and may be less stable, for example, through being prone to aggregation. Aggregated products in particular are of concern as they are often more immunogenic than the soluble product and this can lead to an immune response both to the drug or to related host molecules, with potentially severe consequences for the patient. The theory of CD is discussed in depth elsewhere in this book, so we need not concern ourselves with the details here. Both far and near ultraviolet (UV) spectra are useful tools for biopharmaceutical quality control. Far-UV spectra (180-260 nm) are due to absorption of peptide bonds, and can thus yield information about the overall secondary structure of the molecule. CD in the near-UV region (240-360 nm) is principally due to the amino acid side chains. These spectra provide a fingerprint of the tertiary structure, including disulphide bridges. Sensitivity to structural changes makes CD very useful for comparative purposes, particularly for batch-to-batch analysis or in accelerated stability trials, where the molecule is exposed to extremes of temperature or pH in order to identify degradation pathways that the molecule may undergo. The major advantages of CD are the speed of the analysis (minutes, compared to hours), and that it is non-destructive and uses relatively inexpensive equipment. Contrast this with NMR, where the cost and complexity are much greater.
128 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications Far UV Spectrum
Database of spectra of proteins of known structure
Secondary Structure Prediction
Reference spectra of protein being analysed
Far or Near UV Spectrum
Structural similarity?
Figure 2. Comparative approach for CD. A simplified view of CD workflows in research (upper panel) and in biopharmaceutical analysis (lower panel). In research applications, the far UV spectrum of a protein is often determined so as to enable a prediction of its secondary structure to be made, using databases of spectra from proteins of known structure, and a variety of algorithms. However, in the biopharmaceutical context it is more important to know whether a sample under analysis is effectively the same as previous samples, such as those used in the clinical trials. Here, a statistically robust yes/no answer is required.
J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications 129
3. Using CD for Quality Control 3.1. Contrast with Research Applications From the forgoing discussion, it should be apparent that CD in this industrial application is used rather differently from its typical applications in research (see Figure 2). The purpose of the measurement is not always to obtain a priori information about the structure of the molecule, or to monitor an experimentally-induced structural change, but often is to compare the spectral profiles of different samples of the protein, which may be measured in different places and times over a timescale of years, or even decades. For example, a conventional approach to the characterisation of proteins is to apply one of a variety of deconvolution algorithms to obtain information about the secondary structure of the protein. This approach is typically not useful for biopharmaceuticals, where the ‘canonical’ or correctly folded structure may have been determined in great detail by X-ray crystallography, NMR spectroscopy and a variety of other approaches. Furthermore, the deconvolution algorithms do not provide a reliable way of comparing spectra. For example, there may not be a unique solution to the deconvolution of a particular spectrum, or more than one spectral shape may give rise to a similar result. In many proteins the near UV spectrum may return a more detailed and useful spectral “fingerprint”, although the usefulness and structural sensitivity will depend on the side chains present in the protein and their sensitivity to structural changes in the protein. So it is clear that for CD spectroscopy to be useful in this contest, there are two key requirements: 1.
2.
That the spectra obtained at different times and places, on different instruments and by different operators, are consistent and therefore comparable. If the spectra are not comparable this means that any spectral differences that are observed cannot be assumed to come from changes in the protein structure, but may instead result from errors in the measurement process. That there exist objective methods for the comparison of the spectra, that are easy to apply and interpret, but have sufficient rigour that they will be acceptable for regulatory purposes. We discuss the barriers to meeting these requirements in the following sections.
4. Getting Comparable Data 4.1. Comparability and Traceability As we have shown above, in a quality control environment we do not need to interpret CD spectra; we simply need to compare spectra to a reference data set. Before we can compare them, they need to be comparable. Why is this not a trivial matter, and what does this imply about the approaches that CD users will need to adopt? There may be opportunities to apply some principles from the world of measurement science, or metrology. Any time we want to compare two measurements to determine if they are the same, we need to know something about the uncertainty of the measurements. This is
130 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications
because in virtually every case measurements will contain some degree of error, or to put it another way, there is a degree of uncertainty about the true value of whatever it is we are trying to measure. If we don’t know how big this uncertainty is, we don’t really know whether two measurements are really different or just appear to be different because of the errors in the measurement. Let’s take a prosaic example of measuring the length of some objects with a ruler. If we know that our measurements have an uncertainty 0f ±1 mm, then we can be pretty confident that measurements of 15.2 mm and 16.8 mm really are different. However, we can’t be sure about the difference between 15.2 mm and 14.8 mm. Measurements in the life sciences frequently suffer from proportionately more error than measurements in the physical sciences. Perhaps perversely, however, consideration of such measurement errors is rare – often because we are only interested in a qualitative result. However, when our measurements concern the quality and safety of a product that is to be used to treat a patient, we need to be somewhat more rigorous. However, to understand the uncertainties involved in a measurement such as CD is quite complex. Because of the small size of the CD effect, the instrumentation is rather complex and therefore a number of sources of error may arise from the instrumentation itself. Since most CD instruments require calibration, we must also account for the error introduced by this process, and by any variability or uncertainty in the calibrant itself. The concentration of the protein being measured may seem to be a trivial matter, but quite large errors may be encountered in using various common methods [2] and this in turn can lead to significant errors in the CD measurement. Finally, we must take into account errors in the pathlength of the cell in which the measurement is made [35]. Some of these sources of error are easier to investigate than others. One of the most fundamental problems in the area is that there is no way of determining objectively what is the “right” or “true” value of CD for any given sample. The majority of instruments work by causing the light passing through the sample to oscillate between left- and right-handed circularly polarised light at high frequency [6]. If the sample exhibits circular dichroism, the two polarisations will be absorbed to different extents, and the intensity of the light hitting the detector will oscillate at the same frequency. This oscillation is amplified by a frequency selective (“lock-in”) amplifier. The instrument calibration is generally set by adjusting the gain of this amplifier. A “standard” sample is placed in the instrument and the gain is adjusted until the instrument reads the “correct” value. However, it is not certain that this value is correct (see the section on reference materials, below). Many other aspects of the CD instrumentation also give rise to errors that are difficult to quantify. The ideal for any measurement is that it should be traceable in some way. This means that a measured value can be traced back, with known uncertainties, to some sort of higher reference point or points, such as those embodied in the SI1 itself, which may be fundamental constants (or, in the case of mass, the standard kilogram artefact) [7]. Unfortunately this is not yet the case for CD. Where traceability to the SI is not possible, traceability to certified reference materials (ideally according to ISO 17025
1
The International System of Units (Système International d’Unités) [7].
J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications 131
[8]), or to consensus standards might be appropriate2. Why is traceability important? Because it is the most robust way of ensuring measurement comparability.
20 15
CD (mdeg)
10 5 0 -5 -10 -15 180
190
200
210 220 230 Wavelength (nm)
240
250
260
190
200
210 220 230 Wavelength (nm)
240
250
260
20 15
CD (mdeg)
10 5 0 -5 -10 -15 180
Figure 3. Problems encountered in obtaining comparable data. An inter-laboratory study was carried out amongst UK laboratories to investigate whether it was possible to obtain comparable CD data [9, 14]. The results obtained in the reference lab were consistent (lower panel). However, considerable variability was observed among the data returned by participants (upper panel). This variation was found to be attributable to a variety of causes (see main text).
2
In Biopharmaceutical laboratories, traceability to a biological reference standard from the manufacturer is often the only available material to show some level of traceability throughput the development process.
132 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications
4.2. Issues Encountered in Practice What are the main barriers to, and problems encountered in, obtaining comparable CD spectra in a typical laboratory? To investigate what issues occurred in practice, we organised an inter-laboratory study of labs in the UK which routinely performed CD measurements [9]. This work clearly identified a number of factors that influenced the comparability of measurements: 1.
2. 3. 4.
The study participants did typically not determine the pathlengths of their cells. Without this information, it is impossible to meaningfully compare spectra that have been measured in different cells. This problem is most significant in the Far UV, where cells of very short pathlength are typically used and the variability in pathlength is proportionately larger. The intensity calibration of the instruments used in the study varied significantly. The wavelength calibration of the instruments varied significantly. Some instruments produced data that was notably more “noisy” than from others. This can be due to a number of factors, including the performance of the lamp and degradation of the mirrors, for example.
Other errors we have observed include the use of incorrect cell pathlengths. It is worth noting that in this study, protein concentration was not an issue because all participants measured aliquots of the same solutions. Many of the issues described above are a result of poor practice, poor instrument maintenance, or both. There have been a number of attempts to improve the state of knowledge of CD users. For example, guides to best practice have been produced [4, 5] – such as those contained in the present volume – and training course have been run. We believe that measures such as these can markedly improve the quality of CD data, and we briefly outline some of our recommendations below. 4.3. Good Practice Recommendations Here we summarise some of the key recommendations for obtaining comparable, good quality CD spectra. For further details, see chapters elsewhere in this book, our Good Practice Guide [5] or other recommendations [4]. • • •
•
CD instruments should be well maintained, and serviced regularly; the performance of the instrument should be monitored between service visits. CD instruments should be calibrated regularly using the best available standards and procedures, for both wavelength and CD. CD cells are critical for the accurate measurement of CD spectra. It is vital that the correct pathlength be used for the sample in question. The pathlength should be accurately determined. Thorough cleaning and the measurement of baseline spectra are also vital. The concentration of the protein in the sample should be accurately determined. The absorbance of the sample should be in the most appropriate range to achieve a high signal-to-noise ratio. Strongly absorbing or chiral components in buffers, or light scattering, should be avoided. The buffer conditions and temperature should be tightly controlled.
J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications 133
• •
Appropriate instrument parameters should be selected to obtain the best quality data possible in a reasonable length of time. Issues such as scan speeds and slit widths are particularly important here. Excessive smoothing and other data manipulations should be avoided.
4.4. Improvements in CD Standards As indicated above, a significant cause of variability in CD measurements is the reference materials that are used in instrument calibration and performance testing. Most CD instruments are routinely calibrated with either d-10-camphorsulfonic acid [10] (CSA) or its ammonium salt (ammonium d-10-camphorsulfonate, ACS) [11]. Both compounds dissolve to give the same ion in solution, which gives calibration peaks at 290.5 nm and 192.5 nm. However, these materials have a number of limitations: 1.
2.
3.
4.
5.
The reference values to which users calibrate their instruments are consensus literature values. It is not clear whether these values are correct and therefore measurements made using these standards are not traceable. There is no source of certified reference material for either of these compounds. Therefore each user (including those who originally set the reference values) is using material of unknown, and possibly variable, purity. Typically instruments are calibrated using the 290.5 nm peak only. However, it is not certain whether instrument performance is consistent across the range from the far UV limit to the visible limit, and this is also likely to vary between makes and models of instrument. Miles et al. [12] showed that calibration at multiple wavelengths could improve the comparability of CD spectra between laboratories. These materials are typically only available in one of two enantiomeric forms. This is unfortunate, as it is not necessarily the case that an instruments’ sensitivity is symmetrical about zero. The materials have limited stability, especially if they are stored incorrectly; therefore fresh solutions must be made up at regular intervals [13].
To address these issues, Damianoglou et al. [14] recently developed a new candidate reference material which overcomes many of these limitations. The spectral characteristics are compared with ACS in Figure 4. This reference material is available in both enantiomeric forms, and produces nine Gaussian peaks across the spectral range from the far UV to the visible. The material is also extremely stable. It is hoped that the material will become commercially available in the near future. However, while this material has many advantages, there is more work to be done in the area. Most significantly, the material is no more traceable than ACS (in fact it is only traceable to ACS at present). To make this material traceable would require it to be characterised on a reference instrument capable of making traceable CD measurements.
134 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications
ACS 200
150
100
CD (mdeg)
50
0
-50
-100
-150
-200
-250 200
220
240
260 280 300 Wavelength (nm)
320
340
360
Co(EDDS) 60 S,S enantiomer R,R enantiomer Racemic Mixture 40
CD (mdeg)
20
0
-20
-40
-60
200
300
400
500 Wavelength (nm)
600
700
800
Figure 4. Comparison of CD calibration standards. This figure compares a commonly used CD calibration standard (ACS, top) with a newly developed standard compound, Co(EDDS) (bottom) [14]. ACS exhibits one useful peak and is normally only available in one enantiomer. Co(EDDS) is available in two enantiomeric forms which give equal and opposite spectra, providing a valuable check of instrument performance. The spectrum effectively provides nine Gaussian peaks, which can be used to check instrument calibration from the far UV to the visible.
J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications 135
4.5. Complementary Techniques CD is not the only technique that could be used to obtain quality control information about protein structure. There are a variety of other techniques which can be used in this context, although many are not so well established as CD [15]. In principle, using other such techniques in combination with CD could provide more detailed information and improve confidence in the results obtained. The most mature technique in this area is ATR-FTIR (Attenuated Total Reflectance – Fourier Transform Infra-Red spectroscopy). Here the shape of the amide I band provides information about the preponderance of α and β structures in a protein. Unfortunately, this band falls in a spectral region where water also strongly absorbs, which complicates analysis of biopharmaceuticals in aqueous formulations. This means that extremely short pathlengths – typically 6 µm – are required, and the water background must be accurately subtracted from protein spectra. The former is often addressed by using the attenuated total reflectance sampling method, which relies on total internal reflection rather than transmission through the sample. Unfortunately, we have found that this technique is currently less reproducible than CD [15] but it is still useful as an adjunct and in situations where CD does not work well, for example in the presence of compounds with strong UV adsorption or light scattering.
5. Spectral Comparisons In addition to comparability of spectra, we also noted that objective methods for spectral comparisons are required. Why is this? Consider two successive spectra of the same sample, in the same cell, acquired on the same instrument. Even these spectra will not be identical, due to the presence of noise from the instrumentation. This is even truer for spectra acquired from different samples, where all the sources of error that have been described previously will also come into play. So how can we distinguish whether spectra are genuinely different from each other? Often the approach that is taken is simply to overlay the spectra (typically of a test sample and an in-house reference material) and make a judgement as to whether they are essentially the same. Unfortunately, this approach requires an expert eye, and is in any case rather subjective. A more powerful approach, which does not require a complete understanding of all the experimental uncertainties, is to use statistics. If many spectra are acquired from a “reference” sample, we will acquire information not only about the typical (or average) spectral characteristics, but also about the magnitude of variation from that spectrum, which is likely to vary with wavelength. We can then set limits, based on that variability, and say that any spectrum, measured under the same conditions, which falls outside these limits is different from the reference sample. For example, Bierau et al. have recently described such an approach, where deviations of greater than three standard deviations from the mean spectrum (for example) would cause spectra to be classified as non-equivalent [16]. A related approach, which also makes use of the statistical variability of measurements, is to make use of multivariate statistical approaches such as Principle Component Analysis or PCA. This approach is often used to analyse data where many different variables have been measured, some of which may exhibit a degree of correlation. The power of this approach is that it makes use of the structure within the
136 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications
data, rather than relying on any input from the user. Extensions of the PCA approach, such as SIMCA (Soft Independent Modelling by Class Analogy) then allow spectra to be classified into one or more classes. One of the advantages of such an approach is that in principle, data from many methods can be combined into one analysis. An example of the application of PCA and SIMCA is given in Figure 5. The raw data, a set of near UV CD spectra from three different protein samples, is presented in Figure 5a. Such a plot, which contains many overlaid spectra, is almost impossible to interpret by eye, even with the aid of colour coding. Each spectrum consists of a set of measurements of CD at a series of different wavelengths. In the language of PCA, these measurements are called variables and a complete spectrum is referred to as an observation. It’s immediately obvious that some of the variables will tend to be correlated with each other – neighbouring wavelength points will be strongly correlated, for example. What PCA does is to calculate weighted combinations of these variables that explain the variability within the data. These are known as the principal components. How many of these are required to explain the data depend on its complexity; often surprisingly few are necessary. For the data set shown, two were sufficient. The data can now be represented by plotting each observation as a point at coordinates based on the values of its principal components, or scores. Such a scores plot is shown in Figure 5b. It can be seen that the spectra of the three different samples form tight clusters with a few possible outliers. Such a plot provides a much easier overview of the data set than the many overlaid spectra. We can also plot the weightings (or loadings as they are called), which make up the principal components – see Figure 5c. It should immediately be obvious to the eye that these are directly related to the CD spectra that were input into the analysis – not really a surprising conclusion, but this gives us confidence that the analysis is meaningful. How can we use this type of analysis to decide whether a spectrum differs from our reference sample (or samples)? This is where the SIMCA method comes in. Here we build a PCA model for each of our different samples, and then compare each spectrum to all of these models. We can calculate a statistic – known as the distance to model – for each spectrum. We can also calculate, for each model, a threshold distance. This is like a statistical significance test. If the distance is less than the threshold, our spectrum is not significantly different from the model, and vice versa. By such an approach, we can classify a new spectrum into one – or more – or even none – of our models, with (say) a 95% or 99% confidence. This is illustrated in Figure 5d, where we have plotted the distances for a set of spectra from two models, and also the threshold values for the models (shown by the grey lines). The graph has four quadrants; spectra that fall in the top left quadrant are not consistent with either of our models (in this case they are probably outliers). Spectra in the bottom left are consistent with either model, whereas spectra in the other two quadrants are consistent with one or the other. Although building the models requires a degree of knowledge, in principle using them in an analysis could be made relatively automated and therefore suitable for routine QC applications.
J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications 137
10
0
CD (mdeg)
-10
-20
-30
-40
-50
-60 240
250
260
270
280
290
300
310
320
330
340
350
360
Wavelength (nm)
5a 1 2 3
200
100
t[2]
0
-100
-200
-300
-500
-400
-300
-200
t[1]
5b
-100
0
100
138 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications p[1] p[2]
0.18 0.16 0.14 0.12
Loading
0.10 0.08 0.06 0.04 0.02 0.00 -0.02 -0.04
0
100
200
300
400
500
600
700
800
900
1000
1100
1200
Variable Number
5c 1 3 8
7
Distance to model for Sample 2
6
5
4
3
2
D-Crit(0.05)
D-Crit(0.05) 1
0
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Distance to model for Sample 1
5d Figure 5. Examples of PCA and SIMCA analysis of CD spectra. This figure illustrates the analysis of a set of CD spectra by multivariate data analysis approaches. Panel a) shows the raw data; the CD spectra from multiple measurements of three protein samples, two of which are similar to each other. In this format the results are difficult to interpret. Panel b) shows a PCA scores plot. In this plot each point corresponds to one of the spectra; similar spectra are clustered together. The three different protein samples are readily visualised. Panel c) shows the PCA loadings plot, which indicates the composition of the first two principal components. These are obviously related to the CD spectra of the different proteins. Panel d) shows how spectra can be discriminated into classes using the SIMCA method. This is known as a Cooman’s plot and shows how the spectra can be distinguished into classes by comparison with PCA models. The grey lines divide the plot into quadrants. Spectra below the horizontal line are identified – with 95% confidence – as belonging to sample 3. Spectra to the left of the vertical line are identified as belonging to sample 1. Spectra in the bottom left quadrant cannot be distinguished and samples in the top left quadrant appear to differ from both samples and are therefore outliers.
J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications 139
6. Conclusions Circular dichroism is a valuable technique for both research and industrial applications, such as biopharmaceutical quality control. However, in both applications its usefulness is limited by problems in obtaining comparable spectra and in comparing the spectra objectively. We have highlighted a number of important issues that can affect the quality of CD spectra, such as calibration, pathlength measurement, protein concentration, and instrument maintenance. Some of these issues can be addressed through good practice. Others – such as the provision of traceable, certified reference materials – are more challenging, and remain to be resolved, although progress is being made. Comparing and classifying spectra are possible through a variety of approaches. Here we have mainly discussed multivariate approaches such as PCA and SIMCA, and shown how they can be used to compare spectra to reference data sets. However, such approaches have yet to be widely adopted in either industrial or academic circles. Objective comparison would also be helped by a better understanding of the uncertainties inherent in CD measurements, rather than relying purely on statistical analysis of variability.
References [1]
[2] [3]
[4] [5]
[6]
[7] [8] [9]
[10] [11] [12]
[13]
Specifications: Test Procedures and Acceptance Criteria for Biotechnological/Biological Products (Q6B). 1999, International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. 20. J.E. Noble, A.E. Knight, A.J. Reason, A. Di Matola and M.J. Bailey, A comparison of protein quantitation assays for biopharmaceutical applications, Mol. Biotech. 37 (2007) 99-111. A.J. Miles, F. Wien, J.G. Lees, B.A. Wallace, Calibration and standardisation of synchrotron radiation and conventional circular dichroism spectrometers. Part 2: Factors affecting magnitude and wavelength, Spectroscopy. 19 (2005) 43-51. Kelly, S.M. T.J. Jess and N.C. Price, How to study proteins by circular dichroism, Biochim. Biophys. Acta. 1751 (2005) 119-139. C. Jones, D. Schiffmann, A. Knight and S. Windsor, Val-CiD best practice guide: CD spectroscopy for the quality control of biopharmaceuticals, NPL report DQL-AS 008 (2004) National Physical Laboratory, Teddington, U.K. W.C. Johnson, Jr., Circular dichroism instrumentation, in: Circular Dichroism and the Conformational Analysis of Biomolecules, G.D. Fasman, Editor. (1996) Plenum Press: New York. p. 635-652. BIPM, The International System of Units (SI). 8th ed. (2006) Paris: Bureau International des Poids et Mesures. EN ISO/IEC 17025: (2005) General requirements for the competence of testing and calibration laboratories. D.A. Schiffmann, R.E. Yardley, D.M. Butterfield, A.E. Knight, S.A. Windsor, and C. Jones, Val-CiD best practice guide appendix A: CD spectroscopy: an inter-laboratory study, (2004) National Physical Laboratory, Teddington, U.K. G.C. Chen and J.T. Yang, Two-point calibration of circular dichrometer with d-10camphorsulfonic acid, Anal. Lett. 10 (1977) 1195-1207. T. Takakuwa, T. Konno, and H. Meguro, A new standard substance for calibration of circular dichroism ammonium d-10-camphorsulfonate, Anal. Sci. 1 (1985) 215-218. A.J. Miles, F. Wien, J.G. Lees, A. Rodger, R.W. Janes and B.A. Wallace, Calibration and standardisation of synchrotron radiation circular dichroism and conventional circular dichroism spectrophotometers, Spectroscopy 17 (2003) 653-661. O.C. Vives, A.B. Llordés, R. Marrington, R. Yardley, D.A. Schiffmann, A.E. Knight, S.A. Windsor, A. Rodger, and C. Jones, Val-CiD Appendix B: The use of chemical calibrants in circular dichroism spectrometers, (2004) National Physical Laboratory, Teddington, U.K.
140 J. Ravi et al. / Reproducible Circular Dichroism Measurements for Biopharmaceutical Applications
[14] A. Damianoglou, E. Crust, M. Hicks, S. Howson, A. Knight, J. Ravi, P. Scott, and A. Rodger, A new reference material for UV-visible circular dichroism spectroscopy. Chirality 20 (2008) 10291038. [15] J. Ravi, D. LePevelen, G. Tranter, M.J.A. Bailey, and A.E. Knight, Higher-order protein structure measurements for biopharmaceutical quality control, (2007) National Physical Laboratory: Teddington, U.K. [16] H. Bierau, G. E. Tranter, D. D. LePevelen, C E. Giartosio and C. S. Jones, Higher-order structure comparison of proteins derived from different clones or processes, BioProcess International, 6 (2008) 52-59.
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-141
141
Synchrotron Radiation Circular Dichroism Spectroscopy: Applications in the Biosciences Andrew J. Miles and B.A. Wallace Department of Crystallography, Birkbeck College, University of London Abstract. Synchrotron Radiation Circular Dichroism (SRCD) spectroscopy is enabling a number of new applications in biology because of its ability to measure lower wavelength data, lower requirement for sample quantities, higher signal-tonoise levels and ability to examine proteins under a wide range of conditions. Advances in SRCD instrumentation, data collection methodologies, data processing, reference datasets and methods of analysis have lead to new applications of SRCD for examining interesting biological questions. Although most of the SRCD studies to date have focused on proteins, the technique has been successfully applied in the study of other macromolecules including DNA and carbohydrates. This chapter briefly discusses a number of examples of studies that have been undertaken with this new methodology.
1. Introduction The technical aspects of data collection and methods specifically related to synchrotron radiation circular dichroism (SRCD) spectroscopy have been discussed at length in the chapter on Good Practice in SRCD by Miles and Wallace in this volume. This chapter focuses on recent applications for biological systems of circular dichroism (CD) spectroscopy which have been enabled by the use of synchrotron radiation as the light source for the measurements. The advantage of SRCD over conventional CD using lab-based commercial instruments arises from the nature of the synchrotron light, which is not only many orders of magnitude brighter than that generated by the Xenon arc lamps used in conventional instruments, but also provides a nearly constant flux level over a very wide range of wavelengths, including extending into the much lower vacuum ultraviolet (VUV) range. Compared to conventional CD data, SRCD data obtained on the same sample has a much higher signal-to-noise level, thus enabling the accurate detection of smaller conformational changes. It has a higher information content due to the increased number of electronic transitions detectable in the low wavelength region (the importance of this and its consequences for secondary structure determinations are discussed in more detail in the chapter on Analyses by Whitmore and Wallace in this volume). Conversely, with SRCD a similar signal-to-noise level can be achieved for a much smaller amount of sample than is feasible with conventional CD, which is an advantage for scarce biological samples such as membrane proteins and macromolecular complexes. Furthermore, shorter averaging times mean that data collection can be much quicker, indeed with the potential to enable time-resolved
142
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
measurements for folding and enzyme kinetics. This chapter briefly summarises some of the recent structural biology studies enabled by this technique (emphasising studies done since 2001 with modern SRCD beamlines) and also predicts future applications and developments.
2. New Applications for Proteins 2.1 Detection of Small Conformational Differences SRCD has permitted the identification of very subtle differences not detectable by cCD, including the differences between wild type and cataract-causing mutants of human eye lens proteins [1], differences between metmyoglobin in aqueous and helixpromoting organic solvents [2], changes associated with protein-drug binding [3, 4], and the effects of deglycosylation of glycoproteins on their structural integrity [5]. The difference in spectral characteristics (and therefore secondary structure) between a single-site mutation from proline to threonine at residue 23 (P23T) of the human eye lens γD-crystallin was undetectable using conventional CD instrumentation, but is clearly seen by the appearance of a new shoulder centered at ~208 nm in the mutant protein [1]. This corresponds to a change in conformation of more than just the single amino acid which is the site of the mutation, and is estimated to arise from approximately six amino acids becoming more sheet-like. This corresponds well with molecular modelling studies that suggest this mutation extends the beta-strand in the region of the mutation by several residues, thus changing the solubility characteristics of the protein (and also likely leading to the formation of the insoluble cataract in vivo). Interestingly, the spectrum of an artificial P23S mutant with a serine instead of a threonine at the same site does not produce the same spectral characteristics as the cataract mutant. Hence this dispelled the simple conjecture that the reason for the structural difference was that replacement of the sheet-breaking proline imino acid by a sheet-favoring amino acid. This study was important because although there is a crystal structure for the wild type protein, there is no equivalent crystal structure for the mutant, so SRCD provided the only real insight into structural differences in the disease-associated mutant of this protein. Another example of the use of SRCD to detect conformational changes potentially associated with biological activity was the examination of the structures of both intact voltage-gated sodium channels from electric eels and of deglycosylated channels from the same source [5]. In vivo the channels are highly glycosylated (approximately one third of their molecular weight is sugar), but the sugar components of these and other proteins are usually either ignored (they generally have no detectable signal in the near or far UV regions of the spectrum) or the carbohydrates tend to be removed for ease of biophysical or crystallographic characterisation. However, it was important to have a method which could examine if the removal of the sugars (many of which carried charges) would influence the overall structure of the protein, as had been anticipated. A comparison of native and deglycosylated protein was not possible with conventional CD because all of the sugar signals fall at low wavelengths so removal of the sugar contributions could not be verified. However, such studies were possible with SRCD because the signals attributable to both the protein and sugar could be detected and deconvoluted. Once the sugar was removed, it was possible to see that the protein structure remained intact, a result that was confirmed by functional conductance studies
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
143
that showed the modified protein was still active. Hence, this study demonstrated that SRCD can have value for comparative studies of sugar-free and glycosylated proteins, although such studies are not possible using conventional CD spectroscopy. 2.2 Detection of Quaternary Interactions Although CD studies of macromolecular complexes have traditionally examined spectral changes in the far UV wavelength range for secondary structure effects, and changes in the near UV wavelength region for tertiary structural effects, with the availability of the very low wavelength SRCD data, it has now become possible to examine other structural differences (such as quaternary effects) due to changes in the charge transfer transitions in the VUV region. An example of this was the identification of macromolecular complex formation between two proteins even when no secondary structural changes have occurred [6]. Comparisons of the individual crystal structures of carboxypeptidase A and latexin with their crystal structures in a complex suggested that the binding mode was as rigid bodies and did not involve any secondary structural changes. Using a lab-based CD instrument, not unsurprisingly, no significant spectral changes were seen in the wavelength region above 190 nm; hence conventional CD spectroscopy could not detect the complex formation. However, when the SRCD spectra of the summed spectra of the individual proteins and the spectrum of the complex were compared, significant differences were found in the VUV region below 190 nm, thus enabling the detection of complex formation even when it involved only rigid body interactions. This then adds SRCD to the arsenal of biophysical techniques that can be used to monitor and quantify macromolecular interactions in solution. 2.3 Measurements of Folding and Unfolding Another area in which SRCD is proving useful is in thermal studies of protein folding and unfolding. Because the low wavelength data enables measurements of additional transitions, and the high flux permits measurements in high ionic strength solutions, it has been possible, for example, to detect an heretofore unseen folding intermediate with a signature spectrum in the low wavelength range for tropomysin; most interestingly this transition exhibits a difference between native protein and a mutant version of the protein which causes the disease Familial Hypertrophic Cardiomyopathy [7], and the differences are most apparent at physiological temperatures (~35 o C). Thermal studies have also been used to examine the chaperone-induced stabilisation of folding intermediates in the crystallins family of eye lens proteins [8]. The subtle differences were not detectable with any confidence with conventional CD instruments, but did produce changes that were detectable by SRCD. Most recently [9] SRCD thermal denaturation studies have been used to examine voltage-gated sodium ion channels in various detergents. A significant advantage of the SRCD studies is that they can accurately monitor changes not only in the higher wavelength peaks around 222 nm, but also in the peaks just below 190 nm, which is often below the limit of detection for conventional CD measurements. Whilst conventional CD instruments can make steady state measurements around 190 nm, the data in this region tend to be noisy; this means that unfolding studies in which the changes are small, cannot not usually by monitored at this wavelength. In SRCD spectra, however, the signal-to-noise levels at 190 nm are roughly comparable to those at 222 nm, so monitoring unfolding is facile. The value in being able to examine
144
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
changes at both wavelengths is this is that by following whether the losses in magnitudes of the 190 and 222 nm peaks parallel each other, it can be determined if there is a cooperative unfolding of all the secondary structural elements. Indeed, in the case of sodium channels, it was found that not only did the Tm values in the different detergents varied significantly, but also that the apparent pathway/mechanism was different in the different detergents. Other studies on beta-sheet rich proteins or proteins with natively unfolded features, also benefit from being able to monitor low wavelength peaks, as these include contributions from sheet and disordered structures. SRCD studies have added value for chemical unfolding studies for a different reason. In many cases, the chemical used to induce the unfolding (ie. urea or guanidine hydrochloride) itself tends to absorb, and thus either limits the amount of denaturant that can be used, or else the number of transitions that can be monitored (usually only the 222 nm peak due to helical structure). Although high concentrations of urea can still be problematical, reasonable measurements of the 208 nm (sheet) and 222 nm (helix) peaks are possible in guanidine solutions. 2.4 Other Applications for Proteins Other uses of SRCD have included as an assay for whether expressed proteins or protein domains are properly folded, whether homology models for domains are realistic [10] and as a means of examining proteins in organic solvents that absorb at wavelengths in the near and far UV regions [11]. SRCD has also been shown to be beneficial for examining physically challenging systems for which scattering artefacts can predominate in conventional CD measurements, including fibrous proteins where SRCD has been used to follow the conversion of spider silk liquid to solid fibres [12]. A particularly fruitful new use of SRCD is in combination with other techniques that provide complementary information. For example, two synchrotron techniques, SRCD and SAXS, have been combined to examine in parallel both secondary structures and tertiary structures of individual proteins and complexes [13–15]. Another example is a recent study in which kinetic SRCD studies were combined with single molecule fluorescence to follow the dynamics of an ensemble of collapsed unfolded proteins [16]. Whilst not a comprehensive list of all possible types of applications, the above examples demonstrate the wide range of new structural biology investigations that have employed SRCD, many of which would not have been possible with conventional circular dichroism spectroscopy using lab-based instruments.
3. New Applications for Other Macromolecules Although the majority of the SRCD studies to date have been on proteins, it is clear that CD studies of other types of macromolecules will also benefit from the low wavelength data available in SRCD spectra. Conventional CD studies have been important for identifying conformations of DNA molecules, especially from the near UV region of the spectra. Early SRCD studies showed the presence of additional bands below 180 nm [17]. More recent SRCD studies of nucleosides and nucleotides have shown that the bases apparently contribute more to the VUV spectra than the sugars, but that modifications to the sugars and their structural consequences can also be
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
145
detected in this wavelength range. Indeed, the effects of pH and temperature suggested the VUV bands are strongly sensitive to structural modification and chemical environment [18]. Other studies have shown that whilst carbohydrates have low CD signals in the far UV wavelength region (an advantage for conventional CD studies that tend to ignore sugar components of glycoproteins), they do tend to have transitions that give rise to significant signals in the VUV region. Their peak positions, signs and magnitudes are indicative of the types and configurations of the sugars present both on their own and as part of glycoproteins [19–21], and could be used in the future to identify components of complex sugar samples. SRCD data have also been used to examine the formation of protein-sugar complexes [23].
4. Potential Future Applications 4.1 Structural Genomics: Fold Recognition and Target Selection There are vast quantities of DNA sequence data being generated by genome projects; it is possible to predict the fold and function of proteins resulting from many of the open reading frames (ORFs) from a particular organism by comparison with wellcharacterized homologues in other organisms. However there are many ORF’s that have no structurally-defined homologues and many of these proteins will be comprised of new fold motifs. One of the goals of structural genomics is to identify examples of all types of protein folds, and then use that information to improve the prediction of related structures. Of the ~1000 unique folds estimated to exist, examples of roughly half of them have already been seen. But the number of candidates for the remaining unique folds from the ORFs found in all the genomes sequenced to date is vast. Although the three-dimensional structural determination of proteins as a part of structural genomics programmes has been highly productive, X-ray diffraction studies have generally been limited to proteins that easily crystallize, and NMR studies have been restricted to relatively low molecular weight proteins. Both methods are timeconsuming and there is a bias in solved structures towards proteins that are more amenable to these techniques; therefore a rapid screening method to find potential candidates for new folds to be subjected to structural analysis is required and SRCD may provide the solution. This may be possible because, for example, proteins with similar beta-sheet contents but different folds produce similar far UV spectra, but significantly different VUV spectra, suggesting that SRCD may provide a means of identifying fold motifs or supersecondary structures. Another potential use of SRCD in structural genomics is suggested by the proteomics study of Mycoplasma genitalium [24]. In this organism, which has a small genome, potential proteins of unknown function were selected based on bioinformatics studies, and then expressed and screened for secondary structure characteristics and thermodynamic properties using CD. Thermostable and structured proteins were rapidly distinguished from proteins that were unstable or unstructured and those that may require partners for stability were identified. With SRCD, because of the rapidity of measurements and low requirements for samples, this sort of study could be extended to larger genomes and less readily-expressed proteins.
146
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
4.2 Structural Genomics: Membrane Proteins There has been much interest I the use of CD for studies of membrane proteins, as these proteins are notoriously difficult to crystallize and are generally unsuitable for NMR due to their size and low solubility. As a result, membrane proteins are significantly underrepresented in the structures deposited in the Protein Data Bank (PDB) [25]. This is unfortunate since membrane proteins are the primary targets of many drugs and represent between 20 and 35% of all ORFs in most genomes. Membrane proteins, however, present a number of challenges for studies by CD spectroscopy because they are embedded in either detergent micelles or lipid vesicles. The first issue is that the detergents and/or lipid molecules needed to maintain the membrane proteins in solution tend to absorb UV light quite strongly because of the chromophores they contain, and because on a molar basis, they tend to be in vast abundance relative to the protein component under study (ie. molar ratios of >>100:1). Even though their absorbance is not chiral, and thus they do not produce a CD signal, it does mean that in their presence the amount of light of either handedness reaching the detector is low, so the signal-to-noise for conventional CD studies will be low. The high flux of the SRCD means that this is not a significant problem. The second issue is that membrane particles containing membrane proteins, either large unilamellar or multilamellar vesicles, tend to scatter light significantly, which can lead to artefacts in the data collected [26] if the detection geometry of the instrument is such that it does not collect the scattered light. This is often the case in conventional CD instrument designs. However, most SRCD beamlines have been designed with wide angle detection in mind, thereby ameliorating this problem. A third issue issue is that membrane protein particles tend to exhibit absorption flattening because in order to increase the signal of the protein (see above), experimentally the lipid-to-protein ratio is often minimised. This results in a wellstudied distortion of the spectrum [26], although at the high end limit of lipid concentrations, where only one protein is present per vesicle, the effect goes to zero. Because SRCD can be done in samples with much higher lipid-to-protein ratios (because the signal of the protein is well-detected above the background), this issue too can be eliminated by use of SRCD. Finally, the wavelength shift effects seen for membrane proteins because they are embedded in a non-aqueous (hydrophobic) environment [27], can be mitigated against using a new reference database created with SRCD data from membrane proteins (see the Analysis chapter in this volume by Whitmore and Wallace). Thus, the potential problems associated with CD studies of proteins can be eliminated or at least minimised by the use of SRCD. 4.3 Structural Genomics: Macromolecular Complexes Most structural genomics programmes to date have focused on single protein structures; however it is clear from a wide range of proteomics and binding studies that a very significant proportion of proteins exist not as isolated monomers but as macromolecular complexes in vivo. SRCD is a good method for characterising conformational changes associated with macromolecular interactions, either with other proteins, nucleic acids or lipid molecules. For example, it is now known that many isolated proteins have ‘‘natively unfolded’’ structures. When bound to other macromolecules, the disordered regions tend to refold to form regular secondary
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
147
structures. SRCD is especially suitable for these types of studies as it permits monitoring of changes in the low VUV wavelength regions, where unordered structures typically have significant spectra (as opposed to the far UV region where their spectra are small and nondescript). An example of the use of SRCD for these types of studies include the surface-associated SHERP proteins from Leishmania [28]. CD studies showed the protein was unfolded in the absence of a partner molecule, but SRCD studies were able to identify specific types of detergents and lipids which induced the protein to fold into a mostly helical structure. With this information in hand, it was then possible to undertake NMR studies of the complex of the protein with one of these partner molecules to determine its detailed conformation. Thus, SRCD can be used in conjunction with other techniques to identify conditions and macromolecular partners necessary to induce stable structures. 4.4 The Protein Circular Dichroism Data Bank (PCDDB) A further new development that may aid interpretation and analyses is the Protein Circular Dichroism Data Bank (PCDDB) [29], described in this volume in the chapter by Wallace et al. This is a public archive being created for SRCD data measured at all beamlines, as well as CD spectra from conventional instruments. It should enable a range of new bioinformatics and structural biology studies. It is an undertaking involving many of the SRCD beamlines as mirror deposition and access sites, and is a demonstration of cross-cooperation within the growing SRCD community. In addition to the obvious advantage of accessibility to data for published validated spectra, it will have the added advantage of providing a ready source of spectra that can be used to create both broader-based (like SP175) [30] and more narrowly-focused (like CRYST175) [31] reference datasets to further improve analyses, as described in the chapter on Reference Datasets by Janes and the chapter on Analyses by Whitmore and Wallace.
5. Summary In summary, the availability of new beamlines, analysis tools, and of proof-of-principle studies means there is a bright future for SRCD studies, with many new applications possible in biology.
Acknowledgements Our SRCD studies were supported by grants from the U.K. Biotechnology and Biological Sciences Research Council and beamtime grants from the SRS Daresbury, Astrid, BSRF, and NSLS synchrotrons.
References [1]
P. Evans, K. Wyatt, G.J. Wistow, O.A. Bateman, B.A. Wallace and C. Slingsby, The P23T cataract mutation causes loss of solubility of folded γD-crystallin, J. Mol. Biol. 343 (2004) 436–444.
148
[2]
[3] [4] [5] [6]
[7] [8] [9] [10]
[11]
[12] [13]
[14]
[15] [16]
[17] [18]
[19] [20] [21]
[22]
[23]
[24]
[25]
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
P.W. Thulstrup, J. Brask, K.J. Jensen and E. Larsen, Synchrotron radiation circular dichroism spectroscopy applied to metmyoglobin and a 4-alpha-helix bundle carboprotein, Biopolymers 78 (2005) 46–52. G.R. Jones and D.T. Clarke, Applications of extended ultra-violet circular dichroism spectroscopy in biology and medicine, Faraday Discuss. 126 (2004) 223–236. B.A. Wallace and R.W. Janes, Circular dichroism and synchrotron radiation circular dichroism spectroscopy: Tools for drug discovery, Biochem. Soc. Trans. 31 (2003) 631–633. N.B. Cronin, A. O’Reilly, H. Duclohier and B.A. Wallace, Effects of deglycosylation of sodium channels on their structure and function, Biochemistry 44 (2005) 441–449. N. P. Cowieson, A. J. Miles, R. J. Gautier, K. Forwood, B. Kobe, J. L. Martin and B.A. Wallace, Evaluating protein: protein complex formation using synchrotron radiation circular dichroism spectroscopy. Proteins: Struct. Func. Bioinform. 70 (2007) 1142–1146. R. Maytum and R.W. Janes, Synchrotron radiation circular dichroism spectroscopy reveals a new structural transition in tropomyosin, Biophysical J. 92 (2007) 362a. P. Evans, C. Slingsby and B.A. Wallace, Association of partially-folded lens βB2-crystallins with the αcrystallin molecular chaperone, Biochem. J. 408 (2008) 691–699. E. McCusker and B.A. Wallace, Expression, purification and biophysical characterization of a superfamily of prokaryotic voltage-gated sodium channels, Biophysical J. (2009) in press. M.W. Richards, M.R. Hanlon, N.S. Berrow, A. Butcher, A.C. Dolphin and B.A. Wallace, Synchrotron radiation circular dichroism (SRCD) and CD spectroscopic studies of the voltage-dependent calcium channel beta subunit and its domains, Biophysical J. 82 (2002) 456a. R.W. McCabe, A. Rodger and A. Taylor, A study of the secondary structure of Candida antarctica lipase B using synchrotron radiation circular dichroism measurements, Enzyme Microb. Technol. 36 (2005) 70–74. C. Dicko, D. Knight, J.M. Kenney and F. Vollrath, Structural conformation of spidroin in solution: A synchrotron radiation circular dichroism study, Biomacromolecules 5 (2004) 758–767. W.A. Stanley, A. Sokolova, A. Brown, D.T. Clarke, M. Wilmanns and D.I. Svergun, Synergistic use of synchrotron radiation techniques for biological samples in solution: A case study on protein-ligand recognition by the peroxisomal import receptor Pex5p, J. Synchrotron Rad. 11 (2004) 490–496. D.J. Scott, J.G. Grossmann, J.R.H. Tames, O. Byron, K.S. Wilson and B.R. Otto, Low resolution solution structure of the apo form of Escherichia coli haemoglobin protease Hbp, J. Mol. Biol. 315 (2002) 1179–1187. J.G. Grossmann, J.F. Hall, L.D. Kanbi and S.S. Hasnain, The N-terminal extension of rusticyanin is not responsible for its acid stability, Biochemistry 41 (2002) 3613–3619. A. Hoffmann, A. Kane, D. Nettels, D.E. Hertzog, P. Baumgartel, J. Lengefeld, G. Reichardt, D.A. Horsley, R. Seckler, O. Bakajin, and B. Schuler, Mapping protein collapse with single-molecule fluorescence and kinetic synchrotron radiation circular dichroism spectroscopy, Proc. Natl. Acad. Sci. U.S.A. 104 (2007) 105–110. K.H. Johnson, D.M. Gray and J.C. Sutherland, Vacuum UV CD spectra of homopolymer duplexes and triplexes containing AT or AU base-pairs, Nucleic Acids Res. 19 (1991) 2275–2280. S.B. Nielsen, T. Chakraborty and S.V. Hoffmann, Synchrotron radiation circular dichroism spectroscopy of ribose and deoxyribose sugars, adenosine, AMP and dAMP nucleotides, Chem. Phys. Phys. Chem. 6 (2005) 2619–2624. E.P. Stroyan and E.S. Stevens, An improved model for calculating the optical rotation of simple saccharides, Carbohydr. Res. 327 (2000) 447–453. K. Matsuo and K. Gekko, Vacuum-ultraviolet circular dichroism study of saccharides by synchrotron radiation spectrophotometry, Carbohydr. Res. 339 (2004) 591–597. A.I.S. Holm, E.S. Worm, T. Chakraborty, B.R. Babu, J. Wengel, S.V. Hoffmann and S.B. Nielsen, On the influence of conformational locking of sugar moieties on the absorption and circular dichroism of nucleosides from synchrotron radiation experiments, J. Photochem. Photobiol. A 187 (2007) 293–298. H.L. Bagger, S.V. Hoffmann, C.C. Fuglsang and P. Westh, Glycoprotein-surfactant interactions: A calorimetric and spectroscopic investigation of the phytase-SDS system, Biophys. Chem. 129 (2007) 251–258. E.A. Yates, C.J. Terry, C. Rees, T.R. Rudd, L. Duchesne, M.A. Skidmore, R. Levy, N.T.K. Thanh, R.J. Nichols, D.T. Clarke and D.G. Fernig, Protein-GAG interactions: New surface-based techniques, spectroscopies and nanotechnology probes, Biochem. Soc. Trans. 34 (2006) 427–430. S. Balasubramanian, T. Schneider, M. Gerstein and L. Regan, Proteomics of Mycoplasma genitalium: Identification and characterization of unannotated and atypical proteins in a small model genome, Nucleic Acids Res. 28 (2000) 3075–3082. H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov and P.E. Bourne, The protein data bank. Nucleic Acids Res. 28 (2000) 235–242.
A.J. Miles and B.A. Wallace / Synchrotron Radiation Circular Dichroism Spectroscopy
149
[26] B.A. Wallace and D. Mao, Circular dichroism analyses of membrane proteins: An examination of differential light scattering and absorption flattening effects in large membrane vesicles and membrane sheets, Anal. Biochem. 142 (1984) 317–328. [27] B.A. Wallace, J.G. Lees, A.J.W. Orry, A. Lobley and R.W. Janes, Analyses of circular dichroism spectra of membrane proteins, Protein Sci. 12 (2003) 875–884. [28] C. Guerra-Giraldez, B. Moore, B. Neves, B.A. Wallace, D.I. Svergun, K.A. Brown and D.F Smith,. Structural and functional analysis of the Leishmania infective stage-specific protein, SHERP. 3rd International Congress on Leishmania and Leishmaniasis Abstracts (2005). [29] B.A. Wallace, L. Whitmore and R.W. Janes, The protein circular dichroism data bank (PCDDB): A bioinformatics and spectroscopic resource, Proteins: Struct. Funct. Bioinform. 62 (2006) 1–3. [30] J.G. Lees, F. Wien, A.J. Miles and B.A. Wallace, A reference database for circular dichroism spectroscopy covering fold and secondary structure space, Bioinformatics 22 (2006) 1955–1962. [31] P. Evans, O. Bateman, C. Slingsby and B. A. Wallace, A reference dataset for circular dichroism spectroscopy tailored for the ßγ-crystallin lens proteins, Experimental Eye Res. 84 (2007) 1001–1008.
150
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-150
Linear Dichroism Spectroscopy: Techniques and Applications Alison Rodger Department of Chemistry, University of Warwick
Abstract. Linear dichroism is the difference in absorbance of light polarized perpendicular and parallel to an orientation direction. It is ideal for determining the relative orientations of transitions within molecules and hence of the subunits of biomacromolecular systems.
1. Introduction A simple modification to a circular dichroism (CD) instrument enables it to be used for the technique of linear dichroism. One method is to insert a quarter wave plate in the light beam of a CD spectropolarimeter; this produces alternating beams of linearly polarized light. An alternative method which is implemented on some machines is a software change to LD mode, which doubles the voltage across the photoelastic modulator to achieve the same goal. The resulting spectrum is a linear dichroism (LD) spectrum. LD is formally defined to be the difference in absorbance of light polarized parallel to an orientation direction and light polarized perpendicular to that direction LD = A// − A⊥
(1)
For uniaxial samples this may be written in terms of the reduced LD LD r =
LD 3 = S(3cos 2 α −1) Aiso 2
(2)
where S is the orientation parameter, Aiso is the isotropic absorbance of the same sample in the same pathlength, α is the angle between the orientation direction and the transition polarization direction as illustrated in Figure 1.
LD>0
μ
μ
μ
z α
z
LD =
LD<0 z
3 S A isotropic (3cos2α − 1) 2
Figure 1. The defintion of the angles in LD.
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
151
Unless the sample is oriented the LD signal will be zero, so much of the challenge of LD spectroscopy is to orient the sample in a way that provides useful information. There are many ways of orienting molecular samples ranging from mounting a single crystal to aligning polymers in a shear flow system. In this chapter the focus will be on small molecules oriented in stretched films and on those molecules that can be oriented in shear flow, including polymers (DNA and fibrous proteins), and membrane proteins (bound to flow distorted liposomes). Most of the discussion can be transferred across to other methods of orientation such as squeezed gels, crystals, liquid crystals, magnetic fields, and electric fields [1, 2, 3, 4, 5]. The content of this chapter includes how to align molecules in films and in flow and then covers a range of applications which illustrate the type of data that can be collected and the conclusions that can be drawn. Some of the current experimental and analysis challenges are outlined.
2. Sample Orientation 2.1 Stretched Film Orientation of Small Molecules For a molecule to be aligned in a stretched film it must be either an integral part of the film or associated sufficiently strongly that when the film is stretched the molecule follows the film alignment axis. In practice one of two films enable one to prepare aligned films of most molecules: polyethylene for non-polar molecules and polyvinyl alcohol for polar molecules. 2.1.1 Alignment in Polyethylene Films Polyethylene (PE) is microcrystalline and when it is mechanically stretched along the manufacturer’s stretch direction a molecular orienting environment is produced. PE is well suited for orienting non-polar molecules for spectroscopy as it has transparency in UV (above 200 nm), in the visible, and in the infrared regions. The key to success with PE film LD is the choice of PE and the degree of stretching. A good source of PE is the plastic bags supplied with micropipette tips. However, to stretch them you need to have some kind of mechanical stretcher into which the film can be fixed. In the absence of a film stretcher to stretch and hold the PE, magazine wrappers made of PE can usually be stretched by hand and fixed, using bluetack, onto one wall of the sample compartment so that the light beam passes through the film. By convention the parallel direction of the polarized light is usually taken to be horizontal, so the stretch direction of the film should be aligned horizontally. It is advisable not to stretch too close to the breaking point of the polymer since the film has a tendency to become opaque and to rip suddenly. With a film stretcher a factor of ~3 is fairly straightforward. One can add molecules to the polyethylene film either before or after stretching and get the same spectrum. The most efficient protocol is to first stretch the film, then measure the baseline spectrum (see below). After the baseline has been collected, then introduce the analyte into the stretched film either by adding droplets of a solution containing the analyte in cyclohexane or chloroform or dichloromethane to the surface of the film. Allow the solvent to evapourate between drops. Since the solvent may also enter and swell the PE and is itself aligned when the film is stretched, the baseline must be measured on a film that has been treated with solvent in the same way as the sample
152
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
film will be. You should also endeavour to collect the sample spectrum on the same part of the film as the baseline. 2.1.2 Polyvinyl Alcohol Films Polyvinylalcohol (PVA) is a fairly universal host for polar molecules; the film is transparent in the UV (above 200 nm) and visible regions of the spectrum, though it has a strong absorption over large regions of the infrared [6]. For small molecules it has to be used in a dry form (less than a few percent water). PVA films are more difficult to prepare than PE films, however, the quality of data is often better. To prepare a PVA film, one mixes well-hydrolysed commercial PVA powder in cold water (10% w/v) to make a slurry which is then heated to near boiling to form a viscous solution. The sample solution (typically ~5 mM in water, but the aim is to have the final film with an absorbance maximum between 0.1 and 1) is then added to half of the PVA solution, and the mixture is also cast onto a glass plate and left to dry. The same volume of water is then added to the remainder of the solution which is then cast onto a glass plate and left to dry (this typically takes one to three days in a well-ventilated dust-free place) to make a baseline film. Finally the films are stretched by the same factor (typically ~2) at an elevated temperature by holding the films in the hot air from a hair dryer as they are being stretched. PVA films are quite brittle and one really needs to use a wellengineered film stretcher. It is also advisable to stretch a safe amount first then measure the spectra before trying a larger stretch. The greater the stretch factor the greater the LD signal magnitude, until of course the film breaks. 2.1.3 Film Linear Dichroism of a Bimetallo Triple Helicate in PVA Film When a molecule is oriented in a stretched film, the molecules adsorbed into the film are not all lined up as in a crystal, so the linear dichroism of such a molecule, even if we make the not entirely correct assumption that the orientation is uniform about the stretch direction, is for a transition polarized along the stretch direction LD = 3Aiso S(θ )
(3)
and for one perpendicular to the stretch direction LD = −
3 Aiso S (θ ) 2
(4)
where Aiso is the absorbance of an unoriented sample. The factor of three arises because of the averaging in 3 dimensions of the unoriented sample. S(θ) describes the orientational distribution of the long axis of the molecules in the film. More generally, for a transition polarized at an angle α to the stretch direction LD = 3AisoO(α )S(θ ) ⎞ ⎞⎫ ⎫ ⎧ ⎛ ⎧ ⎛ = 3Aiso ⎨ 1 ⎜⎜⎜ 3 cos 2 α − 1⎟⎟⎟ ⎬ × ⎨ 1 ⎜ 3 cos 2 θ − 1⎟⎬ 2 2 ⎝ ⎠⎭ ⎩ ⎝ ⎩ ⎠⎭
To be really quantitative in our analysis we therefore need to determine S.
(5)
153
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications 0.8 0.6 0.4 0.2 0 x-y short axis
-0.2 Film-LD UV-Vis
-0.4
r
LD
-0.6
z long axis
-0.8
(a)
240
320
400
(b)
560
640
1.5
Az(1) 1
Az(2) Az(3) Az(4) Az(5)
0.5
0
y component absorbance
z component absorbance
1.5
Ay(1) 1
Ay(2) Ay(3) Ay(4) Ay(5)
0.5
0 300
(c)
480
Wavelength / nm
400
500
600
700
Wavelength / nm
300
400
500
600
700
Wavelength / nm
Figure 2. (a) A bis iron tris-chelate helicate [Fe2L3]4+ of empirical formula: [Fe2(C25H20N4)3]Cl4. Hydrogens have been omitted for clarity. (b) Absorbance, LD and LDr spectra for [Fe2L3]4+ in PVA film prepared as described in the text. (c) Component spectra derived as outlined in the text.
The film LD spectrum of a tetracationic iron triple helicate is shown in Figure 2 [7]. If we assume (reasonably) that the molecule orientation axis is the long axis of the triple helix and that the orientation is uniaxial (due to the x-y degeneracy of the metal complex itself), then the negative sign of the LD from 500–600 nm indicates that the transitions here are predominantly short axis (i.e. ⊥) polarized transition. By assuming it is pure ⊥ polarized we get a lower bound on the orientation parameter. Component polarized spectra with incrementally increased orientation parameters are given in Figure 2c. The in-ligand transitions below 400 nm have slightly more // than ⊥ character. The “best looking” spectra with no negative values are chosen as the component spectra. 2.2 Flow Orientation of Macromolecules The shear forces exerted on a polymeric molecule such as DNA that is dissolved in water and flowed past a stationary surface at about 1 m/s give sufficient orientation to ensure that A// and A⊥ are different. If the walls are quartz then we can measure LD spectra in the visible and UV regions. The problem that arises if the shear flow is provided by a linear flow-through system such as an HPLC pump is that it is very
154
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
expensive on sample. Wada in 1964 [8, 9] solved this issue with the invention of a Couette flow cell where the sample is endlessly flowed between two cylinders one of which rotates and one of which is stationary. Much more recently we developed a microvolume Couette flow cell that requires less than 50 µL of sample rather than the mLs of the previous Couette cells [10, 11]. As a result more than one experiment on expensive proteins such as the cytoskeletal proteins tubulin, actin and FtsZ can be performed. Figure 3 illustrates a standard and micro-volume Couette flow cell. A compromise between linear flow-through systems and Couette flow is to have cycling linear flow design.
Lid
500 μm path length
100 mm
Housing
Quartz rod 2.4 mm Quartz Capillary OD 5 mm ID 2.9
Figure 3. Large volume and microvolume Couette flow cells.
2.2.1 Practical Considerations Assuming one has a spectrometer able to do LD measurements one then needs a cell that will flow orient the samples. The easiest approach is to use a Couette cell; until relatively recently one had to build one’s own, now however, they are now commercially available. A large volume cell [1] of the kind illustrated in Figure 3 may be loaded using a micropipette (usually of 1 mL volume) with sufficient sample to ensure the meniscus is above the window through which the light passes. It is easy to add additional material such as a DNA binding ligand to measure a titration series. The microvolume option [11,12] as the name implies, requires less sample but is more difficult to ensure no air bubble is trapped in the light beam. Adding material after the initial loading is also not as straightforward as with the larger cell. When collecting data one must ensure that the sample absorbance is not too high. If the photomultiplier tube is not receiving many photons it becomes unreliable. As with CD the ideal absorbance is about 1 but a range from 0.1–2 is appropriate. The approach of ensuring that the voltage on the photomultiplier tube is e.g. less than 600 V is not always good enough. Almost by definition the samples that one is using for LD scatter light. Especially at lower wavelengths when xenon lamp instruments are struggling for light intensity, this means that a significant percentage of the light reaching the photomultiplier tube from a light scattering sample may be scattered light, giving one a false sense of security. It is important to check the Beer Lambert law is being followed by diluting the sample and checking the spectrum. False maxima are otherwise observed as illustrated in Figure 4 [13].
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
155
0.14 (a)
0.12
LD / absorbance units
0.1 0.08 0.06 0.04 0.02 0 -0.02 200
220
240
260
280
300
320
Wavelength / nm
Figure 4. Far UV LD spectra showing the apparent shift to shorter wavelength of the maximum signal as the concentration of F-actin is reduced. F-actin concentrations 93, 74, 62, 53 and 12 µM (the true spectrum, solid line).
Light scattering also gives a contribution to the measured signal which is apparent as a sloping baseline. Nordh et al. [14] showed that a simple empirical correction as illustrated in Figure 5 often can be used to give the true absorbance LD [15].
LDtotal = LD A + LD τ
0.16 LD/ absorbance units
0.14
LDτ (λ ) = aλ− k
0.12 0.1
λ= wavelength k = 3.5 a determined by rescaling the curve at 320 nm
0.08 0.06 0.04 0.02 0
(ref. Nordén et al.) 240
260
280
300
320
340
W avelength / nm
Figure 5. A method of light scattering correction applied to a polymerised tubulin LD spectrum: the experimental data (–––); the calculated turbidity LD, using a k value of 3.5, with α determined by rescaling the curve at 320 nm where there is no intrinsic absorbance (----); and the corrected data (– – –).
3. Flow Linear Dichroism Applications Flow linear dichroism measurements are relatively easy to perform (if one has a Couette flow cell and a CD machine adapted for LD) and they yield structural information that is often impossible to get by any other technique. Flow LD is suited to the kinds of molecules that techniques such as X-ray crystallography and NMR are not, namely long DNAs, fibrous proteins, membrane proteins in liposomes as well as other
156
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
intriguing molecular systems such as carbon nanotubes. The information one can obtain relates to relative orientations of parts of the molecular system. The simplest system to use is DNA in aqueous solution.
3.1 DNA Most of the flow LD literature relates to DNA and DNA/small molecule systems reflecting the research interests of laboratories who had LD cells and also the comparatively ready availability of long DNA. From a spectroscopic point of view DNA is a spiral staircase of aromatic molecules, the nucleotide bases. In the most common form of DNA, B-DNA, the staircase is right handed and the steps are perpendicular to the helix axis, which is also the orientation axis.
0.002 (a)
LD / absorbance units
0
-0.002 [Ethidium Bromide] / μM -0.004
0 5 10 15 20 25 30 35 40 45 50
-0.006
-0.008
-0.01
-0.012 200
250
300
350
400
450
500
550
600
Wavelength / nm
1.2
0.01
1
0
0.8
-0.01 LD
-0.02 -0.03
Absorbance
DNA DNA+Hoechst ......... Hoechst
0.6 0.4
-0.04
0.2
-0.05
0
-0.06 200
250
300 350 400 Wavelength/ nm
450
-0.2 500
Figure 6. Linear dichroism in a 1 mm pathlength cell of calf thymus DNA with and without (a) ethidium bromide bound DNA (200 µM, 20 mM NaCl, pH=7); (b) Hoechst 33258 (50 µM) with calf thymus DNA (1000 µM, 20 mM NaCl, pH=7). Also shown is Hoechst absorbance and DNA + Hoechst absorbance in a 1 cm pathlength cuvette.
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
157
As the π-π transitions of the aromatic bases are all polarized within the plane of the steps of the staircase, the absorbance is perpendicular to the orientation axis, so we expect the LD spectrum to be very similar to an upsidedown DNA absorbance spectrum. As Figure 6 shows this is indeed the case. The magnitude of the LD depends on how far the solution is flowing or equivalently how fast the Couette cell is spinning. As long as the flow remains laminar, faster is usually better. When we add small molecule ligands to the DNA solution we expect to see them in the LD spectrum if and only if they are bound to the DNA. Ethidium bromide (Figure 6) is often used as a DNA stain; it is a planar aromatic molecule that intercalates between the DNA base pairs. Its LD signals are therefore the same sign as that of the DNA bases. If one takes the ratio of the LD and the absorbance (to give the reduced LD, LDr) at 500 nm where only ethidium absorbs and at 260 nm where both DNA and ethidium have signals, one finds that the ethidium region signal is slightly larger in magnitude even though both are nominally perpendicular to the helix axis. This is the result of the stiffening of the DNA induced by the intercalating ethidium. Hoechst, another DNA stain, is by way of contrast a minor groove binder and so its long axis polarized transitions are expected to lie at ~45° to the helix axis (Figure 6). Referring back to Equation 2 and using the data in Figure 6b, assuming the DNA bases lie at an angle of 86°, it follows that S=0.030, and the long axis of Hoechst actually makes an angle of 47° with the DNA helix axis. Thus, for new ligands the orientation of ligands on DNA can be determined once the polarization of transitions within the ligand are known and the orientation parameter of the DNA is known. It is essential for such calculations that we have a value for S. Many DNA-binding ligands have absorbance at 260 nm so their LD overlays that of the DNA meaning we can not simply use the DNA spectral region. Hoping S is the same for DNA with and without a ligand may be the only option in such a case, however, this can be dangerous as Figure 7 shows. In this case the bimetallo iron helicate (illustrated in Figure 2) binds to the DNA and also intramolecularly coils it up.
0.005
LD / absorbance units
0 -0.005
ct-DNA 500:1 200:1 160:1 120:1 100:1 80:1 40:1 20:1 10:1
-0.01 -0.015 -0.02 -0.025 -0.03 200
300
400
500
600
700
Wavelength / nm
Figure 7. The LD of a tetracationic di-iron triple helicate binding with calf thymus DNA showing the effect on the LD if the ligand bends the DNA. ct-DNA (500 μM, 20 mM NaCl, 1 mM sodium cacodylate buffer pH=7) and ligand ratios are shown on the figure.
158
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
3.2 Cytoskeletal Proteins The cytoskeleton in both prokaryotic and eukaryotic cells is dependent on the rapid assembly and disassembly of polymers whose monomeric units are themselves folded proteins. The structures of the monomers change little if at all when they form a fibre. The possibilities of LD spectroscopy can be illustrated by looking at the bacterial homologue of tubulin, FtsZ and tubulin itself.
LD / absorbance units
0.048
[Ca2+]
0.0025
0 mM 1 mM 3 mM 5 mM 10 mM
0.04 0.032
LD260 nm>0
0.002 0.0015 0.001 0.0005 0 -0.0005
0.024
-0.001 -0.0015 240 250 260 270 280 290 300 310
0.016
LD280 nm<0
0.008 0 220
240 260 280 300 Wavelength / nm
320
Fibre axis (molecular orientation axis) (a)
(b) Ca2+
FtsZ mo no mer
P–P–P–
P–P–P–
Ca2+
Ca2+ P–P–P–
Ca2+
Ca2+
P–P–P–
P–P–P–
P–P–P–
Ca2+
Ca2+
P–P–P–
P–P–P–
P–P–P–
Long axis
= Short axis
Guanine
Long axis (244 nm) Short axis (278 nm)
Figure 8. Top: FstZ (11 µM) polymerisation in the presence of MgCl2 (10 mM) and varying amounts of Ca2+ (50 mM MES buffer pH 6.5, 50 mM KCl, 0.1 mM EDTA and GTP (0.2 mM)). The GTP region is expanded in the insert. Bottom: Schematic of guanine reorientation upon addition of calcium.
FtsZ forms the so-called Z-ring which contracts and pulls the cell membrane in to enable one cell to divide into two daughter cells. FtsZ polymerises to protofilaments on the seconds timescale when GTP (guanine triphosphate) and Mg2+ are present. The protofilament LD spectrum is given in the 0 mM Ca2+ spectrum of Figure 8. Upon
159
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
addition of Ca2+ the FtsZ protofilaments bundle together [16]. The bundling process results in the guanine chromophores of the GTPs (which are between each pair of monomer units) tilting as shown by the positive peak that appears at 260 nm. By considering the polarizations of the guanine transitions, one can deduce that the G in the protofilaments is more or less perpendicular to the fibre axis, whereas in the bundles the G’s have been tilted as suggested in the schematic of Figure 8. (Note this FtsZ has no tryptophans to interfere with this region of the spectrum). The protein YgfE performs the same role as the calcium ions but at more biologically realistic concentrations [17]. 0.06
0.15
(a) LD
0.05
280 nm
0.04
LD
LD
250 nm
280 nm
LD / absorbance units
LD / absorbance units
LD
250 nm
0.03
0.02
0.1
0.05
0.01 Predicted extrapolation of tubulin alone curve
(b)
0 0 0
50
100
150
200
250
300
350
400
0
200
400
Time / minutes
600
800
Time / minutes
0.12
(c) 0.1
LD
LD / absorbance units
LD 0.08
280 nm 250 nm
0.06
0.04
0.02
0 0
10
20
30
40
50
60
70
80
Time / minutes
Figure 9. Capillary LD at two absorbance wavelengths of tubulin polymers (28 μM monomer) at 37 ºC using a thermostatted LD capillary couette cell. (b) LD at 282 nm, 232 nm of tubulin (28 μM) before and after addition of taxol (final concentrations 25.5 μM and 18.2 μM respectively) at 37 ºC using a thermostatted LD capillary Couette cell. (c) LD spectra at 282 nm, 232 nm of tubulin (28 μM) before and after addition of colchicine (final concentrations 25.5 μM and 18.2 μM respectively) at 37 ºC using a thermostatted LD capillary Couette cell.
Because monomeric FtsZ and tubulin have no flow LD signal, LD is the ideal technique to follow the kinetics of fibre assembly and disassembly as illustrated in Figure 9. Tubulin is a dimeric protein composed of two subunits that assembles into microtubule structures. As with FtsZ there is a GTP between each monomer. Rather
160
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
than forming a Z-ring, tubulin polymers radiate from the centrosome of a cell to attachment sites just under the cell membrane and from mitotic spindles to chromosomes undergoing separation during cell division. Microtubules also play a role in moving cells and organelles and interact with motor proteins. They are thus very attractive drug targets. Figure 9 shows the kinetics of polymerisation and depolymerisation of tubulin [18]. That the rates are the same in the aromatic region (280 nm) and backbone region (237 nm) tells us that the backbone and side chains are adopting their final orientation simultaneously. The high concentrations of tubulin needed to initiate polymerisation mean that the data are only reliable down to about 235 nm. The polymerisation in the absence (left hand side) and presence of the taxol illustrates the stabilisation effect of taxol on the microtubules. Colchicine (which is used in the treatment of gout) by way of contrast results in tubulin depolymerising. The actual mechanism is via inhibition of microtubule formation by binding to tubulin monomers because microtubules are constantly polymerising at one end and depolymerising at the other as long as a supply of GTP is present, so the net effect of colchicines is to cause a rapid depolymerisation be presenting the polymerisation reaction but allowing the depolymerisation one. 3.3 Membrane Proteins and Peptides Membrane proteins and peptides are peripherally associated with or embedded within the cell’s bilayer membranes. They play key roles in cells and are directly linked to viral infection, cancer, diabetes, and heart disease, to name a few. They are therefore important drug targets and also drug candidates but, we have only a very limited understanding of their structure, function, and intermolecular interactions (of the order of 100 membrane protein structures compared with more than > 30,000 soluble protein structures solved). Membrane proteins have been studied by LD for a long time using more-or-less dried films and squeezed gel methodologies [5]. Nordén et al. [19] opened up the possibilities for lipid LD by showing that small molecule such as pyrene could be flow oriented in unilamellar liposomes which are model membrane systems with a single bilayer of lipid enclosing a central space. Rodger et al. [20] showed this could be extended to proteins and peptides and since then we have been developing the use of LD of peptides and proteins inserted in liposomes as a method for probing their structure and kinetic of insertion. The possibilities are illustrated below with reference to bateriorhodopsin, probably the most studied membrane protein, and gramicidin, a similarly popular membrane inserting peptide.
normal z,Z
Liposome
Shear deformed liposome
Model of liposome
Figure 10. Schematic of liposomes distorted in shear flow.
161
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
The geometry of the liposome experiment is different from that of long polymers since the flow direction creates a long axis of the liposome (Figure 10) but the membrane normal is the molecular orientation direction. The equation analogous to the uniaxial LD Equation for liposome systems is therefore:
LDr =
LD 3S = 1− 3cos 2 β ) ( Aiso 4
(6)
where β is the angle between the lipid normal and the long axis of the molecule as illustrated in Figure 10. 3.3.1 Bacteriorhodopsin Bacteriorhodopsin (BR) is a membrane protein found in the purple membrane of Halobacteria [21]. It is a 248 residue protein and includes a covalently bound retinal chromophore (Figure 11). Each BR has 7 trans membrane helices, 3 of which in the crystal have their axis at ~70° to the lipids and the remaining 4 are parallel to the lipids [22]. The long axis of the retinal lies at ~69° to the lipids in the crystal. (b)
(a) 13
9
5
Lb 287 nm Bb 220 nm
R N
1
N +
H
La 265 nm
(c) 0.0015
10
(i)
LD
(ii) 8
Absorbance/100 6
CD / mdeg
Absorbance
0.001
0.0005
4 2 0 -2 -4
0 200
300
400 500 Wavelength / nm
600
700
200
220
240 260 280 Wavelength / nm
300
320
Figure 11. (a) All-trans retinal converted to the Shiff base. The vector represents the transition dipole moment of the 570 nm transition of retinal. (b) Tryptophan and its transition polarizations. (c) Spectra of bacteriorhodopsin (0.2 mg/mL) added to a liposome solution (0.5 mg/mL). (i) Absorption (dashed line, 1 mm pathlength, baseline: liposome absorption spectrum) and LD (solid line, 0.5 mm pathlength, baseline: LD spectrum of sample without rotation spectrum); (ii) CD (1 mm pathlength, baseline: liposome CD spectrum).
162
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
A flow LD spectrum of BR inserted into liposomes is shown in Figure 12 [23]. The 570 nm peak is due to a long axis polarized transition of the retinal chromophore; the broad peak in the near UV region (260–290 nm) is due to the transitions of the protein aromatic side chains dominated by the indole chromophore of the tryptophan residues and the peak observed in the far UV region (220– ~ 230 nm) is due to the peptide n→π* transition of the amide groups and lower down to the π → π* transitions. If we assume the retinal is at 69° to the lipids, we can use the above equation and the absorbance of the sample to determine that S ~ 0.05. It then follows that the tryptophan transitions are at: β(La, 270 nm) ~ 60º and β(Lb, 287 nm) ~ 65º which is consistent with the fact that the retinal is sandwiched by tryptophan residues in the X-ray structure [21]. The protein backbone LD spectrum shows a positive maximum at 220 nm (n→π*) and a negative maximum at ~ 213 nm (π→π*) from which it follows that the n→π* transition (which is polarized perpendicular to the α–helix long axis) is ~ 58º from the average lipid direction. Thus the average orientation of the transmembrane helices is ~ 30º from the membrane normal. This value suggests that the protein is less rigidly held in a liposome than when dried or crystallised. 0.003
0
LD / Δ Absorbance
CD / mdeg
-5 -10 -15 -20 -25 220
225
230
235
240
245
0.002
0.001
0
-0.001
250
220
240
260
-16
0.003
-14
0.0025
-12
0.002
-10 0.0015
-8
0.001
-6 -4
0.0005 0
280
300
320
Wavelength / nm
20
40
60
80
100
Mean CD signal 228-230 nm / mdeg
Mean LD signal 226-228nm / Δ Absorbance
Wavelength / nm
120
Time / minutes
Figure 12. CD and LD spectra of gramicidin (100 μg mL−1 in 10% v/v TFE / water, with 1.8 mg mL−1 PC) insertion into PC liposomes. Spectra were measured every 2 minutes, but plotted every 12 minutes. The thick dashed line is a spectrum from the same sample after 24 hours. CD pathlength was 1 mm cuvette, LD pathlength is 0.5 mm. Kinetic plots are overlaid below the folding and insertion happen on the same time scale.
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
163
3.3.2 Gramicidin Insertion Measuring the kinetics of spontaneous insertion of peptides into membranes is often carried out using techniques such as fluorescence and circular dichroism (CD) spectroscopy. If the events are fast, then combining these techniques with stopped flow enables one to follow events down to millisecond timescales. CD and fluorescence approaches rely respectively on the signal changing in response to a change in the fold of the peptide and a change in the environment (usually polar to non-polar). It is usually assumed that the polar to non-polar environment change corresponds to insertion into the membrane, though this is not necessarily the case. LD by way of contrast gives a direct readout on the orientation change of the peptide which occurs upon insertion into the membrane. The first case of LD being used for this purpose was to follow the insertion of gramicidin A (a linear antibiotic pentadecapeptide from Bacillus brevis) into a membrane. Fluorescence has been used to conclude that gramicidin insertion occurred on the ms–s timescale. It turns out, however, that the fluorescence data had missed a step in the insertion process because the fluorescence is the same for an unfolded peptide with tryptophans inserted into the membrane and for a folded peptide properly inserted. The CD and LD spectra of Figure 12 clearly show that the folding and insertion of the peptide happen simultaneously, but on a longer timescale than simple tryptophan insertion. References [1] [2] [3] [4] [5]
[6]
[7]
[8] [9] [10] [11] [12] [13]
[14]
A. Rodger, Linear dichroism, Methods Enzymol. 226 (1993) 232-258. A. Rodger and B. Nordén, Circular Dichroism and Linear Dichroism, Oxford University Press, (1997). B. Nordén, M. Kubista and T. Kuruscev, Linear dichroism spectroscopy of nucleic-acids, Q. Rev. Biophys. 25 (1992) 51-170. B. Nordén, Applications of Linear Dichroism Spectroscopy, Appl. Spectroscopy Rev. 14 (1978) 157248. H. van Amerongen and R. van Grondelle, Orientation of the bases of the single-stranded DNA and polynucleotides in complexes formed with the gene 32 protein of bacteriophage T4. A linear dichroism study, J. Mol. Biol. 209 (1989) 433-445. Y. Matsuoka, and B. Nordén, Linear dichroism studies of nucleic acid bases in stretched poly(vinyl alcohol) film. Molecular orientation and electronic transition moment directions, J. Phys. Chem. 86 (1982) 1378-1386. A. Rodger, K.J. Sanders, M.J. Hannon, I. Meistermann, A. Parkinson, D.S. Vidler and I.S. Haworth, DNA structure control by polycationic species: polyamines, cobalt ammines, and di-metallo transition metal chelates, Chirality 12 (2000) 221-236. A. Wada, Chain regularity and flow dichroism of deoxyribonucleic acids in solution, Biopolymers 2 (1964) 361-380. A. Wada, Dichroic spectra of biopolymers oriented by flow, App. Spectros. Rev. 6 (1972) 1-30. R. Marrington, T.R. Dafforn, D.J. Halsall and A. Rodger, Micro volume Couette flow sample orientation for absorbance and fluorescence linear dichroism, Biophysical J. 87 (2004) 2002-2012. R. Marrington, T. R. Dafforn, D.J. Halsall, M.R. Hicks, and A. Rodger, Validation of new microvolume Couette flow linear dichroism cells, Analyst 130 (2005) 1608-1616. R. Marrington, T.R. Dafforn, D.J. Halsall and A. Rodger, Micro volume Couette flow sample orientation for absorbance and fluorescence linear dichroism, Biophysical J. 87 (2004) 2002-2012. A. Rodger, R. Marrington, M.A. Geeves, M.R. Hicks, L. de Alwis, D.J. Halsall, and T.R. Dafforn, Looking at long molecules in solution: what happens when they are subjected to Couette flow? Phys. Chem. Chem. Phys. 8 (2006) 3161-3171. J. Nordh, J. Deinum, and B. Nordén, Flow orientation of brain microtubules studied by linear dichroism, European Biophysics J. 14 (1986) 113-122.
164
A. Rodger / Linear Dichroism Spectroscopy: Techniques and Applications
[15] R. Marrington, M. Seymour and A. Rodger, A new method for fibrous protein analysis illustrated by application to tubulin microtubule polymerisation and depolymerisation, Chirality 18, (2006) 680−690. [16] R. Marrington, E. Small, A. Rodger, T.R. Dafforn, and S.G. Addinall, FtsZ fibre bundling is triggered by a calcium-induced conformational change in bound GTP, J. Biol. Chem. 47 (2004) 48821-48829. [17] E. Small, R. Marrington, A. Rodger, D.J. Scott, K. Sloan. D. Roper, T.R. Dafforn and S.G. Addinall, FtsZ polymer-bundling by the Escherichia coli ZapA orthologue YgfE involves a conformational change in bound GTP, J. Mol. Biol., 369 (2007) 211-221. [18] R. Marrington, M. Seymour and A. Rodger, A new method for fibrous protein analysis illustrated by application to tubulin microtubule polymerisation and depolymerisation, Chirality 18 (2006) 680-690. [19] M. Ardhammar, N. Mikati, and B. Nordén, Chromophore orientation in liposome membranes probed with flow linear dichroism, J. Amer. Chem. Soc. 120 (1998) 9957-9958. [20] A. Rodger, J. Rajendra, R. Marrington, M. Ardhammar, B. Nordén, J.D. Hirst, A.T.B. Gilbert, T.R. Dafforn, D.J. Halsall, C.A. Woolhead, C. Robinson, T.J. Pinheiro, J. Kazlauskaite, M. Seymour, N. Perez and M.J. Hannon, Flow oriented linear dichroism to probe protein orientation in membrane environments Phys. Chem. Chem. Phys. 4 (2002) 4051-4057. [21] D. Oesterhelt, and W. Stoeckenius, Rhodopsin-like protein. from the purple membrane of. Halobacterium halobium, Nature New Biol. 233 (1971) 149-152. [22] R. Henderson, J.M. Baldwin, T.A. Ceska, F. Zemlin, E. Beckmann, and K.H. Downing, Model for the structure of bacteriorhodopsin based on high-resolution electron cryo-microscopy, J. Mol. Biol. 213 (1990) 899-929. [23] J. Rajendra, A. Damianoglou, M.R. Hicks, P. Booth, P.M. Rodger, A. Rodger, Quantitation of protein orientation in flow-oriented unilamellar liposomes by linear dichroism, Chem. Phys. 326 (2006) 210220.
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-165
165
Methods of Analysis for Circular Dichroism Spectroscopy of Proteins and the DichroWeb Server Lee Whitmore and B.A. Wallace Department of Crystallography, Birkbeck College, University of London
Abstract: Circular dichroism spectroscopy is a very useful method for the analysis of the secondary structures of proteins. This chapter summarises the current (and past) algorithms, methodologies and reference datasets developed for such analyses, including the online analysis server DichroWeb. The chapter also includes sections on practical approaches to such analyses with existing software, and potential sources of errors in analyses and how they can be avoided.
1. Introduction Analyses of protein secondary structures from circular dichroism (CD) data are based on the notion that peptides and proteins in different conformations produce spectra with different characteristics. Greenfield and Fasman [1] demonstrated that the spectra of polylysine were different when the polypeptide was examined at different pH values. At those different pHs, the polypeptide adopted either mostly helical, mostly sheet, or “random coil” conformations. These different types of secondary structure arise from different peptide backbone φ,ϕ angles, resulting in the n→π* and π→π* transitions in the far ultraviolet (UV) wavelength region producing spectral peaks of different magnitudes and centred at different wavelengths, depending on the backbone angles. There are several ways that these spectral data may be used to provide information on the secondary structure of a protein. The methods broadly fall into three categories: visual inspection, empirical analysis and ab initio analyses. The ab initio-type analyses are described in the chapter in this book by Bulheller and Hirst and so will not be further discussed here. Visual inspection is the most basic method of analysis, but can be adequate for qualitative identification of proteins that contain mostly one type of secondary structure as shown in Figure 1. Mainly alpha helical proteins exhibit a large positive peak around 190 nm and two smaller (approximately half-height) negative peaks at around 208 nm and 222 nm, whilst mainly beta sheet proteins have a considerably smaller positive peak (~20% of the magnitude of that of a mostly helical structure) which is red-shifted compared to that for a helix, and a single negative peak at higher wavelengths (which is considerably smaller than either of the negative peaks in the helix spectrum). Unfolded (denatured) proteins and proteins that are mostly composed of polyproline (PPII) structures (like collagen) give rise to a spectrum that is mostly distinguished by a large
166
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
negative peak at around 200 nm, and a very small positive peak in the region above 210 nm.
Figure 1. Examples of CD spectra of proteins with mostly-helical, mostly-sheet, and mostly-polyproline-II type structures, showing the spectral characteristics of these types of secondary structures. The spectrum of the mostly-helical protein, myoglobin (black) has a much larger magnitude at all wavelengths than does that of the mostly-sheet protein, concanavalin A (grey). The spectrum of collagen (dotted) is of a protein that contains a large amount of PPII structure and has a large negative peak at wavelengths below 200 nm, making it very distinguishable from the other two. “Unordered” proteins also give rise to a negative peak in this region, but are usually blue-shifted with respect to the PPII spectrum. (Adapted from Miles and Wallace [3]).
A simple method [1, 2] was developed for quantitating the helical content alone based on the observation that at around either 208 or 222 nm only helical structures contribute significantly to the spectrum. As a result, calculations are done comparing the magnitude of the spectrum at this wavelength with the magnitude of a (theoretically) 100% helical structure. However, this method is unreliable not only because some types of beta-sheet structures contribute at these wavelengths, but also primarily because the magnitude of a single peak is the least well-defined characteristic of a spectrum due to possible errors in protein concentration, pathlength determination and instrument calibration (discussed in detail in the chapter on Calibration by Miles and Wallace in this volume). This method has been widely used for characterising peptides, but ironically, it is even more problematical for molecules such as small
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
167
flexible peptides which exist as an equilibrium mixture of very different conformations in solution. As the magnitude is an average of the magnitudes of the mixture of conformations present, if the peptide is mostly helical (and hence with a large 222 nm peak), but there are also unfolded species present which have no peak at this wavelength, the net spectrum will have an intermediate magnitude, and will not reflect the conformation of either of the forms present. More accurate methods have since been developed which rely not on a single data point, but on the magnitude and spectral positions of peaks over a wide range of wavelengths in the near UV spectral region. This chapter will concentrate on the methodologies that have been developed to apply empirical analyses to the far UV CD spectra of proteins. These techniques use spectra derived from proteins of known (crystal) structures to either create basis (representative) spectra for the principal types of secondary structures or to create a reference dataset of spectra for variable selection protocols. Based on proportional combinations of these reference spectra, the spectrum of an unknown protein can be reconstructed; the calculated secondary structure is then derived from the secondary structure of the components used to reconstruct the unknown spectrum in the relative proportions that they contributed to the reconstructed spectra. These empirical methods have been shown to produce good estimates of secondary structure composition for proteins containing standard types of secondary structures.
2. Information content of CD spectra The information that can be gleaned from a spectrum is dependent upon the number of spectral transitions that can be observed. For proteins, in the far UV region, the peptide backbone gives rise to electronic transitions between the n and π* and the π and π* orbitals; in addition at lower wavelengths (in the vacuum UV region accessible only when using synchrotron radiation circular dichroism (SRCD) – see the chapters in this book on SRCD by Miles and Wallace) charge transfer transitions between peptide backbones become evident. As a result, the broader the range of wavelengths measured in a spectrum – and hence the larger number of transitions – the higher the information content [4, 5]. Using principal component analyses it is possible to determine the information content as a function of the wavelength range in a CD spectrum. Toumadje et al. [4] showed that only two eigenvectors are necessary to describe spectra that only included data down to 200 nm, and between three and four eigenvectors for data that extend to 190 nm (the high wavelength end of the data in each case being 240 nm or longer). This emphasises the potential for over-interpretation of CD data using some automated algorithms. If the secondary structure reportedly derived from a spectrum which only includes data down to 200 nm is divided into more than two structural types (e.g.. helix, sheet, random), it has been divided into more components than the data would support (i.e. it is an over-interpretation). Indeed, for data to 200 nm, the only division should be how much structure is helical and how much is not helical. Similar analyses have suggested that if VUV data is included, as many as eight principal components can be present [6], and because the presence of different types of secondary structural types are not linearly independent [7], even more than this number of structural types can be analysed for using SRCD data [4-6].
168
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
3. Analysis Methods A number of different methods for analysing protein circular dichroism data have been developed over the past fifty years [8-22]. The earliest methods were based on simple least squares techniques, compared to ‘representative reference spectra’ for various types of secondary structure [8] These methods were subject to problems that were subsequently addressed in later modifications. For example, it was soon realised that when calculations were made with no constraints, results could be generated where the sum of the secondary structure fractions was not close to 1.0, or results could be produced with structural fractions calculated to have negative values. In both of these cases, the results make no chemical sense (despite making mathematical sense). They arise primarily because of either errors in input spectral magnitudes or because the reference dataset used was not appropriate for the protein analysed. As a result, ridge regression methods led to constraints being applied to the calculations to ensure that only the solutions which summed to approximately 1.0 and which contained no negative fractions were allowed, but in doing so, the calculated secondary structural values were often distorted. An alternative to application of the constraint that the fractions had to sum to 1.0 was the normalised least squares method, which emphasised more the spectral shape than magnitude, and hence was able to produce concentrationindependent analysis [13]. This had a significant advantage experimentally because it did not depend on the ability to determine precisely accurate protein concentrations (see the chapter on Good Practice by Kelly and Price in this book for a description of suitably accurate methods for protein concentration determinations). Unfortunately, that software was only available on a computer operating system/compiler that is outdated and no longer generally available. All other methods, old and new, are highly reliant on an accurate measure of the concentration of the protein in solution as this is directly related to the magnitude of the spectrum. The least squares methods were also limited in accuracy because they were based on the assumption that data at all wavelengths provide equal information content. This is clearly not the case as data at wavelengths higher than 240 nm have virtually no information from the amide bond transitions and thus are much less important than the data below 240 nm; modern algorithms take this into account. Newer methods are based on principal component analysis procedures (including singular value decomposition (SVD) and principal component-factor analysis), iterative tuning methods (where a weight co-efficient is iteratively tuned to match each structure type, such as with the convex constraints method), and neural network methods (including direct neural network techniques and indirect methods where neural networks are applied to clustered sub-groups of proteins such as the case of the matrix descriptors method). In addition, two other strategies can be overlaid on these methods, self-consistency (where the experimental spectrum is included in the reference data and is updated by the solution each iteration until the solution no longer changes) and variable selection (where multiple solutions are found, each using all but one of the spectra in the reference dataset). Table 1 lists algorithms developed for protein secondary structure analyses, the methods they employ and where they can be obtained, either for downloading or for use via an online server.
Table 1. CD Analysis methods Algorithm
Reference
Downloadable/Obtainable from
Download operating system
Online Servers
CDNN
[15]
NN
[email protected]*
CDSSTR
[11]
SVD & VS
http://lamar.colostate.edu/~sreeram/CDPro/main.html
Windows and Linux
http://dichroweb.cryst.bbk.ac.uk
RR
http://s-provencher.com/pages/contin-cd.shtml
Contin(CD Version) ContinLL
[9]
RR
http://lamar.colostate.edu/~sreeram/CDPro/main.html
Windows and Linux
http://dichroweb.cryst.bbk.ac.uk
K2d
[17]
NN
ftp://ftp.bork.embl.de/pub/software/Andrade
Windows
http://www.emblheidelberg.de/~andrade/k2d/ http://dichroweb.cryst.bbk.ac.uk
K2D2
[22]
NN
Lincomb
[14]
LS
http://www2.umdnj.edu/cdrwjweb/
Windows
Selcon3
[16]
SVD & VS
http://lamar.colostate.edu/~sreeram/CDPro/main.html
Windows and Linux
Selmat3
[21]
SVD & VS
http://www.qmul.ac.uk/~ugbt760/janes/algorithms.htm
Windows and Linux
SOMCD
[20]
NN
Super3
[13]
LS
VARSLC
[12]
SVD & VS
http://www.ogic.ca/projects/k2d2/
http://dichroweb.cryst.bbk.ac.uk
http://geneura.ugr.es/cgi-bin/somcd/ som.cgi?start=1
[email protected]
VAX/VMS Windows and Linux
http://dichroweb.cryst.bbk.ac.uk
Compilations of Algorithms: Dichroprot
[37]
Several
DichroWeb
[29-31]
Several
http://dicroprot-pbil.ibcp.fr/
Windows (32 bit version only) http://dichroweb.cryst.bbk.ac.uk
169
Abbreviations: LS, least squares; NN, neural network; PCA, principal component analysis; RR, rigid regression; SVD, singular value deconvolution; VS, variable selection * This software may no longer be freely available.
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
Method
Individual Algorithms:
170
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
4. Reference Datasets The quality of any empirical analysis method is highly dependent upon the reference datasets available for the calculations or used in training the neural network. Until recently, the only standard reference datasets publicly available were those produced by a number of different authors [8, 18, 23, 24] – some more than 20 years ago – and compiled in the CDPro package [19]. They contained different numbers of proteins (ranging from 17 to 48 proteins) and were based on crystal structures refined a number of years ago, some of which had less than ideal geometries [25]. They are discussed in detail in the chapter by Janes on Reference Datasets. A major practical consideration in the choice of reference dataset for a given spectral analysis is that it contains proteins with similar structural characteristics to the protein to be analysed. If the protein to be analysed has different structural characteristics from all the reference proteins, it is unlikely to be correctly analysed. For example, if the protein to be analysed contains some polyproline II secondary structure, but the reference dataset does not, the analysis will not be correct. However, if the reference dataset contains a protein with polyproline II structure (for example: collagen), then the analysis will likely succeed. If the structure of the protein is unknown (as it is in most cases, since that is generally why the analysis is being conducted) then the most prudent course of action is that all of the available data sets be tested; and the one that gives the best fit and a sum that is close to 1.0 is likely to be the most correct. However, it must be kept in mind that a good fit is a necessary but not sufficient condition for correctness (see section 5.1.1 below). Comparisons of the results obtained with the different reference datasets with all the available algorithms suggest that the choice of reference data has a greater impact on the quality of the analysis than the choice of methodology. This means that the empirical methods will produce better results if the reference datasets maximally cover secondary structure and fold space (i.e. have the most possible types of different folds and examples of all types of secondary structures, even ones found only occasionally). As a result, a new bioinformatics-designed reference dataset (SP175) has recently been produced [6] which contains a larger number of proteins (71), but more importantly, effectively covers fold and secondary space, as well as including the largest wavelength range possible (down to 175 nm), and using highly refined and good quality crystal structures. That dataset is highly effective for analysing most proteins, and generally produces more accurate results and better fits than the older more limited reference datasets. In addition, new more specialised datasets have been developed for specific protein types. For instance, the β,γ crystallin proteins have a mostly beta-sheet structure folded as a double greek key with uniquely strained tortional angles that produce spectra with unusual features. As a result, these proteins are not adequately analysed by the standard reference datasets, so a specialised crystallins dataset has been produced [26] for the specific analysis of this type of protein. Very recently, a membrane protein reference dataset (MSP175) has been produced for the analysis of membrane proteins, whose spectral peaks are shifted relative to soluble proteins [27] due to the low dielectric constant of the hydrophobic membrane milleu; this reference dataset produces superior analyses for membrane proteins, and especially membrane proteins with beta barrel folds. A further consideration for selecting a reference dataset is the choice of secondary structure component types the reference data has been broken down into. These have been defined for each reference dataset. For instance, the most common division of
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
171
secondary structure types amongst reference data is: alpha-helix (regular αR and distorted αD), beta-sheet (regular βR and distorted βD), turns and “unclassified” or “other” structures. The helices are as defined by the DSSP algorithm [28], with the last two residues at each end of an α- or 310-helix being considered the αD fraction, since those residues tend to have slightly deviant phi, psi angles. The α D fraction also includes cases where the helix would be less than four residues in length. Any remaining α-helix or 310-helix is included in the αR fraction. One residue at each end of a β-sheet is defined as being in the βD fraction, as well as any case where the β-sheet would be less than two residues in length. Any remaining β-sheet residues are defined as βR fractions. Turns are defined as two or more consecutive ‘T’ or ‘S’ assignments from DSSP. Any remaining residues not fitting in any of the above classes are designated as ‘other’. Alternative classifications [21] do not divide the helix and sheet into regular and distorted, and so may be easier to reconcile with crystallographic definitions. In addition, some datasets include polyproline II, different turns or 3 10 helixes as separate classes. The latter are, however, often not well characterised in the analyses as there are few examples of these types of secondary structures in the proteins in the reference datasets. However, if there is reason to believe that the protein being analysed is likely to have either 310 or polyproline-like structures, then use of other datasets divided in this way may be more suitable.
5. Analyses using the DichroWeb Online Server In order to provide a convenient and easily accessible interface to a wide range of different deconvolution algorithms and reference datasets, the DichroWeb [29-31] web server was developed for online secondary structure analyses. The algorithms available at the DichroWeb site (http://dichroweb.cryst.bbk.ac.uk) are listed in Table 2. Since its release in 2001 the site has provided nearly 200,000 analyses for approximately 2000 scientists around the globe. Access is via an ID and password that are free to users at academic institutions and research institutes for non-commercial applications. The advantage to the user of an online service is that it removes the need to obtain, compile, install and update software on their own computers, where there can be substantial issues of operating system compatibility. By comparison, DichroWeb can be accessed with any reasonably modern web browser and on the vast majority of computer operating systems. The availability of multiple methods and reference datasets and a consistent goodness-of-fit parameter for all analyses, accessible at DichroWeb through a single interface, makes it easy for the user to compare results from different calculations on the same data. In addition, because the website accepts a large number of different input formats and units (see Table 2), there is no need to reformat the data before submitting it to the server. By contrast, most of the algorithms available for download require input in their own format, in specific units, in a specific order, and using specific intervals for data collection. This has all been standardised in DichroWeb. The various DichroWeb input and output formats and variable analysis parameters are listed in Table 2. The use of DichroWeb and interpretations of the resulting analyses are discussed in the next sections of this chapter. Appendix A contains a list of questions commonly asked by users and their answers.
172
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
Table 2. Input and output variables available in the DichroWeb calculation server Input File format
Selection from the following generic and proprietary formats: Applied Photophysics (ascii only) Aviv (several types, ascii only) BP ( two- and four-column data) DRS (SRS Synchrotron, UK) Free Format (two- or four-column data, no header) JASCO (several types, asciii only) SDS (NSLS Synchrotron, USA) YY (single continuous text file, CD values only)
Units
Selection from/conversion to: delta epsilon mean residue ellipticity millidegrees DRS units (counts)
Data
wavelength range wavelength interval wavelength order user-defined wavelength cutoff limit
Analysis Program
Selection from 5 deconvolution algorithms: CDSSTR ContinLL K2d Selcon3 VARSLC
Reference Datasets
Selection from 10 reference sets (sets 1-7 from reference [19]) Set 1: 29 proteins Set 2: 22 proteins Set 3: 37 proteins Set 4: 43 proteins Set 5: 17 proteins Set 6: 42 proteins Set 7: 48 proteins SP175 (full range): 71 proteins from reference [6] SP175 (truncated to 190nm): 71 proteins from reference [6] CRYST175: 9 proteins from reference [26]
Scaling factor
Numeric multiplier between 0.5 and 1.5, applied to correct for magnitude errors
Output Results Table
Secondary Structure Assignments NRMSD values R values
Plots
Graph containing the experimental and calculated spectra and the difference spectrum at each data point
Files
Experimental and calculated spectra in downloadable format
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
173
5.1 Assessing the Quality of the Results 5.1.1 The Goodness-of-Fit Parameter A useful way of assessing the quality of an analysis is to determine the correspondence between the experimental data and the back-calculated spectrum arising from the “best” calculated secondary structure for a given analysis. A commonly used parameter for this is the normalised root mean square deviation (NRMSD) [32], defined as: NRMSD = Σ[(θexp-θcal)]2/(θexp)2]1/2 which is summed over all wavelengths, and where θexp and θcal are, respectively, the experimental ellipticities and the ellipticities of the back-calculated spectrum for the derived structure. This is more or less equivalent to the “R-factor” in a crystallographic analysis. In general it has been found that when the NRMSD is less than 0.10, the fit is good and the analysis is successful. If the value is >0.10 and <0.20, then it is expected the calculated secondary structure will be generally similar to that found in the protein (i.e. mostly helical or mostly sheet), but if it has a value >0.20, in general the analysis can be considered to have failed and the secondary structure is unlikely to be correct. That is often because the reference dataset used was inappropriate, so the use of a different reference dataset may be more suitable. However, it is essential to note that because the problem is not uniquely defined, whilst a low NRMSD value is a necessary condition for obtaining an accurate analysis, it is not a sufficient condition, so a low NRMSD could still mean the result is incorrect, although this is somewhat unlikely. 5.1.2 Comparison of Plots DichroWeb produces not only the NRMSD value for the solution, but also overlaid plots of the experimental and calculated spectra and the difference spectrum. Visual inspection of the plots can be very informative, and can suggest systematic errors contributing to poor analyses. For example, if the experimental and calculated plots are offset by one or a few nm (most clearly seen in the difference plots which show adjacent positive and negative peaks), this would suggest that either there could be a wavelength calibration problem with the instrument on which the data was collected, a data processing error, or that the protein has different spectral characteristics from the reference dataset (i.e. membrane proteins analysed with a soluble protein reference dataset). If the shapes of the curves are completely different, i.e. spectrum looks helical but the analysis is primarily beta sheet and the calculated spectrum has the shape of a mostly beta protein, this would suggest that the magnitude of the input spectrum is in error. This sort of diagnosis is not possible from just the goodness-of-fit parameter, but examination of both the NRMSD parameter and the plots together can lead to improved analyses. 5.2 The Effects of Spectral Magnitude Variation Analyses of CD data are especially sensitive to errors in the protein concentration and sample celll pathlength parameters used to produce the experimental spectrum, resulting in errors in spectral magnitude. Systematic variation in the magnitudes of spectra by varying the scale factor applied [33], have shown that the magnitude is the
174
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
single most important parameter for accurate analyses. Because the spectra of helical structures are much larger than those of beta sheet structures, a very obvious consequence of having the spectral magnitudes wrong is that the analyses tend to overemphasise the helix content if the spectrum is too large and overcalculate the sheet content if the spectrum is too small. Experimental conditions leading to these errors, and the importance of very accurate measurements of protein concentration and pathlength are discussed in the two chapters on Good Practice by Kelly and Price and Miles and Wallace. To test if this is the reason for a poor analysis, DichroWeb allows the user to apply a “scale factor” to their data to examine the effects of changing the spectral magnitude. Whilst it is highly important to use accurate magnitude values in the first place, and scaling should not be considered an adequate substitute for proper measurements, it has been noted that in most cases tested [33], the NRMSD calculated for an analysis is at the minimum when the magnitude scaling is correct. Thus scale factor variation can help in improving the analyses obtained. 5.3 The Effects of Spectral Wavelength Coverage As noted in section 2 of this chapter, the information content of a spectrum is related to the extent of the lower wavelength limit of the data included in the analyses. In creating DichroWeb, it was decided to prevent users from making one of the most fundamental errors – overinterpretation (under parameterisation) of their data. Although the algorithms will mathematically produce solutions for as many as seven different secondary structural types if data is only available to 200 nm, as discussed above, such a spectrum actually only includes sufficient data to define two unique types of secondary structure. Hence, for all the algorithms (except the neural network K2d), the user must provide data down to 190 nm in order for results to be provided. (K2d will produce results for data only down to 200 nm, but in our experience those analyses are unlikely to be at all accurate or useful). The lower the wavelength data available, the more choices the user has for what reference datasets are accessible. In addition, it has been shown [6] that the inclusion of very low (VUV) data greatly improves the accuracy of the analyses, especially for alpha/beta mixed proteins or beta-sheet rich proteins. This is because in the far UV region, the spectra arising from helical secondary structures are of much greater magnitude than those arising from sheet structures, hence the helical contributions tend to “swamp” the sheet contributions. However, in the VUV region [3] helical and sheet structures give rise to spectra of different signs, and hence their relative contributions can be more accurately determined. As a result, the most accurate analyses will in general arise if the lowest wavelength coverage possible, for instance SRCD data, is included in the analysis. 5.4 The Effects of Experimental Data Quality A vital consideration for obtaining a quality analysis is obtaining quality experimental data. The main issues in terms of analysis are whether the peak magnitudes and positions have been recorded correctly, whether the sample and baseline spectra “match” in the wavelength region where there is no protein signal (around 263 to 270 nm), and the signal-to-noise levels of the data. The issues of calibration of peak magnitudes and positions, and the effects of such errors on calculated structures are discussed in the Calibration chapter by Miles and Wallace.
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
175
The issue of whether the sample and baseline match is a critical one for analyses, because if they do not, the net spectrum will be offset in the Y direction with respect to the spectra in the reference datasets. A mismatch could occur if the solution used to collect the baseline does not contain all the non-protein components present in the sample, or if the sample cells have not be placed in exactly the same orientation with respect to the beam when measuring the sample and baseline. The best way to improve the offset baseline is to recollect the sample and baseline, but if this is not possible, a correction can be made by “zeroing” the net spectrum over the appropriate wavelength region by addition of an offset constant. With respect to signal-to-noise levels, the deconvolution methods can cope with small amounts of noise in a spectrum, providing that the noise does not alter the wavelengths at which the peaks occur or their heights. As a useful rule of thumb, good data should have maximum noise levels less than 2% of the peak maxima. The signalto-noise level can be improved by averaging a number of separately-collected spectra (better practice than allowing the instrument to average the spectra, as this way the user can calculate the standard deviations or error or noise levels in the measurements). Finally, it is important to ascertain that the spectra have been collected over an absorbance range where the photomultiplier is capable of accurate measurements. This can be ascertained from the high tension (HT) or dynode or high voltage spectrum that can be collected simultaneously with the CD spectrum. When the HT value exceeds the maximum defined for that instrument, then any CD reading that is obtained will be incorrect. This issue is discussed in more detail in the Good Practice chapters by Kelly and Price and the Sample Preparation and Good Practice in SRCD by Miles and Wallace. If problematic data is obtained at low wavelengths, this can be ameliorated by the input parameter in DichroWeb that enables the user to define a wavelength in the dataset below which data should not be included in the analyses. 5.5 Comparisons of Results Obtained with Different Algorithms As noted above, the accuracy of analyses seem to be more dependent on use of an appropriate reference dataset than on the use of a particular algorithm. However, the DichroWeb server facilely enables serial analyses with different algorithms and the same dataset, so comparisons can be made. In our experience, although the CDSSTR algorithm tends to produce the lowest NRMSD values (probably due to the larger number of parameters fitted) this does not correlate with producing the most correct solution. Although all methods tend to produce similar results if the same reference dataset is used, in our hands, the CONTINLL method produces the most accurate results (when compared against the crystal structures), as long as the concentration and pathlength values used in the calculation are highly accurate. 5.6 The Effects of Inappropriate Samples A major cause of inaccurate analyses is protein purity, or lack there of. For instance, if a protein were only 80% pure (what biochemists often consider “pure” – i.e. no other significant band visible on a gel), then if the calculated helical content is 50%, that means the actual helical content of the protein of interest could be anywhere between 38% and 63% helical, a result that is effectively meaningless. In addition, impurities in the sample can impact on the calculated concentration of the target protein and could introduce extra chromophores to the spectrum, also distorting the result.
176
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
Another cause of inaccuracy is the application of a method using, or trained on, reference data from a particular class of molecules, to analyse a compound from another class. For instance, there is as little point applying a neural network that has been trained on globular protein spectra to an experimental spectrum from a membrane protein as there would be in analysing the spectrum of a nucleic acid with this methodology. Although the differences between the spectra of nucleic acids and proteins are very obvious, it is less clear but similarly important that the CD spectra of globular proteins, membrane proteins and peptides are also subtly distinct. The analyses of peptides using reference datasets derived from globular proteins can also be highly problematic, but less obvious. Firstly, peptides tend to be small molecules in which all or most of the backbone peptide bonds are exposed to solvent. In globular proteins, by contrast, a large number of the peptide bonds are buried and thus may be seeing an environment of different dielectric constant that may influence the spectral properties. Secondly, peptides tend to have less persistent and defined structures. Unless held together by disulphide bonds, such as some toxin peptides, they tend to be flexible and dynamic in solution, with their CD spectrum representing an equilibrium mixture of folded and unfolded species. Hence, the spectrum, a sum of all species present, will have both contributions of the folded structures such as helices, plus the contributions of the “unfolded” or “other” structures, which have magnitudes near zero at most wavelengths. Only the spectral average is detected, which effectively will have a magnitude much lower than the equivalent spectrum of a full folded peptide, but it will not be possible to distinguish whether the protein has 50% helical structure all the time, or if, for instance, it is a mixture of conformations with 50% of the population being an all-helical peptide and 50% being completely unfolded peptide. The same issue could be raised for functional equilibrium states of globular proteins, but for the most part the equilibrium differences in protein conformations are very small. Finally, it should be remembered that the reference and training datasets were derived for globular proteins in aqueous solution; whilst there might be some temptation to apply them for the analysis of peptides and proteins in non-aqueous solution, they are unlikely to be successful because the spectral characteristics of the peptide bonds in these different environments are likely to be significantly different [34, 35], rendering the reference data totally inappropriate for such analyses.
6. Summary
There are a number of methods and reference datasets that can be useful for the analysis of the secondary structures of proteins from their CD spectra, most of which are included in the user-friendly online DichroWeb analysis servers. However, much care needs to be taken in the analyses in order to obtain accurate and useful results.
Acknowledgments DichroWeb was created, and is upgraded and maintained, with the support of grants from the U.K. Biotechnology and Biological Sciences Research Council.
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
177
Appendix A: Common errors and problems encountered in analyses and how to avoid them
•
The calculated spectrum has a very large/small magnitude relative to the experimental spectrum.
There is a possible error in the units, pathlength or concentration data, or in the calibration of the instrument. •
The analysis solution has a poorly fitting calculated spectrum.
The most common reason is that few of the proteins in the reference dataset resemble the protein producing experimental spectrum. The first thing to do is to try alternative reference datasets, to see if any produce a more closely fitting calculated spectrum. The next thing is to check to see if there is an error in protein concentration. Figure 2 shows an example of this: comparison of Figure 2a, a spectrum of myoglobin (high helix content) analysed using the correct concentration. and Figure 2ba spectrum of myoglobin produced using five times the correct protein concentration. Due to the low magnitude of the latter, the spectrum appears to contain characteristics of sheet structure rather than helix and gives a poor analysis.
Figure 2a. Myoglobin (PDB code 1ymb, 73% helical as defined by a DSSP calculation from the crystal structure) spectrum analysed with the SP175 reference dataset and ContinLL, accessed through DichroWeb. This is the spectrum produced using the correct concentration (1.0) , and results in a calculated structure composed of 72% total helix, 1% total sheet, 10% turns and 17% other, with an NRMSD value of 0.030.
178
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
Figure 2b. Myoglobin spectrum produced using five times the actual concentration (scale factor of 0.2), analysed as above with the SP175 reference dataset and ContinLL, accessed through DichroWeb. The analysis returns 21% total helix, 27% total sheet, 15% turns and 38% other, with an NRMSD value of 0.132.
•
Different analysis methods provide different structure estimations.
Comparisons of different methods can provide an indication of whether the analyses are reliable and the reference datasets appropriate. Almost invariably the helical values will be very similar produced with different algorithms and datasets, but the sheet values may differ considerably. It is often helpful to produce analyses using all of the methods, and then calculate the average values and standard deviations in the values for each type of secondary structure as an indication of the reliability of the analyses. •
The analysis solution has a high NRMSD value.
Re-try the analysis with another deconvolution method and/or select a more appropriate reference dataset. For noisy spectra, collect more scans for averaging if possible and/or consider applying smoothing. Alternatively, check that zero point (the values between ~267 nm and 273 nm should be zero – if they are not, shift the spectra in the Y direction (the CDTOOLS program from the Wallace lab [36] can help you do this)) and reanalyse. Note that a low NRMSD value is a necessary but not sufficient condition for a correct analysis; however, the algorithm with the lowest NRMSD is not always the most correct. This consideration is highlighted in Figure 3, which shows the CDSSTR algorithm produces a solution with a lower NRMSD value for the myoglobin spectrum featured above, but with a slightly less accurate analysis than ContinLL, which is probably the most accurate method, when compared to the crystal structure data.
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
179
Figure 3. Myoglobin spectrum (as above) analysed with the SP175 reference dataset and CDSSTR, accessed through DichroWeb. The analysis returned 76% total helix, 2% total sheet, 8% turns and 14% other, with an NRMSD value of 0.006.
•
The analysis results contain secondary structure components I know not to be present.
The algorithms will attempt to assign structure components to each type of secondary structure that is described in their reference dataset. A solution could be to select a more appropriate reference dataset to avoid obtaining values for irrelevant structure components. Alternatively, consider whether the result makes physical sense: For example, if a 200 residue protein is analysed and the result suggests that 1% of the structure is alpha helical then we have a nonsense solution as 1% equals two amino acids and to make a single turn of helix requires four amino acids. •
The analysis provides no solutions.
There are possibly errors in the input files or parameters, so you should check each value and make sure the choice of spectra initial and final wavelength values are input in the correct order (either high to low or vice-versa depending on your files) and that the input wavelength interval corresponds to that present in the data file. Check with the “preview” function that the spectrum looks correct. •
The data are very noisy and the calculated solution appears to be wrong.
Noisy data will produce results, but the noise will contribute to the apparent characteristics of the spectra, and the results will not be correct (Figure 4). The data should be re-collected using a higher protein concentration or longer pathlength (if the HT permits), or multiple scans should be collected and averaged.
180
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
Figure 4. Spectrum of myoglobin with a low signal-to-noise level. The analysis suggests the helix content is 58% instead of the correct value of 73%.
•
A protein known to be highly helical analyses with significant sheet content.
Check the spectrum for a truncated signal in the lower wavelengths (below 200 nm). If the HT value was too high during data collection, the large peak around 190 nm characteristic of a helix can be mis-recorded as one with a lower magnitude, more characteristic of sheet-like structures. The HT spectrum should be recorded simultaneously with the CD spectrum and checked before analyses are run (see the chapter in this book on Good Practice by Miles and Wallace). An alternative explanation is that the magnitude of the whole spectrum is in error (as described above). •
A really low NRMSD has been obtained but the result is clearly wrong, by comparison with data from other sources.
The mathematics of the analysis packages will attempt to provide a solution according to the data they have and the reference/training data. It may be the case that the methods can sometimes fit noisy spectra very well, giving low NRMSD values but nonsensical solutions. Visual inspection of the experimental spectrum should be undertaken in all cases and high signal-to-noise data should be used whenever possible.
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
•
181
I only have data down to 200 nm but the website requires data to 190 nm – what should I do?
Recollect the data under different conditions which enable you to capture the low wavelength data. This may require changing protein concentration, pathlength, or buffer/additive conditions, of a combination of these parameters.
References [1] [2] [3] [4] [5]
[6] [7]
[8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
[18] [19]
[20] [21]
N. Greenfield and G.D. Fasman, Computed circular dichroism spectra for the evaluation of protein conformation, Biochemistry 8 (1969) 4108–4116. J.M. Scholtz, H. Qian, E.J. York, J.M. Stewart and R.L. Baldwin, Parameters of helix-coil transition theory for alanine-based peptides of varying chain lengths in water, Biopolymers 31 (1991) 1463–1470. A.J. Miles and B.A. Wallace, Synchrotron radiation circular dichroism spectroscopy of proteins and applications in structural and functional genomics, Chem. Soc. Reviews 35 (2006) 39–51. A. Toumadje, S.W. Alcorn and W.C. Johnson Jr., Extending CD spectra of proteins to 168 nm improves the analysis for secondary structures, Anal. Biochem. 200 (1992) 321–331. B.A. Wallace and R.W. Janes, Synchrotron radiation circular dichroism spectroscopy of proteins: Secondary structure, fold recognition and structural genomics, Curr. Opin. Chemical Biology 5 (2001) 567–571. J.G. Lees, A.J. Miles, F. Wien and B.A. Wallace, A reference database for circular dichroism spectroscopy covering fold and secondary structure space. Bioinformatics 22 (2006) 1955–1962. P. Pancoska, M. Blasek and T.A. Keiderling, Relationships between secondary structure fractions for globular proteins. Neural network analyses of crystallographic datasets, Biochemistry 31 (1992) 10250– 10257. C.T. Chang, C.S. Wu and J.T. Yang, Circular dichroism analysis of protein conformation: Inclusion of β-turns, Anal. Biochem. 91 (1978) 13–31. S.W. Provencher and J. Glockner, Estimation of globular protein secondary structure from circular dichroism, Biochemistry 20 (1981) 33–37. J.P. Hennessey Jr. and W.C. Johnson, Jr., Information content in the circular dichroism of proteins, Biochemistry 20 (1981) 1085–1094. L.A. Compton and W.C. Johnson, Jr., Analysis of protein circular dichroism spectra for secondary structure using a simple matrix multiplication, Anal. Biochem. 155 (1986) 155–167. P. Manavalan and W.C. Johnson Jr., Variable selection method improves the prediction of protein secondary structure from circular dichroism spectra, Anal. Biochem. 167 (1987) 76–85. B.A. Wallace and C.L. Teeters, Differential absorption flattening optical effects are significant in the circular dichroism spectra of large membrane fragments, Biochemistry 26 (1987) 65–70. A. Perczel, K. Park and G.D. Fasman, Analysis of the circular dichroism spectrum of proteins using the convex constraint algorithm: A practical guide, Anal. Biochem. 203 (1992) 83–93. G. Böhm, R. Muhr and R. Jaenicke, Quantitative analysis of protein far UV circular dichroism spectra by neural networks. Protein Eng. 5 (1992) 191–195. N. Sreerama and R.W. Woody, A self-consistent method for the analysis of protein secondary structure from circular dichroism, Anal. Biochem. 209 (1993) 32–44. M.A. Andrade, P. Chacón, J.J. Merelo and F. Morán, Evaluation of secondary structure of proteins from UV circular dichroism using an unsupervised learning neural network, Protein Eng. 6 (1993) 383– 390. N. Sreerama, S.Y. Venyaminov and R.W. Woody, Estimation of the number of α-helical and β-strand segments in proteins using circular dichroism spectroscopy, Protein Sci. 8 (1999) 370–380. N. Sreerama and R.W. Woody, Estimation of protein secondary structure from CD spectra: Comparison of CONTIN, SELCON and CDSSTR methods with an expanded reference set, Anal. Biochem. 282 (2000) 252–260. P. Unneberg, J.J. Merelo, P. Chacón and F. Morán, SOMCD: Method for evaluating protein secondary structure from UV circular dichroism spectra, Proteins: Struct., Funct., Genet. 42 (2001) 460–470. J.G. Lees, A.J. Miles, R.W. Janes and B.A Wallace, Optimisation and development of novel methodologies for secondary structure prediction from circular dichroism spectra. BMC Bioinformatics 7 (2006) 507–517.
182
L. Whitmore and B.A. Wallace / Methods of Analysis for Circular Dichroism Spectroscopy
[22] C. Perez-Iratxeta and M.A. Andrade-Navarro, K2D2: Estimation of protein secondary structure from circular dichroism spectra, BMC Structural Biology, 8 (2008) 25. [23] P.L. Privalov, E.I. Tiktopulo, S.Y. Venyaminov, Y.V. Griko, G.I. Makhatadze and N.N. Khechinashvili, Heat capacity and conformation of proteins in the denatured state, J. Mol. Biol. 204 (1989) 737–750. [24] P. Pancoska, E. Bitto, V. Janota, M. Urbanova, V.P. Gupta and T.A. Keiderling, Comparison of and limits of accuracy for statistical analyses of vibrational and electronic circular dichroism spectra in terms of correlations to and predictions of protein secondary structure, Protein Sci. 4 (1995) 1384– 1401. [25] R.W. Janes, Bioinformatics analyses of circular dichroism protein reference databasesm, Bioinformatics 21 (2005) 4230–4238. [26] P. Evans, O.A. Bateman, C. Slingsby and B.A. Wallace, A reference dataset for circular dichroism spectroscopy tailored for the alpha-crystallin lens proteins, Experimental Eye Research 84 (2007) 1001–1008. [27] B.A. Wallace, J.G. Lees, A.J.W. Orry, A. Lobley and R.W. Janes, Analyses of circular dichroism spectra of membrane proteins, Protein Sci. 12 (2003) 875–884. [28] W. Kabsch and C. Sander, Dictionary of protein secondary structure: Pattern recognition of hydrogenbonded and geometrical features, Biopolymers 22 (1983) 2577–2637. [29] A. Lobley, L. Whitmore and B.A. Wallace, DICHROWEB: an interactive website for the analysis of protein secondary structure from circular dichroism spectra, Bioinformatics 18 (2002) 211–212. [30] L. Whitmore and B.A. Wallace, DICHROWEB, An online server for protein secondary structure analyses from circular dichroism spectroscopic data, Nucleic Acids Res. 32 (2004) W668–W673. [31] L. Whitmore and B.A. Wallace, Protein secondary structure analyses from circular dichroism spectroscopy: methods and reference databases, Biopolymers 89 (2008) 392–400. [32] D. Mao, E. Wachter and B.A. Wallace, Folding of the mitochondrial proton adenosine triphosphatase proteolipid channel in phospholipid vesicles, Biochemistry 21 (1982) 4960–4968. [33] A.J. Miles, L. Whitmore and B.A. Wallace, Spectral magnitude effects on the analyses of secondary structure from circular dichroism spectroscopic data, Protein Sci. 14 (2005) 368–374. [34] M. Cascio and B.A. Wallace, Red- and blue-shifting in the circular dichroism spectra of polypeptides due to dipole effects, Prot. Pept..Lett.. 1 (1994) 136–140. [35] Y. Chen and B.A. Wallace, Secondary solvent effects on the circular dichroism spectra of polypeptides: Influence of polarisation effects on the far ultraviolet spectra of alamethicin., Biophysical Chem. 65 (1997) 65–74. [36] J.G. Lees, B.R. Smith, F. Wien, A.J. Miles and B.A. Wallace, CDtool - An integrated software package for circular dichroism spectroscopic data processing, analysis and archiving, Anal. Biochem. 332 (2004) 285–289. [37] G. Deleage and C. Geourjon, An interactive graphic program for calculating the secondary structure content of proteins from circular dichroism spectrum, Comput. Appl. Biosci. 2 (1993) 197–199.
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-183
183
Reference Datasets for Protein Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopic Analyses Robert W. Janes School of Biological and Chemical Sciences, Queen Mary, University of London
Abstract. In this chapter the contents and compositions of existing reference datasets used for the analysis of protein secondary structures from circular dichroism (CD) data are described. It includes discussions of the quality of the CD data and the crystal structures that are used. Early reference datasets are compared with a new dataset, SP175, which was specifically created based on bioinformatics assessments of protein secondary structure and fold space. Finally, the issues that need to be considered when creating a new reference dataset are discussed.
1. Introduction As circular dichroism (CD) spectroscopy developed as a research technique for studying proteins, it became clear that there was a correlation between the spectrum obtained and the secondary structure components present in the protein. This was emphasised when proteins, similar in no other way than their known proportions of secondary structure content, gave rise to comparable CD spectra. As discussed in the Introduction chapter by Janes and Wallace in this book, the first attempts to obtain information on the secondary structural content of a protein from its CD spectrum were by using “basis spectra”. These proved to be of use in determining quantities of the major regular forms of secondary structures, α-helices, β-sheets, β-turns, and “other” (then termed “random coil”) and they are still useful for back-calculating a CD spectrum from its crystal structure coordinates for comparison with an experimentallyobtained spectrum. However, the basis sets were not useful for determining other components, such as polyproline-II or 310 helices, since there were insufficient examples of these types of secondary structures in the available crystal structures to enable accurate extrapolation to a spectrum representing each as a unique component. In addition, because there are significant variations in the canonical secondary structures in proteins with different folds, these “generic” spectra were often not useful for proteins with certain types of supersecondary structures. A new approach to determining the secondary structure content from a CD spectrum was developed, that of using reference datasets, containing a wide range of individual protein spectra. The proteins used to create the reference datasets all had known secondary structure contents derived from their crystal structures. How these datasets are used in combination with different types of algorithms to produce calculated secondary structures is discussed in the Analysis chapter by Whitmore and Wallace. This chapter will focus on the reference data sets themselves.
184
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
2. Early Reference Datasets It is important to consider the components and quality of the early reference datasets. Some contain minor errors or were limited in the breadth of structures they included. This was due to the limited availability of sufficient quantities of purified protein samples, the quality of the CD data collected on a variety of machines, and the limited numbers and sometimes the quality of available solved protein crystal structures when these datasets were constructed. In spite of this, the mathematical procedures that have been developed to derive secondary structure content through use of these datasets are rigorous and of the highest quality, and generally produce very good answers as a result. By the late 1960s/early 1970s when CD spectra of protein samples began to be collected, there were a limited range of CD instruments commercially available, and a variety of ways in which they measured CD data [1]. The lowest wavelengths they could realistically achieve were limited by instrument designs, and for the most part they had a lower wavelength cutoff around 190 nm [1] depending on the sample and spectral conditions used. Calibration standards (if used) and procedures varied between labs. All spectra collected within a research laboratory for a given set of proteins would most likely be internally calibrated/standardised as they were obtained on the one machine. On occasions, however, the same proteins even gave rise to CD spectra with different appearances when collected in different labs [2]. A number of labs produced small reference datasets for the analyses of proteins, but most were limited in the number of proteins they included. Various reference datasets were later amalgamated in an important initiative by Sreerama and Woody [3] because this considerably broadened the coverage of the range of secondary structure components present, the “secondary structure space”, which in turn usually enhanced the accuracy of the derived analyses. At the same time this did introduce some potential problems as reference sets that were internally calibrated were now linked with other sets equally internally calibrated but often not necessarily via the same means. In addition, duplicate CD spectra collected for the same protein by the different groups but which had different appearances were discarded, but there is no report as to why one spectrum was chosen to represent that protein over any of the other equallypossible spectra. As a consequence of amalgamation, reference datasets while broadening their secondary structure coverage lost something of their internal consistency [2]. Although some errors and anomalies were introduced into the datasets, in the main amalgamation to improve the extent of coverage of secondary structures present within them did enhance their overall quality, and these reference datasets have endured as valuable tools for many years. Many of these reference datasets were made publicly available through the CDPro suite of programs [3], and are now accessible through the DichroWeb [4] online analysis server that includes a range of analysis methods for determining secondary structure components from protein CD spectra.
3. Modern Reference Datasets SP175 and CRYST175 In 2006 the Wallace lab created a new soluble proteins dataset, SP175, that was specifically designed using bioinformatics techniques to broadly cover secondary structural types, including geometric variants, and to incorporate proteins that
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
185
extensively covered fold space [5]. That reference dataset contains data from 71 proteins. The spectra for this dataset were in general collected on both laboratory-based CD and synchrotron radiation circular dichroism (SRCD) machines (all of which had been carefully cross-calibrated) to ensure no instrument bias was present. All the crystal structures of the proteins used had been analysed using PROCHECK [6] and other geometry-checking algorithms (see below) to establish their quality, and the spectra include data to a common minimum lower wavelength limit of 175 nm (and many go beyond this). In general, secondary structure determinations of soluble globular proteins using this reference set are more accurate than those calculated using the other datasets currently available [5]. A further, smaller specialist dataset is available, that of CRYST175, which also contains data to 175 nm as its lowest common wavelength [7]. There are nine members of this dataset, and all are proteins in the β,γ crystallin family. This group of proteins produce unusual CD spectral features as they have a beta-sheet double Greek key motif structure which has strained torsion angles. Having a small focused dataset of this kind enables more and accurate information to be obtained on the secondary structure content of novel members of the family which also have similar geometries. As discussed in the chapter in this volume on Analyses by Whitmore and Wallace, because none of the current datasets incorporate either membrane proteins or fibrous (non-globular proteins), accurate analyses of these types of samples still await the availability of new reference datasets [8].
4. Important Dataset Characteristics
4.1 Protein Component Characteristics
4.1.1 Coverage of Secondary Structure and Fold Space At the time of creation of the early datasets, the number of protein structures in the Protein Data Bank (PDB) was much more limited than now. In addition, access to sufficient quantities of proteins for CD studies was also a limiting factor, so the early data sets often contained whatever proteins were available, which means that their coverage of secondary structural space (ie. alpha-helix/beta-sheet content, as shown in Figure 1) was not comprehensive. There were significant gaps in the coverage of the percentages of secondary structure content, meaning that within the datasets many tended to have “clustered areas” of percentage α-helical content while other areas of secondary structure space were completely devoid of examples. In addition in some cases, because the reference dataset included a number of homologous proteins, the coverage was not as broad as would have been presumed from the number of proteins in the dataset. The actual issue is how many “unique” structures are present in the reference dataset rather than the total number of structures. A further limitation within early reference datasets was in the coverage of “fold space”. It is now acknowledged that the overall topology of a protein can be categorised into specific arrangements of closely associated secondary structure features that can be separated into specific types of supersecondary structure
186
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
beta-sheet (%)
1 0.75 0.5 0.25 0
0
0.25
0.5
0.75
1
alpha-helix (%) Figure 1. Distribution of proteins in “structure space”: Plot of alpha-helix versus beta-sheet contents. The grey diamonds are all proteins in a non-redundant dataset derived from the full Protein Data Bank content. The open black circles are from reference dataset 4 and the closed black circles are from SP175 (adapted from [5]).
classifications (CATH and SCOP databases as examples [9, 10]). Such classification concepts are the result of developments in bioinformatics and were not available at the time the original datasets were produced. As a result their coverage was limited by the lack of different types of available protein structures, but also included a number of proteins from the same fold family and this lowered the diversity of structural information available. SP175 was specifically designed to maximise the coverage of both secondary structure space and fold space. Using bioinformatics tools, a set of proteins was selected to include representatives of the nine superfolds, and broad coverage within the CATH [9] class, architecture and topology levels of identification. Obviously, as more proteins and crystal structures become available, the coverage will be able to be even more comprehensive in future reference datasets that are constructed. 4.1.2 Purity and Orthologues It is clear that the protein samples used to produce the CD spectra included in the datasets must be of high purity, properly folded and functional. There is little information available concerning the quality of the proteins in the early datasets, and although SP175 includes proteins with purities of > 92% (most > 95%) neither it nor any of the reference datasets assays them for function – a test of proper folding (albeit a daunting task for this number of different types of proteins). Another issue generally not addressed is whether under the conditions for data collection (usually in low ionic strength buffers to enable collection of low wavelength data) the proteins adopt the
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
187
same conformations as in the (generally) high ion strength conditions present in the crystals. SP175 did compare some, but not all, of the proteins it used under both these types of conditions. Another consideration is whether the proteins used for the CD spectra are the same as those in the crystal structures, with respect to biological source and expressed construct (ie. truncated or mutant versions used to produce the crystal structures must match the protein used to collect the CD data). For instance, some datasets that have been produced including CD spectra of orthologues (ie. similar proteins but perhaps from other species) to those used to produce the crystal structures [11, 12], (because there was no crystal structure available for the actual protein available). This could introduce discrepancies that could be detrimental to the overall quality of the dataset. 4.2 X-ray Structures Characteristics The solving of a crystal structure in the early 1970s took considerable effort as computer processing capacity was restrictive and there were also a limited number of software packages that could be used for structure determination and refinement. The Protein Data Bank, a site for the deposition of data on solved protein structures, was created in 1971, and since that time an ever-increasing number of structures have been stored in this resource. However, only in the early 1990s was software created that aimed to establish the quality of the determined protein structures, and an entire research “industry” has developed around this issue. Valuable packages such as PROCHECK [6], WHATIF [13], WHAT_CHECK [14], and Molprobity [15] are now available for assessing the geometric characteristics of solved crystal structures. It is of little surprise that some of the only determined structures that were available for use in early reference datasets were not of the quality expected for structures determined today. There are numerous validation and quality-determining parameters that are now considered when looking at the overall extent of correctness of protein structures. One major consideration is whether certain polypeptide backbone geometries are favourable, or even possible. Residues in proteins are not freely able to rotate because clashes would occur between side-chains or main-chains in the adjacent sequence; therefore steric hindrance means that residues must occupy certain defined spatial geometries and no others are possible. The space that is available to residues was determined by Ramachandran [16] who defined two angular variables, φ and ϕ, that pertain to the relative orientations of the pre- and post- peptide bond planes about the central alpha carbon for any residue in the protein chain. The exact details of these definitions and their usage are not of direct importance here, however, a Ramachandran plot (φ versus ϕ angles) of these values is important when considering the quality of the determined structure. The simplest way to explain the Ramachandran plot is to use the terms defined in one of the most widely-used structure validation packages available, PROCHECK [6]. Certain areas of the Ramachandran plot are fully accessible to residues; no steric hindrance occurs in these regions and they may be termed “fully allowed” regions. Surrounding these are areas that are possible with only minor flexing of bond angles, or in bond lengths. Either limited or no steric hindrance would result between adjacent residues, and these are termed “additionally allowed” regions. Around these additional regions are areas where it is unlikely that residues would be found in a Ramachandran plot, as to be present there would mean significant straining of bond angles or length changes to have occurred, and these are termed “generously
188
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
allowed” regions. Any residues in a structure whose φ/ϕ angles occur in these regions are identified in PROCHECK as a cause for some concern as their positions may be incorrect. The final and remaining regions that surround the generous areas are where no residues should be located since it is practically impossible for there to be anything but steric hindrance arising in these regions, and they are termed “disallowed regions” in PROCHECK, and are also flagged as likely to be incorrect. A number of the older reference datasets used crystal structures that do contain some errors in parts of their conformations as identified by PROCHECK analyses [2]. As an example, of the 32 structures in the original SELCON3 reference dataset [17, 18] and currently used in a number of the existing datasets (Table 1), half of them contain residues which fall in the disallowed region of the Ramachandran plot [2]. Fortunately most of these have a very limited number of their residues in this region, and it is entirely possible that such a position represents an average between two un-resolvable structural conformations for the given residue. However, there are some structures that do have a very significant number of residues in disallowed regions of the Ramachandran plot which would result in problems with their conformations leading to errors in calculations of their secondary structure content [2]. The resolution of a crystal structure can be considered loosely equatable to the ability to determine accurate atomic positions within a protein; the larger the value the less accurate are the locations of the atoms within the structure. As a guide, structures with a resolution of ~2.4 Å and lower in value would be approaching atomic resolution when considering only protein structures. Many of the proteins used in the reference datasets have poor resolution data due again to the lack of structures available at the time, or the inability to produce wellordered crystals of the proteins. Of these same 32 structures, 10 of them have a resolution poorer than or equal to 2.4 Å, and many of these are nearer to, and one is over 3.0 Å. This lack of resolution can introduce errors into structure solutions, simply because it is not possible on occasions to trace the accurate positions of the main chain of the protein structure which can therefore affect the proportions of their calculated secondary structure content [2]. A further problem for many of the proteins used in the early datasets was “missing residues” in the solved structure. Although the residues are actually present in the protein that produced the crystal structure, because they are flexible and adopt multiple conformations or positions within the “solid” crystal, they cannot be observed (protein crystals retain a large proportion of water in their structures, sometimes as much as 70%!). In a number of the reference datasets it is unclear how these residues were dealt with regarding their contributions to the reported secondary structure content; they could either have been added to the "random coil" (now "other") total content, or simply discarded from the total values. It is arguable that the unresolved residues should be added to the "other" category of secondary structure (as is done in SP175) since it is unlikely they form any persistent recognised type of secondary structure in solution. However, it is possible they do adopt regular secondary structures at least part of the time and do contribute to the spectral signature, which would not have been taken into account in the analyses by defining them as “other”. Hence having a significant number of residues missing is likely to be detrimental to the dataset. Of the 32 proteins in the early reference datasets used as an example here, six of them have more than 3% of their structures undetermined [2]. In an ideal case it would obviously be preferred that this percentage was zero, although this is relatively
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
189
Table 1. Summary of reference datasets available for use in the DichroWeb online analysis server. Reference Set
Wavelength range
No. of Proteins
Author *
Set 1
178 - 260
29 Proteins
Johnson
Set 2
178 - 260
22 Proteins
Johnson
Set 3
185 - 240
37 Proteins
29 Johnson 3 Venyaminov 5 Pancoska & Keiderling
Set 4
190 - 240
43 Proteins
32 from SELCON3 6 Provencher & Glockner 5 Pancoska & Keiderling
Set 5
178 - 260
17 proteins
Set 6
185 - 240
42 Proteins
Johnson 32 from SELCON3 5 Pancoska & Keiderling 5 denatured
Set 7
190 - 240
48 Proteins
32 from SELCON3 6 Provecher & Glockner 5 Pancoska & Keiderling 5 denatured
SP175
175 - 240
71 proteins
Lees et al
CRYST175
175 – 240
9 proteins
Evans et al
* The references datasets are constructed from source material as various combinations from the following references [3, 11, 17, 19, and 20], and as stand-alone datasets from [5, 7].
unrealistic as a large number of protein crystal structures have at least some missing residues [2]. Table 2 compares which proteins are present in the various reference datasets, and Table 3 lists the individual proteins included in SP175 and their characteristics. 4.3 Spectral Characteristics
4.3.1 Spectral Calibrations As discussed in the chapter on Calibration by Miles and Wallace, if the reference datasets are to be used for analyses of an unknown protein spectrum, it is crucial that not only are the protein spectra in the reference dataset internally calibrated to the same standards, the unknown protein spectrum must also be cross-calibrated in the same way. The issues are instrument calibration for wavelength and magnitude, protein concentration determinations and sample cell pathlength measurements [21]. The ways in which such calibrations can be accomplished are discussed in detail in that chapter.
190
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
4.3.2 Spectral Wavelength Ranges As has been indicated in the Analysis chapter by Whitmore and Wallace, extending the wavelength range of the collected CD data into the VUV region of the electromagnetic spectrum, which is available in SRCD data, increases the information content that is present in the data. This in turn means that an increased number of secondary structure components can be determined from that data, and with increased accuracy in the percentages obtained [22, 23]. The wavelength range for CD data in early reference datasets was often limited by the design of the spectrophotometer. As a result, different ranges of data were collected for different data sets and this also lead to a potential difference in the accuracy of secondary structure content that could be determined from them. When different sets were then amalgamated, this meant that only where their ranges overlapped was there viable data that could be used for the whole reference set. On occasions this meant that some truncation was necessary for sets collected over a longer wavelength range than others present in the amalgamated sets, as can be seen in Table 1. Hence, whilst increasing the potential coverage of some types of secondary structure components by inclusion into these amalgamated sets, on occasions they actually decreased the numbers of different types of components that could be derived from the data because the available information content had decreased [22, 23]. Table 1 lists the numbers of proteins in the reference datasets 1-7 and SP175 and CRYST175 datasets available for use in the DichroWeb online analysis server [4]. Table 2 compares which proteins are present in the various reference datasets.
5. Utilisation of the Reference Datasets A number of empirical approaches have been developed to utilise the reference datasets to determine secondary structure content from CD spectra. Despite the concerns listed above for some of the datasets, the mathematical procedures to utilise them for secondary structure content determination are rigorous and yield remarkably good results. Some of these procedures include (in order of their development) linear leastsquares [24, 25], parameterized fit [20], singular-value decomposition [22], non-linear least-squares [26] and self-consistent variable selection methods [3, 17, 27], as well as neural-network based approaches [28, 29]. As computing capacity has increased the newer methods for determining content reflect this in their complexity and need for such resources. A number of these together with a number of newer methods were assessed for their quality in determining secondary structure content [29]. It was concluded that primary reasons for more accurate content determination were increasing the information content into the lower wavelength region and at the same time covering secondary structure space more effectively by increasing the numbers of proteins in the reference dataset. These two criteria have been satisfied for the most part by the SP175 dataset [5]. However, there is no possibility of gaining an accurate determination of secondary structure content from the CD spectrum of a truly novel protein when using a reference dataset that contains only limited or no examples of the secondary structures found in that protein. The choice of reference dataset is therefore of major importance in determining the secondary structure content with any degree of accuracy. By far the most complete coverage of secondary and tertiary structure space is by the SP175 dataset, and this should normally be the initial dataset of choice for
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
191
secondary structure determination from a CD (or SRCD) spectrum. However, it may also be useful to test the other datasets, which may be more suitable for specific types of proteins. For example, for natively unfolded proteins, reference datasets 6 and 7 may be most useful as they contain spectra from denatured proteins. It is completely realistic to report the results from the reference dataset that gives the best quality fits – to do so is not a “fudge” but rather a representation that the unknown protein has characteristics that are more well represented in the proteins in that particular dataset.
6. Considerations When Creating a New Reference Dataset While it is the SP175 dataset that provides the most extensive general coverage of both secondary structure and fold space, there may be occasions where it would be best to focus in on a family of proteins with the aim of enhancing information on novel members of that family. The CRYST175 dataset is an example of a dataset focused on a specific set of proteins, here the β,γ crystallin family [7]. Because this family of proteins has beta-sheet structures with unusual geometric characteristics, members of the family are not well analysed with any of the standard datasets. When new members are found, their CD (or SRCD) spectra can be more readily evaluated for secondary structure and fold information because of the existence of this focused set. Use of CRYST175 enables subtle differences between mutants or homologues to be identified which would otherwise be unobservable from analyses with a more diverse general dataset. Thus for specific purposes focused reference datasets can be more informative. New reference datasets can be created using the CDTOOLS program [30]. However, in order to gain the best results it is necessary to create a dataset from the best source information. There are many aspects that should be considered and the remaining sections focus on key points to look for when considering crystal structures and CD spectral data to be used. 6.1 Utilisation of Quality Crystal Structures Protein crystal structures solved after ~1995 have predominantly been subject to checking of their structure quality, usually prior to their publication or release by the Protein Data Bank. PROCHECK is a typical package used to evaluate such quality and it is widely used as a validation tool. The output from this package is extensive. It includes detailing whether the protein residues are in the most favoured regions of the Ramachandran plot or not; a number 90% or over is considered a quality structure in this respect. Distortions in bond lengths, angles and torsion angles are also noted. These terms are collected together into a single parameter that provides an overall “evaluation term”; the G-factor, and structures where this value is >-0.5 are acceptable. There are other aspects of a solved structure that should be considered especially where two structures from the same source protein might exist. The resolution of the solved structure should be taken into account; a good basic general rule is the lower the value the likelihood is that this is the better structure. Wherever possible, structures with resolutions better than 2.2 Å should be used as this is a resolution where polypeptide backbones are well-defined and therefore a greater degree of accuracy of secondary structure can be determined from them. Deposited protein crystal structures include a value for the “R-factor”, (loosely related to the “remaining” difference between the observed and calculated structure
192
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
data), and again here, the lower the value the better it is as a structure. It should be noted that more recent structures also include a related term, the “R-free”, and having this term quoted suggests that a structure has been solved more recently than one without such a term. The termini of proteins are not always well resolved, usually because of a lack of interactions with the remainder of the protein to ensure they are held in a given position by hydrogen bonding. As a result terminal residues are often un-resolvable in the structure and can be classified as “missing”. When choosing structures for a reference dataset the extent of missing residues, if any, should be taken into consideration. Preferably there should be no missing residues. Certainly missing residues within the body of the protein, in loops for example, are to be avoided, as these could well mask secondary structure features that are incorporated in this “flexible loop”. It might be that flexibility of this kind is essential to the protein function for example, but this often means that areas adjacent to the missing region are poorly defined crystallographically as well, all potentially leading to errors in the reference dataset. Taking into consideration the factors for choosing the crystal structures will ensure that the structure side of the reference dataset has been optimised as best as is possible. 6.2 Collection of Accurate and Calibrated CD Spectra The crystal structures are only one component of any reference dataset and the CD spectra are the remaining and essential part. It is clear from what has been discussed in this and other chapters in this book that the CD data needs to be as accurately collected as possible. One possibly obvious but important point should be made: the CD data should be collected from the same protein from the same species as that of the crystal structure. Some dataset members of the early reference sets equated protein crystal structures from one species with CD data from the same protein but from a different species, primarily because at the time it was considered that such proteins would be identical in their structural features. Today we know this not to be so, and that CD data from identical proteins from different species are subtly different. To collect CD data for a reference dataset, consideration should be made as to what wavelength range can be achieved. In some cases there must be a balance between including buffers which better match the crystallisation conditions but which are absorbing (see the chapter by Miles and Wallace on Good Practice) with the desire for the lowest wavelength data possible. Extending the data into the VUV wavelength range by using SRCD instruments to collect the data will enhance the quality of the evaluated secondary structure content, especially for beta-sheet rich proteins which tend to exhibit considerable variation [31].
7. Summary There exist a number of reference datasets available for use in the determination of protein secondary structural content from a CD spectrum. As well as incorporating a number of different calculation algorithms, the DichroWeb server incorporates all the reference datasets that are currently publicly available. There are issues with some of the proteins incorporated in some of these datasets; however, the algorithms developed, especially those using variable section, can ameliorate these issues and produce good results using almost any of the reference datasets. The newest of the general reference
193
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
datasets is SP175 and this has been demonstrated to be optimal at providing the secondary structure content for many diverse types of proteins, due to its coverage of secondary structure and fold space, and its extended wavelength range. For gaining more subtle information about a specific family of proteins then creation of a focused reference dataset is one possible course of action. CRYST175, a dataset containing family members of the β,γ crystallins exemplifies this point. However, only quality protein structures and CD data should be used in the creation of such reference datasets, and this chapter has identified the features necessary to be considered in order to produce good reference data.
Acknowledgments Much of the material reported on the analyses of reference datasets in this chapter came from work originally supported by a grant from the U.K. Biotechnology and Biological Sciences Research Council. I would like to thank Prof. B.A. Wallace of Birkbeck College, University of London, for helpful discussions. Table 2. The component proteins in the reference datasets available for use in the DichroWeb online analysis server. A Grey circle indicates the presence of the protein in the reference dataset. A Black circle indicates the presence of a protein in SP175 for which the PDB structure used was different from that used in the other datasets. Protein codes †
Set 1
Set 2
Set 3
Set 4
Set 5
Set 6
Set 7
AAMY ABNG
●
●
●
●
● ●
● ● ● ●
ACY5 ACY9 ADH ADK
● ● ● ● ●
APP APRT AVDN
●
●
●
●
●
●
BAMY BB2C BGAL BLAC BNJN BPTI
● ●
● ● ● ●
ALDO
AZU
SP175
● ●
● ●
● ●
● ● ●
●
● ●
● ● ●
● ● ● ●
194
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
CA1
●
●
●
●
●
● ●
● ●
● ●
● ●
●
●
● ●
● ● ●
● ●
● ● ●
●
●
●
●
●
●
●
●
● ●
● ●
●
● ●
● ●
● ●
● ●
● ●
● ●
●
●
●
●
CA2 CAL CAT CER CHYG CHYT
●
●
CITS COLA CONA CPA CPHY CYTC DHQ1 DHQ2 DNA1 ECOR ELAS
● ●
FERD FLVD GBC
● ●
●
●
GDC GEC GFP
GLUD GPB
●
● ●
GRS
● ●
●
● ●
HAL
●
●
●
●
●
●
●
●
●
●
●
●
●
HGDC HMRT
● ● ●
● ● ● ● ●
HSA IFBP
● ●
● ●
GSCR
HBN
● ● ● ● ● ● ●
● ● ●
GLOX
GPD
● ● ● ● ● ● ● ●
●
●
●
●
195
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
IGG
●
INS
●
JAC LACF LDH
●
●
●
●
●
●
●
LEP LLEC LYS
●
●
●
●
●
●
●
●
●
●
●
●
●
●
MBH MBW MON NMRA
●
NUCL
● ●
OVOT OX20
●
●
●
PARV
● ●
●
● ●
● ● ●
PELC PGEN PGK
● ●
●
● ●
● ●
● ●
● ●
PLA2 PLEC PNMT
●
●
●
●
●
●
●
PROX PYK RHD
●
●
● ●
● ●
SN06 SN70
STI
● ●
● ●
● ● ●
● ● ●
●
RUBR
SOD
● ● ● ● ● ● ●
PGM
RNAS
●
●
PGLU
PRAL
● ● ● ● ● ● ●
●
OVAL
PAP
● ● ● ●
●
●
●
●
● ● ● ● ●
● ●
196
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
● ●
STRP SUBA SUBB SUBN T4LS
● ● ●
●
● ● ●
● ● ●
●
● ● ●
● ● ● ●
THAU THML TNF TPI TRPN
● ● ●
● ● ●
● ● ●
● ● ●
● ●
● ● ●
● ● ●
UBIQ
● ●
† Identities of proteins referred to in the “Protein codes” column are: AAMY - α-Amylase; ABNG - α-Bungarotoxin; ACY5 - Apo-Cytochrome C (5C) denatured; ACY9 - Apo-Cytochrome C (90C) denatured; ADH - Alcohol Dehydrogenase; ADK - Adenylate Kinase; ALDO - Aldolase; APP - Alkaline Phosphatase; APRT - Aprotinin; AVDN - Avidin; AZU - Azurin; BAMY - β-Amylase; BB2C - β-B2-Crystallin (bovine); BGAL β-Galactosidase; BLAC - β-Lactoglobulin; BNJN - Bence Jones Protein; BPTI - Bovine Pancreatic Trypsin Inhibitor; CA1 - Carbonic Anhydrase I; CA2 - Carbonic Anhydrase II; CAL Calmodulin; CAT - Catalase; CER - Ceruloplasmin; CHYG - α-Chymotrypsinogen; CHYT α-Chymotrypsin; CITS - Citrate Synthase; COLA - Colicin A; CONA - Concanavalin A; CPA - Carboxypeptidase A (bovine); CPHY - c-Phycocyanin; CYTC - Cytochrome C; DHQ1 Dehydroquinase-type1; DHQ2 - Dehydroquinase-type 2; DNA1 - DNase-I; ECOR - EcoR1 Endonuclease; ELAS - Elastase; FERD - Ferredoxin; FLVD - Flavodoxin; GBC - γ-Bcrystallin (bovine); GDC - γ-D-Crystallin (bovine); GEC - γ-E-Crystallin (bovine); GFP - Green Fluorescent Protein; GLOX - Glucose Oxidase; GLUD - Glutamate Dehydrogenase; GPB Glycogen Phosphorylase-b; GPD - Glyceraldehyde 3-p Dehydrogenase; GRS - Glutathione Reductase; GSCR - γ-S-Crystallin (C-terminus); HAL - Haloalkane Dehalogenase; HBN Haemoglobin; HGDC - γ-D-Crystallin (human); HMRT - Hemerythrin; HSA - Human Serum Albumin; IFBP - Rat Intestinal Fatty Acid Binding Protein; IGG - IgG; INS - Insulin; JAC Jacalin; LACF - Lactoferrin; LDH - Lactate Dehydrogenase; LEP - Leptin; LLEC - Lentil Lectin; LYS - Lysozyme; MBH - Myoglobin (Horse); MBW - Myoglobin (Sperm Whale); MON - Monellin; NMRA - Nitrogen Metabolite Repression Regulator NmrA; NUCL - Nuclease; OVAL - Ovalbumin; OVOT - Ovotransferrin; OX20 - Ribonuclease (20C) denatured; PAP Papain; PARV - Parvalbumin; PELC - Pectate Lyase C; PGEN - Pepsinogen; PGK Phosphoglycerate Kinase; PGLU - Polyglutamic Acid; PGM - Phosphoglucomutase; PLA2 Phospholipase-A2; PLEC - Pea Lectin; PNMT - Phenylethanolamine N-Methyltransferase; PRAL PROX - Peroxidase; PYK - Pyruvate Kinase; RHD - Rhodanese; RNAS - Prealbumin; Ribonuclease A; RUBR - Rubredoxin; SN06 - Staphyllococcal Nuclease (6C) denatured; SN70 Staphyllococcal Nuclease (70C) denatured;; SOD - Superoxide Dismutase (Cu, Zn); STI - Trypsin Inhibitor; STRP - Streptavidin; SUBA - Subtilisin A; SUBB - Substilin BPN; SUBN Substilin Novo; T4LS - T4 Lysozyme; THAU - Thaumatin; THML - Thermolysin; TNF Tumor Necrosis Factor; TPI - Triose Phosphate Isomerase; TRPN - Trypsin; UBIQ - Ubiquitin.
The sets are divided into different classes of secondary structural types, as follows: Set 1 - α-Helix(Regular), α-Helix(Distorted), β-Sheet(Regular), β-Sheet(Distorted), Turn, Other Set 2 - α-Helix, 310-Helix, β-Sheet, Turn, Polyproline-II, Other Set 3 - α-Helix(Regular), α-Helix(Distorted), β-Sheet(Regular), β-Sheet(Distorted), Turn, Other Set 4 - α-Helix(Regular), α-Helix(Distorted), β-Sheet(Regular), β-Sheet(Distorted), Turn, Other Set 5 - α-Helix, β-Sheet, Turn, Polyproline-II, Other Set 6 - α-Helix(Regular), α-Helix(Distorted), β-Sheet(Regular), β-Sheet(Distorted), Turn, Other Set 7 - α-Helix(Regular), α-Helix(Distorted), β-Sheet(Regular), β-Sheet(Distorted), Turn, Other SP175 - α-Helix(Regular), α-Helix(Distorted), β-Sheet(Regular), β-Sheet(Distorted), Turn, Other
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
197
Table 3. Proteins in the SP175 dataset available at the DichroWeb online analysis server, together with their Protein Data Bank (PDB) codes, the resolution and R-factor of the relevant crystal structure, the quality of the crystal structure as given by the G-factor in PROCHECK, and the percentage of amino acids missing (unseen) in the crystal structure. Adapted from [5]. Protein
PDB
Resolution
Code
R-
G-
Percentage
Factor
Factor
missing
Aldolase
1ado
1.9
0.16
0.35
0
Alkaline Phosphatase
1ed9
1.8
0.20
-0.18
0
α-Amylase
1vjs
1.7
0.20
0.23
0
α-Bungarotoxin
1hc9
1.8
0.20
0.28
1
α-Chymotrypsin
5cha
1.7
0.18
-0.82
2
α-Chymotrypsinogen
2cga
1.8
0.17
-0.50
0
Aprotinin
5pti
1.0
0.20
-0.58
0
Avidin
1rav
2.2
0.17
-0.36
0
β-Amylase
1fa2
2.3
0.21
0.23
0
βB2-Crystallin
2bb2
2.1
0.19
-0.66
0
β-Galactosidase
1bgl
2.5
0.17
-0.42
0
β-Lactoglobulin
1b8e
2.0
0.19
-0.27
6
Carbonic Anhydrase-I (Human)
1hcb
1.6
0.18
-0.50
0
Carbonic Anhydrase-II
1ca2
2.0
0.17
-0.42
1
Carboxypeptidase-A
5cpa
1.5
0.19
-0.68
0
Calmodulin
1lin
2.0
0.22
0.42
0
Catalase
1f4j
3.0
0.20
0.24
0
Ceruloplasmin
1kcw
3.0
0.22
-0.11
0
Citrate Synthase
2cts
2.0
0.16
-0.37
0
Concanavalin A
1nls
0.9
0.13
-0.27
0
c-Phycocyanin
1ha7
2.2
0.19
0.50
0
198
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
Cytochrome-c
1hrc
1.9
0.18
-0.29
0
Dehydroquinase-typeI
1qfe
2.1
0.20
0.34
0
Dehydroquinase-typeII
2dhq
2.0
0.15
0.03
7
DNase-I
3dni
2.0
0.18
-0.48
1
Elastase
3est
1.7
0.17
-0.36
0
Ferrodoxin
2fdn
0.9
0.10
-0.27
0
γ-B-Crystallin
4gcr
1.5
0.18
-0.49
0
γ-D-Crystallin (Bovine)
1elp
2.0
0.20
-0.12
0
γ-D-Crystallin (Human)
1hk0
1.3
0.17
-0.01
0
γ-E-Crystallin
1m8u
1.7
0.19
0.24
0
γ-S-Crystallin (C-term)
1ha4
2.4
0.22
0.20
0
Glucose Oxidase
1cf3
1.9
0.19
0.09
0
Glutamate Dehydrogenase
1hwx
2.5
0.17
0.18
0
Glycogen Phosphorylase-b
1gpb
1.9
0.19
-0.54
0
Haloalkane Dehydrogenase
1bn6
1.5
0.17
-0.12
1
Haemoglobin
1hda
2.2
0.19
0.12
0
Human Serum Albumin
1n5u
1.9
0.24
0.44
0
IgG
1igt
2.8
0.21
0.05
0
Insulin
1trz
1.6
0.17
-0.39
0
Jacalin
1ku8
1.8
0.19
0.13
1
Lactoferrin
1blf
2.8
0.23
0.03
0
Lentil Lectin
1les
1.9
0.19
-0.12
0
Leptin
1ax8
2.4
0.19
0.36
0
Lysozyme
193l
1.3
0.18
0.33
0
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
Monellin
1mol
1.7
0.17
-0.12
0
Myoglobin (Horse)
1ymb
1.9
0.16
-0.61
0
Myoglobin (Sperm Whale)
1a6m
1.0
0.13
0.11
0
Nitrogen Metabolite Repression Regulator NmrA
1k6j
1.8
0.17
0.31
5
Ovalbumin
1ova
2.0
0.17
-0.51
0
Ovotransferrin
1dot
2.4
0.23
-1.12
0
Papain
1ppn
1.6
0.16
-0.84
0
Pea Lectin
1ofs
1.8
0.17
0.17
2
Pectate Lyase C
1air
2.2
0.17
0.21
0
Pepsinogen
2psg
1.8
0.16
-0.40
0
Peroxidase
7atj
1.5
0.16
0.04
0
Phosphoglucomutase
3pmg
2.4
0.16
-0.23
0
Phosphoglycerate Kinase
3pgk
2.5
-
-0.56
0
Phospholipase-A2
1une
1.5
0.18
0.33
0
Phenylethanolamine NMethyltransferase
1hnn
2.4
0.23
0.29
6
Pyruvate Kinase
1a49
2.1
0.20
-0.21
2
Rhodanese
1rhs
1.4
0.17
-0.29
0
Ribonuclease A
3rn3
1.5
0.22
-0.33
0
Rubredoxin
1r0i
1.5
0.15
0.09
0
Streptavidin
1stp
2.6
0.22
-0.16
0
Subtilisin A
1sca
2.0
0.16
-0.18
0
Superoxide Dismutase (Cu,Zn)
1cbj
1.7
0.19
-0.03
0
Thaumatin
1thw
1.8
0.18
-0.26
0
199
200
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
Triose Phosphate Isomerase
7tim
1.9
0.18
-0.20
0
Trypsin Inhibitor
1avu
2.3
0.20
-0.27
4
Ubiquitin
1ubi
1.8
0.17
0.19
0
References [1] [2] [3]
[4] [5] [6] [7]
[8] [9] [10] [11]
[12] [13] [14] [15]
[16] [17] [18] [19] [20]
G.D. Fasman, (ed.), Circular Dichroism and the Conformational Analysis of Biomolecules, (1996) Plenum Press. R.W. Janes, Bioinformatics analyses of circular dichroism protein reference databases, Bioinformatics 21 (2005) 4230-4238. N. Sreerama and R.W. Woody, Estimation of protein secondary structure from CD spectra: Comparison of CONTIN, SELCON and CDSSTR methods with an expanded reference set, Anal. Biochemistry 282 (2000) 252–260. L. Whitmore and B.A. Wallace, Protein secondary structure analyses from circular dichroism spectroscopy: methods and reference databases, Biopolymers 89 (2008) 392–400. J.G. Lees, A.J. Miles, F. Wein and B.A. Wallace, A reference database for circular dichroism spectroscopy covering fold and secondary structure space, Bioinformatics 22 (2006) 1955-1962. R.A. Laskowski, M.W. MacArthur, D.S. Moss and J.M. Thornton, PROCHECK - A program to check the stereochemical quality of protein structures, J. App. Cryst. 26 (1993) 283–291. P. Evans, O.A. Bateman, C. Slingsby and B.A. Wallace, A reference dataset for circular dichroism spectroscopy tailored for the alpha-crystallin lens proteins, Experimental Eye Res. 84 (2007) 10011008. B.A. Wallace, J.G. Lees, A.J.W. Orry, A. Lobley and R.W. Janes, Analyses of circular dichroism spectra of membrane proteins. Protein Sci. 12 (2003) 875–884. C.A. Orengo, A.D. Michie, S. Jones, D.T. Jones, M.B. Swindells and J.M. Thornton, CATH - a hierarchic classification of protein domain structures, Structure 5 (1997) 1093-1108. A.G. Murzin, S.E. Brenner, T. Hubbard and C. Chothia, SCOP: A structural classification of proteins database for the investigation of sequences and structures, J. Mol. Biol. 247 (1995) 536–540. P. Pancoska, E. Bitto, V. Janota, M. Urbanova, V.P. Gupta, and T.A. Keiderling, Comparison of and limits of accuracy for statistical analyses of vibrational and electronic circular dichroism spectra in terms of correlations to and predictions of protein secondary structure, Protein Science 4 (1995) 13841401. K.A. Oberg, J-M. Ruysschaert and E. Goormaghtigh, Rationally selected basis proteins: A new approach to selecting proteins for secondary structure analysis, Protein Sci. 12 (2003) 2015-2031. G. Vriend, WHATIF - A molecular modeling and drug design program, J. Mol. Graph. 8 (1990) 52–56. R.W.W. Hooft, G. Vriend, C. Sander and E.E. Abola, WHAT_CHECK. Errors in protein structures, Nature 381 (1996) 272-272. I.W. Davis, L.W. Murray, J.S. Richardson and D.C. Richardson, MolProbity: Structure validation and all-atom contact analysis for nucleic acids and their complexes, Nucleic Acids Res. 32 (2004) W615– W619. G.N. Ramachandran, C. Ramakrishnan and V. Sasisekharan, Stereochemistry of polypeptide chain configurations, J. Mol. Biol. 7 (1963) 95-99. N. Sreerama and R.W. Woody, A self-consistent method for the analysis of protein secondary structure from circular-dichroism, Anal. Biochemistry 209 (1993) 32-44 . N. Sreerama, S.Y. Venyaminov and R.W. Woody, Estimation of the number of alpha-helical and betastrand segments in proteins using circular dichroism spectroscopy, Protein Sci. 8 (1999) 370-380. C.T. Chang, C.S.C. Wu and J.T. Yang, Circular dichroic analysis of protein conformation - inclusion of beta-turns, Anal. Biochemistry 91 (1978) 13-31. S.W. Provencher and J. Glöckner, Estimation of globular protein secondary structure from circular dichroism, Biochemistry 20 (1981) 33-37.
R.W. Janes / Reference Datasets for Protein CD and SRCD Spectroscopic Analyses
201
[21] A.J. Miles and B.A. Wallace, Synchrotron radiation circular dichroism spectroscopy of proteins and applications in structural and functional genomics, Chem. Soc. Reviews 35 (2006) 39–51. [22] J. P. Hennessey, Jr. and W.C. Johnson, Jr., Information content in the circular dichroism of proteins, Biochemistry 20 (1981) 1085-1094. [23] B.A. Wallace and R.W. Janes, Synchrotron radiation circular dichroism spectroscopy of proteins: Secondary structure, fold recognition and structural genomics, Curr. Opin. Chemical Biology 5 (2001) 567-571. [24] Y.H. Chen and Y.T. Yang, A new approach to the calculation of secondary structures of globular proteins by optical rotatory dispersion and circular dichroism, Biochem. Biophys. Res. Commun. 44 (1971) 1285–1291. [25] S. Brahms and J. Brahms, Determination of protein secondary structure in solution by vacuum ultraviolet circular dichroism, J. Mol. Biol. 138 (1980) 149–178. [26] B.A. Wallace and C.L. Teeters, Differential absorption flattening optical effects are significant in the circular-dichroism spectra of large membrane-fragments. Biochemistry 26 (1987) 65–70. [27] W.C. Johnson, Analyzing protein circular dichroism spectra for accurate secondary structures, Proteins: Struct. Funct. Gent. 35 (1999) 307–312. [28] C. Perez-Iratxeta and M.A. Andrade-Navarro, K2D2: Estimation of protein secondary structure from circular dichroism spectra, BMC Structural Biology, 8 25 (2008). [29] J.G. Lees, A.J. Miles, R.W. Janes and B.A Wallace, Optimisation and development of novel methodologies for secondary structure prediction from circular dichroism spectra, BMC Bioinformatics 7 (2006) 507–517. [30] J.G. Lees, B.R. Smith, F. Wien, A.J. Miles and B.A. Wallace, B.A. CDtool – An integrated software package for circular dichroism spectroscopic data processing, analysis and archiving, Anal. Biochemistry 332 (2004) 285–289. [31] B.A. Wallace, F. Wien, A.J. Miles, J.G. Lees, S.V. Hoffman, P. Evans, G.J. Wistow and C. Slingsby, Biomedical applications of synchrotron radiation circular dichroism spectroscopy: Identification of mutant proteins associated with disease and development of a reference database for fold motifs, Faraday Disc. 17 (2004) 653–661.
202
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-202
Ab Initio Calculations for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy of Proteins Benjamin M. Bulheller and Jonathan D. Hirst School of Chemistry, University of Nottingham Abstract. Circular dichroism (CD) and synchrotron radiation circular dichroism (SRCD) spectroscopy are widely used techniques for secondary structure determination of proteins. Using the matrix method, protein CD can be calculated from first principles, with an accuracy which is almost quantitative for helical proteins. Thus, CD calculations and experimental data can be used in conjunction to aid structure analysis.
1. Introduction Ever since the phenomenon of chirality was discovered by Pasteur in 1848 [1], scientists have tried to relate macroscopic chiral effects back to their origin at the molecular level. These effects include, for example, optical activity, the rotation of the polarization plane of linearly polarized light, and circular dichroism (CD), the difference in absorption of left and right circularly polarized light. The cause of these phenomena can be explained by the presence of an unsymmetrically substituted atom, a stereo centre. Two enantiomers, exact mirror images of a molecule, show the same optical rotation, but with the opposite sign and their CD spectra are mirror images [2]. By measuring the CD spectrum it is thus possible to determine the absolute configuration of a molecule. However, the obvious catch is that one has to know which mirror image of the spectrum corresponds to which configuration. One way to solve this problem is a stereoselective synthesis of the molecule from precursors of known configuration. This can be difficult and time consuming depending on the size of the system and may be further complicated if there is more than one stereo centre. Apart from the determination of the absolute configuration, CD spectroscopy is also widely used because of its sensitivity to the secondary structure of proteins. The three dimensional structure of a protein can be determined via NMR experiments and X-ray diffraction measurements. Both methods require a rather large amount of substance and the latter is dependent on the ability to grow crystals of the protein. Although CD spectroscopy lacks the spatial resolution of the two mentioned methodologies, it needs only
B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins 203
C
π* antibonding π orbital
O
N
πnb → π* ≈ 190 nm C
O
C
O
C
O
C
O
N
N
N
N
n → π* ≈ 222 nm
n oxygen lone pair πnb nonbonding π orbital πb bonding π orbital n´ oxygen lone pair
Figure 1. Valence molecular orbital diagram of the amide group and associated electronic transitions in the far-UV.
very small concentrations, is not limited to a maximum molecule size and the measurements can be performed rapidly compared to NMR and X-ray diffraction. The CD spectrum can be used to verify that the conformation of the protein has not changed (e.g. during the course of an experiment) and to determine the approximate fraction of secondary structure types like α-helices and β-strands. For both applications, the elucidation of the absolute configuration [3–7] and the estimation of secondary structure content, the theoretical prediction of the spectrum can be a powerful tool to help with the interpretation of the experimental spectrum. Therefore, there has been long-standing interest in developing a theoretical framework to calculate the CD spectrum for a given molecular structure. In 1915, the foundation of the theory describing optical rotatory power was developed by Born [8], which led to the formulation of the Rosenfeld equation in 1928 [9]. The latter allows one to calculate the rotational strength of a transition, which is related to the intensity in the CD spectrum. One of the milestones for polypeptides was the calculation of the optical rotation of a helical peptide by Moffitt [10, 11] using exciton theory in 1956 and by Fitts and Kirkwood [12] using polarisability theory in the same year. In the following decade, three peptide electronic bands were resolved and attributed to the n → π ∗ and π → π ∗ transitions, by Doty [13]. Further improvements of these fundamental theories and methodologies and the exponential increase in computational power have greatly improved the possibilities of calculating CD spectra in the last half century. The following shall introduce the methods used to calculate CD from a structure.
2. Circular Dichroism Spectra of Proteins In the infrared region CD provides information on vibrational transitions [14–18], while in the ultraviolet electronic excitations are the cause of the spectrum. Electronic CD (henceforth simply CD) is more commonly used and is the focus in this chapter. The most important chromophore in peptides is the amide bond, which possesses three π molecular orbitals and two lone pairs on the oxygen atom of the carbonyl group. Those
204 B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins
50000
[θ] / deg cm2 dmol−1
α helical 25000
α / β mixture β−I
0 β−II −25000 195
210
225
wavelength λ / nm
Figure 2. Characteristic CD spectra of the main secondary structure types. [θ] is the mean residue ellipticity.
frontier orbitals are shown in Figure 1. In order to induce an excitation, the incident light beam needs to have the exact wavelength corresponding to the energy difference between the ground state and the excited state. The bands in the far-UV (190 – 250 nm) are caused by electronic transitions from the nonbonding π orbital, πnb , and the oxygen lone pair, n, into the antibonding π orbital, π ∗ . A different arrangement of the groups (another conformation) changes the overlap of the molecular orbitals and their energy levels while some conformations permit a more constructive interaction than others (affecting the intensity). These effects are the cause for the sensitivity of CD spectroscopy to secondary structure. Figure 2 shows the characteristic curve shapes caused by the most important structure types. Proteins that are mostly α-helical have intense spectra with a positive band at 190 nm and a significant double well with minima at 208 and 222 nm. The former two peaks are caused by the exciton splitting of the πnb → π ∗ transition, while the latter originates from the n → π ∗ transition. Regular β-sheet proteins with long and aligned strands (referred to as β-I) exhibit less intense bands and the maximum at 190 nm is slightly red-shifted. Rather unordered proteins (random coil structures) show a CD spectrum different from the α-helical and β-I proteins. The same is observed for a special type of β rich proteins, referred to as β-II [19], which contain only short strands that are not rigidly aligned. Unordered and β-II type proteins exhibit a similar CD spectrum, showing a negative band at 200 nm and almost no positive bands [20, 21]. These types are still a challenge for CD calculations. To determine the relative proportion of the secondary structure elements in the system, the far-UV spectrum can be regarded as a sum of fractional multiples of reference spectra for each type of secondary structure [2,22,23]. The mean residue molar ellipticity at 222 nm, [θ]222 , is a measure of the average fractional helicity, fH , which can therefore be determined directly from the spectrum: fH =
[θ]222 [θH∞ ]222 · (1 −
k N)
(1)
B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins 205
N is the number of residues in the peptide, k an end-effect factor (≈ 3) and [θ]H∞ is the ellipticity of an ideal helix of infinite length, estimates of which range from −33,000 deg cm2 dmol−1 [24] to −44,000 deg cm2 dmol−1 [25]. It has to be borne in mind that the mean residue molar ellipticity at 222 nm can be influenced by several factors [26–28], but in general the relative proportion of the contained secondary structure types can be calculated from the far-UV CD spectrum.
3. Theory A molecule consists of its nuclei and the electrons which populate the molecular orbitals. An electronic transition, e.g. by an incident light beam, causes an electron to be excited into orbitals of higher energy, which changes the structure of the orbitals. The wavefunction describes the molecular orbitals of the system which in turn represent a certain state of the molecule. This can be either the ground state or an excited state after perturbation by light. The wavefunction, therefore, contains information about the positions and spins of all nuclei and electrons. If the wavefunction is known then all measurable quantities can be calculated by evaluating different operators on this function. A quantum mechanical operator, such as the Hamiltonian operator, accounts for observable quantities like the energy or the dipole moment of the system. The actual expectation value of this measurement (e.g. of the energy or the dipole moment) can be determined by calculating the average value of the operator evaluated on that state. If an operator is applied to a function, it simply modifies it and the result is another function. For example, the function sin x may be changed into cos x by an operator – in this case the operator is differentiation with respect to x. In some cases, the resulting function is the same as the initial function multiplied by a constant (i.e. the operator could simply be multiplication by a constant). Such functions are then called eigenfunctions of this operator, and the constant is the eigenvalue. When applying operators to wavefunctions, eigenvalue equations are a commonly encountered problem because quantum theory uses both matrices and differential equations. The most important example of an eigenvalue equation is the time-independent Schrödinger equation: ˆ k = E k ψk Hψ
(2)
ˆ is an operator applied to the wavefunction ψ k which represents The Hamiltonian H ˆ is a matrix, this eigenvalue problem can be solved by a a state k of the molecule. If H diagonalization of this matrix, yielding the eigenvalues E k , the energies which solve this equation. In case of CD, the eigenvalues are electronic transition energies. A given state can exhibit several transitions, each depending on the specific energy provided by the incident light. Light consists of an electric and a magnetic field, oscillating at right angles to each other. Both fields interact with the molecular orbitals and may provoke an electronic excitation, that is a relocation of charge within the chromophore caused by this interaction. During this transition the two fields give rise to an electric and a magnetic dipole moment, which possess fixed directions with respect to the geometry of the group. The electric field causes a linear displacement of charge and, in the case of the peptide bond, the electric transition dipole moment, μ , is directed along the carbonyl bond, while the
206 B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins
magnetic field provokes a circulation of charge. The magnetic transition dipole moment, m, can be envisioned as a light-induced rotation of charge around the oxygen atom. Both effects happen simultaneously and the combination of a linear and circular displacement leads to a helical motion of charge, which is the cause of the different interaction with left and right circularly polarized light [22]. A chiral solution possesses different refractive indices for these two types of light. The beams travel at different speeds and their extinction coefficients are different at each wavelength. This difference is measured as the differential absorbance Δε during the CD experiment: Δε = εL − εR
(3)
Each transition caused by the light affects the absorbance and thus contributes to the Δε. The observable of this interaction for an excitation from the ground state 0 to the excited state k is the rotational strength, R0k . Using the Rosenfeld equation [9], R0k can be calculated from the transition dipole moments μ and m: R0k = Im(ψ 0 | μ|ψ k · ψ k |m|ψ 0 ),
(4)
where ψ 0 and ψ k denote the respective wavefunction of the ground state and the excited state. The quantum mechanical operator for the magnetic transition moment involves the imaginary number i and thus actually describes a movement in a complex coordinate system. The real part of the result of Eq. (4) is therefore zero and the imaginary part has to be taken, which is denoted by the prefix Im. The angle bracket syntax (Dirac or bracket notation) denotes the integral over the wavefunctions of the ground and excited states using the respective transition dipole moment operator. The information necessary to calculate the transition energies of electronic excitations is, therefore, the wavefunction, which requires the solution of the Schrödinger equation. The computational effort to achieve this scales rapidly with the number of atoms. Hence it is vital to utilize approximations to make calculations on larger systems tractable. For the calculation of optical spectra of proteins, several methods have been developed to cope with the size of these systems. The general approach is to divide the system into independent groups, which are first treated individually and their mutual interactions are then calculated. Groups may be considered independently if the orbital overlap between the groups is minimal and electron exchange negligible. For optical spectra, the groups of interest are chromophores which interact with the incident light, for example the peptide bonds of proteins or aromatic systems. The dipole interaction model [29–32] formulated by DeVoe [33,34] and extended by Applequist [35,36], regards chromophores as point dipole oscillators which interact with each other and the applied electromagnetic field. The model requires the polarisability for each atom forming the group, that is the tendency to react to the perturbation by an external electric field. Another methodology using the independent group approach without the need for polarisabilities is the matrix method [37]. It has become popular for CD calculations on systems such as proteins and crystals. Employing ab initio results of the single groups, the matrix method can be used to calculate CD from first principles with some success. Therefore, it shall be discussed in detail.
B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins 207
4. The Matrix Method In 1962, Tinoco developed a method to compute protein CD [38], which involved perturbation theory. It was superseded by the matrix method, which improved the original procedure by solving the eigenvalue problem via a matrix diagonalization. This is more accurate than perturbation theory and easily implementable in computer programs. The general approach of both methods is to regard the chromophores of the system separately. The electrons are thus localized on the individual groups and may only be excited into higher energetic states of the same chromophore. Since the excitations are an interaction between different states, each of those states needs to be described separately. The wavefunction of state k of a protein with M chromophores and ni states on the respective group, is considered as the sum of electronic configurations of the independent chromophores: ψk =
ni M i
ckia Φia
(5)
a
ckia are expansion coefficients, which for example account for constructive (in-phase) or destructive (out-of-phase) interaction of states. Φia are electronic configurations, that is products of wavefunctions of the single chromophores where only one chromophore i is allowed to be in an excited state a: Φia = φ10 . . . φia . . . φj0 . . . φM 0
(6)
where φia denotes the wavefunction of chromophore i after the transition 0 → a. Since in this formulation only one group may exist in the excited state a, all others are in the ground state 0: ψ 0 = φ10 . . . φi0 . . . φj0 . . . φM 0
(7)
For a given protein it is therefore possible to create equations for its wavefunctions, ψ k , by considering its chromophores and the electronic states of these. Solving the Schrödinger equation using these wavefunctions yields the energies, E k , of each electronic excited state: ˆ k = E k ψk Hψ
(8)
ˆ has to be constructed and because the system is reTo solve Eq. (8) the Hamiltonian H ˆ itself is a sum of the local Hamiltonigarded as a sum of individual chromophores, H ˆ ans, Hi , of these groups. As the chromophores interact with each other, the intergroup potentials, Vij , also have to be accounted for: ˆ = H
M i=1
ˆi + H
M −1
M
Vˆij
(9)
i=1 j=i+1
The interaction terms between the groups are the actual cause for the dependency of protein CD spectra on secondary structure. These interactions are generally assumed to be electrostatic in nature and the terms Vij are integrals of charge densities over the whole system. These integrals involve all interacting groups in their ground and excited states
208 B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins
Figure 3. Monopole distributions for the n → π ∗ transition (left) and π → π ∗ transition (right).
and would become computationally prohibitive for proteins. Therefore, the monopolemonopole approximation is commonly used to make the calculations practicable. The permanent and transition charge densities are approximated by point charges and the integrals can be replaced by a sum of the Coulomb interactions of these monopoles. Hence, the form of the interaction terms in Eq. (9) is Vi0a;j0b =
Ns Nt 1 qs qt 4πε0 s=1 t=1 rst
(10)
Ns and Nt are the number of monopole charges, q, on the chromophores i and j. The distribution and charges of these monopoles are therefore the crucial parameters for matrix method calculations. For each electronic transition a set of monopoles represents the electrostatic field of the nuclei and electrons, and should reproduce well the magnitudes and orientations of the transition dipole moments. Figure 3 shows possible charge distributions for the n → π ∗ and π → π ∗ states, but many distributions are possible, as long as the electrostatic potential is reproduced. Matrix method CD calculations are based solely on these point charges fitted to the 3D structure of the protein. To calculate the excitation energies and wavefunction coefficients, cki,a , it is convenient to convert Eq. (9) into a matrix formalism. The matrix is constructed from the transition energies of each transition on each group as diagonal elements and the interaction terms, Vij , as off-diagonal energies. Considering, for example, a dipeptide with two transitions per group, the Hamiltonian matrix (derived from Eq. (9) would take the form ⎛ ⎞ E1nπ∗ V1nπ∗ ;1ππ∗ V1nπ∗ ;2nπ∗ V1nπ∗ ;2ππ∗ ⎜ ⎟ ˆ = ⎜ V1nπ∗ ;1ππ∗ E1ππ∗ V1nπ∗ ;2ππ∗ V1ππ∗ ;2ππ∗ ⎟ H (11) ⎝ V1nπ∗ ;2nπ∗ V1ππ∗ ;2nπ∗ E2nπ∗ V2nπ∗ ;2ππ∗ ⎠ V1nπ∗ ;2ππ∗ V1ππ∗ ;2ππ∗ V2nπ∗ ;2ππ∗ E2ππ∗ The diagonalization of the matrix is done via a unitary transformation. This procedure finds a matrix U which transforms H into a diagonal matrix (i.e. one in which all the off-diagonal elements are zero): ˆ · U = Hdiag U −1 · H
(12)
B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins 209
The diagonal elements of Hdiag are the eigenvalues of this diagonalization and represent the transition energies of the interacting system. The matrix U is formed by the respective eigenvectors which are the expansion coefficients, cki,a . With these, the initial electric 0a of the single transition dipole moments, μ 0a , and magnetic transition dipole moments, m i, groups can be transformed into the vectors of the whole, interacting system ( μi and m respectively): μ i =
a
Uai μ 0a
m i=
Uai m 0a
(13)
a
The steps yield, for each transition, an energy and a set of transition dipole moments. While the energy can be trivially converted into the wavelength of the transition, the respective rotational strength needs to be calculated from the electric and magnetic transition dipole moments using the Rosenfeld equation, Eq. (4) [9]. The first result is a line spectrum, showing sharp lines for each transition. In an experimental spectrum, however, the transitions are broadened due to the uncertainty principle and interactions with other chromophores and the solvent. The observed bands are of approximately Gaussian shape and therefore the calculated line spectrum is convolved with a Gaussian function. The band width for those bands is set to a fixed value for all transitions; 12.5 nm has proven to give the best results when comparing to experimental spectra. While one might wish to calculate as much as possible ab initio, this is currently too challenging for whole proteins, but the independent chromophores considered in the matrix method are of a feasible size for high-level ab initio calculations. N-methylacetamide (NMA) is a model compound representing a single peptide bond [39]. The electronic spectrum of NMA was calculated using complete-active space self-consistent-field method [40] implemented within a self-consistent reaction field (CASSCF/SRCF) [41–43] and combined with multi-configurational second-order perturbation theory (CASPT2-RF) [44, 45]. For each state a set of monopoles was determined by fitting their electrostatic potential to reproduce the ab initio potential for that state so the least-squares difference was minimized. The same was done for a number of aromatic side chains [46] and selected charge-transfer chromophores [47]. A set of 32 monopoles at a distance of 0.1 Å from the C, N, O and H atoms was used for the n → π ∗ transition (Figure 3) and 20 monopoles for the π → π ∗ transition, with a charge placed at each atom centre and four around it at a distance of 0.05 Å each. A special type of transition does not happen within a single group like a peptide bond, but involves a charge-transfer from one group to a neighbouring group. Although matrix method calculations assume negligible interactions between groups, two adjacent peptide groups can be considered as one group and the electrostatic potential of the transition calculated for the charge-transfer transition. In addition to the local transitions there are in fact four additional states for transitions from an orbital of one of the peptide groups into the π ∗ orbital of the other group. Two transitions involve an excitation from a lone pair orbital (n1 → π2∗ and n2 → π1∗ ), while two consider a transition from a π orbital (π1 → π2∗ and π2 → π1∗ ).
210 B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins
Figure 4. Comparison of calculation with experiment: horse myoglobin, mainly α-helical (left) and γ-Bcrystallin, mainly β-strand (right).
5. Comparing Theory and Experiment To assess the quality of the matrix method calculations, the calculated and experimental spectra of a representative set have to be compared and analyzed in order to get a statistically meaningful result. A useful statistic is the Spearman rank correlation coefficient [48], between the experimental and calculated intensity at each wavelength for each of the proteins. The coefficient is a measure of the strength of the associations between the experimental and calculated spectra and ranges from 1 (perfect correlation) over 0 (no correlation) to −1 (inverse correlation). The Spearman rank correlation is a nonparametric statistic which does not require any assumptions about the probability distribution of the data, thus making it superior to parametric methods, more reliable and robust. The accuracy of the calculations was assessed on a set of 47 proteins in the far-UV [49] and a collection of 31 spectra recorded down to 161 nm allowed an analysis in the deep-UV [47]. With conventional CD spectrometers it is not possible to reach far below 180 nm and the deep-UV is only accessible with Synchrotron Radiation CD (SRCD) [50–53]. The extended range provides additional features and therefore more information about the secondary structure than spectra recorded using conventional spectrometers [54–56]. The analysis of unknown spectra is supported by comparison with the spectra of proteins with known secondary structure and in order to create a data bank for such analyses, 71 proteins were selected for their diversity of secondary structure and recorded from 240 nm down to 175 nm [57,58]. About 18% of the proteins in this set are mainly αhelical, 39% show mainly β strand structures, 34% are αβ mixtures and the rest contain multiple classes or are unstructured. Figure 4, left, compares the matrix method calculations with experiment for a typical α-helical protein. The computed spectra of this class show the best correlation with experiment. The intensity at 222 nm is reproduced almost quantitatively. All α-helical proteins in this set possess a positive intensity around 175 nm and a considerable improvement of the correlation is shown if charge-transfer and side chain transitions are taken into account. The major part of this improvement is caused by the consideration of charge-transfer, which reproduces the positive intensity compared to the negative band for the backbone only calculation.
B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins 211
Figure 5. Spearman rank correlation coefficient between the experimental and calculated intensity at a particular wavelength. A set of 71 proteins was calculated taking only backbone chromophores into account (dotted) and considering charge-transfer, aromatic side chains and backbone chromophores (solid).
Table 1. Spearman rank correlation coefficients between the calculated and experimental spectra considering different electronic excitations.
Correlation coefficient
Chromophores considered 175 nm
190 nm
200 nm
222 nm
peptide backbone
0.12
0.79
–0.16
0.91
+ charge-transfer
0.73
0.78
–0.26
0.89
+ charge-transfer + side chains
0.79
0.75
–0.09
0.88
The right panel of Figure 4 shows a β-I-class protein. Compared to α-helical proteins the intensity is much smaller and the band at 208 nm is almost not resolved. The correlation with the calculations is worse for this class than that of helical proteins. All β proteins in the set show a negative intensity at 175 nm and the consideration of the charge-transfer and side chain transitions again improves the correlation. Figure 5 shows the Spearman rank correlation coefficients at each wavelength considering only backbone chromophores and taking into account charge-transfer and side chains. The exact coefficients at several important wavelengths are given in Table 1. Between 210 and 230 nm there is a very good correlation, between 0.8 and 0.9. This is especially important since the intensity at 222 nm is used to determine the helical content in the protein and the coefficient of 0.9 indicates an almost quantitative calculation of this intensity. The region between 190 and 210 nm is a challenge since the CD intensity changes sign around 200 nm and the gradient of the curve is greatest. So even though the absolute error in wavelength of the zero crossing is between 2 and 3 nm, the error in intensity is very large due to the gradient. The consideration of aromatic side chains yields a slight improvement.
212 B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins
Figure 6. Atom labels for aromatic side chains.
In the vacuum-UV at 175 nm the backbone only calculations show a correlation coefficient of 0.12 and hence almost no agreement. However, the inclusion of side chains and charge-transfer yields an improvement to 0.79, the major part of this being caused by charge-transfer transitions. This shows that the spectral features provided by SRCD measurements can be modelled by matrix method calculations.
6. Performing Matrix Method Calculations using DichroCalc After having explained the theory behind circular dichroism calculations using the matrix method we conclude this chapter by describing a practical means for a nonspecialist to perform such calculations. A freely available web interface, DichroCalc, allows one to set up matrix method calculations online and sends the results back via email. DichroCalc can be reached at http://comp.chem.nottingham.ac.uk [59]. To submit a job, a Protein Data Bank (PDB) file can be retrieved via its PDB code from the RCSB Protein Data Bank [60] or directly uploaded. To conduct calculations on a number of files, an archive containing them may be submitted. The input files for DichroCalc are PDB files according to the standards of the RCSB Protein Data Bank. For the files to be interpreted correctly, the ATOM section is most important (the header and footer information is discarded as are the HETATM lines). The chromophores for the calculation are assigned using the atom types of the respective group. A peptide bond for example is designated by the atom labels C, N and O; a chargetransfer group consists of two peptide bonds. The atom labels of aromatic side chains are shown in Figure 6. The program returns the chromophore assignments in the log files so the user can verify the input data for the calculations. All available chromophores can be selected individually to test, for example, the influence of phenylalanine or the chargetransfer chromophores. All calculated spectra are automatically plotted as postscript files (including the experimental spectrum, if provided).
B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins 213
7. Summary Circular dichroism spectra provide information about protein folding in the ultraviolet region. The method is very quick and not restricted by the size of the protein. These advantages and the importance of the determination of protein secondary structure has fuelled the interest in the theoretical prediction of CD spectra. The matrix method in conjunction with the monopole-monopole approximation makes it is possible to tackle large systems such as proteins and calculate the CD spectra. Matrix method parameters derived from ab initio calculations are available for peptide bonds, aromatic side chain groups and charge-transfer chromophores and a correlation between calculation and experiment of up to 0.9 is reached at 222 nm. Although not fully quantitative, such calculations can be useful, for example in the context of structural modelling.
References [1] L. Pasteur, Mémoire sur la relation qui peut exister entre la forme cristalline et la composition chimique, et sur la cause de la polarisation rotatoire. (note on the relationship of crystalline form to chemical composition, and on the cause of rotatory polarization), C. R. Acad. Sci. 26 (1848) 535–539. [2] N. Berova, K. Nakanishi and R.W. Woody, eds. Circular Dichroism: Principles and Applications, WileyVCH, New York (2000). [3] K.M. Specht, J. Nam, D.M. Ho, N. Berova, R.K. Kondru, D.N. Beratan, P. Wipf, R.A. Pascal and D. Kahne, Determining absolute configuration in flexible molecules: A case study, J. Am. Chem. Soc. 123 (2001) 8961–8966. [4] K. Tanaka, Y. Itagaki, M. Satake, H. Naoki, T. Yasumoto, K. Nakanishi and N. Berova, Three challenges toward the assignment of absolute configuration of gymnocin-b, J. Am. Chem. Soc. 127 (2005) 9561– 9570. [5] P. Butz, G.E. Tranter and J.P. Simons, Molecular conformation in the gas phase and in solution, PhysChemComm 5 (2002) 91–93. [6] F. Furche, R. Ahlrichs, C. Wachsmann, E. Weber, A. Sobanski, F. Vogtle and S. Grimme, Circular dichroism of helicenes investigated by time-dependent density functional theory, J. Am. Chem. Soc. 122 (2000) 1717–1724. [7] M. Schreiber, R. Vahrenhorst, V. Buss and M.P. Fülscher, Ab initio (caspt2) excited state calculations, including circular dichroism, of helically twisted cyanine dyes, Chirality 13 (2001) 571–576. [8] M. Born, The natural optical activity of liquids and gases, Physikal. Z. 16 (1915) 251–258. [9] L. Rosenfeld, Quantenmechanische theorie der natürlichen optischen aktivität von flüssigkeiten und gasen, Z. Phys. 52 (1928) 161–174. [10] W. Moffitt, Optical rotatory dispersion of helical polymers, J. Chem. Phys. 25 (1956) 467–478. [11] W. Moffitt, The optical rotatory dispersion of simple polypeptides, Proc. Natl. Acad. Sci. U. S. A. 42 (1956) 736–746. [12] D.D. Fitts and J.G. Kirkwood, The optical rotatory power of helical molecules, Proc. Natl. Acad. Sci. U. S. A. 42 (1956) 33–36. [13] G. Holzwarth and P. Doty, Ultraviolet circular dichroism of polypeptides, J. Am. Chem. Soc. 87 (1965) 218–228. [14] L. Nafie, J.C. Cheng and P.J. Stephens, Vibrational circular dichroism of 2,2,2-trifluoro-1-phenylethanol, J. Am. Chem. Soc. 97 (1975) 3842–3843. [15] L. Nafie and M. Diem, Optical activity in vibrational transitions: Vibrational circular dichroism and raman optical activity, Acc. Chem. Res. 12 (1979) 296–302. [16] T.A. Keiderling, Vibrational circular dichroism, Appl. Spectrosc. Rev. 17 (1981) 189–226. [17] L.D. Barron, Molecular Light Scattering and Optical Activity, Cambridge University Press, Cambridge (1982). [18] G. Holzwarth, E.C. Hsu, H.S. Mosher, T.R. Faulkner and A. Moscowitz, Infrared circular dichroism of carbon-hydrogen and carbon-deuterium stretching modes. observations, J. Am. Chem. Soc. 96 (1974) 251–252.
214 B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins [19] P. Manavalan and W.C. Johnson, Sensitivity of circular dichroism to protein tertiary structure class, Nature 305 (1983) 831–832. [20] N. Sreerama and R.W. Woody, Poly(pro)ii helices in globular proteins: identification and circular dichroic analysis, Biochemistry 33 (1994) 10022–10025. [21] N. Sreerama and R.W. Woody, Structural composition of βi- and βii-proteins, Protein Sci. 12 (2003) 384–388. [22] C.R. Cantor and P.R. Schimmel, Biophysical Chemistry, W. H. Freeman, San Francisco (1980). [23] W.C. Johnson, Protein secondary structure and circular dichroism - a practical guide, Proteins: Struct., Funct., Genet. 7 (1990) 205–214. [24] J.T. Yang, C.S.C. Wu and H.M. Martinez, Calculation of protein conformation from circular dichroism, Methods Enzymol. 130 (1986) 208–269. [25] P.Z. Luo and R.L. Baldwin, Mechanism of helix induction by trifluoroethanol: A framework for extrapolating the helix-forming properties of peptides from trifluoroethanol/water mixtures back to water, Biochemistry 36 (1997) 8413–8421. [26] C.D. Andrew, S. Bhattacharjee, N. Kokkoni, J.D. Hirst, G.R. Jones and A.J. Doig, Stabilizing interactions between aromatic and basic side chains in α-helical peptides and proteins. tyrosine effects on helix circular dichroism, J. Am. Chem. Soc. 124 (2002) 12706–12714. [27] Z. Dang and J.D. Hirst, Short hydrogen bonds, circular dichroism, and over-estimates of peptide helicity, Angew. Chem. Int. Ed. 40 (2001) 3619–3621. [28] S. Bhattacharjee, G. Toth, S. Lovas and J.D. Hirst, Influence of tyrosine on the electronic circular dichroism of helical peptides, J. Phys. Chem. B 107 (2003) 8682–8688. [29] B.K. Sathyanarayana and J. Applequist, Theoretical-ππ ∗ -absorption and circular dichroic spectra of cyclic dipeptides, Int. J. Pept. Protein Res. 26 (1985) 518–527. [30] K.A. Bode and J. Applequist, Improved theoretical π → π ∗ absorption and circular dichroic spectra of helical polypeptides using new polarizabilities of atoms and nc’o chromophores, J. Phys. Chem. 100 (1996) 17825–17834. [31] K.L. Carlson, S.L. Lowe, M.R. Hoffmann and K.A. Thomasson, Theoretical uv circular dichroism of cyclo(l-proline-l-proline), J. Phys. Chem. A 110 (2006) 1925–1933. [32] S.L. Lowe, R.R. Pandey, J. Czlapinski, G. Kie-Adams, M.R. Hoffmann, K.A. Thomasson and K.S. Pierce, Dipole interaction model predicted π → π ∗ circular dichroism of cyclo(l-pro)3 using structures created by semi-empirical, ab initio, and molecular mechanics methods, J. Pept. Res. 61 (2002) 189–201. [33] H. DeVoe, Optical properties of molecular aggregates. 1. Classical model of electronic absorption + refraction, J. Chem. Phys. 41 (1964) 393–400. [34] H. DeVoe, Optical properties of molecular aggregates. 2. Classical theory of refraction absorption and optical activity of solutions and crystals, J. Chem. Phys. 43 (1965) 3199–3208. [35] J. Applequist, Full polarizability treatment of the π → π ∗ absorption and circular dichroic spectra of α-helical polypeptides, J. Chem. Phys. 71 (1979) 4332–4338. [36] J. Applequist, K.R. Sundberg, M.L. Olson and L.C. Weiss, Normal mode treatment of optical properties of a classical coupled dipole oscillator system with lorentzian band shapes, J. Chem. Phys. 70 (1979) 1240–1246. [37] P.M. Bayley, E.B. Nielsen and J.A. Schellman, The rotatory properties of molecules containing two peptide groups, J. Phys. Chem. 73 (1969) 228–243. [38] I. Tinoco, Theoretical aspects of optical activity. 2. polymers, Adv. Chem. Phys. 4 (1962) 113–160. [39] N.A. Besley and J.D. Hirst, Theoretical studies toward quantitative protein circular dichroism calculations, J. Am. Chem. Soc. 121 (1999) 9636–9644. [40] B.O. Roos, P.R. Taylor and P.E.M. Siegbahn, A complete active space scf method (casscf) using a density matrix formulated super-ci approach, Chem. Phys. 48 (1980) 157–173. [41] G. Karlström, A new approach to the modeling of dielectric media effects in ab initio quantum chemical calculations, J. Phys. Chem. 92 (1988) 1315–1318. [42] G. Karlström, Electronic structure of hf- and hcl- in condensed phases studied by a casscf dielectric cavity model, J. Phys. Chem. 93 (1989) 4952–4955. [43] L. Serrano-Andrés and M.P. Fülscher, Theoretical study of the electronic spectroscopy of peptides 1. the peptidic bond: Primary, secondary, and tertiary amides, J. Am. Chem. Soc. 118 (1996) 12190–12199. [44] A. Bernhardsson, R. Lindh, G. Karlström and B.O. Roos, Direct self-consistent reaction field with pauli repulsion: Solvation effects on methylene peroxide, Chem. Phys. Lett. 251 (1996) 141–149. [45] L. Serrano-Andrés, M.P. Fülscher and G. Karlström, Solvent effects on electronic spectra studied by
B.M. Bulheller and J.D. Hirst / Ab Initio Calculations for CD and SRCD Spectroscopy of Proteins 215
multiconfigurational perturbation theory, Int. J. Quantum Chem. 65 (1997) 167–181. [46] D.M. Rogers and J.D. Hirst, Ab initio study of aromatic side chains of amino acids in gas phase and solution, J. Phys. Chem. A 107 (2003) 11191–11200. [47] M.T. Oakley and J.D. Hirst, Charge-transfer transitions in protein circular dichroism calculations, J. Am. Chem. Soc. 128 (2006) 12414–12415. [48] W.H. Press, B.P. Flannery, S.A. Teukolsky and W.T. Vetterling, Numerical Recipes in FORTRAN: The Art of Scientific Computing, Cambridge University Press, Cambridge (1992). [49] J.D. Hirst, K. Colella and A.T.B. Gilbert, Electronic circular dichroism of proteins from first-principles calculations, J. Phys. Chem. B 107 (2003) 11813–11819. [50] J.C. Sutherland, E.J. Desmond and P.Z. Takacs, Versatile spectrometer for experiments using synchrotron radiation at wavelengths greater than 100 nm, Nuclear Instruments & Methods 172 (1980) 195–199. [51] B.A. Wallace, Synchrotron radiation circular dichroism spectroscopy as a tool for investigating protein structures, J. Synchrotron Radiat. 7 (2000) 289–295. [52] R.W. Janes, Bioinformatics analyses of circular dichroism protein reference databases, Bioinformatics 21 (2005) 4230–4238. [53] J. Šebek, B. Gyurcsik, J. Šebestík, Z. Kejík, L. Bednárová and P. Bouˇr, Interpretation of synchrotron radiation circular dichroism spectra of anionic, cationic, and zwitterionic dialanine forms, J. Phys. Chem. A 111 (2007) 2750–2760. [54] B.A. Wallace, Conformational changes by synchrotron radiation circular dichroism spectroscopy, Nature Struct. Biol. 7 (2000) 708–709. [55] B.A. Wallace and R.W. Janes, Synchrotron radiation circular dichroism spectroscopy of proteins: secondary structure, fold recognition and structural genomics, Curr. Opin. Chem. Biol. 5 (2001) 567–571. [56] A.J. Miles and B.A. Wallace, Synchrotron radiation circular dichroism spectroscopy of proteins and applications in structural and functional genomics, Chem. Soc. Rev. 35 (2006) 39–51. [57] J.G. Lees, A.J. Miles and B.A. Wallace, A reference database for circular dichroism spectroscopy covering fold and secondary structure space, Bioinformatics 22 (2006) 1955–1962. [58] B.M. Bulheller, A.J. Miles, B.A. Wallace and J.D. Hirst, Charge-transfer transitions in the vacuumultraviolet of protein circular dichroism spectra, J. Phys. Chem. B 112 (2008) 1866–1874. [59] B.M. Bulheller, A. Rodger and J.D. Hirst, Circular and linear dichroism of proteins, Phys. Chem. Chem. Phys. 9 (2007) 2020–2025. [60] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I.N. Shindyalov and P.E. Bourne, The protein data bank, Nucleic Acids Res. 28 (2000) 235–242.
216
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-000-1-216
The Protein Circular Dichroism Data Bank (PCDDB): A Resource for Data Archiving, Sharing, Validation and Analysis B.A. WALLACE1, Lee WHITMORE1, and Robert W. JANES2 Department of Crystallography, Birkbeck College, University of London 2 School of Biological and Chemical Sciences, Queen Mary, University of London 1
Abstract. The Protein Circular Dichroism Data Bank is a deposition and userfriendly archive of circular dichroism (CD) and synchrotron radiation circular dichroism (SRCD) spectra, with associated software for validation, analysis and searching. It aims to provide a publicly-accessible resource for structural biology, biotechnology and bioinformatics that enables data sharing and data mining of validated CD data. It contains both spectral data and its associated metadata describing the sample and experimental conditions, with links to sequence and crystallographic data bases. It is located online at: http://pcddb.cryst.bbk.ac.uk. Its functions, contents, structure and features are described in detail in Wallace et al, 2006 [1] and Whitmore et al, 2006 [2].
1. Introduction The exemplar for the establishment of this data bank was the long-established and highly accessed Protein Data Bank (PDB) [3], created in 1971 as a resource and deposition site for protein crystallographic data. As the field of crystallography matured, it was obvious that there needed to be a public repository for the structural data that could be easily and freely accessed. The PDB was developed due to the growing interests in protein crystal structures by a wide range of scientists, not only crystallographers and structural biologists. When it was first created, it only contained seven protein structures. By mid-2008, it had grown to contain more than 40000 protein crystal structures. Its remit was expanded to include NMR spectroscopic structures in 1989, and by 1997, electron microscopy structures were also added. By August 2008, more than 6000 protein NMR structures and more than 100 EM structures were held in the data bank. It now includes not only protein, but also nucleic acid structures and a growing number of macromolecular complexes. In the early 1990’s a number of crystallographic validation protocols/software programmes [4, 5] were created to test the structures based on their PDB coordinates and diffraction data; implementation of these in conjunction with depositions helped ensure and improve the quality of the deposited data. Other specialist structural data banks exist specifically for NMR data (the Biological Magnetic Resonance Bank – BMRB [6]) and electron microscopy data (the EMDB) [7], as well as the relatively new PRIDE data bank for mass spectrometry/proteomics [8]. These data banks are in addition to the protein sequence data banks that, as a result of various genome projects now contain a vast number of
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
217
entries. Uni-Prot [9], which combines the original Swiss-Prot, TrEMBL and PIR-PSD protein sequence data banks [10, 11] had approximately 400,000 entries by mid-2008. But CD spectroscopy, a major tool in use by a large number of academic and industrial structural biology labs, has never had a data bank enabling access to primary data. In addition to these primary data banks, many databases of secondary (derived) information, such as the CATH [12] and SCOP [13] protein fold data bases, currently exist and are also important resources for the life sciences community. Derived databases for CD include the reference datasets used for secondary structure analyses [14 -16]. At present there is no central resource or means of public archival access to any published CD data files of any kind. Sometimes spectra of individual proteins can be obtained by directly contacting the author of a paper, but often even the producing lab can lose track of the primary data when a researcher leaves the lab, or when the media used for computer storage becomes outdated. In the past, researchers often resorted to photocopying or scanning figures from published papers and then digitising them manually in order to produce files for comparisons, clearly neither an efficient nor an accurate procedure. Hence the Protein Circular Dichroism Data Bank (PCDDB) was created to fill this gap in the provision of structural biology information, and to help meet data sharing requirements of U.K. research councils [17] and other funding bodies [18]. Furthermore the validation tools (collectively known as “VALIDICHRO”) accompanying the PCDDB are aimed at providing for the first time a means of ensuring the integrity of the data in the data bank, as well as a framework for improvement of standards in CD data collection in general. This is much in the manner of what PROCHECK and WHATIF validation software [4, 5] have done for crystal data depositions and structural quality. These validation tools act not only as checks for PCDDB data but are also in line with international guidelines for using CD to characterise human pharmaceuticals [19] and U.K. guidelines for establishing quality control standards for CD in industry [20]. For historic reasons, because the first macromolecular crystal structures deposited were all proteins, the crystallographic data bank was called the “Protein” data bank. However, despite its name, the PDB contains information on a wide range of macromolecules (including nucleic acids and carbohydrates), not just proteins. Likewise, the PCDDB accepts spectral data of biomacromolecules other than just proteins, but it was so named in order to emphasise the connection and parallels with the PDB and its philosophy of data deposition, accessibility and validation.
2. Contents/Design of Data Bank
2.1 Contents Each entry includes information on: 1) the sample (including links to sequence and structure data banks where applicable), 2) the experimental conditions (including methods and parameters for concentration determinations, assays of purity, spectral conditions and parameters), 3) the spectra (including CD and HT/high voltage/dynode voltage spectra), 4) the optional but highly recommended calibration parameters and spectra, 5) information on data processing (including parameters used in calculations of spectral magnitude), 6) secondary structure analyses where applicable (either provided
218
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
by the user or linked to the DichroWeb [21] server), 7) citations/references to the literature, including links to the cited paper abstracts, and where available either open access files or pdf reprints provided by the depositor, and 8) details about the depositor. It incorporates raw and processed CD and HT (or dynode voltage) data in downloadable formats, as well as downloadable images of each of the spectra. The PDB was used as an initial guide to some of the sample parameters, to enable an ease of cross-referencing between it and the PCDDB. The principal parameters currently included are listed in Table 1. The first depositions to the data bank were the individual spectra that comprise the SP175 [15] and CRYST175 [16] reference datasets used for secondary structure analyses in the DichroWeb server [21]. 2.2 Input Formats Our aim was to make deposition as simple as possible in order to encourage the addition of as many spectra as possible to the data bank. To achieve this, the website deposition area accepts a wide range of input file formats, including those of the major commercial instruments, all the operational SRCD beamlines, the outputs of the CDTOOLS processing programme [22], the output of the popular DichroWeb analysis website [20], and various generic 2 and 3 column text files. For files with rich headers (i.e. from CDTOOLS or instrument formats) upon download, the PCDDB software reads and data mines the files for as much extra information as possible, to limit the amount of typing of experimental and processing details. Again to simplify the deposition procedure, as many other inputs fields as possible are presented as dropdown lists so that the entering of typed data is kept to a minimum. The depositor can also adapt a previous deposition to make a template that can be called up and simply modified to include the specific parameters that differ for individual spectra. 2.3 Access to the Data Bank The data bank is accessible online through the main PCDDB website, enabling both the deposition and search/download functions; it is located at: http://pcddb.cryst.bbk.ac.uk. The data bank acronym was chosen to be a Googlewhack and thus if “PCDDB” is entered into search engines it should send the user to the correct site. Access to the site is free and does not require registration for searching/downloading, although users can register if they wish to store searches, create scripts for specific searches, set up automated update searches or wish to subscribe to email alerts and opt-in informational newsletters. Depositors must create an account; this account will be used for identification purposes and communication about validation, completion and release of entries, to store and retrieve partially submitted records, and to create/store templates for future depositions. 2.4 PCDDB Website Design The technical aspects of the website and data bank designs have been described [2], but briefly, the site has three sections: Deposition, Search and Supplementary Information. The Deposition section of the website features the data deposition interface, the data validation tools and an interface for storing and retrieving partially completed
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
219
Table 1. Parameters/Files Present in a PCCDB Data Bank Entry. (+ indicates the parameter/file is optional, all other parameters/files are required; italics indicates a choice is provided to author or the information is data-mined from the spectral file, all other parameters are input by authors; files to be downloaded are indicated by *) Section 1: Sample Protein Name Alternative Protein Names+ Uni-Prot code PDB code+ Source Organism Natural Source or Expressed Expression System Expression tags (if any)+ Changes to standard sequence+ Ligands Present + Macromolecular Partners Present+ Section 2: Experimental Conditions Protein Concentration Concentration Quantification Method Protein Purity (%)+ Purity Quantification Method+ Buffer Contents Sample Cell Pathlength (cm) Pathlength Determination Method Sample Cell Type CD Instrument or SRCD beamline Data Collection date Temperature Nitrogen flush or vacuum Dwell or Averaging time, seconds+ Detector Acceptance Angle (Scattering Angle)+ Section 3: Spectra Local Spectrum Identifier+ File Format Software Used for Data Collection Spectral Units Number of repeat scans Maximum (highest) wavelength, nm Minimum (lowest) wavelength, nm Wavelength interval, nm Processed spectrum* Processed Spectrum with error bars+* Raw Spectrum and Baseline Files+* HT / High Voltage / Dynode Spectrum File+* Low wavelength cutoff, nm+ Criteria for low wavelength cutoff+ Section 4: Calibration Details+ Calibration Standard Used+ Standard Concentration Standard Pathlength Date Standard Measured+ Temperature+ CSA Ratio (192.5/290.0 )+ Calculated Molar Ellipticity (at 285nm)+
220
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
Section 5: Data Processing Protein Molecular Weight Number of Amino Acids Mean Residue Weight value used Data Processing Software Smoothing Yes/No Number of Smoothing Points+ Zeroing Point Section 6: Calculated Secondary Structure+ Calculation Software Used Secondary Structure Algorithm Reference Dataset NRMSD+ Secondary Structure Values Section 7: Additional Information Keyword/phrase [up to 10] SCOP or CATH classification+ Publication Details .pdf file upload of citation Medline Entry Open Access website+ Section 8: Depositor Information note: contact details required but optional in released version Depositor Name and Address
depositions. The Search section of the website features a set of pre-defined commonly used search patterns such as search by protein name, author name, keyword, deposition date or identifier. These searches are simple to use and require minimal knowledge of the structure of the data bank. More specific searches can be tailored for the users’ needs in the Advanced Search feature, and will enable search by, for example, secondary structure content, sequence, wavelength range, wavelength minimum and maximum, and other parameters that may be of use for bioinformatics and theoretical studies. All of the validated data and the record-metadata attached to validated records are searchable. The Supplementary Information contains links to other websites and sections on information about the technique of circular dichroism spectroscopy, and literature citations to methods and the data bank. It also includes the user account creation and configuration interfaces, a site map, a glossary, the terms and conditions for usage, and contact details.
3. Additional Software and Functions To enhance the utility of the PCDDB, the website includes additional software enabling a wide range of related analyses of the CD data. 3.1 Validation (VALIDICHRO) CD is now being used by many labs around the world. A large proportion of these carry out "good practice" in their measurements however there are some that are not as
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
221
thorough. Hence, to coincide with the release of the PCDDB, it seemed an opportune time for establishment of validation procedures. Validation will ensure the quality of data deposited in the PCDDB, enable more accurate use of the existing standard analysis methods and reference datasets for CD secondary structure analyses, and establish standards for both academic research labs and industrial quality assurance analyses. When the technique of protein crystallography “matured”, crystallographic validation software programs such as PROCHECK [4], WHATIF [5], WHAT_CHECK [23], and Molprobity [24] were developed and put into use which enabled examination of the quality of the data deposited in the Protein Data Bank (PDB). Those validation software programs also lead to improved good practice in the creation of the data. They are now available to depositors of data to the PDB through the ADIT Validation Server and as standalone programmes for crystallographers for use in checking the quality of structures and data prior to publication. This type of quality control is something that has been completely missing from CD data collection and hence some publications have suffered from erroneous or poor data as a result. To ensure integrity of the data bank as a source of structural information for data mining, it was necessary to develop validation procedures and software for checking the data to be deposited. VALIDICHRO is a software package which tests spectral integrity, processing and standardised/calibration and completeness of metadata. The validation criteria included were based on our examination of data collected on a wide range of CD and SRCD instruments, and most importantly, from our experience with DichroWeb users. Feedback from DichroWeb showed us many examples of what types of errors could be and were being made in data collection, processing and calculations. The criteria were adopted after discussions with the International Scientific Advisory Board of the PCDDB [25]. These include parameters that can be checked and recalculated automatically by the software (such as spectral magnitude values given the instrument parameters and experimental parameters such as concentration and pathlength), as well as parameters such as calibration values and “good practice”-defined data collection parameters [26, 27], including those identified in the National Physical Laboratory cross-validation study [8]. In addition parameters that fall well outside known physical extremes for values are targeted. Validation must maintain a careful balance between too strict (which would put off depositors) and too lenient (where the data would not be useful). Consequently one important decision made was that some of the (critical) outliers would result in rejection of an entry but other less critical ones would result in a “flag”. Consequently, there are three ratings for each criterion: suitable, flag (not good, but in acceptable range), reject (outside acceptable range). For spectra to be deposited in the PCCDB, no parameter should have a reject rating. In addition to the rating, a description is provided to the depositor of why it was given that rating and what acceptable ranges for the parameter were. A “health check” report on the validation is included in the metadata attached to each data bank spectrum so that the user can decide on the utility of the data. 3.2 Spectral Matching (DICHROMATCH) DICHROMATCH is a spectrum pattern recognition search programme to identify related proteins based on their spectral properties that will be included in upcoming versions of the PCDDB website.
222
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
3.3 Back-Calculation of Spectra (PDB2CD) PDB2CD is a programme in development to provide back-calculations of theoretical CD spectra based on the secondary structures of a given protein as defined from its PDB coordinates. Using the DSSP [28] algorithm for defining secondary structural elements from the crystal coordinates, in conjunction with the reference datasets of protein CD spectra, the basis spectra corresponding to different secondary structural types are scaled by the relative amounts of each secondary structure in the protein of interest to produce a “theoretical” spectrum of the protein. This can be compared with the actual spectrum, as a test of the environmental factors affecting spectral properties, or can be used as a comparison with experimental spectra collected for proposed or actual homologues of the protein whose spectrum was calculated. 3.4 Search Tools Searching tools include both the ability to use basic (and advanced) text categories for searching [2] through the website and a user interface which accommodates both these pre-defined search terms and also user-defined search terms, including SQL-like queries. Tailored searches allow more advanced users to write and save scripts in whatever language they prefer. 3.5 Other Analytical Software It is envisaged that other online analytical tools will be included on the website, both ones produced by the PCDDB project, and ones created by others, provided that they interface with the data bank entries and formats. Examples are: similarity functions (including auto-correlation methods), cluster analysis methods for identifying related spectra [15], principle component analyses and support vector machine tools for identifying component spectra of mixtures and thermal melts [22], as well as utility programs such as binding constant determination software. It will provide links to other CD software/websites such as DichroWeb, CDTOOLS, and ab initio calculations [30]. It will also include the ability to download selected files and create new user-defined reference datasets [29] for use with DichroWeb or other analysis programs.
4. Advantages for Users
4.1 Depositors The major advantage for the depositor is that it is a permanent repository for their data that will not be dependent on their retaining it in a specific format and on media that remain current. Often historic spectral data has been archived locally on media that has become outdated such as reel-to-reel tapes, floppy disks, exabyte tapes, etc) which can no longer be read by modern computing equipment, and metadata describing the samples or spectral conditions is only available in lab notebooks that may be difficult to locate or lost as a student or postdoc leaves the lab. With recent directives from research councils [17, 18] for sharing data or long term retention of data this is
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
223
becoming a burden for researchers that can be ameliorated by deposition of the spectral and metadata in the data bank. The PCDDB thus serves two purposes for the researcher with respect to funding councils: it makes the data publicly available, fulfilling datasharing requirements, and provides retained/backed up data on current media, fulfilling long term data storage requirements. A significant advantage for the depositor is that others will be able to access the data at will and by simple searches find the file of interest, as well as the details of the originating lab. It will therefore provide an additional means of obtaining credit for the work done, as others who access/use the data will be expected to acknowledge/cite the original papers linked to the data. Through directed searches it will also provide a means for other structural biologists to easily identify labs working on a particular protein/in a particular area and so be able to contact the originating authors if they are interested in future collaborations. 4.2 Spectroscopists/Computational Chemists For other spectroscopists, the PCDDB will provide ready access to a wealth of spectra, both preventing duplication of effort, and as a means of comparison with spectra they collect of known or unknown proteins. It will aid in development of algorithms for protein classification and for computational chemists, providing the experimental basis for testing ab initio calculation methods. It will also contain a broad range of spectra of related proteins from which spectral features can be extracted. The data bank will also enable creation of new tools for spectral analyses. Using, for instance, the “create reference database” function, the user can select a group of related or unrelated protein spectra to make specific tailored reference datasets for analyses of certain classes of proteins (e.g., a particular protein fold) or supersecondary structure (e.g. coiled-coils) or for particular spectral characteristics (e.g. membrane proteins or polyproline II-rich proteins). 4.3 Structural Biologist and Bioinformaticists The data bank will provide a ready source for data-mining spectral and secondary structure data of both proteins whose crystals structures are known, and also of novel proteins whose structures are unknown. Searching/similarity matching tools will supply them with an objective tool for identifying close homologues. It will also contain spectra obtained under different experimental conditions, allowing examination of the effects of environment on protein structures. Also, the inclusion of thermal melt and other protein folding/unfolding studies will enable analyses of folding intermediates as well as natively unfolded proteins, for which crystal structures are not available .
5. Possible Uses for the PCDDB The potential uses for the PCDDB are manifold. In addition, just as it was found after the establishment of the PDB, many additional unforeseen uses are likely to be conceived once the data are available. Below are listed a few examples we propose for usage:
224
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
5.1 Development and Testing of New Methodologies • • •
• •
Testing of new algorithms developed for calculating secondary structure of proteins. Testing of existing methods for calculating secondary structures using spectra of known homologues (especially for proteins with “atypical” structures). Development and testing of methods for identifying protein folds based on spectral characteristics [15]. This could be used to help assign/identify a function based on detection of spectral nearest neighbours with known functions. Testing systems for first-principles (theoretical) methods for computing CD and UV spectra of proteins [30]. Testing the design and output of SRCD instruments, including those currently being developed in the UK, France, Germany, Australia, Taiwan, and China, and upgrades at existing beamlines in Germany, the USA and Denmark.
5.2 Availability of Information • • •
•
•
• • • • •
Availability of examples of spectra of proteins containing less common types of secondary structures, such as PPII, 310 helices, and various types of turns. Availability of a specific spectrum of a protein for identification purposes. Availability of examples of many spectra from a wide range of protein structural and fold types to help identify the structure of an unknown protein (perhaps one whose structure had been predicted by homology modelling and back-calculated based on the model coordinates). Availability of spectra of different types of beta-sheet-rich proteins (because of the wide variation in beta-sheet protein spectra), which could potentially lead to methods for identifying spectral characteristics arising from such features as sheet twist [31]. Availability of not-easy-to-access spectra due to publications being in less available (or non-open access) journals, or spectra that are not obviously contained in published papers (i.e. ones that would require sophisticated text data-mining to find). Availability that would obviate the duplication of effort and need to produce proteins or peptides in order to recollect spectra of already published samples for direct comparisons. Availability of spectra that would enable comparisons of a known native protein with a new mutant, looking for correct folding and conformational differences [32]. Availability of protein spectra under different conditions for examination of the effects of environment on the protein structure, including the effects of solvent dielectric constant on spectral peak positions [33]. Availability of membrane protein spectra to enable examination of spectral characteristics associated with the hydrophobic environment of a lipid bilayer [34]. Availability of the spectra of components of a complex for comparison of their structures when associated in the complex [35].
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
• •
• •
225
Availability of spectra of natively unfolded proteins, a category of potentially biologically important structures, for which crystallographic and NMR structures are not available. Availability of a series of spectra (i.e. thermal and denaturant unfolding) to enable analysis of cooperativity of unfolding of an individual protein and comparison of folding processes across a range of similar and different types of protein folds. Availability of spectra covering a range of protein architectures and folds for use in bioinformatics, especially in cluster analyses. Availability of new validated spectra for enhancement of existing secondary structure reference datasets with more diverse structural types.
6. Summary The PCDDB is a new tool for CD spectroscopy with uses in a wide range of applications, and as a permanent archive of freely accessible data that will also provide a means of fulfilling data sharing requirements. It is our intention to eventually expand the data bank to include other spectroscopic methods such as FTIR, Raman, Raman Optical Activity, and vibrational CD.
Acknowledgements We thank the members of the PCDDB International Scientific Advisory Board and the Technical Advisory Board for their insight, advice and effort in helping make this project successful, and the U.K. Biotechnology and Biological Sciences Research Council (BBSRC), through the Tools and Resources programme, for funding its development and curation. We thank members of the Wallace Lab, especially Dr. Andrew Miles, for help with the initial (alpha) deposition testing phase.
References [1] [2] [3] [4] [5] [6] [7] [8]
B.A. Wallace, L. Whitmore and R.W. Janes, The protein circular dichroism data bank (PCDDB): A bioinformatics and spectroscopic resource, Proteins: Struct. Funct.Bioinform. 62 (2006) 1–3. L. Whitmore, R.W. Janes and B.A. Wallace, Protein circular dichroism data bank (PCDDB): Data bank and website design, Chirality 18 (2006) 426–429. H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov and P.E. Bourne, The Protein Data Bank, Nucleic Acids Res. 28 (2000) 235–242. R.A. Laskowski, M.W. MacArthur, D.S. Moss and J.M. Thornton, PROCHECK - A program to check the stereochemical quality of protein structures, J. Appl. Cryst. 26 (1993) 283–291. G. Vriend, WHATIF - A molecular modeling and drug design program, J. Mol. Graph. 8 (1990) 52–56. B.R. Seavey, E.A. Farr, W.M. Westler and J.L. Markley, A relational database for sequence-specific protein NMR data, J. Biomolecular NMR 1 (1991) 217–236. M. Tagari, R. Newman, M. Chagoyen, J.M. Carazo and K. Henrick, New electron microscopy database and deposition system, Trends Biochem. Sci. 27 (2002) 589. P. Jones, R. Cote, L. Martens, A. Quinn, C. Taylor, W. Derache, H. Hermjakob and R. Apweiler, PRIDE: A public repository of protein and peptide identifications for the proteomics community, Nucleic Acids Res. 1 (2006) D659–D663.
226
[9] [10]
[11] [12] [13] [14]
[15] [16]
[17] [18] [19] [20] [21] [22]
[23] [24]
[25] [26]
[27]
[28] [29] [30] [31]
[32] [33]
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
The UniProt Consortium, The universal protein resource (UniProt), Nucleic Acids Res. 36 (2008) D190195. B. Boeckmann, A. Bairoch, R. Apweiler, M.C. Blatter, A. Estreicher, E. Gasteiger, M.J. Martin, K. Michoud, C. O'Donovan, I. Phan, S. Pilbout and M. Schneider, The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003, Nucleic Acids Res. 31 (2003) 365–370. H. Huang, Z.Z. Hu, B.E. Suzek and C.H. Wu, The PIR integrated protein databases and data retrieval system, Data Science 3 (2004) 163–174. C.A. Orengo, A.D. Michie, S. Jones, D.T. Jones, M.B. Swindells and J.M. Thornton, CATH - A hierarchic classification of protein domain structures, Structure 5 (1997) 1093–1108. A.G. Murzin, S.E. Brenner, T. Hubbard and C. Chothia, SCOP: a structural classification of proteins database for the investigation of sequences and structures, J. Mol. Biol. 247 (1995) 536–540. N. Sreerama and R.W. Woody, Estimation of protein secondary structure from circular dichroism spectra: Comparison of CONTIN, SELCON, and CDSSTR methods with an expanded reference set, Anal. Biochem. 287 (2000) 252–260. J.G. Lees, A.J. Miles, F. Wien and B.A. Wallace, A reference database for circular dichroism spectroscopy covering fold and secondary structure space, Bioinformatics 22 (2006) 1955–1962. P. Evans, O.A. Bateman, C. Slingsby and B.A. Wallace, A reference dataset for circular dichroism spectroscopy tailored for the βγ-crystallin lens proteins, Experimental Eye Research 84 (2007) 1001– 1008. BBSRC, Consultation Document on Data Sharing Policy (2006). NIH, Sharing research data. Notice NOT-OD-03-032 (2003). Guideline Q6B, International Conference on harmonisation of technical requirements for registration of pharmaceuticals for human use, FDA register 64FR (1999) page 44928. C. Jones, D. Schiffmann, A. Knight and S. Windsor, Val-CiD Best Practice Guide: CD spectroscopy for the quality control of biopharmaceuticals, NPL report DQL-AS 008 (2004). L. Whitmore and B.A. Wallace, DICHROWEB, an online server for protein secondary structure analyses from circular dichroism spectroscopic data, Nucleic Acids Res. 32 (2004) W668–673. J.G. Lees, B.R. Smith, F. Wien, A.J. Miles and B.A. Wallace, CDtool – An integrated software package for circular dichroism spectroscopic data processing, analysis and archiving, Anal. Biochem. 332 (2004) 285–289. R.W.W. Hooft, G. Vriend, C. Sander and E.E. Abola, WHAT_CHECK. Errors in protein structures, Nature 381 (1996) 272-272. I.W. Davis, L.W. Murray, J.S. Richardson and D.C. Richardson, MolProbity: structure validation and all-atom contact analysis for nucleic acids and their complexes, Nucleic Acids Res. 32 (2004) W615– W619. B.A. Wallace and R.W. Janes, International workshop on the protein circular dichroism data bank, Synchrotron Radiation News 18 (2005) 20–21. A.J. Miles, F. Wien, J.G. Lees, A. Rodger, R.W. Janes and B.A. Wallace, Calibration and standardisation of synchrotron radiation circular dichroism (SRCD) amplitudes and conventional circular dichroism (CD) spectrophotometers, Spectroscopy 17 (2003) 653–661. A.J. Miles, F. Wien, J.G. Lees and B.A. Wallace, Calibration and standardisation of synchrotron radiation and conventional circular dichroism spectrometers. Part 2: Factors affecting magnitude and wavelength, Spectroscopy 19 (2005) 43–51. W. Kabsch and C. Sander, Dictionary of protein secondary structure - pattern-recognition of hydrogenbonded and geometrical features, Biopolymers 22 (1983) 2577–2637. L. Whitmore and B.A. Wallace, Protein secondary structure analyses from circular dichroism spectroscopy: methods and reference databases, Biopolymers 70 (2007) 1142–1146. B. Bulheller, A.J. Miles, B.A. Wallace and J. Hirst, Charge-transfer transitions in the vacuum ultraviolet of protein circular dichroism spectra, J. Phys. Chem. B 112 (2008) 1866–1874. B.A. Wallace, F. Wien, A.J. Miles, J.G. Lees, S.V. Hoffman, P. Evans, G.J. Wistow and C. Slingsby, Biomedical applications of synchrotron radiation circular dichroism spectroscopy: Identification of mutant proteins associated with disease and development of a reference database for fold motifs, Faraday Discussions 17 (2004) 653–661. P. Evans, K. Wyatt, G.J. Wistow, O.A. Bateman, B.A. Wallace and C. Slingsby, The P23T cataract mutation causes loss of solubility of folded γD-crystallin, J. Mol. Biol. 343 (2004) 436–444. M. Cascio and B.A. Wallace, Effects of local environment on the circular dichroism spectra of polypeptides, Anal. Biochem. 227 (1995) 90–100.
B.A. Wallace et al. / The Protein Circular Dichroism Data Bank (PCDDB)
227
[34] N.P. Cowieson, A.J. Miles, G. Robin, J.K. Forwood, B. Kobe, J.L. Martin and B.A. Wallace, Evaluating protein: protein complex formation using synchrotron radiation circular dichroism spectroscopy, Proteins: Struct. Funct. Bioinform. 70 (2008) 1142–1146. [35] B.A. Wallace, J. Lees, A.J.W. Orry, A. Lobley and R.W. Janes, Analyses of circular dichroism spectra of membrane proteins, Protein Sci. 12 (2003) 875–884.
This page intentionally left blank
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved.
229
Appendix: Selected Website and Monograph References for CD and SRCD of Biomolecules 1. Websites (as of December 2008) DICHROWEB: http://dichroweb.cryst.bbk.ac.uk/ CDTOOLS: http://cdtools.cryst.bbk.ac.uk/ beamline websites: BSRF: http://www.ihep.ac.cn/bsrf/english/facility/html/VUV.htm NSLS: http://www.biology.bnl.gov/u9b/u9b.html http://www.nsls.bnl.gov/beamlines/beamline.asp?blid=U11 SRS: http://www.srs.dl.ac.uk/vuv/CD/cpmsd.html Diamond: http://www.diamond.ac.uk/Beamlines/Beamlineplan/B23/index.htm ISA: http://www.isa.au.dk/facilities/astrid/beamlines/uv1/uv1.html Soleil: http://www.synchrotron-soleil.fr/Recherche/LignesLumiere/DISCO NSRRC: http://portal.nsrrc.org.tw/news/news.php?mode=view&id=1523 BESSY2: http://www.bessy.de/front_content.php?idcatart=1021 NSRL: http://www.nsrl.ustc.edu.cn/en/pages/facilities/ HiSOR: http://www.hsrc.hiroshima-u.ac.jp/english/bl15.htm AS: http://www.synchrotron.vic.gov.au/content.asp?Document_ID=494 ANKA: http://ankaweb.fzk.de/ General Introductions: http://www.britishbiophysics.org.uk/what-is/cd/cd.html http://en.wikipedia.org/wiki/Circular_dichroism http://www.cryst.bbk.ac.uk/PPS2/course/section8/ss-960531_21.html http://www2.umdnj.edu/cdrwjweb/ http://www.ruppweb.org/cd/cdtutorial.htm http://www.srs.ac.uk/summer-school/talks/DaveClark/summer_school_04.pdf http://www.protein.iastate.edu/circular_dichroism.html http://www.imb-jena.de/ImgLibDoc/cd/index.htm Animated electromagnetic waves: http://www.enzim.hu/~szia/cddemo/edemo1.htm
230
2. Books Berova, N., Nakanishi, K. and Woody, R.W. (eds) 2nd Edition (2000) Circular Dichroism: Principles and Applications. John Wiley and Sons. ISBN-13: 978-0-471-33003-5 Cantor, C.R. and Schimmel, P.R. (1980) Biophysical Chemistry, Part II. W.H. Freeman and Company. ISBN 0-7167-1189-3. Fasman, G.D. (ed) (1996) Circular Dichroism and the Conformational Analysis of Biomolecules. Plenum Press. (now out of print, but generally available) ISBN: 0-306-45742-5 Rodger, A. and Norden, B. (1997) Circular Dichroism and Linear Dichroism. Oxford University Press. (new addition available soon) ISBN-13: 978-0-198-55897-2 van Holde, K.E., Johnson, W.C., and Ho, P.S. 2nd Edition (2006) Principles of Physical Biochemistry. Pearson Prentice Hall. ISBN: 0-13-201744-X
Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy B.A. Wallace and R.W. Janes (Eds.) IOS Press, 2009. © 2009 The authors and IOS Press. All rights reserved.
231
Author Index Bulheller, B.M. Haris, P.I. Hills, A.E. Hirst, J.D. Janes, R.W. Kelly, S.M. Knight, A.E. Miles, A.J.
202 v 125 202 vii, 1, 183, 216 91 125 73, 108, 141
Price, N.C. Ravi, J. Rodger, A. Sutherland, J.C. Wallace, B.A. Whitmore, L.
91 125 150 19 vii, 1, 73, 108, 141, 165, 216 165, 216
This page intentionally left blank