TECHNIQUES IN PROTEIN CHEMISTRY VI
This Page Intentionally Left Blank
TECHNIQUES IN PROTEIN CHEMISTRY VI Edited by
John W. Crabb W. Alton Jones Cell Science Center, Inc. Lake Placid, New York
ACADEMIC PRESS San Diego New York Boston London Sydney Tokyo Toronto
Academic Press Rapid Manuscript Reproduction
This book is printed on acid-free paper, fe) Copyright © 1995 by ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Academic Press, Inc. A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495 United Kingdom Edition published by Academic Press Limited 24-28 Oval Road, London NWl 7DX
Library of Congress Card Catalog Number: 94-230592 International Standard Book Number: 0-12-194712-2 (case) International Standard Book Number: 0-12-194713-0 (comb)
PRINTED IN THE UNITED STATES OF AMERICA 95 96 97 98 99 00 EB 9 8 7 6
5
4
3 2 1
Contents
Foreword xv Preface xvii Acknowledgments
xix
Section I Mass Spectrometry of Peptides and Proteins The Use of a Volatile N-Terminal Degradation Reagent for Rapid, HighSensitivity Sequence Analysis of Peptides by Generation of Sequence Ladders 3 M. Bartlet-Jones, W. A. Jejfery, H. F. Hansen, and D.J. C. Pappin Investigation of Polyethylene Membranes as Potential Sample Supports for Linking SDS-PAGE with MALDI-TOF MS for the Mass Measurement of Proteins 13 James A. Blackledge and Anthony J. Alexander Comparison of ESI-MS, LSIMS, and MALDI-TOF-MS for the Primary Structure Analysis of a Monoclonal Antibody 21 Leticia Cano, Kristine M. Swiderek, and John E. Shively MS Based Scanning Methodologies Applied to Conus Venom A. Grey Craig, Wolfgang H. Fischer, Jean E. Rivier, J. Michael Mcintosh, and William R. Gray
31
Direct Coupling of an Automated 2-Dimensional Microcolumn Affinity Chromatography-Capillary HPLC System with Mass Spectrometry for Biomolecule Analysis 39 D. B. Kassel, T. G. Consler, M. Shalaby, P. Sekhri, N. Gordon, and T. Nadler Edman Degradation and MALDI Sequencing Enables N- and C-Terminal Sequence Analysis of Peptides 47 Roland Keliner, Gert Talbo, Tony Houthaeve, and Matthias Mann
vi
Contents
Identification of the Amino Terminal Peptide of N-Terminally Blocked Proteins by Differential Deutero-Acetylation Using LC/MS Techniques Craig D. Thulin and Kenneth A. Walsh
55
Section II Analysis of Posttranslational Processing Events HPAEC-PAD Analysis of Monoclonal Antibody Glycosylation Jeffrey Rohrer, Jim Thayer, Nebojsa Avdalovic, and Michael Weitzhandler
65
Carbohydrate Structure Characterization of Two Soluble Forms of a Ligand for the ECK Receptor Tyrosine Kinase 75 Christi L. Clogston, Patricia L. Derby, Robert Toso, James D. Skrine, Ming Zhang, Vann Parker, G. Michael Fox, Timothy D. Bartley, and Hsieng S. Lu Characterization of Individual N- and 0-Linked Glycosylation Sites Using Edman Degradation 83 A. A. Gooley, N. H. Packer, A. Pisano, J. W. Redmond, K. L. Williams, A. Jones, M. Loughnan, and P. F. Alewood The Unexpected Presence of Hydroxy lysine in Noncollagenous Proteins 91 Michael S. Molony, Shiaw-Lin Wu, Lene K. Keyt, and Reed J. Harris Isolation of Escherichia coli Synthesized Recombinant Proteins that Contain €-A^-Acetyllysine 99 Bernard N. Violand, Michael R. Schlittler, Cory Q. Lawson, James F. Kane, Ned R. Siegel, Christine E. Smith, and Kevin L. Duffin LC-MS Methods for Selective Detection of Posttranslational Modifications in Proteins: Glycosylation, Phosphorylation, Sulfation, and Acylation 107 Mark F. Bean, Roland S. Annan, Mark E, Hemling, Mary Mentzer, Michael J. Huddleston, and Steven A. Carr Identification of Phosphorylation Sites by Edman Degradation John D. Shannon and Jay W. Fox
117
Determination of the Disulfide Bonds of Human Macrophage Chemoattractant Protein-1 Using a Gas Phase Sequencer 125 Ramnath Seetharam, Jeanne L Corman, and Shubhada M. Kamerkar
Contents
\
Section III Protein Sequencing and Amino Acid Analysis Enzymatic Digestion of PVDF-Bound Proteins in the Presence of Glucopyranoside Detergents: Applicability to Mass Spectrometry 135 Joseph Fernandez, Farzin Gharahdaghi, and Sheenah M. Mische In-Gel Digestion of SDS PAGE-Separated Proteins: Observations from Internal Sequencing of 25 Proteins 143 Kenneth R. Williams and Kathryn L. Stone Peptide Mapping at the 1 |xg Level: In-Gel vs. PVDF Digestion Techniques 153 Lee Anne Merewether, Christi L. Clogston, Scott D. Patterson, and Hsieng S. Lu Enzymatic Digestion of Proteins in Zinc Chloride and Ponceau S Stained Gels 161 Sharleen Zhou and Arie Admon Direct Collection Onto Zitex and PVDF for Edman Sequencing: Elimination of Polybrene 169 William A. Burkhart, Mary B. Moyer, Wanda M. Bodnar, Anita M. Everson, Violeta G. Valladares, and Jerome M. Bailey Minimizing N-to-O Shift in Edman Sequencing William H. Vensel and George E. Tarr
177
The Hydrolysis Process and the Quality of Amino Acid Analysis: ABRF94AAA Collaborative Trial 185 K. limit YUksel, Thomas T. Andersen, Izydor Apostol, Jay W. Fox, Raymond J. Paxton, and Daniel J. Strydom A New Reagent for Cleaving at Cystine Residues 193 C. Mitchell, L. Hinman, L. Miller, and P. C. Andrews Protein Sequence Analysis Using Microbore PTH Separations 201 Michael F. Rohde, Christi Clogston, Lee Anne Merewether, Patricia Derby, and Kerry D. Nugent
viii
Contents
Assignment of Cysteine and Tryptophan Residues during Protein Sequencing: Results of ABRF-94SEQ 209 Jay Gambee, Philip C. Andrews, Karen DeJongh, Greg Grant, Barbara Merrill, Sheenah Mische, and John Rush Automated C-Terminal Protein Sequence Analysis Using the Hewlett-Packard G1009A C-Terminal Protein Sequencing System 219 Chad G. Miller, David H. Hawke, Jacqueline Tso, and Sherrell Early Applications Using an Alkylation Method for Carboxy-Terminal Protein Sequencing 229 MeriLisa Bozzini, Jindong Zhao, Pau-Miau Yuan, Doreen Ciolek, Yu-Ching Pan, John Norton, Daniel R. Marshak, and Victoria L. Boyd C-Terminal Sequence Analysis of Polypeptides Containing C-Terminal Proline 239 Jerome M. Bailey, Oanh Tu, Gilbert Issai, and John E. Shively
Section IV Peptide and Protein Separations and Other Methods High Sensitivity Detection of Tryptic Digests Using Derivatization and Fluorescence Detection 251 Steven A. Cohen, Igor Mechnikov, and Patricia Young Reagents for Rapid Reduction of Disulfide Bonds in Proteins Rajeeva Singh and George M. Whitesides
259
Strategies for the Removal of Ionic and Nonionic Detergents from Protein and Peptide Mixtures for On- and Off-Line Liquid Chromatography Mass Spectrometry (LCMS) 267 Kristine M. Swiderek, Michael L. Klein, Stanley A. Hefta, and John E. Shively Online Preparation of Complex Biological Samples prior to Analysis by HPLC, LC/MS, and/or Protein Sequencing 277 Ken Stoney and Kerry Nugent
Contents
ix
Methods for the Purification and Characterization of Calcium-Binding Proteins from Retina 285 Arthur S. Polans, Krzysztof Palczewski, Wojciech A. Gorczyca, and John W. Crabh Evidence for the Presence of a-Bungarotoxin in Venom-Derived K-Bungarotoxin 293 James J. Fiordalisi and Gregory A. Grant Progress in the Development of Solvent and Chromatography Systems Appropriate for Bitopic Membrane Proteins 301 Song-Jae Kil, Lisa M. Oleksa, Geojfrey C. Landis, and Charles R. Sanders II Rapid Separation of Proteins and Peptides Using Conventional SilicaBased Supports: Identification of 2-D Gel Proteins following In-Gel Proteolysis 311 Robert L. Moritz, James Eddes, Hong Ji, Gavin E. Reid, and Richard J. Simpson
Section V Mutagenesis and Protein Design Studying a-Helix and P-Sheet Formation in Small Proteins Catherine K. Smith, Mary Munson, and Lynne Regan
323
Circular Permutation of RNase Tl through PCR Based Site-Directed Mutagenesis 333 Jane M. Kuo, Leisha S. Mullins, James B. Garrett, and Frank M. Raushel E. c<9//-Expressed Human Neurotrophin-3 Characterization of a C-Terminal Extended Product 341 John O. Hui, Shi-Yuan Meng, Vishwanatham Katta, Larry Tsai, Michael F. Rohde, and Mitsuru Haniu Spectral Enhancement of Recombinant Proteins with Tryptophan Analogs: The Soluble Domain of Human Tissue Factor 349 C. A. Hasselbacher, R. Rusinova, E. Rusinova, and J. B. Alexander Ross
Contents
Utilization of Partial Reactions, Side Reactions, and Chemical Rescue to Analyze Site-Directed Mutants of Ribulose 1,5-Bisphosphate (RuBP) Carboxylase/Oxygenase (Rubisco) 357 Mark R. Harpel, Engin H. Serpersu, and Fred C. Hartman Probing the Roles of Conserved Histidine Residues in P-Galactosidase {E. coli) Using Site Directed Mutagenesis and Transition State Analog Inhibition 365 Nathan J. Roth, Katherine Y. N. Wong, and Reuben E. Huber
Section VI Analysis of Protein Interactions Rapid in Vitro Assembly of Class I Major Histocompatibility Complex 375 Nicholas J. Papadopoulos, James C. Sacchettini, Stanley G. Nathenson, and Ruth Hogue Angeletti Peptide Models of bZIP Proteins: Quantitative Analysis of DNA Affinity and Specificity 385 Steven J. Metallo and Alanna Schepartz Applying Affinity Coelectrophoresis to the Study of Nonspecific, DNA Binding Peptides 393 Michael L. Nedved and Gregory R. Moe Investigating Calmodulin-Target Sequence Interactions Using Mutant Proteins and Synthetic Target Peptides 401 Wendy A. Findlay, Stephen R. Martin, and Peter M. Bayley Interactions of Bacterial Cell-Surface Proteins with Antibodies: A Versatile Set of Protein-Protein Interactions 409 Gordon C. K. Roberts, Lu-Yun Lian, Igor L. Barsukov, Jeremy P. Derrick, Koichi Kato, and Yoji Arata Studies of Cytokine-Cytokine Receptor Interactions: Influence of Ligand Dimerization 417 Larry D. Ward, Geoffrey J. Howlett, Robert L. Moritz, Annet Hammacher, Kiyoshi Yasukawa, and Richard J. Simpson
Contents
New High Sensitivity Sedimentation Methods: AppUcation to the Analysis of the Assembly of Bacteriophage P22 427 Walter F. Stafford III, Sen Liu, and Peter E. Prevelige, Jr.
Section VII Protein Conformation and Folding Cyanogen as a Conformational Probe 435 Richard A. Day, Amy Hignite, and Warren E. Gooden Evaluation of Interactions between Residues in a-Helices by Exhaustive Conformational Search 443 Trevor P. Creamer, Rajgopal Srinivasan, and George D. Rose Design, Synthesis, and Characterization of a Water-Soluble p-Sheet Peptide 451 David S. Wishart, Les H. Kondejewski, Paul D. Semchuk, Cyril M. Kay, Robert S. Hodges, and Brian D. Sykes Automated Analysis of Protein Folding 459 Richard A. Smith, Jack Henkin, and Thomas F. Holzman HSP70 Protein Complexes: Their Characterization by Size-Exclusion HPLC 467 Daniel R. Palleros, Li Shi, Katherine Reid, and Anthony Fink Methods for Collecting and Analyzing Attenuated Total Reflectance FTIR Spectra of Proteins in Solution 475 Keith A. Oberg and Anthony L. Fink
Section VIII NMR Analysis of Peptides and Proteins ^T NMR Studies of Fluorinated Sugars Binding to the Glucose and Galactose Receptor 487 Linda A. Luck
xii
Contents
Heteronuclear Gradient-Enhanced NMR for the Study of 20-30 kDa Proteins: Application to Human Carbonic Anhydrase II 495 Ronald A. Venters and Leonard D. Spicer Toward the Solution Structure of Large (>30 kDa) Proteins and Macromolecular Complexes 503 Cheryl H. Arrowsmith, Weontae Lee, Matthew Revington, Toshio Yamazaki, and Lewis E. Kay Solution Structures of Horse Ferro- and Ferricytochrome c Using 2D and 3D 'H NMR and Restrained Simulated Annealing 511 Phoebe X. Qi, Ernesto J. Fuentes, Robert A. Beckman, Deena L. Di Stefano, and A. Joshua Wand NMR Relaxation Methods to Study Ligand-Receptor Interactions David W. Hoyt, Jian-Jun Wang, and Brian D. Sykes
521
Section IX Peptide Synthesis Application of 2-Chlorotrityl Resin: Simultaneous Synthesis of Peptides Which Differ in the C-Termini 531 Anita L. Hong, Tin T. Le, and Trung Phan Correlation of Cleavage Techniques with Side-Reactions following SolidPhase Peptide Synthesis 539 Gregg B. Fields, Ruth H. Angeletti, Lynda F. Bonewald, William T. Moore, Alan J. Smith, John T. Stults, and Lynn C. Williams Protein Synthesis on a Solid Support Using Fragment Condensation Siegfried Brandtner and Christian Griesinger
547
Characterization of a Side Reaction Using Stepwise Detection in Peptide Synthesis with Fmoc Chemistry 555 Yan Yang, William V. Sweeney, Susanna Thornqvist, Klaus Schneider, Brian T. Chait, and James P. Tam
Contents
Erratum High Sensitivity Peptide Sequence Analysis Using in Situ Proteolysis on High Retention PVDF Membranes and a Biphasic Reaction Column Sequencer 565 Sandra Best, David F. Reim, Jacek Mozdzanowski, and David W. Speicher Index
575
This Page Intentionally Left Blank
Foreword
I express my sincere thanks to John W. Crabb for editing this outstanding volume of Techniques in Protein Chemistry. Last year John edited Volume V, which was lauded as a useful "bench-top" manual. Volume VI promises timely information for many scientists. Organizing the topics and contacting the potential authors of papers for each volume are difficult tasks, but crucial for the success of the book. John and his associates were diligent; they collected an impressive array of papers on diverse subjects of interest to the broad membership of the Society. Most of the papers in Volume VI were obtained from the poster sessions as has been our custom. The Eighth Symposium of the Protein Society, held in San Diego July 9-13, 1994, was very well attended, with almost 1200 scientists and 630 exhibitors. Six hundred and fifty posters were presented and 66 companies displayed their products. We expect next year's symposium to be as successful. The ABRF annual meeting will be held in Boston, Massachusetts, July 7-8, 1995, immediately followed by the Ninth Symposium of the Protein Society (July 8-12). The next volume of Techniques, to be edited by Daniel R. Marshak, will be based on the scientific content of these meetings.
Joseph J. Villafranca President The Protein Society
This Page Intentionally Left Blank
Preface
As in previous volumes of this series, Techniques in Protein Chemistry VI highlights current methods in peptide and protein chemistry. Contributions were selected from over 650 abstracts submitted for presentations at the eighth annual symposium of the Protein Society held in San Diego in July 1994. The authorship is international, with contributions from Australia, Canada, England, Germany, Israel, Japan, and the United States. Chapters focus on mass spectrometry, sequence and amino acid analysis, separations, protein folding and conformation, peptide and protein NMR, and peptide synthesis. In addition, the mutagenesis and protein design section has been expanded and a new section addresses analysis of protein interactions. Very special thanks are due the associate editors who have helped make both Techniques V and VI high-quality resource books. This year's associate editors, all of whom assisted with last year's volume, include Richard M. Caprioli, Gerald M. Carlson, Steven A. Carr, Gregory A. Grant, Michael F. Rohde, David W. Speicher, Leonard D. Spicer, and Kenneth R. Williams. I am also grateful for contributions to the editorial process from William Seifert and Dan Marshak and for the timely assistance of Shirley Light of Academic Press and my secretary, Valerie Oliver. Essential support was provided by the president of the Protein Society (Joe Villafranca) and the secretary/treasurer (George Rose). Finally, I thank all the authors for their cooperation in meeting deadlines and providing their up-to-date methodology.
John W. Crabb
This Page Intentionally Left Blank
Acknowledgments The Protein Society acknowledges with thanks the following organizations who through their support of the Society's program goals contributed in a meaningful way to the eighth annual symposium and thus to this volume. Amgen Inc. Applied Biosystems Inc., Division of Perkin-Elmer
Merck Sharp & Dohme Research Laboratories Michrom BioResources, Inc.
Autodesk Inc.
Millipore Corporation
Beckman Instruments, Inc.
Oxford Molecular, Inc.
Biosym Technologies, Inc.
PreSeptive Biosystems, Inc.
Bristol-Myers Squibb
Pharmacia Biotechnology, Inc.
Brookhaven National Laboratory
Pickering Laboratories, Inc.
Dionex Corporation
Promega Corporation
Du Pont Merck Pharmaceutical Co.
Rainin Instrument Co.
Finnigan MAT
Supelco, Inc.
Fisons Instruments
Vestec Corporation
Hewlett-Packard Co.
VYDAC
This Page Intentionally Left Blank
SECTION I Mass Spectrometry of Peptides and Proteins
This Page Intentionally Left Blank
The Use of a Volatile N-Terminal Degradation Reagent for Rapid, High-Sensitivity Sequence Analysis of Peptides by Generation of Sequence Ladders M. Bartlet-Jones, W.A. Jeffery, H.F. Hansen,and D.J.C. Pappin Protein Sequencing Laboratory, Imperial Cancer Research Fund, PO Box 123, Lincoln's Inn Fields, London WC2A3PX, UK I. Introduction For more than 25 years, automated Edman chemistry [1,2] has remained the favoured method for routine protein sequence analysis. Several limitations, however, have never been overcome. The procedure is inherently slow and does not allow direct identification of many posttranslational modifications. In addition, current detection limits are only at the level of hundreds of femtomoles [3]. Large-format 2Delectrophoresis systems now make it possible to resolve several thousand proteins from whole-cell lysates in the low- to upperfemtomole concentration range [4,5] and more versatile and sensitive methods of protein sequencing are needed to meet analytical problems of this scale. The recent introduction of matrix-assisted laser-desorption (MALD) time-of-flight mass spectrometers [6] has led to the rapid analysis (at high sensitivity) of peptide mixtures. Using this technology, Chait et al. [7] developed a novel sequencing strategy involving the production of a nested set of peptides, each peptide differing from its precursor by the loss of one amino acid. The peptide mixtures were analysed by MALDI time-of-flight mass spectrometry with the generated ladder' sequence read directly from the mass spectrum in a single step. The nested set of peptides were generated using Phenylisothiocyanate (PITC) in the presence of a small molar ratio of phenylisocyanate (PIC). Repeated reaction cycles generated a raggedend set of peptides terminated with non-cleavable phenylureas. The potential advantages of this approach were the speed of analysis (minutes), femtomole detection limits and the ability to process samples in parallel. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
3
4
M. Bartlet-Jones et al.
The described PITCiPIC procedure was essentially derived from early, manual degradation protocols [8,9]. The need to remove excess PITC, buffer and thiourea by-products by repeated solvent extraction remains a major cause of peptide loss and limits the manual processing of large numbers of samples. In order to achieve the routine sequencing of peptides at the low pico- and femtomolar levels it is desirable to keep the number of manipulations to a minimum. Any transfer or extraction of sample may lead to loss of peptide. It is also necessary to eliminate the accumulation of contaminating by-products which cause suppression of sample signal. This led us to the working hypothesis that all reagents and by-products of the chemistry should be volatile in order to minimise work-up procedures which result in peptide loss. For this purpose we synthesised a novel, volatile isothiocyanate (trifluoroethyl isothiocyanate; TFEITC) that demonstrated the necessary characteristics. II. Materials and Methods Reactions were performed in Hewlett Packard 100 |Lil glass minivials with 8mm cap seals. Vacuum systems comprised 2-stage Edwards E2M8 pumps coupled to dual-trap assemblies composed of -70°C refrigerated solvent traps (V.A.Howe Ltd.) and ACE no. 15 solvent traps cooled with dry ice. Separate, independent vacuum systems and desiccators were used for removal of acid or base. Synthetic peptide CD28-3PY (phosphorylated) was a gift from Miss N. O'Reilly and Miss E. Li (ICRF). Human synthetic [Glul]-fibrinopeptide B was purchased from Sigma Chemical Co. and 12.5% w/v aqueous trimethylamine (protein sequencing grade) from Applied Biosystems. Toluenesulphonic acid monohydrate, alpha-cyano-4-hydroxycinnamic acid and trifluoroethanol (99%+) were obtained from Aldrich; trifluoroacetic acid (HPLC/Spectro grade) and heptafluorobutyric acid (anhydrous) from Pierce. The TEA was redistilled from 2aminoethanethiol (Ig/L), The HFBA was used as received. Stock solutions of trimethylammonium bicarbonate buffer were prepared by bubbling dry carbon dioxide into 12.5% w/v aq. trimethylamine until the required pH (8.5) was reached. These stock solutions were stable (weeks) if stored at 0-4^C in a sealed container under nitrogen. Sequencing coupling buffer was prepared from trifluoroethanol: water : 12.5% trimethylammonium bicarbonate pH 8.5 (5:4:1 v/v). Trifluoroethylisothiocyanate (TFEITC; I) and N-hydroxysuccinimidyl6-trimethyl-ammoniumhexanoate (a C-5 linker quatemary-NHS ester; II) were synthesised in these laboratories [10]. The TFEITC reagent was generally used and stored as a 10% v/v solution in acetonitrile (stable for weeks at room temperature).
A Volatile Reagent for Sequence Ladders
FaO^^^NCS
(I)
(11)
A. Ladder Generation Peptides were dissolved in coupling buffer in sufficient volume to give n+1 cycles of 2.5 |Lil per cycle (i.e. peptide requiring sequencing through 5 cycles was dissolved in 15 |xl). Aliquots of 2.5 jxl were pipetted into a minivial and a further 2.5 ^ll of 10% v/v TFEITC in acetonitrile immediately added. The vial was sealed, heated at 80°C for 5 min, then diluted with 5 [ill of water. Solvent, excess reagent and coupling base were evaporated under vacuum for 10 min (4x10"2 mbar) in the presence of toluenesulphonic acid and phosphorus pentoxide. Cleavage was initiated by the addition of 2.5 |Lil of either trifluoroacetic acid (TFA) or heptafluorobutyric acid (HFBA), sealing the vial and again heating to 80°C for 5 min. Acid was removed in vacuo (4x10"2 mbar for 10 min) in a separate vacuum system using sodium hydroxide. This procedure was repeated n times, where n was equal to the number of required cycles. Following the final cleavage step, the last aliquot of peptide was added, followed by the addition of 5 |Lil of water, and the solution dried under high vacuum (base vacuum system) for a minimum of 1 hour. B. Derivatisation with quaternary amines Peptide (20-50 fmol in 0.5 |il) was pipetted onto a target slide and allowed to air-dry for 5 minutes. The slide was then cooled on ice and 0.5 |il of a pre-cooled, freshly prepared solution of 0.25% w/v C-5 quaternary N-hydroxysuccinimide ester dissolved in 12% w/v trimethylammonium bicarbonate (pH 8.5) was added and left over ice for a further 10 min. The slide was then allowed to warm to room temperature, dried under high vacuum (15 min) and matrix added for MS analysis as described below. C. MS analysis of samples Peptide samples or ladder mixtures were dissolved in 50% aq. acetonitrile/0.1% TFA (3-5 |il) and sonicated for 5 min to aid solvation. Small aliquots (0.3-0.5 |xl) were pipetted onto a target slide and allowed to air-dry (approximately 5 min). One 0.3 |LI1 aliquot of matrix solution
6
M. Bartlet-Jones et al.
(1% w/v alpha-cyano-4-hydroxycinnamic acid in 50% aq. acetonitrile/0.1% v/v TFA) was finally applied to the dried sample and again allowed to air-dry. Peptide spectra were obtained using a Finnigan MAT LaserMat mass spectrometer essentially as described by Mock et al.[ll]. III. Results and Discussion In order to maintain adequate repetitive sequencing yields, reagent coupling and cleavage kinetics should be similar to PITC. Short-chain alkyl isothiocyanates (e.g. methyl or ethyl isothiocyanate) are characterised by coupling rates some 5-10 times slower than PITC [9]. In the case of the TFEITC reagent, the strong electron-withdrawing effect of the trifluoromethyl group enhances nucleophilic attack, giving reaction kinetics only 50% slower than PITC. The single methylene spacer between the trifluoromethyl and the isothiocyanate group also enhances the cleavage kinetics, making cleavage faster than PITC. Finally, the low boiling point of this reagent (approx. 94^C) allows rapid removal under moderate vacuum. In the procedure described in this manuscript, the nested set of peptides was generated simply by adding fresh peptide to each cycle and driving both the coupling and cleavage chemistry to completion. No additional reagents were required to act as chain terminators. The process is summarised in Figure 1 and as follows: Add volatile Isothiocyanate
Add
P^ptodd
^(TFEITC)
-In coupling buffer
C@ypllS (5min,80X)
DDl Wigy© (lOmin)
DDD V i ^ y © (lOmin)
(5min,80**) Figure 1: Flow-diagram of TFEITC degradation protocol.
A Volatile Reagent for Sequence Ladders
7.4 7J
1 1
7.0 (.8 6.6
H
6.4
Y
t
6.2
Q
6.0
i
5.8
P
S.6 S.4
Ar
S.2
n
1 i
5.0 4.8
i
4.6 4.4
4a 4.0
1
f Y-PO> ^ 1
,1. d
,
1
^***Hkk.LL,4Jw'\
U
y ^ WW vf n..
^^^^'wp^f \WV
3.8
i ""
wIta
^111
^*%to filTr
3.6 3.4 900
1000
1100
1200
1300
1400
1500
1600 1700 1800 1900 2000 2100 2200
2400 2600 MaasdnAt)
Figure 2: A ladder sequence through 6 cycles of a synthetic peptide (CD283PY) carried out on a total of 17.5 picomoles of starting peptide. The peptide is especially interesting in that it contains both a proline and phosphorylated tyrosine residue. Both residues undergo the sequencing chemistry satisfactorily. 9.8
3
9.6 9.4 92
8.8
i
8.6
1i
31
8.4 8.2 8.0 7.8 7.6 7.4
I
i1
1
1^
1
E
1 JL i
1
D
i
1
9.0
N
N
\\
11
j|
,
11
^
M
M mi r' i
7.2
a
.
1
7.0 6.8
950
1000
1050
1100
1150
1200
1250
1300
1350
1400
1450
1500
1550
1600
1650 1700 Mawftn^)
Figure 3: A ladder sequence through 6 cycles of [Glul] fibrinopeptide B carried out on a total of 17.5 picomoles of starting peptide.
8
M. Bartlet-Jones et al.
In cycle 1, all added peptide is shortened by one residue to give peptide-1. In cycle 2, this peptide quantitatively loses a second residue to become peptide-2 and the freshly added peptide becomes peptide-1. This process is repeated for the required number of cycles. The main practical requirement is that the starting peptide is dissolved in a defined volume of coupling buffer, dependent only on the number of cycles required and the volume applied per cycle. As all excess reagent, buffer and volatile reaction by-products are removed under vacuum, there are no extractive or transfer losses. It is important that all coupling base is removed after each coupling step so that salt formation with the cleavage acid is minimised. Residual trimethylamine or the conjugate TFA salt can cause significant suppression of signal in the mass spectrometer. For this reason two independent vacuum systems containing suitable trapping agents were used for the removal of acid or base. Results obtained on a number of peptide samples are shown in Figures 2-4. Ladder sequences of 5-6 residues were successfully generated on test peptides starting with as little as 600 fmol of material. One example peptide (Fig. 2) is of particular interest in that an internal phosphotyrosine residue was identified using only 17.5 picomoles of starting peptide. There was no evidence of loss of phosphate from the modified residue. This and other examples (Figs. 3 and 4) demonstrate practical ladder sequencing at levels between 10 and 200 times more sensitive than reported in the initial work of Chait et al. [7]. Proline and hydroxy-amino acids (often problematic for the standard Edman chemistry) have presented few problems with the TFEITC reagent (an example of sequence through proline is shown in Figure 2). In some trial experiments (not shown) up to two dozen samples were processed simultaneously. One advantage of generating a sequence without the use of chainterminating reagents is that the terminal amino group is retained. In contrast to the Chait procedure, where all positive charges may be replaced by neutral phenylureas or thioureas, retention of the Nterminal primary amino group improves the ionisation efficiency of the resultant peptides. Other problems associated with the PITC:PIC procedure include the labelling of internal lysine residues with both isothiocyanate and isocyanate groups (yielding products of different mass) and side-reactions associated with N-terminal phenylureas. Retention of the terminal amine also allows the potential for further modification of the peptide with sensitivity-enhancing molecules such as the C-5 alkyl quaternary ammonium activated ester developed in this laboratory (Figure 5). N-terminal derivatisation of peptides with *fixed-charge' compounds including quaternary amines, phosphonium and pyridinium ions have already been shown to simplify the daughterion spectra of peptides produced by high-energy collision-induced
A Volatile Reagent for Sequence Ladders
Figure 4: A ladder sequence through 5 cycles of [Glu^] fibrinopeptide B carried out on a total of 600 femtomoles of starting peptide.
•;
8.4
0
1
>"-
8.2
NH-PeptWe
8.0 7.8 ?.« 7.4
i
7.2
7.0 6.8 6.6
6.4
i
6.2 6.0 S.8 5.6 1000
1200
1400
1600
1800
2000 2200 2400 2600 2800
3200
3600
4000
4400
4800
Figure 5: Quantitative derivatisation of 50 fmol [Glul] fibrinopeptide B with a C-5 quaternary linked tag. The derivatisation was performed on the peptide sample already spotted onto the target slide.
10
M. Bartlet-Jones et al.
fragmentation [12,13]. It is possible that the activated quaternary NHS esters reported here may be useful for facile modification of peptides to aid interpretation of CAD spectra for sequence analysis by such MS/MS techniques. Table I: Table of monoisotopic residue masses Amino acid
Residue mass
Amino acid
Residue mass
Gly Ala Ser Pro Val Thr Cys He Leu Asn
57.02 71.04 87.03 97.05 99.07 101.05 103.01 113.08 113.08 114.04
Asp Lys Gin Glu Met His Phe Arg Tyr Trp
115.03 128.09 128.06 129.04 131.04 137.07 147.07 156.10 163.06 186.08
The principal practical problems that remain are due to limitations in current instrumentation. Residues L and I are isomers and therefore indistinguishable using this procedure. Residues K, Q and E share very similar masses (see Table I) although lysine side chains are modified by the TFEITC reagent and increase in mass by 141 Daltons. With TOP instruments yielding mass resolution below approximately 300, acidic residues and their corresponding amides (E and Q, D and N) are only resolved by repeated mass analysis following chemical modification (e.g. esterification). Such limitations are entirely instrument-related, and not relevant to the demonstration of the degradation chemistry reported here. Future developments in instrumentation (particularly with respect to resolution) are required to overcome these limitations. rV. Summary The primary aim of this work was to explore sequencing strategies capable of rapid analysis of proteins, possibly recovered from 2D- electrophoresis gels. For this purpose, the chemistry needed to be adaptable to multiple samples and sensitive enough to work in the femtomole range. The described TFEITC chemistry is showing early signs of meeting these criteria. The demonstration, on a low-picomolar scale, that a phosphorylated tyrosine residue could be directly identified make this a potentially powerful tool for the identification of this and other sites of post-translational modification. The inherent simplicity of the process should also allow for easy automation to permit rapid processing of samples in parallel.
A Volatile Reagent for Sequence Ladders
11
Acknowledgments This work was supported by the ICRF. Some aspects of the work were presented at the 42nd ASMS Conference on Mass Spectrometry and Allied Topics, May 29-June 3 1994, Chicago, IL. References 1) Edman, P. and Beg, G. (1967) Eur. J. Biochem. 1, 80-91 2) Hewick, R.M. et al. (1981) J. Biol. Chem. 256, 7990-7997 3) Totty, N.F. et al. (1992) Protein Sci. 1, 1215-1224 4) O'Farrell, P. (1975) J. Biol. Chem. 250, 4007-4021 5) Patton, W.F. et al. (1990) Biotechniques 8, 518-527 6) Karas, M. and Hillenkamp, F. (1988) Anal. Chem. 60, 2299-2301 7) Chait, B.T. et al. (1993) Science 262, 89-92 8) Tarr, G.E. (1986) in Methods of Protein Microcharacterization, (J.E. Shively, Ed.) p. 155-194, Humana Press. 9) Tarr, G.E. (1977) Methods Enzymol. 47, 335-357 10) Bartlet-Jones, M et al. (1994) Rapid Commun. Mass Spectrom., in press 11) Mock, K.K. et al. (1992) Rapid Commun. Mass Spectrom. 6, 233-238 12) Vath, J.E. and Biemann, K. (1990) Int. J. Mass Spectrom. and Ion Processes 100, 287-299. 13) Stults, J.T. et al. (1993) Anal. Chem. 65, 1703-1708
This Page Intentionally Left Blank
Investigation of Polyethylene Membranes as Potential Sample Support for Linking SDSPAGE with MALDI-TOF MS for the Mass Measurement of Proteins James A. Blackledge and Anthony J. Alexander Bristol-Myers Squibb, Analytical Research & Development, Pharmaceutical Research Institute, P.O. Box 4755, Syracuse, NY 13221-4755
I. Introduction Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) is a classic technique used for the separation and molecular weight (MW) determination of biomolecules. Unfortunately, the technique gives only a rough estimation of MW, with values of ± 5-10 % being typical. Additionally, it can be subject to systematic errors if the species under investigation has different electrophoretic migration behavior then the MW markers. Matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) routinely gives MW values with an accuracy of ± 0.1% or better, and has become increasingly popular for the mass measurement of biopolymers [1]. The technique is simple, rugged, has a mass range in excess of 200,000 Da, and is extremely sensitive, requiring low nanomole to picomole amounts of material. Additionally, the technique is relatively insensitive to the presence of various salts and buffers that are often associated with the isolation of biomolecules. As the separated protein bands in an SDS-PAGE gel are typically transferred (electroblotted) onto a supporting membrane prior to ftirther analysis, most of the effort to combine the resolving power of SDS-PAGE with the mass accuracy of MALDI-MS has focused on desorbing proteins directly from the transfer membrane. Typical membranes currently employed are nitrocellulose [2], nylon [3], and polyvinylidene difluoride (PVDF) [4,5]. While these have been well developed for the purpose of electroblotting protein bands, their use as sample supports for MALDI-MS is still somewhat problematic. With a UV laser, useful MALDI mass spectra have so far only been obtained for low MW peptides and proteins (<24,000 Da) [3,4]. While these results are encouraging, this effective limitation on mass range clearly does not permit full exploitation of the mass measuring potential of TOF-MS. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
13
14
James A. Blackledge and Anthony J. Alexander
Also electrospray ionization (ESI) operates very effectively over this mass range, and methods for extracting SDS-PAGE protein bands for subsequent ESI-MS analysis are presently in the literature [6]. Access to higher molecular weights, up to bovine serum albumin (BSA) at 66,430 Da., has been achieved with PVDF by employing an IR laser for desorption and ionization [5]. However, the use of IR lasers is undesirable, as their high shot-to-shot variability in output power results in signal irreproducibility. Additionally, due to the greater abrading power of the IR laser, only two to three spectra can typically be acquired from the same sample spot. We report here preliminary investigations into the application of a novel membrane material, a porous polyethylene (PE), for use as a surface to link SDS-PAGE with MALDI-TOF MS. Using this material good quality spectra, in many cases rivaling those of standard sample preparations, can be acquired over a wide mass range using a standard nitrogen UV laser. Additionally, spectra acquired from samples immobilized on PE membrane exhibit interesting qualitative differences compared to those acquired from stainless steel sample stages.
II. Materials and Methods MALDI mass spectra were acquired on a Bruker Reflex mass spectrometer equipped with a 337 nm nitrogen laser and a multiple sample stage source. The spectra were acquired in linear mode, represent the sum of 20 laser shots, and are un-smoothed. All ions were desorbed at a laser power just above threshold, at an ion extraction voltage of 30 kV. The matrix used was a saturated solution of 3,5-dimethoxy-4-hydroxycinnamic acid (sinapinic acid, Aldrich Chemicals) in a 1:1 water/acetonitrile solution. Low-mass matrix ions were deflected by the application of a voltage pulse. Five millimeter diameter disks of a porous polyethylene membrane, used as a window material in disposable FT-IR cards manufactured by 3M Corp. (Fisher Cat. # 14385-861), were prepared by pre-wetting with 2 \i\ of methanol. A 1 |il aliquot of the aqueous protein solution was then added, and the membrane allowed to air dry at room temperature, before addition of 2 \i\ of matrix solution and final air drying. For those samples that were washed, after the protein solution had dried the membrane spot was vortexed in an aqueous 50% methanol solution for 30 seconds and air dried before addition of the matrix solution. Standard sample stages (stainless steel) were masked and sprayed with an aerosol adhesive. After evaporation of residual solvent the membrane spots were affixed with gentle pressure. All protein samples were acquired from Sigma Chemical Company (St. Louis, MO) and were used without further purification. External mass calibration was used by peak centroiding at the 80% level, except where otherwise indicated. Sample concentrations are expressed as pmole/mm^ of crystallized material, in order to account for the larger spread of the sample spots on PE (- 4mm) compared to the steel stages (-'1.5mm).
Membranes for Linking SDS-PAGE & MALDI-TOF MS
15
III. Results Initial results using PVDF (Immobilon P, Millipore Corp., Bedford MA) membranes were comparable to previously published UV-MALDI results using this material [4,5], however they were clearly inferior to results obtained on stainless steel sample stages (data not shown). More importantly for our work, Immobilon P gave extremely poor quality ion signals for high-mass proteins, as did nitocellulose. Therefore, rather than attempting to adapt popular electroblotting media to the requirements of UV MALDI-MS, we instead chose to seek out a novel support medium (PE) that gave excellent MALDIMS results, and then subsequently adapt that material to the requirements of electroblotting.
A. Spectral Quality ofMALDI Analysis from PE MALDI spectra that are acquired from PE membranes do not show any discemable loss in spectral quality relative to results obtained from standard preparations. This is illustrated in Figure 1, which compares the spectrum of bovine serum albumin (BSA) acquired from a stainless steel sample stage (top) to the same sample acquired from a PE membrane (bottom). Note that the spectrum acquired from PE has a more intense singly-charged ion relative to the doubly-charged species, a trend which we have observed consistently with a variety of high-mass proteins. The peak width is very similar in both spectra, with the resolution of the singly-charged ion at m/z ~ 66,400 Da. being about 60 (m/Am, FWHM). Also, there is no discernable degradation in peak shape which could lead to subsequent centroiding errors in mass assignment. The signal-to-noise (S/N) ratio is very similar in both spectra, indicating that there is no significant loss of sensitivity. Furthermore, excellent spectra could be acquired from anywhere with-in a large radius on the blot, whereas when employing standard sample preparations one must often search to find the most judicious sampling spot. B.
Practical
Mass Range of MALDI
Analysis from
PE
The three spectra displayed in Figure 2a-c, were acquired from samples blotted onto the PE membrane, and encompass a mass range of 5,734 (bovine insulin) to over 133,000 Da. (bovine albumin dimer, [BAD]). The quality of these spectra, particularly for BAD Figure 2 c, rivals those obtained under standard conditions from metal sample stages. The high ion intensities and low background exhibited show the utility of PE as a support medium for UVMALDI analysis of proteins over a wide range of molecular weights.
C Washing of Samples Immobilized onto PE An additional advantage to desorbing proteins directly from binding membranes is the ability to wash bound proteins free of contaminates that typically suppress ionization [3]. Although MALDI is relatively tolerant of the presence of salts and buffers, it is extremely susceptible to the presence of surfactants such as sodium dodecyl sulfate (SDS), with spectral degradation
James A. Blackledge and Anthony J. Alexander
16
occurring at levels as low as 0.01% [7]. Figure 3 demonstrates the ability of PE membrane to selectively retain the bound protein sample, while washing away SDS at the 0.73% level to the extent that ionization is completely restored.
BSA on stainless steel sample stage. 66431
33216
22123 44320 1 ,
p-
L
133214
99810
.
1
199508
1
1—
1
'
'
'
'
1
'
'
'
•
F^
m/z
BSA on PE membrane.
m/z Figure 1. Comparison of spectra acquired from stainless steel (top) and PE membrane (bottom). Protein concentrations are 1.5 pmol/mm^ and 1.0 pmol/mm^ respectively.
D. Mass Accuracy and Reproducibility of Samples Immobilized onto PE Due to alterations in sample position and lag times in ion desorption when membranes are employed for MALDI analysis, the standards used for mass calibration must also be desorbed from the membrane [3,4]. The shot-to-shot reproducibility and accuracy of calibration was investigated by first calibrating the instrument with BSA blotted onto PE, then applying that calibration to other membrane-bound preparations of BSA on the multiple sample stage. Replicate measurements, each the sum of 20 laser shots, were typically within ± 98 Da (0.15%) standard deviation of each other and routinely within ± 26 Da (0.04%) of the accepted mass (n=8). Typically in MALDI-MS, improved mass accuracy can be achieved by employing internal calibration. However, we have found that at higher masses (> c.a. 30,000 Da.) this becomes progressively less effective, as ionization of one species often results in near total suppression of the other, or in broad, poorly-resolved ion signals. Sample desorption/ionization from PE does not appear to be as susceptible to such effects. This is demonstrated in Figure 4 by the spectrum of a chimeric monoclonal antibody (BR96) which has been internally mass calibrated by co-addition of BAD. Excellent ion intensity and peak shape is maintained for both the singly- and doubly-charged species of both proteins.
Membranes for Linking SDS-PAGE & MALDI-TOF MS
17
,.i.
2500
-
2000
-
a MH* 5736
Bovine Insulin
1 1500
-
1000
-
500
-
III
1
J\A ^
0
^^^^* &952
^^^^
1n ^...H^...^^^..
.
1
L
J u_„.,
^
500
1000
1500
2000
2500
3000
' ! 3500
I ' 4000
4500
5000
5500
6000
6500
7000
m/z
MH*
Horse Heart Myoglobin
16886
8448
I 1 SOOO
10000
2MH*
34048
-/•^^ 15000
30000
2S000
30000
35000
n/z
Bovine Albumin Dimer
66525 133049
33269 '^"^'* •44266 199143 I
^^mmKJ%iiti\i0*ti^
ynitkkMM
Figure 2. Utility of PE membrane over a large mass range. Spectra o f bovine insulin (0.4 pmol/mm^), horse heart myoglobin (0.4 pmol/nmi^), and bovine serum albuimin dimer (1 pmol/mm^) are displayed from top to bottom. All spectra are the summation of 20 shots, and are unsmoothed.
James A. Blackledge and Anthony J. Alexander
- BSA
\ . . L
Jv
TV
J BSA In 0.73% SDS
1
BSA in 0.73% SDS, after Washing
_J--^^—J
V—
20000
40000
60000
^
80000
/\^
100000
120000
140000
m/z
Figure 3. Washing of protein bound to PE membrane. Top spectrum is of control BSA applied to PE membrane. The middle spectrum is BSA in 0.73% SDS immobilized on PE membrane. The bottom spectrum is BSA in 0.73% SDS immobilized on PE membrane, then vortexed in 50% methanol for 30 seconds prior to the addition of matrix. All spectra were acquired at a protein load of 1 pmol/mm^
a.i.
1
800
H
700
H
149771 1
600
A
500
-]
400
H
74902
* 6652J
*
1
133049 300
-j
1
49942 1
200
H
100
-\
1
lAWu*[j
-¥-T 0
T-,
.
20000
1 T ,
.
40000
,
.
1
y
1 . 1
60000
BOOOO
lOOOOO
120000
140000
160000
180000
200000
220000
m/z
Figure 4. Internal standardization of high-mass samples. The spectrum of the chimeric monoclonal antibody, BR96 (0.64 pmol/mm^), was mass calibrated using the singly and doubly charged species of BAD, which was present as an internal standard (0.5 pmol/mm^). Asterisks indicate BAD calibration ions.
Membranes for Linking SDS-PAGE & MALDI-TOF MS
19
IV. Conclusion Spectra acquired from PE membranes are of equal or better quality as those acquired from metal sample stages under standard sample preparation conditions. The PE membrane provides access to higher molecular weights than the more common transfer membrane materials (PVDF, nylon, and nitrocellulose). This permits the mass analysis of the large proteins for which MALDI-TOF MS is ideally suited. Mass accuracy and reproducibility approaches that obtained with standard sample preparations. Furthermore, the use of PE reduces the severe ion suppression effects typically observed in the MALDI analysis of high mass mixtures. This also permits more accurate mass measurements to be made via the use of internal calibration. While it remains to be shown that proteins can be desorbed from PE membranes following the electrotransfer of bands from SDS-PAGE gels, results to date are very encouraging.
References 1. Hillenkamp, F., Karas, M., Beavis, R.C., and Chait, B.T.; Anal. Chem. 63, 1193A1203A (1991). 2. Klarskov, K. and Roepstorff, P.; Biol. Mass Spectrom. 22, 433-440 (1993). 3. Zaluzek, E.J., Gage, D.A., Allison, J., and Watson J.T.; J. Amer. Soc. Mass Spectrom. 5 (1994). 4. Vestling, M.M. and Fenselau, C ; Anal. Chem. 66, 471-477 (1994). 5. Strupat, K., Karas, M., Hillenkamp, F., Eckerskom C , and Lottspeich P.; Anal. Chem. 66, 464-470 (1994). 6. Le Maire, M, Deschamps, S., Moller, J.V., Le Caer, J.P., Rossier, J.; Anal. Biochem. 214, 50-57 (1993). 7. Mock, K.K., Sutton, C.W., and Cottrell, J.S.; Rapid Commun. Mass Spectrom. 6, 233-238 (1992).
This Page Intentionally Left Blank
Comparison of ESI-MS, LSIMS and MALDI-TOF-MS for the Primary Structure Analysis of a Monoclonal Antibody Leticia Cano, Kristine M. Swiderek,and John E, Shively Division of Immunology, Beckman Research Institute, City of Hope Duarte, CA 91010 Abbreviations: HPLC, high performance liquid chromatography; LSIMS, liquid secondary ion mass spectrometry; MALDI-TOF, matrix assisted laser desorption ionization/time offlight;ESI, electrospray ionization; PVP, polypyrolidone; TFA, trifluoroacetic acid.
I. Introduction Confirmation or verification of a known protein sequence is a common task in protein structural analysis. Examples include recombinant proteins whose sequence must be verified to demonstrate that the correct product was made, and proteins that have been isolated from natural sources and for which the cDNA sequence is known. Monoclonal antibodies belong to the latter class. Antibodies often are cloned and converted to a variety of engineered constructs for use in in vivo imaging and therapy. In these applications it is absolutely essential that the protein chemist confirm that the protein sequence and cDNA predicted sequence agree before investing in costly and time consuming antibody engineering projects. In our lab we have been interested in anti-carcinoembryonic (CEA) antibodies which have excellent tumor targeting properties for imaging and therapy of solid tumors of the colon, lung, and breast. Antibodies of the IgG class are composed of heavy and light chains of mass of 50 kDa and 25 kDa, respectively. Each contain (^sulfide bonds which must be reduced and alkylated in order to obtain complete peptide maps and structural information. While it is relatively easy to isolate and map light chains, the heavy chains are often hydrophobic and difficult to andyze. For this reason, we have chosen to separate the chains on an SDS gel (reduction and alkylation is performed prior to loading the samples to the gel), electrotransfer to membrane (nitrocellulose in this case), and perform an on-blot digest with trypsin. Since this approach is commonly TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
21
22
Leticia Cano et al.
used to sequence a large variety of proteins, the application to a monoclonal antibody should be of general interest. Mass analysis of peptide fragments from a protein of "known sequence" is the method of choice for speed and accuracy. However, at least three major options are available for mass analysis, each varying in their merits. The three most widely used techniques are LSIMS, MALDITOF-MS, and ESI-MS. In LSIMS and MALDI, individual peptide fractions are analyzed by mixing with a matrix followed by ionization and mass analysis, while in ESI the samples can be separated on-line by LC, eliminating the need for individual peak collection. In this report we compare each of the techniques for the analysis of the heavy and light chains of the anti-CEA antibody CEA.11 H5 (1).
11. Materials and Methods The cDNAs for the heavy and light chain variable regions were cloned using consensus PCR primers. The variable sequences were appended to the consensus sequences for murine constant regions of the light chain (kappa isotype) and heavy chain (gamma-1 isotype). The predicted masses for the heavy and Ught chain tryptic fragments were obtainedfromthe program MacProMass (2). The anti-CEA antibody CEA.l 1 H5 (1 nmole) wasreduced(1 jol P-mercaptoethanol), S-alkylated (1 ml 4-vinylpyridine), electrophoresed on SDS-PAGE, blotted onto nitrocellulose, and stained with Ponceau S. The bands corresponding to the light and heavy chains (25 kDa and 50 kDa respectively) were excised, blocked with PVP-360 (0.25% in 10% acetic acid for 20 minutes), and digested with trypsin ( 2 ^g, 37^C ovemight). The protocol is similar to that described by Henzel et al. (3). A 10% aliquot of the digestion mixture was analyzed on a Vydac C18 250iam ID fused silica capUlary column connected on-line to ESI-MS (4). A linear gradient of 2% B to 92% B in 45 minutes using solvent A (0.1% TFA) and solvent B (90% acetonitrile, 0.07% TFA) was used with a flow rate of 2 jul/min. Sample elution was monitored by UV detection at 200 nm. Mass spectra wererecordedin the positive ion mode using a TSQ-700 triple quadrupole instrument (Finnigan-MAT, San Jose, CA) with an electrospray ion source operating at atmospheric pressure. Scans were continuously acquired every three seconds between nVz 500 and 2000 in the centroid mode. Theremainderof the mixture was separated by RP-HPLC on a Vydac CI8 530um ID fused silica capillary column. Capillary LC was carried out using a model 140B Applied Biosystem HPLC and a Rheodyne injection valve with a 100 )ul injection loop. A linear gradient of 2% B to 70% B in 60 minutes using solvent A (0.1% TFA) and solvent B (90% acetonitrile, 0.07% TFA) was used with a flow rate of 20 jul/min. Fractions corresponding to all UV-absorbing peaks were hand collected.
23
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
655.6
&-
1310.6
i«*:#fM!J^-*J«^^
JJlrfXjU«hftftliii
• y
I
••^•^»M»y*.
M/Z
1310.1
B
I,
2260.7
2909.4
M/Z 100,
1310
2259
I^ ^BO :
•=< 6 0 -
2905
LLx M/Z
Figure 1. Mass spectra of a trypdc peptide from the heavy chain of monoclonal antibody CEAll.H5. A. ESI spectrum showing the doubly charged (655.6) and single charged (1310.6) ions. B. LSIMS spectrum for HPLC fraction #50 (tn/z 1310.1). C. MALDI spectrum for the same fraction (nVz 1310,2258,2906, and 4118). An external standard was used for calibration (bovine insulin).
24
Leticia Cano et al.
About 1 ]ul of each fraction was analyzed by SIMS using a thioglycerol (3-mercapto-l,2-piopanediol) matrix. Mass spectra were recorded in the positive ion mode using a TSQ-700 triple quadrupole instrument (Finnigan-MAT, San Jose, CA) equipped with an 8 keV cesium ion gun (Phrasor Scientific, Inc, Duarte, CA). Scans were continuously acquired every seven seconds between rnlz 400 and 4000 in the centroid mode. Approximately 0.5 jul of each fractions was analyzed on a Kratos Kompact in TOF instrument. Samples were prepared using a-cyano-4hydroxycinnamic acid as matrix. Tlie sample wells were prespotted with matrix dissolved in acetone, dried, and respotted with a 1:1 solution of peptide and matrix dissolved in 30% acetonitrile/0.01% TFA. Microsequence analysis was performed on samples spotted onto PVDF membranes in continuous flow reactor and sequenced on a City of Hope-built sequencer (5).
III. Results and Discussion The sample (1 nmole) was reduced with DTT, S-alkylated with 2vinylpyridine and run direcdy on an SDS gel. After electrotransfer to nitrocellulose, the bands were stained, excised, and digested with trypsin according to Henzel eL al. (3) with the modification of PVP360 used in place of PVP40 (Henzel, personal communication). An aliquot was subjected to ESI on LC/MS. The remainder was separated by capillary LC, the peaks collected, and analyzed by LSIMS and MALDI. For the heavy chain, over 40 peaks were identified by all three techniques. Figure 1 illustrates typical spectra obtained by each of these methods for one of die peptidesfromthe heavy chain (H78-88, NFLSLQMTSLR). The ESI spectrum (Figure lA) identifies the peptide at scans 415-417 as a doubly charged and single charged species. The mass corresponds to the expected average mass (1310.56). The LSIMS and MALDI were taken from the RP-HPLCfraction#50. The LSIMS spectrum (Figure IB) shows a predominant peak at mass 1310.1 with traces of other peptides at masses 1855, 2260, and 2909. The MALDI spectrum shows the same peaks, in addition to a peak at 4118, but in different intensities. The intensity differences almost certainly reflect the sample suppression and enhancement effects inherent to the matrix and ionization differences in the two techniques. Since the peak intensities do not necessarily reflect their actual abundance in the sample, no comment can be made on this issue. Although the ESI spectrum is simple, it cannot be directiy compared to the LSIMS and MALDI peaks which were collectedfix)ma separate HPLC run. Each of the trypticfragmentsfor the heavy and light chains was compared to the predicted masses (see Methods). A correct match was found for 199/214 for the light chain, and 349/446 for the heavy chain. Unidentified peaks were subjected to microsequence analysis. Several of the unidentified peaks corresponded to single amino acid substitutions
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
25
1302.9 539.7
600
8D0
1000
1200
1400
1600
'
1800
2000
M/Z
2603.1
M/Z 2606
-^-
Figure 2. Mass spectra of a glycopeptide from the heavy chain of monoclonal antibody CEA11.H5. A. ESI spectrum showing the glycopeptide doubly charged ion at 1302.9. The peak at 539.7 is an unrelat^ peak. B. LSIMS spectrum for the glycopeptide (m/z at 2603.1), C. MALDI spectrum for the glycopeptide (m/z at 2606). Theextemal calibrant was bovine insulin (5734).
26
Leticia Cano et al.
compared to the consensus sequences for the murine constant regions. One of the unidentified peaks corresponded to a glycosylated peptide. The heavy chain (gamma-1 isotype) has a single glycosylation site at Asn-293. The expected mass for this peptide (1158.2, without carbohydrate) was not observed in any of the three analyses; however, an unidentified peak of mass 2603.1 was identified as the glycopeptide by microsequence analysis (EEQF-STFR). The blank at cycle 5 corresponds to Asn, predicted to be glycosylated by the recognition sequence Asn-xxxSer/Thr. The ESI spectrum for this peptide (Figure 2A) gives a mass of 2603 (calculated from the doubly charged species at nVz 1302.9). The LSIMS spectrum, although weak, confirms the mass at 2603 (Figure 2B). The glycopeptide was also identified by MALDI (Figure 2C). The mass difference between the unglycosylated and glycosylate peptide, 1445, probably corresponds to GlcNAc(Fuc)GlcNAc(Man)3(GlcNAc)2, the identical glycopeptide we identified in the monoclonal antibody T84.65 (6). Several peaks were observed which did not correspond to peptides. These peaks were due to the mandatory blocking agent used to prevent trypsin from adsorbing to the membrane. In our initial trials we used reduced Triton XlOO as the blocking agent as described by Femandez et al. (7). This agent has been recommended because it has little or no UV absorbance and does not interfere with microsequence analysis. However, this detergent gives an impressive series of ions interfering with ESI analysis as shown in Figure 3A. The peaks observedfi-omscan 160-230 are all part of the reduced Triton series as evidenced by a closely spaced series with a mass difference of 44. These peaks efficienfly suppressed peptides eluting in the same region of the gradient, and rendered this analysis practically useless. This problem was overcome by using PVP360 (Figure 3B). This high molecular weight polymer elutes late in the gradient away from all but the most hydrophobic peptides. It should be noted that even PVP40 (3) can cause interference in ESI analysis. In spite of the problem encountered with reduced Triton X1(X) in ESI, some peptides could be analyzed. In the scan region 260-263 on ESI, a peptide peak (1138,MH2-»-) was identified amidst the detergent peaks (Figure 4A). The peptide peak (mass 2274) corresponded to H303-322 (SVSELPIMHQDWLNGKEFK) where the Met (shown in bold type) is oxidized to the sulfone. The detergent peaks 994 and 1038, and 1362 and 1406 can be identified by their inclusion in the expected series, including the mass difference of 44 for each pair. The H303-322 peptide was also observed by LSIMS with a much reduced intensity compared to the detergent peaks (Figure 4B). However, the peak on MALDI had an excellent intensity with litde or no interferencefiromthe detergent (Figure 4C). This example also points out the possibility of observing oxidia^ Met residues. In this example, the unoxidized version of the peptide was not observed.
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
27
IV. Summary After the analysis of 87 peptides by three methods, it was possible to account for 78% of the sequence of the heavy chain, and 93% of the light chain (Figure 5). The regions missed correspond to either very large or very small tryptic fragments. This problem could be overcome by performing a second map using an enzyme with a different specificity, but would have increased the amount of work. Of the three methods, ESI and MALDI have no problems with the analysis of very large peptides (mass > 3000). This is an obvious limitation of LSIMS. Spectra containing
CO
C
>
Scan Number
Scan Number Figure 3. Effect of detergent on tryptic map of the heavy chain of monoclonal antibody CEAll.H5. A. ESI base peak chromatogram for sample containing 1% reduced Triton XIOO. B. ESI base peak chromatogram for equivalent sample containing 0.25% PVP360.
Leticia Cano et al.
28
1138.0
I 1406.8
i i j y l p L ^ — . - - ^ M/Z
B
>
2273.8
ii^^i^ 500
1000
1500
2000
2500
3000
3500
4000
M/Z 2274 ^
80
I I" I
1
5734 3000
M/Z
Figure 4. Mass spectra of a tryptic peptide from the detergent-containing digest. A. ESIspectrumshowingthedoubly charged peak at 1138.0. B. LSIMS spectrum for the same peptide (m/z 2273.8). C. MALDI spectrum for the same peptide (m/z 2274). The internal calita-ant was bovine insulin (5734).
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
29
DVQLVESGGGLVQPGGSRVKLSXAASGFTFSSFGMHWIRQAPEKGLEWAYISGGSSTIYYADTV1C
"56 69
_^
77 78
as 99
99
*sS
101
12S
GRFTISRDNPKNFLSLQMTSLRSEDTAMYYXARDYYVNNYWYFDVWGQGTTVTVSSAKTTPPSVY -232^
..^
JJJL^
^
460
2723
TTPPSVYPLAPGSAAQTNSMVTLGXLy^dYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSSS
VTVPSSPRPSETVTXNVAHPAS^^m)KK^^RDXGXKPXIVXTVPSVSSVFIFPPK^kDVj,TIT S12 ^4 UOi 271 ^71
259
1327
29« *"
jg-
LTPKVTXVWDISKDDPEVQFSWFVDDVEVHTAQTQPREEQFNSTERSVSELPIMHQDWLNGK 2276 2617
^
2260
3996 318
323
2««
34S
350
3«1
372
SFKCRVNSAAFPAP ISKTISKTKGRPKAPQVYTIPPPKEQMAKDICySLTCMITDFFPSDITVEWQW 788 *
jOa
^
687
^ ^
"^^
65:^ 1211
""•
1479 ^_
IT;;—^
Xrry)PAF.WYK^^PTNfTJTMryYT^SgTtyj^nKA?^Ar;>JTPTY5^W ^ 1340 49^ 602 2955
^
^?^?
684
>
DISLTQSPKFMSTSVGHRVqjJ!CKASQNVRTAyAWFjQKLg^PKALIYL£SNRY^G^DR 1031 1022 756 "sJT" 551 630 955
B
2267
^
,
'^
*
^1090 *
103
107
FTGSGSGTDFTLTINNVHSEDLADYFXLQHWNYPLTFGAGTKLEIK 3776 108
119
* " 142
^
^RADAAPTVSIFPPSSEQLTSGGASVVXFLNNFYPK *".^n-^? ^
143
J-"
1«3
3
"TT
"^
18S
DIlg/KWKIDGSERQNGVLNSWTD.QDSKDSTYSMSSTLTLT!g3EY "^ST *99lf* * 1 4 / / •• 1576
ERHNSYTCEATHKTSTSP|VKSFNRNEC 306
833
524 4 7 1
Figure 5. Summary of mass analysis for tryptic peptidesfromthe heavy and light chains of monoclonal antibody CEA11.H5. A. Heavy chain. B. light chain.
30
Leticia Cano et al.
multiply charged ions in ESI become problematic when the singly charged states are missing and multiple species are present. This problem was illustrated in Figure 4A. MALDI was the simplest of the techniques, but required careful attention to calibration, usually requiring intemal calibrants to obtain accurate masses. While this may seem a small annoyance, it often caused problems when the intemal calibrants suppressed the peptide peaks. On-line LCTMS (ESI) would have emerged as the clear favorite if it could have identified all of the peaks in single run. Clearly this was not the case. Many peaks that should have been observed were not, and many identified peaks arose in their place. Some corresponded to multiply charged species of known peaks, and some corresponded to peptide variants either at the level of an amino acid substitution, glycosylation (Figure 2), or methionine oxidation (Figure 4). In general, unidentffied peaks had to be sequenced in their entirety to verify the nature of the mass variance. In all cases, this led to a correct assignment. Another problem was unexpected proteolytic cleavages. This occurred for 22 peptides. It was due to chymotrypic cleavages. In spite of the use of Promega sequencing grade modified trypsin, this remains a rare, but real issue in mass assignments by this method. The effect of detergent on LSIMS and ESI is severe. MALDI is clearly the method of choice for samples that include detergents, but otherwise detergents should be avoided. A final comment can be made on mass analysis of peptides blotted and digested from SDS gels: it is a powerful, general approach, but requires much time and effort in sorting out the data no matter which mass spectrometric approach is used.
Acknowledgments This work was supported by the City of Hope Cancer Center Grant from NCI, CA 33572, and by NCI grant CA 43904.
References 1. Wagener, C , Yang, Y.H.J., Crawford, F.G., and Shively, J.E. (1983) Immunology 130, 2308-2315. 2. Lee, T.D., and Vemuri, S. (1989) Proc. 37th ASMS Conf. Mass Spec. 352-353. 3. Henzel, W.J., Billed, T.M., Stults, J.T., Wong, S.C, Grimley, C , and Watanabe, C. (1993) Proc. Natl. Acad. Sci. USA 90, 50115015. 4. Davis, M.T., and Lee, T.D. (1992) Protein Science 7,935-944. 5. Calaycay, J., Rusnak, M., and Shively, J.E. (1991) Anal. Biochem. 192, 23-31. 6. Shively, J.E.,Paxton,R.J., and Lee, T.D. (1989) 77fi5 74,246-252. 7. Fernandez, J., DeMott, M., Atherton, D., and Mische, S. (1992) Anal. Biochem. 201,255-264.
MS Based Scanning Methodologies Applied to Conus Venom t A. Grey Craig, Wolfgang H. Fischer, Jean E. Rivier, J. Michael Mcintosh; and William R. Grayt The Clayton Foundation Laboratories for Peptide Biology, The Salk Institute, San Diego, CA 92138-9216 and tDepartments of Psychiatry and Biology, University of Utah, Salk Lake City, UT 84112
I. Introduction Known biologically active agents in the venom produced by marine cone snails (Conus)y are small, highly constrained and specialized peptides. These venoms are a rich source of unique neuroactive molecules (1). Although the venoms from different species of cone snails may contain homologous peptides (e.g. both C. geographus, and C. striatus make peptides targeted to acetylcholine receptors and voltage sensitive calcium channels) they may also contain a number of distinct specialized peptides (e.g. the activity of conantokin-G of C. geographus which targets NMDA receptors has not been observed in C. striatus venom) (1). We describe a number of strategies (including derivatizations) that allow the identification of as yet uncharacterized toxins in these venoms. The extent of the challenge lies in the fact that most Conus venoms are complex mixtures containing over 100 peptides. With the advent of very sensitive ionization techniques such as matrix assisted laser desorption (MALDI) coupled with time-of-flight (TOF) mass analysis, measurement of the intact mass of peptides at sub-pmol levels has become a reality (2) of which we have taken advantage for the systematic screening of HPLC fractions. Partial sequence information can be obtained by carrying out enzymatic hydrolysis with exoproteinases (e.g. carboxypeptidases and aminopeptidases) (3, 4). More recently, MALDI has been used to measure metastable decomposition occurring in the first field free region of a reflectron TOF instrument (referred to as post source decay (PSD)) with only marginally more sample (5-7). While significantly less material is required for MALDI than for either UV detection, chemical sequencing or amino acid analysis, the nature of the information derived from MALDI spectra is also different. Clearly, obtaining unambiguous composition or sequence information is not a simple task. This is due to the fact that (i) the mass accuracy of MALDI-TOF measurements is generally lower than that of liquid secondary ionization (LSI) with a magnetic sector mass spectrometer, (ii) enzymatic sequencing is affected by the varying rates of cleavages at different amino acid residues and the reduced activity of most enzymes towards particular residues (e.g., tyrosine and proline) and (iii) TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
31
32
A. Grey Craig et al.
fragmentation information from PSD suffers from the ambiguities of assigning fragment ions as being derived from the N-terminus, C-terminus or "internal sequence". We have implemented scanning methodologies using MALDI-TOF mass spectrometry to partially purified venom from C striatus and C. ermineus. We have carried out specific derivatizations in order to deduce composition and sequence information. Together with an intact mass these measurements are used to determine whether an ionized species observed in the MALDI mass spectrum corresponds with the intact protonated molecule of a previously characterized conotoxin. The information obtained from derivatizations is also important when the ionized species does not correspond with the intact mass of peptides of known sequence. In that case, post source decay of the native and derivatized species may help assign the fragment ions.
II. Materials and Methods MALDI mass spectra were measured with a Bruker Reflex time-of-flight mass spectrometer fitted with a gridless reflectron energy analyzer and a nitrogen laser. Accelerating and reflectron voltage of +31 kV and +30 kV were employed unless otherwise specified. Typically, the amount of sample necessary for MALDI analysis was 100 fold less than that used for LSI analysis. All MALDI samples were prepared in six or more different sample preparation formats including three different UV absorbing matrices (a-cyano-4hydroxycinnamic acid, sinapinic acid and 2,5-dihydroxybenzoic acid) and two methods of preparation. In the first method peptide solution was pre-mixed with a solution of each matrix prior to application onto the probe tip (see refs. (8-10) for preparation of matrix solutions). In the second method a solution of the matrix in acetone was dried on the probe tip and then a separate sample of peptide was applied and left to dry onto the matrix (11) As noted previously, the second preparation was found to give more reliable analyses of samples which required a rinse with H2O (12, 13). No ions were observed with any other sample preparation which were not observed using a combination of the second procedure, a-cyano-4-hydroxycinnamic acid as the matrix, and rinsing of the samples. All tabulated data are for samples prepared in this manner. LSIMS spectra were measured with a JEOL IMS HXllO mass spectrometer fitted with a Cs"*" ion gun. An accelerating voltage of +10 kV and Cs"^ ion gun voltage of +30 kV were employed. An electric field scan over a narrow mass range was used to measure segments of the mass spectrum corresponding with appropriate regions of the MALDI mass spectrum. The samples, prior to the dilution used for MALDI analysis, (1 |il; 100 pmol; 0.1 % TFA solution) were added directly to a 1:1 mixture of m-nitrobenzyl alcohol and glycerol. The mass accuracy of LSIMS for measurement of the unresolved isotopic cluster is typically within 100 p.p.m. of the calculated average [M+H]+ mass. The accuracy of the observed masses listed in Tables I and II for the MALDI mass spectra benefited from the use of a reflectron instrument which generally reduced the deviation observed between spectra measured under different experimental conditions (e.g. different matrix or laser power) from ±1000 p.p.m. to ±300 p.p.m. Reflecting this level of mass accuracy we present the MALDI measurements with only 4 significant figures, compared with 5 significant figures for the LSIMS measurements. For calculation of possible amino acid substitutions, the 20 most common amino acids were used together with y-carboxyglutamate (Gla) and hydroxyproline (Hyp) which are commonly found in conotoxins (14).
MS Scanning Methodologies
33
Peptide Modification : lodination was carried out on a stainless steel probe target by adding 0.1 % aq. I2 (1 |il) to the dried peptide (ca. 1 pmol). The reaction was stopped after 1 minute by addition of ascorbic acid and the MALDI matrix, a-cyano cinnamic acid in excess. Esterification with ethanol was carried out using the method of Hunt et al. (15), where an acetylchloride and ethanol solution (1:6, v:v) was added (5 |il) to the peptide dried in a microcentrifuge tube (ca. 1 pmol). After incubation for 15 minutes at room temperature a 2 mM p-mercaptoethanol (in ethanol) solution, was added (1 [i\) and the mixture was dried. The matrix, a-cyano-4-hydroxycinnamic acid (2 |j.l), was added to the micro tube and after 5 minutes 1 |il of this matrix was removed and applied to a target.
IIL Results and Discussion Figure 1 shows the HPLC profile of semi-purified venom from C. striatus (fractions labeled 3, 5-18,20 and 22). A summary of the observed masses in the MALDI and LSI mass spectra for each of these fractions is given in Table I. A data base of known conotoxins was searched for correspondence (±3 Da) with the observed masses: "matches" are scored irrespective of whether the peptide in question was originally isolated from the particular venom. Figure 2 shows the HPLC profile of semi-purified venom from C. ermineus (fractions labeled 4a, 4b, 5-11 and 14). A summary of the masses of the major species observed in the LSIMS and MALDI mass spectra for each of these fractions is given in Table II. Our finding that the preparation of a-cyano-4-hydroxycinnamic acid in acetone and subsequent rinsing (see Materials & Methods) produced all ion species which were observed with a variety of other MALDI procedures is important for further scanning of the Conus venoms for novel conotoxins. Generally, we observe at least one major species in the MALDI mass spectra corresponding to each HPLC component. The increased mass accuracy available when the instrument was operated in the reflectron mode was important for the analysis carried out. For example, fractions 4b and 7 or fractions 5 and 8 from C. ermineus appeared to be the same species when measured with the instrument operated in the linear mode. Only in the reflectron mode were we able to reliably distinguish the masses of each species. The high sensitivity of MALDI-TOF is particularly important for the analysis of native peptides such as conotoxins where often the venom of many milkings must be collected to obtain sufficient material for sequence analysis. The increased sensitivity of MALDI over LSIMS is illustrated in the analysis of fraction 5 from C. striatus venom (see Table I). Despite the two orders of magnitude difference in the amount of material consumed in the LSI experiment we did not discern any intact species in fraction 5, whereas the MALDI measurement yielded useful information. However, the comparisons in Tables I and II reveal that some components may be detected by LSIMS but not observed in the MALDI mass spectrum (measured with any of the matrices or sample preparation methods). The contrary is most likely more prevalent, i.e. that a large number of the species detected by MALDI with one or more of the matrices are difficult species to ionize with LSIMS.
34
A. Grey Craig
et al.
Table I. Observed masses in the LSIMS and MALDI mass spectra of fractions of C striatus [RTI 1 match CaJc! 1 MALDI LSIMS Obs. mass (m/z) Obs. mass (m/z) mass^ 2740.2 ISVIB 12739.5 12739t NO 5 NO 2579 2544 2494.9 6 SVIA 2494.9 2494t 2521.4 7 2521 1241.5 1791.2 NO 8 SII 1792.0 1240 1794t 1813 1814 2786 1354.4 1791.3 NO 10 SI NO 1354.6 1354t 2782.7 NO 11 sm 2498 1456.7 1457t 2497.5 12 2500 4099 9218 1396.4 NO 4098.6 NA 1397 13 2498 3886 3940 NA NA NA NA 1367 14 4898 4952 4968 NO NO 4947.0 4965.9 4882 15 4084 4100 4792 NA 4082.0 4098.5 NA 3175 16 2498 3924 4758 2176.6 2497.4 NO NA 17 4743 5025 3938 NO 18 3782 3713 3778.1 NO 20 3418 NO NO 3348 3400.0 3416.0 3432.4 122 1 INA 1^ calculated average [M+H]+ mass. NO indicates corresponding ion in ]^ALDI or LSIMS spec trum not observecI. NAindi :ates noi analyze5d. t ind icates the obs. species which matched.
ffn
It is apparent from Table I that masses of several previously known peptides from C. striatus correspond to those found for major UV absorbance peaks. Similarly, Table II shows that the C. ermineus venom contained a peptide in fraction 7 that matched the mass of conotoxin GVIB from C. geographus — in this case, the peptide has since been analyzed and found to be completely unrelated to GVIB, whereas the putative match to SI in fraction 9 has been confirmed with chemical sequencing and mass spectrometry.
100
60
h40
— 17
20 "•0
n^^ IIIII M l I I I I I I n I I I I I I I I I m i l I I I I I I I I I I I I I I I III IIII
time (min)
Figure 1. UV trace of HPLC of C. striatus venom.
in
IIIIIIII1
100
MS Scanning Methodologies
35
Table II. Observed masses in the LSIMS and MALDI mass spectra of C. ermineus fractions 1 LSIMS Fr. 1 match Calc. 1 MALDI mass Obs. mass (m/z)J L (m/z) Obs. mass^J 3451.1 NO NO 2513 3105 NO 1 12496.1 [2497'" ^ 3101 3100.7 4b 3085 5 3099.6 NA 2111 NO 2094 2495.4 NA 3085 NA 6 7 GVIB 3095.4 3098t 3096.2 3082 3082.3 8 1792 1944 3068 1354.0 1790.8 1943.6 3065.6 SI 9 1354.6 1353t 2094.2 NA 2094 3047 3558 NO 1803 NA 10 1397.2 2765.6 2781.4 2766 2781 1398 11 12369.9 NA 3512 3528 [2370 NA 1 14 1 ^ calculated average [M+H]+ mass. NO indicates corresponding ion in MALDI or LSIMS spectrum not observed. NA indicates not analyzed, t indicates the obs. species which matched.
From these results it is clear that useful information can be obtained from MALDI, but that it cannot be used directly to establish the identity of peptides — our ultimate aim is to obtain sequence information from these fractions. Towards that goal we are currendy developing protocols that allow reduction of cysteine residues, alkylation and MALDI measurement without the need for further purification (19). In the simplest version, measurement of the peptide before and after modification reveals the number of disulfide bridges present in the peptide. With the linear alkylated peptide, we can more easily interpret the metastable decomposition occurring in the first field free region of a time-of-flight instrument to measure the fragment ions. This protocol is shown in Figure 3 for reduced and S-carboxamidomethylated (Cam) conotoxin GIA(H-Glu-Cam-Cam-Asn-Pro-Ala-Cam-Gly-Arg-His-Tyr-Ser-Cam-Gly-LysNH2) (16, 17). The spectrum shown in Figure 3 was obtained from a derivatization of 10 pmol of peptide in which 1 pmol of peptide was applied to the target. The PSD spectrum is a composite of scans measured at reflectron voltages between 1.25 and 29.9 kV: the total ion current and therefore the baseline noise varies between individual scans. The 'b' and 'y+2' type fragment ions (18) are the most prolific series observed for this peptide and are therefore identified in Figure 3. However, significantly more sequence information is present in this spectrum (19).
time (min)
Figure 2. UV trace of HPLC of C. ermineus venom
100
36
A. Grey Craig et al.
At this point in time it is impractical to sequence every peptide, given the complexity of the venoms. A strategy that directs the sequencing effort to selected peptides can be based on the rapidly growing number of conotoxin sequences which have previously been determined. In 1991, there were over 70 conotoxin peptides characterized from over 10 species (1); this number is now above 200, and growing rapidly with the acquisition of sequences from DNA cloning (20). As described above, we used a database of conotoxin sequences to assign fractions 3, 6, 8 and 11 of C. striatus venom as possibly containing peptides corresponding to SVIB, SVIA, SI and SIB respectively. More accurate mass measurement with LSIMS confirmed that the intact mass was consistent with this assignment in 3 of these cases (no signal was observed with LSIMS in one case). This type of data-base scanning is also being employed in reverse, to search among venom fractions for candidate peptides to match predicted translation products corresponding to cloned cDNAs. The Conus venoms often contain several minor sequence variants of toxins, arising from genetic polymorphisms, multi-gene families, and variation in post-translational processing. When the mass difference between two closely related peptides (in terms of HPLC retention times and mass) corresponds to a single amino acid substitution, a simple experiment may suffice to choose among alternatives. Consider for example fractions 5 and 7 from C. striatus venom, satellites of the major fraction tentatively identified as conotoxin SVIA (Fig 1 and Table I). The mass difference of 50±1 Da between peptides in fractions 5 and 6 could be explained by any of the following changes (i) Tyr to either Hyp, Leu or He (50 Da); (ii) His to Ser (50 Da) (iii) Phe to Pro (50 Da) or (iv) Trp to His (49 Da). Although options (ii) to (iv) are formally possible, SVIA does not contain His, Phe, or Trp, so option (i) would be favored. In order to test this directly, we iodinated small samples of approx. 2 pmol each of fractions 5 and 6. After treatment with I2, fraction 6 was shifted towards higher mass by 126 Da This shift was consistent with the presence of a tyrosine residue in this peptide (we have determined that iodination under these conditions modifies tyrosine but not histidine residues (21)). 73+2 yi3+2
' — I — ^ — I — ^ — I — ^ — r -
400
600
800
1000
1200
1400
m/z
Figure 3. The PSD spectrum (200-1600 Da) of alkylated Conotoxin GIA.
MS Scanning Methodologies
37
In contrast, fraction 5 was not modified by this protocol, indicating that the mass difference of 50 Da could be attributed to the tyrosine residue in the peptide in fraction 6, being replaced by either hydroxyproline, leucine or isoleucine in the peptide in fraction 5. Similarly, assuming that SVIA is the major component in fraction 6, the additional 27±1 Da of the peptide in fraction 7 could be attributed to change of Ser to (Leu, He or Hyp), or of Lys to Arg. Esterification of fractions 6 and 7 verified that neither component contain acidic groups, which was consistent with C-terminal amidation of SVIA. These derivatizations are highly selective, and may thus allow PSD measurements to be carried out on peptides after modification. Such a protocol would significantly enhance our ability to derive sequence information from PSD spectra, because the mass shifts observed in fragments help locate the particular residue within the peptide, and also confirm assignments of fragments as arising from N- or C-terminal regions. In addition to derivatizations that may modify the C- and N-termini and the derivatization of tyrosine residues, we have carried out oxidation of methionine residues with sufficient specificity to enable measurement of PSD spectra.
IV. Conclusion We have gained an appreciation for the ionization bias observed between MALDI and LSIMS. The utilization of PSD to identify known peptides and provide sequence information has been investigated for conotoxins. This approach to obtaining sequence information on novel peptides is attractive because of the low amount of material required. A number of mass spectrometric based derivatizations have been used to scan fractions of venoms in order to characterize peptides of interest. For closely related components (based on HPLC retention time and mass), the small scale derivatization schemes can be used to test hypotheses about peptides with otherwise novel masses (i.e. which may be homologs). The mass accuracy of the TOP technique, with a gridless reflector, was important for identifying and calling these substitutions.
Acknowledgments We would like to thank Drs. B. Olivera and UCruz, University of Utah for stimulating discussions. This work was supported by the National Institute of Health (K20MH00929, lSlORR-8425, HD-13527, DK-26741, CA-54418, HL41910, GM-48677) and supported in part by the Foundation for Medical Research, Inc. (AGC and WHF).
38
A. Grey Craig et al.
References 1. B. M. Olivera, J. Rivier, J. K. Scott, D. R. Hillyard and L. J. Cruz (1991) Journal of Biological Chemistry 266,22067. 2. M. Karas, A. Ingendoh, U. Bahr and F. Hillenkamp (1989) Biomed. Mass Spectrom. 18,841. 3. M. Schar, K. O. Bomsen and E. Gassmann (1991) Rapid Commun Mass Spectrom 5,319. 4. A. S. Woods, W. Gibson and R. J. Cotter, (1994). In " Time of Flight Mass Spectrometry" (R. J. Cotter, eds.) ACS, Washington D.C., 5. B. Spengler, D. Kirsch, R. Kaufmann and E. Jaeger (1992) Rapid Commun Mass Spectrom 6, 105. 6. M. C. Huberty, J. E. Vath, W. Yu and S. A. Martin (1993) Anal Chem 65,2791. 7. W. Yu, J. E. Vath, M. C. Huberty and S. A. Martin (1993) Anal Chem 65,3015. 8. R. C. Beavis and B. T. Chait (1992) Org Mass Spectrom 27,156. 9. R. C. Beavis and B. C. Chait (1989) Rapid Commun Mass Spectrom. 3,432. 10. K. Strupat, M. Karas and F. Hillenkamp (1991) Int. J. Mass Spectrom. Ion Proc. Ill, 89. 11. O. Vorm, P. Roepstorff and M. Mann (1994). 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, ILL, May 29- June 3 1994. 12.0. Vorm and M. Mann (1994) J Am Soc Mass Spectrom in press, 13. R. C. Beavis and F. Xiang (1994). 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, ILL, May 29- June 3 1994. 14. B. M. Olivera, W. R. Gray, R. Zeikus, J. M. Mcintosh, J. Varga, J. Rivier, V. de Santos and L. J. Cruz (1985) Science. 230,1338. 15. D. Hunt, J. R. Yates III, J. Shabanowitz, S. Winston and C. R. Hauer (1986) Proc. Natl. Acad. Sci. USA S3,6233. 16. W. R. Gray, A. Luque, B. M. Olivera, J. Barrett and L. D. Cruz (1981) /. Biol. Chem. 256, 4734. 17. L. J. Cruz, W. R. Gray and B. M. Olivera (1978) Arch. Biochem. Biophys. 190.539. 18. P. Roepstorff and J. Fohlman (1984) Biomed. Mass Spectrom. 11,601. 19. A. G. Craig, W.H. Fischer, W. R. Gray, J. Dykert, J. E. Rivier (unpublished results). 20. D. R. Hillyard, B. M. Olivera, S. Woodward, G. P. Corpuz, W. R. Gray, C. R. Ramilo and L. J. Cruz (1989) Biochemistry 28,358. 21. A. G. Craig, J. E. Rivier, W. R. Gray and W. H. Fischer (1994). 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, ILL, May 29- June 3 1994
DIRECT COUPLING OF AN AUTOMATED 2-DIMENSIONAL MICROCOLUMN AFFINITY CHROMATOGRAPHY-CAPILLARY HPLC SYSTEM WITH MASS SPECTROMETRY FOR BIOMOLECULE ANALYSIS D. B. Kassel^, T.G. Consler^, M. Shalaby^, P. Sekhri^, N. Gordon^ and T.Nadler2 ^Glaxo Res. Inst., 5 Moore Drive, RTF, NC 27709 and ^PerSeptive Biosystems, 11 Sidney St., Cambridge, MA 01960 I.
INTRODUCTION
Two-dimensional (2-D) separations provide the possibility for exquisite resolution of complex mixtures. A benefit to the direct coupling of 2-D chromatographic methods is that sample handling and transfer steps can be virtually eliminated. This is critical when attempting to isolate and identify analytes at very low detection levels. One use of a 2-D separation scheme is to selectively identify "active" components in a compound library. The library may be naturally occurring, such as a cellular lysate or fermentation broth. Alternatively, it may be a pool of synthetic compounds, or a set of enzymatically or chemically modified compounds. Identification of "active" ligands relies on the ability of individual components from the molecular mixture to bind with high affinity to a target molecule. Like many groups, we have been interested in identifying protein-protein interactions involved in signal transduction (1-3). Because ultra-high sensitivity is required to isolate and characterize these interactions, we have initiated the use of a nucroanalytical immunological system which uses an affinity based binding site selection as the first dimension followed by a separation in the second dimension by reverse phase HPLC. The identification of specifically selected molecules is facilitated by the use of an electrospray ionization mass spectrometer as an on-line detection device. Songyong et al. demonstrated that SH2-specific phosphopeptides could be selected from randomly synthesized peptide libraries. Using glutathione affinity resin coupled with GST-SH2 domain fusion proteins, high affinity phosphopeptides were selectively bound and eluted in a rapid on/off assay. Bound ligands were isolated by elution from the affinity resin with phenylphosphate. Consensus sequences of the high affinity peptides were identified by gas phase Edman degradation. (4,5). Cantley et al. demonstrated that various tyrosine kinase peptide substrates can be trapped and identified using a similar approach with the exception that Fe"*"*" chelation chromatography is used as the affinity selection for phosphopeptides (6). Both of the above studies were limited in that they identified a consensus sequence of the highest affinity peptides from the library, not individual components. In addition, these approaches require that the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
39
40
D. B. Kassel et al.
peptide ligands have free amino termini and that their component residues are amenable to identification by Edman sequencing. We have developed an analogous, but more robust system which is not necessarily constrained by the aforementioned limitations. The obvious extension has been to couple an affinity-based separation with mass spectrometry. Hutchens et al. have shown that affinity probe surfaces can be used to capture specific protein ligands allowing detection by laser desorption mass spectrometry (7). The limitations to their technique have been that the surface area for ligand capture is quite small and salt (or detergent) contaminants are still problematic. Perfusive affinity resins, on the other hand, provide a tremendous surface area for binding. The nature and composition of the solvents required for affinity chromatography, however, are not directly compatible with mass spectrometric analysis. The coupling of micro column affinity chromatography with capillary RPHPLC/ESI-MS, should permit a highly sensitive and highly selective approach to decoding complex nuxtures. Using an automated 2-D system which allows for rapid colunm and solvent switching capabilities, we have assessed the feasibility of coupling a variety of affinity chromatography methods in-line with HPLC/ESIMS. Applications include flow-through enzyme digestions of proteins on immobilized trypsin cartridges, binding and elution of phosphotyrosine containing synthetic peptides on micro-column anti-phosphotyrosine antibody resins, and binding and elution of peptide ligands for the SH2 domain of pp60^' ^^^ on an avidin-biotinyl-SH2 affinity surface capillary column. n. EXPERIMENTAL Automated l-dimensional chomatographic system An Integral™ micro-analytical workstation was used for all affinity chromatography/HPLC/ESI-MS analyses. The workstation consists of three 10port injector valves which have been configured for 2-dimensional separations. In order to capture either unbound or bound species from the affinity resins on the Poros"*^^ R2/H reverse phase resins it was necessary to plumb in a mixing tee just prior to column 2 {i.e., the reverse-phase column) and add an equal volume of aqueous 0.1%-0.2% TFA to the volume of liquid displaced from column 1 {i.e., the affinity column) to adjust the ionic strength and ion pairing capabilities and permit binding of the more hydrophilic peptides. In the absence of this postcolumn mixing tee, binding capacity of the RP-HPLC column was compromised. Mass spectrometer conditions A PE-Sciex API-I lonSpray''^^ mass spectrometer (PE-Sciex, Thomhill, Ontario, Canada) was used to acquire all mass spectra. The Sciex API-Ill mass spectrometer and Integral micro analytical workstation were coupled through a Im piece of fused silica tubing (75|im i.d.) at the exit of the capillary flow cell detector using a 250|im i.d. teflon sleeve as described previously (8). The source
Coupling of 2D Chromatography with Mass Spectrometry
41
needle assembly was aligned slightly off-axis of the entrance aperture to permit high flow rates (i.e., 100|jl/min). The mass spectrometer was scanned from 5001500 Da in 2 sec using a 1.0 msec dwell time and a 0.5 Da step size. The ion multiplier was -4200V, the orifice potential was 70V and the resolution was 1000. Rapid flow-through digestion of proteins using trypsin micro-columns The SH2 domain of pp60^"^^^ was purified from e. coli cells using a T7 expression plasmid and isolated following the procedures of Willard et al. (9). A 20p.l aliquot of purified SH2 corresponding to O.lnmole of protein (in 350 mM NaCl, ImM DTT, 50 mM HEPES, pH 8.0) was loaded at a flow rate of 20 pl/min (corresponding to roughly 0.5 column volumes/min) onto a 750pin x 10cm immobilized trypsin column (Porozjone^^) equilibrated in 50 mM NH4HCC)3, pH 8.5. The total residence time of protein on this column was less than 2 minutes! Effluent from the digest column was trapped at the head of a Poros R2/H capillary column using the post-column mixing tee as described above. Solvent lines were purged with reverse phase buffers and the digestion products were eluted from the column using a linear gradient of 1% to 31% buffer B in 10 min. Preparation of micro-affinity anti-phosphotyrosine antibody column An IgG-2a monoclonal antiphosphotyrosine (P-Tyr) antibody was grown as an ascitic mouse tumor and purified to homogeneity by Protein A fast flow sepharose chromatography to a concentration of 11.3 mg/nil. Cross-linking of the antibody to a Protein G ID sensor cartridge^^ was accomplished by passing 1ml of P-Tyr antibody over the ID cartridge at a flow rate of 0.5 ml/min.. Crosslinking reagent (dimethylpimelimidate) was added in 7 x 2ml aliquots at 0.5 ml/min flow rate. Upon completion of cross-linking, excess reagent was removed by addition of 2 x 2ml aliquots of quenching reagent (ethanolamine). The cross-linked P-tyr antibody column was then washed with 10 column volumes of loading buffer (20 mM Tris in 150 mM NaCl, pH 7.4) and elution buffer (12 mM HCl in 150 mM NaCl, pH 2.5) and repeated until a stable baseline 280 nm absorbance was achieved. Binding and elution of phosphopeptides from P-Tyr antibody micro-column Peptide Mixture I contained AcY*EEIE (1), LIEDNEY*TAR (2) TSTEPQY*EEIENL(3), TSTEPQYEEIENL (4), and PTFEYLQAFLEDYFTSTEPQY*QPGENL (5) and was S)mthesized either in-house (M. Rodriguez, Glaxo Research Institute) or by Zeneca. Peptides were dissolved in loading buffer and their concentration adjusted to 30^M each. A 10)11 aliquot was loaded onto the P-Tyr antibody column at a flow rate of 50 |jl/min and washed for a total of 5 minutes to minimize non-specific binding. The column effluent was trapped at the head of a Poros R2/H capillary column. Specifically bound phosphopeptides were eluted from the P-Tyr antibody using 0.2% aqueous TFA and trapped at the head of the Poros R2/H column. Buffer A was aqueous 0.1% TFA and Buffer B was 90/10 Me(3J/H20 containing 0.1% TFA. A gradient of 1% to 31% B in 10 min and 31% to 61% B in 5 nun was used to separate the peptides.
42
D. B. Kassel et al.
Preparation of micro-affinity Avidin-Biotinyl-SHl column A disposable micro-affinity avidin cartridge was prepared by slurry packing bulk B/A Poros resin into a 500 urn x 5 cm piece of peek tubing fitted with 0.062" X 0.028" 2 pm stainless steel frits at both ends of the column. Biotinyl-SH2 was expressed in e. coli cells using a T7 expression plasmid and purified as described by Consler et al. (10). An aliquot corresponding to approximately 2.5nmoles of Biotinyl-SH2 was loaded onto the avidin micro affinity column at a flow rate of 2 column volumes/min. Binding and elution of SH2 ligands from SH2 Affinity Column Phosphopeptide Mixture II, containing a previously determined high affinity ligand for the SH2 domain of pp60C-src^ TSTEPQY*EEIENL, MW=1633, IC50=1.5^M) and its non-phosphorylated isoform, TSTEPQYEEIENL MW=1553, ICsc^l mM) were prepared and diluted to a final concentration of 30)JM each. A total of 600 pmole of the mixture was loaded onto the SH2 micro affinity column at a flow rate of 20 ^il/min and washed through the column for a total of 5 min. Unbound ligand was trapped on the RP-HPLC capillary column coupled in-line in between the affinity column and the lonSpray'^'^ mass spectrometer. Unbound peptides were eluted from the column using the same gradient as above. Specifically bound ligand was competed off the SH2 affinity column using a 4nmole "plug" of the high affinity ligand, AcY*EEIE (MW=804, IC50=1.5^iM). III. RESULTS AND DISCUSSION Molecular weight mapping of proteins by mass spectrometry is a powerful tool that allows for the identification of post-translational modifications, including glycosylation, phosphorylation, amino-terminal acetylation, truncation, myristoylation, to name only a few. One of the rate limiting steps in the mapping of proteins has been in the digestion itself. Typically, we have either purified the protein by HPLC prior to reduction, alkylation and enzymatic digestion to remove potential enzyme interferents or alternatively, bound the protein to a reverse-phase HP sequencing cartridge support and incubate with an enzyme cocktail (11). Purification of proteins by HPLC often gives rise to significant sample losses. Digestions on sequencing cartridges are, in general.
UV Chromatogram
Lv^yvw^
Time (minutes)
Figure 1. 2 min enzymatic digestion of SH2 on 750^m i.d. Immobilized Trypsin Column coupled to HPLC/ESI/MS
Coupling of 2D Chromatography with Mass Spectrometry
43
quite long (> 24 hours) and incomplete. Recently, we have evaluated immobilized trypsin columns. Because a large amount of enzyme can be coupled to the resin (due to particle's perfusive nature), enzyme digests can be performed extremely rapidly as has been shown by Maleknia and co-workers (12). Using the IntegraF^, it has been possible to couple the digestion of proteins on these enzyme digest cartridges with a "trapping" reverse-phase HPLC column coupled to an electrospray mass spectrometer. The results of such an experiment are depicted in Figure 1. Digestion of SH2 was carried out in a flow-stream of pH 8.5 NH4HCO3 buffer for 2 min. The TIC chromatogram was also recorded (data not shown). Analysis of the electrospray mass spectra showed that the digestion was complete (no intact protein was observed). Underlined sequences (below) represent those tryptic fragments observed in the LC/ESI/MS analysis. Peptides that were not accounted for had molecular weights below the m/z range scanned. SH2 SEQUENCE MDSIQAEEWYFGKITRRESERLLLNAENPRGTFLVRESET TKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRT OFNSLOOLVAYYSKHADGLCHRLTTVCP We have been interested equally in developing other immunoaffinity-based chromatographic methods that can be coupled directly with HPLC and mass spectrometry. Phosphorylation events govern many of the interactions between proteins involved in signal transduction (13) and cell cycle events (14). Western blot analyses are commonly employed that use anti-P-Tyr antibodies to identify the proteins involved in signal transduction. Our aim has been to use microaffinity (250-750|im i.d.) P-Tyr antibody columns with flow rates compatible with reverse-phase capillary HPLC/ESI/MS for the purpose of selectively binding phosphorylated peptides and proteins from complex mixtures and detecting them with high sensitivity by mass spectrometry. Figures 2 illustrates the results of a binding assay using model Peptide Mixture I with the P-Tyr Ab cross-linked to a micro-affinity Protein G column. The TIC chromatogram for Peptide Mixture I analyzed solely by capillary HPLC/ESI/MS is shown in Figure 2a. Binding of the mixture to the affinity column was achieved in less than 2 minutes. Analysis of the "flow-through" material is shown in Figure 2b. Only the non-phosphorylated peptide was observed. Bound peptides were acid eluted from the affinity column, trapped on the capillary HPLC column and mass analyzed by ESI-MS. Figure 2c shows excellent recovery of the 4 phosphopeptides. The retention times were affected following elution onto the reverse-phase column. This could be attributed to altered ion pairing resulting from the highly acidic solution used to displace the peptides from the affinity resin. The electrospray mass spectra of peptides 1 and 2 are shown in Figure 2b and are representative of the quality of data observed from this 2-dimensional analysis. The results suggest that this approach should be particularly useful for identifying, for example, preferred substrates for serine, threonine or tyrosine kinases. Attempts were made to identify preferred peptide substrates for c-src from complex peptide libraries but initial attempts to completely eliminate non-
D. B. Kassel etal.
44
specific binding were unsuccessful. Currently, we are evaluating parameters such as incubation time as well as other means for immobilizing the capture ligand, such as through the use of biotinylated tags. 1.2
•5
(A)
1
>%
(C)
«0
a . c B
c
1,2
4
c
1
[A
3
^
B c Z
Ul_^-^
J"-^^ „
_
time (minutes)
(D)
(B)
>% C
4
0
>
a
.2
>
0 OC
—•-'^
\
^..JU
QC
m/z1303(4
L
time (minutes)
Figure 2. HPLC UV Chromatogram for Peptide Mixture I (A) prior to and (B) following binding to P-Tyr Ab column. (C) TIC chromatogram following elution of "bound" ligands from Ab column; (D) Representative ESI mass spectra. In an analogous manner we have evaluated nrdcro affinity chromatography for identifying ligands for the SH2 domain of c-src. The c-src SH2 domain has been well characterized and is known to bind with high specificity and affinity phosphot5^osine-containing peptides. Construction of a capillary SH2 affinity column was achieved as described previously. Peptide Mixture II was incubated in a flow-based format through the Src SH2 affinity column. The results are sununarized in Figure 3a-3b. Figure 3a shows the mixture analyzed by capillary HPLC/MS operating the Integral in the column-2 only mode. Figure 3b shows the TIC chromatogram and mass spectrum (insert) obtained as a result of incubation and trapping of unbound material on the reverse phase column (operating the IntegraF'^ in the column 1/column 2 mode). Only the nonphosphoiylated, weak affinity peptide is observed in the flow-through, consistent with prediction. The bound, high affinity ligand for the Src SH2 affinity column was displaced using the competing high affinity ligand, AcY*EEIE. Figure 3c shows that the phosphopeptide, TSTEPQY*EEIENL, was completely liberated
Coupling of 2D Chromatography with Mass Spectrometry
45
from the resin. Some of the competing ligand, AcY*EEIE was also observed, as shown in Figure 3c. This could be explained by the fact that a 4nmole injection of the competing ligand was made and the micro-affinity column contained maximally 2.5nmoles of binding sites. Figure 3d shows that prolonged washing of the column with cold TBS (> 10 column volumes) was sufficient to remove all remaining bound ligand. This is consistent with the fact that many of these peptide ligands have reasonably fast off-rates, a necessary consideration in identifying the optimal flow rate for binding in a flow-based immunoaffinity assay such as the one described here. Importantly, this suggests that the Src SH2 affinity column could be re-cycled for multiple screening analyses.
(A) CO
CO
c o
c
>
lNrN^/^-^^^y^
Time (min)
Time (mIn)
(D) 4.1
(0
c o
>
o
QC
Time (mIn)
Time (min)
Figure 3. LC/MS TIC Chromatogram for Peptide Mixture II (A) Prior to and (B) Following incubation with SH2 affinity column. (C) Following displacement of "bound" TSTEPQY*EEIENL using a 4 nmole "plug" of AcY*EEIE and (D) following displacement of bound AcY*EEIE by washing with cold TBS.
46
D. B. Kassel et al.
IV. CONCLUSIONS We have demonstrated that 2-D separations based principally upon microaffinity chromatography and RP-HPLC are readily coupled with electrospray ionization mass spectrometry. Using a number of model systems the following principles were demonstrated: 1) Extremely fast enzyme digests could be performed in situ and could be coupled directly with RP-HPLC/MS to provide protein molecular weight maps; 2) Anti-phosphot5a'osine antibodies immobilized to micro affinity Protein G resins could be used to bind phosphopeptides from complex mixtures and be detected by electrospray ionization mass spectrometry and 3) SH2 domains immobilized to Strep-avidin resins through biotinylated tags were capable of selectively binding high affinity ligands and could be identified readily by mass spectrometry. ACKNOWLEDGMENTS The authors are grateful to D. Weigl (Glaxo Research Institute) for purification of the phosphotyrsoine antibody. The authors also wish to acknowledge J. Mark for providing immobilized trypsin columns. Finally, the authors wish to thank M. Rodriguez, D. Kinder, M. Green and J. Berman (all of Glaxo Research Institute) for providing some of the synthetic peptides used in this study. VI. REFERENCES 1. Koch, C.A., Anderson, Moran, M.F., Ellis, C, Pawson, T. (1992) Science 252, 668-674. 2. Koch, C.A., Moran, M.F., Anderson, D., Liu, X., Mbamalu, G. and Pawson, T. (1992) Molec. Cell Biol. 12 (3). 1366-1374. 3. Pawson, T. and Gish, G.D. (1992) Cell ZL 359-362. 4. Songyang,Z.etal. (1993) Cell 72,767-778. 5. Marengere, L.E.M., Songyang, Z., Gish, G.D., Schaller, M.D., Parsons, J.T., Stem, M.J., Cantley, L.C. and Pawson T. (1994) Nature 369,502-505. 6. Cantley. L.C, Songyang, Z. Proceedings of the American Society of Biochemistry and Molecular Biology, Wash., D.C., 1994. Book of Abstracts. 7. Hutchens, T.W. and Yip, T.-T. (1993) Rapid Commun. Mass Spectrom., 7, 576-580. 8. Kassel, D.B., Musselman, B.D. and Smith, J.A. (1991) Anal. Chem. 63,10911096. 9. Knight, W.B. et al., Glaxo Research Institute, manuscript in preparation. 10. Consler, T.G. Glaxo Research Institute, unpublished results. 11. Burkhart, W. (1993) in "Techniques in Protein Chemistry IV," Academic Press, Inc., R. Hogue Angeletti, Ed., pp. 399-406. 12. Maleknia, S., Dixon, J.D., Mark, J. and Afeyan, N.B. (1994) 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, IL, Abstract. 13. Zhu, G., Decker, S.J., Maclean, D., McNamara, D.J., Singh, J., Sawyer, T.K. and Saltiel, A.R. (1994) Oncogene 9,1379-1385. 14. Taylor, S.J. and Shalloway, D. (1994) Nature 368.867-870.
Edman Degradation and MALDI Sequencing Enables N- and C-Terminal Sequence Analysis of Peptides Roland Kellner^ Gert Talbo, Tony Houthaeve, and Matthias Mann European Molecular Biology Laboratory, D-69012 Heidelbeig, Germany
I. Introduction In recent applications of protein characterization we focused on the in matrix digestion of samples and automated Edman degradation of the resulting peptide fragments [1]. This strategy proved to be advantageous with demanding biological samples like membrane proteins and cell signalling components, and we could successfully characterize proteins with starting amounts down to ca. 25 pmol sample [2-5]. However, inherent limitations of the Edman chemistry often cause ambiguous results in the low picomole range. The signal of the first amino acid residue(s) are often overlapped by background signals; residues like tryptophan and cysteine are hardly detected; and the C-terminal end of a peptide may not be identified due to sample washout. The molecular weight information from mass analysis can be used either to confirm results from Edman sequencing or to decide minor ambiguities. However, frequently more than one peak appears in the mass spectrum, or more than one amino acid is unassigned. This makes it impossible to correlate the Edman and the MS data by the measured mass alone. Recently, fragmentations by post source decay (PSD) was introduced as a technique to obtain structural information in MALDI MS [6,7]. We describe here the combined use of automated Edman sequencing and MALDI sequencing for the determination of proteolytic peptide fragments in the low picomole range. ^Present address: Institute for Physiological Chemistry and Pathobiochemistry JohannesGutenbei;g-University, D-55099 Mainz, Germany TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
47
48
Roland Kellner et al.
II. Materials and Methods A. Sample Preparation Proteins were digested in the gel matrix as described [5]. Briefly, after separation by 2D-gel electrophoresis and staining by Coomassie Blue R250, the protein spots were excised and thoroughly washed. 1 \x% protease (e.g. trypsin or chymotrypsin, sequencing grade, Boehringer Mannheim, Germany) dissolved in 100 |iil 100 mM NH4HCO3, pH 8.0 and 0.5 mM CaCl2 was added and digestion was performed at 37°C overnight. Peptide fragments were then extracted from the gel slice using 3 x 100 |il 70% trifluoroacetic acid in water and 3 x 100 |il trifluoroacetic acid / acetonitrile 1:1. The supematants were combined and concentrated. The peptides were separated by RP-HPLC (Vydac C18 218TP 1.6 x 250 mm, 120 jiil/min) and peak fractions were collected manually. B. Amino Acid Sequence Analysis A major aliquot of a peak fraction (90%) was subjected to automated Edman degradation (model 477, Applied Biosystems). Typically, fractions of ca. 60 [xl were applied to a polybrene-coated glass fibre filter and sequenced. C. Mass Analysis About 10% of the sample was concentrated in a vacuum centrifuge to give an estimated concentration of 1 pmol/|il if possible. Samples were not dried down completely to avoid sample loss. The MALDI matrix was a saturated solution of a-cyano-4-hydroxy cinnamic acid dissolved in water/acetonitrile (7:3) [8]. A 0.5-|il aliquot of the matrix solution was placed on the stainless steel probe and mixed with an equal volume of sample solution. The mixture was left to dry at room temperature prior to introduction into the mass spectrometer. Anew sample preparation method which decouples matrix surface preparation and sample handling was also used [9]. Mass spectra were acquired using a time-of-flight MALDI mass spectrometer (Bruker REFLEX, Bruker-Franzen, Bremen, Germany) equipped with a reflector. The ion signals were monitored by aLeCroy 9450 digital oscilloscope (400 MHz sampling rate) and the spectra were transferred to a Macintosh Quadra 950, where sets of data were averaged. The acceleration voltage was set to 23 kV and for the stable ion measurements the reflector voltage was set to 26 kV. Depending on the signal to noise ratio each spectrum was the average of 50 to 200 single shot spectra acquired in sets of 5 shots. The mass spectrometer was equipped with a set of short deflection plates and fast pulse electronics. Together they allow selection of a small mass range of interest in a mixture. After the stable spectrum had been obtained in the linear mode a pulse window was set around the precursor ion of interest deflecting all other ions.
Edman Degradation and MALDI Sequencing
49
Metastable reflector spectra were obtained by the method of Spengler and Kaufmann [7] as modified for a gridless reflector [10]. Reflector voltage steps were chosen to result in overlapping mass ranges for metastable ions (see below). Calibration of the metastable spectra was performed as described [10]. Metastable spectra were interpreted manually using the observation that low energy fragments, especially A, B and Y ions, are predominant. Y ions can often be identified by a simultaneous loss of 17 Da.
III. Results and Discussion Previously, we gave a short description of the principle of MALDI sequencing [11]. In the following we will briefly expand on the basics of the process as applied to peptide sequencing. A time-of-flight instrument offers several different modes for mass analysis: linear, reflector and reflector metastable mode (MALDI sequencing). For the linear mode ions are measured at the end of the flight tube. Alternatively, the flight direction of ions can be reversed by a reflector field and the ions are measured at a second detector. The reflector mode balances some of the energy spread and, therefore, increases resolution and mass accuracy. In all modes ions are created directly at or above the target surface by the interaction of the laser beam with the matrix material. Normally, ions will be formed whose mass corresponds to the complete molecular weight of the analyte molecule plus the charge agent. The molecular ions created may be stable or they may fragment before detection. Particularly weak bonds may result in fragmentation on the surface of the target, the so called ^prompt fragmentation*. In the linear mode these fragments show up at distinct flight times in the MS spectrum. However, if the fragmentation occurs after the acceleration region, where the ions have acquired their full kinetic energy, those fragments will reach the detector at about the same time as their stable counterparts. Then there will be only one peak per component even though there may be a significant degree of fragmentation [12]. The reflector mode is somewhat more complex. The reflector constitutes an energy analyzer, which time focuses ions that have the full acceleration voltage. However, it will separate fragment ions that have the same velocity, because they originate from the same parent ion, but different mass and hence, different energy. Light fragment ions will not penetrate as far into the reflector and will appear earlier in the time-of-flight spectrum. After proper calibration, correct masses can be assigned to a segment of the metastable ion spectrum [13]. Procedures for calibration of metastable fragmentation spectra are now available for most comimercial MALDI mass spectrometers. By stepping the reflector voltage the whole product ion mass range can be covered. In the same measurement, a signal for the corresponding neutral fragments can be obtained on the linear detector. The pres-
Roland Kellner et al.
50
ence of the neutral fragment peak ensures that peptide precursor ions have been formed. This is helpful in mass ranges where no fragment ions are produced. The intensity of the neutral peak is related to the degree of fragmentation of the sample and may help in determining the level of laser irradiance to apply. A. Examples for the identification of proteolytic peptide fragments by Edman degradation and MALDI sequencing 1. A14 kDa vesicular membrane protein which had been the subject of study for two years was separated on a 2D-gel. Poor staining was achieved with Coomassie Blue. Nevertheless, we excised that spot because of the low abundance of the sample. The position of the protein in the gel was determined by comparison to an earlier, silver-stained gel which had given a more intensive staining with less sample. Chymotrypsin was chosen as protease because trypsin failed to cleave this membrane protein in a first attempt. The peak intensity after HPLC separation of the digestion mixture showed the low sample amount (Figure 1). Ninety percent of a peak fraction was applied to the sequencer. The following sequence information for the 12-residue peptide KQYHENIS A W F could be determined by Edman degradation in the range of 1.5 pmol (Tyr^: 1.4 pmol). It contained several ambiguities, namely His"*, Ser^ and Val^^ (Figure 2). Simultaneously, an aliquot of the fractionated peptide (5 |il = ca. 10%) was used for MALDI MS. The reflector spectra showed a monoisotopic molecular weight of 1434.7 Da which means an accuracy of 40 ppm (Figure 3). Furthermore, MALDI sequencing resulted in the peak pattern shown in Figure 4. The fragmentation pattern clearly identified a Ser residue at position 8 and Val at position 10 in the molecule. Together with the overall mass a His residue at position 4 could be derived. He and Leu were already distinguished by Edman sequencing. The combined interpretation of these results gave the unequivocal determination of this chymotryptic peptide fragment.
KQYHENISAVVF
/VTV^ yv^
Figure 1. In matrix chymotryptic digestion of a 14 kDa vesicular membrane protein separated on RP-HPLC and UV-detection at 214 nm (solid line). The peaks were fractionated manually. From the same 2D-gel a blank gel piece was treated simultaneously and used as a comparison to eliminate background peaks (dashed line).
51
Edman Degradation and MALDI Sequencing
Cycle No.
UMM
M UWAJ
r**^
U
5.0
y
12.0
-i.O
1.0
21.9
Time (min)
3-0
6.0
3.0
12.0
15.0
1S.0
21.0
Figure 2. Edman degradation of a chymotryptic peak fraction. OnJy the finally assigned PTHamino acids of the 12-rcsidue peptide are labelled (see text). I
II
434.>7
KQYHENISAVVF MH-i- - 1434.74 amu
UMWH^*;%M>MIV^^ Figure 3. Reflector MALDI MS was performed on an aliquot of the chymotryptic peak fraction (Figure 2). The monoisotopic peak could be determined via an internal standard to an accuracy of 40 ppm.
52
Roland Kellner et al.
K - Q - Y- H -
EJ- N - J J - S
-
AJ- V I - V J - F
687.34 |914.46 I 1072.53 I 1271.67 801.38 1001.50 1171.60 Signal (mV)
Figure 4. MALDI sequencing of a chymotryptic peptide fragment.The overlapping mass ranges for metastable ions are due to steps in the reflector voltage.
Figure 5. Reflector MALDI MS. The tryptic peptide ALLNNSHFYHLAHGKDFASR has i calculated molecular weight of [M+H]* = 2299.56 Da.
Edman Degradation and MALDI Sequencing
53
2. In another example, a tryptic fragment of a 52 kDa protein was subjected to Edman degradation and a 15 residue peptide was identified. It contained two ambiguities: ALLNNSH(F/Y)YHLA(H)GK'^; Lys was assumed to be the C-terminus of the tryptic fragment. However, the molecular weight was determined by MALDI to be 2299.4 Da (Figure 5), which implied that it was larger than 15 residues. Therefore, the component was further analysed by MALDI sequencing. Interpretation of several fragmentation spectra identified position 8 to be a Phe, and position 13 could be confirmed to be His; the fragment ions were in agreement with the Edman result (Figure 6A and 6B). A database search showed strong homology to a known protein and two amino acid exchanges were identified: Phe^ instead of Tyr, and Leu*' instead of Met. In the protein sequence the 15 residue fragment is followed by the tryptic peptide DFASR. Calculating an extented 20 residue peptide its mass agrees with the measured molecular weight of 2299.4 Da. MALDI sequencing was particularly important to determine and verify the amino acid exchanges comparing the studied protein and the sequence which was obtained from the database.
300
1350
Figure 6. The molecular ion peak (Figure 5) was subjected to MALDI sequencingThe mass range, in A, from 650 - 1000 Da displays theY serie LAHGK and the B serie HF"V^ in B the Y and the B serie YHLA from 1000 - 1350 Da are shown.
54
Roland Kellner et al.
IV. Conclusions Automated Edman degradation or MALDI sequencing can in principle yield the complete sequence of peptides. However, the amounts of peptide for sequencing in demanding biological problems are very low. Both methods need to be operated at their highest performance possible and inherent limitations must be considered. A combined application of both methods is feasible because of the very low sample consumption for MALDI MS. As shown in our example ambiguous sequence calls could be clarified by the complementary information. The application on proteolytic peptide fragments resulted often in metastable ions from the C-terminal part. In this way the N-terminal sequence is achieved by Edman degradation, the molecular weight determines the overall size of the fragment, and ambiguous amino acid residues are identified by MALDI sequencing.
References 1. Kellner, R., Houthaeve, T., Kurzchalia, T.V., Dupree, P., Simons, K. (1992) J.Prot.Chem, 77,356. 2. Kurzchalia, T.V., Dupree, P., Parton, R., Kellner, R., Lehnert, M., Simons, K. (1992)7. Cell Biol 775,1003-1014. 3. Emans, N., Gorvel, J.R, Walter, C , Gerke, V., Kellner, R., Griffith, G., Gruenberg, J. (1992) J.Cell Biol 120,1351-1369. 4. Kurzchalia, T.V., Gorvel, J.R, Dupree, R, Parton, R., Kellner, R., Houthaeve, T., Gruenberg, J., Simons, K. (1992) J.BiolChem. 257,18419-18423. 5. Fiedler, K., Parton, R.G., Kellner, R., Etzold, T., Simons, K. (1994) EMBO J. 75,1729-1740. 6. Spengler, B., Kirsch, D., Kaufmann, R., Jaeger, E. (1992) Rapid Commun. Mass Spectrom. 5,105-108. 7. Kaufmann, R., Spengler, B., Lutzenkirchen, F. (1993) Rapid Commun. Mass Spectrom. 7,902-910. 8. Beavis R., Chait B. (1989) Rapid Commun. Mass Spectrom. 5,432-435. 9. Vorm, O., Roepstorff, P., Mann, M. (1994) Anal Chem., in press. 10. Vorm, O., Talbo, G., Mortensen, P., Mann, M. (1994) personal communication. 11. Talbo, G., Mann, M. (1994) In: Techniques in Protein Chemistry V(J.W. Crabb; ed.), p.105-114, Academic Press, San Diego. 12. Spengler B., Kirsch D., Kaufmann R. (1992) J.Phys. Chem. 96, 9678-9684. 13. Vorm O, Talbo G., Mann M. (1994) personal communication.
Identification of the Amino Terminal Peptide of N-temiinally Blocked Proteins by Differential Deutero-Acetylation Using LC/MS Techniques Gaig D. Thulin and Kenneth A. Walsh Dept. of Biochemistry, University of Washington Seattle, WA 98195
I. Introduction More than 80% of eukaryotic proteins are blocked at their amino terminus, often by N-acetylation (1, 2). This prevents direct analysis of the primary structure of many proteins by standard techniques. Instead, blocked proteins are usually digested with a high-specificity protease and the resultant peptides separated chromatographically and sequenced in turn. The one peptide which is refractory to sequencing is then assumed to be the amino terminal peptide, and amino acid analysis yields its composition (see 3, 4 for examples). Chemical analysis of the hydrolytic products, e.g. by gas chromatography, can reveal the nature of the blocking group (4). Alternately, proteins or peptides can be deblocked using specific enzymes or chemical treatment (5), although success is variable. More recently, mass spectrometry has been applied to the identification of blocked N-terminal peptides and tandem MS/MS techniques to the characterization of the blocking moiety (see 6, 7 and 8). Until now no facile method has been available to identify the amino terminal peptide in a mixture. Although LC/MS does allow the analysis of a complex mixture of peptides, one must examine every component in order to identify the amino terminal peptide, even when one knows the amino acid sequence and suspects the nature of the blocking group. We report a method that facilitates the identification of the blocked amino terminal peptide in a digest of a blocked protein. This method can be applied even if the amino acid sequence and blocking group are unknown. In this method, the amino groups on lysine side chains are first deuteroacetylated, then the protein is digested with a high-specificity protease. The digestion mixture is divided and half of it retreated with deuterated acetic anhydride. Only one peptide (from the blocked N-terminus) should not be altered by the second deutero-acetylation; it is identified by comparing LC/MS patterns of the treated and the untreated digest. The use of tandem mass spectrometry (MS/MS) then provides the sequence of that peptide and the identity of the blocking group.
II. Materials and Methods A.
Modification of Lysine Side-chains
Acetylation was performed using a modification of the method of Fraenkel-Conrat (9). Horse cytochrome-C (Sigma) was dissolved in 100 mM Tris pH 8.0 to give a concentration of 0.25 mg/ml. To approximately 2 nMoles of protein (100 ^,1) TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
55
56
Craig D. Thulin and Kenneth A. Walsh
was added an equal volume of 400 mM sodium acetate. The protein solution was cooled in an ice bath and 0.5 ^1 of deuterated acetic anhydride (99 + atom % D; a generous gift of Dr. Hiroshi Ohguru) was added every 20 minutes for one hour. The modified protein was desalted with a POROS Rl/M 2.1mm x 30mm reversephase column (PerSeptive Biosystems, Inc.) using a step gradient from 0 to 80% acetonitrile in 0.05% trifluoroacetic acid (TFA). Fractions were collected and lyophilized.
B.
Proteolytic Digestion of Modified Protein
The modified protein was redissolved in 100 ^ll 100 mM Tris pH 8.0 and 4 ^1 of a 0.1 mg/ml solution of chymotrypsin (to give approx. 1:100 enzymerprotein ratio) was added. Digestion was carried out at 37° C for 18 hours.
C.
Re-acetylation (of New Amino Termini)
To one half of the digest was added 50 |il of 100 mM Tris pH 8.0 and 100 ^ll of 400 mM sodium acetate. This mixture was then placed on ice and three additions of 0.5 |il of deuterated acetic anhydride were made over one hour as before.
D.
Identification of Amino Terminal Peptide
Both the modified and unmodified digest were analyzed by liquid cluromatography/mass spectrometry using an Applied Biosystems Model 140A HPLC with an Upchurch 2mm Cl8 reverse-phase column and a PE Sciex API HI triple-quadrupole ionspray mass spectrometer. At a flow rate of 200 |il/min the chromatography was developed with 0.05% TFA and a gradient of 0 to 60% acetonitrile, 0.03% TFA over 30 min. Ten percent of the HPLC effluent was directed to the mass spectrometer; the remainder was directed to an ABI Model 785A UV detector and fractions were collected by hand. Comparison of the LC/MS data before and after re-acetylation sought a single peptide that did not change mass or mobility. Data analysis was performed using the MacSpec software from PE Sciex, as well as an in-house program, Sherpa, written by J. Alex Taylor, which identifies and relates peptide m/z values in an LC/MS experiment to masses predicted from a given protein sequence. MacBioSpec (PE Sciex) was also used to predict masses for some modified peptides. Fractions containing ions of interest were infused into the mass spectrometer at 1.7 |il/min. and ions selected in the first quadrupole were analyzed by interpreting collisionally induced dissociation (CID) spectra. MacBioSpec was used to generate lists of expected mass spectral fragments for comparison to the observed data.
E. In situ derivatization on a Cationic PVDF Membrane A large protein, rabbit glycogen phosphorylase b (97 kD) was not soluble at high enough protein concentration for the procedures used for cytochrome-C. To overcome this limitation, 20 |il of 9 M urea, 50 mM Tris pH 8 containing 10 mg/ml rabbit glycogen phosphorylase b (a giftfromthe laboratory of Dr. Edmond Fisher) was subjected to electrophoresis on a 4-16% gradient SDS polyacrylamide gel, blotted to Immobilon™-CD (Millipore Corp.), a cationic PVDF membrane, and stained with Immobilon-CD Stain according to the manufacturer's protocol. Stained bands were excised and cut into 1 mm^ pieces. One hundred |ll of 100 mM Tris pH 8.0 and 2 M urea were added to the
Identification of Bloclced N-Terminal Peptides
57
membrane pieces, followed by 100 [i\ of 400 mM sodium acetate. This was then cooled in an ice bath and 0.5 [il of deuterated acetic anhydride was added every 20 minutes for one hour. The supernatant and three 1 ml washes with distilled water were decanted and discarded. Subsequent digestion conditions were based on those of Hess et al. (10) as follows: 50 ^il of 100 mM Tris pH 8, 1 M NaCl, 10% (v/v) acetonitrile, with 0.1 mg/ml chymotrypsin was added to the membrane pieces and incubated overnight at 37°C. Five |il of 9M urea, 50 mM Tris pH 8 plus 1 |xl of 1 mg/ml chymotrypsin was added, and the digest was incubated another 5 hours. Alternative treatment with trypsin used the same Tris, NaCl, acetonitrile mixture but with 2 mM CaCl2 and 0.01 mg/ml trypsin. After digesting overnight, 5 ^1 of 9 M urea, 50 mM Tris pH 8 and 0.25 jil of 2 mg/ml trypsin was added and the digestion incubated another 5 hours. In either case, the supernatant was then decanted and combined with 50 |il 100 mM Tris pH 8. The resulting 100 |il was divided in half, and one half was re-deutero-acetylated as in step C above. Analysis was conducted as previously indicated.
III.
Results and Discussion
Cytochrome-C and glycogen phophorylase b were chosen as model proteins for tfiis study. Both have been well characterized and in both cases the amino termini
a) <s> 1000-
<s>
«>
• | 800 i
600^
400^
1
•
'
•
b)
1
«
•
•
•
1000. "ci^
<5 c^-
1 800^
600^
r"
400. 8.9
o>
p© C-?
-
°o 1
13.4
17.8 Elutlon tlmo (min)
22.2
J_£.a
•-
^.-
26.7
Figure 1. Comparison of contour plots of LC/MS of digest of modified cytochrome-C before (a) and after (b) re-acetylation with deuterated reagent. Contour plots were printed in black and white mode to improve the signal-to-noise ratio seen in the figure. Peaks from the digest before re-acetylation are circled and the circles transferred to the plot of the digest after re-acetylation for comparison. Note that only two significant peaks are found in both profiles (large arrows). The shared peak indicated by die smaller arrow results from incomplete acetylation (see text).
58
Craig D. Thulin and Kenneth A. Walsh
are known to be acetylated (4,11). The protein was first deutero-acetylated so that subsequent deutero-acetylation of free amino groups after proteolytic digestion would alter the behavior of all peptides except the amino terminal peptide. In the case of cytochrome-C, the protein was modified in solution. Tryptic digestion of die modified cytochrome-C would have resulted in only three large fragments; hence chymotrypsin was chosen for proteolysis, which is predicted to cleave primarily at the nine aromatic residues of the protein. LC/MS analysis of the digest revealed that the peptides were modified as expected (data not shown). No peptides containing deuteroacetylated tyrosine were discovered. All modified peptides were homogeneous in mass, indicating complete and specific modification without incorporation of natural isotope acetyl groups. After cleavage with the protease, half of each digest was re-acetylated at the new amino termini with deuterated acetic anhydride. This ensured that any acetyl group added during this modification could be distinguished by mass spectrometry from naturally occurring acetyl groups in the native protein. The digests before and after re-acetylation were then compared by reverse phase HPLC coupled with ion spray mass spectrometry. A stream splitter in line after the column diverted 90% of the effluent into a UV detector, and fractions were collected for subsequent analysis. The two-dimensional separation (separation by time of elution from the column in the one dimension, separation by mass-tocharge ratio in the other) was printed in the form of contour plots for each of the experiments. Comparison of the contour plot of a chymotryptic digest of cytochrome-C with that of file re-deutero-acetylated chymotryptic digest revealed two major and two minor peptides that had unchanged mobility and mass (Fig. 1). Each of these were investigated as the possible amino terminal peptide. One of the two major shared species eluted at about 21 minutes. It had a mass-
a)
-»^^^
314
50
27B
443
- • b ions y ions
616
166 847
25
215
^
1019
ik •MJIIJ|I.M
^jgfe.
^ 5 ^ 1 . J. 1,32
b ) 100 si
f I
75 50 25
314
mIII S
i'°tf'^-
847.
SSSa2
-tm-
1
nVz spectra with doubly charged ions of a) Figure 2. CoUisionally induced dissociation (CID) ni/z=649.4 and b) m/z=567.1. These spectra show that the ion at m/z=567.1 is the b9 fragment of that at m/z==649.4. The CID spectrum of the latter ion was consistent with the sequence of the known N-terminal peptide from cytochrome-C, namely Ac-Gly-Asp-Val-Glu-Lys*-GlyLys*-Lys*-Ile-Phe, where the Lys* residues have a deuterated acetylation. Ions labelled with an arrow pointing right are b fragment ions, those labelled with an arrow pointing left are y fragment ions.
Identification of Blocked N-Terminal Peptides
59
to-charge ratio (m/z) of 556.3 in both LC/MS contour profiles. This m/z corresponds to that of leucine enkephalin, which had been used as a standard in the optimization of the mass spectrometer for LCTMS experiments, and was seen in blank runs (data not shown). The identity was confirmed by tandem mass spectrometry(data not shown). Leucine enkephalin thus serves as an internal standardtiiatensures alignment of the two contour plots. The most intense ion common to both plots eluted at about 18 minutes as the doubly charged form of a peptide of molecular weight 1297. Tandem mass spectrometry of this doubly charged ion (m/z 649.4) gave the CID spectrum seen in Figure 2. The sequence of b-ions is complete between the ion at m/z 100 (which corresponds to acetyl-Gly) and the ion at m/z 1132 (which corresponds to acetyl-Gly-Asp-Val-Glu-Lys-Gly-Lys-Lys-Ile, where the lysyl residues are deutero-acetylated). The sequence of y-ions is complete between the ion at m/z 166 (Phe-COOH) and the ion at m/z 985 (Glu-Lys-Gly-Lys-Lys-Ile-Phe-COOH, with deutero-acetylated lysines as above). The mass of the intact peptide and the CID fragment spectrum are consistent with this being the amino terminus of the protein, with the amino terminus bearing the only non-deuterated acetyl group. A less intense ion coeluted with those identified to be from the amino terminus, having a mass to charge ratio of 567.1. It was found to give a very similar CID spectrum to that of the amino terminal peptide (Fig. 2), except that the y-ions, representing the carboxy terminus, are different. This species proved to be a fragment of the amino terminal peptide (equivalent to the b9 fragment ion) that was generated in the nebulizer. This interpretation was in accord with the observation that it increased, relative to the ion of m/z 649.6, as the orifice voltage was increased. One other minor shared species eluted at about 19 minutes and has a mass to charge ratio of 627.1. This corresponds to the (M+2H)++ of the amino terminal peptide minus one deuterated acetylation. As there are three lysines in this peptide, two being adjacent to one another, this may explain the incomplete modification. In the case of proteins that are not soluble under the conditions used for acetylation of side chain amino groups, SDS-PAGE is employed, followed by transfer to a cationic PVDF membrane, in situ modification of the lysines, and proteolysis. The resulting peptides are recovered from the membrane and the •
1000. c^
-.
-g 800^
,
o
<=>
<=> 400^
m.
.
-
I
4.5
m
'
,^
1^<=<^"
'" " -
" <=>***=^ o < ^ - 'S* "
o ^
^°V%^^.^ • v." o?^i^: r• ?- ^- = .*
•
600-
-- ^
I
I
9.0
1
W
1
1
13.4 17.9 Etution time (min)
1
I
"
•
1
22.4
26.8
Figure 3. Contour plot of an LC/MS of a chymotryptic digest of modified phosphorylase b, printed in black and white mode. Circles indicate the positions of peaks from a contour plot of the re-acetylated digest for comparison. The two significant shared peaks (indicated by arrows) are the N-lerminal blocked peptide (at m/z=514 and 13 min elution time) and the leucine enkephalin internal standard (at m/z=556 and 21 min elution time).
iO
Craig D, Thulin and Kenneth A. Walsh 514
100.
Ac-S 75.
^ W
R
..^^ w
EL
^
L
^
^
286
1
1
229
50.
9
u.
25.
0.
1 t^° L A —•••^iW J i t pw^lgMd^MB—A
100
J L — ^^ ' »
•
• 1
200
'
y
* ^ - ' •
P83 ..Jk
300
1
400
496
.11
J,J 500
600
nrVZ
Figure 4. CID spectrum of the N-terminal chymotryptic peptide from phosphorylase. All labelled ions are b fragment ions, except the ion at m/z=229 which is the y2 ion, and that at m/z=514 which is the parent ion. The spectrum is consistent with the known sequence, namely: Ac-Ser-Arg-Pro-Leu (indicating that the chymotrypsin cleaved at a leucine residue).
analysis can proceed as outlined above. This modified procedure also allows the partial purification of the protein in the initial step. Phosphorylase was analyzed in this way, derivatizating side chain amino groups in situ on a cationic PVDF membrane. The modified protein was then digested on the membrane with either chymotrypsin or trypsin and the solubilized peptides examined as above. Comparison of the contour plot from the chymotryptic digest of phosphorylase to that after re-acetylation revealed only two ions shared by both (Fig. 3). One peptide eluted at about 20 minutes with an m/z of 556.6, again representing the leucine enkephalin internal standard. The other shared ion eluted at about 12 minutes with m/z 514.5. Tandem mass spectrometry of this ion gave a CID spectrum (Fig. 4) consistent with the amino terminal four amino acids, including the acetylated (natural abundance) amino terminal serine. This indicates that the chymotrypsin cleaved at the fourth residue, a leucine. Comparison of the contour plots of trypsin-treated phosphorylase before and after re-acetylation shows two significant shared ions (Fig. 5). One of these is the leucine enkephalin ion seen in all of these experiments. The other elutes at about 13 minutes and has a mass to charge ratio of 652.1. Tandem mass spectrometry (see figure 6) identified this ion as the (M+2H)"*"+ of the amino terminal peptide (Ac-SRPLSDQEK*R), MW 1302, with acetylated N-terminus and one mcldified lysine (note that phosphorylase b is not phosphorylated at Ser 5). Concern may arise that elimination of the basic lysine residues, especially with amino terminally blocked peptides, might preclude detection by positive ion mode mass spectrometry. However, of the seven chymotryptic peptides from cytochrome-C which do not contain any basic residues other than lysine, all seven were detected in the LC-MS experiment, even after re-acetylation. Only one of these peptides (mass 2094) exhibited a weak signal. Ions with two and even three positive charges were visible with these peptides. These results show that the amino terminal peptide of N-terminally blocked proteins can be detected without analyzing every ion produced in an LC/MS experiment. A somewhat related method was developed more than ten years ago by Kaplan and Oda (12). In their procedure free amino groups on side chains were
Identification of Blocked N-Terminal Peptides
61 •
•
1000'
0
" d^
1 800!
c=>-
i
m
•
600-
-«
- ^ '<s>
•
•
ft
400! 8.9
•
•
«
1
11.1
•
•
«
•
1
13.3
••
- cr>
"" '
«
'
•
1 •1
15.5
'
'
»
1
^ •
•
•
•
1
17.8 20.1 Elution timo (min)
«
•
«
•
1
22.3
•
•
•
>
1
•
•
24.5
Figure 5. Contour plot of an LC/MS of a tryptic digest of modified phosphorylase b, printed in black and white mode. Circles indicate the positions of peaks from a contour plot of the reacetylated digest for comparison. The two significant share peaks (indicated by arrows) are the N-terminal blocked peptide (at m/z=652 and 13 min elution time) and the leucine enkephalin internal standard. citraconylated before proteolysis, and then the new amino termini were dinitrophenylated. After subsequent de-citraconylation, the dinitrophenylpeptides were adsorbed on a polystyrene column, and the amino terminal peptide flowed through the column. Additional steps were required when histidine or tyrosine were in the N-terminal peptide. In the present method we also exploited Ac unique resistance of the blocked amino terminal peptide to modification but use LCA^S techniques. It is not necessary to purify the amino terminal peptide; the separation in two dimensions (elution time and m/z) by LC/MS is sufficient to identify this peptide in the mixture. The high resolution of mass analysis assures distinction between naturally occurring acetylation (with naturally abundant isotopes) and deuterated acetyl groups introduced in the procedure. Modification with different chemical moieties and deblocking of lysine side chains is unnecessary. 652
Figure 6. CID specu-um of the N-terminal tryptic peptide from phophorylase b. The spectrum is consistent with the known sequence, namely: Ac-Ser-Arg-Pro-Leu-Ser-Asp-Gln-GluLys*-Arg, where the Lys* has a deuterated acetylation. Ions labelled with an arrow pointing right are b fragment ions; those labelled with an arrow pointing left are y fragment ions. The ion at m/z=652 is the doubly charged parent ion. (Note that trypsin does not cleave at Arg-Pro.)
62
Craig D. Thulin and Kenneth A. Walsh
Visentin and Kaplan (13) also employed a strategy similar to the present method but using [l-^^C]- and [^H] acetic anhydride. The present method allows the analysis to be accomplished without resorting to radioisotopes. The method described in this report has recently been employed in the study of two amino terminally blocked proteins wherein the amino terminus had not previously been identified (Thulin and Walsh, in preparation; and Presland, Kimball, Thulin and Dale, in preparation).
IV. Conclusion Comparison by LC/MS of enzymatic digests before and after modification with deuterated acetic anhydride allows facile identification of the amino terminal peptide of amino terminally blocked proteins. Subsequent MS/MS analysis establishes the specific nature of the blocking group and the sequence of the peptide. The chemical modifications involved (both before digestion, to block free side chain amino groups; and after digestion to modify new amino termini) are simple and effective. These procedures can be done in solution or after the protein has been separated by SDS-PAGE and blotted to a membrane. Modem mass spectrometric techniques greatly simplify this kind of analysis when compared with classical approaches to the same problem.
Acknowledgment CD. Thulin was supported by Public Health Service National Research Service Award T32 GM07270fromthe National Institute of General Medical Sciences.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Brown, J.L. and W.K. Roberts (1975) /. BioL Chem. 251(4); 1009-1014. Brown, J.L. (1979) /. Biol Chem. 254(5); 1447-1449. Resing, K.A., K.A. Walsh, J. Haugen-Scofield, and B.A. Dale (1989) /. Biol Chem, 264(3); 1837-1845. Margoliash, E., E.L. Smith, G. Kreil, and H. Tuppy (1961) Nature 192; 1121-1127. Tsunasawa, S. and H. Hirano (1993) in Methods in Protein Sequence Analysis, ed. by K. Imahori and F. Sakiyama, Plenum Press, NY pp 45-53. Labdon, J.E., E. Nieves, and U.K. Schubart (1992) /. Biol Chem. 267(5); 3506-3513. Gibson, B.W., A.M. Falick, J.J. Lipka, and L.A. Waskell (1990) /. Protein Chem. 9(6); 695-703. Anderegg, R.J., S.A. Carr, I.Y. Huang, R.A. Hiipakka, C.S. Chang, and S.T. Liao (1988) Biochem. 27(12); 4214-4221. Fraenkel-Conrat, H. (1957) Methods in Enzymology 4, 247-269. Hess, D., T.C. Covey, R. Winz, R.W. Brownsey, and R. Aebersold (1993) Protein Science 2; 1342-1351. Koide, A., K. Titani, L.H. Ericsson, S. Kumar, H. Neurath, and K.A. Walsh (1978) Biochem. 17; 5657-5572. Kaplan, H. and G. Oda (1983) Anal Biochem. 132; 384-388. Visentin, L.P. and H. Kaplan (1975) Biochem. 14(3); 463-468.
SECTION II Analysis of Posttranslational Processing Events
This Page Intentionally Left Blank
HPAEC-PAD Analysis of Monoclonal Antibody Glycosylation Jeffrey Rohrer, Jim Thayer, Nebojsa Avdalovic, and Michael Weitzhandler Dionex Corporation, Sunnyvale, CA 94088 L
Introduction
Virtually all antibodies are glycoproteins that contain 2-3% carbohydrate by mass. The carbohydrate of IgG MAbs consists mainly of complex biantennary N-linked oligosaccharide chains. 0-glycosylation has also been documented in the constant region hinge domain of mouse IgG2b (1). Glycosylation of immunoglobulins has been shown to have significant effects on their effector functions, stability, and serum half-life (2,3). Human and mouse IgG are N-glycosylated on each heavy chain in the constant region CH2 domain at Asn-297 (4). Glycosylation at this site effects Fc receptor binding and complement activation (2). Variable region glycosylation has also been reported (5-9). Differing effects on binding affinity were seen due to this glycosylation (10-11). Sialylation on MAb oligosaccharides has been shown to be associated with decreased solubility for monoclonal IgMs and IgGs (12-15). Finally, the presence of a-galactosylation in murine IgGs has recently been documented and was postulated to effect clearance because it is known that 1% of circulating antibodies in humans are directed against this epitope (16). Because of these potential effects of glycosylation on MAb pharmaceutical efficacy, it is important to evaluate MAb glycosylation prior to entering clinical trials. Minimally, this analysis should determine if the glycosylation of the product destined for clinical administration is comparable to that of a reference product, and that the production process achieves a reproducible glycosylation pattern of the MAb. Monosaccharide and oligosaccharide analysis using HPAEC-PAD are two of the methods biopharmaceutical manufacturers use to characterize glycan structures and to monitor the lot-to-lot consistency of therapeutic glycoproteins. A strategy for the mapping of N-glycans by HPAEC-PAD was recently reported (17). In this chapter, these methods were used in conjunction with oligosaccharide standards and endo- and exoglycosidases to identify the oligosaccharide structures present in MAb MY9-6.
TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
55
^^
n. A.
Jeffrey Rohrer et al.
Methods Chromatography
Reagents, a Dionex BioLC system, and conditions used for monosaccharide and oligosaccharide analysis were as described (18). MAb MY9-6 is an ascitesderived murine monoclonal IgG provided by Dr. Mark Hardy, ImmunoGen (Norwood, MA). The MAb was purified by Protein A chromatography. The UV detector configured after the electrochemical detector was a Dionex VDM2 (variable wavelength detector). Peaks were monitored at 215 nm. The sample pretreatment resin used in this study was derived from OnGuard-A cartridges available from Dionex (Sunnyvale, CA). Amino acids used in this study were obtained from Pierce (Rockford, IL). m.
Results and Discussion
A.
Monosaccharide
Analysis
An accurate molar ratio of composite sugars relative to protein 1) provides a basis for further structural elucidation of glycoproteins, 2) provides direct evidence that the polypeptide is glycosylated, 3) suggests classes of oligosaccharide chains, and 4) may serve as a measure of production consistency for therapeutic recombinant glycoproteins (19). Glycoproteins with low percentages of glycosylation represent a challenge for monosaccharide analysis. When there are large molar ratios of peptides and amino acids relative to monosaccharides (e.g., a glycoprotein with < 5% glycosylation), monosaccharide separation and detection can be compromised by coelution of amino acids or peptides from hydrolyzed proteins. In this work, we focused on analysis of glycoproteins with low percentages of glycosylation (MAbs) and the use of sample pretreatment and internal standards to improve monosaccharide quantification. To analyze potential interference of amino acids in monosaccharide analysis, each of the 20 amino acids (10 /xg each, each injected separately) was subjected to the chromatography conditions used for separating, detecting, and quantifying monosaccharides. In addition to PAD detection, we monitored UV detection at 215 nm after the electrochemical detector to verify amino acid electrochemical detection. Ten amino acids (R, K, Q, V, N, A, I, L, T and C) eluted between 2 and 25 min and were both PAD and UV active. Of these ten, two amino acids could potentially interfere with monosaccharide analysis. Glutamine was found to elute as a shoulder on mannose. However, acid hydrolysis conditions used to release monosaccharides from glycoproteins likely would oxidize glutamine. Lysine was found to elute very close to rhamnose, a monosaccharide used here as an internal standard (18). The other eight amino acids eluted either before fiicose (< 5 min) or after mannose (> 20 min). The remaining ten amino acids eluted while washing the column (25 to 35 min) or remained bound to the column.
HPAEC-PAD Analysis of Protein Glycosylation
67
We evaluated peptide interference by using UV detection after PAD detection to identify interfering peptides/amino acids in the MAb hydrolysates. Results of these studies are shown in Fig. 1 and show that at the levels of hydrolysate used to give quantifiable monosaccharide responses (16.7 /xg injected), there is little UV response in the region of the chromatogram where monosaccharides elute. UV detectable peptides/amino acids were found to elute near the column void and after 20min. OnGuard-A resin (50 mg), a microporous strong anion exchanger in the bicarbonate form, was used to remove peptides/amino acids (20). Under the conditions used, we determined that monosaccharides do not bind to OnGuard-A resin (data not shown). Comparison of Fig. lA and IB and IC and ID reveal the results of sample pretreatment with the anion exchanger. Clearly, the sample pretreatment removed UV and PAD active components that eluted after 20 minutes. Additionally, the monosaccharide peaks appeared to be "cleaned up" after this treatment, as evidenced by a more Gaussian shape. OnGuard-A treatment will remove glutamine but not lysine (data not shown). Monosaccharide composition analysis of the intact MAb revealed monosaccharide ratios consistent with the presence of lactosamine type, fiicosylated biantennary oligosaccharides with less than complete galactosylation (gal: man is < 2 : 3) (Table 1). The absence of galactosamine indicates the absence of 0-linked glycosylation. HPAEC-PAD monosaccharide analysis of hydrolyzed heavy and light chain bands verified predominantly heavy chain glycosylation (18,21). Comparison of monosaccharide compositions of PNGase F treated heavy chain bands with corresponding untreated heavy chain bands revealed essentially complete deglycosylation (>90%) by PNGase F (18). We determined MY9-6 monosaccharides with two amounts of injected protein (4.16 and 16.7 )ug) with similar results. This analysis shows that with higher amoimts of injected glycoprotein hydrolysate, there is a greater need for internal standard correction due to electrode poisoning (20). This poisoning is presimied to be primarily due to the amino acids and peptides not removed by the OnGuard-A resin. R,
Oligosaccharide Mapping
Glycosylation of the MY9-6 preparation was investigated by HPAEC-PAD oligosaccharide mapping after release of the N-linked structures by PNGase F. Oligosaccharide peak 1 (Fig. 2A) has a retention time identical to a fucosylated agalactosyl biantennary oligosaccharide standard (Table 2, structure 1). Peak 3 (Fig. 2A) has a retention time identical to a fucosylated, fully galactosylated biantennary oligosaccharide standard (Table 2, structure 3). Oligosaccharide peak 2 (Fig. 2A) has a retention time intermediate between peaks 1 and 3, perhaps suggestive of a fucosylated, monogalactosylated biantennary oligosaccharide structure (Table 2, structure 2). Alternatively, the retention time (15.3 min) of peak 2 is nearly identical to an agalactosyl biantennary oligosaccharide standard
68
Jeffrey Rohrer et al.
(D
Ul C O
¥lJ
CL
^ 300 0)
cr ^ CL
200
f--J 1 2
'
"^^ 10
20
30
Time
0
10
20
30
(min)
Figure 1. Monosaccharide analysis of MAb MY9-6. CarboPac PAl chromatography of 2M TFA hydrolysates (A & B) and 6M HCl hydrolysates (C & D). Chromatography of hydrolysates without prior OnGuard A sample pretreatment (A & C) or with prior OnGuard A sample pretreatment (B & D). Peaks are as follows: 1. fucose; 2. rhamnose; 3. mannosamine; 4. glucosamine; 5. galactose; 6. glucose; 7. mannose. Insets show UV absorbance monitored at 215 imi Table 1. Monosaccharide Analysis of Monoclonal Antibody MY9-6 Residues Monosaccharide/Mole MY9-6 Amount Hydrolyzed 4.16 ng
GlcN** GalN** Fuc* Gal* Man* 4.22 (3.76) 1.97(0.99) 1.14(1.05) 3.56 (3.30) 0(0) [3.54] [0.93] [0.97] [3.0] 3.85 (3.23) 0.93 (0.55) 0.86 (0.50) 16.7 ng 2.64(1.55) 0(0) [4.37] [1.061 [0.97] [3.01 = determined by 2M TFA, 4h, 100°C. ** = determined by 6M HCl, 4h, 100°C. = These values prior to internal standard correction (rhamnose and mannosamine () internal standards for the 2M TFA and 6M Hcl hydrolysates, respectively). = These values are normalized to man = 3. []
(15.2 min; structure not shown). Thus peak 2 may contain either or both of the aforementioned structures. MAb MY9-6 also possessed charged ohgosaccharides (peaks under 4, Fig. 2A). To further elucidate the identities of these oligosaccharide peaks, endoglycosidase treatment (Endo F2, Endo H) was used to classify the oligosaccharides. Exoglycosidase treatment (neuraminidase, pgalactosidase, p-N-acetyl hexosaminidase) of the PNGase F released structures was then used to substantiate the preliminary identifications.
HPAEC-PAD Analysis of Protein Glycosylation
69
Endo F2 Treatment Endo F2, a recently described endoglycosidase which cleaves predominantly biantennary oligosaccharides (23), was used to evaluate whether the released Nlinked oligosaccharides from MAb MY9-6 were biantennary-type chains as has been reported for mouse and human IgGs (24,25). More highly branched structures have also been reported (26). Endo F2 differsfromthe amidase PNGase F not only in its more restricted specificity but also in that it releases oligosaccharides with only half of their chitobiose core (PNGase F releases all types of N-linked oligosaccharides with their chitobiose core intact). A released N-linked oligosaccharide with a complete chitobiose elutes earlier in HPAEC-PAD than the identical pligosaccharide with half its chitobiose; thus biantennary oligosaccharides released by Endo F2 are expected to elute later than the identical oligosaccharide if released by PNGase F (27). Also, if fucose is attached at the reducing end GlcNAc, Endo F2 treatment would leave the fucose bound to the reducing end GlcNAc still attached to the polypeptide. Because the presence of core fucosylation reduces retention times of oligosaccharides on HPAEC-PAD, Endo F2 release of oligosaccharides without the reducing end core fucosylated GlcNAc would result in afiirtherincrease in retention time for the product (when compared to the identical oligosaccharide released by PNGase F). Endo F2 digestion of agalactosylated biantennary structures either with or without core fucosylation would give an identical product. Thus if peak 2 is an agalactosylated biantennary structure (no core fucose), Endo F2 digestion of the three neutral oligosaccharides would result in two products, an agalactosylated and a digalactosylated biantennary oligosaccharide, both with half of their chitobiose. Alternatively, if peak 2 is a monogalactosylated, core fucosylated biantennary oligosaccharide, Endo F2 treatment of the three neutral oligosaccharides would result in three products; an agalactosylated, a monogalactosylated and a digalactosylated biantennary oligosaccharide, each with half of their chitobiose. Comparing PNGase F vs. Endo F2 digestions of MAb MY9-6 revealed similar maps with three Endo F2 released oligosaccharide products, all with somewhat longer retention times when compared to the PNGase F released oligosaccharides (compare Fig 2 A and 2B). Ratios of peak areas of PNGase F neutral peaks 1, 2, and 3 were 49% : 35% : 10%. Similarly, ratios of peak areas (expressed as percentage of total oligosaccharide peak areas) of the Endo F2 peaks 1, 2, and 3 were 53%: 35%: and 7%, respectively. These results confirm that the major N-linked oligosaccharides present in MAb MY9-6 were predominantly biantennary oligosaccharides. These results also support identification of peak 2 as a monogalactosylated biantennary oligosaccharide with core fucosylation.
Jeffrey Rohrer et al.
70
Neuraminidase
PNGose F
c O
/?-Galactosidase
Endo F2
Q_ (D
JUJ Q < 200 CL
C
0
Endo H
15
Time
|S -N-Acetylhexosominidase
30
45 0
15
30
(min)
Figure 2. Oligosaccharide mapping of MAb MY9-6. CarboPac PAIOO chromatography of enzyme digests of MAb MY9-6. In Panels A, B, and C the substrate was 100 jLtg of MAb MY9-6. In Panels D, E, and F the substrate was a PNGase F digest derived from 100 jUg of MAb MY9-6. Table 2. Some Potential Oligosaccharide Structures for MAb MY9-6 1.
GlcNAc(Pl,2)Man(al,6)
Fuc(al,6)
I
I
Man(Pl,4)GlcNAc(Pl,4)GlcNAc GlcNAc(Pl,2)Man(al,3)
r'
GlcNAc(Pl,2)Man(al,6)
Gal(Pl,4)
Fuc(al,6)
I Man(Pl,4)GlcNAc(Pl,4)GlcNAc
I
31cNAc(Pl,2)Man(al,3) Gal(Pl,4)GlcNAc(Pl,2)Man(al,6)
I
Fuc(al,6)
I
Man(Pl,4)GlcNAc(pl,4)GlcNAc
I Gal(Pl,4)GlcNAc(Pl,2)Man(al,3)
Endo H Treatment Endo H treatment of MAb MY9-6 was used to evaluate if oligomannosidic or hybrid structures were present (22). Figure 2C shows that no oHgomannosidic or hybrid structures were released from the monoclonal preparations by Endo H treatment. Ribonuclease B was used as a positive control (data not shown). Hence, this antibody did not contain high mannose or hybrid type oligosaccharides.
HPAEC-PAD Analysis of Protein Glycosylation
Neuraminidase
(Arthrobacter urefaciens)
71
Treatment
Neuraminidase treatment was used to evaluate whether the oHgosaccharides identified as peaks under #4 in Fig. 2 Panel A in the PNGase F maps of the MY96 IgG preparation were modified with sialic acid. HPAEC-PAD was used to distinguish between the different forms of siaUc acid (28). Treatment of half a PNGase F digest of MAb MY9-6 with neuraminidase resulted in the disappearance of the oligosaccharide peaks migrating at 40-42 min. (compare Fig. 2A and 2D) confirming the presence of sialic acid. Concomitantly there was the appearance of a new single peak at 43 min. which has a retention time identical to N-glycolylneuraminic acid (NeuSGc). The presence of Neu5Gc in mouse IgG has been reported.(24) Additionally, upon neuraminidase treatment there was a 26% increase in area of peak 2 and a 123% increase in the area of peak 3. These results suggest NeuSGc sialylation of structures in peaks 2 and 3. fi-galactosidase (Diplococcus pneumoniae)
Treatment
P-galactosidase treatment of the PNGase F treated IgG preparations was used to assess putative galactosylation differences between peak 1 (we believe to be an agalactosyl fiicosylated biantennary oligosaccharide; Table 2, structure 1), peak 2 (we believe to be a monogalactosylated fucosylated biantennary oligosaccharide; Table 2, structure 2), and peak 3 (we believe to be afiiUygalactosylated, fucosylated biantennary oligosaccharide; Table 2, structure 3). If the proposed identifications are correct, P-galactosidase treatment would be expected to convert peaks 2 and 3 to peak 1. Additionally, such a result would confirm a p 1,4 linkage for galactose in the galactosylated oligosaccharides because the Diplococcus enzyme is specific for this linkage. Treatment of the PNGase F digest of MAb MY9-6 with the Diplococcus enzyme resulted in complete disappearance of peak 3 (Fig 2E). The area of Peak 2 was reduced by 86%, while peak 1 exhibited an increase in peak area. With p-galactosidase treatment, a new peak appears at 4 min, which is likely the released galactose (Fig. 2E). New peaks between 10 and 13 min were not expected but may represent the product of contaminating hexosaminidase activity in the P-galactosidase preparation (see below for expected products of a hexosaminidase digest). P-N'Acetylhexosaminidase
Treatment (Jack bean)
The expected products of a hexosaminidase digest of structure 1 would be FucMan3GlcNAc2 and the released monosaccharide, N-acetylglucosamine. Treatment of the PNGase F digest of MAb MY9-6 with the Jack bean enzyme (Fig. 2F) resulted in complete disappearance of Peak 1. A new peak with a retention time identical to a FucMan3GlcNAc2 standard (between 9 and 10 min) was observed (Fig. 2D). Additionally, a peak seen at 5 min. upon hexosaminidase treatment, presumably represents the released GlcNAc (Fig. 2F).
72
Jeffrey Rohrer et al.
The expected products of a hexosaminidase digest of structure 2 would be two isomers in which either the Man 1-3 or Man 1-6 antennae could be terminated with Gal-GlcNAc-Man (no terminal GlcNAc, and thus not susceptible to hexosaminidase), while the second antennae would be newly terminated with Man after release of the exposed terminal GlcNAc by the hexosaminidase. These two isomers would likely have retention times shorter than structure 2 (Table 2). Because there are no commercially available standards corresponding to these structures, identification by chromatographic retention time alone would not be possible. Use of a mannosidase that could distinguish terminal mannose on the Man 1-3 or Man 1-6 antennae (29) could confirm the proposed structures. Results of hexosaminidase treatment of MAb MY9-6 showed that the peak area of peak 2 was reduced by 51% upon hexosaminidase treatment (Fig. 2F). Additionally, several new peaks appeared; between 11 and 13 min (Fig. 2F). Susceptibility of Peak 2 to digestion separately by p-galactosidase and hexosaminidase further confirm that this oligosaccharide is structure 2, monogalactosylated with the ungalactosylated antennae terminated with GlcNAc. The Jack bean enzyme would not be expected to digest the fully galactosylated structure 3 (Table 2) and peak 3, which we believe is structure 3 (Fig. 2F; between 16 and 17 min), was not reduced in peak area upon hexosaminidase treatment. In summary, HPAEC-PAD monosaccharide analysis and oligosaccharide mapping when used in conjunction with oligosaccharide standards and endo- and exoglycosidases of well defined specificities, can be used to identify the oligosaccharide structures present in oligosaccharide maps of IgG preparations, as well as to monitor the lot-to-lot consistency of the production of therapeutic glycoproteins (30). HPAEC-PAD, which employs automated chromatography with rugged, high resolution pellicular anion exchange columns (stable from pH 0-14 and at pressures up to 3000 psi), and direct detection of carbohydrates by PAD (thus eliminating the need for derivatization) at picomol sensitivities offers the advantages of convenience, durability, high resolution (separation of branch, linkage, and positional isomers [31]), and faster speed of analysis (flow rates of 1 ml/min vs. 30 /xl/min) when compared to traditional gel filtration methods for separating and detecting carbohydrates.
References 1. 2. 3. 4. 5. 6. 7.
Kim, H.; Yamaguchi, Y.; Masuda, K.; Matsunaga, C; Yamamoto, K.; Mmura, T.; Takahashi, N.; Kato, K.; and Arato, Y. J. Biol Chem. 1994,269,12345-12350. Nose, M.; and Wigzell, H. Proc. Natl. Acad. Sci. USA. 1983, 80,6632-6636. Tao, M.H.; and Morrison, S.L. J. Immunol. 1989,143,2595-2601. Sutton, B.J.; and Phillips, D.C. Biochem. Soc. Trans. 1983,11,130-132. Sox, H.C.; and Hood, L. Proc. Natl. Acad. Sci. 1970,66, 975-982. Spiegelberg, H.; Abel, C; Fishkin, B.; Grey, H. Biochemistry 1970, 9,4217-4223. Sawidou, G.; Klein, M.; Grey, A.A.; Dorrington, K.J.; Carver, J.P. Biochemistry 1984, 23, 3736-3740.
HPAEC-PAD Analysis of Protein Glycosylation 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.
73
Taniguchi, T,; Mizuochi, T.; Beale, M.; Dwek, R. A.; Rademacher, T.W.; Kobata, A. Biochemistry 1985,24, 5551-5557. Arvieux, J.; Willis, A.C.; Williams, A.F. MoL Immunol. 1986,23, 983-990. Wallick, S.C; Kabat, E.A.; Morrison, S.L. J. Exp. Med. 1988,168,1099-1109. Co, M.S.; Scheinberg D.A.; Avdalovic, N.M.; McGraw, K.; Vasquez, M.; Caron, P.C; and Queen, C. MoL Immunol. 1993, 30,1361 - 1367. Tsai, CM.; Zopf, D.A.; Yu, R.K.; Wistar, R.; Ginsburg, V. Proc. Natl. Acad. Sci. 1977, 74, 4591-4594. Weber, R.J.; Clem, L.W. J. Immunol. 1981,127, 300-305. Lawson, E.Q.; Hedlund, B.E.; Ericson, M.E.; Mood, D.A.; Litman, G.W.; Middaugh, R. Arch. Biochem. Biophys. 1983, 220, 572-575. Middaugh, C.R. and Litman, G.W. J. Biol Chem. 1990,262, 3671-3673. Borrebaeck, C.A.K,; Malmborg, A.; and Ohlin, M. Immunol. Today 1993,14,477^79. Hermentin, P.; Witzel, R.; Vliegenthart, J.F.G.; Kamerling, J.P.; Nimtz, M.; and Conradt, H.S. Anal. Biochem. 1992, 203,281-289. Weitzhandler, M.; Hardy, M.; Co, M.S.; and Avdalovic, N. J. Pharm. Sci. 1994, in press. Townsend, R. Quantitative Monosaccharide Analysis of Glycoproteins Using HPLC, 1994 in Chromatography in Biotechnology editors Horvath, C. and Ettre, L. S. ACS, Washington DC, ACS Symosiimi Series 529. Rohrer, J. S.; Weitzhandler, M.; and Avdalovic, N. A. Glycobiology 1994 4, 91. Weitzhandler, M.; Kadlecek, D.; Avdalovic, N.; Forte, J.G.; Townsend, R.R. J. Biol. Chem. 1993 268,5121-5130. Tandai, M.; Endo, T.; Sasaki, S.; Masuho, Y.; Kochibe, N.; and Kobata, A. Arch. Biochem. Biophys. 1991,291, 339-348. Tarentino, A.L.; Quinones, G.; Schrader, W.P.; Changchien, L.; and Plummer, T.H. J. Biol. Chem 1992,267, 3868-3872. Kobata, A. Glycobiology, 1990, 1, 5-8. Mizuochi, T.; Hamako, J.; and Titani, K. Arch. Biochem. Biophys. 1987,257, 387-394. Krotkiewski, H.; Gronberg, G.; Krotkiewska, B.; Nilsson, B.; and Svensson, S. J. Biol. Chem. 1990, 265,20195-20201. Basa, L. J. and Spelhnan, M. W. J. of Chromatography 1990,499, 205-222. Anderson, D.; Goochee, C ; Cooper, G.; and Weitzhandler, M. Glycobiology 1994, in press. Amano, J. and Kobata, A. J. Biochem. 1986, 99,1645-1654. Bhat, U.R. and Helgeson, E.A. Am. Biotech. Lab. January 1994,16. Hardy, M.R. and Townsend, R.R. Proc. Natl. Acad. Sci, 1988, 85, 3289-3293.
This Page Intentionally Left Blank
Carbohydrate Structure Characterization of Two Soluble Forms of a Ligand for the ECK Receptor Tyrosine Kinase Christi L. Clogston, Patricia L. Derby, Robert Toso, James D. Skxine, Ming Zhang, Vann Parker, G. Michael Fox, Timothy D. Hartley, and Hsieng S. Lu Amgen, Amgen Center, Thousand Oaks, CA 91320-1789
I. INTRODUCTION Human B61 was originally found as the product of an immediate-early response gene induced by treatment of human umbilical vein epithelial cells with TOT-a (1) and subsequently identified as a ligand for the receptor tyrosine kinase ECK (2). The gene encodes a 187 amino acid polypeptide chain with a hydrophobic C-terminus presumably connected to a membrane-bound GPI anchor. Cells transfected with the gene coding for all 187 amino acids secreted two main soluble forms with differing C-terminal lengths (150 and 159 amino acids) and a membrane-bound form (4,5). After a B61^^^ (the identical 150 amino acids in one of the above described soluble forms) gene construct was transfected into CHO cells, two soluble forms are observed by SDS-PAGE, with estimated Mr of 22,000 and 24,000. The proteins terminate at Ala^^o with a theoretical Mr of 17,788. The two forms of r-HuB61i^^ are glycosylated at Asn^ and Thr^^^, which accounts for the larger estimated molecular weights. After isolation and identification of the peptides containing these glycosylation sites, high pH anion exchange chromatography, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, and electrospray ionization mass spectrometry (ES-MS) were utilized to evaluate carbohydrate microheterogeneity. II. METHODS Expression of recombinant human B61^^^ in CHO cells was achieved using the expression vector pDSVRa2 (3) carrying a B61 gene encoding the leader sequence plus the coding sequence of 150 amino acids. Purification of recombinant B61 derived from CHO cells was performed as described (6). N-terminal sequence determination of the intact protein and isolated peptides was performed by automated Edman degradation using a Hewlett Packard G1005A protein sequencer (7,8). Automated C-terminal sequence analysis was done in collaboration with Perkin Elmer-Applied Biosystems Division (9-12). Reversed-phase HPLC of the isolated r-HuB61i5^ was performed with a SynChrom C4 widepore column (3(X) A, 4.6 x 250 mm) using a Hewlett Packard 1090M LC system equipped with a diode array detector and Chemstation. Peptide mapping was performed by digesting aliquots of B61 (154 ug of the 22KD form and 237 ug of the 24KD form in 100-200 \iL) in 0.1 M CHAPS/PBS (pH 7.2) with endoproteinase Asp-N as described (4). Peptides TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
75
76
Christi L. Clogston et al.
were collected manually. Speed Vac dried, and stored at -IQfiC until further analysis. Just prior to N-terminal sequence analysis or MALDI-TOF, the dried peptide fractions were reconstituted in HPLC grade H2O (Burdick & Jackson). Glycosidase digestion of isolated N-linked glycosylated peptides by neuraminidase and N-glycanase was performed as described (13,14). High pH anion exchange chromatography of N-glycanase cleaved oligosaccharides (13) and neuraminidase digested asialo oligosaccharides was performed as described (14). MALDI-TOF was done on a Kratos Kompact Maldi III mass spectrometer fitted with a standard 337 nm nitrogen laser and operated in the linear mode at an accelerating voltage of 20 kV. The matrix used was a-cyano-4hydroxycinnamic acid (33mM in acetonitrile/methanol, premade from BRS) at a ratio of 1:1 with purified peptide samples. MALDI sample slides were loaded with 0.5-1.0 jtiL of matrix/sample mixture (estimated 1-10 pmol peptide). The data was reprocessed using the Kratos software provided with the instrument. Theoretical masses were determined by utilizing a spreadsheet in which individual peptide masses were added to all possible carbohydrate forms; these masses were then compared to the observed masses to identify structures consistent with the mass results obtained. Molecular masses of the purified peptides were determined by a Finnigan SSQ710C or a Sciex API HI electrospray mass spectrometer operating in single quadrupole mode. The samples dissolved in a mixture of H20/MeOH/formic acid (50:50:3 volume ratio) and introduced by flow injection into the same solvent stream at 30 uL/min. For normal molecular weight measurements an orifice potential of 70 V was used. In the Sciex instrument this orifice potential affects the desolvation and also determines the extent of collisional activation. The analysis of glycopeptides was achieved by stepped orifice scans (15). In this mode a higher orifice potential (120V to 140V) was used while scanning the low mass region (m/z = 150-440) and a lower orifice potential (60 to lOOV ramp) was used wlnle scanning the high mass region (m/z = 550-2400). SDS-polyacrylamide gel electrophoresis was carried out under reducing conditions according to Laemmli (16). III. RESULTS AND DISCUSSION A.
N-terminal sequence analysis of r-HuB61^^^ forms N-terminal sequencing of both purified B61 molecular weight forms gave the predicted N-terminal sequence D-R-H-T-V-F-W-(X)-S-S-N-P-K-F-R-N-E-, etc., where (X) is predicted by the gene sequence to be N. To determine the Cterminus of the secreted polypeptide, the C-terminal peptides from an both endoproteinase Asp-N digestions were isolated and analyzed. Figure 1 shows reversed-phase HPLC separation of peptides in the endoproteinase Asp-N digest. Several peptides at 58-64 minutes were found to have the disulfide-linked sequences D-R-(*)-L-R-L-K-V-T-V-S-G-K-I-(Z)-H-S-P-Q-A-H-V-N-P-Q-E-KR-L-A-A and D-A-A-M-E-Q-Y-I-L-Y-L-V-E-H-E-E-Y-Q-L-(*)-Q-P-Q-S-K where (*, X, or Z) denotes an unassigned residue. These sequences correspond to the positions 120-150 and 43-67 of human B61150, fhe human B61 gene encodes Thr at the position indicated as (Z), and the lack of an assignment here could reflect the presence of 0-linked carbohydrate. The human B61 gene encodes Cys at the positions indicated as (*).
B.
C-terminal sequence analysis of r-HuB61^^^ forms C-terminal sequencing of both purified B61 forms produced the predicted C-terminal sequence -L-A-A-COOH, indicating that both forms did terminate at
77
Carbohydrate Characterization
Figure 1. Endoproteinase Asp-N peptide mapping of r-HuB61 24KD (chromatogram A) and 22KD (chromatogram B) forms.
9M Figure 2. SDS-PAGE of T-HUB61^^^ forms. Lanes 1,2: E.coli control N- and 0-glycanse treated and untreated. Lane 3: N- and 0-glycanse treated. Lane 4: N-glycanse treated. Lane 5: untreated. Lane 6: r-HuB61 ^^'^ treated with N- and Oglycanse. Lane 7: r-HuB61 ^^^ untreated.
v4iiii4ij!!-
^.J^
n^ititi^i.U, 4ii,M>yt>i
Figure 3. MALDI-TOF mass spectrometric analysis of Asp-N peptide 19.4 from 24KD r-HuB611^0.
Ala^^^ and that the apparent molecular weight difference was probably due to differences in glycosylation. C.
De-glycosylation and molecular weight of r-HuB61 When the CHO-cell secreted, purified r-HuB61i50 was subjected to SDSpolyacrylamide gel electrophoresis under non-reducing conditions, two bands with Mr -22,000 and Mr -24,000 were detected (Fig. 2, lane 5), the former being more prominent. After treatment with neuraminidase, 0-glycanase and Nglycanase to remove 0-linked and N-linked carbohydrates, the T-KUB6V^^ migrated with an apparent Mr -18,000 (Fig. 2, lanes 3,4). Lanes 1 and 2 are a bacterially expressed form of the molecule and are not glycosylated.
Christi L. Clogston et al.
78
D.
N'linked glycopeptides of r-HuB6U^o 24KD form Table 1 summarizes the carbohydrate structural data deduced from the three detection methodologies, MALDI-TOF, ES-MS, and high pH anion exchange chromatography. The N-glycosylation site proved to be occupied with several structures. The glycopeptides were partially separated as five peptides, at retention times of 19.4, 20.3, 20.7, 20.9, and 21.3 minutes, and all of these glycopeptides contained the same amino acid sequence, N-terminal amino acids M7. By mass spectrometric analysis, Asp-N peptide 19.4 (Figure 3) contained primarily one high mannose structure containing 5 mannose units. Asp-N peptide 20.3 produced mass spectra (Figures 4,5) that are consistent with having high mannose structures with 4, 5, and 8 mannose units.
<<JiUMll,iHrlai,lhM»iiiift 1^ 600
m/z
1000
Figure 4. ES-MS analysis of Asp-N 20.3 peptide from 24KD r-HuB6ll50.
J\m Figure 5. MALDI-TOF mass spectrometric analysis of Asp-N 20.3 peptide from 24KD r-HuB6ll50.
Carbohydrate Characterization
79
Hybrid oligosaccharides having zero and one sialic acid were also present. Biantennary structures having one sialic acid, both with and without fucose, and a triantennary structure were detected as well. Asp-N peptide 20.7 contained many of the same structures (Figures 6,7) as in peptide 20.3 with the addition of a biantennary with no sialic acid and a hybrid structure. Asp-N peptide 20.9 (Figure 8) contained many of the same structures in the above peptides with the addition of a biantennary structure with two sialic acids. Many of the same structures were detected in most of the peptides due to the poor chromatographic resolution of these peptides in the peptide map.
E-i-04 3.60
^ *f H>n«illfcH^M.ij,,Ui| 900 + E4-03 1.14
100-| 80H 60H 40H
20
1 in
i S
Figure 6. ES-MS analysis of Asp-N 20.7 peptide from 24KD r-HuB6ll50
Figure 7. MALDI-TOF mass spectrometric analysis of Asp-N 20.7 peptide from 24KD r-HuB6ll50.
80
Christi L. Clogston et al.
E.
N-linked glycopeptides of r-HuBeU^^ 22KD form Compared to the 24KD form, the 22KD form lacks most of the glycopeptides detected. The only peptide found to be N-glycosylated was Asp-N 20.3. This peptide contained basically the same carbohydrate structures by high pH anion exchange chromatography as the 24KD analog of this peptide, but appeared to be less heterogenous by mass spectrometry (Figure 9). The other significant difference found in this form is contained in Asp-N peptide 63.5, a large peak containing the C-terminal disulfide peptides. The expected peptide mass for this complex is 6479.3 while the observed mass was 6468.1, indicating that the majority of the 22KD material is not 0-glycosylated.
Figure 8. MALDI-TOF mass spectrometric analysis of Asp-N peptide 20.9 from 24KD r-HuB6ll50.
Figure 9. ES-MS analysis of Asp-N peptide 20.3 from 22KD r-HuB6ll50.
Carbohydrate Characterization Table 1. Summary of carbohydrate analysis results
CHO r-HuB61 24KD glycoform Carbohydrate MALDI ES- Axn MH+ MH+ Obs. * Calc. Type TOF MS HPLC 3348.3 3351.1 high mannose X "X" NA X 3189.7 3189.0 high mannose X x 3353.8 3351.1 high mannose X X x X hybrid X 3547.5 3554.3 X high mannose 3845.7 3837.5 X X hybrid + SA X 4013.7 4007.7 X X 4049.3 4048.7 biantennary + SA 4141.5 4123.8 X triantennary biantennary + SA X X 4194.0 4194.9 X +fucose ND 4268.9 triantennary + fiicose X ND 4925.5 tetraantennary + fiicose X 2 0 T 2134.0 3186.2 3189.0 ~~X high mannose IT X 3348.3 3351.1 high mannose X X X 3549.9 3554.3 hybrid X X 3717.4 3716.4 hybrid X 3762.2 3758.5 biantennary X X X 3838.6 3837.5 high mannose X 4008.7 4007.7 hybrid + SA X 4044.2 4048.7 X X X biantennary + SA X 4141.5 4123.8 triantennary X biantennary + SA X 4194.5 4194.9 X + fucose 2134.0 3754.1 3758.5 biantennary "X" ~li. mJ X X X 3896.7 3904.6 biantennary + fiicose 4048.7 4049.7 biantennary + SA X X ND 4123.8 triantennary X X 4340.6 4341.0 biantennary + 2 SA X X CHO r-HuB 61 22KD glycoform high mannose f20T 2134.0 3189.3 3189.0 X IT X 1 3351.0 3351.1 X X X high mannose 3513.1 3514.2 high mannose X X 4045.5 4048.7 biantennary + SA X X ND 3903.6 biantennary + fucose X X ND 4268.9 triantennary + fucose X ND 4634.2 tetraantennary + fucose Peptide RT mass T9T 2134.0 20.3 2134.0
* Gale. MH+ based on ES-MS analyses. NA = not analyzedND = not detected.
IV. CONCLUSIONS Characterization of glycoproteins requires application of several chromatographic and mass spectrometric analytical techniques. High pH anion exchange chromatography of oligosaccharides and asialo oligosaccharides was
g2
Christi L. Clogston et al.
confirming potential structures elucidated by the two mass spectrometric methods. MALDI-TOF mass spectrometry generally gave good results for the glycopeptides at sub-picomolar to low picomolar levels. It was possible to analyze nearly all of the peptides using this technique. The purified glycopeptides were analyzed by ES-MS and in several cases produced some different ion populations than what was observed in the MALDI-TOF spectra. For instance (see Table 1), ES-MS did not detect the triantennary structure that was observed by MALDI-TOF and high pH anion exchange. Based on the mass spectrometric and high pH anion exchange chromatographic results, the major difference between the two r-HuB61i50 forms secreted from CHO cells is in the type of N-linked carbohydrate glycosylation at Asn^. The 22KD form produced only one peptide that contained N-linked glycosylation, while the 24KD form produced several peptides containing many diverse carbohydrate structures. The N-site for the 22KD r-HuB61i^o form is occupied primarily with high mannose type structures. The N-site for the 24KD r-HuB611^0 form is occupied with high mannose, hybrid and complex type structures, with some structures containing fucose as well. The 0-site, Thri^'* is occupied by mucin-type glycosylation, with the 22KD form being very little glycosylated at this site, while the 24KD form is fully occupied. The partial glycosylation pattern observed in this study is also common to other recombinant glycoproteins produced in CHO cells such as stem cell factor (17). ACKNOWLEDGMENTS The authors would like to thank Vish Katta for the electrospray mass spectrometric analyses and Michael F. Rohde for helpful discussions.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
Holzman, et al, (1990) Mol. Cell. Biol. 10:5830-5838. Bartley, et al, (1994) Nature 368:558-560. DeClerck, YA., et al, (1991) J. Biol. Chem. 266:3893-3899 Clogston, C.L., et al (1993) presented at the 7th Symposium of the Protein Society Merewether, L.A., et al (1993) Presentation at the 13th Intemation Symposium for the HPLC of Proteins, Peptides, and Polynucleotides, San Francisco, CA. Toso, R., et al, manuscript in preparation. Bente, H.B., et al, (1990) presented at the 4th Symposium of the Protein Society Miller, C.G., et al, (1990) presented at the 4th Symposium of the Protein Society Boyd, V.M., et al (1992) Anal. Biochem. 206:344-352. Boyd, V.M., et al (1992) presented at the 6th Symposium of the Protein Society Guga, P.J., et al (1993) presented at the 7th Symposium of the Protein Society Bozzini, M., et al (1993) presented at the 7th Symposium of the Protein Society Derby, P.L., et al (1993) p. 161-168 in Techniques in Protein Chemistry IV, Ruth Hogue Angeletti, ed. Derby, P.L., et al (1994) p. 89-96 in Techniques in Protein Chemistry V, John W. Crabb, ed. Carr, S.A., et al (1993) Protein Science 2:183-196. Laemmli, U.K. (1970) Nature 227:680-685. Lu, H.S., et al (1992) Arch. Biochem Biophys. 298:150-158.
CHARACTERISATION OF INDIVIDUAL N- AND O- LINKED GLYCOSYLATION SITES USING EDMAN DEGRADATION A. A. Gooley, N.H. Packer, A. Pisano, J.W. Redmond, and K.L. Williams Macquarie University Centre for Analytical Biotechnology, Macquarie University, Sydney, NSW, 2109, Australia A. Jones, M. Loughnan, and P.P. Alewood The Centre for Drug Design and Development, University of Queensland, St. Lucia, Brisbane, QLD, 4072, Australia
L
INTRODUCTION
Many glycosylation sites have been identified by absorption-phase Edman degradation, although most have been inferred by the absence of the glycosylated phenylthiohydantoin(PTH)-amino acid in the amino acid analyser [1]. This is due to the configuration of most manual and automated Edman degradation sequenators [2-4] where the glycosylated anilinothiazilinone(ATZ)-amino acid is not transferred in the solvent ethyl acetate. Generally, the cDNA sequence is known for the protein being sequenced and A^-linked glycosylation is identified by the occurrence of the motif Asn-Xaa-Ser/Thr, whereas 0-glycosylation is assumed if the sequence predicts Ser or Thr for the blank cycle observed during absorption-phase Edman degradation [1]. The problem with this type of analysis is that medium to low levels of glycosylation will not be detected, since even minor positive detection of PTH-Ser/Thr is acceptable as a positive identification. We have developed simple techniques for determining the sites ofN and 0-glycosylation as part of Edman degradation in the automated solidphase microsequencing of glycoproteins [5-7]. Sites of O- and A^-linked glycosylation can be positively identified during solid-phase Edman degradation if polar solvents, such as anhydrous TEA, are used to extract the cleaved amino acid from the reaction cartridge. Unlike adsorption-phase Edman degradation, the use of polar solvents allows the extraction of the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
83
84
A. A. Gooley et al.
polar modified amino acid derivatives, such as glycosylated amino acids. Hence, solid-phase Edman degradation providestiieideal chemical method for the purification of individual glycoforms as it is not limited by the problems associated with endoglycosidase specificity. Since Edman degradation results in the sequential removal of amino acids from the Nterminus, we have recenfly shown that it is possible to collect glycoamino acids released during Edman degradation and subject them to monosaccharide and ionspray mass spectrometry analysis [8]. 11. MATERIALS AND METHODS A. Glycoproteins We have analysed several different glycoproteins including those available as commercial preparations; human Glycophorin A (Sigma G 9266)[6] and bovine K-casein macroglycopeptide (Sigma C 7278) are both inexpensive sources of 0-glycosylated domains and the trypsin inhibitor, ovomucoid (Sigma) is a source of iV-linked oligosaccharides. Both Glycophorin A and ovomucoid contain glycosylated amino acids in the first ten amino acids from the N-terminus. Otiier sources of glycoprotein include Human Casebrook serum albumin (Asp494-^Asn, creates an iV-linked glycosylation site)[8] and the Dictyostelium recombinant glycoprotein PsA, which contains 0-linked GlcNAc[5,7]. B. Covalent attachment of glycoproteins to Sequelon-DITC and-AA Glycopeptides generated by endoproteinase Lys-C digests are optimal for solid-phase sequencing as they are coupled to Sequelon-DITC™ via their C-terminal e-amino group. However, not all glycoproteins will contain a convenient Lys for this strategy. A more generic approach is to attach peptides via the a-carboxyl group to Sequelon-AA™, an arylamine activated PVDF membrane disk, using water soluble carbodiimide [9]. The covalent attachment of the glycopeptides is carried out at 4°C to increase coupling yield [10]. One precaution that is necessary with Sequelon-AA™ immobilisation is to desialylate glycopeptides, because in addition to the amino acid side chain and C-termin^ carboxyls, the terminal sialic acid carboxyl groups of the oligosaccharide also form amide bonds with the arylamine membrane, hence immobilising the ATZ-sialylamino acid [8]. Glycopeptides/proteins are desialylated in 200 |il of 0-1 M trifluoroacetic acid (TEA) and incubated at 80°C for 40 min. Most of the TEA is removed in the vacuum centrifuge but it is not necessary to remove all traces before covalent immobilisation. Eor glycosylation site identification as little as 50-100 pmol of glycoprotein can be used for covalent attachment. However, for analysis of the oligosaccharide attached to individual glycoamino acids approximately 1-2
Characterization of N- & O-Linked Glycosylation Sites
85
nmol of glycoprotein/ glycopeptide was found necessary to obtain 400-800 pmole of PTH-glycoamino acid. C. Solid-phase Edman degradation The covalently bound glycopeptides were subjected to automated solidphase Edman degradation on the MilliGen ProSequencer™ 6600 using the standard program supplied by the manufacturer. The PTH-derivatives of amino acids and glycoamino acids were transferred directly from the conversion flask to a Waters 600 LC system equipped with a 490E multiwavelength detector. PTH-amino acids and PTH-glycoamino acids were separated on a Waters SequeTag™ (4^, 3-9 mm x 300 mm) Cig reversed phase column using the manufacturer's recommended gradient program. A 2 mM formic acid buffer system (which is essentially glucose-free) in place of the 35 mM ammonium acetate pH 3.8 buffer system recommended by the manufacturer. D. Analysis of the monosaccharide composition of PTH-glycoamino acids The PTH-glycoamino acids collected from the Milligen ProSequencer (approximately 4()0 pmol) were hydrolysed in 2M TFA at 100 'C for 4 h. After vacuum evaporation of the acid, the liberated monosaccharides were analysed by High Performance Anion Exchange Chromatography (HPAEC) using a CarboPac PAl™ column (4 mm x 250 mm, Dionex Corp., USA) on a Waters 625 LC system and Waters 464 pulsed amperometric electrochemical detector. E. Electrospray lonisation Mass Spectrometry of PTH-glycoamino acids Mass spectra were acquired on a Perkin Elmer/ Sciex API EI triple quadrupole mass spectrometer (PE/Sciex, Ontario, Canada), equipped with an ionspray atmospheric pressure ionization source. Samples of PTH-Asn(Sac) (200 pmol in 50 ^il) were flow injected into a moving solvent [10 \i\ min"l; 50% (v/v) acetonitrile, 0-5% (v/v) TFA], whereas PTH-Thr(Sac) was analysed by liquid chromatography mass spectrometry (LCMS) on an Aquapore RP-300 C8 column {l\i 100x2.1 mm) using a 0.2% (v/v) formic acid buffer system and an acetonitrile gradient The flow injection and LC were coupled directly to the ionization source via a fused silica capillary interface (50 |Lim i.d. x 50 cm length). Sample droplets were ionized at a positive potential of 5 kV and entered the analyser through an interface plate and subsequently through an orifice (100-120 \\m diameter) at a potential of 80 V (a sufficient potential to induce a limited amount of dissociation within the molecule). Full scan spectra were acquired over the mass range 400 to 2200 daltons with a scan step size of 0.1 dalton.
86
A. A. Gooley et al.
III. RESULTS Figure 1 represents the current strategy we use to characterise PTHglycoamino acids. All of the glycopeptides shown in this figure were desialylated and immobilised onto Sequelon AA™ PVDF membranes at 4 T [10]. Desialylation of the glycopeptide was necessary as we have found that the terminal sialic acid carboxyl groups of the oligosaccharide can also form an amide bond with the arylamine membrane, hence immobilising the ATZglycoamino acid on the PVDF disk [8]. The buffer conditions for PTH-glycoamino acid chromatography were modified for two reasons: i) preliminary compositional analysis on the recovered PTH-glycoamino acid resulted in a high glucose contamination which was traced to the ammonium acetate buffer recommended by Millipore for PTH-amino acid analysis and ii) to provide an improved separation of the PTH-glycoamino acids from their parent amino acids PTH-Asn/Thr/Ser in a separate "glycoamino acid space". An alternative buffer system using triethylamine phosphate has been recently described by Strydom [11] which also provides an early chromatographic space for the elution of hydrophilic post-translationally modified PTH-amino acids. However, we have not yet established whether this buffer system has a low glucose content compatible with monosaccharide compositional analysis. A. The oligosaccharide is degraded slowly during repeated cycles ofEdman degradation After 10 cycles of Edman degradation of the Casebrook albumin tryptic peptide Arg485-Lys500, there is chromatographic evidence for heterogeneity. This is indicative of some degradation of the oligosaccharide, with the increase in yield of PTH-Asn(Sac) n after 10 cycles (14% of total yield) compared to PTH-Asn(Sac) n (5% of total yield) after 2 cycles of the Casebrook albumin V8 peptide Val493-Glu495 (Fig. 1 Aii and iii). However, the combined yields of the major (I) and minor (II) peaks for PTH-Asn(Sac) from cycle 2 (V8 glycopeptide) and cycle 10 (tryptic glycopeptide) were essentially identical. The m.ajor Pra-Asn(Sac) peaks from the Casebrook albumin glycopeptides (Fig. lA) and PTH-Thr(Sac) peaks of the K-Casein peptideVall39-Tlu-145 (Fig. IB) were collected and subjected to compositional analysis by HPAEC-PAD (Fig. IC and D respectively). The observed compositions were consistent with the presence of a complex biantennary oUgosaccharide for PTH-Asn(Sac), GlcNAc4:Man3:Gal2, and the disaccharide GalNAc:Gal for PTH-Thr(Sac) (see Table I). Hence, much of the desialylated oligosaccharide structure remains intact on the glycosylated Asn and Tlir during as many as 10 cycles of Edman degradation. The pattern of glycosylated PTH-Thr(Sac) peaks (Fig. IBii) is identical to that observed in the rat CD8a hinge peptide [5] and glycophorin A [6], with two major peaks eluting well before normal PTH-Thr. This characteristic pair of peaks
Characterization of N- & O-Linked Glycosyiation Sites
PTH-Asp PTH-Asn
87
PTH-Val
liuLJWJlL,
iii\kjSL.jiiM., PTH-Tht(SK)
'UUJL 10
"T14
12
oMH.^r'N.W./^Uv
10 12 Time (min)
Time (min)
L.
14
D GalNHz
GlcNH2
n
I
Man
kJLs.w-^,,xU 10 Time (min)
6
8
10
-r12
Time (min)
Figure 1. A and B, HPLC chromatograms of PTH-Asn(Sac) released after solidphase Edman degradation of the Casebrook endoproteinase Glu-C fragment Val493-Glu495 (Aii) and tryptic fragment Arg485-Lys5(X) (Aiii). PTH-Thr(Sac) released after solid-phase Edman degradation of the K-Casein peptide, Vail39Thrl45, is shown in Bii. Chromatography conditions were; solvent A: 2 mM formic acid; solvent B 100% acetonitrile. The flow rate was 0-7 mlmin"! and column oven temperature 50°C. The gradient for PTH-amino acid separation was that recommended by the manufacturer. Ai and Bi are 35 pmol PTH-amino acid standard chromatograms. C and D, high performance anion exchange chromatograms using pulsed amperometric detection of monosaccharides released by 2M TEA hydrolysis of peak I, 400 pmol of PTH-Asn494(Sac) released at cycle 2 of the V8 peptide (C) and both PTH-Thr(Sac) peaks (D). The sugars were eluted isocratically with 15 mM NaOH and post-column addition of 0-4 M NaOH, and identified by comparison with standards. An internal standard of 2-deoxyglucose was used for quantitation. Abbreviations are GlcNH2 (glucosamine), Gal (galactose), Glc (glucose), Man (mannose).
A. A. Gooley et al.
probably represents diastereomeric forms of PTH-Thr(Sac), a similar pattern to that obtained for PTH-6-methyl-S-ethyl-cysteine, the ethanethiol adduct of 6-eliminated phosphothreonine [12]. Table I. Monosaccharide composition of PTH-glycoamino acids Sugar constituent
Composition mol/mol^ Casebrook Albumin Si^Asn494(Sac) £Asn494(Sac)
K-Casein
Thrl42(Sac)
3.8 3.2 0 glucosamine 0 0 1.0 galactosamine 2.2 1.7 0.9 galactose 2.6 2.9 0 mannose ^Noraialised on the amount («4(X) pmol) of PTH-Xaa(Sac) collected. Monosaccharide composition was quantified by the inclusion of 1 jiig of the intemal standard deoxyglucose. hpeak I of the PTH-Asn(Sac) recovered from cycle 2 of the V8 glycopeptide, i^Peak I of the PTH-Asn(Sac) recovered from cycle 10 of the tryptic glycopeptide. B. lonspray mass spectrometry of PTH-glycoamino acids Additional evidence concerning the nature of the oligosaccharide attached to Casebrook albumin PTH-Asn494 and K-Casein PTH-Thrl42 was obtained by ionspray mass spectrometry. The determined mass for PTH-AsnGlcNAc4:Man3:Gal2 was 1,872.9 daltons (Fig. 2A, 1,872-8 daltons expected) and for PTH-Thr-GalNAc:Gal 601.5 daltons (Fig. 2B, 601.4 expected). Limited structural information was obtained by increasing the orifice potential; for example the 1507.0 ion (Fig. 2A) results from the loss of a single hexosamine-hexose (366 daltons) and the 440.4 ion (Fig.2B) results from the loss of a hexose (162 daltons). However, interpretation of the spectrum is difficult because of the likelihood that some of the fragment ions arose from products of degradation during Edman sequencing. IV.
CONCLUSION
The method described in this paper makes possible the analysis of the sugars attached to a specific amino acid. The oligosaccharide is released by solid-phase Edman degradation as a labelled reducing terminal sugar, the PTH-glycoamino acid. This provides an excellent chromophore in the UV range (269 nm max), which should prove useful for the separation of heterogeneous oligosaccharides on a single amino acid. There are several excellent methods for the analysis of A^- and 0-linked glycosylation sites.
Characterization of N- & O-Linked Glycosylation Sites
B
IJgSl 1345.0 I1S9JO
.U
11999,9
1378.9
"uLitiililiiiiK
littkk foOO 1100 1200 1900 1400 1300 1000 1700 IMO 1900 2000
IlllljillLlilLlll.knh LJ
1.AU. .. .i • ,. . ..
•
Figure 2 The reconstructed ion-spray spectra for PTH-Asn494(Sac) (A) and for PTH-Thrl42(Sac) (B). PTH-Asn(Sac) was released at cycle 10 from the peptide Arg485-Lys500 (see Fig. 1 Aiii), Mass of PTH-Asn(Sac) = 1872.9, PTH-Asn = 249-3, x-axis is adjusted to molecular mass. Both PTH-Thr(Sac) peaks released at cycle 4 from fiie peptide Vall39-Thrl45 were analysed (see Fig. 1 Bii), Mass of PTH-Thr(Sac) = 602.5, PTH-Thr = 236.5, x-axis is mass/charge. particularly LC-ionspray MS analysis of proteolytically derived glycopeptides identified by the diagnostic sugar oxonium-ions [13]. However, solid-phase Edman degradation of glycopeptides which contain a domain of clustered glycosylation sites is the metiiod of choice for precise glycosylation site identification and characterization. Mass spectral analysis alone will not provide specific structural information regarding the composition of hexoses and hexosamines, and techniques such as acid hydrolysis of the oligosaccharide and subsequent identification of the sugars by HPAEC or GC-MS is necessary for complete carbohydrate analysis. Clustered glycosylation sites are typical of 0-glycosylated proteins such as glycophorin A [6], CDSa [5], GPIba [14] and the mucins [15]. We have previously demonstrated the efficiency of solid-phase Edman degradation by sequencing through the N-terminal domain of the "mucin-like" red blood cell glycoprotein glycophorin A; in that study we positively identified 1 A^-linked and 16 0-linked amino acids in the 60 amino acids sequenced [6]. As demonstrated here, solid-phase Edman degradation in combination with techniques of carbohydrate analysis such as HPAEC and ionspray mass spectrometry, will allow a new approach to the characterization of heavily glycosylated proteins previously thought intractable for protein chemistry studies. ACKNOWLEDGMENTS MUCAB research on Glycobiology was supported by an ARC Program grant to KLW and an NH&MRC grant to KLW and AAG. We thank
90
A. A. Gooley et al.
Millipore Australia for support of our solid-phase sequencing program. This work summarises recent work in our group and elements of it (including Fig.l A and C) have been published previously [8]. REFERENCES 1. Elhammer AP, Poorman RA, Brown E, Maggiora LL, Hoogerheide JG and Kezdy FJ (1993) /. Biol Chem, 268, 10029-10038. 2. Edman P and Begg G (1967) A protein sequenator. Eur. 7. Biochem. 1, 80-91. 3. Hewick RM, Hunkapiller MW, Hood LE and Dreyer WJ (1983) High sensitivity sequencing with a gas-phase sequenator. Methods Enzymol. 91, 399-413. 4. Tarr GE (1986) Manual Edman sequencing system., Shively.J.E. (eds) Methods of protein microcharacterisation. A practical handbook., Humana Press, USA, pp 155-193. 5. Gooley AA, Classon BJ, Marschalek R, Williams KL (1991) Biochem. Biophys. Res. Commun. 178, 1194-1201. 6. Pisano A, Redmond JW, Williams KL, Gooley AA (1993) Glycobiology 3, 429-435. 7. Gooley AA and Williams KL (1994) Glycobiology 4 (in press) 8. Gooley AA, Pisano A, Packer NH, Ball M, Jones A, Alewood PF, Redmond JW and Williams KL (1994) Glycoconjugate J. (in press) 9. CouU JM, Pappin DJC, Mark J, Aebersold R, Koester H (1991) Analytical Biochem. 194, 110-120. 10. Laursen RA, Lee TT, Dixon JD, Liang S-P (1991) In: Jomvall H, Hoog J-O and Gustvasson, A-M (ed's.) Methods in Protein Sequence Analysis. Birkhauser Verlag, Switzerland, pp. 47-54. 11. Strydom,DJ(1994) J. Chromatogr. 662, 227-233. 12. Meyer HE, Eisermann B, Heber M, Hoffmann-Posorske E, Korte H, Weigt C, Wegner A, Hutton T, DoneUa-Deana A and Perich JW (1993) FASEB J. 7, 776-782. 13.CarrSA, Huddleston MJ and Bean, MF (1993) Prot. Sci. 2, IS3-196. 14. Lopez, JA, Chung, DW, Fujikawa, K, Hagen, FS, Papayannopoulou, T and Roth, GJ (1987) Proc. Nad. Acad. Sci. USA 84, 5615-5619. 15. Carraway, KL and Hull SR (1991) Glycobiology 1, 131-138.
The Unexpected Presence of Hydroxylysine In Non-Collagenous Proteins Michael S. Molony Shiaw-Lin Wu Lene K. Keyt Reed J. Harris Analytical Chemistry Department Genentech, Inc., So- San Francisco, CA 94080
I. Introduction 5-Hydroxylysine (Hyl) is a common modification of coUagens and collagen-like domains of the Clq subcomponent of complement, acetyl cholinesterase, pulmonary surfactant apoproteins, mannose binding proteins, types I and II macrophage receptors, and bovine conglutinin [reviewed by Kivirikko et ah (1)]. Partial hydroxylation of a lysine residue in a non-coUagenous protein (angler fish somatostatin) has also been reported (2). Lysine residues are modified by a hydroxylase enzyme complex that recognizes an available XaaLys-Gly consensus sequence on the surface of the molecule. In collagens, Hyl residues are generally utilized for cross-linking of the triple helix structure or glycosylated by the attachment of glucosylgalactose glycans (1). During detailed characterization of a tryptic map of recombinant human tissue plasminogen activator (rtPA), a minor peak was resolved whose mass was not consistent with the expected set of tryptic peptides (L. Keyt and S.-L. Wu, unpublished observation). N-terminal sequence analysis of this fraction showed that it had a sequence containing residues 276-296 of rtPA with Lys277 present as Hyl. We developed a modified amino acid analysis program capable of detecting hydroxylysine at levels down to 0.05 residues/mol to determine the distribution of Hyl in rtPA as well as in some other proteins derived from mammalian cells. Hyl was detected in recombinant soluble form of the human CD4 receptor (rCD4) and a related chimeric protein (rCD4-IgG) by amino acid analysis. Tryptic peptides were collected from these proteins and the Hyl sites were identified by N-terminal sequence TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
91
92
Michael S. Molony et al.
analysis. Hydroxylysine modification sites in tPA, rCD4 and rCD4-IgG were found at surface-accessible Lys-Gly sites in these noncoUagenous proteins.
11. Materials and Methods Samples included rCD4 (3), rCD4-IgG (4), rgpl20 (5), rtPA (Activase®) and rhuMAb HER2 (6) all of which were purified from transfected Chinese hamster ovary (CHO) cells. Human rtPA from transfected 293 (human embryonic kidney) cells was prepared as described (7). Bowes melanoma tPA was obtained from American Diagnostica. Human plasma factor IX was kindly provided by Kenneth J. Smith (8). Recombinant human growth hormone (Protropin®), isolated from transfected E. coli (10), was included as a negative control. Approximately 40-80 |ig of each protein were hydrolyzed in triplicate (in 6N HCl for 24 hours at 110°C, in vacuo) and analyzed on a Beckman 6300 amino acid analyzer using the program given in Table I. Table I. Hydroxylysine amino acid analysis program for a Beckman 6300 Time 0.0 12.0 41.0 60.0 80.0 120.0 122.0 124.0 130.0 132.0 145.0
Event Sample injection Start temperature gradient (to 50 °C) Change buffer to Li-B Start temperature gradient (to 71 °C) Change buffer to Li-C Reagent pump: water Change buffer to Li-R Change buffer to Li-A Reagent pump: ninhydrin Temperature return to 38 "^C Recycle (next sample injection)
The system was equilibrated with a column temp, of 38 °C and flow rates of 20 mL/hr and 10 mL/hr for the buffer and reagent pumps, respectively. Temperature gradients changed at 1.5 °C/min.
Hydroxylysine in Noncollagenous Proteins
93
Lithium citrate buffers (Li-A, Li-B, and Li-C) and a lithium cation exchange column (20 X 4.6 cm) were purchased from Beckman. The amino acid standard was prepared from the Beckman standard with the addition of Hyl/allo-Hyl (from Sigma). Optimization of the method with respect to temperature and buffer change times was required to adequately resolve Hyl from ammonia. Hyl is partially converted to allo-Hyl during hydrolysis. The peak areas for Hyl and allo-Hyl were combined for standards and samples. Samples of rtPA, rCD4 and rCD4-IgG were S-carboxymethylated and digested with trypsin as described (10-12). The rtPA digestion products were chroma to graphed using a Hewlett Packard 1090M HPLC system with a Nucleosil C18 (4.6 x 250 mm, 5 |im) column that was operated at 1.0 mL/min. and 40°C. The column was equilibrated with 100% solvent A (50 mM sodium phosphate at pH 2.85); a gradient of 0 to 30% solvent B (acetonitrile; Burdick & Jackson) was developed over 90 min., then the gradient went from 30 to 60% solvent B from 90 to 120 min. Collected peak fractions were rechromatographed using the same instrument and conditions, except that solvents A and B were 0.06% TFA in water and acetonitrile, respectively, with a gradient of 0-80% solvent B in 80 min. that started 4 min. after sample injection. rCD4 and rCD4-IgG tryptic peptide fractions were prepared as described (11,12). N-terminal sequence analyses were performed using ABI 477/120A or 473A protein sequencers. For mass spectrometric analyses, aliquots of the collected peak fractions were also concentrated in a Savant SpeedVac to -20 pmol/|iL and infused using a Harvard syringe pump into a PE SCIEX API III electrospray mass spectrometer (ESI-MS) operating in the positive mode with an orifice potential of 80 volts.
III. Results A minor peak fraction from a phosphate/ACN-based tryptic map of CHO-expressed tPA was desalted by rechromatography using a TFA/ACN-based RP-HPLC system (Fig. 1). Two forms of a peptide containing residues 276-296 were obtained; the earlier eluting form had a mass of 2257.2 amu, which was ~16 amu higher than the tailing fraction that had a mass in agreement with the expected 276-296 peptide mass of 2241.6. N-terminal sequence analysis of the +16 amu form revealed the presence of Hyl at position 277 of a tPA tryptic peptide containing residues 276-296 (Ile-Hyl-Gly-Gly-Leu-Phe-AlaAsp-Ile-Ala-Ser-His-Pro-Trp-Gln-Ala-Ala-Ile-Phe-Ala-Lys).
Michael S. Molony et al.
94
900-] 800i
rtPA:276-296
i.
700 600 D (E E
500 400H 3001
200-1 100H
ijilLk.
0-': 20
60 T 1 me (m1n. )
40
80
100
120
6 peptides
700^
Hyl2 77
B.
"I1
600^ 500-; 400D (E E
Lys277
300-;
/
200-j
\
100-^ 0-
^ 50
\ 52 Ti me ( m l n . )
54
56
Figure 1. RP-HPLC purification of rtPA:276-296 peak fraction. rtPA:276-296 fractions (indicated by the asterisk in Figure lA) from several injections were combined and re-chromatographed using the TFA/H2O/ACN system as shown in Figure IB.
Hydroxylysine in NoncoUagenous Proteins
95
Hyl was assigned by the elation position of the PTH-amino acid [between Val and DPTU (13)] as shown in Fig. 2; this assignment was confirmed by sequencing a Hyl standard (not shown). Hyl277 is found within a Xaa-Lys-Gly sequence, consistent with other known hydroxylysine modification sites. He TSipmol
Cycle l:Ile-276
r
JA . . J l
D N SQTG E H
A
i ' ' ' ' \u'
S* R Y
P MV T •
W FIKL
III
Cycle2:Hyl-277
Hyl 257pmol
If— H y
D' N'
xlL SQTG E H
A
P MV T •
S* R Y
W F IKL
~5J
iM
2i.l
Cycle 3: Gly- vn Cay 622 pmol
1
i <>
1 /
^.i
1L^
D N SQTG E ii.i
_J
V _
H
A
S* R Y
P MV T •
itl
k%
W F IKL
±1
Retention Tine: fllnutes
Figure 2. N-terminal sequence analysis of rtPA:276-296 peptide with Hyl277. PTHamino acid elution positions are given above the time axis; PTH-Hyl is indicated by the arrov;^ (T). Quantitation of PTH-Hyl assumes the same peak height per pmol as for PTH-(e-PTC)-Lys.
96
Michael S. Molony et al.
The Hyl277 peptide's observed mass {1157.1) was 15.6 amu higher than expected for an unmodified 276-296 peptide (2241.6 amu), consistent with lysine hydroxylation, which would be expected to add 16.0 amu. Partial hydroxylation of Lys277 in melanoma tPA and 293cell expressed rtPA was also demonstrated by identical tryptic mapping/mass spectrometry experiments (data not shown). Hyl co-elutes with histidine during conventional sodium citrate amino acid analysis. We developed a lithium citrate-based amino acid analysis method capable of resolving Hyl to screen tPA and other proteins for this modification. An amino acid standard containing Hyl (and allo-Hy\) was prepared. Resolution of Hyl from His and NH3 using the lithium citrate buffer system is given in Fig. 3. The extended second buffer time (resulting in a blank region after Phe) was required to resolve Hyl from NH3. The Hyl modification was partial at one site, so it was necessary to prepare nanomolar quantities of protein hydrolysates for accurate quantitation of Hyl in the 100-200 pmol range. We analyzed several rtPA lots, type I and type II rtPA, melanoma tPA and several other recombinant proteins (Table 11). All rtPA samples showed equivalent levels of Hyl (0.05-0.21 mol Hyl/mol protein), including the type I (glycosylated at Asnll^^ Asnl84 and Asn^^^) and type II (glycosylated at Asnll'^ and Asn^^ only) forms of rtPA. Hyl was also found in rCD4 and rCD4-IgG at levels of 0.080.10 mol/mol protein. Factor IX, rgpl20 and rhuMAb HER2 did not show detectable Hyl despite having 2, 4 and 5 Lys-Gly sequences, respectively. rhGH derived from E. coli was included as a negative control, as it would not be expected to have Hyl due its expression in non-mammalian host and its lack of any Xaa-Lys-Gly sequences. Table 11. Hyl levels as measured by amino acid analysis Sample
Hyl/mol
#Lys-Gly sites
Source/Cell Line
rtPA rtPA, type I rtPA, type II tPA (Bowes)
0.05-0.21 0.12 0.15 0.24
1 1 1 1
transfected CHO transfected CHO transfected CHO melanoma
factor IX rCD4 rCD4-IgG rgpl20 rhuMAb HER2 rhGH
0.00 0.08 0.10 0.00 0.00 0.00
2 2 4 4 5 0
human plasma transfected CHO transfected CHO transfected CHO transfected CHO transfected E. coli
Hydroxylysine in Noncollagenous Proteins
97
«^
"Wi^...PGA
D SE T
CI Y F ML
NH3
K H Tjjn
10
2
0
3
O
4
O
S 0 a Time [min]
0
7
0
8
0
M
Figure 3. Amino acid analysis program for the sensitive detection of Hyl. Hydrolysate of 38 ng (0.65 nmol) rtPA having 131 pmol (0.20 residues/mol) of Hyl.
There are two Xaa-Lys-Gly sequences in rCD4 and rCD4, at Lys^ and Lys46. Samples of rCD4 and rCD4-IgG were subjected to tryptic mapping, and peptides thought to contain either Lys^ or Lys^^ residues were sequenced. Both rCD4 and rCD4-IgG showed evidence for Hyl at one site (Lys^^) within a region that has the identical sequence in the two proteins. In each case, the modification site was found in a peptide containing residues 36-50 (Ile-Leu-Gly-Asn-GluGly-Ser-Phe-Leu-Thr-Hyl-Gly-Phe-Ser-Lys). IV, Discussion We have identified a known posttranslational modification, hydroxylysine, in unexpected places. This modification is only partial, typically 5-25%, at one site only in tPA (at Lys277)^ rCD4 (at Lys^^) and rCD4-IgG (at Lys^^). All of the observed Hyl sites were found in Xaa-Hyl-Gly sequences, as is the case for the known Hyl sites in coUagens. The modification appears to be cell line-independent, as it was found in tPA from Bowes melanoma cells as well as rtPA from transfected CHO and 293 cells. The Hyl modifications were overlooked in earlier characterization efforts (10-12, 14) because of the co-elution of Hyl with numerous His residues during amino acid analysis and also because the Hyl residues were found in minor tryptic peptides that didn't result from cleavage at the expected sites. All of the observed
98
Michael S. Molony et al.
sites are in surface-accessible regions of their respective molecules. Lys277 is only two residues from the (Arg275) plasmin cleavage site of tPA, while crystal structures obtained for CD4 indicated that Lys^^ (but not Lys^) is found at the surface in the gpl20-binding region (15). These findings suggest that surface accessibility could play a role in the extent of Lys hydroxylation. Partial hydroxylation of the Lys residues will make it difficult to determine the biological significance of this modification in non-coUagenous proteins.
Acknowledgments The authors thank Long Truong and Victor Ling for ESI-MS analyses, and Dr. John Ogez for providing the type I and type II rtPA. This work was initiated by a careful examination of a tryptic map of rtPA by Barb Sheng, Kathy Powell and Dorinne Tsuchiya. References I. Kivirikko, K.I., Myllyla, R. and Pihlajaniemi, T. In "Posttranslational Modifications of Proteins" (Harding, J. J. and Crabbe, M.J., eds.) 1992 pp. 1-51. CRC Press, Boca Raton. 2. Andrews, P. C , Hawke, D., Shively, J. E. and Dixon, J. E. (1984) /. Biol Chem. 15021-15024. 3. Smith, D. H., Bym, R. A., Marsters, S. A., Gregory, T., Groopman, J.E. and Capon, D. J. (1987) Science 238, 1704-1707. 4. Capon, D. J., Chamow, S. M., Mordenti, J., Marsters, S. A., Gregory, T., Mitsuya, H., Bym, R. A., Lucas, C , Wurm, F. M., Groopman, J. E., Broder, S. and Smith, D. H. (1989) Nature 337, 525-531. 5. Lasky, L. A., Nakamura, G., Smith, D. H., Fennie, C , Shimasaki, C , Patzer, E., Berman, P., Gregory, T. and Capon, D. J. (1987) Cell 50,975-985. 6. Carter, P., Presta, L., Gorman, C. M., Ridgway, J. B. B., Henner, D., Wong, W. L. T., Rowland, A. M., Kotts, C , Carver, M. E. and Shepard, H. M. (1992) Proc. Natl. Acad. Sci. USA 89:4285-4289. 7. Higgins, D. L., Lamb, M. C, Young, S. L., Powers, D. B. and Anderson, S. (1990) Thromh. Res. 57, 527-539. 8. Smith, K. J. (1988) Blood 72, 1269-1277. 9. Olson, K. C , Fenno, J., Lin, N., Harkins, R. N., Snider, B., Kohr, W.H., Ross, M. J., Fodge, D., Prender, G. and Stebbing, N. (1981) Nature 239,408-411. 10. Chloupek, R. C , Harris, R. J., Leonard, C. K., Keck, R. G., Keyt, B. A., Spellman, M. W., Jones, A. J. S. and Hancock, W. S. (1989) /. Chrom. 463,375-396. II. Harris, R. J., Wagner., K. L. and Spellman, M. W. (1990) Eur. J. Biochem. 194, 611-620. 12. Harris, R. J., Chamow, S. M., Gregory, T. J. and Spellman, M. W. (1990) Eur. J. Biochem. 188, 291-300. 13. Hunkapiller, M. W. (1985) User Bull. 14, Applied Biosytems, Inc., Foster City. 14. Ling, v., Guzzetta, A. W., Canova-Davis, E., Stults, J. T. and Hancock, W. S. (1991) Anal. Chem. 63, 2909-2915. 15. Ryu, S.-E., Truneh, A., Sweet, R. W. and Hendrickson, W. A. (1994) Structure 2, 59-74.
Isolation of Escherichia coli synthesized recombinant proteins that contain e-N-acetyllysine Bernard N. V i o l a n d , Michael R. S c h l i t t l e r , Cory Q. Lawson^ James F. Kane^, Ned R. S i e g e l ^ C h r i s t i n e E. Smith, and Kevin L. D u f f i n Monsanto Coitpany, S t . Louis, MO 63198 and ^SmithKline Beecham, King of Prussia^ PA, 19406
I. INTRODUCTION The formation of e~N-acetyllysine by the acetylation of the side chain amino group of lysine residues in a polypeptide was initially discovered in calf thymus histones H3 and H4 (1-4). This altered amino acid has also been detected in other histones (5,6) and its formation has been most thoroughly investigated in histones and another class of DNA-binding proteins known as the high-mobility group proteins (7,8). Acetyl-CoA is the donor of the acetyl group which is transferred to the e-nitrogen of lysine by an acetyltransferase in this postsynthetic reaction (1,9). e-N-acetyllysine has never been detected in either natural Escherichia coli proteins or recombinant proteins expressed in this organism except for one article describing its presence in several tryptic peptides from a fraction of recombinant bovine somatotropin having a lower than normal isoelectric point (10). Our report utilizes amino acid sequencing, mass spectrometry and amino acid analysis to demonstrate the presence of e-N-acetyllysine in several recombinant eukaryotic proteins synthesized in E. coli. Additionally, several mono-e-N-acetyllysine species of recombinant porcine (rpST) and bovine somatotropins (rbST) have also been purified utilizing preparative immobilized pH gradient electrophoresis.
II. MZkTERIAL AND METHODS Recombinant somatotropins were refolded from E. coli inclusion bodies as previously described (11). Monomeric rpST and rbST were purified from the refold mixture (20 grams rpST) by deionizing it to <100 microsiemens, adjusting to pH 10.8 with 2.5 N NaOH, TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
99
100
Bernard N. Violand et al.
and applying to a column (5 cm x 25 cm) of Phoenix resin equilibrated in 4.5 M urea, 0.05 M Tris-HCl, pH 10.8. The protein was eluted with a linear salt gradient consisting of 3.5 1 of the equilibrating buffer and 3.5 1 of a limit buffer of 4.5 M urea, 0.05 M Tris-HCl, 0.10 M NaCl, pH 10.8. Isolation of the low pi forms of rpST and rbST was accomplished by chromatography of the product from the Phoenix ion-exchange column on a Whatman DE-52 resin as previously described (12). Immobilized pH gradient electrophoresis (IPG) of rpST and rbST was performed using pH gradients in urea according to published procedures (13). Tryptic peptide mapping, isoelectric focusing, RP-HPLC, and electrospray mass spectrometry (ESMS) were performed as previously described (14). Positive fast atom bombardment mass spectrometry of peptides was performed on a Finnigan TSQ mass spectrometer equipped with an lontech FAB source as previously described (15). The method used for determining the amount of e-N-acetyllysine in somatotropins by amino acid analysis was similar to a previous method with one modification being that norleucine was included as an internal standard (4).
HI. A.
RESULTS Isolation
of Monoacetylated
rpST and rbST
Analysis of rpST by anioh-exchange chromatography resolved this protein into two major peaks (Fig. 1 ) . Isoelectric focusing (lEF) of these fractions and those from a similar large scale separation on Whatman DE-52 resin showed that the components in the second peak were very heterogeneous and were of a lower pi than the major component in rpST (Fig. 2 ) . These species have pis in the range of 5.5-7.0 compared to 7.4 for normal rpST. lEF of rbST (11) and immobilized pH gradient electrophoresis (IPG) of recombinant human tissue factor pathway inhibitor (16) have also demonstrated the presence of several low pi forms of these proteins expressed in E. coli. Since these low pi rpST forms accounted for approximately 35 % of the total protein as analyzed in Fig. 1 an understanding of their origin was desirable. In order to determine the chemical identity of these low pi species, preparative IPG was used to isolate sufficient quantities for characterization. Fig. 3 shows an IPG gel of the low pi rpST loaded onto the gel (same sample as lane 4, Fig. 2) and several of the individual bands purified from this sample. Each of these isolated proteins was subjected to tryptic peptide mapping. Peptide maps of normal pi rpST and the rpST from lane 4 in Fig. 3 showed several
Recombinant Proteins Containing
e-N-Acetyllysine
101
rigurtt 1. Anion-exchange HPLC separation of normal from low pi rpST. This separation was performed using two Bio-gel DEAE-5-PW columns in tandem using a salt gradient at pH 8.3.
'ilBl/;
pH
li|Bi ^ipii^ft
IB ^B
riguxtt 2. Analytical lEF of normal and low pi rpST purified using DE-52 chromatography. The rpST shown in lane 2 was separated into normal (lane 3) and low pi rpST (lane 4). Lane 1 is Pharmacia lEF standards.
Figure 3. Immobilized pH gradient electrophoresis of rpST. Preparative IPG of the low rpST in lane 1 was used to isolate low pi species in pure form. Lane 2 is rpST with the peptide bond between Asn" and Ser^°° cleaved; lane 3 contains rpST with acetyllysine"^; and lane 4 contains rpST with acetyllysine"\
Bernard N. Violand et al.
102
IPG p u r i f i e d low p i rpST
00 vA
Normal rpST
u
LL_LJ 30.aa ^a.aa Time (min)
Figuz« 4. RP-HPLC analyses of the tryptic peptides for normal rpST and rpST containing e-N-acetyllysine"^.
differences (Fig. 4 ) . Two of the normal peptides corresponding to residues Asp^^®-Leu-His-Lys^^^ and Ala^'^^-Glu-Thr-Tyr-Leu-Arg^'''' were reduced in intensity and a novel peptide at a retention time of 36.8 min. was present in the modified protein digest. The sequence of this altered peptide yielded Asp^^®-LeuHis-*^NAcLys-Ala-Glu-Thr-Tyr-Leu-Arg^''\ This data showed that an e-N-acetyllysine residue was present instead of Lys^^^. Similar peptide mapping and analyses of the other purified IPG bands demonstrated that the rpST in lane 3 contained e-N-acetyllysine^^^, and the protein in lane 2 was rpST which had the peptide bond between Asn^^ and Ser^°° cleaved.
B.
RP-HPLC analysis
of
rpST
RP-HPLC a n a l y s i s ( F i g . 5) of rpST p u r i f i e d by t h e Phoenix r e s i n r e s o l v e d i t i n t o 1 major and four minor PURIFIED rpST CONTAINING e-N-ACETYLLYSINE^**
1 RP-HPLC OF rpST
I 1
Figure 5 . t h i s rpST.
1
1
1
.
,
<
.1 1
1
1
1
,
1
1
1
RP-HPLC a n a l y s e s of rpST and p u r i f i e d peak 5 from
Recombinant Proteins Containing
e-N-Acetyllysine
103
components. The first protein peak has previously been identified as rpST containing a peptide bond cleaved between Asn*^ and Ser^°° (same species as lane 2, Fig. 3)^ the second peak is rpST with isoaspartate^^/ the third peak is of unknown identity and the majority of the fourth peak is normal rpST (15). The last eluting component (peak 5 ) , which accounted for 5-10 % of the total protein^ was isolated by RP-HPLC (Fig. 5) and analyzed by tryptic peptide mapping along with standard rpST (not shown). Sequencing of a modified peptide from this lateeluting rpST yielded an identical sequence to the normal peptide of Gln^*°-Thr-Tyr-Asp-Lys^*^-Phe-Asp-ThrAsn-Leu-Arg^^° except residue 144 was an e-Nacetyllysine. Mass spectrometry analysis yielded a positive molecular ion of 1442.8 for the modified peptide compared to 1400.8 for the normal peptide which confirmed the addition of an acetyl group to this peptide. These data from the RP-HPLC and IPG purified species showed that at least three monoacetylated species of rpST (e-N-acetyllysine 144, 112 and 171) were formed in E. coli. Similar analyses of IPG-purified bands of rbST showed that e-Nacetyllysine was present at residues 144, 157 and 167 in this protein.
C.
Electrospray
Mass
Spectrometry
Electrospray mass spectrometry (ESMS) was also used to substantiate the presence of e-N-acetyllysine in rbST and rpST. ESMS (Fig. 6) analyses of normal pi (Fig. 2, lane 3) and low pi rpST (Fig. 2, lane 4) yielded the expected mass of 21,798 daltons for normal rbST whereas the low pi rpST contained a majority of rpST
Figure 6. Deconvoluted ESMS spectra of normal pi rpST and low pi rpST. The calculated molecular weight of rpST is 21,798 amu.
104
Bernard N. Violand et al.
21,914
100, LOW p i r b S T ( ~ 1 pH BELOW NORMAL)
21,872 11 <-(+! a c e t y l )
3
NOBMAL MASS-» ft I \ %
•
^
•
7V
100. VERY LOW p i r b S T ( ~ 2 pH BELOW NORMAL) $
•
_
(+1 a c e t y l ) 21,914 ft 21,956
' ' ' ' 7 \A-(-2 acetyl)
/ V
NORMAL MASS-» I 1 V \
t-
21,800 m/z
Figure 7. Deconvoluted ESMS spectra of normal pi rbST^ low pi rbST and a very low pi rbST. The calculated molecular weight of rbST is 21,872 amu.
which is 42 daltons higher in mass. This additional 42 daltons is the mass expected for incorporation of an acetyl group. Similar analyses (Fig. 7) of normal pi (8.3), low pi (7.3) and a very low pi (6.3) form of rbST yielded similar results and also showed that there were apparently some diacetylations in the very low pi species of rbST. This ESMS data demonstrated that approximately one-half of the low pi somatotropins contained e-N-acetyllysine.
D.
Amino Acid Analysis N-Bcetyllysine
for
Quantitation of
e-
IPG and RP-HPLC were useful for separating several monoacetylated forms of rpST but these methods do not yield quantitation for the total level of e-Nacetyllysine. The total amount of e-N-acetyllysine was determined using total enzymatic digestion of the protein followed by amino acid analysis. Normal hydrolysis with acid or base cannot be used since e-Nacetyllysine is not stable to these hydrolysis conditions. A value of 0.42 moles of e-N-acetyllysine per mole of protein was obtained for the low pi rpST (Fig. 2, lane 4) while the normal pi rpST (Fig. 2, lane 3) yielded non-detectable levels (<0.05 mol/mol) of this amino acid. As expected, these data confirmed that e-N-acetyllysine is present only in the low pi components of rpST but not in the normal pi rpST. Since the majority of the rpST molecules contain only one e-N-acetyllysine residue, as verified by ESMS, then approximately 40 % of the low pi rpST molecules
Recombinant Proteins Containing e-N-Acetyllysine
105
are acetylated. The remainder of the low pi species have been shown to be primarily deamidated forms of rpST which have been previously identified as containing aspartate^^ and isoaspartate^^ (15) . The amount of e-N-acetyllysine was also measured on rpST and rbST isolated directly from inclusion bodies. This was accomplished by reducing and carboxymethylating the free sulfhydryl groups in the proteins from inclusion bodies and subsequently purifying the desired somatotropin away from the E. coli proteins by RP-HPLC. These results yielded 0.22 and 0.27 moles of e-N-acetyllysine per mole of rpST and rbST, respectively. These data showed that approximately 25 % of the somatotropin present in the inclusion bodies contained this modified amino acid.
IV. DISCUSSION This report describes the presence of significant amounts of e-N-acetyllysine in rpST and rbST, eukaryotic proteins expressed in a prokaryotic system. Initial work from our laboratory has also demonstrated the presence of this modified amino acid in two other recombinant eukaryotic proteins expressed in E. coli, bovine placental lactogen and human tissue factor pathway inhibitor (16). ESMS, amino acid sequencing and amino acid analyses were utilized to demonstrate the presence of e-N-acetyllysine in these two recombinant proteins. These data established that this modified amino acid is present in several distinct recombinant eukaryotic proteins expressed in E, coli. The mechanism of acetylation in E. coli has not been elucidated but is being actively investigated in our laboratory. Formation of e-N-acetyllysine in eukaryotic systems involves a posttranslational mechanism in which the acetyltransferase uses acetylCoA as the source of the acetyl group (4). Our attempts to demonstrate an e-N-acetyltransferase activity in E. coli have been unsuccessful thus far when using a crude lysate of E. coli with [C^^lacetylCoA and non-acetylated rpST as a substrate. It may be possible that acetylation of lysines occurs by a chemical mechanism with acetyl-CoA or some other metabolic intermediate providing the source of the acetyl group. It has been shown that the acetyl group from aspirin can readily be transferred to lens protein to form e-N-acetyllysine (17). Conceivably, under the stressed conditions of expressing recombinant proteins an acetyl donor molecule may be produced at a higher than normal concentration and cause this acetylation. Investigations into the effects of fermentation conditions on the level of eN-acetyllysine formation may lead to a better understanding of this event. In conclusion, this
106
Bernard N. Violand et al.
research has demonstrated that acetylation of lysines is an important modification which can occur during the expression of recombinant proteins expressed in E.
coll.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
13. 14.
15. 16. 17.
Allfrey V.G.r Faulkner^ R.^ and Mirsky^ A.E. (1964). Proc. Natl. Acad. Sci. U.S.A. 51^ 786-794. Gershey E.L.^ Vidali^ G.^ and Allfrey^ V.G. (1968). J. Biol. Chem. 243, 5018-5022. Delange R.J., Fanbrough, D.M., Smith, E.L., and Bonner, J. (1969). J. Biol. Chem. 244, 319-334. Allfrey V.G., Paola, E.A., and Sterner, R. 1984. Methods in Enzymol. 107, 224-240. Nelson, D.A. (1982). J. Biol. Chem., 258, 1565-1568. Ruiz-Carrillo, A., Wangh, L.J., and Allfrey, V.G. (1976). Arch. Biochem. Biophys. 174, 273-290. Sterner, R., Vidali, G., and Allfrey, V.G. (1979). J". Biol. Chem. 254, 11577-11583. Sterner R., Vidali, G., and Allfrey, V.G. (1981). J. Biol. Chem.,256, 8892-8895. Nohara, J., Takahashi, T., and Ogata, K. (1966). Biochim. Biophys. Acta 127, 282-290. Harbour, G.C., Garlick, R.L., Lyse, S.B., Crow, F.W., Robins, R.H., Hoogerheide, J.G. 1992. Techniques in Protein Chemistry III, 487-495. Bogosian, G., Violand, B.N., Dorward-King, E.J., Workman, W.E., Jung, P.E., and Kane, J.F. (1989). J". Biol. Chem. 264, 531-539. Wood, D.C., Salsgiver, W.J., Kasser, T.R., Lange, G.W., Rowold, E., Violand, B.N., Johnson, A., Leimgruber, R.M., Parr, G.R., Siegel, N.R., Kimack, N.M., Smith, C.E., Zobel, J.F., Ganguli, S.M., Garbow, J.R., Bild, G., and Krivi, G.G. (1989). J". Biol. Chem. 264, 14741-14747. Righetti, P.R., Gianazza, E. 1987. Methods in Biochemical Analysis 32, 215-27816. Violand B.N., Schlittler, M.R., Kolodziej, E.W., Toren, P.C., Cabonce, M.A., Siegel, N.R., Duffin, K.L., Zobel, Sci. 1, J.F., Smith, C.E., and Tou, J.S. (1992). Protein 1634-1641. Violand, B.N., Schlittler, M.R., Toren, P.C., and Siegel, N.R. (1990). J. Prot. Chem. 9, 109-117. Leimgruber, R.L., Junger, K.D., Gustafson, M.E., and Violand B.N. (1993). Proceeding of Electrophoresis Society, Abstract 112. Rao, G.N., Lardis, M.P., and Cotlier, E. (1985). Biochem. Biophys. Res. Comm. 128, 1125-1132.
LC-MS Methods for Selective Detection of Posttranslational Modifications in Proteins: Glycosylation, Phosphorylation, Sulfation, and Acylation Mark F. Bean, Roland S. Annan, Mark E. Hemling, Mary Mentzer, Michael J. Huddleston, and Steven A. Carr Department of Physical and Structural Chemistry, SmithKline Beecham Pharmaceuticals, King of Prussia, PA 19406
I. Introduction One means of cell regulation of protein function is achieved through control of covalent modifications of the protein. The exact role and mechanism of action of such posttranslational events as glycosylation (1), phosphorylation (2-4), sulfation, or acylation have been difficult to study in part because of a lack of adequate means of detecting and characterizing the modifications using standard peptide sequencing tools. Mass spectrometry has become one of the methods of choice in uncovering many important posttranslational changes in proteins for it is capable of revealing the presence and nature of modifications, as well as locating which amino acid has been modified. What has greatly aided this is mass spectrometry's inherent and improving sensitivity (low pmol moving down to fmol), speed, and ability to analyze mixtures of peptides derived from proteolysis either by LCMS, by matrix-assisted laser desorption (MALDI), or by tandem MS. We have recently described (5-8) a general approach for selective detection of a variety of posttranslational modifications during LC-ESMS analysis of protein digests. The method, which involves stepping of the collision energy during the LCMS analysis (described below), yields peptide molecular weight information in addition to indicating which chromatographic peaks are likely to contain the modified peptide(s). The technique is simple to implement and requires only a single-stage quadrupole instrument. Selective detection of posttranslational modifications is possible through collision-induced formation of low-mass fragment ions that serve as selective markers for the modifications of interest. Collision-induced fragmentation, which is turned on while scanning the lower mass range, is turned off during the remainder of the scan (e.g. m/z 400-2400). In this way peptide molecular weight information and modification-selective, low-mass marker ions are detected in the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
107
108
Mark F. Bean et al.
same scan. The chromatographic elution time for the modified peptide is determined by plotting the marker ion traces and observing maxima (for example, see Figs. 1 and 3). These traces are readily compared with the total-ion current (TIC) and/or UV chromatogram traces stored in the same computer file. Because column effluent is split during all but capillary column LC-ESMS analyses, > 90% of the effluent is fraction-collected. Once located, modified peptides may be analyzed using a variety of other techniques (such as glycosidase digestion, tandem MS, etc.) to further characterize the structure and precise sequence location of the modification. Our initial applications of the stepped collision energy LC-ESMS approach involved selective detection (and differentiation) of N- or 0-glycosylated peptides (5,6), and detection of phosphorylated peptides (7,8,11). The method has been in routine use in our laboratory for over two years and during this time has become one of the mainstays of our work in characterizing protein modifications. Here we present some of our more recent studies on protein glycosylation and phosphorylation, and illustrate a preliminary evaluation of the stepped collision energy LC-ESMS method for selective detection of sulfated and acylated peptides in protein digests. IL Materials and Methods Electrospray mass spectra were recorded on a Perkin-Elmer Sciex API-Ill triple quadrupole mass spectrometer (Thomhill, Canada) with general MS and LC setup as previously described (5-8). Modifications of the instrument which improve sensitivity for negative-ion mode detection of phosphorylation and sulfation have also been described before (7,8). Because phosphorylated or sulfated peptides produce predominantly negatively charged marker ions while glycosylation or acylation yield positively charged species, these two classes of modification cannot both be detected in a single LCMS experiment. The column flow into the ESMS system is best kept to a few |iL/min., and it is common to split the flow from all but capillary columns so that the larger portion is carried through a UV detector to a fraction collector. For example, with a 1 mm diameter column with a flow of 40 |iL/min., effluent will typically be split so that about 5 |iL flows to the mass spectrometer and 35 ^L goes to a fraction collector. There is no loss in sensitivity of detection in such stream splitting because the ESMS acts as a concentration-dependent detector. Maximum sensitivity for a given modification can be obtained with narrower capillary columns, using UV detection together with MS selected-ion monitoring of the fragment-ion markers without scanning for molecular weight information. m . Results Production of marker fragment-ions is a function of ion velocity in the region before the first mass-resolving quadrupole (5 - 8). Some of the kinetic energy of the ions (a product of their mass and velocity) is converted to internal molecular vibrations upon collision with residual gas molecules in this region. Ion velocity
Selective Detection of Posttranslational Modifications
109
(and, therefore, the extent of coUisional excitation) is controlled by modifying the voltage applied to the sampling cone at the mass spectrometer inlet. Increasing this voltage results in fragmentation to produce the marker ions of interest. Thus, by stepping this potential from a high value (fragment-production mode) to a low value (molecular ion production mode) during a scan, we obtain marker-ion and molecular-weight information in a single scan. This voltage change is illustrated by the dotted line in Figure 6. In developing a new stepped collision energy LCMS method, we first work with model compounds in product-ion scanning tandem MS mode to ascertain if there are appropriate diagnostic fragments produced and to determine the optimum collision energy. We then spike these model compounds into mixtures of peptides or into protein digests to test for selectivity and sensitivity of detection. If the chromatographic trace for the diagnostic ions is weak in relation to the background noise, we will use selected-ion monitoring for the ions of interest. This substantially increases sensitivity, albeit at the expense of being able to obtain molecular weight information. When sample is limited, we will also use smaller column diameters (e.g. 1 mm or capillary columns), which provide enhanced detection sensitivity through more concentrated sample elution. In the following LCMS experiments, we have optimized the sampling cone (orifice or OR) voltage for maximum yield of the desired marker fragments (lower voltages will still yield useful data, but at reduced sensitivity). The ions we commonly look for and the OR setting needed to generate them are listed in Table I. As a comparison, a standard OR voltage value for non-fragmenting, molecular-ion production might be about +65V or -115V. Table I. Marker-Ions for Detection of Some Posttranslational Modifications. 1 Carbohydrates Pain3Cys
Phosphates
Sulfates
p R = +140V 163 (Hex+) 204
OR = +160V 114
OR = -350V 63
OR =-250V 80
(PO2-) 79
(SO3-) 96
(PO3-)
(SO4-)
292
(HN=C2H2=S+-C2H3=CH2) 239 (Pam+) 256
(NeuAc+) 366
(Pam-NH3+) 257
1 (HexNAc+)
1 (HexHexNAc+) (Pam-0H2+) The most selective marker-ion masses are in boldface. A. Carbohydrate Specific Detection
110
Mark F. Bean et al.
Glycoprotein gD2, a structural component of herpes simplex 2 virion, and potential vaccine candidate, has been expressed in Chinese hamster ovary cells. The reduced and pyridylethylated protein was digested sequentially with neuraminidase and trypsin to produce a mixture of peptides and glycopeptides. We have found neuraminidase treatment to be helpful in reducing glycopeptide heterogeneity somewhat and in enhancing the MS detection of glycopeptides. The digest mixture (5.6 nmol) was injected onto a 2.1 mm i.d. Cig column for positive-ion stepped collision energy LC-ESMS (carbohydrate-selective detection). In Figure 1A and B, we have illustrated the traces for the oxonium ion fragments HexNAc"*" (m/z 204) and Hex HexNAc"*" (m/z 366). As we have discussed previously (5), these two ions can be derived not only from terminal sugars but also by two-bond cleavages from internal sugars. They are therefore among the most general of the sugar-related ions, and when used together, coincidence of the chromatographic peaks is strong evidence of the presence of a glycopeptide. The selectivity of detection is clearly apparent when the chromatographic traces for these two ions are compared to that of the reconstructed total ion current (RTIC) trace (Fig. IC); the glycopeptides which are so prominent in the fragment ion traces correspond to relatively insignificant peaks in the RTIC trace. HexNAc-*-
Figure 1. Stepped collision energy LC-ESMS for gD-2 glycoprotein digested with neuraminidase/trypsin illustrating selectivity of the m/z 204 (A) and 366 (B) carbohydrate-selective markers when compared to the summed trace for m/z 4002000 (C). The mass spectrum of the glycopeptide eluting at ca. 37 minutes is shown in Figure 2. Microheterogeneity within the glycopeptide due to differences in glycosylation has a small, but noticeable influence on the elution times on Cig reverse-phase chromatography. Therefore, although each data point on the chromatogram represents a full mass spectrum, several adjacent scans must be summed in order to observe a more true representation of the glycoforms present
Selective Detection of Posttranslational Modifications
111
in the digest solution. For example, the spectrum shown in Figure 2 is the sum of ca. 15 scans of 5 s duration each. As in most electrospray mass spectra of larger peptides, the data consists of a series of spectral peaks corresponding to different ion charge states (3+ to 6+, yielding different m/z values) of the same glycopeptide. In addition to this electrospray-derived "heterogeneity", we observe true structural heterogeneity due to the presence of hi-, tri-, and tetraantennary glycopeptides, each with a fucose attached. This structural information is readily available from the spectrum given the protein sequence and knowing the expected digestion sites and the additive masses for the different sugar units (6). For example, (4 x 1714.5) - 4H gives a relative mass of 6854.0 for the most abundant glycopeptide. One of the three candidate tryptic peptides containing the requisite N-linked consensus sequence (N-X-S/T) is I238.D284 ^jth a mass of 5084.5. The mass difference between the observed mass of the glycopeptide and the theoretical mass of the peptide alone is 1769.5 which corresponds precisely to the expected in-chain mass for a biantennary, mono-fucosylated carbohydrate. The previous experiment will detect both N-linked and 0-linked glycopeptides. These two can be distinguished by removing the N-linked oligosaccharides with PNGase F, and repeating the carbohydrate-selective LCESMS experiment. This time only 0-linked glycopeptides will be observed in the marker ion traces (6). 20.1
1714.5
^ Mr = 6854.1 •5084,5 1238.0284 1769.6 s in-chain mass of Bi-ant. -i- Fuc
Figure 2. Stepped coUision energy spectrum for the gD-2 glycopeptide eluting at 37 min. in Fig. 1.
112
Mark R Bean et al.
B. Phosphate-Specific Detection It is thought that about one third of mammalian cell proteins contain phosphate (3), and it is therefore important to have a ready means of detecting this common modification. Osteopontin, which contains an ROD sequence and binds osteoclasts, was expressed in Chinese hamster ovary cells and digested with trypsin. Ca. 150 pmol of the digest mixture was injected onto a Cig packed fused silica capillary column for negative-ion stepped collision energy LCMS (phosphopeptide-selective detection). We used the smaller column diameter in this case to consume less material during initial exploration of the results. Later, the experiment was repeated with 2 nmol of material on a 2.1 mm i.d. column to permit fraction collection of the phosphopeptides for further work. Figure 3 compares the chromatographic traces for the sum of the PO2" (m/z 63) and POB" (m/z 79) ion currents (RIC, outline trace) and for the sum of all the ion currents (TIC, filled trace). The selectivity of the detection method is evident. We have previously shown that these two phosphate-derived ions are detected for peptides phosphorylated on tyrosine as well as serine or threonine (7); we have not yet encountered cysteine phosphorylation (9) but do not expect difficulties with that less common modification. Moreover, we have not yet had false positive identification of phosphopeptides using this technique, despite the fact that we are using low-mass ions for diagnostic purposes. lOOn
rat Osteopontin Tryptic Digest r 1.2
S
8
c
"2 8oH
I
h 1.0 0.8
60H
h 0.6 4(H
3
0.4 2(H
h 0.2
Minutes
t t tlltti I I Phosphopeptlde Fraction
1
2
345678
9 10
Figure 3. Stepped coUision energy scanning LC-ESMS of a 150 pmol osteopontin tryptic digest comparing the total-ion trace with that of the sum of the two phosphate-selective markers (m/z 63 and 79).
Selective Detection of Posttranslational Modifications I
I S
I
I
I
I
113
I
I
I
I
I
H/E/X/E/S/S/pS/S/E/V/N I
I I I I I
I I
I I
I I
I I
I
I I I I I
b3 b4 b5 bg by bg bg biob-|-| b^2
"(0
0)
>
500
1000 m/z
Figure 4. Low-energy collision tandem MS of the (M+2H)+ (m/z 748) for the early-eluting phosphopeptide in Fig. 3. The asterisks indicate phosphate losses from the next higher mass bn-ion. The pS indicates the phosphorylated serine. Having isolated the phosphopeptide-containing fractions, it becomes possible to try to locate the site of phosphorylation by tandem mass spectrometry. Figure 4 represents the low-energy tandem mass spectrum of the early-eluting phosphopeptide 1287.^298 (^^^ protein C-terminus). The sequence information obtained with low-energy collisions is readily interpreted, and although incomplete, is sufficient to precisely locate the phosphate group to the third Ser in a run of four sequential Ser residues. The spectral peaks labeled bn represent amide bond cleavages with charge retention on the peptide N-terminus. Peaks labeled with an asterisk indicate H3PO4 losses (98 Da) from the bn peaks, a common feature of phospho-serine or phospho-threonine (but not phosphotyrosine) in MS/MS mode. Additional water loss peaks are observed from peptide fragments containing multiple serine residues. For additional examples of the use of tandem mass spectrometry in the identification of phosphorylation sites, see reference 9 and citations therein. C Fatty Acyl'Selective Detection Fatty acylation of proteins is a common posttranslational modification that blocks sequencing of the N-terminal amino acids by Edman techniques. Mass spectrometry can play an essential role in such sequencing, but it is not always trivial to locate the blocked N-terminal peptide. Besides acetylation, acylation of amino acids with longer-chain fatty acids (e.g., myristic acid) has been reported.
Mark F. Bean et al.
114
We are developing the stepped collision energy method to aid detection of acylated peptides, including N-terminal tripalmitylated-cysteine (for structure, see Figure 6 inset), a less common modification found in bacterial lipoproteins. Figures 5A and B illustrate positive-ion stepped collision energy LC-ESMS for a Lys-C digest of the milk phosphoprotein p-casein spiked with 500 pmol of a PamsCys-containing peptide (Pam3Cys-SSGSKKPQKPI) injected onto a 1 mm i.d. C4 column. Selectivity is reasonably good for the sum of the diagnostic ions (m/z 114, 239,256, and 257: see Table 1). %B (100% full scab)
10.0
15.0
20.0
25.0
30.0
35.0
Time (min)
Figure 5. Stepped colHsion energy scanning LC-ESMS of 500 pmol 6-casein Lys-C digest spiked with the PamsCys showing the TIC and the summed marker-ion traces for m/z 114, 239, 256 and 257. Figure 6 shows the spectrum for the Pam3Cys-containing peptide eluting at 30.1 min. The dotted line indicates the stepped collision energy (stepped OR voltage). Similar work with myristylated peptides spiked into protein digests (not shown) has indicated that use of the myristyl marker-ion (m/z 211) may produce false positives. Fatty acylium ions from peptides tend to be weak in scanning mode MS and may be better candidates for selected-ion monitoring. In locating acylated peptide-containing fractions it is helpful to bear in mind that such peptides tend to be strongly retained on reverse-phase columns. Pam3Cyscontaining peptides, having three palmityl groups are so hydrophobic, that they dissolve only in highly organic solvents, and unusual measures are necessary to elute them (note the solvent gradient in Figure 5A).
115
Selective Detection of Posttranslational Modifications
100i
3+ 68 3.5
OR llj^^
Ci5 Hst ^ ^ 0
O
2+
0
n H
1024.8 c c
1
0
50"
r
Pam3Cys--
*4S
(0
1
25-
OR = +65
1082.0
\k\ O^
\iLk^^ j 500
_
_.
^1
A. .
10C)0
1+
.
2049.5 1500
2000
m/z
Figure 6. Positive-ion mass spectrum for the tripalmitylated peptide eluting at 30.1 min. in Fig. 5B. The structure of Pam3Cys is shown in the inset. D. Sulfate-Specific Detection Sulfation of specific tyrosine residues in proteins is another posttranslational modification which is difficult to detect by Edman sequencing. We have extended the method developed for negative-ion phosphopeptide markers (m/z 63 and 79) to include the sulfate markers SO3- (m/z 80) and SO4" (m/z 96). These ions have been weaker than expected considering the lability of the sulfate ester, and we have resorted to selected-ion monitoring mode without molecular weight detection to improve sensitivity. Ca. 500 pmol of a tryptic digest of reduced and pyridylethylated a2-antiplasmin was injected onto a 1 mm i.d. C4 column. Lacking a total ion chromatogram, it is advisable to acquire simultaneous UV data. The UV trace (Figure 7A) is compared with the MS trace for the SO3" (m/z 80) ion (Figure 7B). This ion, although of weak intensity, appears to be the most selective marker. The predicted sulfopeptide elutes at 26.8 min. and is clearly differentiated from surrounding peptides. We have not yet isolated the latereluting peaks which also seem to be sulfopeptides, but we suspect that these are the products of incomplete enzymatic digestion and elute later due to their larger size. We have also proven (data not shown) that we can distinguish phosphopeptides and sulfopeptides in a single experiment because the respective m/z 79 and m/z 80 marker ions are clearly differentiated.
Mark F. Bean et al.
116
IAIJU_
WujJi B
RIC- m/z80 lOOi
^
76-
=
531 291 10.0
20.0
30.0
40.0
50.0
60.0
Time (min)
Figure 7. Comparison of the UV and MS (single-ion, negative-ion, monitoring for m/z 80, RIC) traces for 0.5 nmol of a2-antiplasmin tryptic digest. Acknowledgments We would like to thank Dr. Jorg W. Metzger of the Institut fur Organische Chemie der Universitat, Germany, for providing the Pam3Cys-containing peptide. V. 1. 2. 3. 4. 5.
References D. A. Cumming; Glycobiology 1991, 115-130. T. Hunter and M. Karin; Cell 1992, 70, 375-387. M. J. Hubbard and P. Cohen; TIBS 1993, 18,172-177. B. E. Kemp and R. B. Pearson; TIBS 1990, 15, 1342-346. Huddleston, M. J.; Bean, M.F.; Carr, S. A.; Anal. Chem. 1993, 65, 877884. 6. Carr, S. A.; Huddleston, M. J.; Bean, M.F.; Protein Science 1993, 2, 183196. 7. Huddleston, M. J., Annan, R. S., Bean, M. F., and Carr, S. A.; J, Am. Soc. Mass Spectrom. 1993, 4, 710-717. 8. Huddleston, M. J., Annan, R. S., Bean, M. F., and Carr, S. A.; Techniques in Protein Chemistry 1994, V, 123-132. 9. J. W. Crabb, C. Johnson, K. West, J. Buczylko, K. Palczewski, J. Hou, K. McKeehan, M. Kan, W. L. McKeehan, M. J. Huddleston, S. A. Carr; Techniques in Protein Chemistry 1993, IV, 171-178. 10. Stover, D. R. and K. A. Walsh ; Techniques in Protein Chemistry 1993, IV, 193-200. 11. Till, J. H., Annan, R. S., Carr, S. A., and Miller, W. T. J. Biol Chem. 1994, 269, 7423-7428.
Identification of Phosphorylation Sites by Edman Degradation John D. Shannon and Jay W. Fox Department of Microbiology and Biomolecular Research Facility, University of Virginia Medical School, Charlottesville, VA
I. Introduction During standard sequencing conditions, butyl chloride does not extract either free phosphate or phosphotyrosine released during sequencing of phosphopeptides. In the past, methods for extracting phosphate from the sample support were available (1) but they are not convenient and by their design are not efficient. The commercial availability of solvent resistant supports to which peptides can be covalently coupled (Sequelon™-AA, Millipore) allows the use of a greater range of extraction solvents because sample loss from its support is no longer a concern. Both 90% methanol and trifluoroacetic acid have been used to extract phosphate from Sequelon^ membranes. Phosphotyrosine may be identified chromatographically (2, 3), whereas phosphoserine and phosphothreonine can only be inferredfromthe levels of breakdown products (4) unless converted to a more stable derivative (2). Alternatively, mass spectrometry can be used to sequence phosphopeptides (Payne et al., 1991). Both methods require sufficient quantities of pure peptide, whereas the method described in this text requires only sufficient radioactivity in one peptide. Here we describe how we have used methods described by others and adapted them to our instrumentation. We have analyzed over thirty peptides and here we present data to show the range of results found.
II. Materials and Methods After counting the sample, it is dried in 5 \i\ aliquots on a Sequelon™ aryl amine membrane (Millipore catalog no. GEN920033) in the cap of a 1.5 ml microcentrifuge tube placed in a 55 "C heating block. After counting the membrane and the tubefromwhich the sample was taken, the peptide is coupled to the membrane with carbodiimide (7). Using a suggestion from Millipore, we prewet the membrane with 3-5 |il of acetonitrile to enable the aqueous coupling solution (5 jil), containing 50 ^g of carbodiimide, to wet the membrane and couple for 20 minutes at room temperature. The membrane is then washed four times with 1 ml of 27% acetonitrile 9% trifluoroacetic acid, two times with 1 ml of 50% methanol and then both the membrane with coupled peptide and washes are counted. The membrane is then applied to an Applied Biosystems 470A sequenator using the cartridge inverted as suggested by Stokoe et al (7). The cycle program used for sequencing is based on that of Meyer et al (2) but using direct collection in an external fraction collector of ATZ amino acids in neat TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
117
118
John D. Shannon and Jay W. Fox
trifluoroacetic acid as described by Russo et al (8). All conversion flask functions are converted to pauses because the flask is not used; transfer with S3 functions are converted to transfer ATZ fiinctions and transfer with argon functions are converted to ATZ purge functions. The external fraction collector is controlled by a contact closure supplied by the LC start function (disconnect the leads to the PTH analyzer). On some fraction collectors, connect the contact closure terminals to the drop counter input and set the drop counter to 1 drop. Thus the fraction collector advances after receiving a signal from the sequencer (which normally corresponds to one drop) that a fraction has been collected. The TFA extracted material is collected in 10 x 100 mm tubes cut to fit a scintillation vial and containing 1.8 ml of 5 M NaOH to neutralize the acid and hence reduce the amount of TFA vapor; use of KOH gives a significantly higher background. All collected fractions and the membrane are counted after sequencing. The run should go a few cycles beyond the site of phosphorylation to observe a return of counts to baseline. Radioactivity was measured by Cerenkov counting, usually for 1 minute but up to 20 minutes for low level samples. In calculating data, where two or more phosphorylation sites are found in a peptide, the extent of phosphorylation at each site is assumed to be equal. Coupling eCBciency was calculated as the ratio of counts bound to the Sequelon^ membrane compared to the counts applied to the membrane. Yield, lag and preview are calculated as a percentage of the counts coupled to the Sequelon''^ membrane. To prepare the sequencer for sequencing a labelled peptide, S2 is replaced with ethyl acetate containing 1 ml/liter phosphoric acid; butyl chloride is blown out of the S3 line and trifluoroacetic acid placed in the S3 position. The contact closure leads from the PTH analyzer are disconnected from the sequencer and leads from the external fraction collector connected. Tubing takes liquid to the external fraction collector from port 2,9 of the A block immediately after the cartridge. Return to normal sequencing is a reverse of the above procedure with the following additions; after removal of TFA from the S3 position, the bottle cap assembly is rinsed with water and then methanol and allowed to dry to remove all I TA; the cartridge blocks are cleaned and tested for radioactivity and the S2 fluid path is flushed with pure ethyl acetate. Although we cannot ascribe any problems with the instrument to the use of this procedure, Meyer et aL(2) suggest replacing the S3 valves with TFA resistant valves; Applied Biosystems (Research News, May 1992) recommends weekly flushing of the fluid path handling phosphoric acid because of the potential that this reagent will corrode valve blocks.
III. Results Initially we used an extraction solvent (S3) of 90% methanol (7) and obtained phosphoamino acid sequence data which was confirmed by other methods (9). However we changed the extraction solvent to trifluoroacetic acid as used by Meyer et aL, (2) and Russo et al. (8) because of its greater efficiency. As seen in Fig. 1, many peptides couple at 90% or greater efficiency. The lowest coupling efficiency we have observed in over 35 couplings is 29%. In our data, there is no relation between coupling efficiency and peptide length (data not shown). Preview shows a weak association with phosphorylation position (Fig. 2), and in many cases may be zero. Lag, which is normally 25-30% for shorter peptides, appears to increase with the length of the sequence run (Fig. 3). By compiling data from numerous runs, we have attempted to estimate the efficiency of phosphopeptide sequencing (Fig. 4). Although the repetitive yields cannot be estimated accurately due to variability, using data from the longer runs, we have nevertheless estimated a repetitive yield of 85%. This estimate
Identification of Phosphorylation Sites
119
uses counts in peak tubes only, ignoring the significant lag and preview. This number may be used to predict the number of counts in a cycle that may contain a phosphoamino acid. The percentage recovery of counts at a given position is roughly constant for a range of initial counts (Fig. 5). Thus there does not appear to be a loss of a fixed amount of sample during each procedure. As a guideline, we consider that there is a good chance of determining a phosphorylation position within thefirstten amino acids when supplied with 1,000 c.p.m.
20- 25- 30- 35- 40- 45- 50- 55- 602 5 3 0 3 5 4 0 4 5 5 0 5 5 6 0 6 5
6570
7075
75- 80- 85- 908 0 8 5 9 0 9 5
95100
% Coupling Efficiency
Figure 1. Distribution of peptide cx>upling efficiencies. The number of peptides coupling at a particular efficiency (5% intervals) is shown.
45 40 h 35 [
30 I-
l»J 15 I10
.J i l I
5 0 1
2
3
4
5
6
7
8
9
I
10
I
11
I
12
Phosphorylation Position
Figure 2. Percent preview as a function of phosphorylation position.
I
13
14
15
16
17
120
John D. Shannon and Jay W. Fox
80 70 \ 60 [
34^
30 20 10 1
2
3
4
5
6
7
8
9
10
11
12 13
14
15
16
17 18
15
16
Phosphorylalion Position
Figure 3. Percent lag as a function of phosphoiylation position.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
17
Phosphcryiation Position
Figure 4. The percent of coupled counts recovered in the peak tube is shown as a function of phosphorylation position.
Identification of Phosphorylation Sites
121
60
£ ^40
•?»
8»
llllilll
-n
480
SD
1100
1700 1790 1790 Oouts Ooupled to Mentrane
1G000
20OOO
Figure 5. Percent recoveiy of counts at position 7 for 7 peptides is shown as a function of initial (coupled) counts.
In Fig. 6 we show a phosphopeptide whose amino acid sequence was determined in a normal run and whose phosphoamino acid sequence was determined by the methods described above. According to the published sequence, cysteine is present at position 4, but because it was not alkylated, it was not seen in this run. The absence of an amino acid at position 7 is consistent with phosphoserine, as expectedfromthe published amino acid sequence (10). 9000 8000 7000 6000 V
^
5000
O 4000 3000 2000 1000 0 0
1/L
2lfA 3/E
41-
SIR
6/N
7/-
8/P
9N
1(VA 11/K IZTT
13
14
Cyde Number
Figure 6. Release of readioactivity and amino acids during modified and normal sequencing runs of the same sample.
122
John D. Shannon and Jay W. Fox
IV. Discussion Many of the peptides we have analyzed were prepared by digestion of small amounts of protein with large amounts of trypsin (11), with the risk of nonspecific digestion, including digestion by pseudo-trypsin (12, 13). During reverse phase chromatography of peptides, most of the UV absorbing material is autolysed trypsin, making isolation and identification of phosphopeptides by normal amino acid sequencing difficult. Alternatively, peptides were separated by electrophoresis-thin layer chromatography (11) and identified by autoradiography. Nevertheless, most labelled phosphorylation sites identified by Edman degradation can be matched to possible sites in proteins of known sequence, digested at known sites, and a number have been confirmed by other methods (9, 14). In our data, we occasionally see some counts released in the cycle ahead of the phosphorylation site and trailing of counts, as observed by Rossomando et al (15). Phosphate is released fi*om phosphoserine and phosphothreonine under alkaline conditions by B-elimination. The lability of phosphate is influenced by surrounding amino acids (16, 17). Possibly aUeration of the environment of the phosphorylated residue during sequencing increases 13-elimination in some peptides, causing preview. The data we have does not allow us to conclude that phosphoserine or phosphothreonine consistently give higher preview than phosphotyrosine. Based on normal amino acid sequencing, we suspect that incomplete extraction is the major cause of the observed lag. An increase in lag with the length of the run can be explained by the presence of incomplete reactions at each step, increasing the amount of peptide from which one amino acid has not been cleaved due to an incomplete earlier reaction. Most of the samples we have analyzed are unsuitable for normal amino acid sequencing because of the low amounts of peptide ( « 1 pmole) and the likelihood of multiple peptides being present. Thus we have not attempted to use extraction with trifluoroacetic acid in combination with amino acid sequencing as done by Meyer et al. (2). In cases where amino acid sequence data is needed, we have used some sample to detect the phosphorylation site and the remainder for standard amino acid sequencing. This protocol avoids having to test and optimize a third sequencing cycle which attempts to identify amino acids and collect radioactivity. We and others have used release of P^^ to identify phosphorylation sites. One group has found that this method is unreliable (4) for reasons we do not understand. Most of the samples we have sequenced are not available in sufficient quantity or purity for normal amino acid sequencing. However, several of the phosphorylation sites we have identified have been confirmed by synthesizing and phosphorylating peptides identical to those identified by our analysis and showing comigration of the synthetic and native peptides by TLC. Other sites have been confirmed by mutagenesis experiments, one by amino acid sequencing of the peptide, and most sites match expected sites based on the known amino acid sequence of the substrate and biology of the protein. Thus this method appears reliable and relatively straightforward to introduce in a protein sequencing laboratory.
Acknowledg ments The peptides used in this study were by supplied by investigators at the University of Virginia. Support for the study was from the Pratt Committee.
Identification of Phosphorylation Sites
123
References 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Wang, Y., Fiol, CJ., DePaoli-Roach, A.A., Bell, A.W., Heraiodson, M.A. and Roach, P.J. (19SS) Anal. Biochem. 174 537-547 Meyer, H.E., HoflEman-Poroske, E., Donella-Deana, A. and Korte, H. (1991) pp. 206224 in Methods in Enzymology vol. 201 Aebersold, R., Watts, J.D., Morrison, H.D. and Bures, E.J. (1991) Anal. Biochem. 199 51-60 Parten, B.F., McDowell, J.H., Nawrocki, J.P. and Hargrave, P.A. (1994) pp. 159-167 in Techniques in Protein Chemistry F J.W. Crabb ed. (Academic Press) Payne, D.M., Rossomando, A.J., Martino, P., Erickson, A.K., Jer, J.-H., Shabanowitz, J., Hunt, D.F., Weber, M.J. and SturgiU, T.W. (1991) EMBO Journal 10 885-892 Coull, J.M., Pappin, D.J.C., Mark, J., Aebersold, R. and Koster, H. (1991) Anal. Biochem. 194 110-120 Stokoe, D., C:ampbell, D.G., Nakielny, S., Hidaka, H., Leevers, S.J., Marshall, C. and Cohen, P. (1992) EMBO Journal 11 3985-3994 Russo, G.L., Vandenberg, M.T., Yu, I.J., Bae, Y.-S., Franza, B.R. and Marshak, D.R. (1992) J. Biol. Chem. 267 20317-20325 Schaller, M.D., Hildebrand, J.D., Shannon, J.D., Fox, J.W., Vines, R.R. and Parsons, J.T. (1994) Molec. Cell Biol. 14 1680-1688 Haystead, T.A.J., Haystead, C.M.M., Hu, C, Lin, T.A. and Uwrence, Jr., J.C. (1994) J Biol. Chem. in press Boyle, W.J., van der Geer, P. and Hunter, T. (1991) pp. 110-149 in Methods in Enzymology vol. 201 Keil-Dlouh4 V., Zylber, N., Imhoff,J.-M., Tong, N.-T. and Keil, B. (1971) FEBS Lett. 16 291-295 Smith, R.L., Shaw, E. (1969) J Biol. Chem. 244 4704-4712 Moyers, J.S., Linder, M.E., Shannon, J.D. and Parsons, S.J. Biochem. J., in press Rossomando, A.J., Dent, P., Sturgill, T.W. and Marshak, D.R. (1994) Molec. Cell Biol. 14 1594-1602 Kemp, B.E. (1980) FEBS Letts 110 308-312 Martensen, T.M. (1984) pp. 3-23 in Methods in Enzymology vol. 107
This Page Intentionally Left Blank
Determination of the Disulfide Bonds of Human Macrophage Chemoattractant Protein-1 Using a Gas Phase Sequencer Ramnath Seetharam, Jeanne I. Gorman, and Shubhada M. Kamerkar DuPont Merck Pharmaceutical Company, Glenolden, PA 19036
I. Introduction The formation of inter- and intra-molecular disulfide bonds is an important post-translational modification in proteins. Most proteins containing multiple disulfide bonds adopt structures with unique disulfide bonding patterns. Determination of this pattern is an important step in the elucidation of the primary structure of such proteins. The disulfide bonding pattern of a protein is usually determined by identifying the disulfide bonds in disulfide linked peptides obtained after enzymatic or chemical digestion of the intact protein. Amino acid sequencing using step wise Edman degradation provides a method to determine the location of disulfide linked peptides in the intact protein sequence. diPTH-cystine is released from a disulfide bonded peptide only when the second half-cystine of a disulfide bond is cleaved. Hence, determination of the cycles which yield diPTH-cystine often provides a simple method to determine the disulfide bonding pattern of a peptide with multiple disulfide bonds (1, 2, 3). However, under the conditions usually used for PTH analysis, diPTHcystine tends to co-elute with PTH-tyrosine. Therefore, it is often necessary to chemically modify the tyrosine residues to TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
125
126
Ramnath Seetharam et al.
be able to unambiguously identify diPTH-cystine in the presence of PTH-tyrosine (2). A simpler method would involve modification of the conditions used for PTH analysis so as to separate diPTH-cystine from PTH-tyrosine (3). We describe such a procedure which separates di PTH-cystine from all other PTH amino acids, including PTH-tyrosine. This procedure provides a simple way to determine the sequencer cycles which yield diPTH-cystine while sequencing disulfide bonded peptides. This approach has been used to determine the disulfide bonding pattern in a recombinant form of human Monocyte Chemoattractant Protein-1 (r hMCP-1).
II. Materials and Methods Escherichia coli derived r hMCP-1 was purchased from Pepprotech (Cherry Hill, NJ) or provided by T. Handel and A. Yetter (DuPont Merck Pharmaceutical Co.) Pepsin was obtained from Boehringer Mannheim (Indianapolis, IN). Phenylthiohydantoin (PTH) amino acids kits were purchased from Applied Biosystems (Foster City, CA). HPLC grade TFA was purchased from Pierce Chemical Co. (Rockford, IL). HPLC grade Acetonitrile was purchased from E.M. Scientific (Gibbstown, NJ), and water was obtained from a Nanopure system (Barnstead, Boston, MA). 4-hydroxy-a-cyanocinnamic acid was obtained form Sigma Chemical Company (St. Louis, MO). Peptic digests were carried out in 0.1% TFA, pH 2.2 for 48 hr at 37 °C using an enzyme to substrate ratio of about 1: 25. The digests were kept frozen at -70 'C while awaiting analysis. The peptic peptides were separated by reversed phase HPLC using 100-1000 pmoles of the digests. Mass spectra were obtained on a Vestec(Houston, TX) LaserTec ResearcH laser desorption linear time-of-flight mass spectrometer with a 337 nm nitrogen laser and a 0.7 m flight tube, operated according to the manufacturer's recommendations. A Tektronix (Beaverton, OR) model TDS520 dual channel digitizing oscilloscope was used to digitize the spectra, which was then downloaded to a ZEOS 486-33 computer. The data were analyzed using GRAMS (Galactic Industries) data analysis software. 0.65 pi of the sample (1-10 pmole/|al) was mixed with twice the volume of a saturated solution of the matrix, 4-hydroxy-a-cyanocinnamic acid in 2:1
Identification of Disulfide Bonds
127
acetonitrile/TFA (v/v) on the stainless steel probe tips and the resultant mixture was allowed to air dry. After introduction of the sample holder into the sample compartment, a vacuum of about 10"5 to 10'6 torr was achieved before begining the analysis. The built-in video camera attached to the sample compartment was used to view the sample pins and to select spots yielding the highest signal at the lowest laser power. N-terminal protein sequencing was carried out by automated Edman degradation chemistry with a Porton PI2090E gas phase sequencer (Beckman Instruments, Fullerton CA). The respective PTH derivatives of the amino acids were analyzed in an on-line fashion using a HP1090L (Hewlett Packard, Wilmington, DE). Edman degradations and PTH analyses were performed using protocols and reagents recommended by Beckman.
III. Results and Discussion Human MCP-1 (Fig. 1) is a protein of 76 amino acids (MW 8.6 Kd). It contains four cysteines at positions 11, 12, 36 and 52, which are present as two disulfide bonds in the native form (3, 4). MCP-1 belongs to a family of proteins, called CC chemokines since they contain adjacent cysteine residues at positions 11 and 12 (5,6). It is also related to another family of chemokines, referred to as the CXC family of chemokines in which the first two cysteine residues are separated by a single amino acid residue(5, 6). The disulfide bonding pattern of MCP-1 has not been determined. The disulfide bonds as shown in Figure 1 have been deduced (3), based on the disulfide bonding pattern determined for the CXC chemokines p-thyroglobulin (7) and IL-8 (8). In order to determine the actual disulfide bond
1
101
20
30
I
QPDAINAPVTCCYNFTNRKISVQRLASYRRITSSKCPK 40
50 \
60
70
EAVIFKTIVAKEICADPKQKWVQDSMDHLDKQTQTPKT Figure 1. Amino acid sequence of human Monocyte Chemoattractant Protein-1 (2,3).
Ramnath Seetharam et al.
128
14001 1200 1000H 8001 600 400-i 200 0-'. 30
35 T1me
40 (mln.)
Figure 2. Reversed Phase HPLC of the peptic digest of r hMCP-1. The separation was carried out using about 2000 pmoles of the digest on a C18 Vydac column (2.1X250mm) (Separation Systems, Hesperia, CA) connected to a HP1090M HPLC (Hewlett Packard, Wilmington, DE). The column was equilibrated in Solvent A, i.e., 0.1% trifluoroacetic acid (TFA) (w/v) in water at a flow rate of 150 )i 1/min. Peptides were eluted from the column at the same flow rate, using a gradient of 1%/min of Solvent B, i.e., 0.1% TFA (w/v) in acetonitrile, after a 10 min wash with Solvent A . The effluent was monitored at 212 nm. The peaks were collected manually and stored at -70 °C while awaiting analysis.
arrangement in r hMCP-1, we proceeded to isolate and analyze the disulfide bonded peptides from the protein. This was achieved by digesting the native intact protein with pepsin at pH 2, followed by RPHPLC analysis at the same pH. The low pH was chosen to minimize disulfide interchange during the digestion and analysis. A typical elution profile obtained by RPHPLC analysis of the peptic digest of r hMCP-1 is shown in Figure 2. The peaks representing different peptides were collected and sequenced, and their location in the intact MCP-1 sequence was determined. The peak eluting at about 46 minutes gave three sequences. These sequences were arranged into the sequences shown in Table 1 as peptides A, B and C, using the MCP-1 amino acid sequence shown in Figure 1. Treatment of the 46 minute peak with excess DTT at pH 8 gave rise to peaks corresponding to the individual peptides, suggesting that the 46 minute peak represented a disulfide bonded peptide. MALDI-TOF of the intact 46 minute fraction yielded a major peak with MW 4970 (data not shown). These
Identification of Disulfide Bonds
129
Table 1. Amino acid sequences obtained from RPHPLC peak collected at 46 minutes
Peptide A
Peptide C
Peptide B
Cycle
PTH-AA
pmol
PTH-AA
pmol
PTH-AA
pmol
1 2 3 4 5 6 7 8 9 10
Y R R I T S S K C P
51 24 24 41 17 8 8 18 7
P V T C C Y N F T N
27 54 29 36 26 24 14 16
K E I C A D P K Q
57 54 41 49 46 21 17 14
data were used to determine that the peptides A, B and C shown in Table 1 represented fragments 8-20, 28-41 and 4963 of r hMCP-1, and that the 46 minute peak is comprised of three peptides held together by two intermolecular disulfide bonds, as shown in Figure 3. As mentioned earlier, the disulfide bonds shown in Figure 1 (C11-C36 and C12-C52) are based on homology to other chemokines. The actual disulfide bonding pattern was determined by analyzing the cycles which yield diPTH-cystine during N-terminal sequencing of the 46 minute peptide. The peptide would yield di-PTH cystine in cycles 5 and 9, if the disulfide bonds were as depicted in Figure 3. In contrast, it would give rise to diPTH-cystine in cycles 4 and 9, with the only other possible disulfide bonding pattern, C11-C52 and C12-C36.
8
I
20
PVTCCYNFTNRKI 49
\
28
I
41
YRRITSSKCPKEAV 63
KEICADPKQKWVQDS Figure 3. Structure of the disulfide bonded peptic peptide of r hMCP-1. The structures of the peptide present in the 46 minute fraction obtained form the peptic digest of r hMCP-1 deduced from Nterminal sequencing and MALDI-TOF mass spectrometric analysis.
Ramnath Seetharam et al.
10
11
2 12
13
14
15
16
17
18
19
20
21
22
23
24
TIME (min)
Figure 4. Separation of PTH amino acid standards on a Hewlett Packard (Wilmington, DE) 1090L HPLC using a Hewlett Packard CI8 Aminoquant column. The column was equilibrated with Solvent A i.e., 77.6 mM Na Acetate, pH 4, containing 3.5% tetrahydrofuran and 0.012% triethylamine; at a flow rate of 200 ul/min. The PTH amino acids were eluted from the column at the same flow rate using increasing concentrations of Solvent B, i.e., 100% acetonitrile. The following gradient was used: 9% to 16% B in 1.4 min; 16% B to 40% B in 16.6 min; 40% B to 60% B in 4 min; 60% B to 80% B in 3 min. The diPTH-cystine was produced by loading Snmole cystine directly on the sequencer(lO) and collecting the HPLC peak at 14.7-15.0 minutes. An aliquot of this peak was reinjected with 20 pmoles of PTH standards.
Figure 4 shows a typical PTH amino acid separation obtained using the conditions described in the legend. It clear that diPTH-cystine is well separated from the other PTH amino acids, including PTH-Tyr, enabling us to identify diPTH-cystine while sequencing disulfide bonded peptides. We were able to use this procedure to unambiguously identify the sequencer cycles which yield diPTH-cystine during direct microsequencing of the 46 minute peak. The amount of diPTH-cystine released during different cycles while sequencing the 46 minute peak is shown in Figure 5. It is clear that most of the di-PTH cystine is released in cycle 5. The diPTH-cystine peak also increases in cycle 9.
Identification of Disulfide Bonds
131
8
X Q.
Cycle Number
Figure 5. diPTH-cystine release at different cycles during N-terminal sequencing of the 46 min peak. The sequencing was carried out as described in the Methods section. The PTH analysis conditions were as described in legend to Fig. 4. Quantitative analysis of diPTH-Cys was done using the same calibration factor for PTH-Lys (11).
A small amount of diPTH-cystine is seen in cycle 4, probably due to disulfide interchange during isolation and sequencing of the 46 minute peptide, which is common during the isolation and sequencing of disulfide bonded peptides (1, 3). These data indicate that sequencing of the 46 minute peptide releases diPTH-cystine in cycles 5 and 9, which would be expected if the disulfide bonds were as represented in Figure 1. Hence, it follows that Figure 1 represents the actual disulfide bonds present in recombinant hMCP-1, as predicted from its homology to other similar chemokines.
References 1. Marti, T., Rosselet, S.J., Titani, K., and Walsh, K.A. (1987). Biochemistry 26, 8099-8109. 2. Burman, S., Wellner, D., Chait, B., Chaudhary, T., and Breslow, E. (1989). Proc. Natl. Acad. Sci. USA 86, 429-433.
132
Ramnath Seetharam et al.
3. Haniu, M., Acklin, C, Kenney, W.C, Rhode, M.F. (1994). Int. J. Pep. Prot. Res. 43, 81-86. 4. Robinson, EA., Yoshimura, T., Leonard, E.J., Tanaka, S., Griffin, P.R., Shabanowitz, J., Hunt, D.F. and Appella, E. (1989). Proc. Natl. Acad. Sci. USA 86, 1850-1854. 5. Furutani, Y., Nomura, H., Notake, M., Oyamada, Y., Fukui, T., Yamada, M., Larsen, C.G., Oppenheim, J.J. and Matsushima, K. (1989). Biochem. Biophys. Res. Commun. 159, 249-255. 6. Leonard, E.J., and Yoshimura, T. (1990). Immunology Today 11, 97-101. 7. Miller, M.D., and Krangel, M.S., (1992) Critical Reviews in Immunol. 121, 17-46. 8. Begg, G.S., Pepper, D.S., Chesterman, C.N. and Morgan, F.J. (1978). Biochemistry 25, 1988-1996. 9. Baldwin, E.T., Weber, LT., Charles, R., S., Xuan, J-C, Appella, E., Yamada, M., Matsushima, K., Edwards, B.F.P., Clore, G.M., Gronenbom, A.M., and Wlodawer, A. (1991). Proc. Natl. Acad. Sci. USA 88, 502-506. 10. Paroutaud, P. (1993). Prot. Sci. 2 (Suppl. 1), 89 11. Haniu, M., Hsieh, P., Rohde, M.F., and Kenney, W.C. (1994). Arch. Biochem. Biophys. 310, 433-439.
SECTION III Protein Sequencing and Amino Acid Analysis
This Page Intentionally Left Blank
ENZYMATIC DIGESTION OF PVDF-BOUND PROTEINS IN THE PRESENCE OF GLUCOPYRANOSIDE DETERGENTS: APPLICABILITY TO MASS SPECTROMETRY
Joseph Fernandez, Farzin Gharahdaghi, and Sheenah M. Mische The Rockefeller University Protein Sequencing/HHMI Biopolymer Facilities 1230 York Ave., New York, N.Y. 10021
INTRODUCTION Amino terminal protein sequence analysis is a vital tool in understanding the functions of proteins (1). Enzymatic digestion of PVDF-bound proteins in the presence of hydrogenated Triton X-100 (RTX-100) followed by microbore HPLC purification of the peptidefragmentsis an efficient and sensitive procedure to obtain internal protein sequence data (2-3). Matrix assisted laser desorption ionization time of flight (MALDI-TOF) mass spectrometry is a powerful tool when used in conjunction with the above mentioned procedures. Unfortunately, detergents such as RTX-100 (MW = 631) can cause problems with MALDI-TOF as well as electrospray ionization (ESI) mass spectrometry due to formation of detergent/Tris clusters and suppression of tiie peptide signal as previously reported (5-7). Perhaps lower molecular weight detergents such as glucopyranosides could be viable alternatives since they produce lower mass signals and presumably do not form detergent complexes. Here we present a study in which RTX-100 was substituted by heptyl (MW = 278), octyl (MW = 292), nonyl (MW = 306), and decyl (MW = 322) glucopyranosides. Peptide recoveries, amino terminal sequence analysis, and MALDI-TOF mass spectrometry of the recovered peptides and the detergent containing peptide mixtures are presented. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
135
136
Joseph Fernandez et al.
MATERIALS AND METHODS Enzymatic Digestion of PVDF-bound protein. Standard proteins (4 jig) were analyzed by SDS-PAGE, electroblotted to PVDF (Immobilon PSQ), and enzymatically digested as previously described (2-3). Briefly, the excised protein bands were cut into 1 x 1 mm pieces, incubated in 50 pi of 1% detergent/10% acetonitiile/100 mM Tils, pH 8.0 for 30 min, and after addition of the enzyme (0.4 Pg) (trypsin for transferrin and endoproteinase Glu-C for soybean trypsin inhibitor (STI)) the sample was incubated at 37°C for 24 hrs. The detergents used were as follows: hydrogenated Triton X-100 (RTX-100), heptylglucopyranoside (HGP), octylglucopyranoside (OOP), nonylglucopyranoside (NOP), and decylglucopyranoside (DGP). The supernatant was removed and the peptides were extracted with aftesh50 jil of 1% detergent/10%acetonitrile/l00 mM Tris, pH 8.0 followed by 100 pi of 0.1% TFA. The pooled supematants were analyzed on a VYDAC CI8 column (2.1 x 250 mm) using a Hewlett-Packard 1090 HPLC as previously described (8). Note, when endoproteinase Glu-C was used, no acetonitrile was added to the digestion buffer. Amino Terminal Sequence Analysis. Purified peptides were analyzed either on an Applied Biosystems 477A (peptide #2) or a Hewlett-Packard G-1000A (peptide #1) as previously described (1,9). MALDI'TOF Mass Spectrometry, Peptide/detergent mixtures as well as purified peptides were analyzed using a Vestec LaserTec BenchTop II system (PerSeptive Biosystems). Peptides were mixed with alpha-cyano-4hydroxycinnamic acid as previously described (4). ESI Mass Spectrometric Analysis. Detergent containing peptide mixtures were analyzed using either a Fisions/VG Platform system or a Vestec 201 at the University of Michigan Protein and Carbohydrate Structure Facility (7). Table I: Peptide Recovery after Enzymatic Digestion of PVDFBound Protein in the Presence of Various Detergents Detergent
Transferrin'^' Peptide Recovery*^
nd
STI* Peptide Recovery^ (%)
nd
6 2 6 5
44 ±7 0 36 ±7 NA^
4 4 4 NA«
(%)
Hydrogenated Triton X-100 Heptylglucopyranoside Octylglucopyranoside Decylglucopyranoside a. b. c. d e.
60±6 0 52 ±8 64 ±7
PVDF-bound transferrin (53 pmol) was digested as described in Materials and Methods. PVDF-bound STI (190 pmol) was digested as described in Materials and Methods. Based on the total peak height of selected peptides (8-17 peptides) compared to the same protein digested in solution as described elsewhere (8). Number of protein bands digested. Recovered peptide solution possessed miceUes;therefore, HPLC was not performed.
Digestion of PVDF-Bound Proteins in Glucopyranosides
137
Figure 1. Peptide maps of PVDF-bound soy bean trypsin inhibitor (190 pmol) digested with endoproteinase Glu-C in the presence of 50 jil of A) 1% RTX100/100 mM Tris pH, 8.0, B) 1% octylglucopyranoside/lOO mM Tris pH, 8.0, and C) 1% heptylglucopyranoside/lOO mM Tris, pH 8.0 as described in Materials and Methods. Digestion in the presence of 1% decylglucopyranoside was not analyzed by HPLC due to the presence of micelles.
138
Joseph Fernandez et al.
RESULTS AND DISCUSSION To evaluate whether heptyl, octyl, nonyl, or decyl glucopyrano§ides were suitable for replacing RTX-100, several lanes of PVDF-bound transferrin (53 pmol) and soy bean trypsin inhibitor (STI) (190 pmol) were digested with trypsin and endoproteinase Glu-C respectively. The calculated peptide recoveries are summarized in Table I for both proteins. Nonylglucopyranoside was omitted from this study due to the persistent presence of micelles in the recovered supernatant preventing analysis by HPLC. As can be seen, there does not appear to be a significant difference between the RTX-100 and the decylglucopyranoside peptide recoveries, whereas there does seem to be a slightly lower recovery when digestion is performed in the presence of octylglucopyranoside. Endoproteinase Glu-C digestion of STI with decylglucopyranoside generated micelles in the recovered supernatant and could not be analyzed by HPLC. Surprisingly, there was no peptide recovery when heptylglucopyranoside was used as the detergent. This could be due to the inability of the detergent to either prevent enzyme adsorption to the membrane or elute the peptides from the membrane (2-3). Trypsin (Figure 2A-C) or endoproteinase Glu-C (Figure 1) digestion of PVDF-bound protein in the presence of RTX-100, octylglucopyranoside, or decylglucopyranoside produced reproducible peptide maps regarding the number of peaks and their retention times. In fact, digestions in the presence of RTX-100 (Figure lA and 2A) are identical to results previously observed (3). The only significant difference is the quantitative recovery of certain peptides in the presence of octylglucopyranoside (Figures IB and 2B) and the failed digestion of the heptylglucopyranoside (Figure IC). Digestions of blank PVDF demonstrated that there are no additional contaminants associated with any of the glucopyranoside detergents (data not shown). MALDI-TOF mass spectrometric analysis was performed with approximately 150 fmole from each of the trypsin digestions of transferrin prior to HPLC and is shown in Figure 2D-F. Trypsin digestion of transferrin should yield approximately 60 peptides of four amino acids or longer, with calculated masses of 457-4650 daltons. As can be seen there are no peptide signals in the presence of any of the detergents. However, Figure 2D demonstrates the primary problem with Triton, i.e. RTX-lOO/Tris complexes with a mass range of 400-1000 daltons that obscur peptides in this region. The glucopyranoside detergents show only a few low mass signals, but no peptide signal. In contrast, MALDI-TOF mass spectrometric analysis of HPLC purified peptides (150 fmole) from each of the digestions (Figure 3) show a strong peptide signal. This indicates that detergents and/or Tris are suppressing the peptide signal. Larger quantities of peptide/detergent mixtures (approximately 1.5 pmol), lower concentrations of detergents, and substitution of ammonium bicarbonate for Tris produced similar Figure 2. Peptide maps (A-C) and MALDI-TOF mass spectra (D-F) of PVDFbound transferrin (53 pmol) digested with trypsin in the presence of 50 }il of 1% R T X - 1 0 0 / 1 0 % a c e t o n i t r i l e / 1 0 0 mM T r i s , pH 8.0 ( A , D ) , 1% octylglucopyranoside/10% acetonitrile/100 mM Tris, pH 8.0 (B,E), and 1% decylglucopyranoside/10% acetonitrile/100 mM Tris, pH 8.0 (C,F) as described in Materials and Methods. Ninety percent of the digestion was analyzed by HPLC (-29 pmol based on Table I) and 0.5% (-150 fmol) was used for MALDI-TOF mass spectrometry. Peptides #1 and #2 in A-C were amino terminally sequenced (Table II) and analyzed by MALDI-TOF mass spectrometry (Figure 3).
Joseph Fernandez et al.
140
spectra, although digestion in the presence of ammonium bicarbonate was very poor. These results are in contrast to successful MALDI-TOF mass spectrometric analysis obtained for tryptic peptides (800 fmole) in octylglucopyranoside detergent (5). ESI mass spectrometry was performed on approximately 32 pmol of peptide mixture in decylglucopyranoside buffer without success (data not shown). PVDFbound transferrin was digested in lower concentrations of detergent, and substitution of ammonium bicarbonate for Tris, although these conditions are optimum for ESI mass spectrometric analysis the digestion was not successful as determined by HPLC analysis (7). Amino terminal sequence analysis of 90% of peaks #1 and #2 from Figure 2 was performed and the data is shown in Table II. MALDI-TOF mass spectrometry of approximately 150 fmole of the remaining 10% of each peak was
D
3000
2000
lOOOl
r*^^w. B
E
3000
2000i
1000-^
8000
6000
4000
2000
1000.2 *323.5
fV^i^^A^wJ
Figure 3. MALDI-TOF mass spectrometry of peaks #1 (A-C) and #2 (D-F) purified after digestion of PVDF-bound transferrin using 1% RTX-100 (A, D), octylglucopyranoside (B, E), and decylglucopyranoside (C, F) as shown in Figure 2. Approximately 0.5% of the purified peptide (-150 fmole) was mixed with alpha-cyano-4-hydroxycinnamic acid and analyzed by MALDI-TOF mass spectrometry as described in Materials and Methods. Ninety percent of the peptide was amino terminally sequenced (Table II).
Digestion of PVDF-Bound Proteins in Glucopyranosides
141
Table II: Amino Terminal Sequence Analysis of Peptides Purified after Trypsin Digestion of PVDF-Bound Transferrin in the Presence of Various Detergents Peak^
Sequence^
RTX-100 I.Y.^ (pmol)
OGP^ I.Y.^ (pmol)
DGP^ I.Y.^ (pmol)
Calculated Mass
#1A #1B #1C
YLGEEYVK D(S)AHGFLKVPPR (S)VIPSDGPSVA(C)V(K)
13.3 3.4 2.4
10.3 2.5 2.4
10.6 2.4 3.8
1000.22 1324.40 1358.71
EGGYGYTGAFR
5.8
4.2
7.2
1283.61
#2 a. b. c. d.
Sequence analysis of peak #1 resulted in three peptides. Amino acids in parentheses could not be identified during sequence analysis. OGP = octylglucopyranoside; DGP = decylglucopyranoside. Initial yield was determined by the background corrected value of the first identified amino acid.
performed with the results shown in Figure 3. Peak #1 resulted in one major sequence and two minor sequences (4:1 major:minor molar ratio) while peak #2 produced a single sequence. There was no significant difference in the initial yields of purified peptides obtained with RTX-100, octylglucopyranoside, or decyl glucopyranoside detergents. The observed mass for peptides #1 A, #1B, and #2 were identical to the calculated mass. However, the calculated mass for peptide #1C was not observed in Figure 3A-C while an extra mass of about 2494.1 daltons was found. The reason for this discrepancy is unknown. It should be noted that peptides #1A and #1B produced approximately equal intensity peptide signals on MALDI-TOF mass spectrometry, yet sequence analysis revealed molar ratios of four to one. CONCLUSIONS In summary, octylglucopyranoside can be substituted for RTX-100 with only a slight decrease in peptide recoveries. Decylglucopyranoside may be used in place of RTX-100 when acetonitrile is added to the digestion buffer, but is not useful for endoproteinase Glu-C. Heptyl and nonyl glucopyranosides are not suitable alternatives as there was no peptides recovered with the former and micelle formation occurred with the latter. MALDI-TOF mass spectrometric analysis of peptide and detergent mixtures for less than 2 pmol, which is a realistic goal for screening peptide mixtures before HPLC purification, is not feasible at this time with the currently available instrumentation. Although glucopyranosides produce a much cleaner spectra and substituted for RTX-100 in digestion efficiency, there is still a considerable signal suppression problem at 1% concentrations. Software and/or instrument revisions that would allow the user to disable the detector during sample acquisition below a mass range of 500 is needed (B. T. Chait, personal communication). This would allow a means of addressing the signal suppression problem of detergents in
142
Joseph Fernandez et al.
MALDI-TOF mass spectrometric analysis. This is currently being addressed by Vestec/PerSeptive Biosystems. Following HPLC purification of transferrin digestions, no difference in sequencing initial yields of the recovered peptides between octylglucopyranoside, decylglucopyranoside, and RTX-100 was observed. Mass spectrometric analysis of these peptides (< 150 fmole) was used to help interpret sequence data; however, it should be emphasized that care should be taken in interpreting the data, i.e., quantitating molar ratios of peptides within a peak. -MALDI-TOF mass spectrometry of purified peptides can be a sensitive and powerful tool in a) prescreening HPLC peaks for sequence analysis to determine if a single species is present and its estimated length, b) further interpretation of multiple sequences, c) assist in identification of the first amino acid which is often difficult to assign, and d) possible determination of posttranslational and other amino acid modifications. ACKNOWLEDGMENTS Supported in part by NIH Biomedical Shared Instrumentation Grant and by funds provided by the U.S. Army and Naval Office for purchase of equipment. We would like to thank Rachel Ogorzalek Loo and Philip C. Andrews for electrospray ionization mass spectrometry of peptide mixtures. We would also like to thank Michele Kirchner and Quazi Agashakey for amino terminal sequence analysis.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9.
Atherton, D., Fernandez, J., DeMott, M., Andrews, L., and Mische, S.M., (1993) in Techniques in Protein Chemistry IV (Angeletti, R.H., Ed.) pp 409-418, Academic Press, New York. Fernandez, J., Andrews, L., and Mische, S.M., (1994) Anal. Biochem. 218, 112-117. Fernandez, J., Andrews, L., and Mische, S.M., (1994) in Techniques in Protein Chemistry V (Crabb, J., Ed.) pp 215-222, Academic Press, New York. Beavis, R.C., and Chait, B.T., (1992) Org. Mass. Spectrom. 27, 156-158. Vorm, O., Chait, B.T., and Roepstorff, P., (1993), Proceedings of the 41st ASMA Conference on Mass Spectrometry and Allied Topics, , 621-622. Schey, K., and Finch, J.W., (1993), Proceedings of the 41st ASMA Conference on Mass Spectrometry and Allied Topics, 654-655. Ogorzalek Loo, R.R., Dales, N., and Andrews, P.C, (1994) Submitted to Protein Science. Fernandez, J., DeMott, M., Atherton, D., and Mische, S.M., (1992) Anal. Biochem. 201, 255-264. Hewlett-Packard User's Guide for the HP G lOOOS Protein Sequencing System.
In Gel Digestion of SDS PAGE-Separated Proteins: Observations from Internal Sequencing of 25 Proteins Kenneth R. Williams and Kathryn L. Stone W.M. Keck Foundation Biotechnology Resource Laboratory Howard Hughes Medical Institute Yale University New Haven, CT
I. Introduction Although numerous approaches may be taken to obtain internal amino acid sequences from SDS PAGE-seiparated proteins (1), in situ gel digestion is particularly attractive in that it avoids preliminary procedures such as elution or blotting that may result in significant loss of protein. In situ gel protease digestions have been carried out both in the absence (2) and presence of detergents such as SDS (3) or Tween 20 (4, 5). In all instances, the protease is passively diffused into the gel and the resulting peptides allowed to diffuse out of the gel. Because of the simplicity of the Rosenfeld et al, (5) procedure and because preliminary testing on "standard" proteins suggested the yield of peptides obtained from this procedure was often better than another approach we had been using (2), we evaluated how well the Rosenfeld et al.{5) approach worked on 25 "unknown" proteins submitted to the Keck Facility for internal sequencing. The goals of this study were to compare different approaches to deriving internal sequences from SDS PAGE purified proteins, to identify and optimize critical parameters in this procedure and finally, to establish what are realistic expectations with regards to data generated from varying amoimts of unknown proteins. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
j43
144
Kenneth R. Williams and Kathryn L. Stone
II. Materials and Methods A. Preparation of Samples for In Situ Gel Digestion Samples were prepared by running them on either ID (6) or 2D (7) SDS polyacrylamide gels. In general, samples were over-loaded to the maximum extent possible that allowed the separation that had to be achieved. Prior to SDS PAGE, samples were concentrated by any of the following procedures: 1.
2.
3. 4.
Adding the required amount of SDS gel sample buffer and then reducing the final volume in a Speedvac (note that samples containing up to 0.5 - 0.75 M salt can often be successfiiUy fractionated by SDS PAGE). Adding SDS to a final concentration of 0.05% and then dialyzing versus 1 mM NH4HCO3, 0.05% SDS followed by lyophilization in a Speedvac and reconstitution in SDS PAGE sample buffer. Using an SDS polyacrylamide gel containing a funnel-shaped well that allows samples to be loaded in volumes as large as 300 jil (8). TCA precipitating/acetone washing the sample after reducing its volume to 50 jal in a Speedvac (2).
Gels were stained for the minimum time needed to visualize the bands of interest (typically less than an hour) with 0.1% Coomassie Blue in 10% acetic acid, 50% methanol followed by destaining in the same solvent for a minimum of 2 hours (with shaking and several changes of solvent). The bands of interest (along with an equal volume section of gel that did not contain protein and that served as a "blank" control) were excised from the gel, placed in eppendorf tubes and shipped on dry ice to the Keck Facility. Upon receipt, 10-15% aliquots of each sample and blank gel piece were hydrolyzed for 16 hrs at 110°C in 6 N HCl containing 0.2% phenol and quantified by amino acid analysis using a Beckman 6300 Analyzer. After subtracting the amount of protein in the blank gel piece (which typically ranges from 0-0.2 jig), the amount and density (ie., |ig/mm^) of the remaining protein was calculated (1) and a decision was then made regarding whether to proceed or to request that additional protein be provided.
B.
In Situ Gel Digestion with a Modified Rosenfeld et ah (5) Procedure
1.
Sample and blank gel pieces were cut into approximately 1 x 2 mm sections, placed in 1.5 ml Eppendorf tubes and covered with --150 jil/gel band of 50% CH3CN/ 0.2 M ammonium carbonate, pH 8.9. After washing for 20 minutes at room temperature (RT) on a tilt table, the wash solution was removed and this step repeated once (note, if a tilt table is not available, the samples may be intermittently vortexed).
In-Gel Digestion and Sequencing of 25 Proteins
145
2.
The volume of the gel pieces was reduced by approximately 50% by lyophilizing for ~5 min in a Speedvac. 3. A volume of 33 jag/ml modified trypsin (Promega) or lysyl endopeptidase (Wako) in buffer A (0.2 M ammonium carbonate, pH 8.9, 0.02% Tween 20) was added that was approximately equal to the original gel slice volume (in terms of mm^). The average substrate/protease (w/w) ratio used for <200 pmol samples was 4/1. To ensure complete digestion of a wide range of proteins we have used the highest possible final [protease] consistent with minimizing autolysis and with remaining above a substrate/protease (mol/mol) ratio of 1. Both the sample and blank gels were "digested" in the same manner. If necessary, additional buffer A was added to completely immerse the gel pieces. 4. The gel pieces were then incubated at 37°C for 24 hours. 5. After digestion, about 50 fil buffer B (0.2 M ammonium carbonate, pH 8.9) was added to insure immersion of the gel pieces. 6. Sufficient 45 mM DTT (--15 |il) was added to bring the final [DTT] to ~5 mM then, each sample was heated at 50°C for 20 min to reduce disulfides. 7. Alkylation was then carried out by adding the same volume of 100 mM iodoacetic acid as was added DTT followed by incubating in the dark at RT for 20 min. 8. Peptides were extracted by adding 100 |xl 0.1% TFA, 60% CH3CN (or more, if necessary to completely immerse the gel pieces) followed by a 30 min incubation at RT on a tilt table or with intermittent vortexing. 9. After sonicating for 5 min in a water bath sonicator, the peptide extract was removed, steps 8-9 were repeated and the two peptide containing extracts pooled. 10. The volume of all samples was reduced to 50 jil or less in a Speedvac (to lower the CH3CN concentration) and water was then added to bring the volume to 100 jil prior to microfuging the samples through a 0.22 jam Ultrafree-MC filter (Millipore).
C. Reverse Phase HPLC Separation of Enzymatic Digests Digests were injected onto an HP 1090 HPLC equipped with an Isco Model 2150 Peak Separator and a 25 cm Vydac C-18 (5 micron particle size, 300 A pore size) column equilibrated with 98% buffer A (0.06% TFA) and 2% buffer B (0.052% TFA, 80% acetonitrile) (2, 9, 10). The peptides were then eluted with the following gradient program: 0-60 min (2-37% B), 60-90 min (37-75% B) and 90-105 min (75-98% B) and were detected by their absorbance at 210 nm. Amounts of digests in the 25-250 pmol range generally were fractionated on 2.1 mm ID columns eluted at 0.15 ml/min while larger amounts were fractionated on 4.6 mm columns eluted at 0.5 ml/min. Fractions were collected in capless Eppendorf tubes that were positioned on the tops of 13 x 100 mm test tubes and that were capped within a few hours of their collection (to prevent evaporation of the acetonitrile). Under these conditions, several
146
Kenneth R. Williams and Kathryn L. Stone
fractions have been successfully sequenced even after being stored at 4°C for several months.
D. LDMS and Peptide Sequencing In general, ~6 of the most symmetrical, latest eluting HPLC peaks (not present in the blank digest) from each digest were chosen for laser desorption mass spectrometry (LDMS) with the goal of finding at least two peptides suitable for sequencing. In each case, the total volume of the peak was recorded (as determined by pulling the sample up in a 20 JLII Pipetman) and 2 x 1.5 /xl (ie., - 3 % of the average total sample volume) was added on top of 1.0 fil of acyano-4-hydroxy cinnamic acid (aCHCA) matrix solution that had just been spotted onto a new target. Mixing of the matrix was accomplished by repeatedly pulling the sample into and expelling it from a pipette. The samples were then allowed to air dry at room temperature. To avoid any possible crosscontamination, all targets were used only once. The a-CHCA matrix solution was prepared at a concentration of-10-20 mg/ml in 40% CH3CN/ 0.1% TFA and was used after vortexing and standing for a few minutes. In general, matrix solutions were stored for up to a maximum of 2-days at -20°C. The calibrants used for external calibration were gramicidine S (m/z = 1142.5) and insulin (m/z = 5734.5). Both calibrants were stored at -20°C as 10 pmol/jil stocks in either 50% CH3CN, 0.1% TFA, or 0.1% TFA. LDMS was carried out on a VG/Fisons TofSpec mass spectrometer that was operated in the +ve linear ion mode at an accelerating voltage of 25 kV and that was equipped with a nitrogen laser (337 nm) and a 0.65 m linear flight tube. This instrument accepts a 15 position sample plate with the location of the laser beam within each 1.5 mm sample position being controlled by varying the X and Y coordinates of the beam. The laser beam is typically aimed at several different spots within each position. Once a signal is observed, the fine energy is gradually lowered until the resolution is optimized. Routinely, -30 shots were averaged for each spectrum, with 3-6 spectra being acquired for each sample. A 2-point external calibration file was created for each sample plate using a mixture of 10 pmol each of gramicidine S and insulin. Once a representative data file was chosen for a sample position, the centroid masses were obtained and the resulting spectrum was printed. In the case of those peaks selected for Edman degradation, the appropriate fraction was loaded directly (without prior concentration) onto an Applied Biosystems Model 470 or 477 sequencer operated according to the manufacturer's recommended protocol. If 100% of the fraction was loaded, the empty tube was then rinsed with 30 /xl neat trifluoracetic acid which was then applied to the polybrene-coated filter containing the sample. Finally, 50 pmol of an internal sequencing standard (11) was applied on top of each sample filter prior to loading it onto the instrument. Immediately following sequencing, all peptide sequences were searched via the "Blast" e-mail server operated by the National Center for Biotechnology Information (12).
In-Gel Digestion and Sequencing of 25 Proteins
147
III. Results A. In Situ Gel Digestion Results from 25 ''Unknown'^ Proteins Table I summarizes the results from the 25 proteins that have been submitted recently to the Keck Facility for internal sequencing and that were subjected to the modified Rosenfeld et al (5) procedure described in Materials and Methods. It is apparent that the overall success rate for these in gel digestions was very high. That is, only one of the 25 samples failed to yield numerous HPLC peptide peaks suitable for sequencing. The failed sample contained a total of 250 pmol based on amino acid analysis and was submitted at the lowest protein/gel volume density (ie., the density of this sample was 0.017 as opposed to 0.03 |Lig/mm^ or above for the remaining 24 samples). We suspected this digest failed due either to the [substrate] being too low or to the difficulty inherent in working with such a large volume of gel (ie., 2,120 mm^). That one of the latter reasons was responsible for this failure is strongly suggested by the fact that when a lesser amount (200 pmol) of this same protein was submitted at a 3.4 fold higher density (0.058 fag/mm^), the digest proceeded well and 19 residues were positively called in the first peptide that was sequenced. As a result, we recommend that proteins be isolated at the highest possible protein/gel volume ratio. Twenty-four "unknown" (and several standard) proteins have been successfully digested and sequenced at a protein/gel volume ratio of >0.03 |ig/mm^ therefore, we use the latter as the minimimi recommended protein/gel density (following staining). To reach this latter density typically requires that at least 1-2 jig of the protein of interest be loaded/lane in a 0.75-1 mm thick gel. The average percent initial sequencing yield of the 46 peptides summarized in Table I was 14% as calculated from the ratio of the initial pmol peptide sequencing yield to the amount of protein that was actually digested (as determined by amino acid analysis of an aliquot of the gel piece). As suggested by the mean (and particularly the median) initial yields reported in Table I, the initial sequencing yield appears to correlate more strongly with the density as opposed to the total amount of protein in the gel. The latter again emphasizes the importance of doing everything possible to maximize the protein density in the polyacrylamide gel matrix. Since neither the initial sequencing yield nor the overall success rate appears to correlate well with the total amount of protein submitted in the gel, it appears that we have not yet reached the lower limits of this approach (in terms of the least amount of protein that is required to succeed). So far, the least amount of an unknown protein we have subjected to in gel digestion was 26 pmol (2.8 jig) of a 106 kD protein that provided 25 residues of positively called sequence from the 2 peptides sequenced. Database searching of these two sequences indicated they matched published sequences for a-actinin. HPLC profiles that resulted from the in gel tryptic digest of 34 pmol (2.1 jig) of a 62 kD protein are shown in Fig. 1. At this low level there are at least 5-6 peaks present in the blank chromatogram that are similar in size to peaks
148
Kenneth R. Williams and Kathryn L. Stone
Table I. Summary of results from the in gel tryptic digestion of 25 proteins Parameter
Amount of Protein Digested (pmol)
Total
<50
51-100
101-200
>200
Number of proteins digested
5
5
8
7
25
Average mass of protein (kD)
70
36
65
66
60
Average amount of protein digested (pmol)
38
84
146
322
161
Average density of protein band (|ig/mm^)
0.14
0.054
0.25
0.40
0.23
Number of peptides sequenced
7
10
19
10
46
%Peptides sequenced that provided >6 positive residues
100
91
83
91
89
Average %initial yield*
25.1
6.7
13.1
15.2
14.0
Median %initial yield*
11.1
2.9
9.1
13.3
10.0
Average number of residues called/peptide sequenced
15.0
12.0
11.8
14.8
13.0
%"Unknown" proteins identified via database searches
60
20
38
50
46
Overall digest success rate**
100
100
100
86
92
^Calculated from the ratio of the initial sequencing yield to the amount of protein that was digested - as judged by hydrolysis/amino acid analysis of a ~10% aliquot of the stained gel piece. ''A digest was scored as a success if at least 12 residues of positively called sequence was obtained from 1-2 peptides.
that are unique to the actual protein digest shown in the lower panel. Although the origin of the peaks in the 30-95 min region of the blank run is not known, the peaks that elute after this point are due to residual Coomassie Blue. By only selecting peaks for further analysis that elute in the 30-95 min "window" and that are unique to the protein digest, it is possible to avoid most artifact peaks that resultfi-omtrypsin autolysis, Coomassie Blue and other reagents. The impact that routine peptide LDMS "screening" has on internal sequencing is reflected by the fraction of peptides that were subjected to sequencing that provided usable data. That is, with prior LDMS analysis, approximately 90% of the peptides selected for sequencing provided at least 6 residues of usefiil sequence with the overall average number of residues sequenced/peptide being 13 (Table I). In a previous study, without LDMS "screening" only 67% of peptides selected for sequencing fell into this same category (13). In the latter study -17% of the peptides proved to contain mixtures and another -16% failed to yield any sequence whatsoever (13). Based on our experience, LDMS readily differentiates peptides from Coomassie Blue and other reagent peaks and (by comparison to a table of predicted peptide masses) detects most trypsin autolysis products (that are not present in the blank
In-Gel Digestion and Sequencing of 25 Proteins
60 70 T i mo (m1n . )
149
80
90
100
110
Figure 1. Reverse phase HPLC separation of an in gel tryptic digest of 34 pmol (2.1 jxg) of a 62 kD protein (lower panel). The upper panel shows the profile that resulted from incubating an equal size slice of polyacrylamide gel that did not contain protein. The 4 peaks in the 96-102 min region are due to residual Coomassie Blue. The digest and HPLC separation were carried out as described in Materials and Methods.
chromatogram). In addition, by only sequencing peptide peaks that have a major/minor LDMS peak ratio that is >10, we have found that LDMS can serve as a valuable criterion (in addition to HPLC absorbance peak shape) of peptide purity. Thus, in our laboratory, LDMS screening of peptide fractions has decreased the average number of peptides that must be sequenced/protem to obtain definitive internal sequences. Finally, the routine mass accuracy (±0.25% with external calibration without a reflectron) of our LDMS analyses is sufficiently high that it allows accurate prediction of the end of peptides sequenced by Edman degradation and sometimes allows tentative assignment of a "missing" residue based on mass comparison. Since the sensitivity of LDMS is so high (ie., most tryptic peptides can be easily detected in the 20-500 fmol range), we routinely use an average of only - 3 % of each peptide fraction for this analysis.
B. Suggestions for Optimizing Internal Sequencing from In Situ Gel Digests 1. Quantify the amount of protein prior to digestion Hydrolysis and ion exchange amino acid analysis of an aliquot of the gel band provides a more accurate estimate of the amount of protein than can be obtained by comparing relative Coomassie Blue staining intensities. By detecting samples that contain too little protein or that are at too low density
150
Kenneth R. Williams and Kathryn L. Stone
before digestion and HPLC, the investigator can be warned that the pending digest is likely to fail and that additional material should be isolated before proceeding. Obviously, this approach is likely to increase the success rate of in gel digestions. In addition, by quantifying the amount of protein that has been digested, it is possible to calculate the overall recovery of any peptide sequenced and hence, often determine whether it was derived from the major component in a gel band. That is, if the recovery of a given peptide (in terms of pmol sequencing yield/pmol of protein digested) is above average, clearly, it must have derived from a major component in the preparation. 2. Carry out a positive and negative digest control While the negative (no substrate protein) is helpful in quickly identifying reagent peaks and autolysis products derived from the protease, the positive control (ie., a gel slice containing a similar amount of a standard protein such as transferrin) is useful for continuous optimization of the procedures that are being used, for verifying the activity of the protease and for providing a "benchmark" against which the yield of unknown peptides (as judged by the average peak heights obtained from the HPLC tracing) can be quickly compared. 3. "Screen** selected peptide peaks by LDMS prior to sequencing As indicated above, there are many advantages to LDMS analysis of peptide peaks that are candidates for Edman degradation. Since we have previously sent numerous LDMS targets for analysis by an "outside" facility, we know it is quite possible to use this approach even if a laser desorption mass spectrometer is not immediately available. 4. Routinely include an internal sequencing standard with all samples As detailed in reference 11, this practice allows the continuous optimization/on-line monitoring of sequencer performance that is so critical to being able to routinely sequence the <5 pmol peptide (based on initial sequencing yield) amounts that might be expected from in gel digests on <50 pmol protein. In addition, when an occasional blocked peptide (or more commonly, blocked eukaryotic protein) is sequenced, the internal sequencing standard immediately verifies that the failure to obtain a sequence was not the result of an instrument malfunction. Often, the latter determination would otherwise require the sequencing of a standard protein - thus using up valuable instrument time and forestalling detection of the problem. 5. Search all peptide sequences against all available databases The importance of this recommendation follows from the observation that currently we are identifying from 50-60% of all "unknown" proteins submitted for in gel digestion based on the search of the first peptide sequence obtained (see Table I and reference 14). Since in the majority of instances, these proteins
In-Gel Digestion and Sequencing of 25 Proteins
151
Table II. Summary of peptide recoveries using several internal amino acid sequencing strategies Procedure
Ref.
#Proteins Avg. Amt. Protein Avg. Seq. Avg. % Digested Digested Amounts Yield. Recovery* (pmol) Based on: (pmol)
Membrane/NC
(15)
2
340*^
?
35
10
Membrane/PVDF
(16)
1
<400''
?
24
6.1
Membrane/PVDF orNC
(17)
2V
94
P.I. estimate
14
15
In gel/no detergent
(1)
39c
126
AAA
18
14
In gel/Tween 20
Table I
25
161
AAA
23
14
^Calculated from the ratio of the initial sequencing yield/amount of protein digested or loaded onto the gel. ''Refers to the amount of protein subjected to SDS PAGE, the actual amount digested would be the product of this number and the % blotting efficiency. ''Only includes digests in the 34-200 pmol range.
are abundant cellular proteins that are not thought to be responsible for the activity that is being followed, (ie., the "single" Coomassie Blue-stained band submitted for internal sequencing either proves to contain multiple components or the band that has been selected is not actually responsible for the activity being assayed), it is important to carry out this search at the earliest possible opportunity (see reference 14 for further discussion concerning precautions that can be taken to increase the probability that the candidate band is indeed the correct band and that it's purity is sufficient to proceed with the digest).
IV.
Discussion
Based on the data in Table I, it appears that a laboratory that can routinely sequence <5 pmol peptide (based on initial sequencing yield) can also routinely succeed in obtaining internal sequences from >25 pmol amounts of proteins that have been stained with Coomassie Blue and subjected to in gel digestion. Since some of the data in Table I was generated from a digest that was carried out by a lab that had no prior experience in digesting proteins, the Rosenfeld et aL (5) procedure that we have used can apparently be carried out with little prior expertise. Questions that remain are the "absolute" sensitivity limits of in gel digestion and the comparative efficiency of in gel versus on membrane approaches to internal sequencing. Although the data summarized in Table II (ie., see the average % recoveries in the last column) suggest the latter two approaches provide similar results, a more definitive answer will require a larger scale, collaborative study by laboratories that are expert in each of these approaches. Since we have not found a significant correlation between the initial peptide sequencing yield and the amount of protein digested (Table I and reference 1), we believe the "sensitivity" of in gel digestion can be extended
152
Kenneth R. Williams and Kathryn L. Stone
below the -25 pmol amounts included in Table I by using thinner gels with narrower lanes and by going to lower flow rates and narrower HPLC columns. Acknowledgments We especially thank the other members of the Protein Chemistry Section of the Keck Facility for carrying out the analyses described in this study: Myron Crawford, Ray DeAngelis, Mary LoPresti, Ed Papacoda, Suzy Samandar and Nancy Williams.
References 1. Williams, K.R, Kobayashi, R., Lane, W, and Tempst, P. (1993) ABRF News 4, 7-12. 2. Stone, K. L. and Williams, K. R. (1993) In A Practical Guide to Protein and Peptide Purification for Microsequencing, second edition (P. Matsudaira ed.) 43-69. 3. Kawasaki, H., Emori, Y., and Suzuki, K. (1990) Anal Biochem. 191, 332-336. 4. Ward, L.D., Reid, G.E., Moritz, R.L., and Simpson, R.L. (1990) J. Chrom.519, 199-216. 5. Rosenfeld, J., Capdeville, J., Guillemot, J., and Ferrara, P. (1992) Anal.Biochem. 203,173. 6. Laemmli, U. (1970) Nature 227, 680-685. 7. O'Farrell, P. (1975) J. Biol. Chem. 250, 4007-4021. 8. Lombard-Platet, G. and Jalinot, P. (1993) BioTechniques 15, 668-672. 9. Stone, K.L., Elliott, J.I., Peterson, G., McMurray, W., and Williams, K.R. (1990). In Methods in Enzymology (J. McCloskey, ed) 389-412. 10. Stone, K.L., LoPresti, M.B., Crawford, J.M., DeAngelis, R, and Williams, K.R. (1991) In High-Performance Liquid Chromatography of Peptides and Proteins: Separation, Analysis, and Conformation (C. Mant and R. Hodges, ed.) 669-677. 11 Elliott, J.L, Stone, K.L. and Williams, K.R. (1993) Anal Biochem. 211, 94-101. 12. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990) J. Mol Biol 215, 403-410. 13. Stone, K.L., McNulty, D.E., LoPresti, M.L., Crawford, J.M., DeAngelis, R., and Williams, K.R. (1992) In Techniques in Protein Chemistry III (R. Angeletti, ed) 22-34. 14. Stone, K.R. and Williams, K. (1994) In Current Protocols in Protein Science (J. Coligan et al, Qd) in press. 15. Aebersold, R.H., Leavitt., R.A., Hood, L.E. and Kent, S.B. (1987) Proc. Natl Acad Set USA 84, 6970-6974. 16. Yuen, S. Chui, A. Wilson, K., and Yuan, P. (1989) BioTechniques 7, 74-82. 17. Fernandez, J., DeMott, M., Atherton, D., and Mische, S.M. (1992) Anal Biochem. 201, 255-264.
Peptide Mapping at the 1 |Lig Level: In-gel vs. PVDF Digestion Techniques Lee Anne Merewether, Christi L. Clogston, Scott D. Patterson, and Hsieng S. Lu Amgen, Amgen Center, Thousand Oaks, CA 91320-1789
I.
INTRODUCTION
A number of examples (1-6) have demonstrated the utility of the SDSPAGE/electroblotting/sequencing technique strategy to determine the amino acid sequence of low level proteins. However, sequence analysis frequently fails due to a blocked N-terminus, or the determined partial sequence is not useful for oligonucleotide probe design. In either case, it would inevitably hamper cloning of a novel gene. Therefore, micro-scale isolation of internal peptides from gel separated proteins for subsequent sequence analysis becomes an important task. Several procedures for generating internal peptides have been useful in yielding good internal peptide sequences (2-6). Most reported methods have been successful using 2-10 |Xg of purified proteins. We have selected and modified two reported enzymatic digestion methods for in-gel and PVDF-blotted proteins and compared their digestion efficiency, RP-HPLC peptide recovery, mass spectra, and sequence yield using < 1 |Lig protein samples loaded onto the gels. We have also used the modified methods to obtain peptide sequences from two unknown proteins isolated from natural sources. II. METHODS AND MATERIALS A. SDS'PAGE/electroblotting and enzymatic digestion All sample proteins were run on either 12% or 14% pre-cast Novex 1.0 mm, 10 well gels under non-reducing conditions according to Laemmli (7). Samples were immediately electroblotted to Immobilon-P PVDF membrane using a semi-dry (MilliBlot-SDE) electroblotter essentially quantitiatively (8). After blotting, PVDF was washed briefly in HPLC water and stained with 0.05% Brilliant Blue-G Coomassie (BB-G) 720% methanol 70.5% acetic acid or Amido Black (2). The membrane was kept wet and not allowed to dry (9). Enzymatic digestion was performed as described (2) with all digestion and extraction buffer volumes reduced to 25 jiiL. B. SDS'PAGE and in-gel enzymatic digestion SDS-PAGE was performed as described above. Gels were stained with 0.05% BB-G 720% MeOH 70.5 % acetic acid and destained with 30% HPLC grade MeOH. The gel was washed in HPLC grade water overnight. The in-gel digestion was performed according to the methods described (3), a modification of two earlier reports (5,6). C. In-solution enzymatic digestion Sample concentrations were determined by amino acid analysis. A 25 |ig sample was aliquoted and dried in vacuo. Samples were digested in the presence TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
153
154
Lee Anne Merewether et al.
of 1% Triton or 0.01% Tween-20 according to the methods described (2,3) except with the addition of 2 M urea in the digestion buffer. After digestion, samples were diluted to 40 jLig/mL with 0.1% TFA. A 1 ^ig aliquot was reserved for peptide mapping. D, Reversed'phase peptide mapping Protein digests were analyzed by re versed-phase on a 1090A/M Hewlett Packard HPLC equipped with a diode-array detector and Chemstation. They were analyzed (100-200 ^a. for PVDF and 20-30 ^iL total volume for in-gel digests) on a SynChrom C4 (2.1 x 50 mm) column (3) at a flow rate of 0.146 mL/minute at ambient temperature, and monitored at 215 and 280 nm. When used, the anion exchange column was placed in series before the C4 column and removed approximately 7 minutes into the isocratic portion of the gradient. Buffer A was aqueous 0.1% TFA. Buffer B was 0.1% aqueous TFA in 90% acetonitrile. The gradient used was 3%B (0.01-10 minutes), 3-18%B (10-26 minutes), 18-55%B (26-86 minutes), 55-80%B (86-100 minutes). Peptides were collected manually for further analysis. E, N'terminal sequence analysis Sequence analysis was performed on either an Applied Biosystems, Inc. 47 6A protein sequencer, a Hewlett Packard G10005A protein sequencer, or an ABI477A protein sequencer equipped with a MiChrom microbore HPLC system (10). F, MALDI mass spectrometric analysis MALDI-MS was performed using a Kratos Kompact MALDI III mass spectrometer fitted with a standard 337 nm nitrogen laser, and operated in the linear mode at an accelerating voltage of 20 kV. Two sample preparation methods were used: (1) for collected peptides, 0.3 |xL ahquots of sample and matrix (a-cyano-4hydroxy cinnamic acid, Biomolecular Separations, Inc.) were mixed on the probe slide and allowed to air dry; or (2) for unfractionated digests, a thin polycrystalline film was prepared according to (11) (with modifications for use on a probe slide (12)), matrix and sample aliquots were mixed (usually 0.3 \\L each) on this surface and prior to drying, rinsed twice with 2 |LiL of deionized water. IIL RESULTS AND DISCUSSION A.
Method Modifications The modifications made for the PVDF method were: (a) replacing Coomassie Brilliant Blue R stain with BB-G, which gives a single, late eluting HPLC peak, easily identified by strong absorbance at ~300 and 600 nm; and (b) replacing the 2.1 x 250 mm colunm with a 2.1 x 50 nmi column to improve peptide recovery as described (13). In our experience with Amido Black staining, we found several negative deflections in the chromatographic base line, interfering with peptide collection at very low levels (data not shown). Two modifications were made for the in-gel method: (a) elimination of the anion exchange pre-column, which was used to remove excessive SDS prior to RP-HPLC chromatography (3). In our observation, this resulted in complete or partial loss of certain peptides (data not shown). The anion exchange column was successfully eliminated without compromising chromatography, suggesting the gel washing steps are very efficient in removing SDS; and (b) use of a 2.1 x 50 mm column to improve peptide recovery (13).
B.
Comparative Analysis of Two Digestion Methods using SCF Evaluation criteria were based on the number of peptides isolated that yielded sequence, the recovery of very hydrophobic or glycosylated peptides, and the compatibility of each method with MALDI mass spectrometric analysis.
In-Gel vs. PVDF Digestion Techniques
155
It has been noted by us and others (4) that a wide, diffuse band gives less optimal digestion and recovery. Therefore, when excising a band or a gel slice for digestion, cutting was done conservatively. Limiting the area of PVDF or gel involved in this technique improved peptide recovery. Figures 1A and IB illustrate peptide maps derived from endoproteinase Lys-C digestion of 1 |xg non-glycosylated SCF in-solution containing Triton or Tween-20, respectively. Figures IC and ID compare the maps derived from peptide digestion of PVDF-bound SCF and in-gel digestion after SDS-PAGE, respectively. The 1 |Lig SCF peptide digestion from these two methods produced peptide maps different than those observed from the 1 |ig in-solution with the corresponding digestion buffers. However, the addition of 2 M urea to the insolution digestion buffers was essential for protein digestion (data not shown). Table lA lists 15 SCF predicted peptide fragments following Lys-C digestion. The in-solution maps show recovery of ten of fifteen peptides. Two of the ten peptides are disulfide-linked (1+7) and (5+13). (1+7) was recovered in all four maps. (5+13), a very large disulfide linked peptide, was present in the in-solution peptide maps but was not detected in either the PVDF or in-gel peptide maps. Although this peptide was not detectable by sequence analysis of the isolated peptides from the PVDF and in-gel maps, MALDI-MS analysis of both unfractionated digests detected this peptide (Figure 2). Several very small peptides: 2, 8, 9, 11 were not recovered in either the in-solution, PVDF, or ingel peptide maps. Three peptides, 3, 6, and 12, are consistently recovered in all four peptide maps. Several peptides, 4, 15, and (1+7), shifted to earlier retention times in both the PVDF and in-gel peptide maps. These peptides all contain one or more Met residue. MALDI-MS analysis detected a mass increase of 16 Da suggesting oxidation of these methionines. (5+13)
^
. 50 .
"T" 75
Tune in minutes Figure 1. Peptide maps of Stem Cell Factor Lys-C digests. A and B: in-solution digests containing Tween-20 and Triton respectively: C. In-gel method; and D PVDF method.
156
Lee Anne Merewether et al.
Table I. A. Predicted SCF peptides derived from Lys-C digestion Peptide No. Fragment MH+ (mass) Sequence la (1-14) 1634.9 MEGICRNRVTNNVK 2 (15-18) 462.5 DVTK 3 (19-25) 754.9 LVANLPK 4 (26-32) 884.1 DYMTTLK 5b (33-63) 3493.1 YVPGMDVLPSHCWISEMVVQLSDSLTDLLDK 6 (64-79) 1787.9 FSNISEGLSNYSnDK 7a (80-92) 1459.8 LVNIVDDLVECVK 8 (93-97) 564.6 ENSSK 375.4 DLK 9 (98-100) 10 147.2 K (101) 11 (102-104) 381.5 SFK 12 (105-128) 2945.4 SPEPRLFTPEEFFRIFNRSIDAFK 13b (129-149) 2201.4 DFVVASETSDCVVSSTLSPEK 14 (150-157) 892.0 DSRVSVTK 15 (158-166) 943.2 PFMLPPVAA 14-15 (150-166) 1816.2 DSRVS VTKPFMLPPV AA B. Peptide recovery from the PVDF method based on peak areas Peptide No. Run 1 Run 2 Run 3 Average In-solution % Recovery 67 3 79 74 73 146 50 4 164 64 132 107 134 208 6 165 201 164 177 376 47 15 90 99 89 100 141 71 (1+7) 155 199 152 169 343 49 12 278 380 370 343 764 45 (5+13) N.R. N.R. N.R. N.R. 1040 0 C. Peptide recovery from the in-gel method based on peak areas Peptide No. Run 1 Run 2 Run 3 Average In-solution 77 3 76 81 78 136 4 200 219 122 180 362 6 176 166 191 178 273 117 99 127 114 171 15 (1+7) 99 92 107 99 249 12 526 456 511 498 351 (5+13) N.R. N.R. N.R. N.R. 836 a Peptides 1 and 7 are disulfide linked, b Peptides 5 and 13 are disulfide linked.
% Recovery 57 50 65 67 40 142 0
Table n . Comparison of sequencing yields* from PVDF and in -eel SCF Lys-C peptides Peptide In-solution In-gel % Recovery Peptide In-solution PVDF % Recovery 97 45 23 51 3 38 40 3 44 4 24 49 22 67 4 36 54 26 48 6 64 27 43 6 120 9 8 15 17 11 159 15 N.R. N.R. 11 19 14-15 14-15 67 30 45 57 7 18 31 7 48 19 39 1 44 11 26 1 12 37 32 12 160 11 18 12 N.R. 27 5 N.R. 22 5 N.R. 26 13 N.R. 21 13 N.R. - Not recovered. * Pmol amounts are given from the first or second PTH amino acid.
In-Gel vs. PVDF Digestion Techniques
157
The two in-solution maps contain peptide 14-15 generated from incomplete cleavage. Although both maps contain this peptide, it is recovered at different levels based on peak areas and sequencing yields. This suggests that the Tween-20 and Triton in the digestion buffers give the enzyme different accessibility to this proteolytic site. Peptide 14-15 was not detected in either the PVDF or in-gel maps, indicating either that this peptide was not recovered or the protein was sufficiently denatured for this proteolytic site to be completely accessible. SCF peptide maps generated from the PVDF and in-gel methods were run in triplicate to compare both peak areas and sequencing yields to equivalent in-solution maps. Within each method, the peptide maps showed good reproducibility of peak areas. The recovery, based on peak areas, was calculated by the average peak area for each peptide from all three runs. The average was then compared to the peak area for the corresponding peptide in the in-solution map (Table IB and IC). Percent recovery for the PVDF peptide map ranged from 45-71%. Percent recovery for the in-gel peptide map ranged from 40-142%. However, it should be noted that the digestion pattern and resulting peptides are not consistent between the PVDF, in-gel and in-solution maps. Therefore an absolute comparison between the methods could not be made. Recoveries based on sequencing yields were calculated from PTH-amino acid values in the first or second cycle for each peptide. The values for the in-gel and PVDF methods were compared to the values obtained in-solution. The pmol recovery for the in-gel method was considerably higher for 4 out of 6 peptides. Total recovery was significantly higher for the in-gel method compared to the PVDF method (Table II). Again, it should be noted that sequencer recoveries are not quantitative, and an absolute comparison of peptide yields may not be made due to differences in enzymatic digestion. Both methods have been shown to be compatible with MALDI-MS analysis. Very small amounts of the unfractionated protein digests were used for analysis: 0.3% and 1% of the PVDF and in-gel digests, respectively. Tween-20 and Triton produced an envelope of signals that are concentrated in relatively narrow mass ranges (m/z = 1200-1800 and 600-1200, respectively), although the Tween-20 signals were of greater intensity. Both detergents limit the range of detectability. Dilution of the digest reduced both detergent noise levels significantly (data not shown). However, in our experience, although some peptides ionized more efficiently after dilution, others were severely reduced. Dilution was not necessary to obtain multiple peptide signals with both digests (Figure 2). Analysis of both the digest and the isolated peptide fractions detected methionine oxidation (Table HI). Table m. MALDI-MS analysis of collected Lys-C Stem Cell Factor (SCF) peptides Expected Mass Observed Mass (MH+) I In-solution In-solution | (MH+) Tween-20 Triton In-gel PVDF Peptide No. 3 754 755 N.D. 756 N.D. 4 883 886 886 N.D. N.D. 6 1787 1786 1790 1791 1789 15 943 960 960 N.D. 961 14-15 1815 1835* 1833* N.D. N.D. (1+7) 3091 3089 3097 3106* 3112* 12 2944 2949 2942 2951 2944 (5+13) 5689 5738* 5745* N.D. N.D. * Mass increase is consistent with oxidation of methionine(s). N.D. = Not detected.
158
Lee Anne Merewether et al.
y (5+13) oJ
MM
Masa/ChMge Figure 2. MALDI-MS analysis of unfractionated SCF Lys-C digests from in-gel (upper) and PVDF (lower) methods. Note prompt fragmentation of (1+7) in lower (12).
Also of importance to note with the PVDF method, we observed Triton to contaminate the HPLC system, most notably the flow cell, after multiple injections. In several instances, this was serious enough to produce many high UV-absorbing peaks and prevented collection of the in-progress map. Periodic, extensive washing of the system may help resolve this. C.
Digestion of recombinant Human Erythropoietin (EPO) 1 |Lig EPO maps were performed using both methods to evaluate the recovery of glycosylated peptides (Figure 3). Lys-C digestion of EPO produces nine expected peptides (Table IV). The peptide maps were comparable in terms of the number of peptides recovered, with the in-gel digestion map producing slightly greater peak areas. Both methods recovered six out of nine predicted peptides. Peptide 4+5 is a disulfide-linked peptide, while peptides 2 and 6 are large and glycosylated (2 contains an 0-linked site, 6 contains an N-linked site). The 0-linked peptide 2 appears to be recovered at a similar level to other peptides in the PVDF EPO map. Based on peak areas, peptide 6, a very large N-linked glycopeptide, appears to be very poorly recovered in the PVDF peptide map. The in-gel map recovered both the oxidized and unoxidized peptide as denoted by 6 and 6* (6* = oxidized Met54 determined by MALDI-MS (data not shown)). One of the three peptides not recovered in either map contained both a disulfide bond and two N-linked glycosylation sites, suggesting that very large peptides are more difficult to recover.
In-Gel vs. PVDF Digestion Techniques
159
I
60
Hme in minutes Figure 3. Peptide maps of EPO Lys-C digests from A. PVDF and B. in-gel methods. Table IV. Predicted EPO peptides derived from Lys-C Peptide No. Sequence a LK 1 VNFYAWK 2 EAISPPDAASAAPLRTITADTFRK 3 LFRVYSNFLRGK 4 R APPRLICDSRVLERYLLEAK 5 [LYTGEACRTGDR A AVSGLRSLTTLLRALGAQK A EAENITTGCAEHCSLNENITVPDTK 6 RMEVGQQAVEVWQGLALLSEAVLRGQALLVNSSQPWEPLQLHVDK
digestion
(1 0-linked sugar) (disulfide-linked) (disulfide-linked) (disulfide-linked + 2 N-linked sugars) (1 N-linked sugar)
^ Peptides not recovered from either PVDF or in-gel method. Underlined boldfaced letters indicate N or O-linked sites.
y \}J' \S^
J
y^'
^^"^
'^/\LJ^
TitoeininimilM
Figure 4: Peptide map of an unknown 35kD PVDF-bound protein after Lys-C digestion. Sequencing yields were in the 20 pmol range for the numbered peptides, with sequences of 7, 24, 17, and 11 amino acids obtained.
Figure 5: Peptide map of actin, a 40kD protein after in-gel Lys-C digestion. Peptides 1,3, and 5 were sequenced on a standard sequencer. Peptides 2 and 4 were sequenced on a modified instrument.
160
Lee Anne Merewether et al.
Table V; Sequenceing results from in-gel Lys-C digestion of actin (^ l)ig) Peptide No. Initial Yield (pmol) Cycles Sequence 1 5 YPIEHGIVTNXDDMEK 16 4 DLYANTVLSGGTTMYPGIADRMQK 24 2a 3 3 AGFAGDDAPRAVFPSIV... 17 2 4a AGFAGDDAPRAVFPSIVR... 18 IWHHTFYNELRVAPEEHPVLL... 21 5 7 « Sequenced on an ABI 477A/MiChrom LC system (10).
D,
Analysis of Unknown Protein Samples The methods described above were successfully used to analyze two unknown proteins at the <1 |Xg level. The PVDF method was successful with l^g (-30 pmol) of an unknown 35kD protein (Figure 4). Sequence was obtained from four peptides, giving 59 amino acid assignments. The in-gel method was successful with <1 |ig (-25 pmol) of an unknown 40kD protein (Figure 5). Sequence was obtained from five peptides, giving 96 amino acid assignments. This unknown was identified as actin based on a computer search (14) for sequence homology (Table V). V. CONCLUSIONS
In conclusion, we describe two methods suitable for low-level (~lM
REFERENCES 1. 2. 3. 4.
5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Matsudaira, P. (1987) /. Biol Chem. 262, 10-35. Fernandez, J., DeMott, M., Atherton, D., and Mische, S.M. (1992) Anal Biochem. 201, 255-264. ABRF News Ideas
ENZYMATIC DIGESTION OF PROTEINS IN ZINC CHLORIDE AND PONCEAU S STAINED GELS
Sharleen Zhou* and Arie Admont *Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720 ^Department of Biology, Technion, Haifa 32000, Israel
Introduction Proteins resolved by SDS-PAGE are often digested with proteases in order to obtain peptides for sequencing. In most cases the proteins are first transferred to either nitrocellulose or PVDF membrane where the protein bands are proteolyzed in situ (Fernandez et al. 1992 and references therein). More recently a number of procedures have been proposed for digestion of proteins in the gel (Rosenfeld et al. 1992, Kawasaki et al, 1990). Proteolysis on membranes works well after blocking the membrane with PVP-40 or hydrogenated Triton X-100 or both. Proteolysis in the gel is more difficult due to the inhibitory effect of SDS present in the gel. Staining proteins in the gel often causes irreversible fixing of the proteins, resulting in inefficient proteolysis and low recovery of peptides. Some proteins do not electro-transfer well to nitrocellulose or to PVDF membrane. We describe here methods to resolve small amounts of proteins on SDS-PAGE and digest the protein in situ, followed by HPLC purification of the peptides for microsequencing. We compare digestion of proteins in solution, on PVDF membranes or in gels, after staining with zinc-chloride-imidazole or Ponceau S and alkylation with iodoacetamide or 4-vinylpyridine for their effect on proteolysis efficiency and peptide recovery. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
161
162
Sharleen Zhou and Arie Admon
Materials and Methods BSA, dithiothreitol and hydrogenated Triton X-100 were purchased from Calbiochem. Urea was from Pierce. lodoacetamide was from Sigma and 4-vinylpyridine was from Aldrich. Ammonium bicarbonate was from Fisher. Sequencing grade endoproteases LysC and trypsin were from Boehringer Mannheim. A. Digestion
in
Solution
BSA was proteolyzed in solution according to Stone et aL (1989). It was denatured and reduced in 8 M urea, 0.4 M ammonium bicarbonate, and 10 mM DTT, by heating at 55^C for 15 minutes. One aliquot was not alkylated, another was alkylated with 20 mM lodoacetamide for 15 minutes at room temperature and the third was alkylated with 4-vinylpyridine. One microliter of 4-vinylpyridine was mixed with 9 \iL of 2-propanol and 2 |LLL of which was added to 100 L| L1 (250 |Lig) of protein and incubated at 37^C for 30 minutes. After adding 3 volumes of water to reduce the urea concentration to 2 M, 0.1 \ig Lys-C in 100 mM Tris-HCl at pH 8.0 was added to 2.5 |Lig BSA in each tube and the mixture was incubated at 37^0 for 16 hours.
B. Digestion
in Gel
Similar amounts of unproteolyzed BSA from each of the treatments above were transferred to a new tube and SDS-PAGE sample buffer was added. These three samples were run on an 8-15% gradient gel (0.75 mm thickness). For zinc chloride staining, the gel was incubated in 0.2 M imidazole for 15 minutes and then in 0.3 M ZnCl2 for few seconds (Fernandez-Patron et al. 1992 and Ortiz et al. 1992). It was then washed in water for 10 minutes. The clear protein bands were cut out on a glass plate placed on a black background and transferred to Eppendorf tubes. In order to remove SDS, zinc ions and imidazole, the gel pieces were washed twice for 15 minutes with 200 mM ammonium bicarbonate, 10 mM Na2EDTA at pH 8 in 50% acetonitrile at room temperature. The gel pieces were semi-dried in air and the proteins were proteolyzed similar to Rosenfeld et al. (1992). To each tube, 0.1 ^lg Lys-C or trypsin in 50-100 |LIL of 100 mM Tris-HCl at pH 8.0 was added to immerse the gel pieces. After the gel expanded, two more volumes of the same buffer were added and the tubes were kept mixing at 37^C for 16 hours. For proteolysis
Digestion in ZnCl and Ponceau S Gels
163
after Ponceau S staining, the gel was run as above and stained with 0.1% Ponceau S in 1% acetic acid for a few seconds. The gel was destained with 5% acetic acid, washed with water as above, and the pink protein bands were cut out and treated as the zinc chloride stained gel. C. Digestion
on PVDF
Membrane
The proteins were electroblotted to Immobilon-P membrane (Millipore) in 20 mM Tris, 150 mM glycine and 20% methanol. The membrane were stained with 0.1% Ponceau S in 1% acetic acid for 1 minute, and washed with water. The bands were excised and digested according to the method by Fernandez et al. (1992). D. RP-HPLC and Peptide
Sequencing
The peptides obtained from the different proteolysis methods were separated on a Brownlee Aquapore RP-300 column (lx250mm) on an ABI 130A microbore HPLC. Solvent A was 0.1% TFA in water and solvent B was 0.085% TFA in 90% acetonitrile. A gradient of 3-55% B in 58 minutes at a flow rate of 80 jLil/min was used. The peptides were monitored at 214 nm. Peptides were collected by hand and sequenced on ABI 477A and 476A Protein Sequencers using standard chemistry. Results In order to compare peptide recovery from proteolysis in solution, in gel and on PVDF membrane, we digested the same amount of BSA (2.5 |ig each, - 35 pmol). Figure 1 shows chromatograms of peptides from Lys-C digestion of reduced and denatured BSA without alkylation (A), or with the addition of either iodoacetamide (B), or 4vinylpyridine (C). As was demonstrated before (Stone et al. 1989), it is clear that BSA was not efficiently digested, as indicated by the low intensity of the peptide peaks when it was not alkylated. On the other hand, alkylated BSA, either by iodoacetamide (B) or 4vinylpyridine (C), were digested much better, as indicated by the higher intensity of the peaks and the number of peaks. The Lys-C digestion of BSA in zinc chloride stained gel (D) and Ponceau S stained gel (E) are also shown in Figure 1. These peptide maps demonstrated that the gels stained with zinc chloride and
164
Sharleen Zhou and Arie Admon
0.05 h
0.00 h
0.05 h
0.00
1—1
O^^AJJJJ
-A
0.05
0)
o
a
0.00 h
-£ O CO
0.05 h
0.00 0.05
0.00
^..^uUiML^^ 10
20
30
40
50
60
Time (min)
F i g u r e 1. Comparison of Lys-C digestion of 35 pmol BSA (A), carboxamidomethylated BSA (B), pyridylethylated BSA (C) in solution, and carboxamidomethylated BSA in gel with zinc chloride staining (D), and Ponceau S staining (E).
Digestion in ZnCl and Ponceau S Gels
165
Ponceau S were digested efficiently. Compared with the in-solution digestion (B), the number of peaks in D and E were similar, and the intensities were 70% of those in B. There were no extra peaks r e s u l t e d from t h e in-gel digestion. The addition of SDS or hydrogenated Triton X-100 to the proteolysis in the gel improved the recovery of peptides (data not shown). To remove SDS, a Brownlee DEAE cartridge (2.1x30mm), an anion exchange precolumn, was connected before the reversed-phase column (Kawasaki et aL 1990). No peptide loss due to this precolumn was observed. Hydrogenated Triton X-100 was added to a final concentration of 0.1% to t h e digestion in the gel and seemed to be especially beneficial for the trypsin proteolysis. Peptides from zinc chloride stained gel were collected and sequenced to confirm t h a t they are indeed internal BSA peptides (results not shown). Figure 2 shows the peptide maps of pyridylethylated BSA digested with Lys-C (A) and trypsin (B) on PVDF membrane. As was demonstrated before by Charbonneau (1989), it is clear t h a t recovery of peptides from Immobilon-P is much better from t r e a t m e n t with trypsin t h a n with Lyc-C, both from the intensity and number of the peptide peaks.
0.05 V
§ S 3
-s
0.00
0.05
o
0.00 10
20
30
40
50
60
Time (min) F i g u r e 2 . C o m p a r i s o n of digestion of 35 pmol p y r i d y l e t h y l a t e d B S A by Lys-C (A) a n d t r y p s i n (B) on Immobilon-P.
166
Sharleen Zhou and Arie Admon
Discussion For in-gel digestion, we demonstrated the usefulness of the high sensitivity negative staining with zinc chloride and imidazole for detection of protein bands in gels. The negatively stained gels are white with transparent protein bands clearly detected over a black background. This negative staining is more sensitive than Coomassie Blue and only slightly less sensitive than silver stain (FernandezPatron et al. 1992 and Ortiz et al. 1992). Since the stain does not bind to the proteins and the proteins are not fixed in the gel, they are more accessible for enzymatic digestion resulting in more complete proteolysis and better recovery of peptides. Electroblotting is eliminated altogether and the chromatograms are free of dye peaks such as Coomassie Blue, which could obscure peptide peaks (Kawasaki et al 1990 and Rosenfeld et al 1992). From the peptide recovery of the digestion on PVDF membrane, one can see that trypsin is a much better choice than Lys-C. That is understandable since trypsin digestion is more efficient on immobilized proteins and yields higher amounts and a greater number of peptides. After Fernandez et aL (1994) demonstrated that PVP-40 was not necessary to block the membrane if RTX-100 was present in the digestion buffer, we also eliminated PVP-40. However, instead of adding the protease immediately, we first block the membrane with 1% hydrogenated Triton X-100 digestion buffer followed by adding the protease in 0.1% hydrogenated Triton X-100, 10% acetonitrile and 100 mM Tris-HCl at pH 8.0. We should stress here, as was pointed before (Stone et al. 1989), that whichever digestion approach is used—in solution, in gel, or on membrane—reduction of cystine without subsequent alkylation results in incomplete digestion leading to peptide loss. The difference is especially great for proteins such as BSA which contains 6% cysteine residues. Thus, for best peptide yield from unknown proteins, we recommend alkylation before resolving the proteins by SDS-PAGE or HPLC and before digestion. When only small amounts of proteins are available, the entire protein sample has to be digested, without prior testing of the process on a portion of the sample. The in-gel proteolysis approach described here is simple, universal and sensitive. Using this staining and proteolysis method, increasing the sensitivity of the sequence analysis and minimizing sample loss, we are able to routinely obtain useful amino acid sequences from 5-10 picomoles of proteins.
Digestion in ZnCl and Ponceau S Gels
167
Acknowledgments We t h a n k Anat Aharoni for help in testing these methods and Gene Cutler for help with the Figures.
References Charbonneau, H. (1989) in: "A Practical Guide to Protein and Peptide Purification for Microsequencing''. Ed: Matsudaira, P. Academic Press. Fernandez-Patron, C , Castellanos-Serra, L., and Rodriguez, P. (1992) BioTechniques 12, 564-573. Fernandez, J., DeMott., M., Atherton, D., and Mische, S.M. (1992) Anal, Biochem. 201, 255-264. Fernandez, J., Andrews, L., and Mische, S.M. (1994) Anal. Biochem. 218, 112-117. Kawasaki, H., Emori, Y., and Suzuki, K. (1990) Anal. Biochem. 191, 332336. Ortiz, M.L., Calero, M., Fernandez Patron, C , Castellanos, L., and Mendez, E. (1992) FEBS Letters 296, 300-304. Rosenfeld, J., Capdevielle, J., Guillemot, J.C. and Ferrara, P. (1992) Anal. Biochem. 203, 173-179. Stone, K.L., LoPresti, M.B., Crawford, J.M., DeAngelis, R., and Williams, K.R. (1989) in: "A Practical Guide to Protein and Peptide Purification for Microsequencing". Ed: Matsudaira, P. Academic Press.
This Page Intentionally Left Blank
DIRECT COLLECTION ONTO ZITEX AND PVDF FOR EDMAN SEQUENCING: ELIMINATION OF FOLYBRENE William A. Burkhart, Mary B. Meyer, and Wanda M. Bodnar Glaxo Research Institute, Research Triangle Park, NC 27709 Anita M. Everson R. W. Johnson Pharmaceutical Research Institute, San Diego, CA 92121 Violeta G. Valladares and Jerome M. Bailey Beckman Research Institute of the City of Hope, Duarte, CA 91010
I. Introduction Glass-fiber filters and polyvinylidene difluoride (PVDF) membranes have been the primary sequencing supports for gas-phase and liquid-pulse Edman sequencers. As an alternative support, the Hewlett-Packard sequencer has used a biphasic column system utilizing a hydrophobic support to which the protein/peptide is applied. For sequencing membrane-bound samples, a membrane-compatible column was recently introduced. As a C-terminal sequencing support, Bailey (1) has utilized Zitex, a porous teflon membrane. In order to prevent substantial losses prior to sequencing, techniques have been developed for performing multiple manipulations of a protein immobilized on the HP sequencing column. These have included in situ reduction and alkylation, in situ chemical (2) and proteolytic cleavages (3), and direct electroelution from polyacrylamide gels (4). Peptides generated by these methods can be isolated by capillary LC using the HP column adapter (5). Adsorptive losses can occur if peptides are collected into a vessel and then transferred to the sequencing support. These losses can be minimized by direct collection to a suitable sequencing support when performing high sensitivity separations. Although PVDF membranes have performed satisfactorily as a sequencing support for proteins, this has not always been the case with small peptide samples due to washout during sequencing (6). Polybrene has been used routinely as a carrier on many Edman sequencers to help with the sample washout problem. It requires preconditioning before application of sample and is a potential source of interference for mass spectrometric analysis. Therefore, we undertook a study to totally eliminate the use of polybrene, evaluate the use of alternative membranes as sequencing supports, and improve the performance of PVDF with peptides. MALDI-TOF-MS and Edman sequencing have become complementary techniques in protein structural characterization. Ideally, direct collection of peptide fractions should be to a support that TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
169
170
William A. Burkhart et al.
is compatible with both MALDI-TOF-MS and Edman sequencing so that the same sample can be analyzed by both techniques. This capability has been demonstrated for peptides collected directly onto Zitex (7). In addition to our evaluation of PVDF, herein we report the use of Zitex and Teflon tape for sequencing and MALDI-TOF supports, electroblotting from polyacrylamide gels, and in situ proteolytic digestion. II. Materials and Methods A. Sources of proteins and peptides Recombinant phosphodiesterase (PDE) and glutathione-Stransferase were produced and purified at Glaxo. Cytochrome C was purchased from Sigma and bovine serum albumin (BSA) from Pierce. The synthetic peptides angiotensin I/II (3-7), VYIHP; mating factor alpha, WHWLQLKPGQPMY; and Tyr-bradykinin, YRPPGFSPFR were purchased from Bachem Bioscience. RRsrc, RRLIEDAEYAARG, was from the University of Michigan Protein and Carbohydrate Structure Facility. Protein and peptide concentrations were determined by amino acid analysis using an Applied Biosystems 420A Derivatizer with 130A Separation System. B. Polyacrylamide gel electrophoresis and electroblotting SDS-PAGE was performed using a 4-20% Novex (San Diego, CA) mini-gel (100 X 100 X 1 mm) in Tris-glycine buffer at 125 volts for approximately 2 hours. Proteins were then electroblotted onto either PVDF, Zitex, or Teflon tape using a Hoefer TE 22 Mini Transphor unit. Transfer was carried out for 3 hours at 225 mA in 12.5 mM Tris, 96 mM glycine, and 10 % methanol, pH 8.3 (8). Isopropanol or absolute ethanol was used to thoroughly wet the Zitex and Teflon tape prior to equilibration in transfer buffer. Following transfer, proteins were stained with either 0.1% amido black or coomassie blue R-250 in 50% methanol. Destaining was carried out in 50% methanol. Trans-Blot PVDF (0.2 micron) was purchased from Bio-Rad. Zitex G-110 (1-2 micron pore, 0.01 in. thickness) and G-104 (5-6 micron pore, 0.004 in. thickness) were from Norton Performance Plastics (Wayne, NJ). Teflon tape was standard laboratory thread seal tape (mil-spec. T-27730A). C Proteolytic digestions and separation and collection of peptides In situ digestions on the Hewlett-Packard sequencing column with Lys-C (Wako) were performed according to Burkhart (3), except 40% acetonitrile was used in the digestion buffer instead of 20%. After digestion, the colimm was placed in-line on the HPLC using the G1007A column adapter. A Hypersil ODS (0.8 X 300mm, LC Packings) column was employed at a flow rate of 17 |il/min. A linear gradient of 8 to 48%
Membranes for Sequencing and MALDI-TOF MS
171
acetonitrile in 0.1% TFA over 80 min was used. After 80 min, the gradient was stepped up linearly to 80% acetorutrile over the next 20 min. Peptides were manually collected onto Zitex membranes and dried in an oven at 75°C. The membrane-boimd proteins were reduced and pyridylethylated for 15 min. at 25° C in 50 ^ll 6 M guanidine-HCl (8). The membrane was washed three times with water followed by three 20% acetonitrile washes. Proteolytic digestion using 0.5 - 1 p^g endoproteinase Lys-C (Wako) was carried out in 50 |xl 0.1 M Tris-HCl, pH 8.5 with 40% acetonitrile for 24 hours at 37° C. Following digestion, 10 jil of 10% diisopropylethylamine in 60% isopropanol was applied to the membrane followed by 2 washes with 50 jil of 20% acetonitrile. The digestion buffer and washes were pooled, concentrated, and injected onto a Poros Rl column (320 micron X 10 mm). Peptides were eluted in 0.1% TFA with a 10 - 80% acetonitrile gradient (20 min). Peptides were manually collected onto Zitex and air dried. If dried in the oven, spotted peptides were dried for 10 min at 75° C. Since droplet volumes varied, direct collected peptides were dried 10 to 20 min, but not over 10 min after the HPLC eluate had evaporated. D. Automated Edman Sequencing Sequencing was performed on both the HP G1005S Protein Sequencing System with on-line PTH analysis and the ABI477A Protein Sequencer with 120A PTH Analyzer. On the G1005S, 2.2 chemistry and Routine 2.2 cycles were utilized. Our column cycle used a R3 delivery time of 3.2 sec. The 477A was modified to give gas-phase delivery of R3 (TFA) during the cleavage step by raising the delivery tube above the surface of the acid and lowering the Ar delivery tube to right above the surface of the acid. The 477A MicroCartridge Filter cycles were utilized with modification for gas-phase delivery of R3. The step time was 175 sec for R3 delivery, followed by a pressurize step of 300 sec. For the G1005S, 4 mm disks of Zitex and PVDF were used, while 11 mm disks were used on the 477A with the standard cartridge block. The membranes were washed with isopropanol before use and dried. The 4 mm disks were used with the Blott cartridge for irutial trials on the 477A. The cartridge seal of Zitex was used as usual. E. MALDI'TOF-Mass Spectrometry Samples deposited on G-104 Zitex membranes were air dried, then 2 |il of matrix (saturated a-cyano-4-hydroxy-cirmamic acid in 60:40 H20:MeCN with 0.1% TFA) was added. After the membranes had dried, they were taped onto the Fisons VG Analytical sample target. Spectra were acquired with a VG TofSpec instrument in the linear mode with positive ion detection. A 337nm N2 laser was used to desorb the samples from the membranes. Accelerator and detector voltages were set at 25 and 1.8 kV, respectively. External calibration was done using
172
William A. Burkhart et al.
bradykinin, renin substrate, melittin, human insulin, and cytochrome C (1061.2,1760.0,2848.5,5808.7,12,360.1 Da respectively). III. Results and Discussion The HP G1005S sequencer with the 2.2 chemistry is well suited for sequencing peptides spotted or direct collected onto Zitex and PVDF membranes as shown in Figure 1. High initial and repetitive yields were consistently obtained allowing sequencing through the C-terminal residue of the peptides shown. Zitex and PVDF performed equivalently with all the model peptides, except Tyr-bradykinin. PVDF was the superior support for that particular peptide. As unknown peptides of varying lengths have been sequenced, we have continued to see high performance of Zitex and PVDF. Although the cycles were optimized for Zitex, they were found to work equally well for PVDF. The R3 (TFA) delivery time for the cleavage step was foimd to be critical. Once the cycles were optimized for the sequencing column, this was the most important step to check if washout occurred when using Zitex or PVDF. The metered amount of TFA needed to be kept to a minimum so that it entered the membrane-compatible column in the gas phase. Once optimized, the delivery times remained stable during the duration of this study without further adjustment. During our initial trials which utilized 1.3 chemistry and Routine 1.3 cycles and loads of 1 nmol, the effect on performance by drying at 75° C was more pronounced than when we switched to the 2.2 chemistry and loads of 20 pmol or less. Drying at an elevated temperature had little effect on performance with the G1005S sequencer when sequencing at the low picomole level as shown in Figure 1. Ziitex became our membrane of choice for direct collection due to its wetting properties. Being much more difficult to wet than PVDF, the HPLC eluate beaded-up into a small droplet during collection until high percent organic solvent was reached in the gradient. It was feasible to collect up to 25 fil on the 4 mm disk without difficulty. PVDF, on the other hand, began to wet with as low as 10% acetonitrile making it necessary to use lower flow rates (4-5 fil maximum) in order to keep the eluate drop small when using this membrane for direct collection. The chromatograms in Figures 2 and 3 show examples of unknown proteins which were taken through direct electroelution, in situ digestion, and direct collection onto Zitex. Sufficient sequence data was obtained to search the PIR database revealing the 140 kDa and 36 kDa unknowns to be novel proteins. DNA probes have been designed and cloning efforts are underway for the 140 kDa protein. The 97 kDa unknown was identified as gelation factor ABP-280 by sequences shown in Figure 3. It was difficult to get consistent results with Zitex and PVDF on the ABI 477A sequencer. As we worked with several short peptide models and longer unknown samples, the results were quite variable. This is illustrated by the results obtained with two of the model peptides shown in Figure 4. Several factors contributed to improved performance
173
Membranes for Sequencing and MALDI-TOF MS
Amino Add Residue
Amino Add Residue
Figure 1. The sequencing results for the four model peptides spotted to Zitex and PVDF membranes and sequenced on the HP G1005S are shown above. The amount applied for each rim was 20 picomoles, except for the RRsrc runs where 45 picomoles were used. The PTH-Ile yields were not corrected for the separation of lie enantiomers.
including 1) use of G104 Zitex, 2) use of the standard reaction cartridge, 3) gas-phase delivery of R3 (TFA) during cleavage, and 4) keeping ethyl acetate washes to a mirumum. Drying at an elevated temperature improved performance more on the 477A than on the G1005S, especially for peptides on PVDF. The sequencing of mating factor alpha on PVDF demonstrated a dramatic improvement after drying at 75° C. An advantage of Zitex over PVDF was the ease of extraction of PTH-His and Arg from the Zitex. The most persistent problem when using Zitex on the 477A was the observance of a large lag beginning early in the run which was due to inefficient coupling. This was seen when sequencing the RRsrc peptide and several longer peptides (results not shown) which were well retained on the Zitex, but started sequencing in cycle 2 of the run after a very low yield in cycle 1. The lag was most likely due to a wetting problem with the Zitex. Whereas the G1005S uses a R2 coupling base in 60% isopropanol delivered in the liquid-phase, the 477A uses gas-phase delivery of base in water. Various agents were added in a step of the reaction Begin cycle preceding coupling to aid with this problem, but while some helped reduce the lag, all caused unacceptable washout of the sample. We believe Zitex should be evaluated on the ABI491A Procise sequencer which has more precise control of deliveries and the option of gas- or liquid-phase delivery of R2 and R3.
William A. Burkhart et al.
174
Figure 2. Chromatograms of Lys-C peptides from electroeluted 140 kDa unknown (panel A) and 36 kDa unknown (panel B). Peptides were collected onto Zitex G-110. The labeled peaks indicate the number of residues sequenced. Peptides were sequenced on the HP-G1005S, except those shown in panel A which were sequenced on the ABI477A. For the detector used, lOmV equals 80 ntAu.
30.0
60.0
1
DGSCGVAYVVQEPGDYE
2
VNQPASFAVSLNGAK
3
FADQHVPGSPFSVK
4
YGGPYHIGGSP
5
YNEQHVPGSPFTARVTGDDSMRMS
6
FNEEHIPDSPFVVPVASPSGDARRLT
90.0
Minutes
Figure 3. Chromatogram of Lys-C peptides from an electroeluted 97 kDa protein. Peptides were collected onto Zitex G-110 and sequenced on the HP-G1005S. The sequences obtained from the numbered peaks identified this protein as gelation factor. For the detector used, 10 mV equals 80 mAu.
Membranes for Sequencing and M A L D I - T O F MS TYR-BRADYKININ
P
175 MATING FACTOR ALPHA
P G F S Amino Add Residue
Figure 4. The sequencing results for two of the model peptides spotted to Zitex and PVDF and sequenced on the ABI477A are shown above. The amount applied for each run was 20 picomoles.
Having established Zitex as a suitable peptide sequencing support on the G1005S, we next evaluated its use as a MALDI-MS support since several attempts to ablate spotted samples from PVDF were imsuccessful. In all our trials, proteins and peptides were ablated from the surface of Zitex and Teflon tape more readily than from PVDF. As demonstrated with Lys-C peptides from PDF in Table 1, it was possible to determine the mass of a direct collected peptide on Zitex by MALDI-MS and then subject that same sample to Edman sequencing on the G1005S. We found that external calibration of the peptide masses was difficult resulting in poor mass assignments for several peptides. However, we are currently developing methods to improve mass assignments of the direct collected samples by including an internal standard. Although an alternative would be to remove an aliquot of the fraction prior to drying, our goal is to use the absolute minimal amount of sample for mass analysis, leaving the remainder for further characterization. Performing MALDI on proteins electroblotted to PVDF has presented a formidable challenge to the mass spectroscopist. We therefore investigated the use of Zitex for electroblotting from PAGE gels. Although Zitex works for electroblotting, its larger pores allow much of the sample to migrate through the membrane without being trapped. It quickly became apparent that a thinner, more non-porous type of Teflon membrane would be more suitable for electroblotting of proteins. Teflon tape was found to give similar performance in electroblotting as PVDF. Proteins were effectively trapped on the surface of the Teflon tape and did not migrate through as happened when using Zitex. Proteins electroblotted to Teflon tape were amenable to N-terminal sequence analysis and in situ digestion. We are currently investigating the feasibility of performing MALDI-TOF-MS on proteins electroblotted onto Teflon tape. Because of its ease of handling, Zitex
176
William A. Burkhart et al.
Table 1. Mass assignments are shown for the Lys-C peptides obtained from the in situ digest of PDE electroblotted to teflon tape. Peptide fractions were direct collected onto G-104 Zitex and air dried. After mass analysis, the samples were then subjected to 10 cycles of Edman sequencing on the HP G1005S.
FRACTION
(M+H)+ OBS.
SEQUENCE
(M+H)+ CALC.
1
1357.6
QNDVEIPSPT...
1356.5
2
1685.0 1157.7
ERERGMEISP... QQLMTQISGVK
1787.1
3
3187.4
LMHSSSLNNT...
3003.3
4
3317.1
FQFELTLDEE...
3352.5
5
4865.5
RISNSTSPfS...
4876.3
6
3780.2
VTSSGVLLLD...
3780.4
7
2725.0 2698.7
SLELYRQWTD... ATYATSDFTL...
2821.1 2752.5
8
7680.5
SQVGnOYIV...
7674.4
1233.5
remains our choice for direct collection of peptides. However, Teflon tape promises to offer an alternative to PVDF for electroblotting with several advantages. We have already demonstrated its superiority for spotted samples when performing MALDI-TOF-MS, and expect in situ digestions of Teflon-blotted proteins to be more complete without the addition of detergents. References 1. Bailey, J., Rusnak, M., and Shively, J. (1993) Analytical Biochemistry 212,366-374. 2. Wagner, G., Fischer, S., Myerson, J., Miller, C , Bente, H., and Horn,M. (1990) Poster T-141, Fourth Symposium of the Protein Society. 3. Burkhart, W. (1993) In "Techniques in Protein Chemistry IV", Angeletti, R., ed., 399-406. 4. Moyer, M., Rose, D., and Burkhart, W. (1994) In "Techniques in Protein Chemistry V", Crabb, J., ed., 195-204. 5. Burkhart, W., Moyer, M., and Kassel, D. (1993) Poster M-284, Seventh Symposium of the Protein Society. 6. Murata, H., Takao, T, Anahara, S., and Shimonishi, Y. (1993) Analytical Biochemistry 210,206-208. 7. Bodnar, W., Anderegg, R, and Moyer, M. (In press) Proceedings of the 42nd ASMS Conference on Mass Spectrometry and Allied Topics. 8. Reim, D. and Speicher, D. (1993) Analytical Biochemistry 214, 87-95.
Minimizing N-to-0 Shift in Edman Sequencing William H. Venselt and George E. Tarr i*^ t U. S. Dept. of Agriculture, ARS, Western Regional Research Center, Albany CA 94710 and •i-Dept. of Cardiology, Children's Hospital, Boston MA 02115
I. Introduction Protein sequence determination is carried out most frequently by the Edman degradation. Advances in the methodology have resulted from improvements in instrumentation, but the chemistry has remained essentially unchanged since its inception. How far the sequence signal can be read depends upon cycle-to-cycle efficiency and background generation. Chemical repetitive inefficiency of 3-5% produces a general exponential decline of in-phase signal; residue-specific effects such as slow cleavage at proline also contribute. Background ("random" chain-spHtting) is believed to arise by 3 mechanisms[l,3]: acid solvolysis; rearrangement at Asp and Glu residues; and N-to-0 shift at Ser/Thr hydroxyl groups. The latter, we find, is the major contributor under normal sequencing conditions[3]. As a result of the N-to-O shift (Fig. 1), in-chain Ser/Thr amino groups are exposed to coupling with phenylisothiocyanate (ONCS); subsequent cleavage fragments the chain at the next peptide bond. There are two distinctly
Figure 1. N-to-O Acyl Shift H R Rc—C CH ^"^O-^^
H R Re—C CH Rn^^H
H R Rc^-C CH ^"^On
Re—C
H R „
CH
^"^0
Under acidic conditions the alcoholic oxygen attacks the carbonyl group resulting in migration of the acyl group from nitrogen to oxygen. Rn= N-terminal (peptidyl, or acetyl); Rc= C-terminal; R= H (Ser), CH3 (Thr) ^Present address: PerSeptive Biosystems, Inc., 38 Sidney Street, Cambridge MA 02139 TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
177
178
William H. Vensel and George E. Tarr BSA: Aqueous Coupling
No-Rx
Figure 2. Plot of Phe over 39 cycles. No-Rx-, no treatment; N, N-acylation; 0,0-acylation.
different ways of trying to decrease the effect of this process on sequencing. One approach is to reverse the N-to-O shift, and an attempt to minimize fragmentation by applying the coupHng buffer before ONCS has been reported[4]. However, application of this strategy during sequencing of the wheat storage protein (A-gliadin), rather than reducing background accumulation, reduced the out-of-phase signal (data not shown). The other approach is to block protein hydroxyl groups to prevent N-to-O shifting, for instance with easily introduced and stable O-acyl groups. Background accumulation during Edman degradation can be illustrated with yield plots of Phe from bovine serum albumin (BSA) over 39 cycles (Phe occurs at position 11, 19, 27 and 36). N-acetylation (Fig. 2, N) followed by Edman degradation leads to a pattern of background similar to that of a protein not acetylated (No-Rx). Oacetylation (O) of BSA markedly reduced background, though little protein sequences. BSA has a Thr at position 2 and a Ser at 5, and the decrease of sequencable protein is likely the result of an O-to-N shift of the acetyl group when these residues become N-terminal (Fig. 3). Although the N-to-O shift was prevented (Fig. 2, O), we failed to cause the amino group of N-terminal O-acetyl Ser/Thr to react more rapidly with ONCS than with adjacent acetyl ester. Because neither of these simple strategies was successful, we examined both the kinetics of the O-to-N acyl shift and the catalysis of thiocarbamylation to determine optimal conditions. We find that O-to-N shift can be prevented by: using anhydrous coupling; excluding amines and/or by lowering the pH of the
Figures. Q-to-N Acyl Shift H
EC—G~ B+
R -CH
H EC—C-
E
I m H^; ^
H EC—C-
S
E -CH
HNs^ CO •CQ
$
H Be—C
m
I
B
HNs^
^
Rn= N-temiinal (peptidyl, or acetyl); Rc= C-terminal; R= H (Ser), CH3 (Thr)
R CH
I OH
Minimizing N- to 0-Shift
179
coupling reaction to pH 5. Using a tetralkylammonium salt as catalyst for 0NCS coupling, and blocking the N-to-0 shift with an acyl group, we are able to suppress background and sequence through Ser/Thr.
II. Materials and Methods Abbreviations conform to Tarr's nomenclature: F, formic acid; A, acetic acid; fA, trifluoroacetic acid; KNm, dimethylformamide; mSO, methylsulfoxide; MOH, methanol; EOH, ethanol; MCN, acetonitrile; M5KN, N-methylpyrrolidone; 6N, pyridine; N, ammonia; ENm, dimethylethylamine; M6N, N-methylpiperidine; M6N0, N-methylmorpholine; E6N0, N-ethylmorpholine; ENip, diisopropylethylamine; eN, triethylamine; fmK, hexafluoroacetone (hydrate); cit, citric acid; W, water; NaA, sodium acetate ; eN+, tetraethylammonium; MNT, methylthiocarbamyl; ONT, phenylthiocarbamyl; MNGS, methyhsothiocyanate; 0NCS, phenylisothiocyanate; NPN-TNdab, Naminopropyl-N'-p-dimethylaminoazobenzene thiourea. Model substrates with the required functional groups were synthesized by standard methods. The effect of different conditions on coupling of selected isothiocyanates to the model compounds was followed with HPLC by measuring the amount of product and the remaining starting material. A similar approach was utilized to follow the 0-to-N shift of acetyl groups within the model substrate, Ser-Ala-dimethylaminoazobenzene amide (SA-Ndab). Conditions for HPLC were: YMC AQ-C18 2x250 mm column; linear gradients of 0.1% fA vs 0.09% fA/MCN at 40C; HP1090 with diode array detector. Sequencing was mainly carried out using a Millipore ProSequencer. BSA or human serum albumin (HSA) were covalently attached either to hydrophylized Teflon isothiocyanate (NCS) membranes (Millipore) or to glass fiber filters (Applied Biosystems) using SequeNet NCS attachment (Millipore). During Edman degradation the coupling reaction was carried out using either aqueous Millipore buffer (5%E6NO in 70/30 MOH:W) or anhydrous solvents containing 5% M6N. Anhydrous solvents tested included: MCN, KNm, mSO, and M5KN. Standard instrument control cycles were used with minor modification. HSA was chosen as a routine sequencing model. It was reasoned that the presence of Ala at position 2 and 8 with intervening Ser at 5 would allow the appraisal of different sequencing conditions. N-acetylation of albumins, covalently attached to membranes, was carried out after exposure to fA vapors for 10 min at50°C. The membrane was dried, exposed to vapors of E6N0/acetic anhydride 3:1 (30 min at R.T.), washed with 4x1 ml vol. of MOH and subjected to 40 cycles of Edman degradation. When 0-acetylation was carried out in the ProSequencer during the first cleavage step, the fA syringe was disconnected from its valve port and a syringe containing freshly prepared 5% AcCl in fA was attached in its place. When the fA delivery valve opened for the programmed delivery, 60 uL of 5% AcCl/fA was manually injected. After reconnection of the fA syringe, subsequent cleavages were with fA under instrument control. Sequence analysis with the ABI477A protein sequencer was done in an instrument equipped with
William H. Vensel and George E. Tarr
180
a blot cartridge. Acylation of the support coupled protein was performed by exposing it to vapors of 5% AcCl in fA prior to placing it in the cartridge.
III. Results A. Kinetics ofO-to-N acyl migration Our study of migration kinetics of the acetyl group (Table I) indicated several conditions under which the 0-to-N acyl shift was slowed. The 0-to-N shift was pH sensitive, proceeding faster as the pH was raised (first section of Table I); however, it was strongly promoted by water as well: in KNm the requirement for water was absolute — in its absence the 0-to-N shift rate was negligible (second section). Hydroxylic solvents could partly substitute for water. Table I. 0-to-N Shift in Model Substrate: 0-Ac-SA-Ndab Organic Water Buffer
pH
Half-life
eN-A N-finK KP04 N-fmK ENm-cit N-F
10.7 7.1 6.7 6.3 5.4 4.0
1.0 min 3.1 16.3 6.6 123 11200
eN-A eN eN eN eN eN eN
11 >12 >12 >12 >12 >12 >12 high
28 357 infinite 18.6 40 3.3 5.5 28 84 162
Effect of pH and buffer on O-to-N shift rate: MCN MCN MCN MCN MCN MCN
79% 50 50 50 50 70
Effect of water on shift rate: KNm KNm KNm EOH EOH MOH MOH 6N, 6N, 6N,
10 5 0 9 0 33 0 50 10 0
-
It It
Based on these results, it appeared that anhydrous coupling conditions in combination with 0-acetylation should provide a means to markedly decrease background accumulation and allow the Edman degradation to continue. Results of experiments using one anhydrous coupling system (5%M6N/M5KN) with BSA are shown in Figure 4. Comparison with the aqueous results (Fig. 2) indicates that for non-acylated material (No-Rx), the lag at cycle 20 was considerably higher in the nonaqueous system (0.7 versus 0.4), and the yield of Phe at cycle 11 was approximately half. Nonaqueous coupling systems with 5% M6N or 5%E6NO as catalyst proved less effective than aqueous systems.
Minimizing N- to 0-Shift
181 BSA: Nonaqueous Coupling
NoRx
Figure 4. Plot of Phe over 39 cycles. No-Rx, no treatment; O, O-acylation.
although mSO gave the best results of the solvents tested. Recovery at cycle 11 for the acylated material (Fig. 4, 0)suggests, as expected, that the N-to-0 acyl shift was more suppressed under nonaqueous than under aqueous conditions. Sensitivity of the O-to-N shift to pH implied that lowering pH might be an effective way to suppress background and allow sequencing to continue. However, results for HSA (Table II) indicate that as the pH is the lowered repetitive efficiency decreases. Increasing reaction time did not help (data not shown). The implication is, that as the pH is lowered an increase in undesired side products occurs. Table II. Effect of pH on Repetitive Efficiency: human serum albumin Buffer
KNm:M6N:W KNm:M6N:A:W KNm:M6N:A:W KNm:M6N:A:W
45:5:40 45:2.5:2.5:40 45:2.5:2.5:40 45:1.7:3.3:40
m
pmol A @2
pmol A @8
R.E.
8.4 7.2 6.5* 4.5
23 23 21 20
22 22 10 8
97 98 84 78
•Reaction mixture pH was maintained by use of 0.1 M 2-[N-Morpholino]ethane sulphonic acid adjusted to pH 6.5 with acetic acid. R.E. =- repetitive efficiency (%).
B. Catalysis of thiocarbamylation. 1. Tertiary amine and solvent effects Difficulty finding conditions that would allow both efficient sequencing and suppression of the 0-to-N shift led us to investigate tertiary amine catalysis of thiocarbamylation. Complete reaction of ONCS showed little effect of solvent on coupling. Adding 1.5 to 2.5 equivalents of acetic acid, however, decreased the fraction of the desired product (Table III).
William H. Vensel and George E. Tarr
182
Table III. Tertiary-amine catalysis of thiocarbamylation with ONCS Coupling Mix R.T. %0NT EOH:M6N:0NCS EOH:M6N:W:0NCS MOH:M6N:W:0NCS KNm:M6N:W:0NCS KNm:M6N:0NCS KNm:M6N:W:0NCS KNm:M6N:A:0NCS KNm:M6N:A:0NCS KNm:M6N:A:0NCS KNm:M6N:A:(Z)NCS
8:1:1 7:1:1:1 7:1:1:1 7:1:1:1 8:1:1 8.6:1:0.4:1 7.5:1:0.55:1 7.2:1:0.8:1 8.1:0.5:0.4:1 8:.045:0.55:1
15m 20m 20m 20m 20m 20m 20m 20m 20m 20m
88 88 89 89 86 88 85 79 83 81
1 EQA 1.5 EQ A 1.5 EQA 2.2 EQ A
Phenylisothiocyanate coupling to NPN-TNdab, R.T.= room temperature (ca. 23°C).
When MNCS was used instead of 0NCS, there was a greater tendency for the reaction to form multiple products (greater sensitivity to reaction conditions), so MNCS was used for subsequent tests. Formation of multiple products as pH was lowered is consistent with the results obtained using ONCS and HSA at different pHs (Table II). Examination of coupling in the presence of one equivalent each of M6N and A demonstrated that thiocarbamylation in ethanol does not follow pseudo-first order kinetics, showing instead a long lag before the reaction mixture becomes effective. Of six solvents tested, mSO gave the fastest reaction and the best ratio of products. Various amines were tested near neutrality for thiocarbamylation catalysis (Table IV). Amine pKa varied directly with reaction rate, and inversely with the percentage of correct product. Rapid reaction in the absence of tertiary amine suggested another activation pathway for thiocarbamylation.
Table IV. Effect of amine on neutral pH catalysis of Ala-Ndab thiocarbamylation Buffer % Reacted %MNT pKa Amine KNm:M6N:A:MNCS KNm:M6N0:A:MNCS KNm:E6N0:A:MNCS KNm:ENip:A:MNCS KNm:eN:A:MNCS KNm:ENm:A:MNCS KNm:6N:A:MNCS KNm:NaA:MNCS
8:0.72:0.34:1 8:0.66:0.34:1 8:0.77:0.34:1 8:1.04:0.34:1 8:0.84:0.34:1 8:0.65:0.34:1 8:0.49:0.34:1 8:1:1
75 13 26 76 38 19 1 54
37 68 31 27 44 61 all 81
11.1 7.4 7.7 11.4? 10.6 9.9 5.2
N-methylpiperidine N-methylmorpholine N-ethylmorpholine diisopropylethylamine triethylamine dimethylethylamine pyridine (3M Aqueous, pH 5.23)
Only the "correct" product (%MNT) was cleavable with fA. MNCS was 10% in MCN.
2. Non-tertiary amine and solvent effects That coupling could be catalyzed without tertiary amines led to examination of non-tertiary amine catalysis of thiocarbamylation. Tests with sodium salts revealed a moderate catalysis by acetate, and weaker effect of formate (data
Minimizing N- to 0-Shift
183
Table V» Catalysis by tetraethylammonium buffers Salt/Buffer ^I
% Reacted
7.0 7.0 7.2 8.6 7.0 7.0 8.6 7.0 7.0 7.2
KNm:eN+Cl KNm:eN+P04 KNm:eN+A KNm:eN+S04 KNm:KP04 KNm:NaP04 mSO:eN+S04 mSO:eN+Cl mSO:eN+P04 mSO:eN+A
20 36 61 72 23 10 97 28 57 89
Coupling of MNCS to Ala-Ndab. Ratio of Salt/Buffer/MNCS 8:1:1. % Reacted was essentially all MNT product. Reaction was for 10 min at R. T. MNCS was 10% in MCN.
not shown). With eN+ as cation, reaction was strongly catalyzed by sulfate with a weak effect of phosphate and no catalysis by carbonate or borate. mSO proved to be a better solvent for reaction than KNm (Table V). Additional tests (data not shown) gave the same rate with pH 7.1 and 8.6, i.e. no pH effect with eN+S04. We found with 0-acylated SA-Ndab as substrate no catalytic effect of mSO/eN+S04 on the 0-to-N shift. Under actual sequencing conditions the repetitive efficiency with HSA was the same for the standard Millipore buffer as with mS0/eN+S04 (data not shown). All subsequent tests were done with an ABI 477A equipped with a blot cartridge. Tetraethylammonium sulfate at a pH of 6.8 was prepared and dried. Weighed samples of eN+S04 salts dissolved in mSO were used as coupling buffer. ONCS delivered as a 10% solution in MCN was not dried prior to buffer delivery. The results shown in Table VI indicated about 5% water is required for efficient coupling. Acylation with vapors of 5% AcCl in fA caused a reduction in background as measured by Leu and Phe at cycle 8. Despite our expectations of O-acyl stability, based on the behavior of model substrates, the lowered efficiency with acetylated HSA suggested that 0-to-N shift had been reduced but not prevented. Separate studies of the effect of acyl group type on O-to-N shift indicated that changing from acetyl to trimethylacetyl decreased acyl shift under aqueous coupling conditions (R.E. 0.79 & 0.87 respectively) and was effective at suppressing background.
Table VI. Human serum albumin: tetraethylammonium sulfate buffer Buffer R.E. Acylation Phe @ 8 0W/mSO/eN+SO4 .025W/mSO/eN+SO4 .050W/mSO/eN+SO4 .100W/mSO/eN+SO4 .050W/mSO/eN+SO4
.88 .90 .94 .92 .88
Acetyl
* * 6.2 * 3.5
Leu@8 * * 13.1 * 3.9
*Background in pmol at Phe and Leu shown only for the data obtained with 5% water in coupling buffer. Coupling was for 3 min at 57 C.
184
William H. Vensel and George E. Tarr
IV. Conclusions During protein sequencing the number of residues over which the sequence signal can be read depends upon cycle-to-cycle efficiency and background generation. Our results show that the major contributor to background generation is the result of an N-to-0 shift at Ser/Thr hydroxyl groups. As a result of the N-to-0 shift, in-chain Ser/Thr amino groups are exposed to coupling with ONCS. At the subsequent cleavage step the protein chain is fragmented at the next peptide bond. A blocking group on Ser/Thr can prevent N-to-0 shifting, but must be stable and not O-to-N shift when Ser/Thr become N-terminal. Acyl groups are easily introduced onto protein hydroxyl groups. We have found that O-to-N shift of such groups is pH sensitive, and proceeds more rapidly as pH is raised. It is also strongly promoted by water, and to a degree, by other hydroxylic solvents. Using tertiary amines it was not possible to find conditions (pH or nonprotic solvent) in which O-to-N shift was suppressed and Edman sequencing efficient. Tests revealed that thiocarbamylation does not require tertiary amines, and that efficient coupling, around neutral pH, can be carried out in the presence of tetraethylammonium sulfate and methylsulfoxide; with model compounds the O-to-N shift was not promoted. Unacylated HSA sequenced more efficiently under these conditions than acetylated. We expect a combination of non-tertiary amine catalysis and acylating reagent with bulky side chain to suppress O-to-N shift while maintaining high repetitive efficiency.
Acknowledgments We wish to acknowledge Dr. Jim CouU for his support of this project. W. H. V. also wishes to thank Dr. Donald D. Kasarda and the Western Regional Research Center, Agricultural Research Service, of the U. S. Department of Agriculture for support.
References 1.
Iwai, K. & Ando, T. 1967 Meth. Enzymol. 11,263-282
2.
Brandt, W. F., Henschen, A. & von Holt, C. 1982 In Methods in Protein Sequence Analysis (Elzinga M., ed), Humana Press, 101-110
3.
Tarr, G. E. 1987 7*^^ Symposium of the Protein Society, Poster 1028
4.
Thomsen, J., Bucher D., Brunfeldt, K., Nexo, E. & Olesen, H. 1976 Eur. J. Biochem 69, 87-96
THE HYDROLYSIS PROCESS AND THE QUALITY OF AMINO ACID ANALYSIS: ABRF-94AAA COLLABORATIVE TRIAL K. Umit Yuksel i, Thomas T. Andersen 2, Izydor Apostol 3, Jay W. Fox \ Raymond J. Paxton 5, and Daniel J. Strydom ^ 1 Dept Biochem. & MoL Biol., U. North Texas Health Sci. Ctr. Ft Worth, Ft. Worth, TX 76107 ^ Department of Biochemistry and Molecular Biology, Albany Medical College, Albany, NY 12208 ^ Somatogen Inc., Boulder, CO 80301 ^ Department of Microbiology, University of Virginia Medical School, Charlottesville, VA 22908 ^ Department of Protein Chemistry, Immunex Corp., Seattle, WA 98101 ^ Center for Biochem. Biophys. Sciences and Medicine, Harvard Medical School, Boston, MA 02115
I. INTRODUCTION Amino acid analysis remains an indispensable tool in a variety of biological research and development fields, e.g. the biochemical study of proteins, quality control in biotechnology and nutrition, and in clinical analyses. The classic chromatographic technologies will be for the foreseeable future the major quantitative tools for amino acid analyses. The techniques are deceptively difficult and there remains a need to standardize techniques for good quantitation. The Association of Biomolecular Resource Facilities (ABRF) has been addressing the identification and ultimate resolution of such difficulties by collaborative trials [1-7]. The purpose of the 1994 trial was to discriminate between hydrolysis and the chromatographic analysis as a possible major source of errors, and additionally to test cystine analyses as part of a continuing effort. II. MATERIALS AND METHODS A. Sample Preparation and Distribution The 1994 ABRF amino acid analysis study consisted of two different samples, both were primarily bovine ribonuclease. ABRF-94AAA1 (30 \ig or 2.2 nmol) was spiked with an equimolar amount of a small peptide, angiotensin, to provide a larger amount of histidine. Histidine is frequently present in such small amounts in proteins, that discrimination between methodological problems and purely the low amount present, as a source of His errors, could not be made in past studies. ABRF-94AAA2 was ahready hydrolyzed, and to provide an additional test, was spiked with glucosamine (2.0 equivalents) prior to hydrolysis. Commercially available ribonuclease B (30 ^g or 2.2 nmol, 15 |iL) and glucosamine were dissolved in water and pipetted into 6x50 mm Pyrex test tubes (Corning). The aliquots were dried and hydrolyzed as described [8]. The samples were analyzed by the authors and then distributed by mail, along with instructions and extensive references for analyses. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
185
186
K. Omit YUksel ^r ^/.
B.
Calculations Raw data were received by an independent collaborator, identifying marks removed and the anonymous results forwarded to the 1994 ABRF Amino Acid Analysis Research Committee as a spreadsheet. Data reduction was as described [2], except that the yields of the individual amino acids were not rounded to the nearest integer value. The amount of sample (pmol) was calculated from its known content of amino acids, excluding Cys and values differing >15% from the average. The accuracy of each residue was calculated as % Error, and the overall accuracy of the composition as Average % Error, where % Error = 100 x (experimental value - true value) / (true value) Average % Error = (X1% Error for 16 amino acidsl) / 16 These calculations were repeated substituting the "true value" with "analyzed value". Analyzed values are the overall average determined by all the participants and may represent a more realistic figure than the theoretical values, which were determined by protein or nucleic acid sequencing. These values were used to compare the errors of the pre-hydrolyzed and protein samples. In analyzing data, we excluded sites with errors >3 standard deviations from the mean. III.
RESULTS AND DISCUSSION
A.
Participation A record number of 62 sites participated in this study, representing a 38% participation from the 164 sites offering amino acid analysis out of 225 ABRF member laboratories. The preferred methods of analyses of ABRF-94AAA were pre-column derivatization procedures (32/62 = 52%; Table I). This year there was an increased participation by sites using ninhydrin. Ninhydrin and PITC were overwhelmingly the preferred methods.
B. Yield and Accuracy 1. Protein Sample (ABRF-94AAAiy The overall average yield (±stdev) for the unhydrolyzed sample was 2.26 ±0.45 nmol (Fig. 1). This signifies a higher accuracy compared to the last two studies (standard deviations of 20% v^. 214% [7] and 40% [6]). The extremely broad distribution of results in last years study [7] may lie with the fact that it was a short peptide and not a protein. The average error obtained by the participants in the analysis of amino acids (excluding Cys) in ABRF-94AAA1 is 10.9% (Fig. 2) and is summarized by methodology in Table 11. The averages and the ranges
TABLE I. Methods used to analyze ABRF-94 AAA
Brer^sdmnQ PTC OPA/FMOC AQC* DABSYL
22 25 3 3 1
Post-column Ninhydrin
OPA
Ruram
AQC: 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate
2Q 27 2 1
ABRF-94AAA Collaborative Trial
CMT-CMCO
187
iommviT-^r-y^^
1-cvi
T-^cocoeou5wu>co
CM
CM r- ^ '•t
site Figure 1.
The absolute yields of ABRF-94AAA1.
50 40 •
IPITC g
OPA/FMOC •
0 A Q C 0DABSYL
^
Ninhydrin
^
Fluram
OPA
CO CM CO T -
Figure 2. cysteine).
The quality
Site of the analyses
of ABRF-94AAA1
(excluding
TABLE n . Correlation of error with technique and sample load for ABRF-94AAA1
E r r o r (%) theoretical comp analyzed comp n average range range av^age lovoall 110.9±3.7 4.5-19.2 58 8.4±3.9 2.7-17.5 {Post-column 1 10.6d:3.7 4.5-18.9 29 8.0±4.1 2.7-17.5 nn^ddn 26 8.0±3.9 2.7-17.5 10.8±3.6 4.5-18.9 2 4.8±0.1 OPA 5.1-6.2 5.6±0.8 3.6-4.9 |Ruiam 1 13.8 |fte-column 1 11.2±3.7 6.8-18.4 29 8.7±3.7 3.6-16.0 22 8.3±3.4 3.6-14.9 PITC 11.1±3.7 6.8-18.2 11.9±4.1 7.6-15.7 3 10.3±4.1 5.6-13.0 |oPA/Fmoc 12.2±5.4 8.0-18.4 AQC 3 10.8±6.0 4.2-16.0 DABSYL 7.2 1 5.9
Sample
1
Analyzed (|ig) 1 avCTage
range
n
0.06-30 0.14-24
'en 30
2.2±4.5
0.14-24
0.58±0.46
0.25-0.90
27 2 1 29 22 3 3
2.6±5.2 2.1±4.3
2.5 3.2±6.0
0.06-30
2.5±3.4
0.06-15
0.35±0.39
0.11-0.80
12±16
0.50-30
1.0
1 1
K. Umit Yuksel et al.
0 5.0 O i
11 PITC g OPA/FMOC [;]Nlnhydrln 0 OPA 0 AQC 0 DABSYL
4.0
g] Fluram
1 3.0 » 2.0 3 1.0 < 0.0
Site Figure 3. 20-1
The absolute yields of ABRF-94AAA2.
H PITC g OPA/FMOC Q Ninhydrin S AQC 0 DABSYL ^ OPA
j ^ Fluram
UJ
Umrtllilffl^^
« 8 «> ^ 8 9 2 !^ <^ g S :: «> S S *^ D5 ^ « 5 8 S 8 S2 5 g
Site Figure 4.
The quality of the analyses of ABRF-94AAA2 (excluding cysteine).
were similar for both the pre and post column techniques. However, the low eiTOT level of the OPA and DABSYL sites is noticeable, although perhaps not significant due to the small sample size. 2, Prg-hyd|x;)lyzgdS^ipplgfABRF-94AAA2); The overall average yield for the pre-hydrolyzed sample was 2.23±0.51 nmol (Fig. 3). The overall average error was 6.5% (Fig. 4). The methodological breakdown of the data showed significant differences in error (Table HI). C. Comparison of analyses of the single batch hydrolysate with analysis of individual hydrolysates We used the compositions determined by this study to obtain the reference composition for each of the two samples, and calculated the errors in determining each amino acid using this composition. The pre-hydrolysate was more accurately analyzed than the protein, which was hydrolyzed by the participants. The overall average errors were 6.52% (Table HI) and 8.36% (Table H), respectively
ABRF-94AAA Collaborative Trial
189
TABLEm. Correlation of error with technique and sample load for ABRF94AAA2 using the ^'analyzed" composition 1 Error (%) 1 Sample Analyzed (jiig) { n range average range average n I 59 1 OveraU 1.7-18.1 60 6.5±4.0 0.12-30 4.7±6.0 1 Post-column 29 4.8±3.5 1.7-15.6 29 0.50-25 5.3±6.1 2.1-14.2 26 0.50-24 1 ninhydrin * 1 26 4.7±2.9 4.8±5.0 2 0.20-0.41 1.9±0.2 2.811.4 OPA 1.7-2.0 2 1 Fluram 15.6 1 25.0 1 Pre-column 7.9±4.0 2.6-17.9 31 0.3-30 30 4.5±6.0 7.9±3.9 2.6-13.6 24 23 4.0±6.1 PITC * 0.12-30 10.4±6.5 OPA/FMCX: 0.12-0.75 6.5-17.9 3 0.3510.35 3 5.5±2.6 AQC 0.33-9.6 3.9-8.5 3 4.314.8 3 DABSYL 8.05 2.4 1
11
11
' p= 0.0007 by Mann-Whitney ranking test
5
10 % Error, Pre-hydroiyzed
Figure 5. Comparison of errors for the intact vs. pre-hydrolyzed sample.
Statistically significant (Mann-Whitney ranking test, p= 0.002). This was primarily due to the ninhydrin sites being significantly more accurate in analyzing the pre-hydrolysate. The preponderance of better sites above the line of equal error (Fig. 5) reflects smaller errors in analyzing the pre-hydrolysate. This suggests that the hydrolysis procedure does indeed provide a major part of the variation in amino acid analysis, with the caveat that on the average, a 2-fold larger amount of sample was analyzed in the case of the pre-hydrolyzed sample (Tables II and III). There were however some sites which still experienced difficulties in analyzing this hydrolysate. Although some of these sites then did much better at analyzing their own hydrolysates (suggesting incidental failures), others had large errors in both samples, which would point to procedural problems at those sites. This underlines a point which is not appreciated widely enough, namely that amino acid analysis is a deceptively complex process and potentially provides many sources of analytical difficulty.
190
K. Omit Yuksel et al.
D.
Glucosamine We added 2 equivalents of glucosamine to the protein used to prepare the prehydrolyzed sample to provide a test of the general capabilities of amino acid analysis to alert to the presence of the amino sugars. A short review of amino sugar analyses was provided to participants. Only a total of 8 sites identified glucosamine or an amino sugar (3 ninhydrin, 2 OPA, 2 AQC and 1 PTC site) in an amount of about 1 equivalent (2.3 nmols). Three ninhydrin and three PTC sites reported the presence of an unknown of about the correct order of magnitude. Although PTC-Ser and PTC-glucosamine may co-elute, no influence on the levels of PTC-Ser determination was discerned for those facilities doing PTC analyses. E.
Analysis of cystine As in the past, some facilities delivered successful analyses employing any of the cystine analytical methods (Table IV). Thus performic acid oxidation (PAO), still the most popular method, provided highly accurate analyses although also many bad analyses. Interestingly, direct determination as cystine was as successful as PAO, with a third of the values within 10% of the expected value. The most successful method was that using disulfide exchange, 6/9 sites being accurate within 15%. The pre-hydrolyzed sample, prepared by conventional HCl digestion, contained cystine as such, and a number of sites obtained values close to the theoretical value. The distribution of cystine values found by all the sites was very asymmetric. Clusters at -50% error and some very high values suggest that some of these errors are perhaps due to a misunderstanding of the terminology used in analysis of cystine and cysteine. It should be noted that Cys/2 (halfcystine), when reported as such, has double the molar amount that would be reported for cystine. Commercial standards usually contain equimolar amounts of half-cystine and the other amino acids, or stated in terms of cystine - half the amount.
Table IV. Summary of errors in cystine analyses of ABRF-94AAA 1 Sample/ Method ABRF-94AAA1 1 Performic acid oxidation 1 Disulfide exchange DMSO/HCI oxidation 1 Direct as Cys/2 ABRF-94AAA2 Direct as Cys/2
Average Error (%)
Number of sites [% of sites] <|10%| <|I5%| total
31.6±26.7 13.3±12.7 26.01:23.9 34.1±33.0
17 9 7 19
5 [29%] 4 [44%] 3 [43%] 6 [32%]
33.4±25.5
52
15 [29%]
6 [35%] 6 [67%] 4 [57%] 8 [42%]
16 [31%] 1
ABRF-94AAA Collaborative Trial
191
F.
Histidine The larger relative amounts of histidine in tiiis study allow a better comparison of analytical capabilities for this amino acid than previous studies. The ninhydrin values were usually positively biased in the protein analysis (+4.2%), while the PTC values were negatively biased (-6.8%). Errors >30% were seen for 4 ninhydrin sites (+208, +48, +43 and -73%). These are probably due to bufferchange problems. The low PTC values may be due to heavy metal ion contamination, which can be alleviated by including EDTA in the workup and chromatography. Interestingly, there is no such bias in the hydrolysate analyses (0.12, -1.4%), suggesting that such proposed metal contamination may be introduced by some facilities during hydrolysis. IV.
SUMMARY AND CONCLUSIONS
This study was aimed primarily at discriminating between hydrolysis and all other aspects of amino acid analysis as potential sources of error. Two samples were distributed, one pre-hydrolyzed and one to be hydrolyzed by the participants. Their absolute yields were determined with very similar accuracy and standard deviations, illustrating that participants did not lose material during hydrolysis. When participants hydrolyzed their own samples, there was no difference in variation between the pre- and post-column methods (Table II). The pre-hydrolyzed sample was analyzed with less variation by both methods as compared to the protein sample, but the ninhydrin-sites were significanfly more accurate than the PTC-sites (Table III). The average errors in determining individual amino acids, however, were significantly different for the two samples, with smaller errors in the pre-hydrolyzed sample (Fig. 5). Although on the average different levels of sample were used to analyze the two samples, the influence of hydrolysis conditions clearly appear to represent a major source of error in amino acid analysis, and thus, the laboratories must pay great attention to their hydrolysis procedures. Cysteine analyses showed a continuing trend of successful use of disulfide exchange reagents. Performic acid oxidation, although still successfully used, overall fared as well as the simple direct analyses of cystine. Direct cystine analysis of the pre-hydrolysate was as successful as that of the protein sample, suggesting that chromatography and derivatization and not hydrolysis are the most important factors in successful analysis of cystine by this simple methodology. The accuracy of analysis of ribonuclease by the best sites of this study (Table V) can be compared to an early analysis by ion-exchange/ninhydrin [9] on a single hydrolysate, which has as good an error (4.4%). However, that analysis used mg quantities of material as compared to the |ig quantities used at present. Finally, it is clear that excellent analyses can be and are done by any of the available technologies, with proper concern for the many pitfalls peculiar to each method.
192
K. Umit Yuksel et al.
Table V. Best analysis of ABRF-94AAA by pre- and post-column methods ABRF-94AAA1 #28/hinhydrin #4/PITC (ranK;5) Thwr.t ^rank! n* 12.68 12 13.14 5.60 5 4.51 16.81 16 17.21 8.68 8 12.74 13.3 12.71 12 3.35 3 3.25 5.83 6 6.30 2.98 4 3.65 3.96 4 3.79 10 10.83 11.33 4 3.84 5.02 4.18 4.55 4 5.09 5.79 5 16.14 15.62 16 10.48 10.66 10 8.3 7.9 8 11.09 11.64 11
site/technique:
AA Ala Arg Asp Cys Glu Gly His De Leu Lys Met Phe Pro Ser Thr Tyr Val
AmL hydrolyzed ftig) 10.0 2.41 Amt analyzed (^g) 2.50 Total Yield (nmol) 4.53 Error (%, excl. Cys)
9.0 0.90 2.47 6.78
ABRF-94AAA2 #41/0PA #37/PITC Theor. t Analyzed t frank: n frank: 9^ 12.49 12.07 12 12.14 4.04 4.09 4 4.14 15.73 15.26 15 16.00 7.86 6.28 8 3.59 11.82 12.52 12 12.66 3.52 3.50 3 3.53 3.79 3.77 4 3.83 2.05 2.16 3 2.15 2.37 2.38 2 2.40 10.49 10.34 10 10.68 3.86 3.80 4 3.97 3.04 3.08 3 2.84 4.54 4.50 4 4.44 13.68 13.71 15 13.19 9.77 9.77 10 9.30 5.73 5.63 6 5.71 8.72 8.68 9 8.47
_ 3.75 2.19 1.73
12.0 2.17 2.55
t Residues/mole, analyzed composition, see text for details. t Residues/mole of peptide. * These are the best pre- and post-column sites for the two samples; the overall rank includes the full set of sites. Acknowledgments We would like to thank Dr. Y. Bao (U. Virginia) for maintaining the anonymity of the participants, W. Brome (Harvard) for expert assistance in the preparation and distribution of the sample, and all the ABRF facilities who have taken part in this project. This work was supported in part by NSF grant DIR 9003100 (to J.W. Crabb) on behalf of the ABRF. References 1. Niece, R.L., Williams, K.R., Wadsworth, C.L.. Elliot, J., Stone, K.L., McMurray, W.J., Fowler, A., Atherton, D., Kumy, R., and Smith, A. (1989) in "Techniques in Protein Chemistry" (T£. Hugli, ed.). Academic Press, San Diego, pp 89-101. 2. Crabb, J.W., Ericsson, L.H., Atherton, D., Smith, A.J. and Kutny, R. (1990) in "Current Research in Protein Chemistry" (J J. Villafranca, ed.) Academic Press, San Diego, pp 49-61. 3. Ericsson, L.H., Atherton, D., Kutny, R., Smith, A.J. and Crabb, J.W. (1991) in "Methods of Protein Sequence Analysis 1990" (H. JOmvall and J.-O. HOOg, eds.) Birkhauser Verlag, Basel. 4. Tarr, G.E., Paxton. R.J., Pan, Y.-C£., Ericsson, L.H., and Crabb, J.W. (1991) in Techniques in Protein Chemistry 11" Academic Press, pp 139-150. 5. Strydom, DJ., Tarr, G.E., Pan, Y.-C.E., and Paxton, R.J. (1992) in "Techniques in Protein Chemistry III" (R.H. Angeletti, ed.) Academic Press, San Diego, pp 261-274. 6. Strydom, DJ., Andersen, T.T., Apostol, I., Fox, J.W., Paxton, R.J., and Crabb, J.W. (1993) in "Techniques in Protein Chemistry IV" (R.H. Angeletti, ed.) Academic Press, San Diego, pp 279288 7. Yflksel. K.tr., Andersen, T.T., Apostol, I., Fox, J.W., Crabb, J.W., Paxton, RJ., and Strydom, DJ. (1994) in "Techniques in Protein Chemistry V" (J.W. Crabb, ed.) Academic Press, San Diego, pp 231-240. 8. Strydom. DJ., Fett, J.W., Lobb, R.R., Alderman, E.M., Bethune, J.L., Riordan, J.F., and Vallee. B.L. (1985) Biochemistry 24, 5486-5494. 9. Moore, S. & Stein. W.H. (1963) Methods EnzymoL 6, 819-831.
A New Reagent for Cleaving at Cystine Residues C. Mitchell, L. Hinman, L. Miller, and P.C. Andrews Dept. of Biochem, Univ. of Michigan, Ann Arbor, MI and American Cyanamid, Med. Research Div., Pearl River, NY
I. Introduction Chemical cleavage reagents have proven to be important tools for generating internal protein sequence data because the general physical properties of the protein do not usually have a major effect on the success of the cleavage process. Two of the most successful chemical cleavage methods, cyanogen bromide (1) and BNPS/skatole (2) target the relatively rare residues Met and Trp. The advantage of targeting rare residues is that the fragments tend to be larger than those introduced by most enzymatic digests. This allows long stretches of contiguous sequence to be determined. Cysteine and cystine are attractive targets for chemical cleavage because they occur relatively infrequently in most proteins. The large fragments likely to result can be separated by electrophoresis and electroblotted to PVDF membranes (3). However, until now, no reagent was known that in a simple, one-step reaction efficiently cleaved at cystine or cysteine and produced fragments with free amino termini amenable to protein sequencing. This is a preliminary report that a nucleophilic tertiary organophosphine [N-N-diethylaminopropyl-bis-(3-hydroxypropyl) phosphine], shown in Fig. 1, cleaves at cystine residues producing sequenceable amino termini. A number of proteins have been cleaved with this reagent, the resulting fragments separated by gel electrophoresis, electroblotted to PVDF, and subjected to Edman degradation. The results suggest a high degree of specificity for cleavage after cystine although apparent cleavage after cysteine has also been observed. C H X H X H , — OH CHXH N—CHoCHXH, — P CHXH CH2CH2CH2 -
OH
Figure 1. Structure of the protein cleavage reagent N,N-diethylaminopropyl-bis-(3hydroxypropyl) phosphine. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
193
194
C. Mitchell ^'r t7/.
n . Materials and Methods N,N-diethylaminopropyl-bis-hydroxypropyl phosphine was synthesized at American Cyanamid using standard procedures (4). Ammonium bicarbonate and CAPS buffers were from Sigma. Isopropyl alcohol was from Fisher Scientific. Acetic acid and methanol were from Mallinckrodt. Acetone was purchased from Sigma. Precast tricine gels, lOX running buffer stock, and 2X sample buffer were from Novex. Coomassie Blue was purchased from BioRad. Yeast alcohol dehydrogenase, bovine p-lactoglobulin, hen egg ovalbumin, lysozyme, fetal calf serum fetuin, human urodilatin and other proteins and peptides used in this study were purchased from Sigma. Trifluoroacetic acid, ProBlott (PVDF) membranes, and protein sequencer models 473,477, and 470 were from Applied Biosystems. Matrix assisted laser desorption mass spectrometry was done on a Vestec-2000 LaserTec ResearcH linear time-of-flight mass spectrometer. Ammonium bicarbonate stock (100 mM) was made and stored at 4^ C. Stock solutions of all proteins were prepared in 0.1% TFA, 50% acetonitrile solutions and stored at -20^C until use. Aliquots of the stocks, containing 10 to 100 \ig of protein, were dried in a centrifugal dryer prior to digestion. To 5 mg of N,N-diethylaminopropyl-bis-(3-hydroxypropyl) phosphine, 1 ml of 50% isopropyl alcohol/50 mM ammonium bicarbonate was added. The samples were then resuspended in 20 iil reagent solution per 10 ^ig of protein, heated at 80^ C for 2 hours, and dried to remove volatile components. Precipitation of the cleavage products was accomplished by resuspending the dried samples in 50 ^il 0.1% TFA. 450 ^1 of cold acetone was then added and the samples incubated at -20^ C for 3 hours then centrifuged and the supernatant removed. The pellets were redissolved in 0.1% TFA/acetonitrile and lyophilized prior to dissolution in sample buffer. The cleavage products were resolved on 10-20% or 16% Tricine gels at 60 mA constant current until the dye marker reached the bottom, then transferred to PVDF using 10 mM CAPS pH 11/10% Methanol in a semi-dry blotter using lOV, 180 mA limits for 1.2 hours. Membranes were stained in 0.1% Coomassie blue/40% methanol/1% acetic and destained in 50% methanol/water. Sequencing was done in ABI protein sequencer models 470, 477, or 473 using Blott Cartridges and using standard cycles and conditions for sequencing from PVDF membranes. Data acquisition and analysis was done using ABI 610 software. Mass spectrometry (matrix assisted laser desorption) of the cleavage products was done by dissolving the samples in 0.1% TFA then mixing 1 jiil of sample solution with 1 |LI1 internal standard solution (bovine insulin at 0.5 pmole/iil) and 1 ^il saturated matrix solution (100 mM sinapinic acid). The mixture was loaded on the probe tip, air dried, and placed in the mass spectrometer.
Cystine Cleavage
195
ni. Results and Discussion SDS gel profiles of typical cleavage products for five standard proteins are shown in Figure 2. Five microliters of cystine-cleaved alcohol dehydrogenase, |3lactoglobulin, and lysozyme were loaded in each lane representing 7 \ig undigested protein. The digests of ovalbumin and fetuin were done using 30 |xg of each protein, loading 100% of the cleavage products onto the gel. The gel was electroblotted and stained using Coomassie Blue. Fetuin exhibited almost complete cleavage to smaller fragments as did bovine serum albxraiin (data not shown). Other proteins exhibited varying degrees of cleavage - approximately 50% of most proteins studied usually remain undigested. The hydrophobicity of the products frequently results in precipitation during the course of the incubation unless isopropanol is present. The results are also greatly improved by a precipitation step to remove excess reagent prior to gel electrophoresis. Initial experiments suggest that the cleavage reaction is relatively independent of the pH in the range 3-9 (data not shown). It also appeared to be independent of the nature of the solvent. While all the major and minor fragment bands sequenced provided Nterminal sequence compatible with cleavage after cystine residues, the sizes of a few of the fragments appeared to be smaller than expected by SDS gel electrophoresis, suggesting that they might be C-terminaUy truncated. Any anomalous cleavage might also result from the relatively rigorous conditions of the reaction. Figure 3 shows a map of the p-lactoglobulin sequence along with
A
B
C
D
E FA
Figure 2. SDS gel electrophoresis of the products of partial cystine cleavage for several test proteins. A. molecular weight standards, B. yeast alcohol dehydrogenase, C. p-lactoglobulin, D. hen egg lysozyme, E. ovalbumin, F. calf fetal serum fetuin. Molecular weight standards are indicated by arrows on the left side of the gel and are: bovine serum albumin (66,300), bovine liver glutamate dehydrogenase (55,400), porcine muscle lactate dehydrogenase (36,500), bovine erythrocyte carbonic anhydrase (31,000), soybean trypsin inhibitor (21,500), hen egg lysozyme (14,400), bovine lung aprotinin (6,000), unresolved bovine pancreatic insulin A and B chains.
196
C. Mitchell et al.
Cys 119 Cys 121
Cys 106
Cys66 \
n—I
I
Cys 160 Observed Size 13 12 5 2
Calculated Mass 13.3/13.5 11.9/11 4.5 NA
Observed Sequence LIVTQ.... LIVTQ.../AQKKI LIVTQ.../LVRTP LIVTQ
Probable Fragment 1-119/1-121 1-106/67-162 122-160/1-66 frag. 1-66 fragment
Figure 3. Schematic diagram of p-lactoglobulin. The locations of Cys residues are indicated by solid bars.
residue numbers, observed sizes by SDS gel electrophoresis, and masses of the expected fragments. Also shown are the results of sequencing each plactoglobulin fragment and, when possible, the probable identity of each fragment. Two of the smaller bands provided N-terminal sequence. While all the major and minor fragment bands sequenced provided Nterminal sequence compatible with cleavage after cystine residues, the sizes of a few of the fragments appeared to be smaller than expected by SDS gel electrophoresis, suggesting that they might be C-terminally truncated. Any anomalous cleavage might also result from the relatively rigorous conditions of the reaction. Figure 3 shows a map of the p-lactoglobulin sequence along with residue numbers and masses of the expected fragments. Also shown are the Table I; Cystine Cleavage Fragments from Hen Ovalbumin and Yeast Alcohol Dehydrogenase Oyalbumin Observed Size (Kd) Calculated Sequence Probable Preceding SDS-PAGE Mass (Kd) Observed Fragment Residue 32 34.7 GTSVN... 74-385 Cys-73 30 29.3 VKELY... 121-385 Cys-120 10 12.3 FDVFK... 12-120 Cys-11 7 FDVFK... 12-73 Cys-11 6 4 4.7 PIAIMSALAM... 31-73 Cys-30 4 5.4 GTSVNVHSSL... 74-120 Cys-73 Yeast Alcohol Dehydrogenase Observed Size (Kd) Calculated SDS-PAGE Mass (Kd) 18 19.9 12 12.5 7.8 6 4 4.8 5.8 NA 2 NA-Not applicable.
Sequence Observed AGITV... AGITV... SDVFN... SIPET... HTDLH... Mixture
Probable Fragment 154-347 154-276 277-347 1-43 44-97 NA
Preceding Residue Cys-153 Cys-153 Cys-276 N-terminus Cys-43 NA
Cystine Cleavage
197
results of sequencing each p-lactoglobulin fragment and, when possible, the probable identity of each fragment. Two of the smaller bands provided Nterminal sequence. Table I summarizes the sequencing results from alcohol dehydrogenase and the N-terminally blocked glycoprotein ovalbumin. The probable identity of these fragments is indicated. All fragments identified for both proteins by Nterminal sequencing corresponded to cleavage after cystine. The data from ovalbumin are particularly interesting. The structure of ovalbumin is well characterized (5) and contains only one disulfide bond between Cys 73 and Cys 120 yet sequence was obtained following Cys 11 and Cys 30. The bands for these fragments appeared inore slowly than the others and were fainter in appearance. It is possible that under the conditions of the digestion, some free thiols formed disulfides and were subsequently cleaved. It is clear from the protein cleavage data that the reaction frequently results in incomplete cleavage. This does not interfere with its usefulness for structure analysis however. Although a broad range of conditions were explored in an attempt to improve the cleavage efficiency for all proteins, the only parameter which had a significant effect was the presence of 50% organic solvent to prevent precipitation of the cleavage products. Some cystine residues appear to be particularly resistant to cleavage although no particular pattern is yet obvious from the data. Figure 4 shows the MALDI mass spectrum of a partial digest of the small peptide urodilatin which contains a single cystine. Clearly visible in the mass spectrum are undigested peptide, undigested peptide minus one or both sulfurs, and partial cleavage fragments with the C-terminal Cys completely removed and with or without the sulfur on the remaining Cys. Reagent adducts are apparent in the higher mass shoulders adjacent to these peaks. Mass spectral examination of Undigested (-S)
^Undigested reduced 0 O c D "D C
n <
Reagent adducts Internal std.
\
11
Figure 4. MALDI mass spectrum of a partial cystine cleavage of Urodilatin. The structure of Urodilatin is TAPRSLRRSSCFGGRMDRIGAQSGLGCNSFRY.
198
CMitchtW etal.
the products of urodilatin cleavage and of other small model peptides indicates that successful peptide bond cleavage is accompanied by loss of the Cys residue from the C-terminus of the product peptide. It is also clear that the reaction is accompanied by loss of sulfur from the Cys residues. A small amount of the sulfur loss may be due to lanthionine formation although control runs in the absence of reagent exhibit relatively little desulfurization. The mechanism of disulfide reduction by phosphines is hypothesized to involve a stable intermediate containing a sulphur-phosphorous bond (6). Beta elimination would yield the phosphine sulfide and dehydroalanine. The formation of relatively stable adducts between cystine-containing peptides and the reagent was confirmed by mass spectrometry for several peptides with the major adduct representing one reagent molecule per cystine residue. The formation of dehydroalanine would also increase the likelihood of lanthionine formation which would be resistant to peptide bond cleavage. It is likely that dehydroalanine formation itself is responsible for the final peptide bond cleavage through migration of the double bond and subsequent hydrolysis. IV. Conclusions Cleavage with N,N-diethylaminopropyl-bis-hydroxy-propyl phosphine appears to be quite specific for cystine or cysteine. The results from sequencing runs indicate that some nonspecific cuts might be occurring since some fragments appear to be smaller than expected, but the existence of nonspecific cuts has yet to be confirmed. The amino-terminal sequences of all fragments isolated represented the protein amino terminus or cleavages after cystine or cysteine. Useful internal sequences and useful structural information can be gathered from this technique. However, like any digestion procedure, there is no guarantee that useful information will be obtained in every case. The sequences obtained for chicken egg lysozyme (data not shown) gave amino terminal sequence for every band. Given that lysozyme is a small protein with four disulfide bands and that all but two of the eight cysteine residues are located in the C-terminal half of the molecule, this result is not surprising. There has also been some variability in recovery of peptide fragments from digestion to digestion. Most of this appears to be due to the hydrophobicity of the fragments - particularly if the reagent was not separated from the cleavage products by precipitation. Digests after precipitation appear to be much more soluble in SDS-PAGE sample buffer. The yield of cleavage products and the presence of species that appear to be the result of side reactions are other areas of concern. Yields of cleavage products for the proteins studied have ranged from 20% to 90% with an average yield of about 50%. More study is needed to verify the cleavage mechanism and to optimize cleavage conditions. Many of the difficulties with this new reagent are common to chemical cleavage reagents in general. The hydrophobicity of the products and the presence of by-products are reminiscent of cyanogen bromide cleavage reactions. Indeed, if the current limitations of the method can be overcome, this new reagent may prove to be as useful a tool for protein cleavage as cyanogen bromide has been.
Cystine Cleavage
199
Acknowledgments The authors wish to acknowledge Dr. R. Ogorzalek Loo and B.-J. Shyong for the mass spectra, S. Williams for amino acid analyses, and J. Tropea for assisting with protein sequencing. J. E. Hoover is thanked for helpful discussions on catalytic mechanisms. References 1.) Gross, E. (1967) Methods in Enzymology 11,238-241 Academic Press, New York. 2.) Fontana, A. {\912)Methods in Enzymology 25B, 419-426.Academic Press, NewYork. 3.) Matsudaira, P. (1987) /. Biol Chem, 262,10035-10039. 4.) Hinman, L. and Miller, L. (1989) Abstract # 286 to the Protein Society Meeting (San Diego). 5.) Stein P.E., Leslie A.G.W, Finch J.T., Carrell R.W. (1991) /. Mol Biol 221,941-9596.) Ruegg, U. T., and Rudinger, J. (1977) Methods in Enzymology (Hirs, C. H. W. and Timasheff, S. N., Eds.) 47,111-116. Academic Press, New York.
This Page Intentionally Left Blank
Protein Sequence Analysis Using Microbore PTH Separations Michael F. Rohde, Christi Clogston, Lee Anne Merewether, and Patricia Derby Dept. of Protein Structure, Amgen Thousand Oaks, CA
Kerry D. Nugent Michrom BioResources Auburn, CA
I. Introduction The need for higher sensitivity in protein sequencing is a pervasive one. The last ten years has seen a number of significant reductions in the amount of peptide or protein required to obtain protein sequence, by as much as six orders of magnitude from micromoles to picomoles. Despite the decrease in sample requirements, the need for higher sensitivity has not gone away. If anything, the ability to obtain data with decreased sample amounts has fueled a more rabid quest for even better performance. One of the inevitable consequences of telling a researcher that you can obtain sequence information with a quantity "x" of their protein is that they will only be able to bring you one-tenth of that amoimt. Many excellent discussions have been provided for optimization of the currentiy available instrumentation (1-4), and aspects of these methods are employed by researchers who can obtain data in the low picomole to high femtomole range. Scattered initial reports have been made for methods that would modify either coupling or conversion chemistries to improve the detector response of the analyzed amino acid derivatives (5-8), but none of these approaches are in widespread use at this time. One area for increases in sensitivity that has not been widely exploited is in the PTH detection system. Most commercial instruments use HPLC columns with internal cross sections of 2.1 mm, allowing flow rates of 200-350 p
201
202
Michael F. Rohde et al.
This report will describe the use of PTH chromatography on 1.0 mm columns, coupled to a protein sequencer. Some of the considerations, problems and solutions will be presented for this configuration.
II. Materials and Methods A. HPLC Description The HPLC System used for this study comes from Michrom BioResources (Auburn, CA). The system is capable of delivering reproducible gradients at the 50 |iL/min flow rate used in this study. The system operates with two micro piston HPLC pumps. Prior to mixing, solvents from both pimips pass through preheaters designed to allow running the columns above ambient temperature. Initial studies found that this configuration did not provide adequate control of the separation column temperature, so a colimui jacket was fabricated to maintain the colimm at 40*C. The standard injection valve for this HPLC is a ten port valve, configured to allow two injection loops, identified as 'fi-ont' and "back' loops. Pimoip flow to the coliman passes through one loop or the other, dependent on whether the valve is in the load' or 'inject' position. For PTH separations, the 'front' loop was 100 |xL and used for sample introduction. The sample loop was coupled to a 5 cm length of 0.005 in ID PEEK tubing. This serves as a flow restrictor to provide better control over sample loop loading (11). The outlet to waste fii'om the injector valve was fitted with a 5 cm length of 0.01 mm ID PEEK tubing to allow observation of the loop filling process. This length of tubing allows observation of droplet formation within 1 |xL of the point when the sample loop is completely filled. The 'back' loop was 250 |iL and used to wash the colimm at the end of each PTH separation. This was accomplished by connecting the filling port for that loop to a reservoir of acetonitrile, kept imder low helium pressure such that the loop would be filled during the separation portion of the gradient. When the separation was finished, the valve switched back to the injection position, allowing the 250 |iL slug of acetonitrile to pass through the colimm. Also at this time, the flow rate was increased to 100 |iL/min for 1 min and the gradient composition behind the acetonitrile slug was returned to initial conditions. This program is summarized in Table I. Table I Microbore PTH Gradient
Time 0.00 10.00 16.00 18.00 21.95 22.00 23.00 24.00 25.00 30.00
%B 4 90 90 4 4 4 4 4 4 4
Flow, uL/min 50 50 50 50 50 50 100 100 50 50
Event Auto zero, ii^ject
Iiyect250jiLCH3CN
Stop data collection
Microbore PTH Separations
203
Separation coliunns and buffer concentrate are available from Michrom. The column used for the PTH analysis is a 1 x 150 mm column packed 3 micron particle size, 100 A pore size Reliasil C18. The buffer system for PTH separation is an acetate bxiffer with ion-pairing agent; a shallow gradient nmning from buffered acetonitrile to bxiffered n-propanol was used for the chromatograms depicted here. The standard flow cell for the Michrom instrument has a path length of 2mm. Modified flow cells of 4 mm, 5mm and 6mm were evaluated to effect increased signal response. The 5 mm flow cell was selected as the best compromise between increased signal with a tolerable level of baseline drift due to the refractive index changes across the gradient. Data was collected with the EZ Chrom data and control system available from Michrom.
B, Sequencer
Description
The protein sequencer used for this study was an Applied Biosystems Model 477A, used without the Model 120A PTH analyzer. The sequencer was fitted with an Applied Biosystems Micro cartridge, but all other hardware was as supplied by the manufacturer. Reaction and conversion cycles were modified to accommodate this configuration. Some modified solvents and reagents were used and will be discussed below. Solvents and reagents were obtained from Applied Biosystems, Burdick and Jackson (Baxter) and Hewlett Packard. All were used without any additional purification procedures. Solvent and reagent modifications were necessary on the protein sequencer for optimum performance in this configuration. To maximize the amount of PTH sample analyzed, the HPLC is fitted with a 100 jiL loop so that twothirds of the 150 ^iL reconstituted PTH from the flask would be analyzed. In order to inject this volume onto a 1 mm column, it was necessary to reduce the amount of acetonitrile in solvent S4 to 2%; also, 0.25% acetic acid was added to position PTH-Asp and PTH-Glu. At the level of 1 pmole standards, all PTH amino acids were recovered in comparable yield as judged by peak height; however, when sequencing larger sample loads, occasionally lag was noticed for the more hydrophobic amino acids. To overcome this, flask washes were done using 20% acetonitrile from the X2 position on the 477A. PTH Standards were initially made from the normal ABI stock solutions, diluted with ABI R5 acetonitrile. Dilutions were made so that a 75 M,L block loop load in the program would deliver 1.5 pmoles to the flask. In this configuration, it was noted that PTH-Asp eluted as a shoulder on a very large non-PTH peak, presumably related to the DTT in R5. Furthermore, the diluted PTH standards on the instrument needed to be replaced often, due to decomposition. It had been observed in our experience that the PTH standards obtained from Hewlett-Packard were more stable on the instrument than the conventional ABI standards prepared for the conventional configuration. Thus, we began using dilutions of HewlettPackard PTH standards, initially diluted with R5 to 1.0 pmole, in the flask. Later, it was found that using ABI solvent B eliminated any artifact peak in the region of PTH-Asp, but introduced a new artifact peak between PTH-Gln
204
Michael F. Rohde et al.
and PTH-Gly. Use of Burdick and Jackson acetonitrile to dilute HewlettPackard standards eliminated these interfering peaks. For optimal reproducibility of the PTH separations, it was usually necessary to run either a blank before the standard or two standards before analyzing the first cycle PTH-AA. In order to do this under control of the protein sequencer, it was necessary to overcome a quirk of the Applied Biosystems software controlling the 477A. Running a standard does not involve transfer from the cartridge to the flask. The 900A software would not allow two "non-transfer" cycles to be run without resetting the computer. This was solved by adding non functional transfer commands to the beginning of the second standard cycles. One second steps of Prep Transfer and End Transfer were added at the start of the second Begin reaction cycle, and Ready to Receive at the start of the second Begin conversion cycle. Also, since the Begin cycle has an R3 cleavage prior to coupling, this was eliminated in the Begin-2 reaction cycle. This configuration gives two cycles of pre-coupling and washes prior to commencing actual sequencing and may aid in cleaning up low level samples. The only changes to the Sequencer Reaction Cycles were the optimization steps recommended by ABI in User Bulletin No. 56. Some sequencer conversion cycle modifications were necessary to accommodate the new HPLC; these are Hsted in Table II. Table II 477A Conversion Cycle for Michrom HPLC Step Function Fxn# Time Comment 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Block Flush LoadS4 Ready to Receive Block Flush Argon Dry LoadR4 Argon Dry Pause Argon Dry Pause Argon Dry Block Flush Clear Inj to Waste LoadS4 Argon Dry LoadS4 Argon Dry Pause Argon Dry Pause Load Injector Event A LoadS4 Argon Dry LoadS4 Argon Dry Deliver X2 Argon Dry Empty
23 12 13 23 22 3 22 3 22 25 22 25 22 23 14 12 22 12 22 25 15 26 12 22 12 22 8 22 20
6 8 126 6 220 8 8 350 8 350 300 6 180 9 6 9 4 80 6 80 42 1 11 6 11 6 30 6 30
Extended to dry ii^ector loop
Optimized for new HPLC Start HPLC
Flask wash w/ 20% CH3CN
Microbore PTH Separations
205
Load S4, Load R4 and Ar Dry steps were optimized for this particular instrument. Major differences were implemented as follows: Clear Injector to Waste was set to be 60 seconds more than the time needed to expel the first bubble of gasfiromthe loop. This time is longer than for a Model 120A HPLC due to the restrictor in place in the Michrom; a total of 180 seconds of Ar deUvery through the loop cleared it completely so that sample loading was extremely reproducible. Load Injector step was optimized either by direct observation, or by running standard cycles of Load Injector times. The restrictor gave a window of more than 15 seconds between the point of full loading of the loop and introduction of Argon at the end of the load. The normal Inject step sends a command only to the 120A, and was replaced by Event A; this output on the back of the 477A was connected to the start contacts of the Michrom. Fraction collection was not employed, and these steps were removed. In place of a flask wash from the S4 bottle, 5C2 was used to wash the flask. As mentioned above, X2 was 20% acetonitrile.
III. Results A. PTH
Separations
All microbore sequencing runs started with two PTH standard n m s as described in Methods. Separations were similar for both standards, but the
Figure 1 An aliquot of one picomole of diluted Hewlett Packard PTH standard amino acids was delivered to the conversionflask,dried and subjected to conversion with the modified R4 and reconstitution in the modified S4 as described in the text. Two thirds of the converted aliquot was ii^jected onto the microbore HPLC for this analysis.
206
Michael F. Rohde et al.
second standard more closely matched the retention times of subsequent cycles. Figure 1 is the second standard from a recent nm. This represents a total of 1.0 pmole of each amino ^cid delivered to the flask, or 666 femtomoles on the PTH analyzer, when the fraction injected is taken into accoimt. Since the same proportion of reconstituted PTH is analyzed whether standard or sample, all calculations will refer to the amount in the flask. In Figure 1 we see baseline resolution of all the PTH amino acids in the standard. Roughly equivalent recoveries of all are noted, with the exception of His and Arg, which are somewhat broader in this system. Peak heights are equivalent to those found by manual injection of dilutions from a concentrated stock of PTH-AA. This indicates that no drastic losses were encountered with the use of 2% acetonitrile to redissolve the sample after R4 conversion and drying. Especially noteworthy are the recoveries of PTH-Ser and PTH-Thr, despite the lack of DTT in the modified R4 and S4. One possibility is that the presence of DTT in the X2 wash of the flask between injections is a satisfactory substitution for inclusion in S4 and R4 for sequencing at this level.
B. Sample Preparation
Considerations
One of the consequences of being able to detect PTH-AA at low levels, is the ability to detect other contaminants as well. It has been observed that what may look like a quiet baseline on a 10 pmole scale can have multiple interfering peaks at the 1 pmole or lower level (3). We have already discussed the interference of DTT in S4 and R4 on the cycles, but also find that if the normal R4 from ABI is used for sample transfer from collection tube to the filter, multiple interfering peaks will be detected in the first few cycles. Thus, HPLC grade TFA and water are used to bring the sample to 33% TFA/water just prior to loading (3). Even when this precaution is taken, it has often been observed that loading large volimies of liquid onto the filter will result in high initial backgrounds. For these reasons, most samples loaded on this sequencer are isolated on microbore and smaller (0.3-1.0 mm ID columns) HPLC systems. This minimizes sample adsorptive losses during chromatography, removes as many interfering components as possible and provides the sample in a small volimie for sequencer loading.
C. Comparative
Sequencing
Runs
The system described above was compared to low level sequencing on two other commercial systems in our laboratory. Several peptide samples were prepared by microbore HPLC isolation of proteolytic digests. Collected peaks were diluted with 25% TFA, and identical aliquots were loaded on the sequencer with the microbore PTH analyzer, a standard ABI 477A/120A and a Hewlett-Packard G1005 Protein Sequencer. This procedure was repeated with five or six peptide samples. Figure 2 compares the results for one peptide on the three sequencers. Each set is displayed from its own data system using the necessary scale to allow cycle to cycle comparisons. This full scale setting was 3.0 mv for the Michrom, 0.9 mv for the 120A and 0.5 mv for the Hewlett-Packard. In general, it was noted that first cycle chromatograms
Microbore PTH Separations
207
Figure 2 Comparative sequence analyses for three different systems are presented in this figure. A single peptide fraction was used for all three sequencers. The peptide was collected from a preparative, microbore HPLC run. The collectedfiractionwas brou^t to 33% in TFA and equal aliquots were spotted on the three sequencing systems, according to the standard procedure suggested by the respective manufacturer. Cycles four, five and six are presented from each system, printed from the data system for each unit. Chromatograms are scaled to allow cycle to cycle comparisons for each instrument. The top panel is from the Michrom ABI 477A system, the middle panel isfiromanother ABI 477A with their 120A PTH analyzer, and the lower panel is from a Hewlett Packard G1005A system.
208
Michael F. Rohde et al.
had some interfering peaks, but subequent cycles for all three systems could make at least some assignments at this level (approximately 1 pmole). The microbore system does have a clear advantage in the ability to call low level sequences with less ambiguity and for more cycles, due to its increased level of signal to noise.
IV. Discussion What has been presented here is a method for combining 1.0 mm microbore HPLC PTH separations with routine sequencing runs for commercially available instrument systems. The use of the combined system is relatively routine. Even though it does require a bit of additional attention and care to keep running at optimal conditions, the effort is no more than would be spent keeping a conventional system in operation at the same sequencing level. In exchange for the extra effort, one is able to obtain more sequence information than would be possible with the standard systems. Further optimization of the system would ideally include efforts to clean up the backgroimd noise coming from the sequencer and sample preparation. Efforts are already underway to provide a separation buffer that would better resolve the PTHAA peaks. A bigger challenge may be in flattening the baseline drifts, but this is not limiting when data is examined from the data system, rather than a strip chart recorder. Using this system with unknown samples, it has been possible to make assignments below 100 fmoles that were later proven to be correct. One of the outcomes of being able to obtain sequence data at increasingly lower levels, is that new challenges are presented to further minimize losses of peptide samples. As one goes to lower levels in peptide isolation and fraction collection, lower percentage recoveries seem to be the case, even with the careful use of techniques to extract the sample from the micro collection tubes. Such problems may force us to consider dtemate sample preparation procedures for low level samples.
References 1. Speicher, D. W. (1989) In "Techniques in Protein Chemistry" (Hugli, T.E., ed.) 24-35. 2. Tempst, P., Link, A.J., Riviere, L.R., Flemming, M., and Elicone, C. (1990) Electrophoresis, 11: 537-553. 3. Ercjument-Bromage, H., Geromanos, S., Chodera, A., and Tempst, P. (1993) In "Techniques in Protein Chemistry IV" (Angeletti, R. H., ed.) 419-426. 4. Atherton, D., Fernandez, J., DeMott, M., Andrews, L. and Mische, S. M. (1993) In "Techniques in Protein Chemistry IV" (Angeletti, R. H., ed.) 409-418. 5. Horn, M.J., Early, S.L., and Magil, S. G. (1989) In "Techniques in Protein Chemistry" (Hugli, T.E.,ed.) 51-58. 6. Aebersold, R., Bures, E.J., Namchuk, M., Goghari, M. H., Shushan, B., and Covey, T.C. (1992) Protein Science, 1:494-503. 7. Famsworth, V. and Steinberg, K. (1993) Analytical Chemistry 215:190-199. 8. Stolowitz, M.L., Kim, C.-S., Marsh, S.R., and Hood, L. (1993) In "Methods in Protein Sequence Analysis" (Imahori, K and Sakiyama, F. eds.), 37-44. 9. Blacher, R.W. and Wieser, J.H. (1993) In "Techniques in Protein Chemistry IV' (Angeletti, R. H., ed.) 427-33. 10. Moritz, R.L. and Simpson, R.J., (1992), J. Chromat. 599: 119. 11. Reim-D-F. and Speicher-D-W. (1993) Ana/.Bioc/iem., 214,87-95.
ASSIGNMENT OF CYSTEINE AND TRYPTOPHAN RESIDUES DURING PROTEIN SEQUENCING: RESULTS OF ABRF-94SEQ Jay Gambees Philip C. Andrews^, Karen DeJongh^, Greg Grant^, Barbara Merrill^, Sheenah Mische^, and John Rush^ IShriners Hospital for Crippled Children, Dept. of Research, Portland, OR 97201 ^Protein Structure Facility, University of Michigan Medical School, Ann Arbor, MI 48019 ^Dept. of Pharmacology SJ-30, University of Washington, Seattle, WA 98195 and Cell Therapeutics, Inc., Seattle, WA 98195 "^Dept. of Molecular Biology and Pharmacology, Washington University School of Medicine, St. Louis, MO 63110 ^Burroughs Wellcome Co., Dept. of Organic Chemistry, Research Triangle Park, NC 27709 ^Protein Sequencing/HHMI Biopolymer Facility, The Rockefeller University, NewYork, NY 10021 '^HHMI/Harvard Medical School, Dept. of Genetics, Boston, MA 02115 I.
Introduction
As part of the Association of Biomolecular Resource Facilities (ABRF), the Protein Sequencing Research Committee has, for the past six years, designed test samples to enable its member facilities to better evaluate their protein sequencing capabilities (1-6). This evaluation informs both facility users and operators of sequencing improvements and capabilities available in the average facility. These studies allow ABRF members to compare their performance with others on an anonymous basis and to identify areas needing improvement. In addition, they provide instrument manufacturers with objective data regarding the performance of their equipment under realistic operating conditions. Although the results from the last study, ABRF-93SEQ, indicated improvement for the correct identification of cysteine and tryptophan residues, the data fell mainly into two groups: those able to identify cysteine and tryptophan consistently and those unable to identify them at all (6). Routine assignment of cysteine and tryptophan residues is necessary for completeness of protein sequencing. Cysteine assignment is important because this residue provides the major covalent cross-linkage in proteins and contributes to the biological and catalytic activity of many folded polypeptides. Accurate assignment of tryptophan residues is especially crucial when sequencing is used to design oligonucleotide probes of limited redundancy (tryptophan has a single codon and cysteine has two codons). The primary goal of this study was to provide members with the opportunity to evaluate methods to improve the reliability of cysteine and tryptophan assignment in their own laboratories. This report describes the results of ABRF-94SEQ, based on 78 responses returned to the Committee as of May 20,1994. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
209
210
Jay Gambee et al.
n. Materials and Methods A.
Selection of the ABRF-94SEQ Test Sample
Human lactoferrin (MW 76,354; with 692 amino acids) was chosen for ABRF94SEQ because it contains several challenges for sequence assignment in the first 25 residues, including amino-terminal glycine (Gi), a common contaminant; an arginine quadruplet (R2R3R4R5) representing the difficulties often encountered in calling identical successive residues as well as the low yield for arginine during sequencing; tryptophan (W9 and W23) to assess the ability to resolve this residue from diphenylurea (7); and cysteine at residues C^Q and C20 (Figure 1) (8). Since the primary goal of this study was to assess assignments of tryptophan and cysteine, the data analysis concentrated primarily on the results obtained for these two residues. 5
10
15
20
25
NH2-G-R-R-R-R-S-V-Q-W-C-A-V-S-Q-P-E-A-T-K-C-F-Q-W-Q-R Figure 1. Amino acid sequence for the first 25 residues of ABRF-94SEQ. B. Protein Characterization Human lactoferrin (> 99% pure)fromICN (cat# 150203) was reconstituted with HPLC-grade H2O and quantitated, in triplicate, by amino acid analysis on a Beckman 6300 amino acid analyzer. Laser desorption mass spectrometry (LDMS) indicated a m/z value of 80,262 (higher MW due to the presence of carbohydrate). Before recommending methods for cysteine reduction and alkylation and for tryptophan analysis, the Protein Sequencing Research Committee evaluated each one for reliability and ease of use. Committee members who routinely perform in situ derivatization of cysteine were unsuccessftil using these methods with this sample so in situ methods were described but not recommended. Fifty pmol aliquots of the reduced and alkylated protein were analyzed by sequencing. Initial yields of 47 ± 23 % (average ± sample standard deviation, 5 independent determinations) were obtained. C
Preparation and Distribution of the Test Sample
Aliquots of 50 pmol were pipetted into pre-washed 1.5 ml polypropylene microfiige tubes and vacuum dried. Several aliquots were analyzed by committee members to insure sample quality before distribution. Two aliquots of ABRF-94SEQ were sent to each of the 258 ABRF member facilities along with instructions for sample solubilization, a short questionnaire, a data analysis worksheet, and several commonly used methods for cysteine and tryptophan analysis. Members were encouraged to try at least one of the methods described. All results were returned to an independent third party who, after removing all identifying labels, forwarded the anonymous data to the committee for analysis. m
Results and Discussion
A.
Survey Results
Of the 78 responses to this study, all facilities returned the survey questionnaire. The distribution of protein sequencing instruments was as follows: Applied Biosystems (57/78), Beckman-Porton (13/78), Hewlett-Packard (7/78), and Millipore (1/78). The average age of the instruments was 5.6 ± 3.3 years.
Results from ABRF-94SEQ
211
Most facilities (53%) reported sequencing at levels of 10-75 pmol (41/77), while others (29%) reported sequence levels of 1-10 pmol (29/77). The percentage of respondents sequencing at the 1-10 pmol level (29%) is somewhat higher than that reported in last year's, ABRF-93SEQ, study (21%). PTH amino acid analyzers used in this study were Applied Biosystems (59/78), Hewlett-Packard (16/78), Beckman (2/78), and Waters (1/78). The average age of these instruments was 5.2 ± 2.6 years, and nearly all (76/78) were on-line with the sequencing system. Most users of Applied Biosystems sequencers reported using premix buffers (49/57), and several reported using additives to adjust the baseline (22/57) or to improve yields (18/57). The most common additive to Applied Biosystems solvent A was acetone (13/57), while others reported adding tryptophan (2/57) and TEA (2/57). Several facilities use dimethylphenylthiourea (DMPTU) to adjust the baseline rise (12/57) and isopropanol to improve tryptophan resolution in Applied Biosystems solvent B. B,
General Sequencing Results
The 78 respondents participating in this study reported a total of 1637 sequencing cycles (Table I). There were 1492 (91%) positive assignments, 145 (9%) tentative assignments, 132 (8%) wrong assignments, and 169 unassigned cycles. Altogether there were 1505 correct assignments. Overall accuracy was 96% for positive assignments and 55% for tentative assignments. The average response assigned 21 cycles, of which 2 were tentative and 19 correct. Only 3 instances of sequence failure were reported. One failure occurred due to a leaky ProSpin cartridge, one individual reported loss of the sample with no explanation, and one facility was unsuccessful with two attempts at in situ reduction and alkylation. Table I.
Summary of Sequence Assignments for ABRF-94SEQ^
Total Cycles Correct Assignments Incorrect Assignments Positive Assignments Tentative Assignments Unassigned Cycles Accuracy-Positive Assignments Accuracy—Tentative Assignments Average Cycles Average Tentative Assignments Average Correct Assignments
PC+TC+PW+TW PC+TC PW+TW PC+PW TC+TW PC/(PC+PW) TC/(TC+TW) Total Responses/No. Responses Tentative Assign./ No. Responses Correct Assign./ No. Responses
1637 1505 132 1492 145 169 0.955 0.550 21 1.9 19.3
^ Sequence assignments are positive correct (PC), tentative correct (TC), positive wrong (PW), and tentative wrong (TW).
Figure 2 shows the number of positive assignments for the first 25 residues in ABRF-94SEQ. The N-terminal glycine proved to be problematic, with only 63% (49/78) positive calls and only 53% (41/78) positive and correct calls. There were 13 misassignments for G^. This result is slightly worse than that obtained with ABRF93SEQ (6) which also contained N-terminal glycine (60%PC). The arginine quadruplet (R2 R3 R4 R5) shows a slight decline in positive correct calls, due to an increase in tentative calls toward the end of the quadruplet.
212
Jay Gambee et al.
G R R R R S V Q W C A V S Q P E A T K C F Q W Q R ABRF-94SEQ
Sequence
Figure 2. Sequence assignments of ABRF-94SEQ. The number of positive correct (PC) and positive wrong (PW) assignments are plotted against the sequence of ABRF-94SEQ. The two cysteine residues, Cjo and C2Q, were the most difficult residues to assign. There were 41 positive correct calls ror Cjo and 29 positive correct calls for C20- There was a total of 7 misassignments for Cjo and 13 for C^o- The low positive correct response for C20 contrasts with the positive correct calls for the adjacent residues 19 and 21 (K19 (54PC), COQ (29PC), F21 (59PC)), indicating the difficulty with cysteine identification. Other than Gj (13 incorrect assignments), C20 was the only residue often assigned incorrectly (7 positive and 6 tentative wrong), with four incorrect calls identifying this residue as lysine, probably due to lag from K^Q. C^Q and C20 also had the highest number of calls as no residue seen and unidentified amino acid (31% for C^Q and 33% for C20). The number of PC calls decreased abruptly after Q22Tryptophan at W9 was relatively easy to identify with 64 positive correct assignments, whereas W93 was much more difficult (35 positive correct). The number of tentative correct calls for W23, eleven, was the highest for any residue in the sequence. Several facilities (62/78) indicated the use of modified solvents (heptane sulfonic acid or premix buffers) known to improve the yield of arginine. Of the 31 facilities positively identifying R25, 24 used these modified solvents (Hewlett-Packard 6/24, Applied Biosystems 18/24), and 7 did not (Applied Biosystems 4/7, BeckmanPorton 3/7). Accuracy of assignments for cysteine and tryptophan residues is shown in Table II. The accuracy of positive assignments for tryptophan in this study was 95%, a significant improvement over the previous study, ABRF-93SEQ, where the accuracy of positive assignments was 72% (6). For cysteines, the overall accuracy of positive assignments was 88% in this study and 53% for ABRF-93SEQ. These values indicate tremendous improvement over last year's study and may be attributable to both the reliability of the methods and that many facilities were probably performing alkylation for this study, whereas they did not do so for ABRF-93SEQ. For solution alkylation the accuracy of positive assignments was 94%, for the in situ method 88%, and for non-alkylated samples 50%. These results clearly show the impact that alkylation prior to sequencing has on the reliable assignment of cysteine.
Results from ABRF-94SEQ Table II.
213
Accuracy of Assignment for Cysteine and Tryptophan
Tryptophan Accuracy—Positive Assignments Tryptophan Accuracy—Tentative Assignments Overall Cysteine Accuracy^-Positive Assignments Overall Cysteine Accuracy—Tentative Assignments Solution Cysteine—Accuracy—Positive Assignments Solution Cysteine—Accuracy-^Tentative Assignments In situ Cysteino—Accuracy—Positive Assignments In situ Cysteine-Accuracy—Tentative Assignments Non-Alk. Cysteine—Accuracy—Positive Assignments Non-Alk. Cysteine—Accuracy—Tentative Assignments
C
ABRF-94SEQ 95% 67 88 44 94 36 88 0 50 57
ABHF-93SE:O (6)
72% 60 53 64
Tryptophan Sequencing Results
Of the 57 facilities using Applied Biosystems instruments, 35 reported that they added isopropanol (IPA) to solvent B and 21 did not. Because IPA should help separate tryptophanfromdiphenylurea, tryptophan analysis results are separated into Applied Biosystems users who added isopropanol (IPA+), Applied Biosystems users who did not (IPA"), and users of other instrumentation (Non-Applied Biosystems). Table III shows both the number of assignments and the percentage of correct assignments (%PC+TC) for each group. 80% of those using IPA correctly identified W9, and 86% of those without IPA correctly identified W9. 95% of those using other instrumentation (7/7 Hewlett-Packard users, 12/13 Beckman-Porton users, and 1/1 Millipore user) identified W9 correctly. For W23, users with IPA made 57% correct assignments, those without IPA made 52% correct calls, and users of other instruments assigned 67% correctly (Hewlett-Packard 7/7 (100%), Beckman-Porton 6/13 (46%), and Millipore 1/1 (100%)). Interestingly, of the 7 Hewlett-Packard instruments, no incorrect assignments were made for tryptophan. Table III. Distribution of Tryptophan Assignments for ABRF-94SEQ The table represents assignments for tryptophan by Applied Biosystems instrument users either with isopropanol (IPA"*") or without isopropanol (IPA") in solvent B, and a Non-Applied Biosystems category representing all other instrumentation. Number of assignments is given using the classifications positive correct (PC), tentative correct (TC), positive wrong (PW), tentative wrong (TW), and other (Other) which includes no residue seen, unidentified amino acid, and no assignment. The percentage of correct assignments (%PC+TC) is indicated for each group. W9 IPA+ IPA" Non-ABI*
EC 27 16 20
TC 1 2 0
%PC+TC 80% 86% 95%
PW 0 1 0
TW 2 0 0
Other 5 2 1
W23 IPA^ IPANon-ABI
PC 13 8 13
TC
%PC+TC 57% 52% 67%
PW 3 1 0
TW
Other 9 7 7
7 3 1
3 2 0
Total 35 21
21 77** Total 35 21
21 77**
*ABI (Applied Biosystems) ••Total of 77 represents all responses except for one Applied Biosystems user that did not specify whether IPA was used or not.
214
Jay Gambee et al.
The survey indicated facilities not using IPA identified tryptophan by the size of the diphenylurea/ tryptophan (W) peak relative to fluctuations in the DPTU peak; when W is larger than DPTU or by use of a Supelco column with a modified gradient. Several facilities indicated that these methods of monitoring W and DPTU shifts are not reliable past approximately 19 residues. This may explain the relative difficulty in identifying W23 (overall 86% correct for W9 and 58% correct for W23). Another observation was the large number of wrong and other assignments for W23 by Applied Biosystems users. This may indicate a need for further improvement of tryptophan resolution on these instruments. These results indicated that the addition of IPA to solvent B made no significant difference in the ability to identify tryptophan, as the percentages of correct calls for both groups (IPA+ and IPA") were almost identical. However, from the survey, several respondents acknowledged the added benefit of using IPA in solvent B, while others mentioned the usefulness of different columns and modified HPLC gradients. The authors, in general, have found the use of IPA in solvent B to be helpful. D,
Cysteine Sequencing Results
Of the 78 survey responses only 22 reported that they routinely alkylate samples before sequencing. However, 49 facilities alkylated ABRF-94SEQ, of which 37 used solution and 12 attempted in situ alkylation. The remaining 29 facilities sequenced without alkylation. Table IV describes several methods of reduction and alkylation that were all provided with ABRF-94SEQ. An additional method using 3-bromopropylamine was employed by one respondent. Reduction in solution was performed with dithiothreitol (DTT) (68%), 2-mercaptoethanol (29%), and tributylphosphine (TBP) (3%). In situ reduction was carried out using DTT (50%) and TBP (50%). Table IV. Methods Provided for Disulphide Bond Reduction and Alkylation in ABRF-94SEQ Reduction Method
Solubilization Solution
Reducing agent
Reducing Conditions
50M1 8 M urea, 0.4M NH4HCO3, pH 8.0 300)ul 6M Gu-HCL, 0.1MTris,pH8.5 6M Gu-HCL, 0.5M Tris, pH 8.2, 50mM EDTA
5nl 45mM DTT
15minat50°C 9
5pl 50 mg/ml DTT
4hrat37''C
10
20mM DTT
11
D
40M140%CH3CN,0.4M
5iil 4.5mM DTT
lhrat37°C, under N2 15minat55''C
E
N-ethyl morpholine, pH 8.0 50pl 6M Gu-HCL, 0.25M Tris, pH 8.5, ImM EDTA
2.5^110% 2-ME
A B C
Alkylation Method A B C D E F G
PTH Derivative
Carboxymethyl Cys Carboxamidomethyl Cys Pyridylethyl Cys Cys-S-propionamide Cys-S-dimethylpropionamide A^-isopropylcarboxyamidomethyl Cys Aminopropyl Cys
Ref
12
2hratRT 13 in dark under Ar
Reagent
Reference
lodoacetate lodoacetamide Vinyl pyridine Acrylamide Dimethylacrylamide A^-isopropyliodoacetamide 3-bromopropylamine
10, 14, 15 9,12 13, 16, 17 18 18 19 11
Results from ABRF-94SEQ
215
Table V shows the total number of assignments and the percentage of correct assignments (%PC+TC) for the various alkylating methods used. The distribution of alkylating reagent usage was iodoacetate 8%, iodoacetamide 16%, vinyl pyridine 59%, acrylamide 10%o, A^-isopropyliodoacetamide 4%, and 3-bromopropylamine (BPA) 2%. The most popular method of alkylation was vinyl pyridine, and this group assigned 86% correct (%PC+TC) for C^p and 52% correct for C20- Vinyl pyridine is readily available in a ready-to-use liquid form, and this may account for its popularity. The elution position of PTH-pyridylethyl cysteine (PE-CYS) on an Applied Biosystems sequencer is highly sensitive to the ionic strength of the HPLC Solvent A (13), and it may elute anywhere between PTH-Tyrosine and DPTU. The reaction requires fresh reagent or derivatization may not be complete, and preview of the next amino acid during sequencing may occur due to N-terminal alkylation of the polypeptide (16). These problems may account for the larger number of "wrong" and "other" assignments for C2p respondents who used vinyl pyridine. No respondents reported using dimethylacrylamide, and 1 respondent used 3-bromopropylamine (BPA) (11) that produced very clear cysteine derivatives. Those using acrylamide could clearly identify Cjo (5/5) but had difficulty making a positive call later in the sequence for C20 (2/5). Facilities using iodoacetate or iodoacetamide either got both C\Q and C20 correct (8/10), or they did not get them at all (2/10). Table V.
Distribution of Sequence Assignments for CJQ and C20 in ABRF-94SEQ
Alkylation Method Iodoacetate Iodoacetamide Vinyl pyridine CjQ Acrylamide Isopropyliodoacetamide BPA None Alkylation Method Iodoacetate Iodoacetamide Vinyl pyridine C20 Acrylamide isopropyliodoacetamide BPA None
PC 3 5 24 5
TC 0 0 1 0
1 1 2
0 0 2
PC 3 5 15 2
TC 0 0 0 2
0 1 3
1 0 2
%PC+TC 75% 63% 86% 100% 50% 100% 14%
%PC+TC 75% 63% 52% 80% 50% 100% 17%
a
PW 0 0 1 0
TW 0 2 0 0
Other 1 1 3 0
Total 4 8 29 5
1 0 1
0 0 2
0 0 22
2 1 29
PW 0 0 2 0
TW 0 2 3 0
Other 1 1 9 1
Total 4 8 29 5
1 0 4
0 0 1
0 0 19
2 1 29
^The number of assignments is listed for the alkylation method used, using the classifications as described in Table III. Overall, there were 44 correct (PC+TC) assignments for C^n and 34 correct assignments for C20- Of the 44 correct calls for Cjf), 31 (70%) alkylated in solution, 9 (20%) were in situ, and 4 (10%) were non-alkylated samples. Of the 34 correct calls for CoQ, 24 (70%) alkylated in solution, 5 (15%) were in situ, and 5 (15%) were from non-alkylated samples. Alkylation of cysteine residues greatly facilitates their identification. Of the 31 cases where Cjo was mis-identified or no residue was called.
216
Jay Gambee et al.
only 6 groups had alkylated while 25 had not. After reviewing the chromatographs, the Committee noted that 13 facilities identified Cjp and/or C20 as cysteines when no PTH-residue was observed. Of these, 11 made this assignment as being positive correct and 2 as being tentative correct. E,
Best Responses
Of the 78 responses, 11 groups (#10,16,17,48,51,52,59,68,71,75, and 80) correctly identified the first 25 residues. Two of these facilities made questionable calls (#16 at COQ and Q74; and #68 at W93). Of these eleven groups, 5 used HewlettPackard, 5 used Applied Biosystems, ana 1 used Beckman-Porton instruments. Of the facilities using Hewlett-Packard instruments (9% of all respondents), 71% of them are listed in the top eleven performances. Possible contributing factors to this comparatively better performance over Applied Biosystems instruments may be that the average age of the Hewlett-Packard instruments is less than that of the Applied Biosystems instruments and that PTH-tryptophan and most PTH-cysteine derivatives are well resolved on Hewlett-Packard instruments. It was observed that in situ derivatization led to difficulties in identifying sequence assignments in general. Of the facilities performing in situ alkylation, only 1 (#75) identified the first 25 residues correctly. In addition to the these eleven groups, another group (#'s 28 and 29) was able to correctly identify the sequence by combining two data sets, which was not the intent of the Committee, and group #67 called cysteine by omission (no alkylation), but correctly called the sequence. Conclusions After reviewing the results of previous ABRF studies (1-6), the Protein Sequencing Research Committee selected this sample (human lactoferrin) to not only assess but also to assist member facilities in identifying both tryptophan and derivatives of cysteine. ABRF-94SEQ would allow members to see which methods are used successfully in other facilities and also provide them the opportunity to assess these methods for themselves. The most common methodology used in this study was to resuspend in 50 ul 6M GuHCL, 25mM Tris-HCL, pH 8.5, 1 mM EDTA; reduce in either DTT (final concentration of 4.1 mM) for 45 min at 50-55C, in the dark under N2, or 2.5 ul of 10% 2-ME for 2 hr at room temperature, in the dark and under argon; and alkylate with 1-2 ul of fresh vinylpyridine for 30 min at 37C. Sample cleanup was done using a variety of methods such as: Hewlett-Packard sequence columns, Applied Biosystems ProSpin cartridges, rinsing of blots with 1^0/ MeOH, RP-HPLC, and Applied Biosystems SI, S1:S2 washes. The average HPLC injected 61±18% of the PTH amino acid, and used modified solvents to improve sequencing capability. The positive identification of W^ (82%) was a remarkable improvement over the previous study, ABRF-93SEQ, in which W2 and W7, also both early in the sequence, were positively identified 49% and 46%, respectively. Tryptophan 23 (W23) was positively identified (45%) nearly as well as the early tryptophan residues m ABRF93SEQ. Positive cysteine identification (CiQ (53%) and C20 (37%)) improved over the 1993 study, where C5 was positively identified at 20%. Committee members observed that several respondents, some who had alkylated and some who did not alkylate, risked making positive assignments for cysteine when no PTH-residue or dehydroalanine/DTT adduct was apparent. If this assumption is done, it must be noted that this is not a positive assignment. Although cysteine and tryptophan identification may be problematic, the data suggest that routine identification of these residues should be possible for the majority of facilities.
Results from ABRF-94SEQ
217
Acknowledgments This work was partially supported by NSF grant DIR9003100 to John Crabb (W. Alton Jones Cell Science Center) on behalf of the ABRF. The contribution of all ABRF laboratories that made this study possible by analyzing the test sample and providing data is gratefully acknowledged. The assistance of Gary P. Gryan (HHMI/Harvard Medical School Computer Facility) in coordinating the data returns and ensuring the anonymity of the participating laboratories is appreciated. The assistance of the FASEB Business Office for postcard printing and mailings is acknowledged. Several members of the authors's laboratories contributed to this project, including: Dawn Fitzpatrick, Rebeca Ramos, and Melissa Saylor (HHMI/Harvard Medical School) for coordinating the sample mailing and the processing of responses to the study; Resa Rorie (Burroughs Wellcome Co.) for amino acid analysis; Anita Colvin (University of Washington), Joe Fernandez (The Rockefeller University), Bill Chesnut (Burroughs Wellcome Co.), and Melissa Saylor (HHMI/Harvard Medical School) for initial sequence analysis. References 1. Niece, R.L., Williams, K.R., Wadsworth, C.L., Elliott, J., Stone, K.L., McMurray, W.J., Fowler, A., Atherton, D., Kutny, R., and Smith, A.J. (1989) in Techniques in Protein Chemistry, T.E. Hugli, ed., pp 89-101 Academic Press, San Diego, CA. 2. Speicher, D.W., Grant, G.A., Niece, R.L., Blacher, R.W., Fowler, A.V., and Williams, K.R. (1990) in Current Research in Protein Chemistry II, J.J. Villafranca, ed., pp 159-166. Academic Press, San Diego, CA. 3. Yiiksel, K.U., Grant, G.A., Mende-Mueller, L.M., Niece, R.L., Williams, K.R., and Speicher, D.W. (1991) /« Techniques in Protein Chemistry II, J.J. Villafranca, ed., pp 151162. Academic Press, San Diego, CA. 4. Crimmins, D.L., Grant^ G.A., Mende-Mueller, L.M., Niece, R., Slaughter, C , Speicher, D.W., and YUksel, K.U. (1992) in Techniques in Protein Chemistry III, R.H. Angeletti, ed., pp 35-43 Academic Press, San Diego, CA. 5. Mische, S.M., Ytiksel, K.U., Mende-Mueller, L.M., Matsudaira, P., Crimmins, D.L., and Andrews, P.C. (1993) in Techniques in Protein Chemistry IV, R.H. Angeletti, ed., pp 453461 Academic Press, San Diego, CA. 6. Rush, J., Andrews, P.C, Crimmins, D.L., Gambee, J.E., Grant, G.A., Mische, S.M., and Speicher, D.W. (1994) in Techniques in Protein Chemistry V, J.W. Crabb, ed., pp 133141. Academic Press, San Diego, CA. 7. Applied Biosystems User Bulletin Issue No. 59 (1993). 8. Metz-Boutigue, M.H., JoUes, J., Mazurier, J., Schoentgen, F., Legrand, D., Spik, G, Montreuil, J., and Jolles, P. (1984) Eur. J. Biochem. 145, 659-676. 9. Stone, K.L., LoPresti, M.B., Williams, N.D., Crawford, J.M., DeAngelis, R. and Williams, K.R. (1989) in Techniques in Protein Chemistry (T.E. Hugli, ed.) Academic Press, San Diego, CApp 377-391. 10. Pan, Y-C.E., Wideman, J., Blacher, R, Chang, M. and Stein, S. (1984) J. Chromatog. 297:13-19. 11. Jue, R.A. and Hale, J.E. (1993) Anal. Biochem. 210:39-44. 12. Atherton, D., Femandez, J. and Mische, S.M. (1993) Anal. Biochem. 212:98-105. 13. Hawke, D. and Yuan, P. (1987) Applied Biosystems User Bulletin Issue No. 28. 14. Allen, G. (1981) in Sequencing of proteins and peptides, North-Holland Publishing Company, Oxford, pp 30-31. 15. Hirs, S.H.W. (1967) in Meth. Enzymol. VI (C.H.W. Hirs, ed.) Academic Press, New York, pp 199-203. 16. Andrews, P.C. and Dixon, J.E. (1987) Anal. Biochem. 161:524-528. 17. Tempst, P., Link, A.J., Riviere, L.R., Fleming, M. and Elicone, C. (1990) Electrophoresis 11:537-553. 18. Brune, D.C. (1992) Anal. Biochem. 207:285-290. 19. Krutzch, H.C. and Inman, J.K. (1993) Anal. Biochem. 209:109-116.
This Page Intentionally Left Blank
Automated C-Terminal Protein Sequence Analysis Using the Hewlett-Packard G1009A C-Terminal Protein Sequencing System Chad G. Miller, David H.Hawke, Jacqueline Tso, and Sherrell Early Protein Chemistry Systems, Hewlett-Packard Company, Palo Alto, CA 94304
I.
INTRODUCTION
Automated carboxy-terminal (C-terminal) protein sequence analysis enables the direct and unambiguous confirmation of the Cterminal sequence of native and expressed proteins, the detection and characterization of protein processing at the C-terminus, the identification of post-translational proteolytic cleavages, and partial sequence information on amino-terminally blocked protein samples. In order for C-terminal sequence analysis to be of immediate benefit, each of the 20 common amino acid residues must be detectable. Additionally, the scope of typically analyzable protein samples must span a usefully broad molecular weight range and degree of structural complexity. The HP G1009A C-terminal protein sequencing system provides the reliable identification of the C-terminal sequence of diverse protein samples isolated from various purification protocols. The automated sequencing chemistry utilizes diphenyl phosphoroisothiocyanatidate (DPP-ITC) as the coupling reagent and trimethylsilanolate as the cleavage reagent for the efficient generation of thiohydantoin-amino acid derivatives (1). The automated HPLC analysis of the sequencing cycles is accomplished using the HP 1090M liquid chromatograph with the HP specialty PTH analytical HPLC column (2). TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
219
220
II.
Chad G. Miller et al.
MATERIALS AND METHODS
Protein samples were applied to Zitex membranes (Norton Performance Plastics, Wayne, NJ) and inserted into C-terminal sequencer columns for sequence analysis using the HP G1009A Cterminal protein sequencing system (HP, Palo Alto, CA). The system consists of the HP G1000A C-terminal protein sequencer, HP 1090M liquid chromatograph, HP Vectra 486/66 computer with Microsoft'^ MSDOS^ and Windows™ environment, HP specialty C-terminal sequencing reagents and solvents, HPLC columns and solvents, and all related consumables. Protein samples were obtained from Sigma Chemical Co. (St.Louis,MO). Mouse immunoglobulin G was a generous gift of Dr. J. Bailey, City of Hope. Thiohydantoin-amino acid derivatives were prepared according to published procedures and quality-controlled by HP (3). The HP C-terminal chemical sequencing method was developed for automation using a chemistry licensed by HP from the City of Hope, Duarte, CA.
III.
RESULTS AND DISCUSSION
A. Sample Application on an Inert Reaction Support The protein samples for C-terminal sequencing are conveniently applied directly to Zitex reaction membranes (1mm x 12mm) that have been pre-treated with alcohol (isopropanol). The liquid sample solutions, on occasion made homogeneous by the addition of a small volume (1-5 |Lil) of dilute aqueous trifluoroacetic acid, are applied to the wetted Zitex membrane in 5-20 |Lll volumes and allowed to dry at room temperature (10-20 minutes). The dry membrane is inserted into a C-terminal sequencer column (inert Kel-F columns fitted with inert endfrits) and installed in any one of the four sample positions on the HP G1009A sequencer. The sequencer column reactions that occur on the Zitex membrane include the chemical coupling and cyclization of the C-terminal residue and the cleavage and extraction that releases the thiohydantoin-amino acid derivative. The thiohydantoin-amino acids are extracted off the Zitex
Automated C-Terminal Sequence Analysis
221
membrane from the sequencer column to the sequencer transfer flask for preparation for HPLC injection. The sample application is compatible with diverse samples recovered in various buffers (phosphate, inorganic salts, HEPES) and solvents (HPLC fractions). Samples that have been subjected to aminoterminal sequence analysis using the HP G1005A N-terminal protein sequencing system and Zitex as a reaction support may be transferred to C-terminal sequencer columns and subjected to C-terminal sequence analysis with the HP G1009A C-terminal sequencing system. B. C-Terminal Coupling and Cyclization Reactions Diphenyl phosphoroisothiocyanatidate (DPP-ITC) in the presence of pyridine constitutes the new chemical coupling and cyclization reactions for the HP G1009A automated C-terminal sequence analysis (1). Prerequisite to the coupling reaction with DPP-ITC is the base activation of the protein C-terminal carboxylic acid moiety to a carboxylate species using trimethylsilanolate. The carboxylate salt of the C-terminal amino acid residue is highly reactive to the diphenyl phophoroisothiocyanatidate coupling reagent, speculatively generating a reactive pentavalent species, which collapses to the C-terminal acylisothiocyanate. The Zitex membrane is washed with organic solvent (acetonitrile) to eliminate the excess DPP-ITC. The coupled peptidylacylisothiocyanate is treated with pyridine to affect the five-membered ring closure to the carboxy-terminal peptidylthiohydantoin. Additional treatment of peptidylprolylacylisothiocyanate with liquid anhydrous trifluoroacetic acid enables the cyclization of the C-terminal prolylisothiocyanate to the corresponding prolylthiohydantoin (4). This species is readily cleaved by treatment with 2% aqueous trifluoroacetic acid vapor (and trimethylsilanolate treatment) to yield the thiohydantoin-proline derivative. C. Cleavage Reaction of the Peptidylthiohydantoin The peptidylthiohydantoin (coupled and cyclized product) is subjected to chemical cleavage to the C-terminal thiohydantoin-amino acid residue and the shortened polypeptide using an alkali salt of trimethylsilanolate (KOTMS). The cleavage reagent is a highly reactive nucleophile displacing the thiohydantoin derivative from the C-terminal acylthiohydantoin moiety. The resulting trimethylsilyl ester is rapidly
222
Chad G. Miller et al.
cleaved to a free C-terminal carboxylate species amenable for the next cycle of chemical coupling with DPP-ITC. The released thiohydantoinamino acid is extracted off the Zitex membrane to the transfer flask by the cleavage solution and subsequent organic solvent (acetonitrile). The extraction solvents are evaporated and the thiohydantoin-amino acid is redissolved in the HPLC transfer solvent (dilute aqueous trifluoroacetic acid) and injected into the HP 1090M HPLC for analysis. D. HPLC Analysis of Thiohydantoin-Amino Acid Derivatives The HP G1009A C-terminal protein sequencing system provides automated HPLC analysis of sequencer cycles using the HP 1090M liquid chromatograph with filter photometric detection at 269nm and the HP specialty (2.1mm x 25cm) reversed-phase PTH analytical HPLC column (3). A 39-minute binary gradient (Solvent A: phosphate buffer, pH 2.9; Solvent B: acetontrile/water) utilizes an ion-pairing reagent (alkyl sulfonate) enabling highly reproducible elution times and peak resolution. A stable thiohydantoin-amino acid standard mixture is incorporated on the sequencer for on-line automated peak calibration and quantitation. Thiohydantoin-amino acid standard mixture. The thiohydantoinamino acid standard mixture consists of the synthetic thiohydantoin derivatives corresponding to the actual sequencing products resulting from chemical sequence analysis. In particular, the serine, threonine, cysteine, and lysine thiohydantoin derivative standards correspond to their respective sequencing degradation products. The sequencing product derivatives of serine and cysteine yield the same degradation species and, without cysteine side chain modification, permit the identification of either residue for confirmatory analysis of a known sequence. The residue assignment of cysteine for unknown sequences requires the prior chemical modification of cysteine (an S-alkylation) as is routinely done with amino-terminal sequencing methods. The thiohydantoin-amino acid standard HPLC chromatogram (Figure 1) shows the elution times for each of the 20 common amino acid derivatives (approximately 50 pmols) including thiohydantoin-Pro (P) and the common peak designated S/C, identifying Ser and Cys residues. The relative retention time for the S-carboxymethyl derivative of cysteine is indicated by the arrow.
Automated C-Terminal Sequence Analysis
223
i1 6 c o o
L
CM
F
B }
Ivw 10
20
30
time (mln)
Figure 1. Thiohydantoin-amino acid standard chromatogram (about 50 pmoles).
E. C-Terminal Sequence Analysis of Diverse Samples The recoveries of first-cycle residues typically result in sequencing initial yields ranging from 10%-50% of the total amount of sample applied to the Zitex membrane. As observed for amino-terminal sequencing, there is a sample and residue-specific dependency that contribute to the initial thiohydantoin-amino acid recoveries. The proteins are retained on the Zitex membrane after sequencing (and resist extraction) as determined by the amino acid analysis of the membranebound samples. Mouse immunoglobulin G (ISOkDa) was applied as a phosphate buffer solution directly on a Zitex reaction membrane and subjected to C-terminal sequence analysis (Figure 2). The C-terminal cycle-1 identified the extent of protein processing at the C-terminus of the heavy chain by the detection and relative recoveries of the heavy chain Lys (K, expected full-length sequence C-terminal residue) and Gly (G) residues. The C-terminal residue of the light chain was identified as the expected Cys (C) residue. Superoxide dismutase, an N-terminally blocked protein, was applied to a Zitex reaction membrane (approximately 1 nmol) in 10 |Lil of 1% aqueous trifluoroacetic acid. The first three cycles of C-terminal sequence analysis resulted in the identification of Lys (K) cycle-1, Ala (A) cycle-2, and lie (I) cycle-3 (Figure 3). The chemical background remained relatively stable as a thiohydantoin background increased, in part, attributed to internal cleavages as analogously observed for aminoterminal sequencing chemistry.
Chad G. Miller et al.
224
^--^^^--^^^
10
^^U.AU-.JL^JIL
J. 20
30
L.
time (min)
Figure 2. Cycle-1 of 2 samples of intact mouse IgG (about 1 mnole) detecting protein processing of the heavy chain by the identification of Lys(K) and Gly(G) residues.
IW-^
U J '^-
^AX.A^.os>fc-wA_^
-A_>J
|VAOW-J
JL.---X-JLAIVJL-J 10
20 time(mln)
Figure 3. Cycles 1-3 of 1 nmol of superoxide dismutase.
The results of C-terminal sequence analysis on a series of proline-containing protein samples identified thiohydantoin-Pro (P) at each cycle-1 of the analyses (Figure 4), confirming the expected fulllength sequences. The model polypeptide, polyproline, was applied to a Zitex reaction membrane and directly sequenced as were the additional two protein systems shown. Ovalbumin and apo-transferrin (about 5 nmoles) were subjected to sequence analysis and resulted in the recovery of thiohydantoin-Pro as first cycle residues. The sequencing chemistry invokes the use of neat trifluoroacetic acid (as described here and elsewhere, cf ref. 4) to generate the peptidylthiohydantoin-Pro species which is subsequently cleaved to the thiohydantoin-Pro as part of the routine sequencing cycle methodology.
Automated C-Terminal Sequence Analysis
225
Figure 4. Cycle-1 chromatograms of polyproline (10 nmols), ovalbumin (5 nmols), and apo-transferrin (5 nmols) showing the identification of the C-terminal thiohydantoin-proline residue. Approximate 10-20% initial recoveries are found.
The results of C-terminal sequence analysis on a 1 nmol sample of bovine beta-lactoglobulin A are shown in (Figure 5). The first three cycles of analysis show the identification of cycle-1 He (I), cycle-2 His (H), and cycle-3 Cys (C), confirming the expected full-length protein sequence. The initial recovery of cycle-1 He is approximately 40%. A 1 nmol sample of human serum albumin was applied to a Zitex reaction membrane, inserted into an N-terminal sequencer membranecompatible column, and installed in the HP G1005 A N-terminal protein sequencer (5). The sample was subjected to five cycles of automated Nterminal sequence analysis (cycles-1 Asp,D and -2 Ala, A are shown) and the Zitex reaction membrane was then transferred to the HP G1009A C-terminal protein sequencing system (Figure 6). The first two cycles of automated C-terminal sequence analysis of the previously N-terminal sequenced sample resulted in the unambiguous C-terminal sequence residue assignments of Leu (L) at cycle-1 and Gly (G) at cycle-2. The alternate order (C-terminal sequencing then N-terminal sequencing) is under investigation. The integration of amino-terminal and carboxyterminal sequence analysis on a single sample should become an invaluable procedure for the sequence determination and structural identification of protein samples.
Chad G. Miller et al.
226
Figure 5. Cycles 1-3 of bovine P-lactoglobulin A (1 nmole). Cydes 1 and 2 of N-lormlnaJ lequefvang vviih HP G1005A
j_JL 5
10
15
Cydes1«yl2ofC-tennInalsequencJoowHhHPGt009A ]
S
time (min)
Figure 6. HSA (1 nmol), applied to a Zitex membrane, was Edman-sequenced on the HP G1005 A N-terminal sequencer for 5 cycles and then transferred to the HP G1009A C-terminal sequencer. Cycles 1 and 2 of the Edman cycles are shown above the cycles 1 and 2 of the C-terminal analysis. Residues are unambiguously assigned in both cases.
IV.
CONCLUSIONS
The HP G1009A C-terminal sequencing system automates an efficient, reliable, and reproducible carboxy-terminal sequencing chemistry based on the introduction of diphenyl phosphoroisothiocyanatidate as the coupling reagent, pyridine as a cyclization reagent, and trimethylsilanolate as the cleavage reagent. The strategic incorporation of trifluoroacetic acid into an extended cyclization scheme enables the sequence analysis through all of the 20 common amino
Automated C-Terminal Sequence Analysis
227
acids, including proline. The sequencing methodology does not require any pre-sequencing modifications to protect side chain residues. Direct sample application to a Zitex reaction membrane avoids covalent attachment procedures and sample manipulations. A robust HPLC method for the identification of the thiohydantoin-amino acid derivatives, in addition to an on-line thiohydantoin-amino acid standard mixture, enables the reliable detection and identification of sequencer cycle residues. The HP G1009A sequencing system provides the flexible platform on which further developments and refinements to the chemical sequencing methodology can be accomplished.
Acknowledgments The authors dedicate this work to the memory of Dr. Marzell Herold of the HewlettPackard Company, Analytical Products Group, GmbH, Waldbronn, Germany.
References 1. 2. 3. 4. 5.
Bailey, J.M., Nikfarjam, F., Shenoy, N.S., and Shively, J.E. (1992) Protein Science 1, 68-80. HP G1009A C-Terminal Protein Sequencing System technical note (1994) TN 94-5. Bailey, J.M. and Shively, J.E. (1990) Biochemistry 29, 3145-3156. Bailey, J.M. (1995) in Techniques in Protein Chemisty VI, ed. Crabb, J.W. (Academic Press, San Diego) in press. Miller,C.G. (1994) in Methods: A Companion to Methods in Enzymology VI, ed. Shively,J.E. (Academic Press, San Diego) in press.
This Page Intentionally Left Blank
Applications Using an Alkylation Method for Carboxy-terminal Protein Sequencing MeriLisa Bozzini, Jindong Zhao, Pau-Miau Yuan, Doreen Ciolek^, Yu-Ching Pan^, John Horton^, Daniel R. Marshak^, and Victoria L. Boyd. Perkin-Elmer, Applied Biosystems Division, Foster City, CA and ^Biotechnology Department, Hoffmann-La Roche Inc., Nutley, NJ and ^Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
I. Introduction Protein chemists have sought a practical, automated carboxy-terminal (C-teraiinal) sequencing method for many years. In the biopharmaceutical industry, C-terminal sequence information can be essential for the early characterization of recombinant proteins, as well as for routine batch analysis. Several automated chemical degradation schemes, analogous in principle to the Edman method for amino-terminal (N-terminal) sequencing, have been proposed (1,2). In 1992, we introduced a novel alkylation method for C-terminal protein sequencing (3,4). During early development of the chemistry, all C-terminal sequencing employed covalent attachment of the protein to resin or glass beads, or the use of specialized, non-commercially available membrane supports. Recently, we have optimized our sample handling protocols to employ sequencing supports that are more useful for the protein sequencing community. We have applied protein to a polyvinylidene difluoride (PVDF) membrane using centrifugation (Perkin-Elmer ProSpin Sample Preparation Cartridge) with good results at the 1 to 3 nmol level. Furthermore, we have sequenced electroblotted samples (ProBlott membrane) from SDS-PAGE at the 0.15 to 1 nmol level. The number of residues identified using this method is sequence dependent. Typically 1 to 2 nmol of purified protein will allow sequencing of 3-5 (and sometimes more) cycles from the C-terminus. Sequencing of electroblotted samples, where less than 1 nmol of protein is present, has also been successful for 3 or more cycles. Although a few amino acids remain difficult to detect or sequence, our chemistry is especially useful for the verification of the expected C-terminus of recombinant proteins, and has provided information on possible modifications. Here we show specific examples, where the application of this chemistry can be used as an additional analytical tool for protein characterization. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
229
230
MeriLisa Bozzini et al.
II. Materials and Methods Phenylisocyanate (PIC) was purchased from Aldrich Chemical Company, Inc. Pre-cast gels for SDS-PAGE were purchased from Novex. All sequencing reagents, solvents and materials were supplied by the Perkin-Elmer Corporation.
A. Automated C-terminal Sequencing C-terminal sequencing was performed using an ABI 477A Protein Sequencer, as previously described (3,4), incorporating changes in some reagent formulations (5,6). The HPLC conditions were the same as those previously described (6). Prior to sequencing, all immobilized protein samples were treated with PIC to derivatize the epsilon amino groups of the lysine residues into phenylureas (5). The PVDF membrane was placed in a microcentrifuge tube. Aliquots (3 \\L) of 2.5% (v/v) PIC and 2.5% (v/v) diisopropyl ethylamine (DIEA), both in acetonitrile, were applied to the PVDF disk. The membrane was dried by placing the open microcentrifuge tube on a heating block at 60° C.
B. Immobilization ofPurified Proteins on PVDF by Centrifugation For immobilization of purified recombinant proteins, approximately 2 nmol of the protein in solution was applied to a 4-mm diameter disk of PVDF membrane by centrifugation using a ProSpin Sample Preparation Cartridge. The proteins were dissolved in the following: water, 0.1% TFA in water, 20% acetonitrile in water, NaCl, Triton, PBS, or urea (total buffer/salt concentration < 100 mM). Generally, the concentration of the protein in solution was 0.5 to 1 |Lig/|iiL. 1. Preparation of two forms of recombinant human interleuldn-2 Recombinant human interleukin-2 (rIL-2, previously known as T-cell growth factor) was expressed in E. coli (7). During the purification process of rIL-2 by reversed-phase HPLC, another higher molecular weight form of this protein was also isolated, known as HMW rIL-2 (8). The purification process of HMW rIL-2 involved two chromatography steps and utilized a Bakerbond Carboxy-Sulfon (CS) column. Approximately 2 nmol each of rIL-2 and HMW rIL-2 were immobilized using ProSpin cartridges for C-terminal sequence analysis. 2. Preparation of Protein Kinase CKII The cDNA clone for the 6-subunit of human Protein Kinase CKII (Casein Kinase II) was expressed in E. coli using a T7 expression system. The protein was purified to apparent homogeneity by chromatography using the following columns: polyethyleneimine, phenyl-TSK (Toyo Soda Co.), and monoQ (Pharmacia). Following column purification, the proteins were subjected to SDS-PAGE. Two
231
Applications of C-Terminal Sequencing R2 o Ri o I II I II Protein — NH—CH—c — N H — C H — c — O H
I I
— •
"Begin" cycle: 1. DPCP/DIEA 2. NH4SCN/rFA
Protein —NH—CH—c — N^^^NH
11 T
Next cycle
»
'
Step 1: Bromomethylnaphthalene
R2 o " y l f
I II / \ +base-HX Protein —NH—CH—c — N . ^ N s — methylnaphthalene Step 2: NH4SCN/TFA
'n ')R R2
,0
•—Protein — N ^ ^ N H
Y s Truncated Protein
R1 ,0
+N H , ^ N
Y s — methylnaphthalene Alkylated Thiohydantoln "ATH"
Scheme 1. Perkin-Elmer C-terminal sequencing chemistry
preparations of CKII revealed protein bands with slightly different electrophoretic mobilities. Approximately 1.5 nmol of both CKII proteins were immobilized using ProSpin cartridges for subsequent C-terminal sequence analysis. 3. Preparation of proteins for electroblotting PsaC is the apoprotein of a Fe-S protein involved in photosynthetic electron transport. The psaC gene was cloned from Synechococcus sp. PCC7002, and the PsaC protein was expressed in E. coli in large quantities accumulated as inclusion bodies (9). The inclusion bodies were isolated and dissolved in 6 M urea in the presence of 20 mM dithiothreitol. Partial purification of PsaC was performed using gel filtration columns of Sephadex GIOO. Final isolation of the PsaC protein was by SDS-PAGE. Apomyoglobin (200, 500, and 1,000 pmol) (Sigma), and PsaC (150, 300, and 600 pmol) were loaded on a tricine gel (10-20%, Novex)(10). After electrophoresis, the protein bands were electroblotted onto ProBlott membrane, stained with Coomassie Brilliant Blue R (Sigma), and analyzed by C-terminal sequencing.
IIL Results and Discussion Scheme 1 outlines the Perkin-Elmer method for automated C-terminal sequencing developed in our laboratory (3-6). The benefits of this new sequencing
MeriLisa Bozzini et al.
232 Table I. Alkylated Thiohydantoin Amino Acids Difficult to Detect/Sequence
Reliably Called Alanine Arginine Asparagine Aspartic Acid Glutamic Acid Glutamine
Histidine Isoleucine Leucine Lysine Methionine
Glycine Phenylalanine Tryptophan Tyrosine Valine
Cysteine Serine Threonine
Stops Sequencing Proline
method are selective cleavage of the ATH with simultaneous derivatization of the C-terminus into a thiohydantoin; the introduction of a UV label onto the cleaved amino acid derivative; and elimination of the need to reactivate the newly exposed carboxylic acid of the truncated protein. In Table I, we list the 20 common amino acid residues, categorized according to our ability to sequence and detect them as alkylated thiohydantoin (ATH) derivatives. Although the number of cycles detected is largely dependent on the protein sequence, generally 16 of the amino acids can be reliably identified for 3 to 5 cycles starting with 1 nmol of protein. Pre-sequencing modifications of several amino acids with reactive side chains are essential for successful sequencing. For example, the epsilon amino groups of lysine residues are derivatized to phenylureas with PIC. In addition, the side chain carboxyl groups of the aspartic and glutamic acid residues are derivatized to piperidine amides using piperidine-HSCN during the activation, or "begin," cycle. Finally, cysteine, serine and threonine remain difficult to detect and can impair sequencing, and proline stops sequencing. Investigations on practical modifications for these last four amino acids are the subject of further investigations. This study reports on the success of the alkylation chemistry to characterize recombinant proteins. Additionally, we provide examples that include conventional supports for sequencing and sample handling protocols. The general applicability of C-terminal sequencing to the protein sequencing laboratory is assessed, particularly information regarding additions, truncations, and heterogeneity of proteins that are derived from gene expression.
A. C'terminal Sequencing ofHMW rlL-l During the purification of recombinant interleukin-2 (rIL-2) by reversed-phase HPLC, another form of this protein with a higher molecular weight was also purified. To fully characterize this protein, termed HMW rIL-2, all methods currently available were utilized, including: N-terminal sequencing, peptide mapping, amino acid analysis, and matrix-assisted laser desorption time-of-flight mass spectrometry. The combination of data generated from these techniques enabled the characterization of both rIL-2 and HMW rIL-2, and determined that a C-terminal modification was the cause of the observed heterogeneity. As a result, the
Applications of C-Terminal Sequencing
233
direct analysis of the C-terminal region was essential, because the expected sequence of the HMW rIL-2 protein appeared to have two repeating alanine residues. This was necessary because unambiguous confirmation of the double alanine was not achievable by N-terminal sequencing of the isolated C-terminal peptides. Figure 1 shows the C-terminal sequence analysis of approximately 2 nmol of purified HMW rIL-2 immobilized on a PVDF membrane by centrifugation. A 70pmol ATH standard chromatogram, followed by thefirst4 cycles of sequencing, are shown. C-terminal sequencing confirmed the expected sequence to be NH...-Ala-Leu-Ala-Ala-COOH. As seen in thefigure,the double alanine sequence at the C-terminus of the HMW rIL-2 protein was detected unambiguously, followed by the leucine in cycle 3. The signal for alanine expected in cycle 4 is slightly greater than the lag/preview observed in cycle 3. Additionally, the C-terminal sequence of rIL-2 was verified to be NH-...-Leu-Thr-COOH (data not shown).
B. C-terminal Sequencing of Protein Kinase CKII Another example where C-terminal sequencing aided in the characterization of a recombinant protein was for Protein Kinase CKII. SDS-PAGE of two different preparations of the protein revealed CKII bands with slightly different electrophoretic mobilities: one band with the expected mobility, and one suggesting a lower molecular weight protein. The two CKII proteins were found to have identical N-terminal sequences, and both reacted with antibodies to CKII beta. However, different crystal forms were obtained from each preparation. A truncation at the C-terminus of the protein seemed a possible explanation.
V
W
II
picl
LLA_
x.^
UJ A
Figure 1. C-terminal sequencing of HMW rIL-2.
MeriLisa Bozzini et al.
234
Standard 1 v"~w-
Residue 2 16.00 14.00 12.0o| 10.00 8.00
4.0
8.0
12.0 16.0 20.0 24.0 28.0 32.0
vV..._JJU' UcAi' LAAAJ 4.0
8.0
12.0 16.0 20.0 24.0 28.0 32.0
Residue 3
Residue 1
4.0
8.0
12.0 16.0 20.0 24.0 28.0 32.0
4.0
8.0
12.0 16.0 20.0 24.0 28.0 32.0
Figure 2. C-terminal sequencing analysis of a heterogenous, truncated sample of the p-subunit of Protein Kinase CKH.
Approximately 1.5 nmol of both the "full length" and "truncated" CKII proteins were immobilized using ProSpin cartridges. C-terminal sequence analysis verified the expected C-terminus of the "full length" protein to be NH-...-IleArg-COOH (data not shown). Figure 2 shows the C-terminal sequence analysis of the "truncated" protein. Not only was the location of the truncation determined, the sequencing data also indicated a mixture of two protein components: a 25-amino acid truncation with the C-terminus: NH-...-Tyr-Gly-Phe-COOH; and a 26-amino acid truncation with the C-terminus: NH-...-Leu-Tyr-Gly-COOH. Consistent with the sequencing data, mass spectrometric analysis (PE Sciex, API 1 LCMS) supported the presence of the two protein components at their expected molecular weights, 21,960 Da and 21,814 Da, respectively. C. C'terminal
Sequencing for Electroblotted
Proteins
Electrophoresis followed by electroblotting has become a very useful and established technique to prepare samples for protein sequencing. In many cases, the use of electrophoresis and electroblotting can replace a number of conventional purification steps, and can separate sequenceable material from partially purified or even crude mixtures of proteins. The typical protein capacity for most gels is less than 500 pmol. Furthermore, a significant amount of protein may not be transferred during electroblotting. During the development of this and other methods of C-terminal sequencing (1), minimum protein amounts have been approximately 1 nmol, and, as a result, C-terminal sequencing of electroblotted samples has been viewed as a particular challenge. Horse apomyoglobin and PsaC, a recombinant protein, were used as model proteins to assess the capability of the Perkin-Elmer C-terminal chemistry to sequence less than 1 nmol of electroblotted proteins. PsaC is the apoprotein of a
235
Applications of C-Terminal Sequencing
250 K D a -
Myosin BSA Glutamic Dehydrogenase Ale. Dehydrogenase Carb. Anhydrase Myoglobin Lysozyme Aprotinin Insulin
98 K D a 64 KDa— > 50 KDa—f 36 KDa—,. 30 K D a - , 16 K D a - ^ 8 KDa-' 4 KDa-
r-^ Figure 3. PsaC and Apomyoglobin, electroblotted onto ProBlott
Fe-S protein involved in photosynthetic electron transport, and is expressed in E. coli in inclusion bodies. The protein is conveniently isolated for sequencing by SDS-PAGE. In Figure 3, the protein bands immobilized on a ProBlott membrane are revealed after electrophoresis, transfer, and staining. Table II shows the initial yields for each of the blotted samples, and the total number of residues identified. Initial yields of the PsaC protein are roughly linear over the range of concentrations. Sequencing for myoglobin was successful, although the initial yields were not linear. For these experiments, a smaller cartridge (6 mm) was used for the sequencing reactions. This cartridge was not designed to accommodate the large pieces of membrane (i.e., the myoglobin bands) likely contributing to the inconsistencies in sequencing yields. Table 11. C-terminal Sequencing Initial Yields of SDS-PAGE-blotted Samples Picomoles Loaded on Gel 150 200 300 500 600 1000
]Initial
PsaC
Yield (pmol) Apomyoglobin
8.6
—
—
9.6
23.1
—
—
44.1
46.0
—
—
33.5
Number of Residues Identified 3 3 4 4 4 4
MeriLisa Bozzini et al.
236
Residue 3
Residue 1 1
11.00
P
9.od 8.od
kiuJJwull•^ U W M J 8.0
8.0
Residue 2
LJJLI
LAAJ
8.0
Hi
L_JJ
VJ
12.0 16.0 20.0 24.0 28.0 32.0
12.0 16.0 20.0 24.0 28.0 32.0
Residue 4
\i
12.0 16.0 20.0 24.0 28.0 32.0
fl.o
12.0 16.0 20.0 24.0 28.0 32.0
Figure 4. C-temiinal sequencing analysis of the electroblotted PsaC protein
Figure 4 shows the C-terminal sequencing data for the electroblotted PsaC protein. Approximately 300 pmol of the protein was loaded onto the tricine gel; therefore, an estimated 150-250 pmol of protein was transferred to the ProBlott membrane for sequencing. The C-terminal sequence was confirmed to be NH-...Gly-Leu-Ala-Tyr-COOH.
IV. Conclusions The ability to sequence proteins from the C-terminus enhances and expands the analytical methods available for protein characterization. Although we are currently in the development stages of the chemical degradation process for C-terminal sequencing, the Perkin-Elmer method has proven utility in the protein analysis laboratory. We have demonstrated the chemistry's robustness in confirming the expected sequence of recombinant proteins. Equally valuable, this chemistry method can identify post-translational modifications and sample heterogeneity. Our sequencing method has been demonstrated on proteins obtained from several independent laboratories. Furthermore, the information obtained has been complimentary and consistent with data obtained from other analytical techniques, such as peptide mapping, amino acid analysis, and mass spectrometry. In addition, the proteins were sequenced from a conventional sequencing matrix, PVDF, introduced either by centrifugation or electroblotting from a gel. As little as 150 pmol of protein was sequenced for 3 to 5 cycles from the blotted proteins. Although some amino acid residues interfere with our ability to sequence all protein samples, many proteins have been sequenced successfully by this new C-terminal sequencing method, which continues to improve and, thereby, provides a useful analytical tool.
Applications of C-Terminal Sequencing
237
Acknowledgments We gratefully acknowledge the following contributors for their expertise in preparing the protein samples referred to in this manuscript: Jason Kass, Mark Vandenberg, and Nicholas Chester, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, for their work on CKII. Work at Cold Spring Harbor is supported by a Cancer Research Fund of the Damon-RunyonWalter Winchell Foundation Fellowship, DRG-1193 (to J.L.H.), by grant #CB-72 from the American Cancer Society (to D.R.M.) and by grant AG10208 from the Public Health Service (to D.R.M.). John Golbeck, The University of Nebraska, Lincoln, Nebraska, for his work on PsaC. Zafeer Ahmad and Fazul Khan, Hoffmann-La Roche, Nutley, NJ, for their workonHMWrIL-2. Ling Chen, Perkin-Elmer, Applied Biosystems Division, for mass analysis. We also wish to thank Anne Marie Bozzini and Karen Felker for their editorial assistance and desktop publishing expertise.
References 1. Rangarajan, M. (1988). In "Protein/Peptide Sequence Analysis: Current Methodologies", A.S. Bhown, ed. 135-144. 2. Inglis A.S. {\991). Anal Biochem. 195,183-196. 3. Boyd, V.L., Bozzini, M., Zon, G., Noble, R.L., and Mattaliano, R.J. (1992). Anal Biochem. 206, 344-352. 4. Boyd, V. L., Bozzini, M., Zon, G., Noble, R,L., and Mattaliano, R.J. (1992). "A New Chemical Method for Protein C-terminal Sequence Analysis." Presented at the Sixth Symposium of the Protein Society, San Diego, CA. 5. Guga, P.J., Bozzini, M., DeFranco, R.J., Large, G.B., and Boyd, V.L. (1993). "C-terminal Sequence Analysis of the Amino Acid Residues with Reactive Side-Chains: Ser, Thr, Cys, Glu, Asp, His, Lys." Presented at the Seventh Symposium of the Protein Society, San Diego, CA. 6. Bozzini, M., DeFranco, R.J., Guga, P.J., Mattaliano, R.J., and Boyd, V.L., (1993). "C-terminal Sequencing Automation and Performance Assessment for the Alkylated Thiohydantoin Method." Presented at the Seventh Symposium of the Protein Society, San Diego, CA. 7. Ju, G., Collins, L., Kaffka, K. L., Tsien, W.-H., Chizzonite, R., Crowl, R., Bhatt, R., and Kilian, P.L. (1987)./. Biol Chem. 262,5723-5731. 8. Ahmad, Z., Ciolek, D., Pan, Y.-C.E., Michel, H., and Khan, F. (1994)7. Protein Chemistry, (in press). 9. Zhao, J., Snyder, W.B., Muhlenhoff, U., Rhiel, E., Warren, P.V., Golbeck, J.H., and Bryant, D.A. (1993). Mol Microbiol 9,183-194. 10. Schagger, H. and von Jagow, G. (\9%l).Anal Biochem. 166,368-379.
This Page Intentionally Left Blank
C-Terminal Sequence Analysis of Polypeptides Containing C-Terminai Proiine Jerome M. Bailey, Oanh Tu, Gilbert Issai, and John E. Shively Beckman Research Institute of the City of Hope, Diviston of Immunology, Duarte.CA 91010
I.
INTRODUCTION
The last few years have seen a renewed interest in the development of a chemical method for the sequential C-temiinal sequence analysis of proteins and peptides. Such a method would be analogous and complimentary to the Edman degradation commonly used for N-terminal sequence analysis (1). It would also be invaluable for the sequence analysis of proteins with naturally occurring N-terminal blocking groups, for the detection of post-translational processing at the carboxy-terminus of expressed gene products, and for assistance in the design of oligonucleotide probes for gene cloning. Although a number of methods have been described, the method known as the "thiocyanate method", first described in 1926 (2), has been the most widely studied and appears to offer the most promise due to its similarity to current methods of N-terminal sequence analysis. Work performed in our laboratory over the last several years has systematically addressed many of the problems associated with the thiocyanate chemistry. The use of sodium or potassium trimethylsilanolate for the cleavage reaction provided a method for rapid and specific hydrolysis of the derivatized C-terminal amino acid, which left the shortened peptide with a free C-terminal carboxylate ready for continued rounds of sequencing (3). The use of diphenylphosphoroisothiocyanatidate (DPP-ITC) and pyridine combined the activation and derivatization steps and TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
239
240
Jerome M. Bailey et al.
permitted the quantitative conversion of 19 of the twenty common amino acids (the exception being proline) to a thiohydantoin derivative. These improvements permitted application of the Cterminal chemistry to a wide variety of protein samples with cycle times similar to those employed for N-terminal sequence analysis (4). The introduction of Zitex (porous Teflon) as a support for protein sequencing permitted the C-terminal sequence analysis of protein samples that were non-covalently applied to the sequencing support (4,5). The inability of C-terminal proline to be derivatized to a thiohydantoin has been a major impediment to the development of a routine method for the C-terminal sequence analysis of proteins and peptides. Since the method was first described in 1926 (2), the derivatization of C-terminal proline has been problematic. While over the years a few investigators have reported the derivatization of proline, either with the free amino acid or on a peptide, to a thiohydantoin (6-8), others have been unable to obtain any experimental evidence for the formation of a thiohydantoin derivative of proline (9-12). Recently, utilizing a procedure similar to that described by Kubo et ai. (6), Inglis et al. (13) have described the successful synthesis of thiohydantoin proline from N-acetylproline. This was done by the one-step reaction of acetic anhydride, acetic acid, trifluoroacetic acid, and ammonium thiocyanate with N-acetyl proline. We have reproduced this synthesis and further developed it to a large scale synthesis of TH-Proline. We also describe the development of chemistry based on the DPP-ITC/pyridine reaction which permits the efficient derivatization and hydrolysis of peptidyl C-terminal proline to a thiohydantoin and discuss the integration of this chemistry into an automated method for the C-terminal sequence analysis of polypeptides containing C-terminal proline.
II.
MATERIALS AND METHODS
Materials. Diphenyl chlorophosphate, acetic anhydride, trimethylsilylisothiocyanate, anhydrous dimethylformamide (DMF), anhydrous acetonitrile, and anhydrous pyridine were from Aldrich. Water was purified on a Millipore Milli-Q system. Sodium trimethylsilanolate was obtained from Fluka. Diphenyl phosphoroisothiocyanatidate was synthesized as described (14). All of the peptides used in this study were either obtained from Bachem or Sigma. N-Acetyl proline was from Sigma.
C-Terminal Sequencing through Proline
241
Diisopropylethylamine (sequenal grade), trifluoroacetic acid (sequenal grade), and 1,3-dicyclohexylcarbodiimide (DCC) were obtained from Pierce. Tlie carboxylic acid modified polyethylene membranes were from the Pall Corporation (Long Island, NY). Zitex G-110 was from Norton Performance Plastics (Wayne, NJ). The amino acid thiohydantoins used in this study were synthesized as described (9). The Reliasil HPLC columns used in this study were obtained from Column Engineering (Ontario, CA). Synthesis of Thjohydantgin Prolipg. Acetic anhydride (100 ml), acetic acid (20 ml), and trifluoroacetic acid (10 ml) were added to N-acetylproline (500 mg). The mixture was stirred until dissolved. Trimethylsilylisothiocyanate (3 ml) was added and mixture stirred at 60^C for 90 min. The reaction was dried to a powder by rotary evaporation and water (50 ml) added. This solution was again dried by rotary evaporation and water (20 ml) added. A white powder formed. The solution was kept on ice for approximately 30 min. The powder (thiohydantoin proline) was collected by vacuum filtration. The yield was approximately 40%. The product was characterized by UV, FAB/MS, and NMR. The UV absorption spectrum had a ^max of 271 nm in methanol. FAB/MS gave the expected MH+ = 157. NMR 5 4.32(Ha, m), 3.85 (Hs, m), 3.43 (Hg, m), 2.20 (Hvand Hp, m), 1.70 (Hp, m).
Covalent Coupling of Peptides to Carboxylic Acid Modified Polyethylene. Peptides were covalently coupled to carboxylic acid modified polyethylene and quantitated as described (15). Application of Protein Samples to Zitex. The Zitex support (2x10 mm) was pre-wet with isopropanol and protein samples (25 III) dissolved in water were applied. The samples were allowed to dry before sequencing.
HPLC Separation of the AmiPO Acid ThichydantoinsReverse phase HPLC separation of the thiohydantoin amino acid derivatives (400 pmol) was performed on a C-18 (3^1, lOOA) Reliasil column (2.0 mm x 250 mm) on a Beckman 126 Pump Module with a Shimadzu (SPD-6A) detector (Figure 1). The column was eluted for 2 min with solvent A (0.1% trifluoroacetic acid in water) and then followed by a discontinuous gradient to solvent B (10% methanol, 10% water, 80% acetonitrile) at a flow rate of 0.15 ml/min at 35°C. The gradient used was as follows: 0% B for 2 min, 0-4% B over 3 min, 4-35% B over 35 min, 35-45% B for 5 min, and 45-0% B over 3 min. Absorbance was monitored at 265 nm.
Jerome M. Bailey et al.
242
Automation of the C-Terminal Sequencing Chemistry. The instrument used for automation of the chemistry described in this manuscript has been described previously (5).
lO CD
20
30
40
48
Retention Time (min.)
Figure 1. HPLC Separation of the Amino Acid Thiohydantoins.
III.
RESULTS AND DISCUSSION
Chemistry for the Automated C-Terminal Sequence Analysis of Proline Containing Polypeptides. Application of the acetic anhydride/TMS-ITC/TFA procedure, used for the synthesis of TH-proline, to the tripeptide, N-acetyl-Ala-Phe-Pro, in our laboratory, found that thiohydantoin proline was formed in low yield (approx. 1-2% of theoretical). Recovery of the peptide products after the reaction revealed that approximately half of the starting peptide was unchanged and the remaining half had been decarboxylated at the C-terminus, thereby blocking it to C-terminal sequence analysis. This was most likely caused by the high concentration of trifluoroacetic acid, the excess of acetic anhydride present, and the high temperature (SO'^C) at which the reaction was performed. The poor reaction with C-terminal proline most likely stems from the fact that proline cannot form the necessary oxazolinone for efficient reaction with the isothiocyanate. Work in our laboratory has obviated the need for oxazolinone formation by the use of diphenyl phosphoroisothiocyanatidate and pyridine. Reaction of this reagent with C-terminal proline directly forms the acylisothiocyanate. Once the acylisothiocyanate is formed, the addition of either liquid or gas phase acid followed by water was
C-Terminal Sequencing through ProHne
243
found to release proline as a thiohydantoin amino acid derivative. Unlike thiohydantoin formation with the other 19 naturally occurring amino acids, C-terminal proline thiohydantoin requires the addition of acid to provide a hydrogen ion for protonation of the thiohydantoin ring nitrogen. This step is necessary for stabilization of the proline thiohydantoin ring. The resulting quaternary amine containing thiohydantoin can then be readily hydrolyzed to a shortened peptide and thiohydantoin proline by introduction of water vapor or by the addition of sodium trimethylsilanolate (the reagent normally used for cleavage of peptidylthiohydantoins). The automation of this chemistry has allowed proline to be analyzed in a sequential fashion without affecting the chemical degradation of the other amino acids.
Peptide—
y/^
-cm
Diisopropylethylamine
Peptide-C - N - C H - C - O -
0
% ' Peptide-c - N - C H - C - o - P - N - c - s ^ 0^
Peptide Mixed Anhydride
Ph
I
pyridine
— 0 Peptide-c - N-CH-c - N - C - s Peptide Isothiocyanate
Pepcide-i! - * A - a .
>
T
S-C
P*Pdde-
'
I I
C-0 ^
/
Y H2Q
C o
"f Y A—NH
s Thiohydantoin Proline
S-C C - 0 H
O
^
PeptideJI-OH Shortened Peptide
,
—, CHj-Si-O Ni* 0
Peptide-C - o * Shoitened Peptide
Figure 2. Reaction Scheme and Postulated Intermediates for the Sequential 0Terminal Degradation of Polypeptides Which may Contain Proline.
244
Jerome M. Bailey et al.
The chemical scheme for C-terminal sequencing is shown in Figure 2. The first step involves treatment of the peptide or protein sample with diisopropylethylamine in order to convert the C-terminal carboxylic acid into a carboxylate salt. Derivatization of the C-terminal amino acid to a thiohydantoin is accomplished with diphenylisothiocyanatidate (liquid phase) and pyridine (gas phase). The peptide is then extensively washed with ethyl acetate and acetonitrile to remove reaction by-products. The peptide is then treated briefly with gas phase trifluoroacetic acid, followed by water vapor in case the C-terminal residue is a proline (this treatment has no effect on residues which are not proline). The derivatized amino acid is then specifically cleaved with sodium or potassium trimethylsilanolate to generate a shortened peptide or protein which is ready for continued sequencing. In the case of a C-terminal proline which was already removed by water vapor, the silanolate treatment merely converts the C-terminal carboxylic acid group on the shortened peptide to a carboxylate. The thiohydantoin amino acid is then quantitated and identified by reverse-phase HPLC. The proposed role of trifluoroacetic acid (TFA) is for the protonation of the thiohydantoin proline ring. The addition of water or silanolate salt is required for cleavage of the TH-proline. If the temperature is raised to 80°C and the TFA step is prolonged (10 to 20 min) the acid alone can be used to cleave the TH-proline. TFA and water under the conditions used for the proline reaction at 50°C does not cleave peptidylthiohydantoins formed from the other 19 amino acids. This makes it possible to integrate the unique steps needed for proline as routine steps in the automated C-terminal sequencing program. Examples of Automated Sequence Analysis. Peptide samples for C-terminal sequencing were covalently attached to carboxylic acid modified polyethylene prior to sequence analysis and proteins were non-covalently applied to Zitex strips.
IV.
SUMMARY
We have described a simple procedure for the large scale (200 mg) synthesis of thiohydantoin proline from N-acetyl proline and extensively characterized this analogue. The thiohydantoin derivative of proline is conveniently obtained as a white powder which is stable to long term storage. The availability of a thiohydantoin proline standard is critical for the evaluation of automated sequencing results. We have described automated chemistry which is capable
245
C-Terminal Sequencing through ProHne
Cyde2
P
JX
0
20
40
Retention Time (min)
Figure 3. Automated C-Terminal Sequencing of the Tripeptide, LAP (15 nmol), Covalently Attaclied to Cart)oxylic Acid Modified Polyetliylene. Each thiohydantoin derivative is identified by comparison to the retention time of an authentic standard. Unlabeled peaks are background produced by reaction side products.
of the C-terminal sequence analysis of polypeptides containing Cterminal proline. This chemistry has been integrated into the automated sequencing program previously used for the C-terminal sequence analysis of the other 19 amino acids without affecting performance. We have proposed a chemical mechanism for proline sequencing via the thiohydantoin route which is consistent with the experiments performed to date. The failure of previous methods to derivatize C-terminal proline may be due to the inability of proline to form an oxazolinone, a necessary step in many of the previous methods. The use of DPP-ITC/pyridine for derivatization permits the direct formation of an acylisothiocyanate at the C-terminus without the
246
Jerome M. Bailey et al.
LJk_JU^L^JJV-•^Jl~-
S
0.4
I 0-:
lUiLL
LJlILILlLi^vJi 10
20
30
40
Ratantion Tim* (min)
Figure 4. Automated C-Terminat Sequencing of Ovalbumin (approx. 5-6 nmol) Non-Covalently Applied to Zitex. The expected sequence at the C-terminus is Val-Ser-Pro.
need for oxazolinone formation. Once an acylisothiocyanate is formed it can cyclize to a quaternary amine containing thiohydantoin. This thiohydantoin, if protonated with acid, is stable. If the acid step is eliminated Cterminal proline is regenerated. The quaternary amine containing proline thiohydantoin can be readily cleaved with water vapor or alternatively with the silanolate salt normally used for the cleavage reaction. Current Expectations for C-Terminal Sequencing. Current technology now permits 1-3 cycles of automated C-terminal sequence analysis on 200 pmol - 4 nmol of non-covalently applied protein samples which contain any of the twenty common amino acids.
C-Terminal Sequencing through Proline
247
Work is continuing toward the goal of extending the number of cycles of sequence information which can be obtained with this automated method.
REFERENCES 1.
Edman. P. (1950) Acta Chem. Scand. 4. 283-293.
2.
Schlack, P.. and Kumpf. W. (1926) Z. Physfol. Chem. 154.125-170.
3.
Bailey, J.M, Shenoy, N.S., Ronk, M., and Shively, J.E. (1992) Protein ScifiD£fi1.68- 80.
4.
Bailey. J.M.. Nikfarjam. F.. Slienoy. N.S.. and Shively. J.E. (1992) Protein SmfiDCtl, 1622-1633.
5.
Bailey. J.M.. Rusnak. M.. and Shively. J.E. (1993) Analytical BiQChemiStry 212, 366-374.
6.
Kubo. H.. Nakajima. T.. and Tamura. Z. (1971) Chem. Pharm. Bull. 19, 210-211.
7.
Inglis. A.S.. Wilshlre. J.F.K.. Casagranda. F.. and Laslett, R.L. (1989) In Methods In Protein Sequence Analysis (wittmann-Lieboid. B.. Ed.) pp.137-144. Springer-Verlag.
8.
Yamashita. S.. and Ishikawa, N. (1971) Proc. Hoshi. Phamn. 13,136-138.
9.
Bailey. J.M.. and Shively, J.E. (1990) Biochemistry 29. 3145-3156.
10.
Turner. R.A.. and Schmerzler, G. (1954) Biochlm. Biophys. Acta. 13, 553-559.
11.
Fox, S.W., Hurst, T.L., Griffith, J. F., and Undenvood. O. (1955) J.Am. C1MIL.SQC. 77, 3119-3122.
12.
stark. G.R. (1968) EiQ£tMIL7. 1796-1807.
13.
Inglis. A.S.. and De Luca. 0. (1993) In Methods in Protein Sequence Analysis (Imahori. K..Sakiyama. F.. Eds.) 71-78. Plenum Publishing Corp.
14.
Kenner. G.W.. Khorana. H.G., and Stedman. R.J. (1953) Chem. Soc. Jour. (London), 673-678.
15.
Shenoy, N.S.. Bailey. J.M., and Shively. J.E. (1992) Protein Science 1. 58-67.
This Page Intentionally Left Blank
SECTION IV Peptide and Protein Separations and Other Methods
This Page Intentionally Left Blank
High Sensitivity Detection of Tryptic Digests using Derivatization and Fluorescence Detection Steven A. Cohen, Igor Mechnikov,and Patricia Young Waters, Milford MA 01757
I. Introduction Peptide mapping by reversed-phase HPLC continues to play an important role for the qualitative analysis of protein structure. Unfortunately, increasing the information obtained from tryptic maps often conflicts with the need to conserve precious samples. This conflict has been a strong motivating force in the development of more sensitive methods for peptide analysis that also provide additional information in comparison to standard mapping procedures. Peptide derivatization has long been studied as a means of providing higher sensitivity and/or selectivity for peptide analysis (1-3). Recent studies (4-6) with amino acid analysis utilizing derivatization with 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC) have suggested that application of this chemistry to tryptic digests could prove beneficial for high sensitivity peptide analysis. We herein describe the application of this procedure to tryptic digests of a model protein that demonstrates usefiil maps with subpicomole amounts of sample.
II. Experimental Materials Cytochrome c, TPCK-trypsin, methionine-enkephalin, and amino acid standards were obtained from Sigma Chemical Co., St. Louis MO. Trifluoroacetic acid was from Pierce Chemical Co., Rockford IL. Borate buffer and AQC (AccQ-Fluor'^^ reagent) for derivatization were from Waters, Milford, MA. Other chemicals were HPLC or analytical grade. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
251
252
Steven A. Cohen et al.
Proteolytic Digestion A stock solution of rabbit Cytochrome c was made up in water at a concentration of 4.3 mg/ml. The concentration of the stock solution was determined by quantitative amino acid analysis of a 5|al aliquot that was hydrolyzed and analyzed as previously described (4). The stock solution was diluted 5-fold with O.IM ammonium bicarbonate or O.IM sodium borate (both at pH 8.2) and TPCK-Trypsin in lOmM HCl was added to give a 1:20 enzyme:protein solution (w/w). Digestion was carried out at 37°C for 24 hours after which the enzymatic activity was terminated by heating at 100°C for 5 min.
Peptide Derivatization Peptide digests were diluted 10-fold for most experiments and 100-fold for high sensitivity studies with 0.2 M borate buffer, pH 8.8 containing 5mm EDTA . Labeling of the amino groups was accomplished by addition of one volume of AQC (lOmM) in acetonitrile to four volumes of buffered sample. Peptide digests in ammonium bicarbonate were rendered salt-free by lyophilization, whereas digests performed in borate buffer were simply diluted prior to derivatization.
HPLCAnalysis All separations were performed on 625 or 626 LC Systems with a 715 Ultra WISP "^^ or 717 Plus Autosampler and a Temperature Control Module (all from Waters, Milford, MA). HPLC columns used for separations were Delta-Pak'^'^ C18 300A reversed-phase columns (2.0 x 150mm or 3.9 x 150mm).
Separation of Underivatized Peptides Peptides were detected with a M486 variable UV detector set at 214 nm. The eluents were A = 0.1% TFA in water and B = 0.1% TFA in acetonitrile. The gradient was from 0 - 45% B in 90 minutes at a flow rate of 0.18ml/min. The column was thermostatted at 35°C.
Separation of Derivatized Peptides Derivatized peptides were detected byfluorescencewith excitation at 250nm and emission at 395 nm using model M470 or M474 Scanning Fluorescence Detectors.. Occasionally the UV detector set at 254 or 214nm was used in series with the fluorescence detector. The A eluent was 140mM sodium acetate, 17mM triethylamine, pH adjusted with phosphoric acid to 4.0 - 6.0. Eluent B
Fluorescence Detection of Tryptic Digests
253
was acetonitrile. Some separations used the TFA system described above or solutions of ammonium acetate at pH 6.06 or 5.56, or O.IM sodium phosphate at pH 3.0 or 3.5.
II. Results Separation of Samples using TFA Eluents Tryptic cleavage of Cytochrome c in borate buffer and separation of the peptides using the TFA eluent system resulted in the map shown in Figure la. Comparison of this result to digests performed in the more commonly used ammonium bicarbonate buffer system (7) demonstrated that trypsin activity was essentially equivalent in the two buffer systems, as there was little difference in the pattern of peptides obtained. The pattern only varied significantly with regard to related peptide pair T5 and T6, which differ by an extra Lys residue at the C-terminus of T6 that has Lys-Lys at the end. In the current study, one of these peptides is not present. This experiment allowed the use of the simpler
<
s
Figure 1. Separations of Cytochrome c tryptic digests. (A) 5|j,l of the tryptic digest (0.86mg/ml) was injected to give approximately 360 pmol separated on the 3.9mm ID column. Peaks are labeled according to Reference 4. (B) A 5fxl aliquot of the 10-fold diluted digest (ca. 36 pmol) was separated on the 2nun ID column and this same sample was derivatized with AQC and 50|J,1 injected. Other conditions are described in the Experimental Section.
254
Steven A. Cohen et al.
borate buffer for routine tryptic digestion. Sample lyophilization and the risk of sample loss were thus eliminated as the derivatizations could be carried out with the borate digest after dilution with more buffer. A comparison of the underivatized mixture with a derivatized sample (Figure lb) was made to determine the eflBciency of the derivatization with AQC. For those peptides in which there is no interference in the derivatized map for underivatized components, this would allow the observation of any given tryptic peptide for which a significantfi-actionwas not derivatized. An example of such peptides is T12 and T13 where it can be seen that no significant underivatized peptide remained, thus showing effective conversion to their derivatized forms. This observation is consistent with previous results for peptide derivatization (6) which have suggested that high yields are routinely obtained for small to medium size peptides (2-25 amino acids) if the reaction mixture is maintained above pH 8.2 and the reagent is present in at least afive-foldmolar excess over total amine content present.
Peptide Response as a Function ofpH: Model Compound Studies A stock solution of Met-enkephalin (l.Omg/ml) was made up in water, and diluted to give a O.lmg/ml solution. A lOjil aliquot of this solution was buffered with 10\x\ of borate (pH 8.8) and derivatized with 20|al of AQC solution. Comparison of the derivatized and underivatized samples indicated that derivatization was essentially quantitative. Analyses of the derivatized peptide were carried out in the phosphate/acetate eluent system at five different pH values ranging from 4 - 6 . Other eluents studied included the TFA eluents (pH ca. 2.1), phosphate buffers at pH 3.0 and 3.5 and ammonium acetate at pH 6.0. Representative chromatograms for the different eluent systems are shown in Figure 2. Retention time and peak height as a function of pH are given in Figure 3. It is worth noting that despite the large variation in mobile phase conditions, the retention of methionine enkephalin was always within a 1.5 minute window. Hence the concentration of acetonitrile at the time of elution only varied maximally 1.5%. Although the fluorescence of AQC derivatives is known to increase as a function of increasing acetonitrile concentration (4), the 1.5% change is certainly too small to significantly affect the sensitivity, and the increased response can thus be attributed to increased fluorescence quantum yield as a function of the pH change. This pH sensitivity is similar to that reported for amino acids, and although the evidence is not definitive, the decreased response at acidic pH is believed to resuh fi-om protonation of the quinoline ring on the derivatized peptide.
Fluorescence Detection of Tryptic Digests
255
Ammonium Acetate pH6.06
pH6.06
pH5.56 pH5.06
pH4.56
25
Minutes
30
Figure 2. Influence of eluent pH on retention and response. Methionine enkephalin was analyzed using eluents with pH varying from 3.5 to 6.06. Derivatization and chromatography are described in the Methods section. All gradients were from 0-45% B in 45 minutes. Eluents at pH 3.5 and 4.06 were O.IM sodium phosphate. Eluents at pH 4.56, 5.06, 5.56 and 6.06 were the acetate/phosphate/TEA system described in the experimental section. In addition, at pH 6.06, the peptide was analyzed with an ammonium acetate eluent.
f e
29.2
W
29.0
1i.
Retention Time
l.Oe+5 28.8
O Phosphate Buffer qS P-l
5.0e+4
• Acetate/Phosphate/ TEABuflFer
o a
rf 3 a
28 6
/—N
28.4
i. a
1
Peali Height
Eluent pH Figure 3. Effect of eluent pH on the retention and peak height of methionine enkephalin.
Derivatized Cytochrome c Peptide Maps A number of studies with peptide maps (8-10) have shown that increased sensitivity can be achieved through a reduction in column diameter from ca. 4mm
256
Steven A. Cohen et al.
to 2 or even 1mm. However, these experiments all used UV detection which is fundamentally different from the fluorescence detection used this study for derivatized peptide mixtures. With UV detection, the signal is proportional to the pathlength of the detector flow cell, whereas in fluorescence detectors the response is proportional to the flow cell volume. This inherent difference creates severe limitations on flow cell design to maximize peak response while minimizing the detector-associated bandspread. The bandspread effects are, of course, magnified by any reduction in column diameter that decreases the peak volume. Two detector configurations with flow cells with volumes of 5 and 16|il were used for studies on optimizing detector sensitivity and resolution. In addition, the effect of column diameter with a 2mm and a 3.9mm ID columns were compared. The objective was to see if there is an optimal combination of column dimension and detector configuration that provides the highest sensitivity without seriously compromising the resolution of a complex map. Identical injections containing approximately 860 fmol of the cytochrome c digest were analyzed and the results are shown in Figure 4. The high sensitivity of the analysis is evident from the excellent response observed with this subnanomole analysis with all combinations of detector and flow cell. Comparison
*
FS=10mV
LudUwW c *
D
*|
FS = 200mV
v ti UjlJ^
VjUiW 80
Minutes Figure 4. Separation of 860 finol of Cytochrome c tryptic digest using (A) a 3.9mm ID column with a Sfxl flow cell, (B) a 2mm column with the 5}j,l flow cell, (C) a 3.9mm column with the 16)4,1flowcell, and (D) a 2mm column with the 16)0,1flowcell. All other conditions are described in the experimental section. Peaks marked with an asterisk (*) are found in derivatization blanks. Other peaks are components in the digest that are derivatized.
Fluorescence Detection of Tryptic Digests
257
of these maps to derivatization blanks indicated that three major reagent-related peaks were generated. The first two of these are 6-aminoquinoline (the reagent hydrolysis product) and derivatized ammonia, respectively, while the third component has not yet been identified. Many other components are samplerelated, and are either fi-ee amino acids, which elute prior to 40 minutes on the 3.9mm column (Figures 4A and C) or derivatized peptides. At least 13 components with retention > 40 minutes can be identified, which may be the derivatized forms of the peptides shown in Figure 1. Small polar peptides are also produced in Cytochrome c digests (e.g. residues 6-8 Gly-Gly-Lys), but are poorly retained on reversed-phase columns in underivatized analyses and likely elute at or near the column void volume. Some of these peptides may elute in the amino acid region, as their retention behavior in the derivatized form may be similar to amino acids. The best resolution is obtained with the larger column and the smaller flow cell (Figure 4A). However the sensitivity of this system is the poorest of the four configurations, and peak heights are nearly three times less than that obtained with the same flow cell on the smaller column (Figure 4B). Maximum response is achieved with the small column and the large flow cell (Figure 4D), but the resolution is significantly less than that observed with the same column and the smaller cell. Detector-related bandspread thus allows the use of the larger flow cell with the 3.9mm column with only a modest decrease in separation efficiency (Figure 4C), but the smaller peak volumes resultingfi-omthe decrease in colunm diameter using the 2mm column make this configuration less than ideal if high resolution is required. The combination of the 2mm column and the 5|al flow cell
Minutes Figure 5. High sensitivity analysis of Cytochrome c tryptic digests. Samples were analyzed on the 2mm ID column using a 5}xl flow cell. Amounts injected were (A) 860 fmol, (B) 172 finol and (C) a derivatization blank with reagent amount equivalent to the analysis in (A) and five times the reagent injected in (B). Major peaks present in the blank are labeled with an asterisk
258
Steven A. Cohen et al.
(Figure 4B) provided a reasonable compromise between resolution and sensitivity. High sensitivity analyses with this system are shown in Figure 5.
III. Conclusions Derivatization of peptide mixtures with AQC has been shown to an effective means of generating useful peptide maps from protein tryptic digests. Maintaining the eluent pH above 5 is important for maximal response. The choice of HPLC column andfluorescentdetector flow cell can be optimized to provide subpicomole detection, but there is some compromise between operating conditions that provide the maximum sensitivity and the highest resolution. Sample derivatization is extremely simple and can be accomplished with a small fraction, typically 2-5%, of a tryptic digest prepared in sodium borate buffer. Both standard peptide maps and derivatized maps are readily generated fi'om a single digest. This provides additional information for a complex peptide mixture without requiring significantly more sample. Several key features of this derivatization analysis are still not fully explored. Volatile buffer systems, such as the ammonium bicarbonate that was briefly studied, will permit the facile collection offi-actionsfor further analysis such as amino acid composition or mass spectrometry. Indeed, it should be possible to utilize on-line mass spectrometric detection with such buffers. Capillary electrophoretic analysis is also feasible for these samples (3) and may even provide subfemtomole detection.
References 1. Lunte, S. M., and Wong, O. S., (1990) Current Separations 10, 19-26. 2. Polo, M. C, Gonzalez de Llano, D. and Ramos, M. in "Food Analysis by HPLC", L. M. L. Nollet, ed.. Marcel Dekker, Inc., pp 117-140. 3. Liu, J., Hsieh, Y.-Z., Wiesler, D., and Novotny, M. (1991) Anal. Chem. 63, 408-412. 4. Cohen, S. A. and Michaud, D. P., (1993) Anal. Biochem. 211, 279-287. 5. Cohen, S. A., De Antonis, K. M., and Michaud, D. P., (1993) in "Techniques in Protein Chemistry IV", R. H. Angeletti, ed., pp 289-298. 6. De Antonis, K. M., Brown, P. B, and S. A. Cohen, (1994) J. Chromatogr. 661,279-285. 7. Young, P. M. and Wheat, T. E., (1990) J. Chromatogr. 512, 273-281. 8. Simpson, R. J. and Nice, E. C. (1987) in "Methods in Protein Sequence Analysis", K. A. Walsh, ed., Humana Press, pp 213-227. 9. Stone, K. A., LoPresti, M. B., Williams, N. D., Crawford, J. M., DeAngelis, R. and Williams, K. R. (1989) in "Techniques in Protein Chemistry IV", T. E. Hugh, ed, Academic Press, pp 377-391. 10. Burgoyne, R., Stacey, C, Young, P., Astephen, N., and Merion, M. (1989) in "Techniques in Protein Chemistry IV", T. E. Hugli, ed. Academic Press, pp 399-413.
Reagents for Rapid Reduction of D i s u l f i d e Bonds i n P r o t e i n s Rajeeva Singh ImmunoGen,
Inc.
Cambridge, MA 02139 George M. Whitesides Department of Chemistry, Harvard University, Cambridge, MA 02138
I.
Introduction
Disulfide-reducing reagents are routinely used in biochemical manipulations for (i) reducing the native disulfide bonds in proteins and (ii) maintaining the essential thiol groups in proteins by preventing their oxidation to the disulfide state. Dithiothreitol (DTT) is the most popular disulfide-reducing reagent (1). DTT is, however, slow in reducing disulfides at pH 7-8. The value of pKa of the thiol groups in DTT is high (9.2) and therefore at pH 7 only a small fraction (~1%) of thiol groups in DTT are present in the reactive thiolate form. We have developed several new dithiol reagents for rapid reduction of disulfide groups (2-6) . These dithiol reagents reduce disulfide bonds by the mechanism of thiol-disulfide interchange (Eq 1) . R'
+
RSSR ^
^
R*
+ RSH
^ ^
R' I
+ 2RSH (1)
The design of these dithiol reagents is based on two requirements: (i) a low value of pfCa (~7 to 8) of their thiol groups and (ii) a high reduction potential. The reactivity of a thiol is influenced TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
259
260
Rajeeva Singh and George M. Whitesides
both by its fraction present in the thiolate form and by the nucleophilicity of the thiolate anion. A thiol group of low pfCa has a significant fraction present in the reactive thiolate form, but the nucleophilicity of its thiolate anion is lower than it is for a thiol of higher pKa• The overall effect is that the apparent rate of thiol-disulfide interchange is maximum for a thiol whose pK'a value is approximately equal to the pH of the solution (3) . A dithiol reagent whose thiol groups have pfCa values of --7 to 8 and which has a high reduction potential is therefore expected to reduce disulfide bonds rapidly at pH 7-8. We have developed several new reagents [N,N*(DMH), dimethyl-N^N'-bis(mercaptoacetyl)hydrazine bis(2-mercaptoethyl)sulfone (BMS) and meso-2, bdimercapto-N,N,N',N'-tetramethyladipamide (DTA)] whose thiol groups have pfCa values of -7.8 (2-6). Based on Br0nsted correlations these reagents are expected to reduce disulfide groups at pH 7 faster This report than DTT by a factor of ~5 (3,5). focuses on the comparison of reactivities of BMS, DMH, and DTT toward disulfide groups in several proteins under nondenaturing conditions at pH 7. O
CONMeg
MeN'
SH
r
MeN
SH
k^SH
(f
BMS
II.
V^SH HO^^^^^^^
CONMe2
DMH
Materials
^"
DTA
DTT
and Methods
BMS and DTA are available from US Biochemical Corporation. The synthesis of BMS, DMH and DTA are straightforward from readily available materials (46) . Papain-S-SCH3 was prepared as described before (7). Trypsinogen (bovine pancreas) and a-chymotrypsinogen A (bovine pancreas) were purchased from
Rapid Reduction of Disulfide Bonds
261
Sigma. The murine monoclonal antibody anti-B4 (IgGi) was purified from hybridoma culture supernatants. BMS^ DMH and DTA are solids at room temperature. We recommend that their stock solutions (-10 mM) in phosphate buffer (50 mM sodium phosphate, pH 7, 1 mM in EDTA) be prepared fresh by brief sonication to ensure complete solubilization. These solutions can be assayed for thiol groups by Ellman's assay (8).
A.
Reduction of EMS, DMH and
Papain-S-SCHs DTT
Using
Samples of papain-S-SCH3 (0.042 mg/mL, 1.8 M,M) in deoxygenated 50 mM sodium phosphate buffer (pH 7, 2 mM in EDTA) were reduced using dithiol reagent (25 |LlM; BMS, DMH or DTT) at 23°C. At several time intervals (1-, 6-, 11-, 16-, and 21-min), aliquors (200 |LIL) of the reaction mixture were added to substrate (800 |IL of 3.4 mM N-benzoyl-L-arginine-pnitroanilide in 50 mM bis-tris buffer, pH 6.3, containing 1 mM EDTA and 5% v/v DMSO) and the rates of increase in absorbance at 410 nm were measured. The concentration of dithiol was in excess over that of papain-S-SCH3, and was therefore assumed to be constant during the course of reduction; the kinetics is therefore pseudounimolecular. For the reduction by DTT, the apparent rate constant ik^pp) was calculated from the plot of -In [{ (maximum regenerated papain activity)-(regenerated papain activity)}/(maximum regenerated papain activity)] vs time, for which slope = icapp [Dithiol ] . For the reductions using EMS and DMH, the regenerated papain activity was measured at 1 min in four separate experiments, and ^app ^^s calculated using the rate equation: -In [{ (maximum regenerated papain activity)-(regenerated papain activity)}/(maximum regenerated papain activity)] = icapp [Dithiol] t.
B. Reduction Dithiol
of
Trypsinogen
Using
Samples of trypsinogen (5 mg/mL, 0.21 mM) in 50 mM sodium phosphate buffer (pH 7, 1 mM in EDTA) on ice
262
Rajeeva Singh and George M. Whitesides
(0°C) were reduced using dithiol (0.5 mM; BMS, DMH, DTT). At 10-, 20-, 30-, and 200-min time intervals, aliquots (200 |IL) of the reaction mixture were purified by gel-filtration, and were analyzed for thiol content using Ellman's assay and for protein concentration by measuring absorbance at 280 nm (2). Under these conditions a maximum of 0.6 disulfide residue was reduced per trypsinogen molecule. Assuming pseudounimolecular kinetics, the apparent rate constant (^app) ^^s calculated from the plot for -In([remaining disulfide]/[maximum reducible disulfide]) vs time, for which slope = A:app [Dithiol] .
C.
Reduction of a-Chymotrypslnogen Using Dithiol
A
Samples of a-Chymotrypsinogen A (6.8 mg/mL, 0.27 mM) in 50 mM sodium phosphate buffer (pH 7, 1 mM in EDTA) at room temperature were reduced using 4.8 mM dithiol. Under these reaction conditions a maximum of 0.75 disulfide residue per a-Chymotrypsinogen A molecule was reduced (2) . The analysis for reduction of a-Chymotrypsinogen A was similar to that for trypsinogen.
D.
SDS-PAGE Analysis Immunoglobulin by
of Reduction Dithiol
of
Samples of a murine immunoglobulin (IgGi, 6.3 mg/mL) in 50 mM sodium phosphate buffer (pH 7, 0.5 mM in EDTA) were reduced using dithiol (BMS, DMH, DTT; 4.8 mM) . At several time intervals, aliquots (25 |aL) of the reaction mixture were quenched using iodoacetamide (250 |IL of a 0.3 M iodoacetamide solution in 50 mM sodium phosphate buffer, pH 7, 1 mM in EDTA), and analyzed by 4-12% gradient SDS-PAGE under nonreducing conditions (2).
Rapid Reduction of Disulfide Bonds
263
Table I. Comparisons of Rate Constants for Reduction of Disulfide Bonds in Proteins Using Dithiol Reagents (DTT, EMS, DMH) 1 Reduction
Protein
^DTT
^BMS
^DMH
^DTT
^DTT
1,1
6.6
Conditions
Trypsinogen
pH 1, 0°C
a-Chymot ryps inogen A
pH 7, 28°C pH 1, 26^C
12 M~^ min"l 9 M"^ min"^
pH 1, 23°C
2700 M"^ min""!
Papain-S-SCH3
8 M"l min"l
2.3 2.3 10
25
•'•Rate constants {k) are apparent rate constants based on total dithiol concentration. The calculations of rate constants are described in Methods section. The rate constants for trypsinogen and a-chymot ryps inogen A are from reference 2.
Ill
Results
and Discussion
Table I shows a comparison of the apparent rate constants for the reduction of disulfide bonds in proteins using BMS, DMH and DTT. BMS and DMH reduce the disulfide bonds in proteins at pH 7 significantly faster than does DTT. The disulfide bond in trypsinogen is reduced more rapidly using BMS and DMH than using DTT by a factor of ^1 (Table I) . The rate of reduction of trypsinogen by BMS is -20% faster than by DMH (Figure 1). A maximum of 0.6 disulfide residues were reduced (i.e. 1.2 thiol residues were formed) per trypsinogen molecule under these reaction conditions. A selective cleavage of 179-203 disulfide bond in trypsinogen has been reported under similar conditions of reduction (0.5 mM dithioerythritol, O^'C, pH 8.5; Ref. 9). The disulfide bond in a-chymotrypsinogen A is reduced about 2.3-fold faster using BMS and DMH than by DTT (Table I). A maximum of 0.75 disulfide group per a-chymotrypsinogen A molecule was reduced under the reduction conditions. The apparent rate constant for the reduction of disulfide bond in
264
Rajeeva Singh and George M. Whitesides
C O
.2 IS H
0
10
20
30
40
Time, min Figure 1. Reduction of Trypsinogen u s i n g d i t h i o l s [DTT ( • ) ^ BMS ( • ) , and DMH ( A ) ] . Trypsinogen (5 mg/mL, 0.21 mM) in 50 mM sodium phosphate buffer (pH 7 . 0 , 1 mM in EDTA) was reduced u s i n g d i t h i o l (0.5 mM) a t 0°C. The c u r v e s p l o t t e d a r e based on t h e v a l u e s of apparent r a t e c o n s t a n t s shown in Table I .
a - c h y m o t r y p s i n o g e n A by DTT a t 2 6°C i s s i m i l a r t o t h a t f o r r e d u c t i o n of t r y p s i n o g e n a t 0°C (Table I) . I t i s t h e r e f o r e p r e d i c t e d t h a t t h e r a t e of c l e a v a g e of d i s u l f i d e bond i n a - c h y m o t r y p s i n o g e n A would be s i g n i f i c a n l y slower t h a n t h a t f o r t r y p s i n o g e n a t t h e same t e m p e r a t u r e . The 191-220 d i s u l f i d e bond i n a chymotrypsinogen A i s r e p o r t e d t o be l e s s a c c e s s i b l e t h a n t h e a n a l o g o u s 1 7 9 - 2 0 3 d i s u l f i d e bond i n t r y p s i n o g e n (9) . The r e a c t i v e d i s u l f i d e bond i n papain-S-SCH3 i s r e d u c e d e s p e c i a l l y r a p i d l y by DMH ( F i g u r e 2, Table I) . The r a t e s of r e d u c t i o n of papain-SSCHs u s i n g DMH and BMS a r e f a s t e r t h a n t h a t u s i n g DTT by f a c t o r s of 25 and 10 r e s p e c t i v e l y (Table I) . The t h i o l g r o u p i n p a p a i n h a s a low pfCa (~4) and i s e s s e n t i a l f o r i t s a c t i v i t y . The i n a c t i v e mixed d i s u l f i d e of p a p a i n (papain-S-SCHs) i s r e a c t i v a t e d c o m p l e t e l y w i t h i n 5 min u s i n g s m a l l c o n c e n t r a t i o n s of DMH and BMS (Figure 2 ) .
265
Rapid Reduction of Disulfide Bonds
u
c
9i
a ON
5
10
15
20
Time, min Figure 2. Regeneration of a c t i v i t y of papain from p a p a i n S-SCH3 u s i n g d i t h i o l s [DTT ( • ) , BMS ( • ) , and DMH ( A ) ] . Papain-S-SCH3 (0.042 mg/mL, 1.8 )IM) in 50 mM sodium phosphate buffer (pH 7, 2 mM in EDTA) at 23°C was reduced using d i t h i o l (25 |J.M) . At s e v e r a l time i n t e r v a l s a l i q u o t s of r e a c t i o n mixtures were added t o s u b s t r a t e s o l u t i o n and t h e a c t i v i t i e s of papain were measured. The curves p l o t t e d are based on t h e values of apparent r a t e c o n s t a n t s shown in Table 1.
The d i s u l f i d e bonds in immunoglobulin (IgGi) a r e r e d u c e d ~ 5 - f o l d f a s t e r u s i n g DMH and BMS t h a n u s i n g DTT ( 2 ) . Murine IgGi c o n t a i n s two heavy c h a i n s and two l i g h t c h a i n s ; t h e two heavy c h a i n s a r e l i n k e d t o e a c h o t h e r by two d i s u l f i d e bonds^ and each heavy c h a i n i s l i n k e d t o a l i g h t c h a i n by a d i s u l f i d e bond (10) . SDS-PAGE a n a l y s i s of i o d o a c e t a m i d e - q u e n c h e d r e a c t i o n m i x t u r e s of IgGi and d i t h i o l s shows t h a t t h e immunoglobulin molecule i s c l e a v e d s i g n i f i c a n t l y f a s t e r u s i n g DMH and BMS t h a n u s i n g DTT ( 2 ) .
IV.
Conclusions
Both BMS and DMH r e d u c e d i s u l f i d e bonds in p r o t e i n s a t pH 7 f a s t e r t h a n does DTT by a f a c t o r of - 5 - 7 in
266
Rajeeva Singh and George M. Whitesides
nondenaturing conditions. Although the typical rate enhancements expected from using BMS and DMH over that using DTT are ~5 based on Br0nsted correlations, variations are seen for some proteins: the relatively less accessible disulfide bond in achymotrypsinogen A is reduced 2.3-fold faster using BMS and DMH than using DTT; the highly reactive disulfide bond in papain-S-SCH3 is reduced faster using DMH than using DTT by a factor of 25. The values of equilibrium constants for the reduction of bis (2-hydroxyethyl) disulfide (Eq 1) for BMS, DMH and DTT are 60 M, 2 M and 180 M respectively (4,5,11). BMS is therefore more reducing than DMH and slightly less reducing than DTT. All these dithiols (BMS, DMH, DTT) have significantly high reduction potentials and reduce noncyclic disulfides completely. Although both BMS and DMH reduce disulfides at similar rates, we recommend the use of BMS because it is commercially available, it is odorless and it has a high reduction potential.
References 1. 2. 3.
4. 5. 6. 7. 8. 9. 10. 11.
Cleland, W. W. (1964). Biochemistry 3, 480-482. Singh, R., and Whitesides, G. M. (1994). Bioorg. Chem. 22, 109-115. Singh, R., and Whitesides, G. M. (1993). In "Supplement S: The Chemistry of Sulphur-Containing Functional Groups" (Patai, S., and Rappoport, Z., eds.) 633-658, Wiley, London. Lamoureux, G. V., and Whitesides, G. M. (1993). J. Org. Chem. 58, 633-641. Singh, R., and Whitesides, G. M. (1991). J. Org. Chem. 56, 2332-2337. Lees, W. J., Singh, R., and Whitesides, G. M. (1991). J. Org. Chem. 56, 7328-7331. Singh, R., Blattler, W. A., and Collinson, A. R. (1993). Anal. Blochem. 213, 49-56. Riddles, P. W., Blakeley, R. L., and Zerner, B. (1983). Methods Enzymol. 91, 4 9-60. Sondack, D. L., and Light, A. (1971). J. Biol. Chem. 246, 1630-1637. Edelman, G. M., and Gall, W. E. (1969). Annu. Rev. Blochem. 38, 415-466. Lees, W. J., and Whitesides, G. M. (1993). J. Org. Chem. 58,642-647.
strategies for the Removal of Ionic and Non-Ionic Detergents From Protein and Peptide Mixtm*es For On- And Off-Line Liquid Ctiromatography Mass Spectrometry (LCMS)
Kristine M. Swiderek, Michael, L. Klein, Stanley A. Hefta, and John E. Shively Division of Immunology, Beckman Research Institute at the City of Hope, Duarte, CA 91010
I. Introduction Ionic and non-ionic detergents are reagents widely used during the purification of proteins and peptides. For example most membrane proteins have to be purified in the presence of detergents and the purification of proteins by SDS-polyacrylamide gel electrophoresis (PAGE) is one of the standard procedures in many protein chemistry laboratories. Reagents such as Triton X-lOO, PVP-40 or PVP-360 are often added to the digestion of proteins from nitrocellulose or PVDF membrane (1,2,3) to increase the yield of digestion and the recovery of peptides. The presence of ionic detergents such as SDS is well known to destroy any chromatographic resolution on reversed phase HPLC even in very low concentrations (Figure 1), making any following mass spectral analyses impossible. Researchers in different laboratories have tried to improve the quality of chromatography by replacing ionic detergents with non-ionic detergents. However, the presence of many non-ionic detergents still interferes with mass spectral analysis. The ion signal deriving from the detergent often suppresses the ion signal of the peptides or protein present in the same fraction. In addition, detergents often elute over the whole gradient of the chromatography, contaminating every fraction with detergent. This problem arises no matter if the fractions were collected off-line and analyzed by mass spectrometry or continuously delivered into the mass spectrometer during on-line LCMS analysis. Furthermore, detergents often do not have one distinct mass which could be filtered out during the mass spectral analysis, resulting in a TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
267
268
Kristine M. Swiderek et al.
wide array of masses related to the detergent This makes it nearly impossible to decide which ion signals are related to the component of interest. Some protocols have been developed to get rid of different types of detergent. Dialysis turns out to be very impractical if only picomole amounts of material are available or if peptide mixtures have to be isolated. SDS can be removed by precipitation with guanidine hydrochloride (4); protein can be precipitated with trichloroacetic acid (TCA), leaving for example SDS in the supernatant. However, these techniques bear die risk of losing the sample, especially if only picomoles of protein or peptide are available. Here we present some strategies to remove ionic and non-ionic detergents during sample preparation at the low picomole level for micro HPLC chromatography and mass spectrometry.
II. Materials and Methods Reversed phase HPLC was performed using fused silica columns and solvent delivery systems developed and built in our laboratory (5,6). All chromatographies were carried out on Vydac 5 |im C18 RP support with or without an SDS removal precolumn. SDS removal resin was obtained from Poly-LC. Solvent A was 0.1 % trifluoroacetic acid (TFA) in water and solvent B was 0.07 % TFA, 90 % acetonitrile in water. Water was obtainedfroma Milli-Q system. Samples were eluted with a gradient from 2% to 92 % solvent B in 45 minutes unless otherwise noted. As a standard, cytochrome C digested with Lys-C (CCKOD) was used. The separation of the peptides was carried out with or without SDS. Hydrophilic interaction chromatography (HILIC) was performed on Polyhydroxy-ethyl Aspartamide (PHEA), obtained from PolyLC, packed in a PEEK column (0.53 mm ID x 3 cm) bounded by a stainless steelfritat the inlet and a glassfiber(Whatman GF/F) plug held in place by a mounted fused silica transfer line. Solvent A was 0.1% TFA; solvent B was 0.1% TFA, 90% acetonitrile. Peptide solutions were applied to the column after they were brought to 90% acetonitrile. The column was equilibrated in 100% B to a stable 214 nm absorbance; after injecting the sample and reestablishing the stable absorbance, a gradient of 100% to 0% B was applied to the column over 20 minutes, followed by 0% B for another 10 minutes. All mass spectrometric analyses were carried out on a triple quadrupole mass spectrometer (TSQ-700) from Finnigan-MAT (San Jose, CA) equipp^ with an electrospray ion source (ESI) operating at atmospheric pressure. Mass spectra were recorded in the positive ion mode. The electrospray neecUe was operated at a differential of 3-4 kV, the conversion dynode was set to -15 V. TTie drying gas was nitrogen and the temperature was set to about 200 ^C. A sheath flow of 2-methoxyethanol was delivered with 2 |xl/min and the nitrogen sheath gas was set at 50 psi. Samples were directiy introduced into the source with a flow rate of 2 |il/min. Scans were continuously taken every three seconds.
Detergent Removal for LCMS
269
III. Results and Discussion The tremendous effects of the presence of SDS in the sample is demonstrated in Figure 1. 10 picomoles of a Cytochrome C standard 0.061
Figure 1. Comparison of peptide separations from Cytochrome C Lys-C digested (CCKCD) in the absence and presence of SDS. The upper panel shows the separation of 10 picomoles of the digest on a micro capillary column (360 \im outer, 250 pm inner diameter, 200 mm length) fiUed with 5 jim Vydac C18 RP support The column was prepared as described in Materials and Methods. The separation of the same sample on the same column in the presence of 0.1 % SDS is shown in the lower panel. digested with Lys-C (CCKCD) were injected and compared to the chromatography of the same standard in the presence of 0.1 % SDS. An immense peak broadening, loss of resolution and the change of retention times were observed. No useful mass spec data could be obtained (data not shown). The presence of even 10 times less SDS caused the same effects on the chromatography (data not shown). To reestablish the quality of the chromatography a protocol for the removal of SDS had to be developed. Figure 2 shows a schematic drawing of how the SDS-removal column is attached to the reversed phase column. To prepare this column setup, a fused silica tubing with the desired dimensions (360 |im outer, 250 ^im inner diameter and 200 mm length for LC-MS analysis) was connected to a transfer tubing (180 |im outer and 50 |im inner diameter) holding a frit in place. The column was filled with reversed phase support as described in
270
Kristine M. Swiderek et al.
(6). A piece of fused silica tubing of about 10 mm length (740 p.m outer and 530 |im inner diameter) was then glued on top of the column and packed with the SDS-removal resin. At this point, tfie column was ready to use and could be connected to the chromatography system. The 50 |im inner diameter transferlinefromthe column introduced the sample either directly or through a UV detector to the mass spectrometer. It was important to avoid any dead volume between the two resins since this resulted in artifacts and loss of resolution during chromatography. 10 picomoles of the
• Epoxy -..^^^
SOS Removal Resin 530/tm ID
C18 Reversed Phase, 250 ^m 10
GlQS Fiber. Zytex, PVDF
SO/iin 10 TransferLine to MS
Figure 2. Schematic drawing of SDS-removal precolumn directly connected to C18 reversed phase HPLC column. The resins were packed in fused silica capillary tubings which were connected by epoxy glue. The length of the column was typically between 200 and 250 mm depending upon their inner dimensions.
Cytochrome C digestion mixture were applied onto this column and eluted with the standard gradient. The chromatogram in the upper panel of Figure 3 shows that the separation of the peptides is comparable to the separation using the unmodified micro capillary column (Figure 1, upper panel). There was some delay between the gradient onset due to the increased volume before the CI8 reversed phase column. However, the on-line LCMS analysis of this peptide separation demonstrated clearly that the presence of the precolumn did not interfere with the mass analysis (Figure 4a). Besides the delay in the beginning of the run the LC-MS analyses with or without a precolumn can be compared directly (data not shown). As an example, the spectra collected over peak I were averaged and the resulting spectrum is displayed in the inset of Figure 4a. After adding SDS to a final concentration of 0.1 %, another aliquot of 10 picomoles was injected onto the column (lower panel of Figure 3). The peptides separated almost identically compared to die chromatography in the absence of SDS indicating that the detergent was retained by the precolumn and did not interfere with the peptide separation anymore (compare to the lower panel of Figure 1). LC-MS analysis of the separated peptides indicated as well, that the chromatography of the sample with SDS is dmost identical to the peptide separation without SDS. The spectra collected over the peak which correspond to peak I (peak H) were averaged to compare the quality of the mass spectral data in the presence and absence of detergent (Figure 4b). There is virtually no difference to the spectrum generated from the same peak in the analysis of the sample without detergent. No adduct formation
271
Detergent Removal for LCMS
0.071
20 30 Time (min)
Figure 3. Comparison of peptide separations from Cytochrome C Lys-C digested (CCKCD) in the absence and presence of SDS with SDS-removal column. As described in Materials and Methods, the micro capillary column (360 |im outer, 250 |im inner diameter, 200 mm length) was filled with 5 |im Vydac C18 RP support and had in addition a precolumnfilledwith SDS-removal resin attached to it. The upper panel shows the separation of 10 picomoles of the digest without SDS, in the lower panel the separation of the same sample on the same column in the presence of 0.1 % SDS is shown. was observed, nor a reduction of signal intensity. Applying this technique to samples contaminated with SDS will improve chromatography as well as mass spectral analysis. However, it is not to exclude that some peptides or proteins might interact with the precolumn and will be retained in the same manner as SDS. The usefulness of the SDS removal precolumn was demonstrated using an LC-MS system on a micro capillary column with an inner diameter of 250 p.m. SDS-removal precolumns can be used on micro capillary columns of larger dimensions as well. Our experience has shown tiiat it is
Kristine M. Swiderek et al.
272
100
^
50
&
I
I
I
I I I
t
I
IMJUIULU*J^
I
200
400 Scan
200
400
600
800
100 n
50a
&
U^
600
lLM.Vvjjil] 800
Scan
Figure 4. LC-MS analysis of Cytochrome C Lys-C digestion mixtures in the absence (a) and presence (b) of SDS. The base peak profile of the mass spectral analyses is displayed. The insets I and II show the corresponding m/z spectra acquired over peak I and n.
important however to keep dead volumes between the columns as small as possible to avoid artifacts during chromatography. It is therefore recommended that the precolumn is attached direcdy in front of the reversed phase column using the same technique described above. HILIC has previously been shown to select for biomolecules on the basis of relative hydrophilicity (7, 8). The work of these authors, and the PHEA resin manufacturer's instructions, indicated that peptides would be
Detergent Removal for LCMS
273
retained by this matrix in buffers of low pH and/or acetonitrile concentrations no higher than 85%. Such chromatography was performed with standard size (4.6 x 250 mm) stainless steel columns, using small, relatively hydrophilic peptides as test samples. Using 20 mM sodium acetate (pH 5.0) m 85% acetonitrUe, the test peptide ALFHGRVSWAMFPNGK (ALF) was retarded but not retained by the PHEA capillary column, as it eluted just after the injection artifact peak (data not shown). Additionally, we have been unable to get the test peptide hCMV pp65 107-114 (EPMSIYVY), which has no basic residues and an acidic residue directly
0.00
4.90
9.00
ia.90
18.00 88.80 27.00 31.80 38.00 40.80
48.00
RCTENTXON TXMK (MINUTES)
Figure 5. HILIC separation of the test peptide ALF from Triton X-100 and PVP-40. Sflbcsnls: A=0.1% TFA in water. B=0.1% TFA in 90% acetonitrile. Gradient: 100% B for 10 min. after injection at retention time = 0; 100-->0% B in 20 min.; 0% B for 15 min. Top: 200 pmol ALF, 111 nmol Tris; middle: 200 pmol ALF. 11.4 M.g RTX-100. I l l nmol Tris; bottom: 200 pmol ALF, 1.1 \ig PVP-40, 111 nmol Tris. Peaks labeled a are ALF; the peak labeled p is PVP-40. Full scale absorbance for each chromatogram is 0.212 AU.
274
Kristine M. Swiderek et al.
adjacent to the alpha-amino group, to stick in any buffer at pH 2.8 or pH 5. We surmise that the early-eluting peaks in chromatograms shown by others (7, 8) might be attributed not to retention but to retardation, since they used a much longer and larger column in their studies than we have used (4.6 x 250 mm vs. 0.53 x 30 mm). We were able to have ALF retained by our capillary PHEA column when it was equilibrated with 90% acetonitrile (Figure 5). It should be noted that ALF was not retained by the column when the peptide was dissolved in a non-buffered solution. Triton X-1(X) (middle panel) was not retained by the column under these conditions, while ALF was eluted by lowering die acetonitrile concentration. We have also found that Nonidet P-40, another nonionic detergent, is not retained by the PHEA column (data not shown); we thus expect that this column should be useful in removing any nonionic detergent from peptide samples. Since PVP-360 is insoluble in 90% acetonitrile, we could not test the ability of PHEA to separate it from peptide samples. The lower panel of figure 5 shows the separation of the nonionic blocking agent PVP-40 from ALF by HILIC. Unlike Triton X-100, PVP-40 is retained by the column; nonetheless, we were able to separate the blocking agent from ALF by HILIC. We were interested in using HILIC as a general method for removing nonionic contaminants from a wide variety of peptides, including some which are hydrophobic. Using the PHEA capillary column, we have been unable to get a peptide which has no basic residues and an acidic residue directly adjacent to the alpha-amino group, to stick in any buffer at pH 2.8 or pH 5. tlius, it appears that this chromatographic procedure is limited to peptides known to contain basic residues or non-N-terminal acidic residues.
Acknowledgments The authors like to thank Mike T. Davis for his technical input on column preparation and liquid chromatography. This work was supported by NIH grants CA 33572 and RR 06217
References 1. 2. 3. 4. 5.
Aebersold, R. H., Leavitt, J., Saavedra, R. A., Hood. L. E. & Kent, S. B., (1987) Proc. Natl. Acad. Sci. USA 84, 6970-6974. Fernandez, J., DeMott, M., Aterton, D. and Mische, S. M., (1992) Anal. Biochem. 201. 255-264. Henzel, W. J., personal communication. Shively, J. E., (1986) in "Methods of Protein Microcharacterization" (Shively, J. E., ed.) 41-87. Lee, T. D. and Davis, M. T., (1992) Protein Science 1,935.
Detergent Removal for LCMS
6. 7. 8.
275
Swiderek, K. M., Lee, T. D. and Shively, J. E., (1994) Methods of Enzymology, in press. Alpert, AJ. (1990) J. Chromatog. 499, 177-196. Zhu, B.-Y., Mant, C.T., and Hodges, R. S. (1991) J. Chromatog. 548, 1324.
This Page Intentionally Left Blank
Online Preparation Of Complex Biological Samples Prior To Analysis By HPLC, LC/MS And/Or Protein Sequencing Ken Stoney and Kerry Nugent Michrom BioResources, Inc., Auburn CA 95603
I. Introduction Although modem analytical techniques are very powerful for tracelevel characterization of proteins and peptides in complex biological samples, most samples require some degree of preparation prior to final analysis (1,2). The reason for this is that sample matrices and dilute samples generally interfere with the potency of analytical techniques. Operations such as concentration, desalting, buffer exchange and detergent removal which can correct matrix problems, are usually performed off line; this is often time consuming and can result in loss of the analytes of interest when working at low picomole levels (3). A series of micro trap cartridge columns have been developed which address the removal of several of the most common interfering substances found in biological matrices; these cartridges can be used in conjuction with HPLC analysis, or prior to Mass Spec, AAA or Protein Sequencing Analysis. Each type of trap cartridge has a chemistry which is uniquely suited to removal of a particular interfering substance. The first of these micro trap cartridges functions in salt removal and sample concentration. This micro trap cartridge can also be used to remove buffers and salts from biological samples, or perform a buffer exchange to a system more compatible with the final analytical technique. Mass Spectrometry is extremely sensitive to nonvolatile salts, which can cause instrumental problems, interfere with ionization of samples and make data interpretation much more difficult. Protein sequencers can also have difficulties with salts, especially phosphate, which can result TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
277
278
Ken Stoney and Kerry Nugent
in a suppressed signal. For sample concentration, this trap cartridge uses reversed-phase HPLC chemistry to concentrate aqueous samples at the head of the trap, then releases the sample in a 10 ul volume when the cartridge is eluted with strong solvent. Volume reduction can be extremely important for such techniques as microbore/capillary HPLC and protein sequencing. With small ID HPLC columns running at lowflowrates (5-50^il/min), sample loading time can be as long as the separation time, in addition to adding large loop volumes to the gradient delay volume of the system. The use of multiple injections can get around sample volume problems, but adds significantly to the time involved for either procedure. Two different detergent removal cartridges have also been developed to help clean up biological samples prior to final analysis. The first of these removes anionic detergents such as Sodium Dodecyl Sulfate (SDS). SDS is widely used for solubilization of peptides and proteins, and is present in samples isolated from SDS-PAGE gels. With even trace levels of SDS present in the sample, chromatographic resolution and repeatability are compromised, and RP HPLC columns quickly become contaminated. Trace levels of SDS bound to peptides and proteins also makes mass spectral interpretation more difficult, and excess SDS can interfere with peptide and protein ionization and make it impossible to interpret the mass spectral data, especially when the analytes of interest are present at trace levels. SDS may also interfere with AAA and protein sequence analysis, and excess levels of SDS should be removed from samples prior to final analysis if accurate data is to be expected (4,5). The SDS removal cartridge uses a strong anion exchange chemistry to retain SDS molecules while proteins and peptides are released for analysis. The second detergent removal cartridge cleans up samples containing non ionic detergents (NID), such as Triton XI00, Tween 80, etc. NIDs are commonly used by protein chemists to help solubilize hydrophobic proteins. Like SDS, these detergents can interfere with both the chromatography and mass spectral interpretation for samples analyzed by LC/MS. They offer an even greater challenge than SDS, since they tend to be broad range mixtures of long chain surfactants that elute from a RP HPLC column over part or the entire range where proteins elute from the column. The non ionic detergent removal cartridge uses a mixed-bed ion-exchange chemistry and a multistep procedure to isolate proteins of interest from these potential interferences.
Online Sample Preparation
279
II. Materials and Methods Protein standards and detergents were obtained from Sigma Chemical Company, St. Louis, MO. Horse heart myoglobin from Sigma was digested with trypsin using a standard protocol from Promega. HPLC solvents were obtained from Burdick and Jackson and trifluoroacetic acid (TFA) was obtained from Pierce Chemical Company. All of the HPLC instrumentation, accessories, HPLC columns and micro trap cartridges are products of Michrom BioResources. All of the trap cartridges were used in an injection loop on the 10port valve built into the UMA. All of the peptide separations were run on a 1.0 X 150 mm column packed with Sju 300A Reliasil CI8 (Column Engineering, Ontario, CA) at 50 ul/min, using a 20 minute gradient from 565% Acetonitrile in water with 0.1% TFA. All of the protein separations were run on a 1.0 x 50 mm column packed with 8]LI 4000A PLRP-S (Polymer Labs, Amherst, MA) at 100 jul/min, using a 5 minute gradient from 5-65% Acetonitrile in water with 0.1% TFA. All of the HPLC separations were done using an Ultrafast Microprotein Analyzer (UMA) from Michrom BioResources.
III. Results and Discussion
A. Sample Concentration & De-Salting The top chromatogram in Figure 1 shows the separation of a myoglobin tryptic digest after 20 repetitive injections of 50 ul of 0.1 pmol/ jul dissolved in a 2M Urea solution. The large solvent front is due to the Urea which is not retained on the RPLC column. If this sample were run directly into an electrospray Mass Spectrometer, the large amount of Urea would greatly disturb the ion source, and could potentially plug the interface. Although the separation looks good, because all of the peptides were concentrated at the head of the column during the isocratic loading at 2% Acetonitrile in water with 0.1% TFA, what is not shown is the fact that loading time for this sample was 40 minutes, with each 50 jul injection wash having been washed out of the loop for two minutes prior to the next loading, and this procedure repeated 20 times. In the lower chromatogram in Figure 1, the same 1000 )ul sample has
280
Ken Stoney and Kerry Nugent
Figure 1. Concentration & removal of Urea from myoglobin tryptic digest prior to microbore LC/MS using a Michrom Peptide Trap Cartridge. Sample contains 100 pmol of digest dissolved in 1000 jul of 2 M Urea. Sample is run on a 1 x 150 mm RC-18 column.
been loaded in 2 minutes onto a peptide trap cartridge built into the injection loop of the HPLC; the cartridge is then rinsed with 100 jul of initial mobile phase (5% Acetonitrile in water with 0.1% TFA) to flush all of the Urea from the trap, prior to switching the sample on line to the analytical column for rapid gradient separation of the protein mixture. The large solvent front is absent from this lower trace, showing that the Urea has been selectively removed from the sample without loss of analyte. This rapid on-line desalting is accomplished by placing a Michrom protein or peptide trap cartridge in the injector loop, where samples containing salts can be loaded onto the trap; then the trap is rinsed with a volatile weak solvent (Solvent A from the HPLC) to insure complete removal of all the salts. The proteins and peptides of interest can then be immediately stepped off the trap cartridge with a strong solvent (Solvent B from the HPLC) into a few microliters of sample volume, or injected into a HPLC flow stream for further separation and / or analysis by LC/MS. This technique allows rapid concentration of samples from 20-10,000 \xl down to a 10 jul volume, and has been successfully employed to remove a wide range of salts and buffers (up to 8M) from a variety of protein and peptide samples.
Online Sample Preparation
281
Figure 2. On-line SDS removal from myoglobin tryptic digest prior to microbore HPLC, with/without SDS Removal Trap Cartridge, Sample contains 100 pmol of digest with/without 0.1% SDS. Sample is run on a 1 x 150 mm RC-18 column.
B. Anionic Detergent (SDS) Removal In Figure 2, the upper trace shows the separation of 100 picomoles of a myoglobin tryptic digest injected from a standard 20 jul loop without any SDS in the sample. The middle chromatogram shows the separation of the same 100 picomole myoglobin digest, but in a 0.1 % SDS solution, also injected from a standard 20 jul loop. The SDS binds to the peptides making them all more hydrophobic and more similar in overall polarity, thus resulting in the peaks being bunched up at the end of the chromatogram, with much worse overall resolution. The bottom chromatogram shows the same sample as the middle trace (in 0.1% SDS), but for this run, the loop was replaced with an micro SDS removal cartridge / loop system which was able to remove most of the SDS from the sample, such that the subsequent peptide separation was now very similar to that in the upper trace. The Michrom SDS removal cartridge can remove SDS at concentrations up to 1% (higher concentrations form micelles and trap analytes with the SDS micelle complex; these samples must be diluted below 1% prior to analysis), and can remove up to 1 mg of SDS from a single sample. When these cartridges are used in alO-port loop
282
Ken Stoney and Kerry Nugent
injector, the SDS can be removed while the peptides and proteins are trapped at the head of the RPLC column. The trap cartridge can then be switched out of line and back flushed (manually or automatically) with strong solvent during the HPLC run to completely remove the SDS from the cartridge without having it go through the HPLC column. This SDS cartridge also removes Coumassie Blue dye, an anionic stain commonly used in slab gel electrophoresis.
C Non Ionic Detergent Removal In Figure 3, the lower trace shows the separation of 200 ng of three standard proteins (insulin, lysozyme and alpha lactalbumin), without any detergent in the sample. The upper trace shows the same protein standard in a solution of 1.0% Triton X-100 detergent. Since Triton X-100 absorbs in the UV, a very large peak is seen for the detergent, but the peaks for the
Figure 3. On-line removal of Triton X-100 from proteins prior to microbore HPLC, with/ without NID Removal Trap Cartridge. Sample contains 200 ng of 3 protein standard with/ without Triton. Sample is run on a 1 x 50 mm PLRP-S column.
Online Sample Preparation
283
three standard proteins are completely obscured by the large detergent peak. The middle trace shows the results of running the same sample as in the upper trace (3 protein standard in 1.0 % Triton X-100), but using the non ionic detergent removal protocol. One can see that the middle and bottom traces are nearly identical, showing that the majority of the detergent has been removed from the sample by the non ionic detergent removal system. A multistep, automated procedure has been developed to selectively remove these non ionic interferences from proteins, using a non ionic detergent (NID) removal column in series with a protein trap cartridge, prior to separation on a RP HPLC column. Protein samples are loaded on to a mixed bed ion exchange precolumn (1x10mm) with 10% acetonitrile in 10 mM buffer (pH 5-8 depending on pi of protein) with the protein trap cartridge out of line in the valve loop. The protein(s) of interest are trapped on the precolumn (up to 1 mg of total protein), while the non ionic interferences are flushed to waste with the load solvent. The protein trap cartridge is then switched in line, and the proteins are eluted from the precolumn onto the trap cartridge with 10% acetonitrile in 2M NaCl buffer (pH 5 - 8). The salts are then washed out of the trap cartridge with 10% acetonitrile in 0.1 % TFA. Finally, the trap cartridge is returned in line to the RP HPLC column and the proteins are eluted, detergent free, through the RP HPLC column with the appropriate gradient conditions.
IV.
Conclusions
Complex biological samples often require some degree of sample preparation prior to final analysis by HPLC, LC/MS, Mass Spectrometry, AAA or Protein Sequencing due to sample matrix interferences. Although many off line procedures have been employed in the past to deal with problems of low concentration, high salt background and interferences from other additives such as detergents, these techniques can be time consuming and may result in significant loss of the analytes of interest when working at low levels. On-line sample preparation techniques, such as the trap cartridge systems described here, offer distinct advantages of minimizing prep time and maximizing sample recovery over other conventional sample preparation protocols. Use of the RP salt removal/concentration trap cartridges provides a rapid means for volume reduction, salt elimination and/or buffer exchange, since the entire procedure takes place in less than 5 minutes. With volumes
284
Ken Stoney and Kerry Nugent
reduced from 10,000 jul down to 10 jul, samples can now be placed directly on to sequencing filters without further handling; volume reduction also avoids additional system volume in microbore HPLC. Removal of salts improves instrumental performance and data interpretation for mass spectrometry, AAA and protein sequence analysis. The SDS removal trap cartridges are able to remove up to 1 mg of SDS from a given sample. Use of these cartridges eliminates interference with reversed-phase separations and avoids column fouling which is commonly experienced when SDS is a part of the separation. Removal of SDS from biological samples also improves the accuracy of analysis by MS, AAA or sequencing. Non-ionic detergent (NID) removal cartridges used in the multistep procedure outline in this paper offer efficient cleanup, in addition to concentration of samples. Removal of NID's restores reversed-phase HPLC separation of proteins, as well as avoiding problems with mass spectral data interpretation. Future work is planned with these cartridges to see if they can be extended to other contaminants such as PolyEthylene Glycol (PEG), and a trap cartridge for removal of non ionic detergents from peptide samples is also being investigated.
References 1. 2. 3. 4.
5.
Atherton, D. (1989) In "Techniques in Protein Chemistry" (Hugli,T.E., ed.) 273283. Slattery, T.K. and Harkins, R.N. (1993) In "Techniques in Protein Chemistry IV" (Angeletti, R.H., ed.) 443-452. Nugent, K.D. and Nugent, P.W., (1990) Biochromatography 5(3), 142-148. Stone, K.L., LoPresti, M.B., Williams, N.D., Crawford, J.M., DeAngelis, R. and Williams, K.R. (1989) In "Techniques in Protein Chemistry" (Hugh, T.E., ed.) 377-391. Jeno, P., Scherer, Manning-Krieg, U. and Horst, M. (1993) Anal. Biochem. 215, 292-298.
Methods For The Purification And Characterization Of Calcium-Binding Proteins From Retina Arthur S. Polans^ Krzysztof Palczewski^ Wojciech A. Gorczyca^, and John W. Crabb^ ^R.S. Dow Neurological Sciences Institute, Good Samaritan Hospital, Portland, OR 97209; ^epts. Ophthalmology and Pharmacology, Univ. of Washington, Seattle, WA 98195; ^Protein Chemistry Facility, W. Alton Jones Cell Science Center, Lake Placid, NY 12946
I. Introduction More than 200 EF hand calcium-binding proteins have been identified, however, the function of only a few is known (1). Calmodulin regulates various enzymes including myosin light chain kinase, CaM kinase, brain adenylate cyclase and erythrocyte Ca^"^-ATPase (2). Troponin C modulates troponin I and thereby the interaction of troponin/tropomyosin during muscle contraction (3). Recoverin, a photoreceptor-specific calcium-binding protein (4), appears to be involved in the termination of the transduction cascade (5), perhaps by blocking the phosphorylation of photoexcited rhodopsin (6). Recoverin also has been identified as an autoantigen in a degenerative disease of the retina known as cancer-associated retinopathy (7); several calcium-binding proteins have been implicated in neurodegenerative diseases (8) and in the development or progression of tumors. Capl, for example, is an S-100-related calcium-binding protein capable of inducing the metastatic phenotype in several human and rodent cell lines (9) and recently was identified in retinal preparations (10). This chapter outlines methods we have used for purifying and characterizing calcium-binding proteins from ocular tissues and for determining their calcium binding parameters. The procedures are adaptable to a variety of tissues and hopefully will facilitate further investigations of calcium-binding proteins. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
285
286
Arthur S. Polans et al.
II. Methods And Materials Several calcium-binding proteins including calmodulin, recoverin, Capl and the S-100 proteins can be readily enriched owing to their calcium-dependent binding to phenyl-Sepharose. Calcium-binding proteins like calbindin D28K, however, do not bind to phenyl-Sepharose but can be easily purified by anion exchange chromatography as a function of the calcium concentration. More difficult to purify calcium-binding proteins like GCAP (Guanylate Cyclase Activating Protein) require several chromatography steps for isolation. A schematic of the possible purification strategies is given in Figure 1. A. Purification of Recoverin, Calmodulin, Cap! and S-IOOB 1. Tissue Preparation - Fifty bovine retinas are homogenized in a minimum of 60 ml of buffer (50 mM HEPES, 1 mM EDTA, 100 mM NaCl, pH 7.5, containing 1-2 /xg/ml each of aprotinin, leupeptin and pepstatin) using a 1inch Teflon pestle followed by six passes with a glass on glass tissue grinder. Homogenates are centrifuged (JA-17 rotor, Beckman Instruments) at 39,000 X g for 20 min at 4''C, the supematants removed and adjusted to a final concentration of 2 mM CaCl2. 2. Phenyl-Sepharose Chromatography - A phenyl-Sepharose (Pharmacia Fine Chemicals, Piscataway, NJ) column (1x7 cm) is equilibrated at a rate of 15 ml/hr with 50 mM HEPES, 2 mM CaCla, 100 mM NaCl, pH 7.5, for a minimum of 3 hr at 4*'C prior to use. After application of an extract, the column is washed with the same buffer until the A^^Q returns to baseline (for larger quantities of protein this requires overnight washing). The bound proteins are eluted with 50 mM HEPES, 10 mM EDTA, 100 mM NaCl, pH 7.5, at the sameflowrate, and 1 ml fractions collected. EGTA can substitute for EDTA during the elution, and concentrations of 1 mM are effective. 3. Mono Q Chromatography - For the purification of recoverin, fractions from the phenyl-Sepharose eluate containing protein are combined and dialyzed against 10 mM l,3-bis[tris(hydroxy-methyl)methylamino]propane (BTP) buffer, pH 8.4 containing 1 mM EDTA. Aliquots are applied to a Mono Q column (HR 5 x 50 mm; Pharmacia Fine Chemicals) equilibrated with the same buffer. A linear gradient of NaCl (0-0.25M NaCl) in the same buffer is developed over 20 min at a rate of 0.5 ml/min, and recoverin elutes at 125 mM NaCl. Approximately 1 mg of purified recoverin is obtained per fifty retinas. 4. Reversed-Phase High-Performance Liquid Chromatography (RP-HPLC)The retinal phenyl-Sepharose eluate is loaded directly onto a 4.6 x 150 mm C4 column (Phenomenex W-Porex 5, Torrance, CA). The gradient is developed at 0.6 ml/min using buffer A (0.1% v/v trifluoroacetic acid (TFA) in water) and buffer B (80% CH3CN, 0.08% v/v TFA) with the following program: 0-45% buffer B, 5 min; 45-68% buffer B, 85 min; 68-95% buffer B, 90 min; 95-100% buffer B, 95 min. Four to six abundant calcium-binding proteins, including recoverin, calmodulin, Capl and S-1006, are purified by
Methods for Studying Calcium-Binding Proteins
287
Retinal Extract
I— —
^^
I +Ca2^ phenyl-Sepharose bound (elute w EGTA)
unbound +EGTA
I Mono Ql
Recoverin
OM
T
Capl
]
1
pEAE-Sepharose
|RP-HPLC|
bound bound , (elute w Ca ) (elute w NaCl)
T
Calmodulin Capl
s-ioop
r
Calbindin
Recoverinacylated forms
I hydroxylapatite! | I RP-HPLC|
T
GCAP
Figure 1. Flow Diagram for the Purification of Caldum-binding Proteins from Retina.
these procedures, and heterogeneously fatty acid modified forms of recoverin are resolved. 5. Organomercurial Chromatography - Affi-Gel 501 (BioRad Laboratories, Richmond, CA) is washed extensively with 50 mM HEPES, 10 mM EDTA, 100 mM NaCl, pH 7.5 prior to packing a 1 x 2 cm column at 4**C. The column is equilibrated in the same buffer at a flow rate of 15 ml/hr. The retinal eluate from the phenyl-Sepharose column is applied directly to the organomercurial column, and fractions collected until the A280 returns to baseline. A second peak of protein is eluted from the column using the same buffer containing 10 /xM dithiothreitol (DTT). Afinalpeak containing purified Capl (100 /xg per fifty retinas) is obtained using a buffer with 1-10 mMDTT. B. Purification of Calbindin D-28K Calbindin, a 28 kDa vitamin D-dependent calcium binding protein of undefined function, can be purified from bovine retinal homogenates (11). Retinas are disrupted in Tris buffer (O.OIM Tris-Cl, pH 8.0), centrifuged (10,000 X g, 20 min), and ammonium sulfate is added to the supernatant to yield 70% saturation. After stirring for 3 hr at 4°C, the solution is centrifuged, and the soluble material is precipitated overnight using 100% ammonium sulfate. After centrifugation (100,000 x g, 1 hr) the pellet is resuspended in Tris buffer containing 0.1 mM EGTA and 120 mM NaCl and applied to a DEAE-Sepharose column. The column is washed with Tris-EGTA buffer containing 90 mM NaCl, and calbindin D-28K finally is eluted using the same buffer with 0.2 mM CaCl2 in place of EGTA. Since
288
Arthur S. Polans et al.
calbindin D-28K does not bind to phenyl-Sepharose, theflow-throughfrom the phenyl-Sepharose column can be used as an alternative starting material for the simultaneous purification of calbindin D-28K and other calciumbinding proteins, as indicated in Figure 1. C. Purification of GCAP GCAP, a 23.5 kDa mediator in the calcium-sensitive regulation of guanylate cyclase, is purifiedfi-omthe low salt extract of retinal rod outer segment membranes (12). The extract is applied to a DEAE-Sepharose column equilibrated with 5 mM BTP buffer, pH 7.5, containing 50 mM NaCl. Elution is accomplished with a NaCl gradient (100-350 mM). Fractions containing GCAP activity are purified further by hydroxylapatite chromatography using a gradient of phosphate (0-60 mM KH2PO4) and NaCl (100-0 mM) followed by RP-HPLC (C4, W-Porex 5) using an acetonitrile gradient (30-60%) in 5 mM BTP, pH 7.5. Rod outer segments from 150 bovine retinas yield 5-20 fig of purified GCAP. Like calbindin D28K, GCAP can be retrieved in theflow-throughfrom a phenyl-Sepharose column, and this material can be used for the simultaneous purification of several calcium-binding proteins, as indicated in Figure 1. In addition, a different activator of bovine retinal guanylate cyclase recently was isolated by other procedures (13). D. Characterization of Purified Calcium-Binding Proteins 1. Protein Sequencing, Peptide Synthesis, Amino Acid Analysis and Mass Spectrometry - Methods for protein modification, proteolysis, RP-HPLC peptide purification and automated Edman degradation are well documented (7,10,14) as are methods for FMOC synthesis and myristylation of synthetic peptides (15) and amino acid analysis (16). 2. Flow Dialysis - Dialysis tubing is boiled in 5% NaHCOj and then in deionized water. Buffer, 10 mM HEPES-KOH, 100 mM KCl, pH 7.5, is treated for the removal of contaminating calcium by Chelex chromatography (Biorad Laboratories). EDTA is removed from protein samples by passage over two Sephadex G-25 PD-10 columns. Calcium is removed from protein samples dialyzed against HEPES-KCl buffer by Chelex chromatography just prior to flow dialysis. The concentration of calcium is determined in all solutions by atomic absorption spectrophotometry. Flow dialysis follows the procedures of Colowick and Womack (1969) (17) as modified by Haiech et al. (1981) (18). The dialysis cell is described by Feldman (1978) (19). A 2 ml aliquot of protein (1-2 mg/ml) is placed into an upper Teflon chamber, and both the upper and lower chamber (0.1-0.2 ml) are temperature controlled at 25''C. A peristaltic pump that perfuses the lower chamber is set at aflowrate of 1 ml/min. Non-radioactive calcium is added in steps to yield afinalcalcium concentration of 0-700 pM. Data are analyzed using the Hill equation and the Adair-Klotz equation.
Methods for Studying Calcium-Binding Proteins
289
III. Results Calcium-binding proteins often become more hydrophobic in the presence of calcium and bind reversibly to hydrophobic matrices, such as phenylSepharose. Figure 2 illustrates the enrichment of several retinal calciumbinding proteins by phenyl-Sepharose chromatography. A retinal extract was adjusted to 2mM CaCl2 and applied to a phenyl-Sepharose column. An aliquot of the extract was subjected to SDS-polyacrylamide gel electrophoresis (lane a), and proteins which did not bind to the column are shown in lane b. Proteins that bound to the column and subsequently were eluted by the chelation of calcium with EDTA are shown in lane c. The four prominent protein bands were identified by amino acid sequence analysis as recoverin (26 kDa), calmodulin (17 kDa), Capl (12 kDa) and S-IOOB (9 kDa). Two additional protein bands at 23 kDa and 28 kDa also were observed routinely. The 23 kDa protein is neurocalcin, a homolog of recoverin, while the 28 kDa protein has not been identified. The identified proteins are calcium-binding proteins of the EF hand type, demonstrating the usefulness of phenyl-Sepharose chromatography for the initial enrichment of this family of proteins. Recoverin was purified further by anion exchange chromatography using a Mono Q column (Figure 2, lane d). The purification of recoverin was accomplished using a linear gradient of NaQ (0-250 mM), and recoverin eluted at approximately 125 mM NaCl. 94 67
43 30 20.1 14.4 a
b
c
d
e
f
g
h
Figure 2. Purification of Caldum-binding Proteins. A bovine retinal extract (lane a) is applied to a phenyl-Sepharose column, and after washing the column (lane b), four prominent proteins are eluted by the addition of EDTA (lane c). Recoverin is purified from the phenyl-Sepharose eluate by Mono Q chromatography (lane d). The eluate from the phenyl-Sepharose column also can be applied to an organomercurial colunrn: flowthrough (lane e), 10 fxM DTT wash (lane f), and Capl is ehited with 10 mM DTT (lane g). Calbindin D-28K does not bind to phenyl-Sepharose but is elutedfroma DEAE-Sepharose column by the addition of calcium (lane h). (Reproduced by permission from ref. 10).
290
Arthurs. Polans^/tf/.
Retinal calcium-binding proteins obtained from the phenyl-Sepharose column can be separated by RP-HPLC (20). Calmodulin elutes first, followed by Capl. S-1006 elutes as two distinct peaks, while recoverin was separated into a series of at least three overlapping peaks (20). Mass spectrometry revealed that these peaks correspond to the different fatty acid modified forms of recoverin. The first peak contains the C14:2 and C12:0 acylated forms. The second recoverin peak consists of the C14:l species and the third peak C14:0. The same procedure may be used to isolate similarly modified proteins, including the photoreceptor G protein transducin. Calcium-binding proteins can differ in their cysteine content and, therefore, the number and accessibility of free sulfhydryl groups. Organomercurial binds fi*ee sulfhydryl-containing proteins by forming mercaptide bonds that can be reversed by the addition of reducing agent. When the eluate from the phenyl-Sepharose column was applied to an organomercurial column, calmodulin was detected in the flow through (Figure 2, lane e), since calmodulin has no cysteine residues. Some recoverin and S-1006 also were present in this fraction and could be eluted entirely by the addition of 10 fiM dithiothreitol (DTT) (Figure 2, lane f). Although their accessibility is unknown, the elution profile correlates with the number of cysteine residues; recoverin has one cysteine residue, while S-1006 has two. Capl has four cysteine residues and was eluted at higher concentrations of DTT (1-10 mM) (Figure 2, lane g). The primary structure of purified Capl was determined by Edman analysis and MS/MS (10), and two EF hands were identified within the Capl sequence; the COOH terminal EF hand is a canonical type consisting of 12 amino acids in the calcium binding loop, while the NHj terminal EF hand is a variant comprised of 14 amino acids. Calbindin D-28K is an example of a calcium-binding protein of the EF hand family that is not enriched through its calcium-dependent binding to phenyl-Sepharose. Instead, it can be readily purified by binding to DEAESepharose in low calcium and eluted by increased calcium (Figure 2, lane h). Our studies demonstrate that calbindin D-28K is expressed in peripheral and perifoveal cone cells of the human retina but not in parafoveal or foveal cones, and this pattern of expression parallels the degeneration of photoreceptor cells seen in some humans with rod-cone dystrophies. Molecular cloning and cDNA sequence analysis indicates that vertebrate GCAPs contain 201-205 amino acids, three EF-hand calcium binding motifs and a putative site for N-terminal myristoylation at Gly 2 (15). LC/MS and MS/MS analyses of the purified protein indicate that the N-terminus of GCAP is heterogeneously N-acylated and synthetic peptide competition experiments demonstrate that the N-terminal 47 residues of GCAP may contain the domain that may interact with guanylate cyclase (15). Notably, GCAP maintains biological activity following RP-HPLC in acetonitrile at neutral pH; this technique should not be discounted in developing
Methods for Studying Calcium-Binding Proteins
291
purification methods for other biologically active proteins. GCAP is now one of the few calcium-binding proteins for which the function is known: it confers Ca^"^-sensitivity to guanylate cyclase as shown in Figure 3. Once proteins have been purified, probably the most accurate means of determining the number of calcium binding sites and their affinities is by flow dialysis (see Haiech et al., 1981 for further discussion). Figure 4 shows a fractional saturation curve for the binding of calcium to purified recoverin. The curve is asymptotic at n=2, indicating that recoverin contains two calcium binding sites. An additional, lower affinity calcium binding site would cause the curve to drift upwards rather than asymptote. The binding data can be analyzed using either the Hill equation or Adair-Klotz equation. Analysis of the data reveals a Hill coefficient of 0.99, indicating that the two calcium binding sites on recoverin are independent (non-cooperative). The data, not corrected for any loss of calcium, are fit by a single binding constant, K3=0.11 x 10'^M\ roughly comparable to calmodulin.
IV. Conclusion Several strategies have been provided for the purification and characterization of calcium-binding proteins of the EF hand family. Members of this family often can be enriched through calcium-dependent binding to a hydrophobic matrix. Although Calbindin D-28K, for example, does not bind to phenyl-Sepharose, it can be purified through binding to
Figure 3. Calcium titration of guanylate cyclase activity with and without 400 nM purified GCAP (left panel). Reproduced in part by permission from ref. 12. Figure 4. Calcium binding parameters for recoverin (27/i,M) purified by phenyl-Sepharose and Mono Q chromatography (right panel).
292
Arthur S. Polans et al.
DEAE-Sepharose in low calcium. In addition to these relatively soluble calcium-binding proteins, other less soluble EF hand proteins (eg. GCAP) can be purified by relatively simple procedures. Such isolation procedures will be of increasing importance for studying signal transduction pathways that involve calcium.
Acknowledgments This work was supported by USPHS Grants EYO7089 (ASP), EYO8061 and EYO1730 (KP), and EYO6603 and DK38639 (JWC). Further support included an award from Research to Prevent Blindness, Inc. to the Department of Ophthalmology at the University of Washington. KP is the recipient of a Jules and Doris Stein Research to Prevent Blindness Professorship.
References 1. Heizmann, CW. and Hunziker, W. (1991) TIBS 16, 98-103. 2. Cohen, P. and Klee, CB. (eds.) 1988 Calmodulin, Vol. 5: Molecular Aspects of Cellular Regulation, Elsevier Science Publishing Co., New York. 3. Grabarek, Z., Tao, T., and Gergely, J. (1992) J. Muscle Res. Cell Motil. 13, 383-393. 4. Dizhoor, A., Ray, S., Kumar, S., Niemi, G., Spencer, M., BroUey, D., Walsh, K., Philipov, P., Hurley, J., and Stiyer, L. (1991) Science 251, 915-918. 5. Gray-Keller, M.P., Polans, A.S., Palczewski, K., and Detwiler, P.B. (1993) Neuron 10, 523-531. 6. Kawamura, S. (1993) Nature 362, 855-857. 7. Polans, A.S., Buczylko, J., Crabb, J., and Palczewski, K. (1991) J. Cell Biol. 112,981-989. 8. Heizmann, CW. and Braun, K. (1992) TINS 15, 259-264. 9. Ebralidze, A., Tulchinsky, E., Grigorian, M., Afanasyeva, A., Senin, V., Revazova, E., and Lukanidin, E. (1989) Genes and Development 3, 1086-1093. 10. Polans, A.S., Palczewski, K., Asson-Batres, MA., Ohguro, H., Witkowska, D., Haley, T.L., Baizer, L., and Crabb, J.W. (1994) J. Biol. Chem. 269, 6233-6240. 11. Maruyama, K., Ebisawa, K., and Nonomura, Y. (1985) Anal. Biochem. 151, 1-6. 12. Gorczyca, W., Gray-Keller, M., Detwiler, P., and Palczewski, K. (1994) Proc. Natl. Acad. Sci. USA 91, 4014-4018. 13. Dizhoor, A., Lowe, D., Olshevaskaya, E. Laura, R., Hurley, J. (1994) Neuron 12,13451352. 14. Crabb, J.W., Johnson, CM., Carr, S.A., Armes, L.G., and Saari, J.C. (1988) J. Biol. Chem. 263, 18678-18687. 15. Palczewski, K., Subbaraya, L, Gorczyca, W., Helekar, B., Ruiz, C, Ohguro, H., Huang, J., Zhao, X., Crabb, J., Johnson, R., Walsh, K., Gray-Keller, M., Detwiler, P., and Baehr, W. (1994) Neuron 13, 1-20. 16. West, K. and Crabb, J. (1992). In Techniques in Protein Chemistry IIF (Angeletti, R.H., ed) Academic Press, San Diego, 295-304. 17. Colowick, S.P. and Womack, F.C. (1969) J. Biol. Chem. 244, 774-777. 18. Haiech, J., Klee, C, and Demaille, J.G. (1981) Biochemistry 20, 3890-3897. 19. Feldman, K. (1978) Anal. Biochem. 88, 225-235. 20. Polans, A., Crabb, J., and Palczewski, K. (1993). In "Methods in Neurosci. 15" (Hargrave, P., ed) Academic Press, San Diego, 248-260.
Evidence for the Presence of a-Bungarotoxin in Venom-Derived K-Bungarotoxin
James J. Fiordalisi and Gregory A. Grant Departments of Medicine and Molecular Biology & Pharmacology Washington University School of Medicine St. Louis, Missouri 63110
I.
Introduction
The venoms of certain species of poisonous snakes contain a large family of homologous peptide neurotoxins (66-76 residues) that act as acetylcholine antagonists by binding with high affinity to the nicotinic acetylcholine receptor (nAChR), preventing 3ie proper functioning of this receptor as a ligand-gated ion channel. In addition to sequence similarities, the three-dimensionad structures of several of these toxins have been determined and shown to be very similar. In spite of these similarities, the post-synaptic neurotoxins can be defined as either a-neurotoxins or K-neurotoxins according to functional differences. The aneurotoxins, of which over ninety have been identified, bind to the neuromuscular subtype of nAChR (K^ = 1 nM). In contrast, the more recently identified Kneurotoxins, of which four are known, bind to certain neuronal nAChR (K
293
294
James J. Fiordalisi and Gregory A. Grant
IL Materials and Methods Preparation of toxins: a-bgt was purified from crude Bungarus multicinctus venom (Miami Serpentarium, Salt Lake City, UT) (11). Venom-derived K-bgt was obtained from Biotoxins (St. Cloud, FL). Recombinant K-bgt was expressed in £. coli as previously reported (8, 9). The toxin was characterized by automated sequence and compositional analyses, mass spectrometry, CD spectroscopy (10) and analytical ultracentrifugation (12). Quail fibroblast-expressed muscle receptor assays: QT-6 quail fibroblasts expressing the mouse fetal muscle nAChR, produced as previously described (13), were grown in complete growth medium containing Earle'sfialancedSalt Solution (EBSS, Sigma Chemical, St. Louis, MO), 10% (v/v) Iryptose phosphate hroth (TPB, GIBCO, Grand Islands, NY), 5% fetal tovine ^erum (FBS, Hyclone, Logan, UT), 1% dimethylsulfoxide, 100 U/mL penicillin and 100 ^ig/mL streptomycin. Cells were grown at 37° C in a humidified 5% C02-95% air atmosphere until confluent (2-3 days). 24 hrs. prior to use, the cells were exposed to 3.5 mM butyrate. The binding assays were performed at room temperature and in duplicate. Cells were washed three times with 300 ^iL EBSS containing 10 mM HEPES, pH 7.3, supplemented with 0.2% FBS, 120 mM glucose, 100 U/mL penicillin and 100 ^ig/mL streptomycin (EBSS+) and then preincubated for 15 minutes with various concentrations of the toxins resuspended in 300 ^L EBSS+. After preincubation, the cells were rapidly washed three times with EBSS+ and immediately exposed to 300 jxL 10 nM 125l-a-bgt (prepared as previously described (14)) in EBSS+ for 5 minutes. The cells were washed three times with EBSS+ after removal of the radiolabel. Cells were freed from the plates with 300 ^iL 0.1 N NaOH. The plates were washed with 300 ^iL EBSS+ which was pooled with the NaOH wash. Receptor-bound radiolabel was counted in a TM-analytic 1191 gamma counter. Total binding of ^^^I-a-bgt was determined in the absence of inhibiting toxin during the preincubation and non-specific binding was determined in the presence of 1 nM unlabelled a-bgt during the preincubation. Chick skeletal muscle receptor assays: 14 day embryonic chicks were sacrificed by decapitation. Thigh muscle tissue was removed, cleaned and homogenized in 2.3 times (w/v) thick H.EPES tyrode (CHT) buffer, pH 7.2 containing 10 mM HEPES, 150 mM NaCl, 3 mM KCl, 5 mM CaCl2,2 mM MgCl2. The homogenate was centrifuged at 16000 xg for 10 minutes and the pellet was resuspended in the original volume of CHT. Muscle homogenate (9 pL) was preincubated for 2 hrs. in the presence of various concentrations of toxin in a total volume of 25.7 ^iL. Controls were identical except for the absence of inhibiting toxin in both the total binding and non-specific binding controls and the presence of 1 ^iM a-bgt in the non-specific control. All reactions were performed in duplicate and at room temperature. After the two hour preincubation period, 4.3 ^iL of 70 nM 125l-a-bgt was added to each reaction and mixed. After seven minutes, the binding reactions were essentially stopped by diluting with 300 ^iL cold CHT. The reactions were centrifuged for 10 minutes at 16000 xg. The pellets were washed with 300 ^iL of CHT, centrifuged and counted after removal of the supernatant. This procedure was performed at least three times for each recombinant construct tested. Data were analyzed by GraphPAD-InPlot sofware or SigmaPlot. The chick skeletal muscle assay and the quailfibroblast-expressedmuscle receptor assay described earlier are fundamentally the same type of assay. Both involve comparing the abilities of recombinant K-bgt, venom-derived K-bgt and abgt to bind to nAChR populations at various concentrations and to block the
a-Bungarotoxin in Venom-Derived K-Bungarotoxin
295
binding of 125i_(x_i)gt. During the pre-incubation period, the toxins to be assayed develop an equilibrium between their free and receptor-bound states that is dependent upon their affinities for the receptor. Over a range of concentrations, a particular toxin will produce a dose-response curve from which the concentration of toxin that blocks 50% of specific 125i-a-bgt binding to the muscle receptor (inhibitory concentration-50% or IC50) can be determined. The IC50S can be used to compare direcdy the relative affinities of the recombinant toxins. Direct determination of dissociation constants: A second mathematical analysis can also be applied to the data in order to direcdy calculate the dissociation constants (Kd) for the toxins at the muscle receptor (15). Assuming a standard bimolecular reaction scheme to the binding of toxin to receptor, Colquhoun et al. derived two equations:
dpB(t) / d(t) = K B [ B ] [ 1 - pB(t) - pi(t)] and
Pl(t) = [1 - PB(t)]Xi / [Xi + Ki] where pB(t) and pi(t) are the fractions of receptors occupied by 125l-a-bgt and competing toxins, respectively, at time t; Xi is the concentration of competing toxin; KB and Ki are the dissociation equilibrium constants of 125i-a-bgt and the competing toxin, respectively; and [B] is the concentration of 125l-a-bgt. By combining these equations and integrating to solve for pB(t), the rate constant (k) for the binding of 125l-a-bgt to the receptor can be derived and shown to conform to the following relationship:
k = K B [ B ] / [1 + Xi / Ki] If R is then defined as the ratio of k in the absence of competing toxins (kab) to k in the presence of competing toxins (ki), it can be shown that
R = kab/ki = [Xi/Ki] + 1 and that
R-l=Xi/Ki When R - 1 is plotted on the y-axis against competing toxin concentration (xi), a straight line is produced. When R - 1 = 1, Ki = Xi, thus yielding the Ki, or in this case, the K
III. Results and Discussion The functional comparison of venom-derived and recombinant K-bgt is predicated on the assumption that the recombinant toxin is properly folded and accurately reflects the native affinity of K-bgt for the muscle receptor. This point
James J. Fiordalisi and Gregory A. Grant
296
Table 1: Comparison of the ability of venom-derived (vd-) and recombinant (r-) K-bgt to block the binding of ^^I-a-bgt to the mouse fetal muscle nAChR expressed in QT6 quail fibroblasts.
% specific binding
Toxin concentration vd-K-bgt
430 nM
63
r-K-bgt
430 nM
100
_«__^^^__^
100 ""^""^ CQ
125i-a-bgt
o
^
75 h
o o CD
1
50
QL
25
0
1
i
0.1
1
1^*^^.
1
1
^ ^
i
1 10 100 1000 10,000 Toxin Concentration (nM)
Fi gure 1: Inhibition of ^^-I-a-bgl binding lo chick skeletal muscle by venom-dcrivcd K-bgt ( # ), recombinant K-bgt ( O ) and a-bgt ( •§ ).
a-Bungarotoxin in Venom-Derived K-Bungarotoxin
00
297
8-^ h
8.81 h I
•
j<
^
I
8.881
8.1
188
1888
18600
[Toxin] (nM) Figure 2: Direct determination of dissociation constants (FQjs) for recombinant K-bgt ( O ), venom-derived K-bgt ( # ) and a-bgt ( v )-
Table 2: Summary of chick skeleial muscle nAChR binding data Muscle nAChR ((al)2pY5) Toxin
IC50([xM)
a-bgt
0.0007
-9.2
0.0006
vd-K-bgt
0.15
-6.8
0.16
r-K-bgt
11.4
Log Kd ± SD
-5.0 10.19
Kd(nM)
10
Tlie IC50 values for each toxin determined from the dose response curves (Figure 1) are shown in column 1. The Log K^ and the K^j values calculated for each toxm in Figure 2 arc shown in columns 3 and 4, respectively.
298
James J. Fiordalisi and Gregory A. Grant
was addressed by analytical ultracentrifugation and CD spectroscopy (9,12). The CD spectra of venom-derived- and recombinant K-bgt were identical and indicated the expected presence of p-sheet. The application of analytical ultracentrifugation to an analysis of toxin conformation is based on the unique ability of the Kneurotoxins to form non-covalent dimers in solution (12,16). It was believed that the dimerization process would depend significantly on the toxin monomers being properly folded and, therefore, that proper folding could be inferred from dimerization. In addition to these methods of physical characterization, recombinant K-bgt was subject to functional analysis. Both recombinant and venom-derived K-bgt have shown an identical ability to block the nAChR in the chick ciliary ganglion indicating that the recombinant toxin had attained its functional conformation (10). As shown in Table 1, the apparent affinity of recombinant K-bgt for the mouse fetal muscle receptor expressed in quail QT-6 fibroblasts is significantly lower than that of venom-derived K-bgt. In spite of previous reports, it is not clear from these data whether K-bgt has any affinity for the muscle receptor since no displacement of I25i.(x-bgt was seen at 430 nM, the K(j observed with venom-derived K-bgt. However, due to limited toxin supply, it was not possible to address this point by assaying higher concentrations of recombinant and venom-derived K-bgt using the fibroblast assay. The chick skeletal muscle assay, however, is performed in smaller reaction volumes and requires less toxin. Using this assay, it was possible to analyze toxin concentrations up to 10 ^iM. Figure 1 shows the dose-response curves produced by venom-derived and recombinant K-bgt. The data generated with a-bgt are also shown for comparison. This analysis shows that K-bgt does have some ^finity for the muscle nAChR (IC50 =11.3 ^iM) but that it is 1-2 orders of magnitude lower than that of venom-derived K-bgt (IC50 = 0.15 ^iM). An analysis of the same data by a different mathematical treatment (15) allows for the direct calculation of dissociation constants for the toxins assayed (Figure 2). Summarized in Table 2, the KdS for venom-derived and recombinant K-bgt are significantly different (K^ = 0.16 ^iM and 10 jxM, respectively) and are consistent with the IC50 values calculated earlier. In Table 1, if we hypothesize a-bgt contamination of venomderived K-bgt of just 0.2% w/w, then the concentration of a-bgt that produced a 37% muscle receptor block would have been 0.86 nM. This is consistent with the known affinity of a-bgt for the muscle receptor (K^ < 1 nM) (Table 2). Whether or not the observed muscle receptor block is due in part to contamination by a-bgt can be answered definitively only by further purifying venom-derived K-bgt and showing that the active component in the sample is a-bgt by direct analysis. However, based on the work presented here, it is plausible to suggest that commercially-available K-bgt might be contaminated with small but pharmacologically significant amounts of a-bgt in the same way that commerciallyavailable a-bgt was shown often to be contaminated by K-bgt. The potential crosscontamination of venom-derived toxins suggested by these observations should be considered in the design and interpretation of any future studies involving such toxins.
Acknowledgments We would like to thank Dr. Joseph Henry Steinbach and Dr. Vincent A. Chiappinelli for their assistance with the fibroblast and chick skeletal muscle binding assays. We also thank Regina Al-Rabiee, Mark Crankshaw and the staff of the Washington University Protein and Nucleic Acid Chemistry Laboratory for their technical help.
a-Bungarotoxin in Venom-Derived K-Bungarotoxin
299
References 1. Boulter, J., J. Connolly, E. Deneris, D. Goldman, S. Heinemann, and J. Patrick (1987) ProcNatlAcacLScLUSA, 84: p. 7763-7767. 2. Chiappinelli, V. A. and R. E. Zigmond (1978) ProcNatlAcacLScLUSA, 75: p. 2999-3003. 3. Loring, R. H., V. A. Chiappinelli, R. E. Zigmond, and J. B. Cohen (1984) NeuroscL, 11: p. 989-999. 4. Rosenthal, J. A., S. H. Hsu, D. Schneider, L. N. Gentile, N. J. Messier, C. A. Vaslet, and E. Hawrot (1994) The Journal of Biological Chemistry, 269: p. 11178-11185. 5. Boyot, P., L. Pillet, F. Ducancel, J. Boulain, O. Tremeau, and A. Menez (1990) FEES, 266: p. 87-90. 6. Ducancel, P., J.-C. Boulain, O. Tremeau, and A. Menez (1989) Protein Engineering, 3: p. 139-143. 7. Pillet, L., O. Tremeau, F. Ducancel, P. Drevet, S. Zinn-Justin, S. Pinkasfeld, J.-C. Boulain, and A. Menez (1993) J,BioLChem., 268: p. 909-916. 8. Fiordalisi, J. J., C. H. Fetter, A. TenHarmsel, R. Gigowski, V. A. Chiappinelli, and G. A. Grant (1991) Biochem., 30: p. 10337-10343. 9. Fiordalisi, J. J., R. L. Gigowski, and G. A. Grant (1992) Protein Expression and Purification, 2: p. 282-289. 10. Fiordalisi, J. J., R. Al-Rabiee, V. A. Chiappinelli, and G. A. Grant (1994) Biochemistry, 33: p. 3872-3877. 11. Chiappinelli, V. A. (1983) Brain Res., 277: p. 9-21. 12. Fiordalisi, J. J. and G. A. Grant {\99A)Techniques in Protein Chemistry V, J. Crabb, Editor. 1994, Academic Press: p. 269-274. 13. Phillips, W. D., C. Kopta, P. Blount, P. D. Gardner, J. H. Steinbach, and J. P. Merlie (1991) Science, 251: p. 568-570. 14. Wolf, K. M., A. Ciarleglio, and V. A. Chiappinelli (1988) Brain Res., 439: p. 249-258. 15. Colquhoun, D. and H. P. Rang (1976) Molecular Pharmacology, 12: p. 519535. 16. Chiappinelli, V. A. and K. M. Wolf (1989) Biochem, 28: p. 8543-8547.
This Page Intentionally Left Blank
Progress in the Development of Solvent and Chromatography Systems Appropriate for Bitopic Membrane Proteins Song-Jae Kil, Lisa M. Oleksa, Geoffrey C. Landis, and Charles R. Sanders n DepL of Physiology and Biophysics, Case Western Reserve University, Cleveland, OH 44106
I. Introduction While membrane proteins are the subject of much concern, there remains a need to develop generally applicable methods for solubilizing, purifying, and reconstituting such proteins. This is true not only for large multitopic membrane proteins, but also for smaller bitopic (single span) transmembrane peptides (c.f., ref. 1). The principal emphasis of this paper will be upon the development of methods of particular relevance to the handling of this latter class of polypeptides. This work was undertaken in our lab as a component in the continuing development of an experimental approach to membrane protein structural determination (see refs. 2-5). Transmembrane peptides containing a single hydrophobic stretch of 15 to 25 residues plus polar residues on both sides of this stretch are notoriously difficult to solubilize and purify using solvent and chromatographic systems which have traditionally proven quite effective in studies of polypeptides with less than 50 residues. In this contribution results are presented from semisystematic screens of solvent and chromatography systems for specific application to bitopic membrane proteins. A partial bibliography of prior work completed in this area is also presented. It is hoped that this paper may serve as a concise guide to the current status of the field (which we make no attempt to review, outside of the bibliography) and may point to promising directions for future development. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
301
302
Song-Jae Kil et al.
n. Materials and Methods Polyamino acids were purchased from Sigma and were polydisperse, having the following average lengths: polyAla (42), polyPhe (15 or 34), polyLeu (242). Crude P16 and P24 (6) were prepared in-house or were a gift from R. Hodges and are H^N-K^GLi^K^-COOH and H^N-K^GI^K^-COOH, respectively. Gramicidin D was also from Sigma. Crude DGK3M was prepared by Chiron Mimotopes and has a sequence which is similar to the third transmembrane segment of the integral membrane protein, diacylglycerol kinase (7): HjN-DLIAIAVAVTTWLILAFSHK-COOH. Crude CorB was prepared by Jim Elliot of the Keck Biotechnology Resource Lab at Yale University and represents the third transmembrane segment of CorA, a Mg^* transporter (8): amide-KWSFGYPGAHFMILAGLAPYLYFKR-acetyl. Most solvents were from Fisher or Aldrich. It should be noted that at the time this study was carried out, the most inexpensive source of hexafluoroisopropanol we could find was Hoescht Advanced Technical Group (Summit, NJ) ($285/pound). Glass-backed thin layer chromatography (TLC) plates were from Whatman (Clifton, NJ). TLC plates were visualized by spraying or dipping in a ninhydrin solution followed by gentle heating of the plate with a heat gun (9) or by dipping into 50% sulfuric acid followed by 10 minutes incubation at room temperature, "drying" with a heat gun, and continued incubation during which time spots appeared and were recorded over a period of hours.
in. Results and Discussion A. Solubility Studies: Organic Solvents We first screened about 2000 solvent systems for their ability to solubilize polyAla, polyPhe, polyLeu, and gramicidin. Admittedly, none of these polypeptides can be regarded as particularly good models for natural biotopic membrane proteins (especially polyLeu). However, it was reasoned that solvent systems which were successful in solubilizing these peptides would likely prove effective in solubilizing more conventional membrane proteins. The following neat solvents formed the basis for the various mixtures tested. 1.
2. 3.
4.
Hydrogen bond donor•¥acceptor: water, methanol (MeOH), isopropanol (IPA), n-butanol, acetic acid (AcOH), trifluoroethanol (TFE), hexafluoroisopropanol (HFIP), ammonium hydroxide Hydrogen bond donor (only): pyrrole (may accept an H-bond, but not of the conventional variety) Hydrogen bond acceptor (only): tetrahydrofuran (THF), dioxane, dimethyl formamide (DMF), dimethylsulfoxide, (DMSO), ethyl ether, acetonitrile, pyridine, ethyl acetate Other: chloroform (CLF), dichloromethane (DCM), benzene
Solvent and Chromatography Systems for Membrane Proteins
303
All of these solvents (neat) and all possible 1:1 mixtures (19X19) were tested for their ability to completely solubilize each of the 5 peptides at a level of 1 mg/ml. Each neat solvent and 1:1 mixture was subjected to 5 independent variations: no additives, added 1% trifluoroacetic acid (TFA), added 2% pyridine, added 1% TFA + 5% water, added 2% pyridine + 5% water. Each solvent system was tested for its ability to solubilize the peptides both upon prolonged incubation (48 hrs) at room temperature and also when warmed on a hot water bath (2 minutes, 90 degrees). A number of less systematic tests were conducted involving mixtures including the following solvents: formic acid, TFA, N-methylpyroUidinone (NMP), hexafluorobenzene (HFB). Solvent systems which were identified as being promising based upon the screens described above were tested for their ability to solubilize the conventional transmembrane peptides DGK3M and CorA. In general, it was observed that the addition of small amounts of pyridine and/or water to the solvent system did not promote solubilization of the peptides. In cases where a "neutral" solvent system solubilized a given peptide, the presence of TFA in the otherwise same solvent system generally proved equally effective. However, there were a number of cases where the presence of 1% TFA was needed to make a particular solvent system effective. The results are summarized below. 1. 2.
3.
4.
5.
6.
Gramicidin D: soluble in a majority of the solvents tested PolyAla: neat HFIP and 1:1 HFEP:(IPA, benzene, HFB, GLF or DCM). Also soluble in TFA, in 10% TFA in TFE, in CLF or DCM with 1% TFA, in 1:1 ((XF or DCM): benzene with 1% TFA (and heating), and in 2:1:1 HFB:HFIP:CLF with 0.5% TFA. PolyLeu: 1:1 CLF:HFIP (with heating), 1:1 CLRHFIP with 1%TFA, CUB with 1% TFA, neat TFA, 1:1 HFIP:benzene with 1% TFA (with heating), 1:1 DCM:HFIP with 1% TFA (with heating), 1:1 THF:ethyl ether (with heating), 1:1 ethyl ether:benzene (with heating), 1:1 (DCM or CLF):benzene (with heating), 1:1 HFIP:ethyl ether with 1% TFA and 1:1 THF:benzene (with heating). PolyPhe: CLF or DCM with 1% TFA, 1:1 HFIP:(DCM or CLF) with 1% TFA, 1:1 benzene:(CLF or DCM) with 1% TFA (with heating) Crude DGK3M (of a total of 50 systems tested): HFIP, HFIPCLF mixtures, formic acid, 1:1 formic acid-TFE, 1:1 TFA:IPA, 1:2:2 TFA:IPA:CLF Crude Cor3 (of a total 23 systems tested): HFIP, TFE, AcOH, MeOH, DMF, DMSO, NMP, 1:1 HFIP:HA 1:1 TFE:CLF, 1:1 MeOH:THF, 1:1 HFIP:benzene, 1:1 HFIP:HFB, 1:1 HFIP:acetonitrile, 1:1 IPA:H20
DGK3M and Cor3 appear to represent extremes: Cor3 was easy to solubilize by a variety of solvents. DGK3M was very difficult to solubilize except in either extreme solutions (high % TFA or formic acid) or in the HFIP-
304
Song-Jae Kil et al.
containing systems which were suggested by the more systematic polyamino acid studies. Without additional data, it is difficult to say whether this difference in behavior is due to capping of the N- and C- termini of Cor3 or whether the difference is more sequence-related.
B. Solubility Studies: Detergents and Denaturants The ability of several denaturant and detergent mixtures to solubilize (1-3 mgs/ml) DGK3M, polyAla, polyLeu and polyPhe at room temperature were tested: 100 mM dodecyltrimethylammonium bromide, 100 mM CHAPSO, 125 mM SDS, 100 mM Triton X-100, 100 mM lauryl maltoside, 150 mM octylglucoside, 6 M guanidine-HCl, 8 M urea, and 20% polyethylene glycol 8000. All of the detergents were at concentrations well above their critical micelle concentrations. None of the poly-amino acids were found to be very soluble in any of the above mixtures. DGK3M was observed to be partially, but incompletely soluble in the SDS, dodecyl trimethylammonium, and octyl glucoside mixtures. The general difficulty of solubilizing these polypeptides in traditional denaturant and detergent solutions may reflect the inability of the solutions to overcome a kinetic barrier to disaggregation and should not necessarily be taken to imply thermodynamic instability of the fully dispersed proteins in the solutions. C. Making Homogeneous Polypeptide'Lipid
Mixtures
Not surprisingly (10,11), HFIP-containing solutions were generally the best at dissolving the polypeptides tested in this study. HFIP is also an excellent solvent for many phospholipids and has a relatively high freezing point (-4 deg. C). We found that polyAla could be dissolved (5 mg /ml) in any of the following solvents, frozen, andfreeze-driedin vacuo to produce an easy to handle fluffy solid: neat HFIP, 1:1 HFIP:hexafluorobenzene, and 1:1 HFIP:(CLF or DCM). This was not die case witii 1:1 HFIP:benzene, 1:1 HFIP:acetonitrUe, 3:1:1 HFIP:TFA:H20, 1:1 HFIP:IPA, 1:1 HFIP:water, 3:1:1 formic acid:HFIP:water, or 1:1 HFIP:formic acid. Lipid-peptide mixtures were next investigated. Roughly 10 mg of dimyristoylphosphatidylcholine (DMPC) and 2.5 mg of polyAla were codissolved by 0.5 ml each of various solvent mixtures to form a clear homogeneous solution followed by freezing, andfreeze-dryingin vacuo. Neat HFIP, 1:1 HFIP:benzene, and 1:1 HFIP:hexafluorobenzene all yielded an easy to handle, low density, white solid, while 1:1 HFIP:(CLF or DCM) produced a more glass-like solid.
Solvent and Chromatography Systems for Membrane Proteins
305
D. Screening of Chromatography Systems By TLC The results from the solubility studies were used to provide guidance for the screening of solvent systems for possible use as chromatographic eluents. One of the major problems previously encountered when doing normal or reversed phase chromatography of peptides having a high propensity to aggregate is that total post-column recovery is often very low. The aim of the screening studies described here is to identify solvent systems which will afford appropriate elution characteristics while at the same time promoting high post-column yield by preventing on- or pre-column peptide aggregation. While the TLC results presented below have yet to be extended to column chromatography, it is hoped that they may provide the foundation for future work. The polypeptides used in screening TLC systems were P16, P24, and DGK3M. TLCs were typically run using plates about 3 inches high at room temperature. The results summarized below must be viewed as preliminary for three reasons. First, it was observed that results could be highly dependent upon temperature which tended to be somewhat variable in our lab. Secondly, no effort was made to remove waters of hydration which may have absorbed to the silica-based stationary phases of the plates (the humidity in our lab is also variable). Third, because of the expense of HFIP, only minimal volumes were made of each solvent system, a fact which will tend to magnify errors in volumetric quantitation. With these caveats in mind the screening of solvent systems is summarized in Table L
Table 1. Summary of Thin Layw Chromatography Results Phase 1: Unmodified Silica Gel Solvent System
Peptide R(S
Additional Comments
1. HFIP:CLF=1:1
near 0
2. HFIP with 0.5% TFA and 0.5% HjO
near 0.1
3. HFIP with 1% TFA and 10% H^O
DGK: near 0.6 P16: near 0.52
4. HFEP with 10% formic acid
all near 0.9
5. 1:3 CLF:(HFIP with 10% fonnic acid)
DGK: near 0.8 P16: near 0.65
6. 1:6 CLF:(HFIP with 1% TFA and 10% H^O)
DGK: near 0.6 P16: near 0.45
7. 1:1:1 HFIP:Fomiic Acid:CHCl3
NA
Phase separation occurred and silica/plate bonding agent partially dissolved.
8. l:lTFA:CLF,with2%H20
NA
Solvent phase separation occurred on plate: very uneven solvent fronts.
306
Song-Jae Kil et al.
Table I (contd.) Phase 1: Unmodified Silica Gel Solvent System
Peptide R(S
Additional Comments
9. 1:3TFA:CLF
NA
Phase Separation
10. 1:1:0.075 HFIP:CLF:Formic Acid
near 0.9
Streaked/tailing spots.
11. 2:3 CLF:(HFIP with 10% FA)
near 0.7
Phase separation. Peptides at boundary.
12. 1:4 CLF:(HFIP with 1% TFA and 10% HjO)
DGK: near 0.6 P16: near 0.45
13. 7:1:2:2 CLF:HFIP:Formic Acid:IPA
0.2-0.6
Streaky spots.
14. 8:2:2 CLF:Fonnic Acid:IPA
0.1-0.5
Somewhat streaky peptide spots
15. 7:3:1 CLF:IPA:HFIP
nearO
16. 7:3:1:0.5 CLF:IPA:HFIP:TFA
nearO
17. 7:1:1.5 CLF:HFIP:TFA
near 0.5
Phase separation. Peptides diffuse at front of bottom phase.
18. 7:1:2:0.25 CLRHFIP: Formic Acid: H2O
0.1-0.4
Phase separation. Peptides streak in lower phase.
19. 6:2:2:1:0.25 CLF:HFIP: IPA:Formic Acid:H20
0.2-0.4
Peptide spots streaky.
20. 6:2:2:2:0.25 CLF:HFIP: IPA:Fonnic Acid:H20
near 0.5
Peptide spots slightly streaky.
21. 1:2:0.5 CLF:(HFIP with 10% HjO and 1% TFA):IPA
DGK: near 0.3 P16: near 0.2
22. 7:1.5:1.5:1.5 CLF: HFIP:Formic Acid:IPA
near 0.5
Peptide spots vwy streaky
23. 7:1.5:1.5:1:0.25 CLF:HFIP:Formic Acid:IPA:H20
0.05-0.3
Peptide spots slightly streaky
24. 7:1.5:1.5:1:1.5 CLF:HFIP:Formic Acid:IPA:H20
DGK: near 0.4 P16: near 0.2
Peptide spots slightly streaking.
Phase 2: C2-Silica (Ethyl-Modified Silica Gel) 25. neat HFIP
Rf: near 1.0
26. 2:1:0.5 (HFIP witii 10% H2O and 1% TFA):CHCl3:IPA
R^ near 0.2
27. 20:lHFIP:H2O
R^s near 1.0
28. 20:1:6.7 HFIP:H20:CLF
R^ near 1.0
29. neat CLF
R^satO
Phase separation. Peptides at boundary.
Solvent and Chromatography Systems for Membrane Proteins
307
30. 6:1CLF:HFIP
RfSatO
31. 2:1CLF:HFIP
RfSatO
32. 1:1 CLRHFIP
DGK near 0.8 P16: near 0.45
Peptide spots shaped unevenly and streaked.
33. 2:1 CLF:HFIP. with 0.5% TFA
near 0.27
May have been phase separation, spots diffuse.
34. 4:1 CLRHFIP, with 0.5% TFA
nearO
May have been phase
Phase 3: C18-Silica (Octadecyl-Modified Silica Gel) 35. neatHFEP
near 0.65
36. HFIP with 10% HjO
0-0.1
37. HFIP with 0.1% TFA
near 0.65
38. HFIP with 1% TFA
near 0.75
39. HFIP with 10% TFA
0.9-1.0
40. HFIP with 0.5% TFA and 0.5% HjO
near 0.75
Peptide spots uneven, some streaking.
near 1.0 42. 1:3 TFA:CLF
near 0.2
Solvent front uneven. Phase separation possible. Peptides spots blotched.
43. 1:1:0.075 HFIP:CLF:Formic Acid
near 1.0
Solvent front very uneven.
44. 2:3CLF:(HFIP with 10% formic acid)
near 1.0
Phase 3: C18-Silica (Octadecyl-Modiried Silica Gel) Solvent System
Peptide RfS
Additional Comments
45. 1:4 CLF:(HFIP with 1% TFA and lOHjO)
near 0.7
Peptide spots a little streaky and uneven.
46. 7:1:2:2 CLF:HFIP:Formic Acid:IPA
near 1.0
47. 8:2:2 CLF:Formic Acid:IPA
near 1.0
48. 7:3:1 CLF:IPA:HFIP
atO
49. 7:3:1:0.5 CLF:IPA:HFIP:TFA
atO
Peptide streak focussed at origin but going up plate.
50. 7:1:1.5 CLF:HFIP:TFA
NA
Solvent phase separation occurred on plate.
308
Song-Jae Kil et al.
Table I. (contd.) Phase 3: C18-Silica (Octadecyl-Modified Silica Gel) Solvent System
Peptide R^s
51. 6:2:2:1:0.25 CLF:HFIP:IPA:Formic Aci^HjO
at 1.0
52. 1:2:0.5 CLF:(HFIP with 10% HjO and 1% TFA)
at 1.0
53. 1:1:1:1 HFIP:IPA:H20:Formic Acid
near 1.0
54. 1:1 HFIP:H20, with 0.5% TFA
at 0.0
55. HF1P:H20 with 0.5% TFA
at 0.0
56. 9:1 HFIP:H20, with 0.1% TFA
near 0.05
57. 8:0.5:0.5:0.5 CLF:HFIP: IPA:formic acid
0.1-0.6
58. 8:0.5:0.5:0.5 CLF:HFIP: IPA:formic acid, with 0.2% HjO
near 0.9
59. 10:0.5 HFIP:H20. with 0.1% TFA
near 0.0
Upward streak from spot at origin ("umbrella")
60. 8:2.0:0.5:0.5 CLF:HFIP: IPA:formic acid, with 0.2% HjO
near 1.0
Solvent front blotched.
61. 8:2:1 CLF:HFIP:IPA, with 0.2% TFA
at 0.0
62. 8:2:0.5 CLF:HFIP:IPA, with 0.2% TFA
at 0.0
Some streaking of peptide up from origin. Phase separation?
63. 8:1:0.5:0.5 CLF:HFIP:IPA: Formic Acid, with 0.2% HjO
near 0.7
Moderate streaking of peptide spots.
64. 6:3:1.5 CLF:HFIP:IPA, with 0.2% TFA
near 0.33
Phase separation probably occurred with peptides on the boundary. Streaking.
65. 6:3:0.75 CLF:HFIP:IPA, with 0.2% TFA
near 0.3
Phase separation probably occurred, peptides lie on the boundary.
66. 6:3CLF:HFIP
NA
Phase separation appears to have occurred. Some peptide runs witii front of top phase, some at origin.
67. 4:4:1 CLF:HFIP:IPA, with 0.2% TFA
NA
Phase separation occurred. Peptide found at both fronts.
68. 8:3:1 CLF:HFIP:IPA
near 0.0
Solvent phase separation occurred on plate.
Additional Comments
Appeared to dissolve gel/plate bindCT. Peptide spots very blotchy.
Peptide spots very streaky.
Solvent and Chromatography Systems for Membrane Proteins
69. 8:3:1 CLF:HFIP:IPA.
309
near 0.0
with 0.3% HjO
70. 6:3:0.5 CLF:HFIP:IPA.
NA
with 0.2% HjO
71. 16:3:1 CLF:HFIP:IPA
near 0.0
72. 16:3:1 CLF:HFIP:IPA,
near 0.5
with 0.5% formic acid
Some peptide stays at origin, some runs with solvent front.
Some peptide left on origin, some streaking of spots.
Acknowledgments The authors thank the NIH (GM47485) for their support of this work. G.C.L. received salary support from an NIH training grant (T32 HL07653). Part of this work was carried out while C.R.S. was an Established Investigator of the American Heart Association.
References 1. Gullick, W. J., Bottomley, A. C , Lofts, F. J., Doak, D. G., Mulvey, D., Newman, R., Crumpton, M. J., Newman, R., Crumpton, M. J., Sternberg, M. J. E., and Campbell, I. D. (1992). EMBO J. 11, 43-48. 3-D structure of the transmembrane region of the protooncogenic and oncogenic forms of the neu protein. 2. Sanders, C. R., and Landis, G. C. (1994) Facile Acquisition and Assignment of Oriented Sample NMR Spectra for Bilayer Surface-Associated Proteins. / . Am. Chem. Soc. 116, 6470-6471. 3. Sanders, C. R., and Schwonek, J. P. (1993). Simulation of Solid State NMR Parameters from Oriented Proteins: Progress in Algorithm Development for Total Structural Studies. Biophys. J. 65, 1460-1469. 4. Sanders C. R., and Schwonek, J. P. (1993). An Approximate Model and Empirical Energy Function for Solute Interactions with a Water-Phosphatidylcholine Interface. Biophys. J. 65, 1207-1218. 5. Sanders, C. R., Hare, B. J., Howard, K., and Prestegard, J. H. Magnetically Oriented Phospholipid Micelles as a Tool for the Study of Membrane-Associated Molecules. Prog. NMR Spectr.y in press. 6. Davis, J. H., Clare, D. M., Hodges. R. S., and Bloom, M. (1983). Biochem. 22, 5298-5305. Structure of a synthetic amphiphilic polypeptide and lipids in a bilayer structure. 7. Smith, R. L., OToole, J. F., Maguire, M. E., and Sanders, C. R. (1994). / . Bacteriol. (in press). Membrane topology of E. coli diacylglycerol kinase. 8. Smith, R. L., Banks, J. L., Suavely, M. D., and Maguire, M. E. (1993). / . Biol. Chem. 268, 14071-14080. Sequence and topology of the CorA magnesium transport systems of S. typhimurium and E. coli, 9. Kirchner, J. G. (1978). "Thin Layer Chromatography". Techniques of Chemistry Vol. XIV. 10. Kuroda, H., Chen, Y.-N., Kimura, T., and Sakakibara, S. (1992) Int. J. Pept. Prot. Res. 40,294-299. Powerful solvent systems useful for synthesis of sparingly-soluble peptides in solution.
310
Song-Jae Kil ^/«/.
11. Narita, M., Honda, S., Umeyama, H. and Obana, S. (1988). Bull. Chem. Soc. Jap. 61, 281-284 (1988). The solubility of peptide intermediates in organic solvents. Solubilizing ability of hexafluoro-2-piopanol.
Supplemental Bibliography Greiner, J., Riess, J. G., and Vierling, P. (1993). In "Oiganofluoiine Compounds in Medicinal Chemistry" (R. Fillo*, Y. Kobayashi. and L. M. Yagupolskii, eds.), p. 339, Elsevier, New York. Author's note. This paper reviews a variety of partially perfluorinated detergents which have been developed fairly recently. Might these prove effective for solubilizing difficult membrane polypeptides? Indofine Chemical Company (1994). Fluorochemicals for Research and Development (catalog). Author's note: This availability of a large number of fluorinated compounds from Indofine may facilitate future developments of fluorocaibon-based solvent systems. Iwamoto, T., Grove, A., Oblatt Montal, M., Montal, M. and Tomich, J. M. (1994). Int. J. Pept. Prot. Res. 43, 597-607. Chemical synthesis and characterization of peptides and oligomeric proteins designed to form transmembrane ion channels. Author's note. This is one of many papers dealing with the handling of amphipathic peptides such as melittin or peptides derived from the pores of ion channels. It is our experience that these peptides are generally much easier to handle by traditional methods than non-amphipathic membrane proteins. The same can be said for uncharged transmembrane peptides like gramicidin and alamethicin. Hendrix, J. C, Halverson, K. J., Jarrett, J. T., and Lansbury, P. T. (1990). /. Org. Chem. 55,4517-4518. A novel solvent system for solid-phase synthesis of protected peptides. Kohl, B., and Sandermann, H. (1977). Febs Lett. 80, 408-412. Solubilization of E. coli membrane proteins by aprotic solvents. Lerro, K. A., Orlando, R., Zhang, H., Usherwood, P. N. R., and Nakanishi, K. (1993). Anal. Biochem. 215, 38-44. Separation of the sticky peptides from membrane proteins by HPLC in a normal-phase system. Author's note. We were unable to solubilize either DGK3M or any of the polyamino acids in the solvent systems reported in their work, suggesting their method may not be completely general. Ono, S., Lee, S., Mihara, H., Aoyagi, H., Kato, T., and Yamasaki, N. (1990). Biochim. Biophys. Act. 1022, 237-244. Design and synthesis of basic peptides having amphipathic beta structure and their interaction with phospholipid membranes. Perez-Gil, J., Cruz, A., and Casals, C. (1993). Biochim. Biophys. Act. 1168, 261270. Solubility of hydrophobic surfactant proteins in organic solvenl/water mixtures. Structural studies on SP-B and SP-C in aqueous organic solvents and lipids. Smith, S. O., Jonas, R., Braiman, M., and Bormann, B. J. (1994) Biochem. (in press). Structure and orientation of the transmembrane domain of glycophorin A in lipid bilayers. Tomich, J. M., Carson, L. W., Kanes, K. J., Vogelaar, N. J., Emerling, M. R., and Richards, J. H. (1988). Anal. Biochem. 174, 197-203. Prevention of aggregation of synthetic membrane-spanning peptides by addition of detergent. Vorherr, T., Wrzosek, A., Chiesi, M., and CarafoU, E. (1993). Prot. Sci. 2, 339347. Total synthesis and functional properties of the membrane-intrinsic protein phospholamban.
Rapid Separation of Proteins and Peptides using Conventional Silica-Based Supports: Identification of 2-D Gel Proteins Following In-Gel Proteolysis Robert L. Moritz, James Eddes, Hong Ji, Gavin E. Reid, and Richard J. Simpson Joint Protein Structure Laboratory, Ludwig Institute for Cancer Research (Melbourne) and The Walter and Eliza Hall Institute of Medical Research, Parkville, Vic. 3050 Australia
1. Introduction Reversed-phase high-performance liquid chromatography (RP-HPLC) is one of the most important and well established techniques for purifying proteins and peptides for structural analysis. Current technology, using narrow- and wide-pore size (120-300 A) silica-based (i.e., conventional) supports, requires 30-120 min to achieve highly efficient, reproducible separation of components [1]. However, during the last several years there has been an increasing awareness of the need for faster chromatography for high volume sample throughput. In our particular laboratory, this impetus has stemmed from the emergence of two powerful techniques that allow the rapid identification of 1-D and 2-D gel protein spots, namely, peptide-mass fingerprinting [2,3] and amino acid composition analysis [4]. Rapid chromatographic separation of proteins and peptides at high flow velocities (-1000 cm/h; i.e., 3 ml/min on a 4.6 mm I.D. column) on silicabased supports is not a new concept. However, these experiments were performed on specifically designed supports. During the late 1980's, several approaches to rapid RP-HPLC were investigated. These studies utilised small (1-3 |im) non-porous particles [5-7] which allow rapid mass transfer of solutes into and out of the thin layer of stationary phase on the surface of the particles. Although these supports are commercially available, they have not gained general acceptance due to serious drawbacks such as low mass loading (capacity) and high back pressure. Many of the limitations of the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
3j j
312
Robert L. Moritz et al.
latter supports were overcome with the development of polystyrene divinylbenzene-based polymeric supports such as the perfusive support [8]. This support has very large pore structures (>8000 A) which transect the particle and confer two distinct advantages over non-porous supports- (i) increased surface area, and hence higher mass loading, and - (ii) the minimisation of slow diffusive flow that permits rapid partitioning, and hence resolution, of proteins and peptides within the supports over a wide range of high flow velocities (1000-9000 cm/h). Although ultrafast (15-90 sec duration) protein and peptide separations have obvious analytical applications [9,10], the high flow velocities involved (>4000 cm/h) severely restrict their preparative application since the peak bandwidths involved (typically, 1-2 sec) present the researcher with the technical challenge of collecting fractions at these flows; a problem akin to trying to collect precision fractions from a "high-pressure garden hose". However, when these macroporous supports are used at flow velocities in the range 500-10(K) cm/h, peak bandwidths of 10-20 sec are obtained with only a minimal decrease in resolution, but in a time-frame that permits manual fraction collection. One important consideration of the macroporous supports that is often overlooked is that they do not possess the high efficiencies of conventional supports when operated at < 300 cm/h; flow velocities deemed optimal for the latter supports. As the flow velocities are increased, the efficiency of conventional supports deteriorates while that of the macroporous supports varies little. However, at flow velocities of ~ 3500 cm/h, the chromatographic behaviour of both supports is comparable. Here we describe a protocol for fast chromatographic analysis of proteins and peptides using conventional "wide pore" supports and standard liquid chromatographs. Using flow velocities of 500-l()00 cm/h (0.3-0.6 ml/min for a 2.1 mm I.D. column) and a constant linear gradient volume of 6 ml, peptide separations can be achieved in ~ 10 min; almost an order of magnitude faster than standard chromatographic conditions. Under these conditions, highly efficient, reproducible peptide separations (0.(X)9 RSD's) can be achieved on conventional columns. Additionally, we present our modified procedure for performing in-gel proteolytic digestion of 2-D gel protein spots for the purpose of identifying proteins by peptide-mass fingerprinting along with microsequencing.
Rapid Separations Using Conventional Silica-Based Supports
313
II. Materials and Methods All protein standards were from Sigma, Coomassie Brilliant Blue from LKBPharmacia, HPLC-grade solvents from Mallinckrodt and trifluoroacetic (TFA) was obtained from Pierce. Proteolytic digestion of standard proteins was performed at 37°C for 16 h in 0.2M ammonium bicarbonate using trypsin (Promega) at a 1:50 enzyme:substrate ratio. Homogeneous 10% Trisglycine gels were from Novex. Chromatography of proteins and peptides was performed on a Hewlett-Packard liquid chromatograph (model 1090A) as described [11] using - (i) Brownlee RP-300, 7 |im dimethyloctyl silica packed into a 100 mm x 2.1 mm I.D. cartridge, Applied Biosystems; - (ii) Poros Rn/H, 10 |im divinylbenzene crosslinked polystyrene packed into a 100 mm X 2.1 mm I.D. column, Perseptive Biosystems. Preparation of whole-cell lysates of the human colon carcinoma cell line LIM 1863, 2-D gel and SDS-PAGE analysis were performed as described [12,13]. In-gel proteolysis of gel protein spots was essentially as described [14] with the modifications shown in Table 1. Table 1. 1.
Protocol for in-gel protein digestion
Run 1-D or 2-D acrylamide gel. Visualize proteins with Coomassie Blue Stain gel: 50% MeOH, 10% HOAC, 0.1% CBR250 (~ 5-10 min) Destain: 12% MeOH, 7% HOAC, for 1-1.5 h (with ~ 3 changes)
2.
In-gel digestion: (i) Excise stained gel band (ii) Wash twice (~ 200 ^il 0.2M NH4HCO3 / 50% CH3CN) for 30 min at 30°C (iii) Dry gel band completely in Savant (~ 30 min) (iv) Rehydrate gel band with trypsin solution (~ 0.5-1.0 |ig trypsin in 10 \i\ 0.2M NH4HCO3, 0.5 mM CaCl2), 15-30 min. Repeat step (v) Add 150 ^il of digestion buffer (0.2M NH4HCO3, 0.5 mM CaCl2). Incubate at 37°C, ~16h
3.
Peptide Extraction (i) Collect digest buffer (ii) Add 200 ^il 1% TFA, sonicate ~ 30 min (35-40°C), collect extract (iii) Add 200 fxl 0.1% TFA/60% CH3CN, sonicate - 30 min (35-40°C), collect extract (iv) Pool extracts, concentrate by centrifugal lyophilization to ~ 10-20 jxl
4.
Peptide Identification: Separate peptides by either rapid microbore RP-HPLC (for Edman degradation) or rapid capillary RP-HPLC (for LC/MS/MS or peptidemass fingerprinting)
314
Robert L. Moritz et al.
i n Results and Discussion Binding Capacity and Mass Transfer Kinetics Frontal analysis chromatography, using a 1 mg/ml solution of lysozyme in aqueous 0.1% TFA, was performed at varying flow velocities with a conventional wide-pore (300 A) column and a macroporous (8000 A) column in order to evaluate their binding capacities as well as mass transfer kinetics. It can be seen in Fig.l that the total protein binding capacity (saturation level) was significantly greater (~ 3-fold) for the silica-based 300A support (~ 11.5 mg) when compared to the macroporous support (~ 4 mg). As expected, the protein saturation level in both cases was independent of flow velocities over the range examined. However, it should be noted that for the silica-based support, the initial binding (or breakthrough) is very much dependent on flow velocity. For example, at 347 cm/h the breakthrough occurs at protein loads > 11 mg, while at 3465 cm/h - at ~ 7 mg. The lack of variation in the frontal curve shape for the macroporous support reflects minimal "stagnant mobile phase mass transfer" - a feature of this support design. By contrast, for the conventional silica-based support, as the flow velocity increases the slow mass transfer kinetics attributable to the large number of stagnant (or inaccessible pockets) pools that are inherent in these supports becomes more pronounced. This observation has been previously reported for a wide range of other sihca-based supports [15]. e 1-5 a o
.B
LA
00
^
1.0
1
173^m/h
173^xm/h/^ 3 4 7 cm/h
O
g 0.5
r
.
3465^cm/h/ i / 86^ cm/h
3465^cm/h
O
0.0
1 i 1 1 1 1 1 1 1 i 1
10
1
1
1
1
1
15
1
1
f
J
86$ cm/h
1
.i-j. _ _ ^ _ . _ _ L _ 1
0
M7cm/h
1
1
1
1
1 H
3
Lysozyme (mg) Fig 1. Frontal loading adsorption isotherms (?/conventional "widepore" derivatised silica (Brownlee RP-300) and macroporous divinylbenzene crosslinked polystyrene (Poros RII/H) supports. Protein: 1 mg/ml solution in aqueous 0.1% TFA. Superficial linear flow velocities: 347, 866, 1732 and 3465 cm/h. Temperature: 25°C. (A) RP-300 2.1 mm ID cartridge. (B) Poros RII/H 2.1 mm ID column.
Rapid Separations Using Conventional Silica-Based Supports
1.5
0.1 ml/min 1.0 h 173 cm/h
e
0.5
5
0.0
o
JJJ
n
IL
0.1 ml/min 1.0 h 173 cm/h
c 0.5 o 0.0 CO
X) <
0
:r
20
B
2.0 ml/min 1-3465 cm/h
ii:
1.5
315
D
2.0 ml/min "3465 cm/h
40
60
0
1
2
3
Retention Time (min)
Fig 2. Rapid reversed-phase chromatography of standard proteins. Supports: RP-300 (panels A & B); Poros RII/H (panels C & D). Chromatographic conditions: linear 6-ml gradient of 0-100%B. Solvent A: aqueous 0.1% TFA, Solvent B: aqueous 0.1% TFA containing 60 % acetonitrile. Temperature: 45°C. Chromatographic runs performed at superficial linear flow velocities of 173 cm/h (0.1 ml/min) (A, C) and 3465 cm/h (2.0 ml/min) (B, D). Proteins (5 |ig): 1, ribonuclease-B; 2, chick lysozyme; 3, bovine serum albumin; 4, myoglobin; 5, carbonic anhydrase; 6, ovalbumin.
0.3 hA 0.2 0.1 0.0 0.3
\-C
o 0.0 on
<
0
^•B
2.0 ml/min 3465 cm/h
0.1 ml/min h D 173 cm/h
2.0 ml/min 3465 cm/h
0.1 ml/min 173 cm/h
lilUlll 2
JllJllLjL J. 20
40
-L
L_J
60 0
I
I
\
L-
1
Retention Time (min)
Fig 3. Rapid reversed-phase HPLC peptide mapping. Sample: 20 M,g tryptic digest of cytochrome-c. Columns: RP-300 2.1nmi ID cartridge (panels A & B); Poros RII/H 2.1nmi ID (panels C & D). Chromatographic conditions are the same as described in Fig. 2. Chromatographic runs performed at superficial linear flow velocities of 173 cm/h (0.1 ml/min) (A, C) and 3465 cm/h (2.0 ml/min) (B, D).
316
Robert L. Moritz et al.
Effect of Flow Velocity on Resolution and Recovery Chromatography of a mixture of standard proteins at varying flow velocities on conventional and macroporous supports is shown in Fig. 2. For both supports, it would appear that the resolution of these proteins varies little over the range of flow velocities examined. However, upon close inspection (compare Fig.2A & B), discernible loss of resolution is evident for the silicabased support upon increasing the flow rate from 0.1 to 2.0 ml/min (i.e., 173 to 3465 cm/h). A loss of resolution is also evident for the macroporous support, but to a lesser extent (compare Fig.2C & D). It should be noted that the seemingly lower recoveries at the higher flow velocities are due to peak broadening. Overall sample recovery, however, was comparable for both stationary phase packings (data not shown). Fractionation of a tryptic digest of cytochrome c on conventional and macroporous supports at varying flow velocities is shown in Fig.3, panels A-D. An inspection of the profiles at 0.1 ml/min indicates that the chromatographic efficiency of the wide-pore silica-based support exceeds that of the macroporous support; this is also evident by the greater (~ 30%) "peak capacity" (i.e., the number of peaks that can be resolved in a chromatographic separation) of the silica-based support. Surprisingly, the chromatographic efficiency of the silica-based support still exceeds that of the macroporous support at high flow rates (3500 cm/h). Additionally, the obvious selectivity differences between the two supports (compare peaks 1 & 2 in Fig. 3A & C) indicate that they could be used in series in a multidimensional peptide purification strategy. Peptide-Mapping of Acrylamide gel resolved Proteins following In-Gel Proteolysis During the last few years several internal amino acid sequencing strategies for SDS-PAGE separated proteins have been reported. For an excellent practical assessment of these methods see Williams et al. [16,17], and references therein. In an earlier report [14] we described our method which relies on removing the SDS from the Coomassie Blue stained gel prior to in situ proteolysis and an extensive acid extraction of generated peptides. In an effort to further reduce the overall time of the procedure, we have omitted some of the TFA extraction steps. Additionally, we have replaced the initial SDS removal step with a dilute TFA / acetonitrile extraction step, as described by Fernandez et al. [18]. Using varying amounts of proteolytically digested phosphorylase b (Mj. ~ 97000) and pre-cast 10% gels (Novex), we
'^
en
'
=1
ro ^
1
o
•
J H
o o
1111 o d
in
1 o
^
i _J
rr^
lll^
1
J o 1
1
H *n
U
o
^
^
-J
^
-1
H
) ^ u ^ I 3 J^ souBqjosqv
TO
^ W-)
I ^^ DC «« w . ^
i2
c^ ^ '^ V.
OQ ^
O
o
§
1.1 Wo
Ki
=
g
O
1^^
a § o .o
£
?5 . E "^ t:^
C/5
^ g -< m g vo ^ ci
^ •§3« •'-^
c
.•O
O O
X ".
"
o
=1 o
.
-*-* «2 -5 ^ 5 ^ B 0)
^^
*5
b
C
• 1 =
S
5/3 C«
^
fi E 5
fe o
. c M) C
Robert L. Moritz et al.
318
compared peptide recoveries from in-gel derived peptide maps with those obtained in-solution (Fig.4). It can be seen in Fig.4 that the peptide map profiles of the in-gel and solution digests compare favourably, even with 20 pmol (2 |i,g) amounts of protein. The recovery of peptides from the in-gel proteolysis, based upon absorbance at 214 nm, is ~ 80% compared to the control in-solution digests. Comparable data was obtained using standard proteins of lower Mj. such as lysozyme and P-lactoglobulin (data not shown). We are using the above mentioned approach to obtain peptide maps of 2-D gel resolved proteins from various human colorectal cancer cell lines as part of an ongoing program to identify specific colon tumour markers [12,14]. A typical application is shown in Fig. 5 for protein #3 isolated from LIM 1863 cells. Four Coomasssie blue-stained protein spots from identical gels were digested with trypsin and the digest mixture subjected to fast chromatography on a conventional 2.1 mm ID column using TFA/acetonitrile (Fig. 5A). Peptide T4 was sequenced directly while peptides Tl-3 were further purified on the same column, but using 1% NaCl/acetonitrile (Fig. 5B), before sequence analysis. The partial sequence data obtained (data not shown), at 5-17 pmol levels, was used to search the available protein sequence databases and permit the unambiguous identification of protein #3 as heat shock protein 60.
I
0
2
I
4
I
I
6
I
I
8
I
I
I
10
I
X
I
12
0
2
Retention Time (min) Fig 5. Rapid peptide mapping of colorectal cancer cell line LIM 1863 protein #5. Coomassie blue stained protein #3 from 4 identical 2-D gels was digested in-gel with trypsin, as described in Materials and Methods, and chromatographed on a conventional silica-based support (Brownlee RP-300) as described in Fig.2. . First chromatographic dimension (Panel A): linear 6 ml gradient 0-100% B; solvent A, aqueous 0.1% TEA; solvent B, aqueous 0.1% TFA / 60% CH3CN, Flow, 0.5 ml/min (866 cm/h). (B) Second chromatographic dimension (Panel B); peptide fraction containing peptides Tl, T2 & T3 (see collection bar) from Fig.5A were rechromatographed on the same column but using a linear 5 ml gradient from 0-50% B; solvent A was aqueous 1% NaCl, pH 6.5 and solvent B was acetonitrile. Flow rate, 0.5 ml/min (866 cm/h).
Rapid Separations Using Conventional Silica-Based Supports
319
IV Conclusions This report describes a rapid chromatographic method for obtaining peptide maps of 1-D and 2-D gel resolved proteins. Peptides can be recovered from acrylamide gel pieces, following in-gel proteolysis, by TFA extraction. Peptides obtained by this approach can be fractionated by rapid (~ 10 min) reversed-phase chromatography using conventional silica-based supports and standard liquid chromatographs. In conjunction with mass spectrometric peptide-mass fingerprinting technologies [2], this approach may allow a rapid expansion of 2D gel protein databases.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
15. 16. 17. 18.
Simpson, R.J., Moritz, R.L., Begg, G.S., Rubira, M.R., and Nice, E.G. (1989) Anal. Biochem. Ill, 221-226. Pappin, D.J., H0jrup, P. and Bleasby, AJ. (1993) Current Biology 3, 327-332. Mann, M., H0jrup, P., and Roepstorff, P. (1993) Biol Mass Spec. 22, 338-345. Shaw, G. (1993) Proc. Natl. Acad. Sci. U.S.A. 90, 5138-5142 Unger, K.K., Jilge, G., Kinkel, J.N. and Hearn, M.T.W. (1986) J. Chromatogr. 359, 61-72. Kalghatgi, K. and Horvath, C. (1987) J. Chromatogr. 398, 335-339. Yamasaki, Y., Kitamura, T., Nakatani, S. and Kato, Y. (1989) J. Chromatogr. 481, 391-396. Afeyan, N.B., Gordon, N.F., Mazsaroff, I., Varady, L., Fulton, S.P., Yang, Y.B. and Regnier, F.E. (1990) J. Chromatogr. 519,1-29. Nugent, K.D. (1990) In "Current Research in Protein Chemistry: Techniques, Structure and Function" (Villafranca, J.J., ed.) Academic Press, pp. 233-244. Fulton, S.P., Afeyan, N.B., Gordon, N.F. and Regnier, F.E. (1991) J. Chromatogr. 547,452-456. Simpson, R.J., Moritz, R.L., Rubira, M.R. and Van Snick, J. (1988) Eur. J. Biochem. 176, 187-197. Ji, H., Whitehead, R.H., Reid, G.E., Moritz, R.L. and Simpson, R.J. (1994) Electrophoresis 15, 391-405. Ward, L.D., Ji, H., Whitehead, R.H. and Simpson, R.J. (1990) Electrophoresis 11, 883-891. Ward, L.D., Reid, G.E., Moritz, R.L. and Simpson, R.J. (1990). In "Current Research in Protein Chemistry: Techniques, Structure and Function" (Villafranca, J.J., ed.) Academic Press, pp. 179-190. Introduction to Modern Liquid Chromatography. (Snyder, L.R. and Kirkland, J.J.) Wiley, Toronto. Williams, K., Kobayashi, R., Lane, W. and Tempst, P. (1993) ABRF News 4, 7-12. Stone, K. (1992) ABRF News 3, 8. Fernandez, J., DeMott, M., Atherton, D. and Mische, S.M. (1992) Anal. Biochem. 201, 255-264.
This Page Intentionally Left Blank
SECTION V Mutagenesis and Protein Design
This Page Intentionally Left Blank
studying a-Helix and B-Sheet Formation in Small Proteins Catherine K. Smith, MaryMunson, and Lynne Regan Department of Molecular Biophysics and Biochemistry Yale University, New Haven, CT
I. Introduction The protein engineer can mutate proteins specifically with relative ease, and for this reason, protein engineering provides a swift and precise means for exploring a protein's structure and activity. However, the effects of these mutations on protein structure and stability cannot be predicted with a comparable precision yet. As a result, great care must be taken in the design of experiments to examine the desired property independently of other interactions. Here, we describe the design and analysis of two experiments which use protein engineering to explore two aspects of protein structure. The first experiment investigates how a-helices pack together to form tertiary structure through redesigning the hydrophobic core of the four-helix-bundle, Rop, Figure la (1). The second experiment quantifies the 6-sheet forming tendencies of the amino acids (2) using the IgG Fc binding domain of Streptococcal Protein G ("61") as a model system, Figure lb (3).
Figure 1 Ribbon diagrams of the overall protein fold of a) Rop and b) 61 TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
323
324
Catherine K. Smith et al.
11. Selection of the Model System Careful selection of the protein to be engineered can simplify both obtaining mutant proteins and analyzing the data collected. Important considerations in this choice include high expression levels, ease of purification and high solubility in aqueous solution, criteria which both Rop and 61 satisfy. Additionally, a detailed structural and thermodynamic analysis must be feasible. Both Rop and 61 lack disulfides, cofactors and proline residues, and the thermodynamics of unfolding are approximated by a reversible two-state transition. The stability of the wildtype should be relatively high so that potentially destabilizing mutants still can be obtained and analyzed. The melting temperature of Rop is 64 *C and that of 61 is 87 "C. A detailed structural analysis is available for both proteins at both low and high resolution to facilitate the design and analysis of mutants. Functionality assays exist for both 61 and Rop: IgG binding and RNA binding respectively. These assays provide convenient tests for the retention of global structure in redesigned variants. The secondary structure of Rop and 61 can be examined by another low resolution technique, circular dichroism (CD) because both proteins give strong characteristic spectra. Finally, the structures of 61 and Rop have been solved at high resolution by NMR (4,5) and X-ray crystallography (6,7), which simplifies similar high resolution structural determination of the engineered mutants.
i n . Preliminary Modification of the Model System No matter how ideal the model system may seem, modifications are occasionally necessary before work can begin. In the case of 61, expression of the wild-type domain yielded two species: an unprocessed form that retained the N-terminal methionine and a processed form that lacked the methionine. This incomplete processing has serious consequences: The two species differ in stability by 1.7 kcal/mol, and the heterogeneity of the N-terminus causes a duplication of many of the resonances in the ^H NMR spectrum (4). Before initiating the 6-sheet propensity studies, it was therefore necessary to "engineer" a homogeneous protein preparation. To do this, we took advantage of the substrate recognition preference of the processing enzyme, N-terminal methionine aminopeptidase. For cleavage to occur, a small amino acid is required at the second position following the Nterminal methionine (8). Substitution of a larger amino acid. Gin, for the Thr at position 2 in 61 prevented cleavage of the N-terminal methionine yielding a homogeneous pool that retains the N-terminal methionine. All subsequent mutations were made in this "pseudo wild-type" background. The opposite approach was taken with the expression of Rop. Gly was inserted at position 2, which resulted in the complete removal of the N-terminal methionine and again a homogeneous protein pool. However, it is sometimes possible to be too clever in modifying one's model system. The wild-type Rop protein has a six amino acid C-terminal "tail" which has been shown to be unnecessary for RNA binding and to be unstructured by both X-ray crystallography and NMR (5,7,9). We therefore truncated the protein at the end of helix 2 in order to work with only the well-defined four-helixbundle. Although the resulting protein still bound RNA, it tended to precipitate even at low concentrations. Hence, the seemingly unimportant tail was actually critical for solubility. All subsequent mutations in the Rop protein retain this "solubility tail."
a-Heliz and p-Sheet Formation
325
IV. Experimental Design Careful experimental design is critical to the aim of isolating the parameter in question and minimizing other artifactual effects. Modeling the planned mutations using various software packages provides a sense of how those mutations will affect the stability or tertiary structure of the wild-type protein. However, modeling studies still have limited predictive capability, and there are often surprises when the redesigned protein is actually synthesized and characterized.
A. Repacking the Hydrophobic Core of a Four Helix Bundle The formation of monomeric a-helices is relatively well understood, hence the challenge is to investigate how helices pack together to form tertiary structure. We have redesigned the hydrophobic core of Rop to contain a simple, repetitive sequence to more fully understand the requirements for packing helices in a four-helix-bundle protein. In the wild-type protein, the hydrophobic core is formed by eight "layers" of side chains perpendicular to the long axis of the bundle. Shown schematically in Figure 3, each layer is formed by two "a" and two "d" position residues, one from each of the four helices. The "a" position in one layer packs against the "d" positions in both the preceding and succeeding layers. Most of the "a" positions are occupied by small amino acids, such as Ala, while the "d" positions contain larger amino acids like Leu. In the repacked protein, Rop21, all eight layers have been simplified, each containing only Ala and Leu at the "a" and "d" positions, respectively. Analysis using the program X-PLOR (11) suggested that the new design was feasible. After minimization, the final energies of Rop21 and wild-type were comparable, and major displacements of atoms were not required to give a wellpacked interior. While modeling can help to rule out seriously destabilizing designs, the need to actually make the mutations must be emphasized. This is especially important when altering hydrophobic interactions because energy functions that satisfactorily include solvation effects are still under development.
Figure 2 The residues of the helices of Rop may be conveniently specified by the "heptad repeat" nomenclature used for associating helices (10).
326
Catherine K. Smith et al.
B. fi'Sheet Propensity Studies In contrast to a-helix formation, the formation of 6-sheet structure is poorly understood. Hence, we sought to determine the difference in the 6-sheet forming propensities of the amino acids. Each amino acid was individually substituted into a single solvent exposed "guest" site in 61. The difference in stability between the mutants was compared and formed the basis of the ranking. To ensure that any difference in stability between the 61 variants reflects 6-sheet propensities rather than artifactual interactions, the position and local environment of the guest site are of key importance. Position 53 was selected as the optimal guest site for the following reasons (Figure 3): First, its side chain extends from the face of the sheet that is opposite to the helix. Second, position 53 is found in the middle of a central 6-strand, which provides the most homogeneously 6-sheet environment available. Finally, the amide and carbonyl hydrogens of position 53 are bonded to the adjacent strand. Residues that make hydrogen bonds with solvent may possess more conformational freedom which could mask potentially small differences in conformational preferences. To attenuate interaction between the guest and neighboring sheet residues, small neutral side chains were included at positions surrounding the guest site. Computer modeling indicated that the mutations, I6A and T44A, would minimize host-guest interactions. Mutations at the guest site in this host environment will be referred to as 61X where X is the one letter code for the guest residue.
Figure 3 Illustration of the guest site and surrounding host environment in the four-stranded 6sheet region of 61. Arrows indicate hydrogen bonds. The guest site is circled and highlighted in black. The nearest neighbor residues are also circled but highlighted in gray.
IV. Structural Controls For a complete interpretation of any protein engineering results, it is important to show that the "folded state" of the proteins has not been significantly distorted from the wild-type, especially for mutations that are severely destabilizing. Preliminary tests for the creation of a folded mutant protein are overexpression and purification properties. The repacked Rop21 could be expressed and purified comparably to wild-type. 61 mutants expressed well although overexpression of the two most destabilized 61 mutants was substantially reduced. Wild-type Rop and Rop21 had similar retention times on a gel filtration column, suggesting that the dimeric oligomerization state was not perturbed. The CD spectrum of Rop21 was similar to that of wild-type Rop.
a-Heliz and p-Sheet Formation
327
T2Q Wildtype Chemical Shift (ppm) Figure 4 Comparison of the backbone amide and a-proton chemical shift (ppm) for the "pseudo wild-type," 61-T2Q, versus BIT, 61R and BIA.
Finally, the binding of Rop21 to a complex of two small target RNAs was about five times less than wild-type, but still strong enough to indicate that the overall structure of the mutant was not radically altered. NMR was used to verify the structural integrity of the 6-strand regions of various 61 mutants. The characteristic sequential and long range NOEs of a 6sheet were observed. A comparison of the proton chemical shifts for 61-T2Q and various Bl mutants provided evidence that the global conformation of the mutants was unaltered (Figure 4). CD was used to characterize the two most destabilized Bl mutants, BIG and BIP, as their reduced overexpression prohibited structural characterization by NMR. Compared to B1-T2Q, the spectrum of BIP was quite perturbed indicating that it was significantly unfolded (Figure 5). As a result, a meaningful AAG for BIP could not be determined. For Rop21, we anticipate that crystallographic analysis would be most useful high resolution technique. The large number of Ala and Leu residues in the core would result in degeneracy of the NMR spectrum and create ambiguities in the assignment. Crystallization trials are in progress.
10 +^.H++++++H+
0^
1 -io4^v+^^" O o -20S -30-f -40-
200
_ J — I
210
220
nm
230
I
I
—
I
240
I
I L _
250
Figure 5 Comparison of CD spectra of BIT, BIG and BIP at the guest site. Measurements were made at 10 *C.
Catherine K. Smith et al.
328
\. Thermodynamic Characterization Of equal importance to a structural characterization of the redesigned proteins is the determination of their stabilities. CD is a simple but sensitive probe that can be used to monitor the loss of secondary structure as a function of temperature. Requiring only small amounts of protein, CD allows a direct determination of melting temperature (Txn) and a calculation of van't Hoff enthalpy of unfolding. Differential Scanning Calorimetry (DSC) can also be used to determine a protein's thermal denaturation curve. Because it monitors the actual absorption of heat by the sample, AH values can be calculated directly from integration of the unfolding transition peak. However, DSC requires significantly more protein than CD, which can be prohibitive if large amounts of pure protein are difficult to obtain.
A. Rop The characterization of Rop21 involved both CD and DSC to compare its thermostability with that of wild-type Ropll. Thermal denaturation monitored by CD was fully reversible for both wild-type Ropll and repacked Rop21. The CD measurements showed that Rop21 was substantially more stable than wildtype Ropl 1 with a Tm that was increased by more than ZO^C. The T^ of Rop21 was so high that a post-transitional baseline could not be recorded using CD, and the van't Hoff enthalpy could not be calculated from this data. DSC measurements were therefore used to obtain a more complete description of the protein's thermodynamic characteristics. Figure 6 illustrates the unfolding transition of wild-type Ropll and repacked Rop21. It is clear that Rop21 is substantially more stable than Ropll and has a significant enthalpy associated with the denaturation transition, as would be expected for a well-packed, nativelike protein. The calculated enthalpy and free energy of unfolding for the two proteins is summarized in Table I.
Rop21 Ropll
20 Figure 6
40
60 80 Temperature CC)
100
120
Sample DSC scans comparing the thermal denaturation of wild-type Ropl 1 and the repacked Rop21. Both proteins were at a concentration of 4mg/ml in lOOmM Na phosphate pH 7.0,200mM NaCI. Under these buffer conditions, the denaturation of both proteins was reversible.
a-Heliz and p-Sheet Formation
329
Table I The melting temperatures (Tjn) from the CD thermal denaturation studies are estimated from the temperature at which the slope of the first derivative of the uncorrected data was a minimum ti/2 is the temperature of half-completion of the DSC thermal denaturation transition. A//cal is the calorimetric enthalpy. The DSC AAG% = AG°u (mutant) - AG'^u (wild-type), at the value of ti/2 for the wild-type protein (74.6°C). Tm Protein
AHCSLI
(kcal/mol) 64 91
Ropll Rop21
tl/2
74.6±0.7 95.4±0.5
113.8±7.6 94.3±3.6
DSC AAG^u (kcal/ mol)
"6
5.6
B.Jil Thermal denaturation monitored by CD was also used to determine the scale of 6-sheet propensities. As shown in Figure 7, measurable differences were found between the 6-sheet forming propensities of the amino acids. The thermal stabilities of the various mutants of 61 containing different guest site substitutions formed the basis of the thermodynamic scale generated. Table II shows this scale in comparison to the natural statistical distribution of amino acids occurring in 6-sheet structure (12). In both cases, the best 6-sheet formers tend to be the 6-branched and aromatic residues. It is important to note that these trends emerge through both the averaging of different environments inherent in the statistical data and a single solvent exposed site in this study. The range of the scale is -2.5 kcal/mol (excluding Gly and Pro), which indicates that 6-sheet propensities are an important consideration in protein stability.
1 -100 -120 20
30
40
50
60
70
80
90
Temperature 'C Figure 7 Typical raw data from thermal denaturation of Bl mutants monitored at 222nm at a concentration of ~0.5mg/ml in 50mM NaOAc pH 5.2. Under these buffer conditions, the thermal denaturation of the wild-type and the mutants was shown to be reversible.
330
Catherine K. Smith et al.
Table II Summary of the B-sheet propensity data Tm is the midpoint of the thermal denaturation transition. Also shown are the Pg values for the probability of occurrence of each amino acid in B-sheet in proteins of known structure (12). A AAG value is reported for each mutant that is calculated relative to AG Ala* ^^<^B1X=^^B1X-A<^B1A- This treatment assumes that within the transition region, Af/ is independent of temperature. Accordingly, AAG values are reported at a temperature that is within the transition region for all the mutants (60°C). For the standard, 61 A, AG = -0.24 kcal/mol at 60°C. Experiments were performed in triplicate, and the results were averaged. The maximum error in the Tm is ± 0.4'C and in AG to be less than 5%. 1
1
GUEST RESIDUE TYR THR ILE PHE TRP VAL SER MET CYS LEU ARG ASN HIS GLN LYS GLU ALA ASP GLY
PRO
TmCC) 69.22 68.67 67.78 67.68 65.73 65.47 64.80 64.26 63.99 62.47 62.41 6L88 60.96 60.90 60.65 58.81 57.05 50.91 45.95 <10
AAG (kcal/mol) -1.63 -1.36 -1.25 -1.08 -1.04 -0.94 -0.87 -0.90 -0.78 -0.45 -0.40 -0.52 -0.37 -0.38 -0.35 -0.23 0 0.85 1.21 ND
'^
1.31 1.33 1.57 1.23 1.24 1.64 0.94 1.01 1.07 1.17 0.94 0.66 0.83 1.00 0.73 0.51 0.79 0.66 0.87 0.62
VI. Conclusion A. Repacking the Hydrophobic Core We have created a completely repacked protein by repeating a simple heptad motif throughout the hydrophobic core of a four-helix-bundle protein. The repacked protein shows significantly increased thermostability relative to the wild-type protein. Clearly, protein engineering can "improve" on the natural design, perhaps because the natural protein has evolved to optimize its function rather than its stability. It is important to note that the repacked protein retains native-like properties: It binds RNA, exhibits a cooperative two-state unfolding transition both thermally and chemically, and shows a significant enthalpy of denaturation by DSC. These characteristics clearly distinguish it from the "molten globule"-like molecules that have resulted from other attempts at protein design. Future designs will explore the limits of the repacked core, incorporating both larger and smaller residues, as well as different packing shapes and pattems. The results from these studies will help to define the rules for the packing of four-helix-bundles.
a-Heliz and p-Sheet Formation
331
B.Ji-Sheet Propensities Initially, we thought there might not be measurable differences in the 6-sheet propensities because extended B-sheet structure is much more similar to the unfolded state than the constrained a-helix. Hence, the 6-sheet might be merely a "default" structure into which any amino acid may fit, and the Chou-Fasman statistical parameters (12) may therefore result from the strict conformational requirements of a-helices alone. Because 6-sheets are more often found fully buried than are a-helices, the statistical distribution of the hydrophobic, 6branched residues in 6-sheets may reflect only a hydrophobic requirement rather than a 6-sheet forming propensity. From the 61 studies, however, it is clear that 6-sheet propensities are indeed measurable. Hence, the 6-sheet is not a default structure and has certain conformational requirements which must be satisfied. Even in a solvent exposed guest site, the 6-branched and aromatic residues still tend to be the best 6-sheet formers. Apparently, the statistical frequencies reflect the high propensity of these amino acids for a 6-sheet conformation. The range of free energies between the best and worst 6-sheet formers is comparable to that found for a-helices (--1 to 2.5 kcal/mol) (14,15,16,17) indicating that 6-sheet propensities are at least as important as a-helix propensities in contributing to protein stability. The range determined in this study (2.5 kcal/mol) is much larger than that determined by Kim and Berg (18) using a zinc-finger peptide host (0.2 kcal/mol). The location of their guest site in an edge strand may have accommodated poor 6-sheet forming residues more readily and attenuated the range determined. The rank order of 6-sheet preferences obtained here clearly shows a general correlation with the statistical values of 6-sheet formation derived from surveys of proteins of known structure as well as with measurements of 6-sheet propensities in other systems (18,19). However, Chou-Fasman probability values (12) are averaged over all possible 6-sheet environments, such as middle and edge positions as well as partially and fully hydrogen bonded positions. In contrast, our 6-sheet propensity measurements were made in a single 6-sheet environment. Hence, the lack of a perfect correlation between our experimentally measured rankings and the Chou-Fasman probabilities is not unexpected and may reflect both the influence of the statistical averaging and also the small differences between the sheet forming propensity of the amino acids in the middle range. The similarities between the different experimentally measured scales for the 6-sheet propensities demonstrate that there are intrinsic differences in how readily each amino acid can be accommodated in a 6-sheet. The lack of a one-toone correspondence between the different scales (1,18,19) reflects, as would be expected, the influence of local interactions which modulate the individual ranking of the amino acids within the overall trends. Future studies will investigate the structural basis of the observed differences in 6-sheet propensities and explore how these differences can be modulated by local environmental effects.
Acknowledgments We would like to thank D. Crothers, J. Marino and R. Gregorian for the development of the Rop RNA-binding assay and for helpful discussions. Thanks to E. Kim, G. Liu, A. Nagi, P. Predki, M. Klemba, S. Marino and G. Schoenhals for critical reading of the manuscript. We thank R. O'Brien and J. Sturtevant for performing the calorimetric experiments. We would like to thank
^32
Catherine K. Smith et al.
J. Withka and K. Gardner for their invaluable assistance in performing the NMR experiments. L.R. is an NSF National Young Investigator. C.K.S. and M.M. are NIH predoctoral trainees. This work was supported in part by NSF grant MCB9316863 and NIH grant GM49149-01A1.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
Munson, M., O'Brien, R,, Sturtevant, J.M., and Regan, L. (1994), Protein Science, in press. Smith, C.K., Withka, J.M., and Regan, L. (1994) Biochemistry 33,5510-5517. Fahnestock, S.R., et al. (1986) JJBact. 167,870-880. Gronenbom, A.M., et al (1991) Science 253,657-661. Banner D.W., Kokkinidis, M., and Tsemoglou, D. (1987) /. Mol Biol 196,657-675. Derrick, J.P. and Wigley, D.B. (1992) Nature 359,752-754. Eberle, W., et al (1991) /. Biomol NMR 1,71-82. Boissel, J., Kasper, T.J., and Bunn, H.F. (1988) /. Biol .Chem. 263,8443-8449. Castagnoh, L., et al (1989) EMBO J 8,621-629. Cohen, C. and Parry, D.A.D. (1986) TIBS 11,245-248. Briinger, A.T., X-plor Manual Version 3.1 (Yale University Press, 1992). Chou, P.Y. and Fasman, G.D. (1978) Adv .Enzymol 47,45-148. Pace, C.N. (1986) Meth Enzymol 131,266-280; Shortle, D., Meeker, A.K. (1986) Proteins: Struct. Funct. Genet. 1, 81-89. Padmanabhan, S., et al (1990) Nature 344,268-270. Lyu, P.C, et al (1990) Science 250,669-671. O'Neil, K.T. and DeOrado, W.F. (1990) Science 250,646-651. Blaber, M., Zhang, X-J., and Matthews, B.W. (1993) Science 260,1637-1640. Kim, C.A. and Berg, J.M. (1993) Nature 362,267-270. Minor, D.L. and Kim, P.S. (1994) Nature 367,660-663.
Circular Permutation of RNase Tl Through PCR Based Site-Directed Mutagenesis Jane M. Kuo, Leisha S. Mullins, James B. Garrett, and Frank M. Raushel Department of Chemistry, Texas A&M University CoUege Station, TX 77843
L Introduction The prediction of the three-dimensional structure of a folded protein from its primary sequence has long been of interest to biological chemists studying the relationship between protein structure and function. One approach towards a better understanding of this problem involves studying the effects that rearrangements in the primary amino acid sequence will have on the kinetics and thermodynamics of a protein-folding pathway. Goldenberg and Creighton were the first to circularly permute a protein through chemical coupling methods on purified bovine pancreatic trypsin inhibitor (BPTI) (1). The circular permutation of a protein at the genetic level was pioneered by Kirshner's laboratory, which characterized variants of phosphoribosyl anthranilate isomerase (PRAI) and dihydrofolate reductase (DHFR) (2,3). More recently, a circularly permuted variant of T4 lysozyme has also been characterized (4). Current methods for creating circularly permuted variants through alterations in the genetic sequence of a protein are laborious and involve a large number of recombinant DNA manipulations. Our laboratory has developed a much simpler procedure through the use of the polymerase chain reaction (PCR). The method involves amplifying two separate segments of a gene by PCR and combining them in reverse order in a third PCR step (Figure 1). This general method of creating circularly permuted proteins through the use of PCR technology can be applied to any cloned gene, and furthermore, can be extended to make even more complex rearrangements of protein sequences. We have applied this general technique, which requires only four oligonucleotide primers and three PCR steps, to construct a circularly permuted variant ofribonucleaseTl (RNase Tl). TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
333
334
Jane M . Kuo et al.
\Unker^
V cpB
^Startj
n |Stop|—
X
1
cpA \
Linker B\.
PCR
PCR
• Linker Codons
X
X+1
New stop
•
New Start
Linker Coctons
Hybridize 1 Fragments
New Start
Linker Cock)ns
^^^"^
Linker Coctons
New Stop
1 PCR and Religation x+1 ^
Start 1
f n
1
1 1
X |Stop 1 —
Linker Coctone
Figure 1. General PCR protocol for the creation of circularly permuted proteins.
RNase Tl provides an ideal model for investigating the effects of a circularly permuted sequence on protein folding and stability, since the folding pathway of the native oxidized form of this well characterized 104 residue a,p globular protein has already been determined using the hydrogen-deuterium amide exchange 2D NMR technique (6,7). The three-dimensional structure of RNase Tl as determined by x-ray crystallography is also known to 1.5 A resolution, and the complete NMR spectrum has previously been assigned (8,9). The kinetics and thermodynamics of the native RNase Tl, which possesses two disulfide bonds, have been well characterized (10,11). Initial characterizations of both a mutated form of RNase Tl, in which the 2-10 disulfide bond has been eliminated, and a circularly permuted variant of RNase Tl suggest that the two mutant proteins fold to conformational states similar to that of the native enzyme, since both forms maintain partial activity towards the catalytic hydrolysis of RNA. Thermodynamic studies indicate that the mutant forms are less stable than the native protein.
IL Materials and Methods A. Materials The plasmid pMac5-8 and the Escherichia coli strain WK6 were generous gifts from Dr. C. N. Pace (Texas A&M University). All oligonucleotides used for mutagenesis and sequencing were synthesized by the Gene Technology Laboratory of the Biology Department at Texas A&M University. Restriction
PCR Based Mutagenesis
335
enzymes and Magic Miniprep and Magic PCR Prep DNA purification kits were purchased from Promega Corp. GeneClean DNA purification kits were purchased from Bio 101, Inc. Taq polymerase and GeneAmp PCR reagent kits were purchased from Perkin-Elmer-Cetus. Sequenase Version 2.0 sequencing kits were purchased from United States Biochemical Corp. Urea, hen egg lysozyme, and Type II-C ribonucleic acid core were purchased from Sigma.
B. Construction of Silent StyI Restriction Site The gene for RNase Tl is encoded in the plasmid, pMac5-8, immediately downstream from the signal peptide portion of the alkaHne phosphatase gene, phoA (12), and the protein is expressed as a phoA-RNmQ Tl precursor which undergoes posttranslational cleavage of the leader peptide. The pMac5-8 plasmid contains a unique Hindlll restriction site immediately following the termination codon for the RNase Tl gene and a unique EcoRI restriction site upstream from the promoter region. In order to facilitate the transposition of segments of the RNase Tl gene without involving other portions of the genome, a unique and silent Styl restriction site was inserted into the phoA leader sequence using the overlap extension PCR method for site-directed mutagenesis (13). The four primers used for this procedure, primer A, Styl-B, Styl-C, and primer D2, were designed such that primer A and primer D2 flank the RNase Tl gene and anneal upstream of the EcoRl site and downstream of the HindW. site, respectively. Styl-B and Styl-C are complementary oligonucleotides and contain the base changes necessary for creating the silent Styl site. Using the pMac5-8 plasmid as the template, two PCR amplifications were performed, one with primer A and Siyl-B and one with primer D2 and Styl-C, The fragments generated in these first two PCR amplifications, fragments AB and CD, were combined in a third PCR step to generate a full length AD fragment containing the newly created Styl site within the p/ioA-RNase Tl gene flanked by the EcoRl and HmdIII sites. This fragment was then cloned into the pMac5-8 plasmid between the EcoRl and ffindlH sites, replacing the original/?A(9A-RNase Tl gene and creating the new plasmid, pKWOl.
C. Construction of(C2A^ CIO A) Mutant The pKWOl plasmid was used as the template to create a Cys2->Ala, CyslO —>Ala double mutant for the purpose of eliminating one of the disulfide bonds in RNase Tl. It was possible to accomplish this in one PCR step, because the two mutation sites were close enough to be encompassed by one mutagenic primer, C2J0A, which contained the base changes necessary for replacing the 2 and 10 cysteines with alanines. Primer D2 was used as the corresponding PCR primer. The resulting fragment was cloned into the pKWOl plasmid between the Styl and Hindlll sites, replacing the gene for the native RNase Tl with the gene for the (C2A, ClOA) mutant. This plasmid was named pKW02.
336
Jane M. Kuo et al.
D. Construction of Circularly Permuted cp3SSl Mutant The segment between the Styl and Hindlll sites of the pKW02 plasmid was restricted and purified for use as the template in the first two PCR steps towards the creation of a circularly permuted variant of RNase Tl. The four primers required for this procedure have been designated cpA, cpB, linker A, and linker B, The sequence (5'-^3') of the cpB primer consisted of a portion of the phoA leader sequence containing the newly engineered Styl site, followed by the codon for an extra alanine and the codons for Ser-35 through His-40 of RNase TL An extra alanine was included as the first residue for the new protein to insure the proper processing of the mature protein from the phoA fusion product. The linker B primer sequence (3'~»5') consisted of the complements to the codons for Gly-97 through Thr-104, a Gly-Pro-Gly linker, and Ala-1 through Ala-2. Primers cpB and linker B were combined in the first PCR step to create a fragment consisting of the first half of the gene for the circularly permuted variant. The second PCR step, which was performed in tandem with the first, utilized primers cpA and linker A to create a fragment consisting of the second half of the gene for the circularly permuted variant of RNase Tl. The sequence (5'-»3') of the linker A primer consisted of the codons for Cys-103 through Thr104, the Gly-Pro-Gly linker, and Ala-1 through Ser-8 of the (C2A, ClOA) double mutant. The cpA primer sequence (3'~>5') consisted of the complements to the codons for Asp-29 through Gly-34 and a termination codon, followed by a HindUi site. The two fragments created in these first two PCR steps were purified and combined in a third PCR amplification to form a fiill length gene for the circularly permuted variant of RNase Tl. Since the two fragments shared a portion of the same sequence for the Gly-Pro-Gly linker region, they acted as both primer and template for each other. Extra cpA and cpB primers were added to further amplify the final product. This circularly permuted gene was cloned into pKW02 between the Styl and ffmdIII sites, yielding the new plasmid, pKW03. The circularly permuted protein (cp35Sl) encoded by the new gene consists of 108 amino acid residues, beginning with an alanine, followed by Ser35 through Thr-104 of the original protein, the Gly-Pro-Gly linker, and Ala-1 through Gly-34 of the (C2A, ClOA) double mutant. All three constructions, contained in pKWOl, pKW02, and pKW03, were completely sequenced to ensure that no unwanted mutations were incorporated during the PCR amplification steps. The construction of these plasmids from pMac5-8 is illustrated in Figure 2.
E. Enzyme Purification and Assay The RNase Tl, (C2A, ClOA), and cp35Sl proteins were expressed in E. coli WK6 cells transformed with the plasmids pKWOl, pKW02, and pKW03, respectively. The enzymes were purified according to the method described by Shirley and Laurents (14). Specific activities were determined by the continuous
PCR Based Mutagenesis
337 EcoRI
Hindlll PhoA
M
RNaseTI
I EcpRI
Mutagenesis
Styl
Hindlll
PhoA"H
RNaseTI
I EcpRI
H
pKW01
Mutagenesis
Styl PhoA
pMac5-8
Hindlll RNaseTI "
Ul——
pKW02
C2A C10A
I EcoRI
Circular Permutation
Styl
L^ PhoA
H
Hindlll
iaV,24| cp35S1 I
11
pKW03
Gly-Pro-Gly
Figure 2. Construction of plasmids pKWOl, pKW02, and pKW03 from pMac5-8.-
assay method described by Oshima et al, (15). Native and SDS polyacrylamide gels were run at room temperature according to the procedures described by Pace and Creighton (16).
F. Thermodynamic Measurements Urea denaturation curves were determined by measuring the intrinsic fluorescence intensity (278 nm excitation and 320 nm emission) of solutions containing approximately 0.9 |iM protein in sodium acetate/acetic acid, pH 5.0 buffer and increasing concentrations of urea in a temperature regulated PerkinElmer MPF 44 B spectrophotometer. Solutions were incubated at 25 °C for 24 h before measurements were taken. The free energy of unfolding was calculated by the linear extrapolation method (17). The error in AG was ± 0.6 kcal/mol.
G. Mass Spectrometry The molecular masses of the RNase Tl proteins were determined by matrixassisted laser desorption ionization (MALDI) mass spectrometry on a home-built spectrometer constructed in the laboratory of Dr. D. H. Russell (Dept. of Chemistry, Texas A&M University). The matrix solution was a-cyano-4hydroxycinnamic acid (15 mg/mL) in methanol. Samples were ionized in a 20
338
K
Jane M. Kuo et al.
B gly-pro-gly
Figure 3. Ribbon diagram for the structure of native RNase Tl (A) and the circularly permuted cp35Sl (B) as modeled with Insight II from B I O S Y M /
kV electric field. The instrament was calibrated with hen egg lysozyme (1 mg/itiL) in 50% methanol. Samples were prepared with a final protein concentration of 1-2 mg/mL and internally standardized with hen egg lysozyme (2 mg/mL, 50% methanol).
H. AminO'Terminal Sequencing The amino-terminal sequences of the (C2A, ClOA) and cp35Sl mutants were determined with an Applied Biosystems 470A sequencer in the Biology Support Laboratory at Texas A&M University.
IIL Results and Conclusion The three-dimensional structure of the native RNase Tl is shown in Figure 3A. A major and a minor loop are formed due to the two disulfide bridges (Cys6 and Cys-103, Cys-2 and Cys-10). Since it has previously been shown that breaking the disulfide bond forming the smaller loop (Cys-2 and Cys-10) results in a protein that remains folded in a conformation very similar to that of the native enzyme (16), it was decided to remove this disulfide bridge in order to introduce greaterflexibilityat the amino terminus so that a peptide linker of only three amino acids would be sufficient to bridge the gap between the N- and Cterminal ends of RNase Tl (18). The Gly-Pro-Gly linker was chosen because these residues have a relatively high propensity to accommodate p-tums (19). Ser-35 and Gly-34 were chosen as the new N- and C-terminal ends for the
PCR Based Mutagenesis
Table I. ComparisoiK9f Wild-type and Mutant Variants of RNase Tl Specific Molecular Molecular Stability Protein Activity" Mass (calc.)* Mass (obs.)'' (urea, pH 5.0) (kcal/mol) (U/mg) (daltons) (daltons) RNaseTl 19,000 10.1 11,085 11,088 6.4 (C2A, 17,000 11,023 11,004 ClOA) 4.2 cp35Sl 11,305 11,303 5,400
339
N-terminal Sequence ACDYTXGSNCYS AADYTXGSNAYS ASNSYPHKYNNY
" From the continuous assay described by Oshima et al. Units are defined as 0.01 change in absorbance at 298.5 nm/min at 37 ^'C. * Calculated molecular masses are based on average molecular weight for the individual amino acids. "" Represents the molecular mass in the +1 protonation state.
circularly permuted variant, because they are located in an exposed loop that has not been implicated in either substrate binding or catalytic activity (6). The serine at the new amino terminus is preceded by an alanine for correct polypeptide expression and processing. A computer-generated model of the circularly permuted variant of RNase Tl, cp35Sl, is shown in Figure 3B. The native RNase Tl, (C2A, ClOA), and cp35Sl proteins were characterized by specific activity, polyacrylamide electrophoresis (SDS and native), thermodynamic stability, matrix-assisted laser desorption ionization (MALDI) mass spectrometry, and amino-terminal sequencing. Results are shown in Table I. The (C2A, ClOA) and cp35Sl mutants were 89% and 28% as active as RNase Tl, respectively, in hydrolyzing RNA. Electrophoresis on 15% native gels indicated that the larger cp35Sl migrated slower that RNase Tl and (C2A, ClOA). Urea denaturation studies indicated that the thermodynamic stabilities of (C2A, ClOA) and cp35Sl were reduced by 3.7 and 5.9 kcal/mol, respectively, relative to the native RNase Tl. The m values for wild-type RNase Tl, (C2A, ClOA), and cp35Sl were 1600, 1500, and 1650 cal mol"^ M\ respectively. The observed molecular weights as determined by mass spectrometry for all three proteins were consistent with the predicted values based on amino acid composition. N-terminal protein sequencing verified the initial sequences of both the (C2A, ClOA) and cp35Sl mutants. The specific activity and thermodynamic stability of the (C2A, ClOA) mutant confirm that the Cys-2 to Cys-10 disulfide bond imparts thermodynamic stability but has little effect on catalytic activity. Hence this mutant was selected as the starting point for constructing a circularly permuted form of RNase-Tl so that as short a linker as possible could be used to bridge the original N- and C-termini. The activity and stability of the circularly permuted variant indicate that it adopts an overall tertiary fold very similar to that of the native protein. Therefore, transposing the first 34 residues to the C-terminus has little effect on the overall folding to the final tertiary structure. The real effect, however, may be more evident in the kinetics of the specific folding pathway. We have demonstrated the utility of a general PCR approach for the creation of circularly permuted proteins through the initial characterization of cp35SL This circularly permuted variant of RNase Tl can be expressed and purified in a
340
Jane M. Kuo et al.
catalytically active fomi. Multiple circularly permuted analogs of the same protein can easily be made by varying the cpA and cpB primers. Conversely, multiple peptide linkers can be designed by varying the linker A and linker B primers. These approaches will be used to make other circularly permuted variants of RNase Tl in order to investigate the impact of specific alterations in the amino acid sequence on the protein-folding pathways.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
12. 13. 14. 15. 16.
17. 18. 19.
Goldenberg, D. P., and Creighton, T. E. (1984) J, MoL Biol. 179,527-545. Luger, K., Hommel, U., Herold, M., Hofsteenge, J., and Kirschner, K. (1989) Science 243,206-243. Buchwalder, A., Szadkowski, H., and Kirschner, K. (1992) Biochemistry 31,1621-1630. Zhang, T., Bertelsen, E., Benvegnu, D., and Alber, T. (1993) Biochemistry 32,1231112318. MuUins, L. S., Wesseling, K., Kuo, J. M., Garrett, J. B., and Raushel, F. M. (1994) JACS 116,5529-5533. Pace, C. N., Heinemann, U., Hahn, U., and Saenger, W. (1991) Angew. Chem 30,343454. MuUins, L. S., Pace, C. N., and Raushel, F. M. (1993) Biochemistry 32,6152-6156. Martinez-Oyanedel, J., Heinemann, U., and Saenger, W. (1991) /. MoL Biol 222, 335352. Hoffman, E., and Ruterjans, H. (1988) Eur. J. Biochem 111, 539-560. (a) Shirley, B. A., Stanssens, P., Hahn, U., and Pace, C. N. (1992) Biochemistry 31,725732. (b) Pace, C. N. (1990) TIBS 15,14-17. (a) Kiefhaber, T., Quaas, R., Hahn, U., and Schmid, F. X. (1990) Biochemistry 29, 30533061. (b) Kiefhaber, T., Quaas, R., Hahn, U., and Schmid, F. X. (1990) Biochemistry 29,3061-3070. (c) Kiefhaber, T., Grunert, H.-P., Hahn, U., and Schmid, F. X. (1990) Biochemistry 29,7475-6480. (d) Kiefhaber, T., Grunert, H.-P., Hahn, U., and Schmid, F. X. (1992) Proteins, Struct. Funct. Genet. 12,171-178. (e) Kiefhaber, T., Schmid, F. X., Willaert, K., Engelbroughs, Y., and Chaffottee, A. (1992) Protein Sci. 1,1162-1166. Quaas, R., McKeown, Y., Stanssens, P., Frank, R., Blocker, H., and Hahn, U. (1988) Eur. J. Biochem 273,617-622. Ho, S. N., Hunt, H. D., Horton, R. M., PuUen, J. K., and Pease, L. R. (1989) Gene 11, 51-59. Shirley, B. A., and Laurents, D. V. (1990) J. Biochem. Biophys. Methods 20,181-188. Oshima, T., Uenishi, N., and Imahori, K. (1976) Anal. Biochem 71,632-634. (a) Pace, C. N., and Creighton, T. E. (1986) /. Mol. Biol. 188,477-486. (b) Pace, C. N., Grimsley, G. R., Thomson, J. A., and Bamett, B. J. (1988) J. Biol. Chem 263,1182011825. Santoro, M. M., and Bolen, D. W. (1988) Biochemistry 27, 8063-8068. Distances were determined by molecular modeling with Insight II from BIOS YM using the coordinates from the crystal structure determined by Martinez-Oyanedel et al. (7). Creighton, T. E. (1993). In "Proteins-Structure and Molecular Principles" (W. H. Freeman and Company, New York) 201-259.
E. CO/i-Expressed Human Neurotrophin-3 Characterization of a C-Terminal Extended Product John O. Hui*, Shi-Yuan Meng, Vishwanatham Katta, Larry Tsai, Michael F. Rohde, and Mitsuru Haniu* Amgen Inc., Thousand Oaks, CA 91320
L INTRODUCTION In the development of proteins as therapeutic agents, it is crucial to erisure the homogeneity of the potential products. However, a survey of the current literature has indicated that overexpression of recombinant proteins in E. coli often leads to translational errors as well as post-translational modifications (1). Acetylation of the e-amino group of lysyl residues in bovine somatotropin has been documented (2). Substitution of methionine by norleucine is another well known example (3,4). In the expression of granulocyte colony stimulating factor (G-CSF), Lu and coworkers have characterized a mistranslation product where a histidine residue has been replaced by a glutamine (5). Elimination of such problems is important to the biotechnology industry; it is thus essential to understand the biological mechanisms that are involved. We have been studying the Exoli expression of human neurotrophin-3 (NT-3), a member of the nerve growth factor (NGF) family of neurotrophic factors. The protein is believed to have value in the treatment of certain neurodegenerative diseases (6). Mature NT-3 is a polypeptide of 119 amino acid residues and under physiological conditions, the protein functions as a tightly associated, noncovalently linked homodimer. Expression of NT-3 using UAA or UAG as the
* Authors to whom correspondence should be addressed. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
34 j
342
John O. Hui et al.
termination codon led to the production of the mature protein with the expected C-terminus (Thr at position 119). However, termination of the cloned DNA with UGA gave in addition to the mature NT-3, another protein with an identical N-terminal sequence but approximately 1 kDa higher in molecular weight (as measured by SDS-PAGE). In this study, we report the characterization of this higher molecular weight product and show that it is due to misreading through the UGA codon by tryptophan incorporation.
11. MATERIALS AND METHODS (a^ Materials: NT-3 was expressed as inclusion bodies in E. colu A detailed description of the expression will be published elsewhere (7). The protein retains the initiation methionine as the N-terminus. lodoacetamide, ammonium bicarbonate and DTT were the products of Sigma. Sequencing grade endoproteinase Lys-C was bought from Boehringer Mannheim. Guanidine hydrochloride, urea and trifluoroacetic acid were obtained from Pierce. All other reagents were of the highest quality commercially available. HPLC grade water and solvents from Burdick and Jackson were used throughout. (h) Reduction and S-alkylation of NT-3: Because the expressed proteins were obtained as insoluble protein aggregates, they were reduced and alkylated with lodoacetamide prior to their purification. In a typical experiment, inclusion bodies obtained from 25 ml of cells were solubilized in 100 jil of 6 M guanidine hydrochloride in 0.25 M Tris-HCl containing 1 mM EDTA at p H 8.5. DTT was added to a final concentration of 10 mM and the reduction was allowed to proceed at 45°C for 1 hr prior to carboxamidomethylation with 20 mM lodoacetamide at room temperature in the dark for 20 min. The sample was diluted with 1 ml of 0.1% TEA in water and the precipitate was removed by centrifugation at 12,000 rpm for 20 minutes. The clear supernatant was chromatographed through a Vydac C18 column (0.46 X 25 cm). Solvent A was 0.1% TEA in water and solvent B was 0.1% TEA in 90% acetonitrile. The column was washed with 2% solvent B for 2 min after injection and the proteins were eluted using a linear gradient of 2% to 55% solvent B over 1 hr. A flow rate of 0.7 ml per min was employed. Elution of the proteins
Characterization of a C-Terminal Extended Product
343
was monitored using absorbance at 215 nm. The purified protein was collected manually and lyophilized. (c) Proteolytic degradation of NT--3: To establish the identity of the higher molecular weight protein, the purified material was subjected to proteolytic fragmentation. Approximately 20 [ig of the purified protein was dissolved in 25 pil of 0.4 M ammoniimi bicarbonate containing 8 M urea (pH 7.8). The resulting solution was diluted with 75 ^il of water and endoproteinase Lys-C at 1% by weight was added. The sample was incubated at 37°C for 20 hr before it was quenched with TFA. The peptides generated were separated by chromatography through a Vydac C18 column (0.21 x 5 cm). The solvents and the gradient used were identical to those described above except the flow rate was changed to 0.25 ml per min. (d^ Protein and peptide sequence analysis: Automated Edman degradation of protein and peptide samples was performed in an Applied Biosystems sequencer (Model 476 or 477) or a Hewlett Packard GIOOOA sequencer. Each sequencer was fitted with an on-line HPLC analyzer for the identification of phenylthiohydantoin amino acids. (e) Mass spectrometry: The molecular mass unit of the peptides or proteins was determined using a Finnigan SSQ 710 quadrupole mass spectrometer fitted with an electrospray interface. Molecular mass units (daltons) and standard deviation were calculated by employing the Finnigan software. (f) SDS-PAGE: Laemmli (8) gels (14%) were run in the presence of reducing agent and stained with Coomassie Blue for detection of proteins. III. RESULTS AND DISCUSSION During our study on the expression of NT-3 in E. coli, it was observed that when the cloned DNA was terminated with the codon UGA, both the mature protein and a higher molecular form (which amounts to approximately 40% of the expressed protein) were produced in the inclusion bodies. However, if UAA or UAG was employed as the stop codon, only mature NT-3 was obtained as the predominant product (Figure 1). It is therefore of interest to characterize the higher molecular weight material.
344
kDa
John O. Hui et al.
MW 1
2
3 MW 4
5
6
7
66.3 55.4 36.5 31.0
21.5 14.4
Figure 1 Analysis of the inclusion bodies expressing NT-3 using UAG (lane 1), UAA (lane 2) and UGA (lane 3) as the stop codon by reducing SDS-PAGE. The arrows point at 2 bands that migrate close together. The HPLC purified proteins from UAG and UAA termination are shown in lanes 4 and 5 respectively. Lane 6 is the mature NT-3 (peak I) purified from the system employing UGA as termination and lane 7 contains the higher molecular weight material (peak II). MW is the molecular weight standards.
Because the expressed proteins were obtained as insoluble protein aggregates in inclusion bodies, they were reduced and carboxamidomethylated prior to purification on RP-HPLC. Figure 2 shows that the higher molecular weight material was readily separated from the mature protein using a C18 column. Examination of the purified proteins by SDS-PAGE under reducing condition demonstrates that peak I corresponds to mature NT-3 whereas peak n is the higher molecular weight form. Automated Edman degradation of peak II indicated that its N-terminal sequence was identical to that expected of methionyl NT-3 (data not shown). However, molecular mass determination by mass spectrometry demonstrated that the reduced and
345
Characterization of a C-Terminal Extended Product
2000 1800-3 1600 1400 1200 10004 800 •] 600 •] 400^ 200 0 2000 S 1800^
<.
\
(A)
xJL (B)
^ 1600^ I 1400 J2 1200 ^ 1000^ g 800 c 600 CO
n o CO n
400 200 4
j U ^
<
20001
i
18004
t
1400
c in
1200
J , 1600
(C)
1000 CD
800
CO
600
o c jQ
o
400
<
200
en 13
11
jU 10
20
30
40
50
60
Time (min.)
Figure 2 RP-HPLC purificaticm of ttie reduced and carboxamidomethylated protein from the system using UAG (chromatogram A), UAA (chromatogram B) and UGA (chromatogram C) as the termination codon. Peak I is the mature NT-3 and peak II is the higher molecular weight form.
S-alkylated protein gave a molecular mass of 14,749.0 Da, a value which is 648.0 Da higher than that expected of the mature protein (Figure 3). This corresponds to the incorporation of a peptide
346
John O. Hui et al.
with a molecular mass of about 666.0 Da, suggesting a hexapeptide. 1.34
) S ^ •5l5
Si
Tlicoratical • 14.099 Da
6oH
[1095.5,,. 1175 14JL0.C
-J m mk^k 600
800
1000
1200
1400
1(00
ItOO
2000
m/z
1*04 1.82 738^
NT-4pMkn Mcasond molacoUr mau « l i 7 4 9 Dt
'".•^•14 105 V <
*"
r"^'
^^^^^4^ '^ '
U.i,>.Lj^... V
.k.
1300
n/z
Figure 3 Analysis of the mature NT-3 and the higher molecular weight material by mass spectrometry. To identify the site of incorporation, the protein was subjected to endoproteinase Lys-C digestion and the resulting peptides were separated using a narrow bore C18 column. The peptide map of the mature NT-3 protein is shown in Figure 4A and the expected C-terminal peptide I116GRT119 is highlighted in bold print. Examination of the endoproteinase Lys-C map of the higher molecular weight material (Figure 4B) demonstrates that
347
Characterization of a C-Terminal Extended Product
the peptides corresponding to the C-terminus were absent and a new peptide was generated. /w-
i £ c CO
600-
'.
<^>
1 '
.
=o ' CO
500:
< ^ >
400: U
o o c
300:
n oCO n <
200-
CO
1 1
^
k.
100:
li
(2S.
n-
UlJILI
10
1>1S
lUU—V A
20
J 40
30
50
Time (min.)
500 (B)
I
400
E c 1£
300
a <
CO
OS
CVJ
"co 0
o c
200-
CO
€o (0
<
lOOH
O^pi 0
LIu
I^VAK^
" - T —
10
-T
1
1
1—I
20
r-
IWLM -T
30
1
1
1-
— I —
40
I 50
Time (min.)
Figure 4 Endoproteinase Lys-C map of mature NT-3 (A); peptide map of the higher molecular weight form (B). The expected C-terminal fragment I1I6GRT119 of the mature protein is highlighted in bold.
N-terminal analysis of the peptide gave IGRTWGSADK (measured molecular mass = 1,091.8 Da; theoretical = l,091.2Da). Therefore, the C-terminus of the protein has been extended with the peptide WGSADK (molecular mass = 663.3 Da). This sequence matches perfectiy with the translated cloned DNA sequence, before another in-frame termination codon (UAA) was met (Figure 5).
348
-1
John O. Hui et al.
1
10
ATG TAT GCA GAA CAT AAG AGT CAC CGA GGG GAG Met Tyr Ala Glu his Lys Ser His Arg Gly Glu
-•
114 119 AGA AAA ATC GGA AGA ACA TGA GGA TCC GCG GAT AAA TAA — Arg Lys lie Gly Arg Thr Arg Lys lie Gly Arg Thr Trp Gly Ser Ala Asp Lys
Figure 5 A schematic representation of the cloned DNA and its translation product.
Hence, we conclude that the higher molecular weight form of NT-3 was caused by reading through of the UGA stop codon by tryptophan incorporation. The UGA codon has been well documented to be leaky in early study on microbial molecular biology (9). Its use should be minimized in bacterial expression system. REFERENCES 1. Santos M. A. S. and Tuite, M. F. (1993). Trends in Biotech. 11, 500-505. 2. Harbour, G.C., Garlick, R.L., Lyle S.B., Crow, F.W., Robins, R.H. and Hoogerheide, J.G. (1992). Techniques in Protein Chemistry III, 487-495. 3. Lu, H.S., Tsai, L.B., Kenney, W.C. and Lai, P.H. (1988). Biochem. Biophys. Res. Commun. 156,807-813. 4. Randhawa, Z.I., et al, (1994). Biochemistry 33, 4352-4362. 5. Lu, H.S., et al, (1993). Protein Expression and Purification 4, 465-472. 6. Arenas E. and Persson H. (1994). Nature 367,368-371. 7. Meng, S-Y., Hui, J.O., Haniu, M., Tsai, L., manuscript in preparation. 8. Uemmli, U.K., (1970). Nature 227, 680-684. 9. Parker, J., (1989). Microbiological Reviews 53, 273-298.
SPECTRAL ENHANCEMENT OF RECOMBINANT PROTEINS WITH TRYPTOPHAN ANALOGS: THE SOLUBLE DOMAIN OF HUMAN TISSUE FACTOR^ C.A. Hasselbacher, R. Rusinova, E. Rusinova, andJ.B. Alexander Ross Department of Biochemistry, Mt. Sinai School of Medicine, New York, NY 10029
L INTRODUCTION Tryptophan (Trp) fluorescence is widely used to study structure, dynamics, and intermolecular interactions of proteins. However, the presence of Trp in most proteins severely limits the utility of Trpfluorescencefor studying one protein species in the presence of others. For this reason, extrinsic probes are often used to tag individual proteins. We have been exploring an alternative to use of extrinsic probes by inserting Trp analogs in vivo using various Trpauxotrophic expression systems. Trp analogs in proteins have great potential for fluorescence studies, provided they have a unique absorbance compared to Trp, allowing proteins incorporating these analogs to be selectively excited in the presence of other Trp-containing proteins. Two Trp analogs that may be useful for combining the best qualities of intrinsic and extrinsicfluorescenceprobes are 5-hydroxytryptophan (5-OHTrp) and 7-azatryptophan (7-ATrp). We have experimented with Trp analog labeling protocols for recombinant proteins produced in bacteria and yeast, and we have found variability in these labeled proteins both with respect to structural and functional integrity and degree of analog incorporation (1,2). We have evaluated levels of 5-OHTrp incorporation in several protein systems. In this paper, we evaluate attempts to replace the four Trps in the truncated, soluble domain of recombinant human Tissue Factor (sTF) with 5-OHTrp and 7-ATrp. This domain of native, membrane-bound tissue factor binds factor Vila, a serine protease circulating in ^Supportedby NIH Grants GM-39750 andHL-29109. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
3^9
^^^
C. A. Hasselbacher et al.
blood; the TF/VIIa complex activates factor X, which is a critical step in the process of blood coagulation (3). We have prepared four sTF mutants in which one different Trp in each has been changed to either Phe or Tyr. We report the results on induction of expression of these proteins in the presence of 7-ATrp and 5-OHTrp and discuss our conclusions on the utility of this new technique. n. MATERIALS AND METHODS Expression vectors for sTF and the Trp-replacement sTF mutants that export the recombinant protein into the culture medium were prepared as described (4). To express proteins with incorporated Trp analogs, Trp-auxotrophic E. coli strains W3110TrpA88 and CY15077AEA2 (gifts of C. Yanofsky) were transformed with these plasmids. Cells were grown to mid-log phase in shaker culture in minimal media (5) with added Trp, then centrifiiged and resuspended in minimal media with no Trp. After 30 min incubation to deplete the remaining Trp present, either Trp, 5-OHTrp, or 7-ATrp was added (5 mg amino acid per 100 mL) and protein expression was induced with IPTG. Cells were allowed to continue to grow at room temperature for 12 h. Protein was concentrated and purified from media (4), and also purified from sonicated cells. For small preparations, not amenable to purification as above, media proteins were subjected to SDS-PAGE. Bands were digitally scanned and relative intensities were quantitated using software by Image-Quant. Proteins of interest were electroeluted from single bands using preparative SDS-PAGE and characterized for analog incorporation by absorbance andfluorescencespectra. Incorporation of analog into this purified protein was measured as described (6). m.
RESULTS
A. sTF Protein Expression and Analysis. sTF is targeted for secretion by cells, with induced protein observed in the cell-conditioned media. sTF-related proteins from media normally appear as three bands when visualized on 12% polyacrylamide gels; the band with the lowest apparent molecular weight is fiilly processed sTF (Fig. 1). Cells incubated with 5-OHTrp or 7-ATrp express only about 30% media-derived sTF, as measured by densitometry, when compared to the amount obtained with Trp. In addition, the relative amount of fijUy processed protein is less with Trp analogs than with Trp. Analyses of total protein from media fractions after concentration and dialysis show incorporation of 7-ATrp into protein. Purification of sTF induced in the presence of 7-ATrp, however, yielded protein with no evidence of analog
Spectral Enhancement of Proteins with Tryptophan Analogs
351
incorporation as determined by LINCS analysis (6). sTF grown in the presence of 7-ATrp is largely insoluble under purification conditions, which include centrifugation of media proteins before column chromatography (Fig. 1). To determine whether 7-ATrp is incorporated into the "insoluble" fraction, we electroeluted individual protein bands from preparative SDS-PAGE and examined their fluorescence spectra. The electroeluted "insoluble" sTF fraction of 7-ATrp-incubated cells contains the analog, as can be seen by comparing Trp and 7-ATrp standards with electroeluted fi"actions (Fig. 2d). Therefore, the established sTF media purification protocol must select for Trp sTF produced with residual Trp in the cells. Purification of 5-OHTrp sTF fi"om media yields protein with little analog incorporation. Variation of growth and induction conditions did not significantly improve 5-OHTrp incorporation, since this sTFexpression system requires long incubation times and since exposure to 5-OHTrp is toxic to cells. Western blot analysis of F i g u r e 1: Proteins from media of transfoimed CY15077 protein purified from media AEA2 cells after protein expression was induced in the presence of Trp (lanes 1,2), 5-OHTrp (lanes 3,4), or 7-ATrp and sonicated E, coli BL- (lanes 5,6), visualized by SDS-PAGE. Arrows on left 21(DE-3) cells shows that indicate proteins seen only after induction; sTF standard is 50% of the total induced on left. Proteins remaining in supernatant after sTF can be recovered from centriftigation (20,000 x g, 15 min): lanes 1,3,5. Proteins cells (7). Trp-labeled sTF pelleting under these conditions: lanes 2,4,6. fi-om cells has the same amino and carboxy termini as media-derived sTF, indicating that it has been secreted into the bacterial periplasm. Even greater amounts of sTF are retained in the cells with Trp-auxotrophic bacterial strains (Table I). Total sTF produced (including media-derived protein, non-sedimenting cell protein (12,000 x g, 10 min.), and cell pellet fractions) with induction in the presence of 5-OHTrp and 7-ATrp is 64% and 138%, respectively, of that produced with Trp.
352
C. A. Hasselbacher et al. TABLE I. Relative Amounts of sTF in Media and Cell Fractions* Amino acid
Media
Non-sedimenting cell proteins
CeUpeUet
Trp
0.17
0.45
0.38
5-OHTrp
0.15
0.80
0.05
7-ATrp
0.23
0.38
0.39
*As measured by densitometric analysis after SDS-PAGE.
(D O C (D O O)
(D L_
O _D *•—
"D (D N
"co
E o c 260
280
300
320
365
410
455
500
wavelength (nm) F i g u r e 2: Fluorescence excitation (a,c) and emission (b,d) spectra: a,b) Trp, 5-OHTrp, and 7-ATrp standards; c) 5-OHTrp-sTF ( ), Trp-sTF ( ); d) Trp-sTF ( ), "soluble" 7-ATrp-sTF (purified) (• • •), "insoluble" 7-ATrp-sTF ( ). All proteins are from media. Unless otherwise indicated, proteins were electroeluted from single SDS-PAGE bands.
With 7-ATrp, more protein is secreted into the media than with 5-OHTrp or Trp. Little of the sTF expressed in the presence of 5-OHTrp is associated with the cell pellet (cell membranes or protein aggregates). We estimate that replacement of Trp by 5-OHTrp is 10-12% for sedimenting and non-sedimenting cell fractions and 20% for media-derived soluble sTF. When 7-ATrp is used, sTF obtained from the insoluble fraction of the media (see above) contains the
Spectral Enhancement of Proteins with Tryptophan Analogs
353
greatest amount of analog. Therefore, the greatest fraction of incorporated 5-OHTrp is in soluble protein from the media, and the greatest fraction of incorporated 7-ATrp is in "insoluble" media protein (Fig. 2). B. Expression and Analysis of Trp Replacement in sTF Mutants. Table II gives the relative amounts of "soluble" proteins (see above) found in media from cells expressing the sTF mutants in the presence of Trp or Trp analogs. The solubility of sTF is maximal when Trp is incorporated. W14F produces little protein in the presence of 7-ATrp, but a large percentage of what is produced is soluble, and thefluorescencespectra indicate 7-ATrp incorporation (not shown). Table n. Amounts of sTF and the Trp-replacement sTF mutants in Media* Protein
Amino acid
sTF
WMF
W25Y
W45F
W158F
Percent soluble^
Relative total^
Tip
0.80
1.00
5-OHTrp
0.61
0.22
7-ATrp
0.27
0.34
Trp
0.91
1.00
5-OHTrp
0.58
0.21
7-ATrp
0.45
0.04
Trp
0.60
1.00
5-OHTrp
0.25
0.27
7-ATrp
0.29
0.30
Tip
0.88
1.00
5-OHTrp
0.74
0.17
7-ATrp
0.09
0.24
Trp
0.76
1.00
5-OHTrp
0.73
0.40
7-ATrp
0.31
0.52
*As measured by densitometric analysis after SDS-PAGE. Percentage of total protein expressed under a particular growth condition. ^Total amount expressed relative to Trp.
354
C. A. Hasselbacher et al.
W25Y induced in the presence of Trp and 5-OHTrp is less soluble than corresponding native sTFs. Levels of overall expression between W25Y sTFs induced in the presence of Trp and analogs are identical to those of native sTF. Expression of analog-labeled W45F sTFs are lower than those of native sTF and the amount of soluble 7-ATrp sTFfromW45F is considerably reduced. W158F appears very similar to sTF in terms of relative soluble and total protein. CONCLUSIONS In vivo labeling of proteins with amino acid analogs that have spectral properties distinctfromthat of the intrinsicfluorophoretryptophan is a powerful new tool for elucidation of local environment andfianctionin proteins. Among possible Trp analogs, 7-ATrp is an excellent environmental probe, as it can exhibit a large increase in quantum yield when buried in the protein interior and has approximately 50 nm red-shifted emission spectra compared to Trp. The analog 5-OHTrp has the advantage of a large shift in the absorbance spectrum, allowing for its selective excitation in the presence of Trp. We have previously demonstrated the utility of this labeling technique by incorporating 5-OHTrp into the lambda repressor and the yeast pheromone a-factor (1,2). We have measured incorporation of 5-OHTrp rangingfrom3095% into other recombinant proteins using similar labeling protocols and the bacterial Trp-auxotrophs described above. To explore possible reasons for this variability, we have compared the expression and properties of the soluble domain of human tissue factor using these Trp-auxotroph cells in the presence of Trp, 5-OHTrp and 7-ATrp. We chose sTF for study because 1) we have previously prepared and characterized its single-Trp replacement mutants with respect to structure and activity (4); and 2) its purification is straightforward, due to targeted secretion into the cell-conditioned media. In general, we observe that cells exposed to 5-OHTrp grow poorly and express less protein than cells induced in the presence of Trp. 5-OHTrp itself is unstable in solution at elevated temperatures, and peptide-incorporated 5-OHTrp can be photolabile (1). In addition, 5-OHTrp competes weakly with Trp for binding to the bacterial tryptophanyl t-RNA synthase (8). Thus, efficient analog incorporation requires complete depletion of the cells' Trp reserves before induction. This would suggest that protein expression systems that achieve high yields in short incubation times would be optimal for use of this probe. The present expression system is not optimal for labeling with 5-OHTrp, since
Spectral Enhancement of Proteins with Tryptophan Analogs
355
accumulation of sTF in cell-conditioned media requires many hours. 7-ATrp is less detrimental to cell growth and it is a better competitor with Trp for the synthase (8), and we observe excellent cell viability in the presence of 7-ATrp. However as discussed below, when 7-ATrp replaces Trp in sTF, the protein sediments readily, suggesting altered structure and/or properties. A significant observationfi-omour study of the sTF expression system is that the bacterial secretory machinery continues tofiinctionin the presence of Trp analogs. Wefindsimilar ratios of total protein produced in the media for Trp, 7-ATrp, and 5-OHTrp (Table I), and spectral evidence for analog incorporation in all media and cellfractionsthat we tested (Fig. 2 c and 2 d). Normal levels of protein expression have been equated with proper conformation of the expressed protein (9). We point out that for sTF, however, this correlation does not seem to hold. We have previously determined that sTF does not self-associate at concentrations up to 20 mg/mL (10). We have shown also that none of the Trp residues are at the surface of sTF (4). Thus, aggregation suggests alteration of sTF structure (we define the soluble mediaderivedfractionas the non-sedimenting fraction that remains after centrifiigation at 20,000 X g for 15 min). sTF solubility decreases slightly when expressed with 5-OHTrp and decreases dramatically when expressed with 7-ATrp (Table II). This suggests that in general 7-ATrp is not tolerated as well as 5-OHTrp at one or more of the four Trp positions. However, while mutant WMF produces much less protein in the presence of 7-ATrp than does sTF or the other mutants, nearly half of the 7-ATrp-labeled WMF has solubility characteristic of Trp-sTF, far more than any of the other 7-ATrp-labeled proteins. Thus, while less total protein is expressed, more appears to be folded in the appropriate conformation. By contrast, while levels of expression of the mutant W25Y induced in the presence of Trp and analogs are identical with those of wild-type sTF, W25Y induced in the presence of Trp and 5-OHTrp is less soluble. Thus, while more total protein is produced, less is folded in the appropriate conformation. To date most of the proteins expressed with 5-OHTrp or 7-ATrp have been prokaryotic. Those that we are aware of are listed in Table III. The prokaryotic DNA-binding proteins listed have been prepared to provide absorption spectra that can be easily resolved from that of DNA. Of these, only BirA has an enzymatic activity, and it has a Trp in its active site. It is therefore of interest that the 5-OHTrp-containing protein has no enzymatic activity. Its DNAbinding properties have not yet been tested. The other DNA-binding proteins all appear to have wild-type fiinction. To date, there is not enough data available to make similar generalizations about eukaryotic proteins.
356
C. A. Hasselbacher et al.
TABLE m. Proteins tested for 5-OHTrp Incorporation.* Promoter
Trps
% 5-OHTrp
Ad^ (Ad repressor)
ptac ptac
3
95
WT
CRp2'ii'i2(cAMP regulatory protein)
APL X?^
2
50-90
WT
osubunit^^CRNA polymerase)
T7
4
50-60
WT
a subumt^^(RNA polymerase)
T7 T7
11
90 90
WT
CytR(M151W)^^(Cytidine repressor)
T7
1
30
WT
ptac ptac OXYPRO OXYPRO
77
11
85 85 <50 <50
? ?
ptac ptac
4
<20
?
Protein
BirA^"* (Biotin repressor) Oncomodulin (Y57W)15 sTF (human)
Function
^Proteins expressed in Trp-auxotrophic E. coli cells W3110TrpA88 and CY15077AEA2.
References 1. Hasselbacher, C.A., Green, R., Schwartz, G.P., Kohanski, R.A., Rusinova, E., & Ross, J.B.A. (1993) Protein Science 2, 75A. 2. Ross, J.B.A., Senear, D.F., Waxman, E., Kombo, B.B., Rusinova, E., Huang, Y.T., Laws, W.R., & Hasselbacher, C.A. (1992) Proc. Natl Acad. Sci. USA 89,12023-12027. 3. Bach, R. (1988) CRC Crit. Rev. Biochem. 23, 339-368. 4. Hasselbacher, C.A., Rusinova, E., Waxman, E., Rusinova, R., Kohanski, R.A., Lam, W., Guha, A., Lin, T.C., Nemerson, Y., Konigsberg, W.H., & Ross, J.B.A. Biochemistry, submitted. 5. Sambrook, J., Fritsch, E.F., & Maniatis, T. (1989) Molecular Cloning, a Laboratory Manual, Cold Spring Harbor Laboratory Press, New York. 6. Waxman, E., Rusinova, E., Hasselbacher, C.A., Schwartz, G.P., Laws, W.R., & Ross, J.B.A. {\993) Anal. Biochem. 210, 425-428. 7. Hasselbacher, C.A., Rusinova, E., Waxman, E., Lam, W. Guha, A., Rusinova, R., Nemerson, Y., andRoss, J.B.A. (1994)Proc. SP/£'2137, inpress. 8. Hogue, C.W.v., & Szabo, A.G. (1993) Biophys. Chem. 48, 159-169. 9. Ruf, W., Schullek, J.R., Stone, M.J., Edgington, T.S. (1994) Biochemistry 33, 1565-1572. 10. Ross JBA, Hasselbacher CA, Kumosinski TF, King G, Laue TM, Guha A, Nemerson Y, Konigsberg WH, Rusinova E, Waxman E: Testing an FTIR-Consistent Model of the Soluble Domain of Human Tissue Factor. ACS Symposium Series, in press 11. Heyduk, E. & Heyduk, T. (1993) Cell Mol. Bio. Res. 39, 401-407; and personal communication. 12. Lee, J. C., personal communication. 13. Senear, D.F., personal communication. 14. Beckett, D., personal communication. 15. Hogue, C.W.v., Rasquinha, I., Szabo, A.G., & MacManus, J.P. (1992) FEBS Lett. 310, 269-272.
utilization of Partial Reactions, Side Reactions, and Chemical Rescue to Analyze Site-Directed Mutants of Ribulose 1,5-Bisphosphate (RuBP) Carboxylase/Oxygenase (Rubisco) Mark R. Harpel«, Engin H. Serpersu^,, and Fred C. Hartmana ^ Protein Engineering Program, Biology Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831-8080 and ^Department of Biochemistry, University of Tennessee, Knoxville, TN 37996-0840
!• Introduction Rubisco (EC 4.1.1.39) performs a pivotal role in determining biomass yield (for recent reviews, see refs. 1-3). The biosynthetic reaction catalyzed by this enzyme, the carboxylation of RuBP by CO2 to form two equivalents of 3phospho-D-glycerate (PGA) (Fig. 1, upper pathway), is limited by both slow turnover (/:cat = 2 - 5 s-i) and utilization of O 2 in competition with CO 2. The H2C-OPO3® HO'
c-cop
C-0 I H-C-OH
X HoC-OPOj^ I /-» C= 0 H-C-OH I H C-OH H2C-OPO3®
H2C-0P03<^ HgO
HO-iC-COg®
H O - g - C02^
HO-CT^H
3ZE
..
HgC-OPOj^'y
HgC-OPO,
H2C-0P03*^
H-C-OH
m
I
r
[HO-C«C02® J
H-C-OH
HgC-OPO,®
H2C-0P03<E>
HO-C-H
rC-OH
CO2®
H-C-OH H9C-OPO,®
>? H2C-OPO3®]
HgC OPO3® HO-C-O-O I C-O I H-C-OH I • ^ [_ HgC-OPOs®
IT
HO-^C-0-0® HgO
COg^
HjO* H2C-OP03*
CO2® H-C-OH H2C-0P03<2>
Figure 1. Reaction pathways for the carboxylation and oxygenation of RuBP as catalyzed by Rubisco. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
357
358
Mark R. Harpel et al.
oxidative degradation of RuBP (4-5), which forms one equivalent each of PGA and 2-phosphoglycolate (PGyc) (Fig. 1, lower pathway), can reduce photosynthetic yields in certain plants by 50%. Although the enzyme's specificity for carboxylation is not immutable (see refs. 1-3, 6, and citations therein), the molecular bases for the discrimination between the two pathways are not entirely clear. Hence, understanding the structural and mechanistic features of Rubisco that limit its efficiency and specificity are of practical importance. In this chapter, we present three approaches to address mechanistic issues with Rubisco mutants: characterization of catalysis of partial reactions, analysis of side products, and subtle alteration of the active-site microenvironment by manipulation with exogenous reagents.
II. General Considerations in Characterizing Rubisco Mutants In the absence of 3D structures, a general challenge in exploiting site-directed mutants of Rubisco is to determine whether catalytic deficiencies reflect improper folding of polypeptide, failure of subunits to associate, inability to undergo requisite activation [carbamylation of active site Lysl91 followed by binding of Mg2+ (7)], failure to bind substrates, or loss of a group that participates in catalysis; only the mutants of this latter class are mechanistically revealing. Even partial retention of these properties by catalytically-impaired mutants can allay fears that detrimental consequences of amino acid substitutions at the active site are indirect effects brought about by major conformational changes^. Due to ease of genetic manipulation, heterologous expression, and assembly, Rubisco from Rhodospirillum rubrum, a homodimer (50,500 dalton subunit), has been the target of most mutagenesis studies. Despite the difference in quaternary structure from the Rubisco found in most photosynthetic organisms, which consists of eight large (L) (53,000 dalton) and eight small (S) (14,000 dalton) subunits, species invariance of active-site residues and homologous threedimensional structures justify extrapolation of mechanistic conclusions from the L2 protein to the LgSg form (10, 11). The original clone of the R. rubrum Rubisco gene was expressed as a fusion protein (12), but all the mutant proteins generated in our laboratory are derived from a reconstruction that encodes authentic, wild-type enzyme (13). Mutant genes are constructed by single primer extension utilizing an appropriate single-stranded M13 vector (14) and expressed in Escherichia coli strain MV1190. Mutant enzymes are purified chromatographically to near homogeneity (15). Expression of the mutated genes as full-length translation products that have not been proteolyzed is readily assessed by the electrophoretic mobility on denaturing gels of the mutant proteins in comparison to wild-type. Dimer formation, distinguishable from monomers by gel permeation chromatography or non-denaturing gel electrophoresis, is taken as evidence for proper folding and assembly. The reaction-intermediate analog, 2-carboxyarabinitol 1,5-bisphosphate (CABP), is used to assess whether an inactive mutant can nevertheless undergo activation chemistry and bind phosphorylated ligands. Only the carbamate form of Rubisco binds the analog in an exchange-resistant complex [ti/2 -20 h for wild-type R. rubrum Rubisco (16)] of equimolar subunit, CO2, Mg2+, and CABP (17). This complex is readily isolated by gel filtration (18). Tight binding of each of the three ligands is dependent on the presence of the other two. Therefore, lOnly the inactive D193N mutant of Rubisco has been successfully crystallized and subjected to crystallographic analysis (8). Although this mutant exhibited large conformational differences relative to the wild-type enzyme, these properties may have been expected due to the mutant's deficiencies in both activation and binding of CABP (9).
Rubisco Mutants
359
complex formation proves competence in activation chemistry and binding of phosphorylated ligands.
III. Partial Reactions The overall carboxylation or oxygenation of RuBP as catalyzed by Rubisco consists of discrete partial reactions illustrated in Fig. 1 (reviewed extensively in 1-3, 19). Because an active-site residue will not necessarily be involved in all catalytic steps, site-directed mutants devoid of overall activity may retain competence in one or more of the partial reactions. Independent of overall carboxylase activity, enolization of RuBP and turnover of the isolated six-carbon reaction intermediate can be assayed as distinct reactions, providing an avenue for discerning the particular step(s) preferentially facilitated by a given active-site residue. Formation of the enediol(ate) of RuBP is readily assayed on the basis of exchange of solvent protons with the C3 proton of substrate (20-22). The sixcarbon intermediate of the carboxylation pathway (11 in Fig. 1) can be prepared by rapid quench after mixing equimolar amounts of RuBP and the carboxylase in the presence of 14CO2 (23). Availability of this labeled intermediate allows determination of an enzyme's commitment to forward processing in the carboxylation step. Decomposition, via decarboxylation, is observed as a decrease in radioactivity that can be stabilized by borohydride, whereas forward catalysis is equated with an increase in acid-stable radioactivity.
IV. Oxygenation and Other Side Reactions Completion of a catalytic cycle by Rubisco requires stabilization of several inherently unstable intermediates. Imperfect stabilization of these intermediates is reflected in non-productive side reactions, including several involving the enediol(ate) (I). Not only does the formation of side products provide insight into limitations of Rubisco efficiency in vivo, but perturbation of these side reactions by mutants can reveal amino acid groups crucial to intermediate stabilization. The most prominent side reaction of Rubisco is its counterproductive oxygenase activity, reflecting competition with CO2 for the enediol(ate) intermediate (I) (23). Partitioning between the two pathways (VC/VQ) is defined by VJVQ = T • ([C02]/[02]), where T ISVCKQ/VOKC (24). Because x can be interpreted in terms of the free energy differential for carboxylated versus oxygenated transition states (25-26), it provides insight into determinants of Rubisco specificity. Many assays have been described for determining T; unfortunately, inherent limitations generally render them cumbersome for screening of enzymes and particularly ill-suited for analyzing mutant Rubiscos exhibiting low levels of activity. We have developed a simple high-resolution anion-exchange chromatographic method (27) to derive VQ/VQ (and hence x) from the ratio of radioactive peak areas for PGA and PGyc generated from [l-3H]RuBP. This method also provides a complete picture of the ultimate fate of input substrate (from a single chromatographic profile), including degradation products, which impact X but are overlooked in some widely-used assays (see Fig. 2). Any method for determining the specificity is dependent upon knowledge of gaseous substrate concentrations. Generally, [O2] is maintained constant at either ambient concentration (255 |xM) or at 100% O2 saturation (1.2 mM). Free [CO2] is varied with exogenously-added NaHCOs. The total concentration of all species of "CO2" can be determined spectrophotometrically by the phospho^«6>/pyruvate
360
Mark R. Harpel et al.
carboxylase/malic dehydrogenase assay of O'Leary et al. (28). Decline of carboxylase activity during the course of assay, which is particularly endemic to higher-plant Rubiscos, has been denoted as "fallover". Characterization of the products of RuBP turnover under fallover conditions has shown that this process is a result of enediol(ate)-derived side reactions (see refs. 29-31 and citations therein). Misprotonation at C3 (net epimerization), which occurs once every 400 turnovers, gives rise to D-xylulose 1,5-bisphosphate (XuBP), a potent inhibitor and alternate substrate of the enzyme (32-33). This inhibitor accumulates because its utilization as substrate is exceedingly slow. Another inhibitor formed in similar amounts to XuBP during RuBP turnover by spinach Rubisco has chemical properties suggestive of 3-ketoarabinitol 1,5bisphosphate, which would result from protonation (rather than carboxylation or oxygenation) at C2 of the enediol(ate) (net isomerization) on the same face as carboxylation. Enhanced formation of either misprotonation product by a mutant Rubisco is an indicator of compromised processing of the enediol(ate). Another potential side reaction of the enediol(ate) intermediate is formation of the dicarbonyl compound, l-deoxy-D-glycero-2,3-pentodiulose 5-phosphate, resulting from p-elimination of the CI-phosphate due to improper stabilization and/or premature dissociation of enediol(ate) from the enzyme active site. This compound has been characterized by reduction with borohydride, oxidation with H2O2, complexation with o-phenylenediamine, and i^C-NMR (23, 34). The pelimination product is not detected in reactions with wild-type R, rubrum Rubisco but is formed in substantial amounts with mutants in which the CI-phosphate ligands are substituted, demonstrating the required role of these amino acid side chains in stabilizing the enediol(ate) intermediate (34-35). p-Elimination of phosphate and concomitant formation of pyruvate from the terminal ac/-carbanion (VI) of PGA also occurs (36). Abstraction of a hydroxyl proton from the gem-dio\ carboxylated intermediate (III) promotes C2 - C3 scission with liberation of PGA derived from C3, C4, and C5 of RuBP. The resulting ac/-carbanion of PGA (derived from CI and C2 of RuBP and from CO2) must undergo inversion of configuration at C2 and protonation prior to its release as the D-isomer of PGA. The status of this final step of carboxylation (protonation of the PGA carbanion) is reflected by the ratio of protonation (PGA formation) to p-elimination (pyruvate formation). Detection of side products generated by Rubisco is accomplished by various means. XuBP (30) and pyruvate (36) are both conveniently detected spectrophotometrically by coupling to NADH oxidation with appropriate enzymes. Alternatively, our chromatographic procedure (27) gives a complete profile of all RuBP-derived products. Resolution of these compounds is enhanced by inclusion of 10 mM sodium borate, which complexes v/c-diols, in elution buffers. Since our initial report of the separation of borohydride-reduced misprotonation products (27), we have observed that borate also effects complete separation of unreduced RuBP and XuBP (37). Thus, the analysis is simplified by circumventing the necessity to deduce the amounts of misprotonation-derived bisphosphate based on ratios of ribitol-, arabinitol-, and xylitol-1,5bisphosphates.
V. Applications of Chemical Rescue Site-directed mutagenesis is generally restricted to the 20 amino acids normally occurring in proteins. Thus, reliance on homologous series of compounds to establish structure-reactivity correlations, a hallmark of mechanistic studies with non-enzymic catalysts, has been lacking with enzymes. One manner in which this limitation can be partially overcome is provided by the demonstration that an enzyme, crippled because of an active-site substitution, can be rehabilitated
Rubisco Mutants
361
("rescued") by the addition of exogenous organic compounds that mimic the missing side chain (38-39). Thus, systematic variation in the substituted side chain is Umited only by the number of homologous compounds available to test. For example, the virtujdly inactive K258A mutant of aspartate aminotransferase is stimulated by primary amines; the degree of stimulation, after correcting for steric effects, correlates with the pJ^a of the amine in accordance with the Br0nsted relationship (38-39). More recently, this approach has been extended to a variety of systems, including Rubisco (see ref. 15 and citations therein). Chemical rescue of deficient site-directed mutants can also be achieved through covalent chemical modification, thereby expanding the diversity and subtlety of structural changes that can be effected through mutagenesis. Examples include substitution of lysyl with aminoethylcysteinyl residues (net replacement of the y-methylene group with a sulfur atom) (16,40), substitution of glutamyl with carboxymethylcysteinyl residues (net insertion of a sulfur atom between the pand y-methylene groups with lengthening of the side chain by ~ lA) (41-42) and substitution of arginyl with homoarginyl residues (net insertion of a methylene group with lengthening of the side chain by - 1 A) (43-44).
VI. Lys329 - A Case Study Our studies of active-site Lys329 of R. rubrum Rubisco by site-directed mutagenesis illustrate the value of combining these methodologies. Lys329 is the apical residue of a flexible loop ("loop 6") located in the eight-stranded p/a-barrel of the C-terminal domain of this protein. In the activated enzyme with CABP bound, this loop folds over the top of the barrel and becomes immobilized, in part by electrostatic interactions between Lys329 and Glu48 of the adjacent subunit and between Lys329 and the carboxylate of the bound analog (11, 45-47). Closure of loop 6 and the NH2-terminal segment of the adjacent subunit presumably controls ligand access to the active site and mitigates dissociation of reaction intermediates from the active site. Replacement of Lys329 by site-directed mutagenesis greatly diminished carboxylation activity (~lCK-fold reduction) and formation of a stable quaternary complex with CABP (48). However, these mutants catalyze the CO2- and Mg2+dependent enolization of [3-3H]RuBP, indicating that their primary deficiency is not in assisting this partial reaction, but at a latter step in the reaction pathway (22). Evaluation of the K329G mutant as a catalyst for the turnover of isolated six-carbon, carboxylated intermediate further localized the functional role of Lys329 (25). Despite its lack of carboxylation activity, K329G exhibited a high forward commitment to hydrolysis of this intermediate to PGA. Thus, Lys329 is not needed for enolization nor for processing of the carboxylated intermediate; by deduction, it must be required for reaction of gaseous substrate with enediol(ate). This conclusion is entirely consistent with the location of the e-amino group of Lys329 in the enzyme^CABP quatemary complex of wild-type enzyme as seen by crystallography (11, 45-47). Furthermore, these results demonstrate that carboxylation is not spontaneous with wild-type Rubisco, but requires direct intervention by amino-acid side chains. Precise positioning of the amino group of Lys329 should be crucial if one of its roles is to stabilize the incipient negative charge formed in the intermediate of the gaseous substrate addition step. This supposition was validated by aminoalkylation ("covalent rescue") of the K329C mutant. Treatment of K329C with 2-bromoethylamine or 3-bromopropylamine partially restored activity, as a consequence of selective modification of the introduced thiol group (16, 25). Reduced ifccat for aminoethyl- and aminopropyl-K329C (22% and 5% wild-type, respectively) and corresponding reductions in x (56% and 30%, respectively) emphasize the stringent requirement for placement of the amine at position 329.
362
Mark R. Harpel et al.
Position-329 mutants are also amenable to noncovalent chemical rescue by aliphatic amines (15). For example, at 450 mM ethylamine and 2 mM RuBP, the K329A mutant exhibited about 2% of the wild-type carboxylation activity, representing ~80-fold stimulation compared to the marginal activity of K329A measured in the absence of amine. The system was saturable with respect to both amine and RuBP, and rescue was effected by various amines. Both the extent of rescue and the CO2/O2 specificity of the rescued enzyme (also reduced relative to the wild-type level) showed a steric preference for amine, emphasizing the importance of amine orientation. In addition, amine-rescued K329A formed a detectable complex with CABP. Given the mobility of loop 6, the effectiveness of an exogenous amine in stabilizing the catalytically competent conformation of the protein while concomitantly fulfilling the ftinctionality of a lysyl side chain is rather remarkable. Presumably, loop 6 of K329A, even in the absence of amine, can adopt the conformation necessary for catalysis; but without the lysyl side chain, reaction of gaseous substrate with the enediol(ate) of RuBP cannot occur. A volume-adjusted Br0nsted coefficient of ~l, derived from the rescue of K329A by various amines, is consistent with the amine being fully protonated in the rescued transition state(s). The role of Lys329 was further defined by product analyses of K329A turnover reactions. As shown in Fig. 2A, at high [enzyme]/[RuBP], K329A can consume RuBP, but with formation of two novel side products (dicarbonyl and X). Predominate formation of PGA and PGyc (the normal Rubisco products) in the presence of amines (Fig. 2B) is consistent with rescue of activity deriving from enhanced stabilization of intermediates, effected by the amine through direct interaction and/or maintenance of the closed conformation of loop 6. The side product denoted "dicarbonyl" is l-deoxy-D-glycero-2,3-pentodiulose 5phosphate, derived from ^-elimination of the CI-phosphate from the enediol(ate) intermediate. Formation of this compound supports a role of Lys329 in enediol(ate) interactions and stabilization. XuBP is also formed transiently before eventual consumption and can be quantified in non-reduced samples (not shown).
4000
2000
§*3000
B
/"- H 1500
PGA
K
2000 h
,
PGyc
/
/
/
/
/
1 1 I
1
1000
r.-'l. n
J
10
20
30
40
50
Time (min) Figure 2. Product analysis of K329A turnover reactions in the absence of amine (A) or in the presence of 400 mM ethylamine {B). Other reaction constituents at pH 8 were 20 |xM K329A protomer, 1 mM EDTA, 10 mM MgCl2, 415 mM bicine, 19.6 mM NaHCOa, 10% glycerol, and 250 ^M [l-3H]RuBP. Reactions were quenched after 4 h by reduction with borohydride.
363
Rubisco Mutants
4.3
"T 4.2
4.0 ppm
"T
"T
3.9
3.8
3.7
Figure 3. iR-NMR (400 MHz) of compound X isolated from a K329A reaction mixture by chromatography on MonoQ. The chemical shifts and proton-proton coupling assignments, based on selective decoupling and 2D experiments (not shown), are consistent with the structure of 2-carboxytetritol 1,4-bisphosphate (inset). Two phosphorous resonances were observed by 31P-NMR; proton-phosphorous couplings, assigned by broad-band decoupling and 2D heteronuclear COSY experiments (not shown), are also consistent with the proposed structure.
The other side product generated by K329A provides insight into Rubisco's oxygenase intermediate; X does not contain i^c derived from i^C02 and its formation is dependent on O2. Furthermore, periodate degradation (data not shown) and NMR analyses (Fig. 3) are consistent with this side product being 2carboxytetritol 1,4-bisphosphate (Fig. 3, inset; stereoconfiguration unknown). It could arise by rearrangement of the peroxy adduct of the enediol(ate) (the putative, but as yet unproven, oxygenation intermediate) with elimination of H2O2. If confirmed, the proposed structure will provide the first evidence of multiple fates for the intermediate of Rubisco's oxygenase pathway. Ironically, wild-type enzyme does not normally form X, demonstrating that despite the counterproductive nature of Rubisco's oxygenase reaction, the peroxy ketone intermediate for formation of the C2-C3 cleavage products is stabilized by this enzyme. Collectively, these studies illustrate the power of multiple approaches for the analysis of Rubisco mutants. Characterizations of position-329 mutants have not only localized the step of catalysis facilitated by Lys329 but also uncovered its role in intermediate stabilization and optimization of carboxylation selectivity.
Acknowledgment This work was supported by USDOE under contract DE-AC0584OR21400 with Martin Marietta Energy Systems, Inc.
References 1. Andrews, T.J. and Lorimer, G.H. (1987). in The Biochemistry of Plants (M.D. Hatch and N.K. Boardman, Eds.), Vol. 10, pp. 131-218, Academic Press, New York. 2. Hartman, F.C. and Harpel, M.R. (1993). Adv. Enzymol. 67, 1-75. 3. Hartman, F.C. and Harpel, M.R. (1994). Ann. Rev. Biochem. 63, 197-234.
364
Mark R. Harpel et al.
4. Bowes, G., Ogren, W.L., and Hageman, R.H. (1971). Biochem. Biophys. Res. Commun. 45, 716-722. 5. Lorimer, G.H., Andrews, T.J., and Tolbert, N.E. (1973). Biochemistry 12, 18-23. 6. Spreitzer, R.J. (1993). Ann. Rev. Plant Physiol. Mol. Biol. 44, 411-434. 7. Lorimer, G.H., Badger, M.R., and Andrews, T.J. (1976). Biochemistry 15, 529-536. 8. Soderiind, E., Schneider, G., and Gutteridge, S. (1992). Eur. J. Biochem. 206, 729-735. 9. Gutteridge, S., Lorimer, G., and Pierce, J. (1988). Plant Physiol. Biochem. 26, 675-682. 10. Schneider, G., Lindqvist, Y., and Lundqvist, T. (1990). / Mol. Biol. 211, 989-1008. 11. Knight, S., Andersson, L, and Brandon, C.-L (1990). J. Mol. Biol. 215, 113-160. 12. Somerville, C.R. and Somerville, S.C. (1984). Mol. Gen. Genetics 193, 214-219. 13. Larimer, F.W., Machanoff, R., and Hartman, F.C. (1986). Gene 41, 113-120. 14. Zoller, M.J. and Smith, M. (1983). Methods Enzymol. 100, 468-500. 15. Harpel, M.R. and Hartman, F.C. (1994). Biochemistry 33, 5553-5561. 16. Smith, H.B. and Hartman, F.C. (1988). J. Biol. Chem. 263, 4921-4925. 17. Pierce, J., Tolbert, N.E., and Barker, R. (1980). Biochemistry 19, 934-942. 18. Miziorko, H.M. and Sealy, R.C. (1980). Biochemistry 19, 1167-1171. 19. Schloss, J.V. (1990). in The Proceedings of NATO ASI on Enzymatic and Model Carboxylation and Reduction Reactions for Carbon Dioxide Utilization (M. Aresta and J.V. Schloss, Eds.) pp. 321-345, Kluwer Academic Press, Netherlands. 20. Saver, B.G and Knowles, J.R. (1982). Biochemistry 21, 5398-5403. 21. Sue, J.M. and Knowles, J.R. (1982). Biochemistry 21, 5404-5410. 22. Hartman, F.C. and Lee, E.H. (1989). J. Biol. Chem. 264, 11784-11789. 23. Pierce, J., Andrews, T.J., and Lorimer, G.H. (1986). / Biol. Chem. 261, 10248-10256. 24. Laing, W.A., Ogren, W.L., and Hageman, R.H. (1974). Plant Physiol. 54, 678-685. 25. Lorimer, G.H., Chen, Y.-R., and Hartman, F.C. (1993). Biochemistry 32, 9018-9024. 26. Chen, Z. and Spreitzer, R.J. (1991). Planta 183, 597-603. 27. Harpel, M.R., Lee, E.H., and Hartman, F.C. (1993). Analyt. Biochem. 209, 367-374. 28. O'Leary, M.H., Rife, J.E., and Slater, J.D. (1981). Biochemistry 20, 7308-7314. 29. Edmondson, D.L., Badger, M.R., and Andrews, T.J. (1990). Plant Physiol. 93, 13901397. 30. Edmondson, D.L., Kane, H.J., and Andrews, T.J. (1990). FEBS Lett. 260, 62-66. 31. Zhu, G. and Jensen, R.G. (1991). Plant Physiol. 97, 1354-1358. 32. McCurry, S.D. and Tolbert. N.E. (1977). /. Biol. Chem. 252, 8344-8346. 33. Yokota, A. (1991), Plant Cell Physiol. 32, 755-762. 34. Larimer, F.W., Harpel, M.R., and Hartman, F.C. (1994). /. Biol. Chem. 269, 1111411120. 35. Morell, M.K., Paul, K., O'Shea, N.J., Kane, H.J., and Andrews, T.J. (1994). /. Biol. Chem. 269, 8091-8098. 36. Andrews, T.J. and Kane, H.J. (1991). J. Biol. Chem. 266, 9447-9452. 37. Lee, E.H., Harpel, M.R., Chen, Y.-R., and Hartman, F.C. (1993). J. Biol. Chem. 268, 26583-26591. 38. Toney, M.D. and Kirsch, J.F. (1989). Science 243, 1485-1488. 39. Toney, M.D. and Kirsch, J.F. (1992). Protein Sci. 1, 107-119. 40. Planas, A. and Kirsch, J.F. (1991). Biochemistry 30, 8268-8276. 41. Lukac, M. and Collier, R.J. (1988). J. Biol. Chem. 263, 6146-6149. 42. Smith, H.B., Larimer, F.W., and Hartman, F.C. (1990). /. Biol. Chem. 265, 1243-1245. 43. Beyer, W.F., Jr., Fridovich, L, Mullenbach, G.T., and Hallewell, R. (1987). J. Biol. Chem. 262, 11182-11187. 44. Engler, D.A., Campion, S.R., Hauser, M.R., Cook, J.S., and Niyogi, S.K. (1992). J. Biol. Chem. 267, 2274-2281 45. Andersson, L, Knight, S., Schneider, G., Lindqvist, Y., Lundqvist, T., Brandon, C.-I., and Lorimer, G.H. (1989). Nature 337, 229-234. 46. Schreuder, H.A., Knight, S., Curmi, P.M.G., Andersson, L, Cascio, D.,Branden, C.-L, and Eisenberg, D. (1993). Proc. Natl. Acad. Sci. USA 90, 9968-9972. 47. Newman, J. and Gutteridge, S. (1993). J. Biol. Chem. 268, 25876-25886. 48. Soper, T.S., Mural, R.J., Larimer, F.W., Lee, E.H., Machanoff, R., and Hartman, F.C. (1988). Protein Eng. 2, 39-44.
Probing The Roles Of Conserved Histidine Residues In B-Galactosidase (E. coli) Using Site Directed Mutagenesis And Transition State Analog Inhibition Nathan J. Roth, Katherine Y.N. Wong, and Reuben E. Ruber Div. Of Biochemistry, Dept. of Biological Sciences, University of Calgary, Calgary, Alberta, Canada T2N 1N4
I. Introduction His residues often play important roles in the structure and function of enzymes. The imidazole side chain of His is unique in that it has a pKa near neutral and therefore can gain or lose protons by small changes in the local environment. Thus His is often found within the active site as an acid/base catalyst or plays roles in modulating conformational changes (1). His is capable of acting as a hydrogen bond acceptor or donor, and often functions directly in metal and ligand binding. In addition, it can assist catalysis by acting as a nucleophil. 6-Galactosidase from E. coli is a retaining glycosidase which catalyses the hydrolysis of 6-D-galactosides. Native 6-galactosidase is a tetramer, consisting of four identical monomers of 1023 residues each (2). We decided to probe the functional roles of conserved His in 6-galactosidase using site directed mutagenesis followed by a quick characterization of the resultant enzymes. E, coli 6-galactosidase contains 34 His residues (2). Alignment of the sequences of the related 6-galactosidases and 6-glucuronidases sequenced to date reveals that of these 34 His, only 3 (corresponding to His 357, His 391, and His 540 of the E. coli enzyme) are absolutely conserved (2-15). In this paper we illustrate the approach we took, in the absence of structural data, to determine the functional roles of His-357 and His-391 of 6galactosidase. The data acquired using this approach indicates that His-357 and His-391 appear to be highly important in transition state stabilization but not in ground state binding of the substrate.
II. Materials and Methods A. Site Directed Mutagenesis Site directed mutagenesis was carried out using a modified procedure of Kunkel's dut' ung' method (16). A 1.1 kb fragment of the lac Zgent containing the codons we wished to alter was excised from the plasmid pIPlOl using the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
355
366
Nathan J. Roth et al.
restriction enzymes Sac I and Cla I, and then ligated into BS SK+ previously cut at the corresponding sites. The single stranded mutagenic template DNA was rescued from E. coli RZ1032 containing the subclone with the help of VCS M13 helper phage. The mutagenic primers that were used were phosphorylated prior to annealing using T4 polynucleotide kinase. The synthesis and ligation of the second strand was carried out at 3TC for 3 hr by T4 polymerase and T4 DNA ligase. The mutagenic reaction mix was transformed into E. coli XLBlue. Putative mutants were screened directly by sequencing using Sequenase 2.0. The 1.1 kb fragment containing the mutation was excised and reinserted into pIPlOl. The pIPlOl plasmid carrying the desired mutation was transformed and expressed in a /ac Z - strain of E, coli. Finally, the integrity of the mutation in pIPlOl was reconfirmed using a thermocycling DNA sequencing method.
B. fi'Galactosidase Purification 6-Galactosidase was purified as described previously (17), except for several slight modifications. An 800 mL gradient from 0.09 M to 0.18 M NaCl was used to elute the protein from the DEAE column. The active fractions from the DEAE elution were pooled, precipitated with ammonium sulfate, and then applied to an FPLC Superose 6™ size exclusion column into the appropriate assay buffer. Protein purity was assessed by SDS-PAGE, and the enzyme concentration was determined using an extinction coefficient of 2.09 cm^/mg at 280nm(18).
C. Determination ofKinetic and Inhibitor Constants Kinetic assays were performed at 25*C and pH 7.0 in TES assay buffer (30 mM TES, 145 mM NaCl, 1 mM MgS04) in a UV 2101 Shimadzu spectrophotometer. o-Nitrophenyl-6-D-galactopyranoside (ONPG) or pnitrophenyl-6-D-galactopyranoside (PNPG) was used as the substrate. The kcat and Km were determined by least squares analysis of Eadie Hofstee plots. Kinetic inhibitor constants were determined using the method described by Deschavanne et al. (19) and Huber and Gaunt (20). There are two kinetically relevant steps in the 6-galactosidase reaction mechanism, denoted by the rate constants k2 and k3 (Fig. 1). The first step, "galactosylation" (k2), results in galactosidic bond breakage and the formation of an enzyme substrate intermediate. The second step, "degalactosylation" (ka), involves the hydrolysis of the enzyme-galactosyl intermediate and the release of galactose. The rate of "degalactosylation" (ka) is the same for any 6-galactoside substrate. Therefore, if similar kcat values are obtained for different substrates, k3 is probably the rate determining step. Nucleophilic competition experiments, (in the presence of 1 M methanol), also aid in determining which kinetic step is rate determining. Nucleophiles (e.g. MeOH) can compete with water to attack the galactosylenzyme intermediate (E»GAL) (Fig. 1). If k3 is the rate determining step and if k4 > k3, the addition of methanol should increase the rate of reaction. However, if k2 is rate determining, the addition of methanol should have no effect on the observed catalytic rate.
Probing Conserved Residues in p-Galactosidase
GAL
367
^
H20
E«GAL-OR - ^
E'GAL
GAL-OMe
^"^
^^^^
Figure 1. Probable mechanism offi-galactosidasein the presence of a nucleoj^l (methanol). E,fi-galactosidase;GAL-OR, galactoside substrate; EOAL, galactosyl enzyme; MeOH. added nucleoli (methanol); GAL-OMe galactosyl nucleophil product; Ks, dissociation constant for E-GAL-OR.
III. Results and Discussion A. Rationale for Mutagenesis It has been demonstrated that conserved amino acid residues are functionally more important than non conserved residues (21). Mutation data matrices show that His is most often replaced by acid and amide residues, and thus replacement of His by a residue from this group would probably be the most obvious choice (22, 23). However, replacement of His with a member of the acid or amide group to screen for a catalytic role for a His might not discriminate since acids and amides could perform a similar function to that of His. Therefore, in our initial studies, we decided instead to replace His with Phe. Phe was chosen because its side chain has a size similar to His but it lacks the metal complexing properties and hydrogen bond forming capabilities present in the imidazole side chain and also in the members of the acid and amide group of amino acids. Purification and initial screening of the Phe substituted enzymes using transition state analog inhibitors showed that the enzymes with Phe substituted for His357 and His-391 bound transition state analogs very poorly compared to substrate analogs but Mg^-*- binding was not affected. His-357 and His-391 were therefore selected for further study using site directed mutagenesis to replace the His with acid and amide residues.
B. Purification and Stability All of the enzymes with substitutions at His-357 and His-391 precipitated at the same ammonium sulfate concentrations, and eluted from the ion exchange and gel filtration columns in similar volumes as wild type. This indicated that the physical properties associated with purification (aggregation, quaternary structure, charge, etc.) were not affected by the mutations. The enzymes were greater than 95% pure as analyzed by SDS PAGE. H357F was stable for at
368
Nathan J. Roth et al.
Table L Kinetic constants of the wild type and substitutedfi-galactosidaseswith ONPG and PNPG as substrates H391F Wild H357D tD57F H357N H391E ONPG kcat(s'^) IQn(mM)
q)pkcat(s'¥ PNPG kcat(s"b IQn(mM) appkcatCs"^)^ Values d* ^pkcat
620 0.12 1150
7.85 0.12 6.63
15.9 0.76 19.8
63.5 0.22 57.9
1.71 6.9 1.71
0.24 0.44 0.18
90 0.041 90
1.22 0.47 0.90
2.82 0.069 2.33
0.77 0.006 0.65
0.017 3.94 0.013
0.009 0.066 0.007
refer to the turnover number
least 6 months when stored at 4**C, and H357N and H357D were also stable. H391F and H391E were, however, fairly unstable under the storage conditions (50 mM Tris, 1 mM MgS04, 0.04% sodium azide) and gradually lost activity with time. Kinetic analyses of those enzymes were performed as soon as possible after purification.
C. Enzyme Characterization Since the three dimensional structure of 6-galactosidase has only recently been solved (24) there was no structural information available at the onset of this study to draw upon for potential roles of these conserved residues. Therefore, we examined the substituted enzymes in terms of the effects the substitutions had upon activity, substrate analog and transition state analog inhibition, and Mg2+ binding. Replacement of His-357 or His-391 resulted in substantial decreases in activity when compared to the wild type enzyme (Table I). As Mg2+ is required for full activity of wild type 6-galactosidase, we first checked whether the presence or absence of Mg^^ had any effect upon the activity of the substituted enzymes. The results (not shown) indicated that the Phe substituted enzymes were inactivated by EDTA to essentially the same extent as wild type enzyme, indicating that His-357 and His-391 are not Mg2+ ligands. The pH profiles of the kcat values on the alkaline side of the pH profile show that, except for magnitude (the kcats ^r^ normalized), a substitution of His357 by Phe had very little effect on the pH-profile (Figure 2). The H391F enzyme was unstable during the time of assay at pH 10.0 but the pH profile appears to be similar to wild type up to pH 9.0, suggesting that substitution of His-391 by Rie also does not affect flie pH profile (except for magnitude). Inhibitor studies of the substituted proteins provided insight as to the possible roles for the conserved His residues. The inhibitors used can be subdivided into two groups: substrate analogs and transition state analogs. The substrate analogs utilized were isopropyl-6-D-thiogalactopyranoside (IPTG), phenylethyl-6-D-thiogalactopyranoside (PETG), and lactose. IPTG and PETG contain a 6-thio-galactosyl bond which can not be hydrolyzed by 6galactosidase. Although lactose is actually a substrate for 6-galactosidase, its inhibition constant can be determined because of its relatively slow catalytic breakdown compared to the synthetic substrate, ONPG. The transition state
369
Probing Conserved Residues in p-Galactosidase
lOOJ
1
pH Figure 2. pH profiles of the H357F (circles), H391F (triangles), and wild type (squares) 6galactosidases. The pH profiles of the substituted enzymes were determined in pH assay buffer (30 mM TES, 50 mM Histidine, and 1 mM Mg2+) using ONPG as the substrate. The kcat values were normalized as percentages of the maximal kcat observed for each enzyme to account for the large differences in the activity between the wild type and substituted enzymes.
analogs used were L-ribose (25), Y-(l,4)-D-galactonolactone (25), and Dgalactal (26). These are planar molecules which bind much more tightly to wild type 6-galactosidase than does galactose, and are thought to resemble a planar galactosyl carbonium ion that is probably formed during substrate hydrolysis. Analyses of the inhibitor constants of the substrate analogs indicated that substitutions for the conserved His residues (except H391E) had only small affects on the ability of the enzyme to bind substrate analog inhibitors and in some cases even strengthened such binding (Table II). Substitution of His-391 by Glu did affect the ability of this enzyme to bind substrate tightly as shown by the high Km value and the increased Ki values for the substrate analogs. Overall, the results from the substrate inhibitor studies imply that His-357 and His-391 are probably not required for the proper binding of galactose in 6-galactosidase Table II. Kj constants of substrate and transition state analogs for the wild type and substituted 6-galactosidases H357D H357F Wild H391E H357N H391F KjimMi IPTG PETG Lactose
0.085 0.0015 1.21
0.24 L-ribose 0.016 D-galactal 0.13 D-Y-galactonolactone N.D. - not determined
0.024 0.00029 0.16
0.25 0.0027 2.0
0.044 0.00058 0.28
3.14 0.046 50
0.045 N.D. N.D.
62 34 26
7.31 0.30 14
17.8 5.0 15
>100 N.D. 36
14.8 17.8 N.D.
370
Nathan J. Roth et al.
since, except for the substitution of Glu for His-391, the substitutions had little effect on the binding of substrate. The substitution of Glu for His-391 may cause a general active site disruption as it is in close proximity to the catalytic nucleophile, Glu-537 (24). All of the enzymes with substituted residues at His-391 and His-357 bound the transition state analogs very poorly (Table II) suggesting that His-391 and His-357 are required for proper transition state stabilization. Although, H391E 6-galactosidase bound the substrate analogs about 40 times poorer than wild type, it bound the transition state analogs even more poorly by a full order of magnitude. The observed decreases in catalysis of the substituted enzymes may be a consequence of increased energy barriers due to the losses of transition state solvation. The effect seems to be mainly on "galactosylation" (k2). This is supported by the results of the nucleophilic competition studies which showed that the addition of methanol to the assay did not result in an increase in the k^at. Furthermore, the kcat values for each enzyme were quite different depending upon which substrate was used. This indicates that "galactosylation" (k2) was rate determining, and shows that this step was affected much more than "degalactosylation" (ks) by the changes in solvation of the planar transition state.
IV. Conclusions The results suggest that His-357 and His-391 are required for proper transition state stabilization and may form direct inter-actions with a planar galactosyl transition state intermediate. The presence of active site His residues which interact with the transition state in glycosidases has previously been shown in a-amylase (28). Studies of 6-galactosidase utilizing deoxy and deoxyfluoro galactosyl analogs indicate that interactions at the 3-, 4-, and 6positions contribute approximately 16.7 kJ (4 kcal) • mol-1 each to the stabilization of the transition state, while interactions at the 2- position contribute at least 33.5 kJ (8 kcal) • mol-l (27). Our findings show that His-357 and His-391 might be the residues within the active site of 6-galactosidase which mediate some of these interactions. As His-357 and His-391 are conserved in both the 6-galactosidase and 6-glucuronidase family, it is unlikely that either of these His are involved in interactions with the 4- hydroxyl position since this hydroxyl is axial in galactose but is equatorial in glucuronic acid. We can not absolutely discount the possibility that the observed loss in transition state binding is indirectly due to minor structural aberrations in the enzyme as a result of the substitutions. However, the crystal structure of 6galactosidase, which became available near the completion of this study, shows that His-357 and His-391 are near the known active site residues and probably line the active site cavity (24). Therefore, they have the potential to form direct interactions with the substrate in the transition state form.
Acknowledgments We would like to thank Rob Penner for his invaluable technical assistance in the latter stages of this woik. We would also like to thank R.H. Jacobson and B.W. Matthews for providing us the opportunity to examine preprints of the structure of fi-galactosidase. Funding for this work was provided by the Alberta Heritage Foundation for Medical Research (AHFMR)
Probing Conserved Residues in P-Galactosidase
371
in the form of studentships and by the National Science and Engineering Research Council of Canada (NSERC).
References 1. Richardson. J. S., and Richardscm, D. C. (1989) In The Prediction of Protein Structure and the Prindi^es of Protein Conformation" (G. D. Fasman, ed.), 1-98. 2. Kahiins. A., Otto, K., Ruther, U. and Muller-HiU. B. (1983) EMBO J. 2,593-597. 3. BurchhaTdt.G..andBahl,H. (1991) Gene 106,13-19. 4. Buvinger.W.E.. and Riley. M. (1985) J.BacterioL 163,850-857. 5. David.S..Stevens.H..vanRiel.M.,Simons.G.,anddeVos.W.M. (1992) J.BacterioL 174,4475-4481. 6. Fanning. S..Leahy.M..andSheehan.D. (1994) Gene Ul,9\-96, 7. Hancock. K. R.. Rockman. E.. Young. C. A.. Pearce. L.. Maddox. I.S.. and Scott. D.B. (1991) 7. Bacteriol 173,3084 - 3095. 8. Poch.O..L*Hote.H.L..Dallery.V..DebeauxF..Reer.R..andSodoyer.R. (1992) Gene 118, 55-63. 9. Schmidt. B. F.. Adams. R. M.. Requadt. C. Power. S.. andMainzer. S. E. (1989) J, Bacteriol 171,625-635. 10. Schroeder.C. J.. Robert. C..Lenzen.G.. McKay. L.L.. and Mercenier. A. (1991) J, Gen. Microbiol. 137,369-380. 11. Stokes. H. W.. Betts. P. W. and HaU. B. G. (1985) Mol. Biol. Evol. 2,469 - 477. 12. Gallagher. P. M..D'Amore.M. A.. Lund. S.D.. and Ganschow.R.E. (1988) Genomics 1, 215-219. 13. Jefferscm. R. A.. Burgess. S. M.. and Hirsh. D. (1986) Proc. Natl. Acad. Sci. USA 81, 414-418. 14. Oshima. A.. Kyle. J. W.. MiUer. R. D.. Hoffman. J. W.. PoweU. P. P.. Grubb. J. H.. Sly. W. S., Trq)ak. M.. Guise. K. S.. and Gravel. R. A. (1987) Proc. Natl. Acad. Sci. USA 84, 685-689. 15. Nishimura. Y.. Rosenfeld. M. G.. Kreibich. G., Gubler. U.. Sabatinit. D. D.. Adesnik. M.. and Andy. R. (1986) Proc. Natl. Acad. Sci. USA 83, 7292-72%. 16. Kunkel.T.A..Roberts.J.D..andZakour.R.A. (1987) Meth. Enzymol. 154,367-382. 17. Cupples.C.G..Miller.J.H..andHuber.RE. (1990) J.Biol.Chem. 265,5512-5518. 18. Wallenfels.K.. and Weil. R. (1972) In "The Enzymes" (Boyer. P. D.. ed) VoL7, pp.617663. 19. Deschavanne.P.J..Viratelle.O.M..andY<Mi,J.M. (1978) J.Biol.Chem. 253,833-837. 20. Huber.R.E.. and Gaunt. M.T. (1982) Can.J.Biochem.60,6OS-6l2. 21. Poteete.A.R..Rennell.D.,andBouvier.S.E. (1992) Proteins 13,3^-40. 22. Overington,J..I>onnelly.D..Johnson.M. S..SaU.A..andBlundell.T. L. (1992) Protein Science 1,216-226. 23. Dayhoff.H.0.. Schwartz. R.M.. and Orcutt. B.C. (1978) /n "Adas
This Page Intentionally Left Blank
SECTION VI Analysis of Protein Interactions
This Page Intentionally Left Blank
RAPID IN VITRO ASSEMBLY OF CLASS I MAJOR HISTOCOMPATIBILITY COMPLEX Nicholas J. Papadopoulos^ James C. Sacchettini^ Stanley G. Nathenson*-*', and Ruth Hogue Angeletti*"'*^
Departments of Cell Biology", Microbiology & Immunology*', Developmental & Molecular Biology*, and Biochemistry**, Albert Einstein College of Medicine, Bronx, NY 10461
L
Introduction
The class I m^or histocompatibility complex (MHC) molecule consists of a highly variable heavy chain in complex with a light chain (P2-microglobulin, pjm) and peptides of approximately 8-10 amino acids in length. This complex assembles in the endoplasmic reticulum, and is transported to the surface of an antigen presenting cell (APC), where it is recognized by antigen-specific cytotoxic T lymphocytes. This recognition by T cells triggers a series of cellular events, leading to lysis of the target cell (1). The source of the peptides in these complexes can be either "self, i.e., derivedfi'omendogenous proteins, or foreign,fi-ompathogenic sources. In inbred mice there are at least 22 allelic forms of heavy chain, each of which is thought to bind peptides with a specific structural motif, e.g., peptide length and position of specific amino acid side chains. The P2-microglobulin chain is invariant. Analysis of natural MHC molecules has led to identification of specific peptide epitopes for several of these alleles. The peptide motifs of two of the most well-characterized murine alleles, H-2K** and H-2D*', have been identified, and the 3- dimensional structures of their ternary complexes have been solved (2,3). Previous methods for in vitro formation of intact MHC complexes using recombinant proteins solubilizedfi'ombacterial inclusion bodies, together with exogenous synthetic peptides have relied upon lengthy dialysis techniques for refolding. Yields of properly folded complex from these methods rangedfi-om515%, with most of the protein misfolded into high molecular weight aggregates. In this conmiunication, we present a novel method for the rapid assembly of properly folded MHC ternary complexes in relatively high yields utilizing exogenous synthetic peptides and overexpressed polypeptides fi-om inclusion bodies (4). TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
375
376
IL
Nicholas J. Papadopoulos et al.
Experimental
Materials: Biochemicals were obtained from Sigma Chemical Corp. unless otherwise stated. Sequanal-grade urea and the BCA-protein assay were obtained from Pierce Chemical Co. All water used was distilled before passage through a MilliQ^ apparatus. HPLC solvents were obtainedfromBurdick and Jackson. Solubilization: Heavy chains(K^ or D^) and light chain(P2M) were isolated as Rcoli insoluble inclusion bodies. These pellets were extensively washed with water and dissolved in 8 M urea containing 10 mM Tris HCl pH 8.5. After centrifugation at 15,000 rpm in an SS35 rotor (Sorvall RC5 centrifuge) for 30 min at 4°C, the solubilized heavy chains were stored at -70°C in urea solution for up to 1 week. Urea solubilized P2M was prepared by dialysis at 4**C versus 10 mM Tris HCl pH 8.5 for 24 hr in the presence of 100 ^M reduced glutathione, followed by an additional 24 hr with buffer alone. Peptide synthesis and purification: Peptides were synthesized by solid-phase methods using either Fmoc or ^Boc chemistry on implied Biosystems 430A or Advanced ChemTech peptide synthesizers. All peptides were purified by reversed phase HPLC to >95% purity with a Vydac C-18 column (2.1 or 4.6 mm x 25 cm, 3 00A) on a Hewlett Packard HP-1090M instrument . Peptides were further analyzed by amino-acid analysis on a Hewlett Packard AminoQuant system after acid hydrolysis, and by mass spectrometry on a Finnigan MAT90 instrument. The standard peptide used in these experiments is the vesicular stomatitis virus (VS V) nucleocapsid protein N52.59, RGYVYQGL. Other peptides used were: simian virus (SV40) T-antigen, SAINNYAQKL or AINNYAQKL, and influenza virus epitope, ASNENMETM. Gel-Exclusion Chromatography: Incubation mixtures were injected(50-500 |il) onto a Superdex-75 column (Pharmacia, molecular weight cutoff of 100,000) equilibrated in 10 mM potassium phosphate pH 7.0 containing 150 mM NaCl, operating at aflowrate of 0.75 ml/min. Detection of absorbances was at 214 and 280 nm. Protein Sequencing & Mass Spectrometry: Proteins were sequenced on an Applied Biosystems 477A protein sequencer. Polypeptide molecular weights were determined after analysis by a Sciex API-Ill electrospray ioniziation mass spectrometer.
III.
Results
Formation and Time Course ofAssembly ofMHC Complexes For a typical incubation protocol in which Pjm and peptide are in excess, 225 jig (6.8 nmoles) of K^ polypeptide (5.6 mg/ml) in 8 M urea containing 10 mM Tris HCl pH 8.5 are precipitated with aflnalconcentration of 10% trichloracetic
Assembly of Class I Major Histocompatibility Complex
377
acid at 4° C for 10 minutes. After centrifugation at 10,000 x g for 10 minutes at 4°C, the supernatant is discarded, and the pellet is washed twice with 100% ethanol. The pellet remaining after centrifugation at 10,000 x g is allowed to air dry for 5 minutes. To this pellet is added 50 nmoles of peptide from a stock solution at 10 mg/ml in water, 265 ^g (24 nmoles) of Pjm (1.76 mg/ml) in 10 mM Tris HCl pH 8.5, and 3.5 ^moles of reduced glutatWone. The pellet does not dissolve at this point, but is gently dispersed before adding aliquots of 1 N NaOH. The solution will clear after the addition of 8-10 \xl of base, at approximately pH 9. Sufficient distilled water and 1 M Tris HCl pH 8.5 are quickly added to bring thefinalvolume of the reaction mixture to 200 ^1 and thefinalconcentration of Tris to 55 mM. This reaction mixture is incubated at room temperature on a rotating platform. Figure 1 shows the gelfihrationprofiles of the reaction mixture at zero time, and after 18 hr of incubation. Note the conversion of reduced glutathione (GSH) to oxidized glutathione (GSSG) during this time. The peak at 13 minutes corresponds to an Mr of 45,000 and represents properly folded ternary complex. This peak has been sequenced to yield approximately equimolar amounts of the three components (Table I). The small peak at 14.5 minutes is presumed to be Pjm dimers. This region of the gelfiltrationprofile is expanded in Figure 2, which shows more details of the time dependence of complex formation. Within 8 hours, there is a six-fold increase in the amount of complex detected. The small amount of complex seen at 0 hr is probably complex formed as the incubation mixture is neutralized, just prior to analysis. At early time points, aggregates of misfolded K** and Pj"^ which would give rise to a 10 minute peak are spun down and discarded. If reduced glutathione is omitted from the reaction mixture, neglible amounts of complex are formed even after 18 hr of incubation, and instead a large peak is observed at 10 min, the void volume of the column (data not shown). Figure 3 shows the two small, late-eluting peaks in expanded scale. The peak at 27 min corresponds to the VSV octapeptide, and the 31.5 min to the Tris buffer from the incubation mixture. Note that in contrast to the time dependent increase in the complex peak, the quantity of the excess peptide decreases.
Table I: Sequence Analysis of MHC Complex Molecule
Position
1
2
3
4
5
6
7
8
9
10
Observed R,M
G,I
Y,P,Q*
V,H,K
Y,S,T
Q.L,P
G,R,Q
UY,I
F,Q
V
Peptide
R
G
Y
V
Y
Q
G
L
-
-
K^
M
G
P
H
S
L
R
Y
F
V
pjtn
M
I
Q
K
T
P
Q
I
Q
V
•Since the yield of arginine is low, the relative proportion of the chains is better represented by comparing the molar yield at Edman cycle 3, where each polypeptide chain has a unique sequence: Q + E = 21.6 pmoles; Y = 15.6 pmoles; P = 20.9 pmoles.
Nicholas J. Papadopoulos et al.
378 GSSG
20 T I M E , MINUTES
30
Figure 1: Gel Fatration Profile crfMHC Assembly at 0 Hr (solid line) and 18 Hr (dashed line).Equal aliquots (50 ^g) of incubation mixtures were separated on a Superdex-75. The elution times of individual components are identified: MHC complex; pj"^* GSSG, oxidized glutathicMie; GSH, reduced glutathione; VSV-8mer, RGY VYQGL; Tris salt
12 TIME,
14 MINUTES
Figure 2: Expanded Gel Filtration Profile Showing the MHC Complex Region.
16
379
Assembly of Class I Major Histocompatibility Complex VSV
0-HR 8-HR
TIME,
30 MINUTES
Figure 3: Expanded Gel Filtration Profile Showing the Low Molecular Weight Region.
B2M HPLC PROFILE OF DENRTURED MHC COMPLEX
300-1 (XJ
y 200 a: Q:
I
en
T
KB
o^ 1001 1 m _
VSV-8MER
10
40 20 30 RETENTION TIME, MINUTES
50
Figure 4: Reversed Phase HPLC Separation of Denatured MHC Complex. The MHC complex peak frran the 8 hr incubation was collected, denatured and separated on a Vydac C-18 column (2.1 mm x 25 cm, 0.2 ml/min, 1%/min increase in acetonitrile concentration).
Nicholas J. Papadopoulos et al.
380
478.3
100i
75
50 JO
0 DC
955.6
25
|tiiilLlUiiitLiJii^.xi g
0 200
400
^•ii-
y. 600
800
1000
1200
1400
m/z
Figure 5: ESI-Mass Spectrometry of Individual Components Isolated from the Denatured MHC Ternary Complex from Figure 4. Panel A: The peak at 31.5 minutes corresponds to the VSV-8mer peptide, RGYVYQGL. The ion at m/z = 955.6 corresponds to the singly protonated species, while the ion at m/z = 478.3 is the doubly protonated peptide. Predicted monoisotopic mass = 955.5. Panel B: The peak at 43 minutes corresponds to the Pj-microglobulin polypeptide. The molecular weight peaks are 11,818 (fiiU length polypeptide); 11,687 (protein minus N-terminal methionine); 11,801 (possible dehydration); 11,860 (N-acetylated protein). Upper = mass spectrum; lower = reconstructed mass spectrum Panel C: The peak at 49 minutes corresponds to the K^ polypeptide. The molecular weight peaks are 32,349 (full length protein); 32,393 (N-acetylated protein). Upper = mass spectrum; lower = reconstructed mass spectrum
u
PQ
(%) Ansuejui aA!iB|8b|
(%) Ai!SU8)u| SAjieiay
8J^
2 £
J ^
3
(%) Ai!su8)u| eAjiBiea
(%) Ajjsuajui aAjjBiay
81
382
Nicholas J. Papadopoulos et al.
Analysis of the MHC Complex Peak The ternary complex peakfromthe 8 hr time point was isolated and 8 M guanidine hydrochloride added to a final concentration 800 mM. This denatured complex was then separated by reverse phase HPLC as shown in Figure 4. Three prominent peaks are seen at 214 nm. When the HPLC profile at 280 nm is integrated (data not shown), the individual components are present in approximately equimolar ratio, based upon their molar extinction coefficients at 280 nm: VSV pq)tide, 2,560; pj^, 17,900; K^ 73,270. Although the integrated areas of VSV peptide and the Pj"^ were in the predicted 1:1 stoichiometry, K^ was only 0.6. We attribute this to poor recovery of K^ as well as to difficulties in integration of a tailing peak. These peaks were also collected and analyzed by ESI-mass spectrometry as shown in Figures 5 (panels A-C). The peaks at 31.5, 42.8 and 49 minutes correspond to the VSV octapeptide, Pjm and K^ respectively.
IV.
Discussion
An in vitro system has been developed for the assembly of class 1 major histocompatibility complexes using recombinant murine heavy (H-2K** or H-2D**) and light (P2-microglobulin) chains plus synthetic peptides. This method is reproducible, and capable of rapidly and efi5ciently forming stable ternary complexes. The redox equilibrium protocol was adaptedfromprocedures used to fold monomers or homopolymeric proteins. Thus, it is one of few heterodimeric protein systems assembled in this manner. In experiments not shown, larger scale preparations have been used for preparation of crystals suitable for x-ray diffraction studies. Ternary complex can be detected as soon as 1 hr of incubation, and its formation is complete by 18 hr at room temperature. Under the conditions of excess light chain and peptide used in the above experiments, essentially 100% of the starting K** polypeptide is incorporated into correctly folded, productive complex if the reaction is allowed to proceed for 18 hr. If the amount of peptide is decreased to below 3:1 (peptide:K** polypeptide), or if pjm is lowered to 1:1 (P2m:K*'), the efficiency of complex formation is lowered. Nonspecific, high molecular weight aggregates comprised of heavy chain and Pjm are observed at the void volume of the column. No complex can be isolated in the absence of exogenous peptide. Thus, "empty" MHC complex molecules appear not to be formed (5). The presence of glutathione in the incubation mixture is essential, emphasizing that during the refolding process proper formation of disulfide bonds is critical (6). Consistent with this is our observation that there are no free sulfhydryl groups in initially solubilized heavy chain, suggesting that mispaired disulfide bonds prevent the correct folding to ternary complex. Blocking of heavy chain sulfhydryl groups by N-ethyl maleimide also interferes with complex formation (data not shown).
Assembly of Class I Major Histocompatibility Complex
383
The adaptability of the method to analysis of multiple samples facilitates screening of peptide variants, and consequently, the study of the rules for epitope binding. Using this method we have tested a variety of peptides previously reported to be K* and D** restricted, and have observed that they bind to their appropriate alleles. However, two peptides reported to be only H-ID^ restricted (SAINNYAQKL, AINNYAQKL) (7), have been shown by this method to be capable of binding to the H-2K** allele as well. The ability of these peptides to bind to both K*" and D*" suggests that the conformation of these peptides can change such that the tyrosine residue in these D** -specific peptides can insert into the anchor position in the K** peptide binding groove. Another D*" -restricted peptide, ASNENMETM, does not form productive K** complex in our assay, presumably due to the lack of a proper K** anchor residue. Similarly, the VSV-8mer used in the above experiments cannot form ternary complexes with D^ polypeptide in our assay system, because there are no suitable anchor residues to fulfill the D^ motif requirement. Due to the eflBcient manner by which MHC ternary complexes can be formed, this method has potential for rapid screening of peptide analogues, possibly by radioassay, or for use in equilibrium and kinetic studies.
Acknowledgments: The authors wish to thank the Laboratory for Macromolecular Analysis for efforts on this project, particularly Edward Nieves, Yuan Shi, Nguyet Le and Dr. Xuejun Tang. N.P. was si^pcMted by a NaticHial Institutes of Health training grant, C A 09173. This work was supported by grantsfixsmthe Naticmal Institutes of Health to S.G.N.(AI07289, AI33184, AR42533, CAl 3330). The Albert Einstein College of Medicine Cancer Center and Diabetes Research and Training Center provided general support for the Laboratory for Macromolecular Analysis.
References 1. 2. 3. 4. 5. 6. 7.
GM van Bleek and SO Nathenson (1993) in "Naturally Processed Peptides", Chemical Immunology vol. 57, A Sette (ed.), Karger Publishing, Basel, pp 1-17. W Zhang, ACM Young, M hnarai and SO Nathenson (1992) Proc. Natl. Acad. Sci. USA 89,8403-8407. ACM Young, W Zhang, JC Sacchettini and SG Nathenson (1994) Cell 76,1 -20. DN Garbozi, DT Hung and DC Wiley (1992) Proc. Natl. Acad. Sci. USA 89,3429-3433. Y Saito, PA Peterson and M Matsumura (1993) J. Biol. Chem. 268, 21309-21317. TE Creighton (1977) J. Mol. Biol. 113,275-293. Y Tanaka, RW Anderson, WL Maloy and SS Tevethia (1989) Virology 171,205-213.
This Page Intentionally Left Blank
Peptide Models of bZIP Proteins: Quantitative Analysis of DNA Affinity and Specificity Steven J. Metallo Alanna Schepartz Department of Chemistry, Yale University New Haven, Connecticut 06511
I. Introduction Cellular phenomena such as replication, recombination, differentiation, and cell growth are controlled at the most fundamental level by transcription factors, proteins that regulate gene expression through interactions with specific DNA sequences. The sequence specificity of a given DNA binding protein may be defined as the free energy of the correct protein^DNA complex relative to the free energies of all other protein»DNA complexes. Therefore, a complete understanding of sequence specificity requires the identification of not only those factors that enhance interaction with a correct site, but also those factors that inhibit interaction with incorrect (or partially correct) sites.^ Because of the atypically simple motif they use for DNA recognition, the bZIP class of eukaryotic transcription factors^ represents an excellent system in which to study those factors that influence sequence specificity. The bZIP motif consists of a short, helical, basic segment whose residues participate in DNA contacts, a zipper segment responsible for protein dimerization,^'^ and a six residue spacer segment of variable sequence connecting the two (Fig. 1).^ X-ray crystallography data of two bZIP'DNA complexes show the protein dimer to consist of a pair of uninterrupted a-helices that interact with each other along the length of the zipper segment to form a parallel coiled coil.^ These a-helices diverge in the vicinity of the nucleic acid and interact with the major groove of the target DNA.^''^ Like several other families of eukaryotic transcription factors,^"^^ bZIP proteins exhibit "half-site spacing specificity" —that is, they distinguish between target sites based on the number of base pairs separating two half-sites.^^ For example, bZIP proteins related to Fos and Jun (AP-1 family) prefer the nine base pair AP-1 target site (ATGACTCAT), whereas those related to CREB and CREBPl (CREB/ATF family) prefer the ten base pair CRE target site (ATGACGTCAT).!^ Within a B DNA context, the additional dG:dC base pair in the CRE target site displaces the two ATGA contact surfaces by an axial translation of 3.25 A and a twist angle of 34.5^. This geometric operation moves the base and phosphate groups of one half site by approximately 4 and 7 A, respectively.^ Despite the presumed structural differences between the CRE and AP-1 target sites, the yeast bZIP protein GCN4^^'^'^ operates in a mode distinct from the AP-1 or CREB/ATF families and binds both sites with comparable affinity. ^^ The experiments described here were initiated to explore the molecular basis for the half-site spacing selectivity of CREB/ATF proteins. We prepared a series of seven test peptides in which the basic, spacer, and zipper segments of GCN4 TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
^^^
386
Steven J. Metallo and Alanna Schepartz
were systematically replaced by analogous segments from CRE-BPl, a prototypic CREB/ATF family member exhibiting high CRE/AP-1 specificity (AAGobs>2 kcal^mol"!). Equilibrium dissociation constants were determined for the CRE and AP-1 complexes of each peptide homodimer. Comparisons of the measured ATanp allowed us to assess in a quantitative way the relative importance of each CREBPl segment in determining DNA affinity and specificity.^^
^
( basic ]
(spacer) P^ ' P4
izipper)
•
GCN4 SAALKRARNTEAARRSRARKLQRMKQLEDKVEELLSKNYHLENEVARLKKLVGER
(999)
CRE-BPl i i l i l i l i i i l i i i l (ccc) ggC
SAALKRARNTEAARRSRARKLQ
gCC
SAALKRARNTEAARRSB
gcg
SAALKRARNTEAARRSRARK||||i|i|}:iEDKVEELLSKNYHLEl^^
egg cog
lilllllllilljij liBliiBlliliilililliil^ Figure 1. Sequences of bZIP peptides used in this study.
II. Materials and Methods Peptides and DNA. A peptide containing the sequence Gly-Ser at its amino terminus followed by the GCN4 bZIP element (residues 228-281, labeled ggg in Fig. 1), was obtained from Professor Jon Shuman. The concentration and identity of ggg was confirmed by amino acid analysis.^^ A peptide comprising the bZIP element of CRE-BPl (residues 354-408, labeled ccc in Fig. 1), and five chimeric peptides ggc, gcg, gcc, egg and ccg, were synthesized and purified by use of standard methodology.^^ Stock solutions were dissolved in a buffer containing 137 mM NaCl, 2.7 mM KCl, 4.3 mM Na2P04, 1.4 mM KH2PO4 (pH 7.4), 1 mM EDTA, 1 mM DTT, and 0.1% NP-40. Stock solutions containing less than 10 |iM peptide in the absence of NP-40 did not give reproducible results. CRE24 and AP-I23 were prepared as described.^^ Electrophoretic mobility shift assays. Equilibrium dissociation constants of peptide^DNA complexes were determined by use of an electrophoretic mobility shift assay.^^'^^ In a typical procedure, a peptide was diluted serially from a stock solution of known concentration into PBS binding buffer (137 mM NaCl, 2.7 mM KCl, 4.3 mM Na2P04, 1.4 mM KH2PO4 (pH 7.4), 1 mM EDTA, 1 mM DTT, 0.1% NP40,0.4 mg/mL acetylated BSA, and 5% glycerol). To the peptide solution was added labeled DNA to a final concentration of <50 pM. Binding reactions were incubated for 30 minutes at 25 °C or 4 °C and then applied to a running nondenaturing polyacrylamide gel.^^ Gels were pre-equilibrated for 30 min at 300 V and maintained at a constant temperature during electrophoresis by immersion in a circulating, temperature-controlled water bath. Samples were loaded and subjected to electrophoresis at 300 V for 1 hour at 25 °C, or at 300 V for 3 hours at 4 °C. The amounts of complexed and free DNA were quantified on
Models of bZIP Proteins
387
a Betagen 603 Blot Analyzer (Betagen Inc., Waltham, MA). Within each experiment, the sum of the cpm contained in the bound and free bands was constant to within 10%. A series of experiments were performed to verify that all peptides studied reached equilibrium with their DNA target sites within 30 minutes. For each peptide, the scale of a typical binding reaction was doubled, and an aliquot was loaded from each reaction after 30 minutes and after 60 minutes, generating two titration curves. In each case, the fraction DNA bound in the presence of a given peptide concentration after a 30 minute incubation was within experimental error of the fraction DNA bound after a 60 minute incubation. To determine whether the binding equilibrium was perturbed by the gel electrophoresis experiment, the amount of radiaoactivity located between the free and bound bands was quantified for a representative sample of experiments. There was no significant increase in the amount of radioactivity present in the region between the free and the peptide bound DNA bands relative to lanes containing only free DNA. If the peptide»DNA complex was dissociating while in the gel, the formerly complexed DNA would be located in the region between the free and bound bands. We conclude from this experiment that the fraction DNA bound does not change once the sample has entered the gel. Calculations. It has not been established whether all bZIP proteins bind DNA as pre-assembled dimers in a single step or whether two protein monomers bind sequentially with dimerization occurring on the DNA.^ If dimerization precedes DNA binding, the binding reaction will be described by scheme 1: Scheme 1 O 2U " ^ - ^ A2 - ^ - ^ A2O ^1
^2
Here, U represents unfolded bZIP monomer, A2 represents coiled-coil homodimer, O represents duplex DNA, and A2O represents the DNA»peptide homodimer complex. Ki represents the dissociation constant of the peptide homodimer and K2 represents the dissociation constant of the complex of this homodimer and DNA. For this case, the following equilibrium and mass conservation equations hold: i q = ^ ' [A2]
(1)
^ [A2X0] Kn = —^ ^ [A2O]
(2)
Atotal=[U] + 2[A2] + 2[A20]
(3)
Under our experimental conditions, the total peptide concentration, Atotal» exceeds the total DNA concentration (<50 pM) by a factor of at least 10. As a result, the term [A2O] in equation (3) is small compared with Atotal and can be ignored. Equation 3 is simplified further by the consideration that the values of Ki for the peptide homodimers studied here are also greater than Atotal- The value of Ki for GCN4p, a peptide that differs from ggg by four residues at the amino terminus, is approximately 5 |LiM under experimental conditions almost identical to those used here;^^ we assume that the Ki for ggg and gcg are similar based on their similar Tm values (67 ^C^^ versus 66 ± 7 ^C, respectively). If so, at the highest concentration of ggg or gcg used (30 nM), the concentration of homo-
388
Steven J. Metallo and Alanna Schepartz
dimer A2 will be no greater than 2% of Atotal and may be ignored. For ccc, gcc and ggc, we estimate Ki values greater than 100 |LiM and as a result, the concentration of these homodimers will be negligible at all protein concentrations studied. For egg and ccg, we can assign values for Ki that are comparable to that of gcg based on the comparable Tm values of the three peptides (Table 1). At the top of the cgg»AP-l titration, the egg concentration is 500 nM and approximately 15% of the peptide is in the dimer state. At the egg concentration required for half maximal binding of the AP-1 target site (which coincides with the top of the cgg'CRE titration), the egg concentration is 200 nM, corresponding to approximately 7% dimer. At the top of the ccg'CRE titration, the ccg concentration is 200 nM, corresponding to approximately 7% dimer. If a value for Ki of 5 |iM is defined, and the data describing formation of the cgg»CRE, cgg»AP-l, and ccg»CRE complexes are fit to an equation which expresses [U] in terms of i^i and Atotah^^ the values obtained for ^app fall within the error of the values obtained by approximating [U] with Atotal- 1 hus it is possible to ignore the term [A2] in equation 3 and closely approximate [Atotal] with [U]. In this regime the fraction DNA bound (0) is given by 0 = _ [ A 2 O ] _ ^ _ 1 _ [A20] + [0] i + j M l [U]2 Substitution of Atotal for [U] yields the following expression for the dependence of 0 on Atotale =
^ — (5) app [Atotal]^ An alternate binding model involves the sequential binding of two monomers to the target DNA, as represented by scheme 2: 1+
Scheme 2 U
O -^-^'"
U AO ^—*" A^O
Here, A'mi represents the dissociation constant for binding the first protein monomer to DNA and Kj^i represents the dissociation constant for binding the second monomer to the complex AO. In this case the relevant equilibrium and mass action equations are ^.1 = ™ ^ ™1 [AO] _[U][AO] ^ - 2 - [A2O]
(6)
Atotal =[U] + [AO] + 2[A20]
(8)
^'^
Once again, since [U] closely approximates Atotal. the fraction DNA bound (0) is given by Q^ [A2O] ^ 1 ^ 1 [A20] + [0] Y I ^ml^m2 ^ , ^app [Up [A,o,al]2
Models of bZIP Proteins
389
Therefore, regardless of the pathway that the reaction follows, fitting the data to equation (5) will yield the correct A'app. Dissociation constants were estimated by nonlinear least squares analysis of tne data using Kaleidagraph 3.0.2 (Abelbeck Software, Reading, PA). Circular Dichroism Experiments. CD experiments were performed on an Aviv 62DS spectrometer with a 0.1 cm path length cell. Samples contained 50 mM potassium phosphate pH 7.0, 200 mM KCl, and 220 |LiM peptide monomer. Thermal stability was determined by monitoring the signal at 222 nm while increasing the temperature at a rate of 1 ^C per minute. Spectra were baseline corrected but were not smoothed. For each peptide, the Tm was determined by taking the first derivative of the CD signal (0) with respect to the temperature (temperature in Kelvin) and finding the maximum of this function.^^ The error in the measurement of Tm was taken as the width of the d0/d(T) plot.
III. Results and Discussion A.
Thermal stabilities of bZIP peptides and chimeras
Circular dichroism spectroscopy was used to evaluate the thermal stabilities of the six peptides shown in Figure 1 (Table I). The measured Tm values vary between <4 ^C and 68 ^C, and thermal stability appears to track with the origins of the zipper segment. The three peptides containing the CRE-BPl zipper segment each display Tm values below 25 , whereas the three peptides containing the GCN4 zipper segment display Tm values above 65 ^C. The correlation between thermal stability and CRE-BPl sequence seen in three different peptides and chimeras suggests that the CRE-BPl coiled coil, like the Fos coiled coil,^-^ may be inherently unstable relative to the GCN4 coiled coil. Table I. Tm for bZIP peptides and chimeras Peptide coc gcc ggc gcg egg ccg
B.
Tm (°C) <4 <4 20±5 66±7 68±7 65±7
DNA Binding
Equilibrium dissociation constants were measured for the CRE24 and AP-I23 complexes of the seven peptides shown in Figure 1 (Table II). With the exceptions of ccc and ccg, which did not form detectable complexes with the AP-1 target site at 25 ^C, all of the peptides bound the two DNA sequences tested. Calculated values for ^app ^ ^ shown in Table 11. Our results may be summarized as follows: (1) Peptide ggc bound with equal affinity to the CRE and AP-1 target sites. This result indicates that although it is possible to alter the half-site spacing specificity of a bZIP peptide through changes in dimerization domain structure, ^'^^ CREB-BPl does not use this mechanism to discriminate between the CRE and AP-1 target sites.
390
Steven J. Metallo and Alanna Schepartz
(2) Peptides gcg and gcc displayed moderate specificity for the CRE target site (approximately 35 and 50%, respectively, of that exhibited by ccc). This implies that residues within the CRE-BPl spacer segment are important for CRE specificity. Our results are consistent with those obtained by others, in which certain mutations within the GCN4 spacer segment were shown to increase CRE target site preference.^^'-^^ (3) Peptide egg exhibited considerable specificity for the CRE target site, approximately 75% that exhibited by ccc. The specificity of ccg was equal to that of ccc. Taken together, these results imply that the major determinants of CRE target site specificity exhibited by CRE-BPl reside within the basic segment. (4) The specificity of ccc for the CRE target site was achieved at the expense of affinity; this peptide bound the CRE target site with an affinity 300 times lower than that of ggg. Table II. Equilibrium dissociation constants of bZIP peptide*DNA complexes
ggg ccc ccc ggc gcc gcc gcg egg ccg ccg
^aoD (M)2« AP-1 CRE (2.1 ± 0.4) X 10-18 (4.1 ± 0.6) X 10-18 N.D. (1.2 ± 0.1) X 10-15
AG%bs^ AP-1 CRE -23.7 -24.1^ N.D. -20.3
(2.1 ± 0.3) X 10-15^ (3.8 ± 0.6) X 10-17 (1.5 =t 0.9) X 10-16 (6.5 ± 1.3) X 10-18^ (2.1 ± 0.1) X 10-17 (3.3 ± 0.3) X 10-14 N.D.
(3.0=^ 0.4) X 10-17^ (4.1 ± 0.7) X 10-17 (2.4 ± 0.2) X 10-17
-18.6^ -22.4 -21.6
-20.9^ -22.3 -22.7
(1.1 ± 0.1) X 10-18^ (6.1 ± 0.5) X 10-18 (2.2 ± 0.3) X 10-15
-21.8^ -22.7 -18.4 N.D.
-22.8^ -23.5
>-16.8^
-19.4^
> 1 X 10-14^
(2.5 ± 0.2) X 10-16 (5.0 ± 0.9) X 10-16^
-20.0 -21.2
AAG^ -0.4 >2.3^ 2.3^ -0.1 1.1 1.0^ 0.8 1.6 >2.5^ >2.5^
^determined as described in Materials and Methods. Unless noted otherwise, all values refer to data obtained at 25 °C. Each value represents the mean of at least 3 determinations (except for gcc at 4 OC) ± SEM. ^AG^obs in units of kcal»mol"l is equal to -/?71n(l/^app) where T is 298 (or 277) K and R is 0.001987 kcal»mol-l»K-l. ^AAG^obs is equal to AG^obsCAP-l)AG^obsCCRE). ^determined at 4 ^C ^estimated based on AAG^obs determined at 4 ^C and the failure to observe a ccc*AP-1 or ccg*AP-1 complex at a protein monomer concentration of 300 nM. We believe this estimate is reasonable because the AAG^obs values determined for the gcc peptide at 25 ^C and 4 ^C are comparable.
C.
Conclusions
Using block substitutions, we determined the effects on DNA specificity and affinity of the zipper, spacer, and basic segments of GCN4 and CRE-BPl. CRE/AP-1 specificity is encoded by residues within the spacer and basic segments of the bZIP element. Of these two regions, the basic segment plays the dominant role. Our finding that the determinants of half-site spacing specificity, like the determinants of base-pair specificity, are encoded primarily within the basic segment represents a further concentration of recognition information within the short span of a bZIP recognition helix.
Models of bZIP Proteins
391
References (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25)
Cuenoud, B.; Schepartz, A. Proc. Natl Acad. ScL USA 1993, 90, 1154-1159. McKnight, S. L. Sci. Am. 1991, 54-64. Landschulz, W. H.; Johnson, P. F.; McKnight, S. L. Science 1988, 240, 1759-1764. Pu, W. T.; Struhl, K. Proc. Natl. Acad. Sci. USA 1991, 88, 69016905. O'Shea, E. K.; Klemm, J. D.; Kim, P. S.; Alber, T. Science 1991, 254, 539-544. Konig, P.; Richmond, T. J. Mol. Biol. 1993, 230, 139-154. Ellenberger, T. E.; Brandl, C. J.; Struhl, K.; Harrison, S. C. Cell 1992, 71, 1223-1237. Umesono, K.; Evans, R. M. Cell 1989, 57, 1139-1146. Reece, R. J.; Ptashne, M. Science 1993, 261, 909-911. Gorton, J. C ; Johnston, S. A. Nature 1989, 340, llA-lll. Paolella, D. N.; Palmer, C. R.; Schepartz, A. Science 1994, 264, 11301133. Hai, T.; Liu, P.; Coukos, J.; Green, M. R. Genes & Development 1989, 3, 2083-2090. Penn, M. D.; Galgoci, B.; Greer, H. Proc. Natl. Acad. Sci. USA 1983, 80, 2704-2708. Hinnebusch, A. G.; Fink, G. R. Proc. Natl. Acad. Sci. USA 1983, 80, 5374-5378. Sellers, J. W.; Vincent, A. C.; Struhl, K. Mol. Cell. Biol. 1990, 10, 5077-5086. Metallo, S. J.; Scepartz, A. submitted 1994, Cuenoud, B.; Schepartz, A. Science 1993, 259, 510-513. Garner, M. M.; Revzin, A. Nucleic Acids Research 1981, 9, 3047-3060. Fried, M.; Crothers, D. M. Nucleic Acids Research 1981, 9, 6505-6525. Weiss, M. A.; Ellenberger, T.; Wobbe, C. R.; Lee, J. P.; Harrison, S. C.; Struhl, K. Nature 1990, 347, 575-578. Brown, B. M.; Sauer, R. T. Biochemistry 1993, 32, 1354-1363. Weiss, M. Biochemistry 1990, 29, 8020-8024. O'Shea, E. K.; Rutkowski, R.; Kim, P. S. Cell 1992, 68, 699-708. Kim, J.; Tzamarias, D.; Ellenberger, T.; Harrison, S. C.; Struhl, K. Proc. Natl. Acad. Sci. USA 1993, 90, 4513-4517. Johnson, P. F. Mol. Cell. Biol. 1993,13, 6919-6930.
This Page Intentionally Left Blank
Applying affinity coelectrophoresis to the study of non-specific, DNA binding peptides Michael L. Nedved and Gregory R. Moe Department of Chemistry and Biochemistry, University of Delaware, Newark, DE 19716
L Introduction There are many instances where it is of interest to characterize the nucleic acid binding activity of peptides and peptidefragmentsof proteins. However, it is often difficult or impossible to accurately measure binding constants for nucleic acid binding peptides since the methods available were developed for proteins that bind to nucleic acids with high affinity and specificity. For example, the commonly usedfilterbinding and gel-shift assays depend on the relatively slow kinetics of ligand dissociation in order to separate the bound and unbound species (1). This requirement is rarely met in peptide binding experiments. Alternatively, spectroscopic methods such as NMR, fluorescence, or circular dichroism spectroscopy can be used but are often complicated by a non-linear dependence of the measured spectral parameter on the extent of binding (1). Recently, we have used affinity coelectrophoresis (ACE) (2) to characterize the non-specific, DNA binding activity of several small peptides. In order to demonstrate the uses and advantages of this method, typical results for three peptides, TPPI, Xfin-31, and clupeine Z, are summarized below. TPPI is a twenty-two amino acid peptide having an amino acid sequence similar to that of a proline repeat motif in the replication arrest protein, Tus (3). Xfin-31 (4) is the thirty-first zincfingerof the Xenopus laevis protein, Xfin (5), and is a typical example of a "classical" zinc finger. The DNA binding properties of Xfin-31 have been characterized previously using the gel-shift assay (6). Clupeine Z is a protamine isolated from salmon sperm (7), and its DNA binding activity has been characterized using spectroscopic methods (8, 9). We show here that the ACE data can be analyzed using the theory developed by McGhee and von Hippel (10) for non-specific binding of ligands to a homogeneous lattice to obtain binding constants and cooperativity parameters. Additionally, the effi^ct of lattice length on the estimation of these parameters is considered. Site sizes are estimated based on the DNA mobility at saturating peptide concentrations. Finally, ACE can also be used to measure the salt dependence of peptide-DNA binding. The number of cations released and the non-electrostatic component of the binding constant can then be obtained byfittingthe data to the equation derived by Record et al. (11). TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
393
394
Michael L. Nedved and Gregory R. Moe
11. Materials and Methods Xfin-31 (4) and TPPI, a twenty-two residue peptide (3), were synthesized using solid-phase peptide synthesis and purified by IffLC. The oxidized form of Xfin-31 (Xfin-3 l(S-S)) was made using K3Fe(CN)6 (12). Clupeme sulfate was fi-om Sigma, and clupeine Z was purified using the method of Ando and Suzuki (13). The identity of all peptides was determined by electrospray mass spectrometry and where possible by amino terminal sequencing. Concentrations of Xfin-31, TPPI, and clupeine Z were determined spectrophotometrically. Synthetic deoxyoligonucleotides were end-filled after annealing using unlabeled nucleotide triphosphates, [a-^2p]-dATP, and Sequenase® (United States Biochemical) to give a 19-mer containing the consensus binding site of Spl (14), a 37-mer containing the TerB termination sequence in E, coli (15), and a 51-mer containing the UV-5 sequence of the lac operon (16). The procedure for affinity coelectrophoresis (2) was used with the following modifications. A glass plate (10.1 cm x 8.2 cm x 0.1 cm) was placed in a Plexiglas gel-casting box (inner dimensions 10.1 cm x 8.2 cm) open on two sides which were subsequently sealed with cellophane tape. High-mehing agarose was used to cast a 1% (w/v) gel (10.1 cm x 8.2 cm x 0.5 cm) containing TAE buffer (40 mM Tris-acetate pH 8.3, 1 mM Na2EDTA). A continuous DNA well was made at one end of the gel approximately 0.5 cmfromthe peptide lanes by setting a comb (7 cm x 2.2 cm x 0.1 cm) to a depth of 0.4 cm. Using a template drawn on the bottom of the box, a control lane and eight peptide lanes (5 cm x 0.5 cm X 0.5 cm each) were cutfromthe gel with a scalpel. Individual 500 |aL solutions containing buffer or peptide at twice the desired concentration were incubated for two minutes at 65 °C, diluted with an equal volume of 2% agarose in TAE buffer, and pipetted into the cut lanes. The gel was then cut to 7.5 cm x 8.2 cm x 0.5 cm, placed in an electrophoresis chamber, and equilibrated forfifteenminutes in TAE buffer that had been cooled to 5 °C. A labeled DNA solution (100 |iL) containing TAE buffer and 15% glycerol without dyes was loaded into the DNA well and electrophoresed at 3-4 V/cm with buffer circulation until the DNA had migrated over half the length of the peptide lanes. The gel was then wrapped in plasticfilmand autoradiographed. Gels run with Xfin-31 in the zinc complex form used TAE buffer without Na2EDTA (TA buffer). The salt dependence of TPPI was done using TA buffer and KCl. The data were analyzed using Sigma-Plot® (Jandel Scientific), a non-linear, least-squares, curvefittingprogram. Errors given arefromthe best fits of the data.
III. Results In an ACE gel, the electrophoretic mobility of the DNA depends on its net charge as long as the DNA and the DNA-ligand complex is much smaller than the pore size of the matrix through which it moves (2,17). The net charge of the
Affinity Coelectrophoresis and DNA Binding Peptides
395
Figure 1. Autoradiogram of an ACE gel showing Xfin-31 (oxidized) binding to the 19-mer DNA sequence. Peptide concentrationsfromleft are 0,2, 3,4, 5,8,10,14, and 18 |iM.
DNA is reduced in proportion to the amount of ligand bound. A typical autoradiogram of an ACE gel for oxidized Xfin-31 binding to a nineteen basepair oligonucleotide is shown in Figure 1. The change in mobility as a function of ligand concentration is represented by the retardation coefficient (R) which is the ratio of the distance traveled by the DNA in the presence of peptide to the distance it travels in the absence of peptide as measuredfi*omthe center of the bands on the autoradiogram. Roo is the DNA mobility at saturating peptide concentrations. Since all of the peptides used in this study bind non-specifically to DNA, the data were analyzed using the McGhee-von Hippel model (10) where in this case, the binding density is proportional to R. For ligands that bind to a specific sequence, the method of data analysis described by Lim et al. (2) is appropriate. For non-cooperative binding, the McGhee-von Hippel equation (10) becomes ( 1-cR ^^-1 U-(c-l)R
f = K(l-cR)|
(1)
where L is thefreeligand concentration which is approximately equal to the total peptide concentration when the peptide is in excess over the DNA, K is the intrinsic association constant, and c is the unitless proportionality constant relating R to the binding density and is equal to 1 / Roo. For cooperative binding, the corresponding McGhee-von Hippel equation becomes R _ K(l ^T>/(2co - l)(l-cR)+R-HY-Yl-(c+l)R + H y L - *^"-'n 2(o-l)(l-cR) J I 2(l-cR) J
^^^
396
Michael L. Nedved and Gregory R. Moe
[TPPIl(jiM) Figure 2. A.) A binding curve for TPPI using the 37 base-pair sequence. The data were fit using the binding equation of Lim et al. (2). Error bars represent the error in R based on a 1 mm error in measuring distances on the autoradiogram. B.) A Scatchard plot of the same data which werefitusing equation 1.
where H = J { [ 1 - ( C + 1)R]^+[4CDR(1-CR)]}, andoisthecooperativity factor as defined by McGhee and von Hippel (10). As shown in Figure 2 for TPPI binding to a 37 base-pair synthetic oligonucleotide, a plot of R as afiinctionof peptide concentration describes a simple binding isotherm characteristic of ligands binding to non-interacting sites. A Scatchard plot of the data is shown in the inset. Byfittingthe data to the noncooperative equation of the McGhee-von Hippel model (equation 1) (10), the binding constant and the value of Roo were obtained (Table I). A plot of log K
Table L Summary of ACE Data -«..-.«.«^^--_««---«.«_. Zp^ M^^ n^ o^ KQA'^)^ K@(M-^/ Roog Peptide Xfin-31(Zn2+) 5+ Na+(2mM) 3.0 61 8.1 ±0.7x103 4.9 ±0.5x105 0.96 Xfin-31(S-S) 5+ Na+(2mM) 3.4 3.9 3.0±0.1xl04 1.2±0.1xl05 0.84 TPPI
7+
clupeine Z
21+ Na+(50mM) 24
K+ (2mM) 4.0
1
1.4±0.1xl06
1.4±0.1xl06
1.00
85
5.3 ±1.3x104
4.5±1.5xl06
0.50
^net charge on the peptide at pH 8.3 ^mono-valent cation ^site size in base-pairs calculated using equation 3 ^cooperativity factor ^intrinsic association constant binding constant for singly-contiguous sites ^ N A mobility at saturating peptide concentrations
Affinity Coelectrophoresis and DNA Binding Peptides
8
397
12
20
pCfm-31] (MM) Figure 3. Binding isotherms for Xfin-3 l(S-S) (closed circles) and Xfin-3 l(Zn2+) (open circles) using the 19 base-pair sequence. The data werefitusing the binding equation of Lim et al. (2).
versus log [K"^] for TPPI is linear in the range of [KCl]from50 mM to 150 mM. Applying the theory of Record et al. (11), the maximum number of cations released can be estimatedfromthe slope which is 4.7 ± 0.3 for TPPI. The nonelectrostatic binding constant obtained by extrapolation to 1 M KCl is 0.6 M"l indicating that the binding is almost entirely electrostatic for this ligand. ACE data can be used to estimate the site size by using the DNA mobility at saturating peptide concentrations, Rx, and equation 3. NZp n =
ZDROO
(3)
In equation 3, N is the number of base-pairs, Zp is the net charge on the peptide, and ZD is the net charge on the nucleic acid. ZD is estimated by mukiplying the number of phosphates by 0.88, the theoretical constant for duplex DNA (11). In contrast to TPPI, the binding curves for the Xfin-31 in the zinc complex and oxidized forms binding to the 19-mer were sigmoidal (Figure 3), and the Scatchard plots were "humped" (Figure 4). These characteristics are typical of ligands that bind cooperatively. The data in Figure 4 werefittedto equation 2, the McGhee-von Hippel equation for cooperative binding to an infinite, homogeneous lattice (10). This form of the equation includes a cooperativity factor, G), in addition to the intrinsic association constant, K, and Roo. DNA binding by the Xfin-31 zinc complex has been characterized previously using the gel-shift assay (6). Although the value for Ko, the binding constant for singlycontiguous sites (10), determined by ACE is similar to the binding constant determined by the gel-shift assay (6), the cooperative nature of Xiin-31 in the
398
Michael L. Nedved and Gregory R. Moe
Figure 4. Scatchard plots of the binding data from Figure 3 for Xfin-3 l(S-S) (closed circles) and Xfin-3 ICZn^"*^ (open circles). The data were fit using equation 2.
zinc complex form binding to DNA was not apparent in the gel shift assay. The binding site sizes determined for both forms of Xfin-31 using equation 3 are nearly identical (Table I) and agree well with the three base-pair binding site observedfiDrsingle zincfingersin zincfinger-DNAco-crystal structures (18, 19). The McGhee-von Hippel theory (10) was developed for an infinite lattice; however, it is necessary to use relatively small,finitelattices (oligonucleotides) in the ACE gel assay to avoid sieving effects. This might be expected to affect the accuracy of the binding parameters derivedfi"omfitsof the ACE data to equation 2 as the lattice approaches saturation for highly-cooperative ligands with large binding site sizes (20, 21). In order to determine the effects of finite lattice length on estimates of the binding parameters, the ACE assay was used to characterize the DNA binding of clupeine Z to a 51 base-pair synthetic oligonucleotide. Clupeine Z has been shown previously to bind cooperatively to DNA using spectroscopic methods (8, 9). The binding site size determined previously for clupeine Z is -- 20 base-pairs (8, 9) so that there are approximately two binding sites on the oligonucleotide used here. As shown in Figure 5, clupeine Z exhibits the "humped" Scatchard plot of cooperative ligands confirming that ligands known to bind cooperatively to DNA exhibit cooperative binding in the ACE assay as well. Of the parameters estimated from a non-linear, least-squares fit of the clupeine Z ACE data, K, and the cooperativity parameter, o, are smaller by a factor of--100 and ~2, respectively, than those determined spectroscopically using considerably longer DNAfragments(2, 3). These results are consistent with theoretical (20) and experimental estimates (21) of the effects offinitelattice length on binding parameters obtainedfromthe McGhee-von Hippel equation. Therefore, for highly cooperative ligands, longer DNAfragmentsor alternative methods of data
399
Affinity Coeiectrophoresis and DNA Binding Peptides
14 ^ «r>
o -^
,^ 10-
N'
8 ~
•§
6-
s
a
•^.^ «!
•/'^^^"'"•^
X^
12X ^
X
2 -
X
/ /
4A
\
^r
X
y
£
u —
0.0
1
0.1
1
0.2
1
0.3
1 0.4
1 0.5
Figure 5. A Scatchard plot showing the binding of clupeine Z to the 51 base-pair DNA sequence. The data werefitusing equation 2.
analysis must be used to avoid underestimating K and © (20,21,22). In contrast, the value of Roo and the binding site size estimated therefrom appear to be insensitive to the lattice length since the binding site size determinedfromthe ACE assay (~24 base-pairs) agrees reasonably well with the site size estimated using a much longer lattice (8, 9). The binding parameters of Xfin-31 in the zinc complex and oxidized forms were unaffected due to the lower cooperativity and larger number of available binding sites on the oligonucleotide used.
IV. Conclusions Affinity coeiectrophoresis (ACE) (2) was applied to study the non-specific binding properties of a peptide (TPPI)fromthe replication arrest protein, Tus (3), a zincfingerpeptide, Xfin-31 (4), in the oxidized and zinc complex forms, and a protamine, clupeine Z (7), using different DNA sequences. ACE data were used to construct simple binding curves and Scatchard plots, and the McGheevon Hippel theory (10) was used to model the binding of both non-cooperatively (TPPI) and cooperatively binding ligands (Xfin-31, clupeine Z) in order to determine association constants (K) and cooperativity parameters (o). Additionally, the number of salt contacts and the non-electrostatic component of binding were determined for the TPPI peptide by applying the theory of Record et al. (11). The binding constant determined for Xfin-31 in the zinc complex form is in good agreement with that reported using the gel-shifl assay (6). Site sizes can be estimated using ACE data, and the site size determined for Xfin-31 m the zinc complex form was similar to that observed in zinc finger-DNA co-crystal structures (18,19).
400
Michael L. Nedved and Gregory R. Moe
Since ACE requires the use of finite lattices (small oligonucleotides) to avoid sieving effects, the binding of clupeine Z, a highly-cooperative ligand with a large site size (8, 9), to a two-site oligonucleotide was studied. ACE data for clupeine Z demonstrated the cooperative nature of the bmding (8,9), and the binding site size correlated well to data obtained using spectroscopic methods (8, 9); however, the magnitude of both the cooperativity factor and the binding constant were underestimated, an effect which has been predicted theoretically (20) and verified experimentally (21). The binding parameters of Xfin-31 and TPPIwere unaffected due to the larger number available of binding sites on the oligonucleotides used. In summary, the data presented here for peptides binding to different sequences and lengths of DNA demonstrate that afiSnity coelectrophoresis (ACE) (2) can be utilized to study non-specific, peptide-DNA binding. It is a simple, gel electrophoresis technique which measures equilibrium binding of small ligands whose rapid dissociation kinetics may preclude the use of other DNA binding assays. Thus, it is a valuable supplement to other techniques used to measure DNA binding.
References 1. Revzin, A. (1990). In "The Biology of Nonspecific DNA-Protein Interactions" (Revzin, A., ed.), pp. 1-31. CRC Press, Boca Raton. 2. Lim, W.A., Sauer, R.T., and Lander, A.D. (1991). Methods Enzymol 208,196-210. 3. Nedved, M.L., Gottlid), P.A., and Moe, G.R., submitted. 4. Lee, M.S., Gippert, G.P., Soman, K.V., Case, D.A., and Wright, P.E. (1989). Science 245,635-637. 5. Altaba, A.R., Peny-O'Keefe, H., and Melton, D.A. (1987). EMBOJ. 6,3065-3070. 6. Lee, MS., Gottesfeld, J.M., and Wright, RE. (1991). FEES Lett. 279,289-294. 7. Ando, T., Iwai, K., Ishii, S., Azegami, M., and Nakahara, C. (1962). Biochim. Biophys. ^cto 56,628-630. 8. Willmitzer, L., and Wagner, K.G. (1980). Biophys, Struct. Mech. 6, 95-110. 9. Watanabe, F., and Schwarz, G. (1983). J. Mol. Biol. 163,485-498. 10. McGhee, J.D., and von Hippel, P.H. (1974). J. Mol. Biol. 86,469-489; (1976). J. Mol. Biol. 103,679. 11. Record, M.T., Jr., Lohman, T.M., and De Haseth, P. (1976). J. Mol. Biol. 107,145-158. 12. Mirsky, A.E., and Anson, ML. (1936). J. Gen. Physiol. 19,451-459. 13. Suzuki, K., and Ando, T. (1968). J. Biochem. 63,701-708. 14. Gidoni, D., Dynan, W.S., and Tjian, R. (1984). Nature 312,409-413. 15. GotUi*, RA., Wu, S., Zhang, X., Tecklenburg, M., Kuempel, R, and ffill, T.M (1992). /. Biol. Chem. 161, 7434-7443. 16. Zhang, X., and Gottlieb, P.A. (1993). Biochemistry 21,11374-11384. 17. Serwer, P., and Hayes, S.J. (1986). Anal. Biochem. 158,72-78. 18. Pavletich, N.P., and Pabo, CO. (1991). Science 252,809-817. 19. Pavletich, N.P., and Pabo, CO. (1993). Science 261,1701-1707. 20. Epstein, I.R (1978). Biophys. Chem. 8, 327-339. 21. Kowalczykowski, S.C, Paul, L.S., Lonberg, N., Newport, J.W., McSwiggen, J.A., and von Hippel, P.H. (1986). Biochemistry 25,1226-1240. 22. Draper, D.E., and von Hippel, P.H, (1978). J. Mol. Biol. Ill, 339-359.
Investigating Calmodulin-Target Sequence Interactions Using Mutant Proteins and Synthetic Target Peptides Wendy A. Findlay, Stephen R. Martin, and Peter M. Bayley Division of Physical Biochemistry, National Institute for Medical Research, MiU Hill, London NW7 lAA, England, U.K.
I. Introduction Protein-protein interactions are important in the function and regulation of many biological pathways. Associations between proteins are often characterized by strong and extremely specific noncovalent interactions between complementary surfaces. One way of looking at the details of these interactions is to use peptides corresponding to the "interaction region" of one of the proteins to determine the minimum sequence needed for the interaction as well as the effect of changing individual residues. We report here on the use of this approach to study the interaction of calmodulin with a target sequence from skeletal muscle myosin light chain kinase (sk-MLCK). The strategy is to use a number of sequence variants of the peptide and site directed mutants of calmodulin. Calmodulin is a small ubiquitous calcium binding protein which regulates a variety of enzymes in several different metabolic pathways. Calmodulin interacts with many of its target proteins with very high affinity (K^ — nM), usually in a calcium specific manner. It also binds target sequence peptides derived from the calmodulin binding regions of many of these proteins with affinities close to those for the intact enzymes. Many target sequences are predicted to form basic amphipathic helices and this has been proposed as a common structural motif for calmodulin binding (1). In the solution structure of a complex of calmodulin with a 26-residue target peptide derived from the sequence of sk-MLCK, the two domains of TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
401
402
Wendy A. Findlay et al.
calmodulin surround the Ml3 peptide, which has adopted an oj-helical conformation (2). The peptide lies in a hydrophobic channel and the sidechains of Trp4 and Phel7 of the peptide appear to anchor the peptide to the two domains by fitting into the hydrophobic "pockets". Maune et al. (3) have produced a set of single site mutants of Drosophila melanogaster calmodulin, each of which has the conserved glutamic acid residue in the 12th position of one of the four calcium binding loops mutated to either glutamine or lysine. Mutations to sites 2 or 4 effectively eliminate calcium binding to the mutated site and cause structural changes in the protein (3, 4) as well as decreasing the ability of calmodulin to activate target enzymes (5). The availability of these mutant calmodulins and the choice of synthetic peptides derived from the sk-MLCK target sequence allows the manipulation of both components involved in the interaction. The 18residue sk-MLCK target sequence has 3 aromatic residues: Trp4, Phe8, and Phel7. In this work, we use peptides with tryptophan in either position 4 (WFF peptide) or position 17 (FFW peptide). Since the calmodulin itself has no tryptophan residues, we can use optical spectroscopy to monitor the interaction of a tryptophan in a specific position in the target sequence with an individual domain of calmodulin. We have studied binding of the two peptides to wildtype calmodulin and to the site 2 and site 4 mutants (B2Q, B2K, B4Q, and B4K) to see how mutations which effectively eliminate calcium binding to a particular site affect the interaction of the protein with the two target sequence analogues.
n . Materials and Methods Proteins and peptides - Drosophila melanogaster calmodulin and the various mutants expressed in E. coli were purified essentially as previously described (3). Peptides were synthesized on an Applied Biosystems 430A peptide synthesizer and purified by reverse phase HPLC on a CIS column (WFF peptide) or a C8 column (FFW peptide) and were provided with free carboxy and amino termini. All concentrations were determined spectrophotometrically using a calculated extinction coefficient of 5560 M"^ cm"^ at 280 nm for the peptides (2 Phe and 1 Trp) and published extinction coefficients for wildtype and the four mutant calmodulins (3).
Calmodulin-Target Sequence Interactions
403
Fluorescence and affinity measurements - Peptide in 25 mM Tris, 100 mM KCl and 1 mM CaCl2 at pH 7.5 and 30 C was titrated with a stock solution of calmodulin in UV transmitting plastic cuvettes since the peptides appear to bind to glass. Fluorescence titration spectra were recorded using a SPEX FluoroMax fluorescence spectrometer with excitation at 280 nm and emission scanned from 310 to 390 nm. The value of fluorescence intensity at 330nm was plotted as a function of calmodulin concentration and fitted using standard non-linear least squares methods (6) to obtain optimal values of the dissociation constant (K^) and the maximum fluorescence enhancement (F/FQ). The detection limit under our experimental conditions was 50 nM peptide and all quoted K,, values are the average of at least 3 independent determinations. Circular dichroism spectra - Spectra were recorded on a Jasco J-600 spectropolarimeter at room temperature. Far UV CD spectra (190 to 260 nm) of 7.5 fiM peptide:calmodulin complex in 25 mM Tris, 100 mM KCl and 1 mM CaCl2 were measured in a 0.1 cm path length cuvette. Near UV CD spectra (250 to 340 nm) of 20 /xM peptide:calmodulin complex in the same buffer were measured in a 1 cm path length cuvette.
n i . Results We have used two synthetic 18-residue peptides related to the target sequence of sk-MLCK to study their interaction with calmodulin in the presence of calcium. The WFF peptide (KKRWKKNFIAVSAANRFK) corresponds to residues 577 to 594 of rabbit sk-MLCK. In the FFW peptide (KKRFKKNFIAVSAANRWK) the W4 and F17 residues have been interchanged. Upon binding to the protein the tryptophan fluorescence emission maximum for each peptide is shifted from 356 nm to 334 nm as shown in Fig. lA, with an enhancement in fluorescence intensity at 330 nm of about 2.4-fold for the WFF peptide and 3-fold for the FFW peptide (Table 1). These results indicate that the Trp residue is in a hydrophobic environment when either peptide binds to the protein. By monitoring fluorescence intensity at 330 nm while titrating either peptide with calmodulin, we determined the affinities of calmodulin for the FFW peptide (Kd= 1.6 nM) and the WFF peptide (K^<0.2 nM). The affinity of calmodulin for the native sequence (WFF peptide) is at least an order of magnitude higher than that for the modified sequence (FFW peptide).
404
Wendy A. Findlay et al.
We used near UV CD spectroscopy to obtain information about the steric environment of the Trp sidechain in each of the protein:peptide complexes. Figure IB shows the near-UV CD spectra of wild-type Drosophila calmodulin in the presence of calcium alone and in (1:1) complexes with the WFF and FFW peptides. Free calmodulin shows prominent bands at 262 and 268 nm, which derive from the nine Phe residues. The signal at longer wavelengths (X > 275 nm) derives from the single Tyr located at position 138 in the C-terminal domain. The free peptides show negligible circular dichroism in this wavelength range. The spectra of the complexes of calmodulin with the WFF or FFW peptide show clear evidence of a major contribution from the Trp residue in the peptide. Tryptophan model compounds (7) generally show two sharp bands (from I^ transitions), one at 289 - 294 nm and the second some seven nanometres to shorter wavelength, which generally has the same sign. Bands corresponding to the L. transitions usually occur at shorter wavelengths (265-275 nm) and show little fine structure. The Ae values for these bands are expected to lie in the range ± 3 M"^ cm"^ (7). The changes in the near UV CD spectrum upon binding of peptide conform to the general pattern described for Trp CD (indicating that there is little contribution from the two Phe residues in the peptides), but the two spectra differ significantly in both magnitude and sign. The large negative intensity of the Trp of the bound FFW peptide clearly indicates that the indole chromophore is strongly immobilized in an asymmetric environment. Based on the solution structure of the CaM:M13 CaM:WFF
280
300
320
340
360
wavelength (nm)
380
400
250
260
270
280
290
300 310
320
wavelength (nm)
Figure 1 - A) Fluorescence spectra of WFF and FFW peptides, free and bound to wildtype calmodulin, [peptide] = 200 nM, [CaM] = 200 nM in 25 mM Tris (pH 7.5), 100 mM KCl, and 1 mM CaClj. B) Near UV CD spectra of 20 /xM wildtype calmodulin alone and in (1:1) complex with WFF and FFW peptide (Ae is per mole calmodulin).
Calmodulin-Target Sequence Interactions
405
peptide complex (2) the Trp sidechain of the bound WFF peptide is also expected to be strongly immobilized. The weaker CD signal of the WFF Trp is diagnostic of lower asymmetry but not necessarily greater mobility. The asymmetry derives from two sources -1) electronic interaction with other chromophores (eg Phe) and polarisable groups (eg sidechains) in the closely packed interior of the protein, and 2) electronic interaction with neighbouring peptide groups which are arrayed asymmetrically (owing to the L-chiral configuration of natural amino acids). The CD properties of a chromophoric side chain thus reflect both secondary and tertiary structure. Near UV CD is a sensitive indicator of relatively small changes in protein conformation in the vicinity of the aromatic group, as well as an indicator of different chiral environments within a protein. The very different near UV CD spectra indicate distinct chiroptical environments of the Trp residue in the two peptide complexes and are consistent with interaction of Trp4 (in peptide WFF) and Trpl7 (in peptide FFW) with different domains of the protein. This would suggest that both target peptides are binding to calmodulin in the same orientation i.e. with residue 4 interacting with the C-domain of calmodulin and residue 17 interacting with the N-domain, as was found for the homologous 26-residue M13 peptide bound to calmodulin (2). The affinities of the two peptides for four calcium binding site mutants of calmodulin also provide important information. The B2K and B2Q calmodulins have Glu67 (in binding site 2) mutated to Lys and Gin respectively, and B4K and B4Q calmodulins have Glul40 (in binding site 4) mutated to Lys and Gin respectively. Each of these mutations effectively eliminates calcium binding to the altered site (3). As shown in Table 1, the affinity of each of the mutant proteins for either peptide is at least 10-fold lower than that of wildtype calmodulin. The B2K mutant has the highest affinity for both peptides, suggesting that it is the least altered in function. The B4K mutant has the lowest affinity for both peptides - more than 200-fold lower than wildtype calmodulin suggesting that it is the most altered in function. Although the B2Q and B4Q mutants both have about 100-fold lower affinity for the WFF peptide than wildtype CaM, there is a 10-fold difference in their affinities for the FFW peptide. The fluorescence enhancement upon binding of the FFW peptide to the B2Q mutant is also much lower than that for any of the other proteins. These results suggest that the E67Q (B2Q) but not the E67K (B2K) mutation in site 2 of the N-domain has significantly altered the interaction with the sidechain of the residue in position 17 of the peptide. It is interesting to note that two different replacements for a single residue in the protein result in significantly different affinities.
Wendy A. Findlay et al.
406
Table I - Dissociation constants of wildtype and four mutant calmodulins for WFF and FFW peptides and fluorescence enhancement (at X =330 nm) upon complex formation CaM
WFF peptide K,(nM) F/Fo
FFW peptide F/Fo
Ksi(nM)
WT
<0.22 (1)'
2.4
1.6
(1)
3.0
B2K
5.2 (24)
2.7
19
(12)
3.1
320 (200)
2.0
B2Q
21
(95)
2.8
B4Q
21
(95)
2.5
38
(24)
2.9
B4K
48
(218)
2.2
340 (213)
2.4
• Values in brackets are relative to the wildtype K^.
Of all the mutant calmodulins studied, the B4K mutant has the most altered a-helical secondary structure, based on the CD signal at 222nm, (Fig. 2A). Values obtained from current samples of B2K and B2Q mutants show somewhat weaker far UV-CD than previously reported, (4), although the absolute values for a given mutant are critically dependent upon determination of protein concentration. When the WFF peptide is added, the CD (222nm) increases for all the proteins, as shown in Fig. 2B. For WT-protein this increase in intensity is interpreted as deriving mainly from the bound peptide adopting an a-helical conformation (2). 200
210 220 230 wavelength (nm)
210 220 230 wavelength (nm)
Figure 2 - Far UV CD spectra of wildtype and four mutant calmodulins A) alone and B) in (1:1) complex with WFF peptide (Ae is per mole calmodulin).
Calmodulin-Target Sequence Interactions
407
As shown in Fig. 2B, the far UV CD spectra of the mutant calmodulins in complex with the WFF peptide more closely resemble that of the wildtype calmodulin:WFF peptide complex than do the corresponding spectra in the absence of peptide (Fig. 2A). Quantitation of this effect shows that in addition to induction of of-helix in the bound peptide, the mutant calmodulins have recovered at least some of the "native" helical structure. IV. Conclusions The study of the interaction of peptides with proteins by optical spectroscopy is greatly facilitated if the peptide contains tryptophan and the protein does not, as in the case of the calmodulin/peptide systems studied here. Even for proteins which contain one or more tryptophans this approach can still be used provided that the spectral changes associated with binding of the peptide are sufficiently large. The use of synthetic peptides or site specific mutagenesis allows almost any residue to be replaced by tryptophan. The most conservative substitutions on the basis of size and hydrophobicity would be Phe - > Trp or Tyr - > Trp. In general, binding of the peptide to a protein will result in a shift in the tryptophan fluorescence emission maximum from approximately 355 nm (free peptide) to shorter wavelength as the tryptophan enters a more hydrophobic environment, and an overall intensification of the fluorescence emission intensity. The extent of the wavelength shift gives some information about the environment of the tryptophan in the complex. More importantly, the fluorescence enhancement on binding of the peptide may be used to determine the dissociation constant (K^) at the low protein/peptide concentrations required to study high affinity interactions. The principal near UV circular dichroism bands of tryptophan are generally distinct (at longer wavelength) from those of tyrosine and phenylalanine and ^so differ in characteristic band shape. Since near UV CD bands can be either positive or negative, the near UV CD spectrum can provide more information about the properties of aromatic residues than absorption spectroscopy. Free peptides (< 20 residues) generally have negligible near UV CD because of the conformational mobility of the chromaphoric aromatic side chains. However, as shown in this work, the immobilization of an aromatic group of a peptide when bound to a protein means that the near UV CD spectrum may be used to distinguish different chiral environments for aromatic residues such as tryptophan introduced in
408
Wendy A. Findlay et al.
different positions in the peptide. Far UV circular dichroism spectra contain information about secondary structure content in proteins and peptides. Small peptides are almost invariably unstructured in aqueous solution, but can adopt regular secondary structure upon binding to a protein. In general, the secondary structure of a protein is unlikely to change dramatically upon binding to a peptide. However, as shown in this work, the binding of a target peptide to a mutated protein with an altered structure (compared with the wildtype) may partially restore the "native" structure of the protein. The use of site specific mutagenesis of a protein in identifying individual residues involved in enzyme function or interaction with ligands is well established. In the case of calmodulin, whose mode of action is dependent on its calcium sensitive interaction with target proteins, a mutant protein may retain the ability to activate its target enzymes (5). This may mask the fact that a structural change in the mutant calmodulin has actually been compensated for by the strength of the interaction with the target sequence. Such a mechanism can be identified and characterized by quantitative examination of the interaction of the mutant protein with peptides related to the calmodulin binding sequence of the target enzyme, as outlined here.
Acknowledgments We thank Dr. K. Beckingham and colleagues (Rice University, Texas, USA) for providing the mutant calmodulins and Peter Fletcher (N.I.M.R.) for synthesis and purification of the peptides. References 1. O'Neil, K.T. and Degrade, W.F. (1990). TJ.B.S. 15, 59. 2. Ikura, M., Clore, G.M., Gronenbora, A.M., Zhu, G., Klee, C.B., and Bax, A. (1992) Science 256, 632. 3. Maune, J.F., Klee, C.B., and Beckingham, K. (1992)7. Biol Chem. 161, 5286. 4. Maune, J.F., Beckingham, K., Martin, S.R., and Bayley P.M. (1992)Biochemistry 31, 7779. 5. Gao, Z.H., Krebs, J., VanBerkum, M.F.A., Tang, W.-J., Maune, J.F., M e a n s , A.R., Stull, J.T., and Beckingham, K. (1993)7. Biol Chem. 268, 20096. 6. Bevington, P.R. (1969) in "Data Reduction and Error Analysis for the Physical Sciences", McGraw-Hill, New York, USA. 7. Strickland, E.H. (1974) Crit. Rev. Biochem. 2, 113.
Interactions of Bacterial Cell-Surface Proteins with Antibodies: a Versatile Set of Protein-Protein Interactions
Gordon C.K. Roberts, Lu-Yun Lian, Igor L. Barsukov, Jeremy P. Derrick,t Koichi Katof and Yoji Arata§
Biological NMR Centre and Department of Biochemistry, University of Leicester, Leicester LEI 9HN, UK I^Department of Biochemistry and Applied Molecular Biology, UMIST, Manchester, UK and ^Faculty of Pharmaceutical Sciences, University of Tokyo, Kongo, Bunkyo-ku, Tokyo 113, Japan
I. Introduction Some species of pathogenic bacteria, notably Streptococci and Staphylococci, have proteins on their surface which bind immunoglobulins (1). Protein A from Staph, aureus and protein G from species of Streptococci, which are widely used as immunological tools, are the most extensively studied of these antibody-binding proteins. All these proteins are large, multi-domain molecules, and in each case the antibody-binding activity has been shown to reside in small domains of 50-60 amino acid residues, often present in multiple copies in the sequence. Thus, protein A contains five highly homologous domains, each of about 60 amino acid residues (2,3), which bind to the Fc portion of immunoglobulin G (IgG) while protein G has three 55-residue IgG-binding domains (4,5). Protein G has a broader specificity than protein A for IgGs from different sources, and its IgG-binding domains are able to bind to both the Fab and the Fc portions of the antibody molecule, although their relative affinities for the two fragments vary with the class and species of IgG (6,7). The binding of protein G to mouse IgGj is predominantly due to the interaction with the Fab region (K^ 2 ^iM; J.P. Derrick, unpublished), while binding to human IgG is primarily to tiie Fc region (8). The solution structure of the IgG-binding B domain of protein A has been determined by nmr (9,10), and shown to be a three-helix bundle. A number of solution and crystal structures have been reported for several IgG-binding domains from protein G of different strains of Streptococcus (7,11-17); the basic fold is the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
409
410
Gordon C. K. Roberts et al.
same in each case, consisting of an a-helix packed against a 4-stranded antiparallel-parallel-antiparalleip-sheet. The binding of protein G to Fc fragments is competitive with respect to protein A (18,19), notwithstanding the fact that they lack sequence or structural similarity. The binding site on Fc for domain B of protein A has been identified by crystallography (20) and by NMR (21); it binds to the interface region between the 0^2 and CjjS domains of Fc, the interaction involving two a-helices of the protein A domain. The crystal structure of domain II of protein G bound to an Fab fragment of mouse IgGj (7), on the other hand, indicates that the complex involves an "edge-to-edge" interaction between the p-sheets of the protein G domain and the Cfjl domain. We have now studied the interaction between domain II of protein G and both Fab and Fc in solution by heteronuclear nmr methods, allowing a direct comparison both of the binding of protein G and protein A to Fc, and of the binding of protein G to Fab and Fc. n . Methods Preparation ofisotopically labelled protein Domain II from protein G from Streptococcus strain G148 was expressed and purified as described previously (11,22). To prepare the [^^C,^^N]-labelled domain, the cells were grown on a defined medium containing 3g/l of a mixture of [99% l^Cl-glucose and [99% ^^NJ-amino-acids (EMBL, Heidelberg), supplemented with lg/1 ^^NH4C1 (22). IgG and its Fc fragments selectively labeled with ^^C at the carbonyl carbon of particular amino-acids were prepared as described previously (21,23,24). NMR spectroscopy. Samples of [^^C,^^N]-labelled protein G, complexed with either Fab or Fc, contained 0.7mM protein G domain II in 50mM sodium phosphate buffer, pH 6.5 (90% H2O, 10% 2H2O). Spectra were obtained at 600MHz for ^H at sample temperatures of 25 and 37°C. The 2D heteronuclear correlation experiments were carried out using either the HSQC or the HMQC sequence. The water signal was suppressed using 1ms gradient pulses of maximum gradient strength 150 G/cm for the HSQC experiments, and low rf power irradiation for the HMQC experiment. In all these experiments, a GARP sequence was used to decouple ^^C or ^^N from the protons. Samples of selectively ^^C-labelled Fc fi*agments contained 0.2-0.4 mM protein in 200mM NaCl, 3mM NaN3, 5mM phosphate, pH 7.3, in a volume of 2 ml. ^^C spectra were recorded at 100 MHz, at a sample temperature of 30°C, using a WALTZ-16 decoupling sequence.
Bacterial Cell-Surface Interactions with Antibodies
411
m . Results The molecular mass of the complex between a domain of protein G and an Fab or an Fc fragment of IgG is approximately 58 kDa, so that a complete structure determination by nmr is not feasible. However, since the structure of each component of the complex is known, it is possible to obtain a medium-resolution model of the complex if the residues in each component involved in the interaction can be identified, at a simple level by using appropriately isotope-labelled proteins to identify residues whose chemical shifts are affected by complex formation. A. The effects of binding to antibody fragments on the nmr spectrum of protein G Complete^H, ^^N and almost complete ^^C resonance assignments of domain II of protein G are available (12; L.-Y. Lian, I. L. Barsukov, J. P. Derrick and G. C. K. Roberts, unpublished work), and ^H-^^N and ^H-^^C heteronuclear correlation spectra thus provide a convenient means of identifying resonances affected by the binding of protein G to IgG fragments. Figure 1 compares the ^H-^^N correlation spectra of protein G domain II, alone and in its complex with an Fab fragment of mouse IgGj (22). The resonances of rather more than a third of the residues undergo significant chemical shift changes (greater than the linewidth of the crosspeak) or are so broadened as to be undetectable. This marked line-broadening is specific to these residues, as distinct from the smaller increase in linewidth seen for all resonances due to the increase in correlation time on complex formation. This specific line-broadening arises from the fact that, as monitored by these specific resonances, the exchange rate between the free and bound forms of protein G domain II are at an intermediate rate on the nmr timescale. These specific changes may reflect direct protein-protein contacts and/or changes in conformation on complex formation although the crystal structure of this complex (7) suggests that any conformational change is slight. The unaffected resonances, on the other hand, can be unambiguously assigned and must arise from residues that are not involved in intermolecular contacts. The unaffected backbone amide resonances in the ^H-^^N correlation spectrum are those of residues 1-11, 13, 23, 25-40, 44, 46, 47, 49-57 and 59-64. The ^^N-^H spectra give information primarily about the backbone of the protein, while ^H-^^C correlation spectra give more information on side-chain contacts, and changes are also observed, for example, in the methyl region of the ^H-l^C spectrum (22). The changes in the resonances of protein G on complex formation allow us to identify the regions of the molecule that are involved in Fab binding as the turn between the strands 1 and 2 of the P-sheet and the first two-thirds of p-strand 2, and the loop following the helix, but not the helix itself As discussed previously (22), the changes observed in the nmr spectrum on complex formation in solution
412
Gordon C. K. Roberts et al.
are entirely consistent with the crystal structure; it is clear that the crystal structure does correspond to the structure of the complex in solution. Similar experiments on the interaction of protein G domains with Fcfragments,however, show that in this case a different set of residues are affected by complex formation (25,26). The regions of the protein G domain which form the "contact surface" in the complexes with the Fab and Fcfragmentsare compared schematically in Figure 2. In the complex with Fab, the second strand of the p-sheet plays a major role, together with the loop at the C-terminal end of the helix, while in the complex with Fc, the regions most affected are the helix and the third strand of the p-sheet, but not the intervening loop. The IgG-binding domains of protein G thus interact with the Fab and Fc regions of the antibody in quite different ways. B. Identification of residues ofFc involved in binding protein G The Fcfragmentis clearly too large a molecule for a complete analysis of its nmr spectrum to be possible, and to locate the binding site for protein G on Fc it is necessary to label Fc selectively with individual [^^C]-aminoacids and to use their ^^C resonances as "probes". Here, we have used the resonances of the carbonyl carbons of His, Met, Trp and Tyr residues. The assignments of these resonances to individual residues, most of which were made by the ^^C-^^N double labeling method (27), have been reported previously (21,23,24,28). In this way, the behaviour of a total of 22 residues in the Fc molecule could be monitored, and of these, only Met-252, His-433, His-435 and His-436 were affected by the binding of domain II of protein G (26). These residues lie in the 'groove' between the 0^2 and 0^3 domains of Fc, indicating that this region is primarily responsible for the binding of protein G and, since all these residues are also affected by the binding of domain B of protein A, confirming that these two proteins bind to the same region of the Fc molecule. There are, however, differences between the effects of protein G and protein A which suggest that the former interacts more with residues from the CH3 domain than with thosefromthe 0^2 domain (26; see below). C The structures of the protein G - antibody complexes As noted above, a crystal structure is available for the complex between domain III of protein G and the Fab fragment of mouse IgGj (7), and the solution data from nmr is entirely consist with this. In this complex, the P-sheets of the two molecules align in an anti-parallel manner, with intermolecular hydrogen-bonds between the second p-strand of protein G and the last strand of the Cj^l domain of Fab.
413
Bacterial Cell-Surface Interactions with Antibodies
•
T54
^
G46
.^ '''J ^' ' T16
^Q
^T22r
T23 ,,,^ T49f^# ^V26 „
TSsi ^'^ «
0
N,3
^G14 T56
^Ji
1
Q^^l »
^ ,
X
Q37
-^' . ^ ^ ^'^ D45
A»^
"^
^ Xjyf\ W48.© "P . N13
o / \
,
CO
e Q.
"^^
'*©»V34
J L17
#
A31
^1 D51
E6,
^ ]0.0
9.0 (w2
!.0
7.0
(pprr
F/^iire i. ^^N-^H correlation spectra of [^^C, ^^N]-labelled protein G domain n in the presence and absence of Fab. Cross-peaks from domain n alone are shown in grey, and those from the complex in unshaded contours. Due to space restrictions, resonance assignments are indicated for only some of the assigned residues.
Figure 2. Comparison of the residues of the IgG-binding domains of protein 0 affected by binding to (a) Fab and {b) Fc fragments of IgG. Those residues whose ^H/^^N/^^C chemical shifts and/or linewidths are altered on formation of the respective complexes are coloured in black. The domain is oriented with its N-terminus at the top.
414
Gordon C. K. Roberts et al.
As yet, detailed crystallographic information on the protein G - Fc complex is not available. However, the nmr data which identifies residues involved in the interaction between domain II of protein G and Fc can be combined with the knowledge of the structures of the two molecules to generate a model of the complex (26). Briefly, this was done as follows. An initial structure was obtained by manually positioning the protein G domain and Fc so that the affected residues on each were facing each other. Six starting structures were then generated by changing the orientation of the protein G domain relative to the Fc molecule in 60° steps, while keeping its position constant. Each of these structures was then subject to Monte Carlo energy minimisation. The nmr information was introduced by using a pseudo-potential which constrained those residues in either partner seen to be affected by complex formation to lie close to some (unspecified) residue(s) of the other partner. This led to a series of minimised structures, one of which, shown in Figure 3, had significantly lower energy than any of the others. This procedure depends on the assumption that there is no substantial change in the conformation of either partner on formation of the complex (no marked changes are seen in the crystal on formation of the protein G - Fab complex; 7). It is therefore only an approximate, medium-resolution model, but one which can be tested by, for example, sitedirected mutagenesis, and these experiments are in progress. IV. Conclusions It appears that the interactions of the bacterial antibody-binding proteins with their 'target' immunoglobulins involve a very versatile set of protein-protein interactions. First, the IgG-binding domains of protein A and protein G have different structures, but bind competitively to the Fc fragment. The binding sites for both protein A and protein G lie between the Cfj2 and C^S domains of Fc, overlapping extensively. This interaction involves two a-helices of protein A and the a-helix and one strand of the p-sheet of protein G. These represent two structural solutions to the recognition of the same region of a protein surface (29). Secondly, protein G is able to bind to Fab as well as to Fc. Although the constant domains in Fab and Fc are of course structurally related, protein G binds quite differently to them, binding edge-on to the Cjjl domain of Fab but in a cleft between €^2 and C^S of Fc. It is particularly notable that such a small domain (only 55 residues) is able to recognise specifically two quite different protein surfaces. It does this by employing an almost completely different set of residues on its surface.
Bacterial Cell-Surface Interactions with Antibodies
415
Figure 3. A model for the complex of domain II of protein G with the Fc fragment of IgG, derived from the nmr data as described in the text. Residues of domain II and Fc whose chemical shifts or/and linewidths have been affected on complexation are coloured in black.
2. 3. 4. 5.
7. 8. 9.
Boyle, M. D. P. (1990) Ed: Bacterial Immunoglobulin Binding Proteins, Academic Press, San Diego. Langone, J. J. ( 1 9 8 2 ) ^ ^ . Immunol. 32,157-252. Moks, T., Abrahmsen, L., Nilsson, B., Hellman, U., Sj5quist, J., & Uhlen, M. (1986) Eur. J. Biochem. 156, 637-643. Guss, B., Eliasson, M., Olsson, A., Uhlen, M., Frej, A.-K., Jomvall, H., Flock, J.I., &Lindberg, M. {19%6)EMB0J. 5, 1567-1575. Olsson, A., Eliasson, M., Guss, B., Nilsson, B., Hellman, U., Lindberg, M., & Uhlen, M. (1987) Eur. J. Biochem. 168, 319-324. Eliasson, M., Anderson, R., Nygren, P.-A., & Uhlen, M. (1991) Mo/. Immunol, 28, 1055-1061. Derrick, J. P., & Wigley, D. B. ( 1992) Nature 359, 752-754. Sjobring, U., Bj5rck, L., & Kastem, W. (1991) J. Biol. Chem. 266, 399-405. Torigoe, H., Shimada, I., Saito, A., Sato, M., & Arata, Y. (1990) Biochemistry 29, 8787-8793.
416
Gordon C. K. Roberts et al.
10. Gouda, H., Torigoe, H., Saito, A., Sato, M., Arata, Y., & Shimada, I. (1992) Biochemistry 31, 9665-9672. 11. Lian, L.Y., Yang, J.-C, Derrick, J. P., Sutcliffe, M. J., Roberts, G. C. K., Murphy, J. P., Goward, C. R., & Atkinson, T. (1991) Biochemistry 30, 5335-5340 12. Lian, L.-Y., Derrick, J. P., Sutcliffe, M. J., Yang, J. C , & Roberts, G. C. K. (1992) J. Mol Biol. 228, 1219-1234. 13. Derrick, J. P., & Wigley, D. B. (1994) in press 14. Gallagher, T., Alexander, P., Bryan, P., & Gilliland, G. L. (1994) Biochemistry 33, 4721-4729. 15. Gronenbom, A. M., Filpula, D.R., Essig, N.Z., Achari, A., Whitlow, M., Wingfield, P.T., & Clore, G. M. (1991) Science 253, 657-661. 16. Orban, J., Alexander, P., & Bryan, P. (1992) Biochemistry 31, 3604- 3611. 17. Achari, A., Hale, S. P., Howard, A. J., Clore, G. M., Gronenbom, A. M., Hardman, K. D., & WhiUow, M. (1992) Biochemistry 31, 10449-10457. 18. Stone, G. C , Sjobring, U., Bjdrck, L., Sjoquist, J., Barber, C. V., & Nardella, F. A. ( 1989) J. Immunol. 143, 565-570. 19. Frick, I.-M., Wikstr5m, M., Fors^n, S., Drakenberg, T., Gomi, H., Sjdbring, U., & Bjdrck, L. (1992) Proc. Natl. Acad. Sci. USA 89, 8532-8536. 20. Deisenhofer, J. (1981) Biochemistry 20, 2361-2370. 21. Kato, K., Gouda, H., Takaha, W., Yoshino, A., Matsunaga, C , & Arata, Y. (1993) FEBS Lett. 32S, 49-54. 22. Lian, L.-Y., Barsukov, I. L., Derrick, J. P., & Roberts,G. C. K. (1994) Nature Structural Biology 1, 355-357. 23. Kato, K., Matsunaga, C , Igarashi, T., Kim, H., Odaka, A., Shimada, I., & Arata, Y. (1991) Biochemistry 30, 270-278. 24. Kato, K., Matsunaga, C , Odaka, A.,, Yamato, S., Takaha, W., Shimada, I., & Arata, Y. (1991) Biochemistry 30, 6604- 6610. 25. Gronenbom, A. M., & Clore, G. M. (1993) J. Mol. Biol. 233, 331-335. 26. Kato, K., Lian, L.-Y., Barsukov, I. L., Derrick, J. P., Kim, H., Tanaka, R., Yoshino, A., Shiraishi, M., Shamada, 1., Arata, Y., & Roberts, G. C. K. (1994) Structure, submitted. 27. Kainosho, M., & Tsuji, T. (1982) Biochemistry 21, 6273-6279. 28. Kato, K., Matsunaga, C , Nishimura, Y., Waelchli, M., Kainosho, M., & Arata, Y. (1989) y. Biochem. (Tokyo) 105, 867-869. 29. Roberts, G.C.K., Lian, L.-Y., Yang, J. C , Derrick, J. P., & Sutcliffe, M. J. (1992) In: Molecular Recognition: Chemical and Biochemical Problems II, (ed., Roberts, S.M.) Royal Society of Chemistry, pp. 161-170.
Studies of Cytokine - Cytokine Receptor Interactions: Influence of Ligand Dimerization Larry D. Ward*<>, Geoffrey J. Hewlett^, Robert L. Moritz*, Annet Hammacher", Kiyoshi Yasukawa^, and Richard J. Simpson* *Joint Protein Structure Laboratory, Ludwig Institute for Cancer Research (Melbourne) and The Walter and Eliza Hall Institute of Medical Research, Parkville, Vic, Australia, ^Dept. of Biochemistry, University of Melboume, Parkville, Vic, Australia; ^ Biotechnology Research Laboratory, TOSOH Corporation, Kanagawa, 252, Japan; ^Current Address: AMRAD Laboratories, Hawthom, Vic, Australia
I.
Introduction
The recent advent of a commercially available biosensor utilising surface plasmon resonance detection (SPR) has created considerable interest with respect to its use for both the qualitative and quantitative study of protein protein interactions [1]. Surface plasmon resonance is a phenomenon observed at the interface of two transparent media (eg. glass and aqueous) when the interface is coated with a thin layer of metal, in this case gold. Monochromatic, p-polarized light is reflected from the surface and the intensity of the totally intemally reflected light is monitored, an intensity dip is measured at a defined incident angle, this phenomenon being termed SPR. The angle where SPR is observed can be utilised to monitor macromolecule interactions as it is critically dependent on the refractive index of the medium in close proximity to the metal surface. Thus if one component is immobilised to the sensor surface, the binding of other components to the immobilised reactant can be monitored in real time as binding causes an increase in local refractive index, this change being proportional to the mass of the interactant [1]. The intrinsic sensitivity of the technique allows the study of interactions without extrinsic labelling of the interacting species at concentrations not previously accessible to conventional methodologies [1]. Indeed, the high sensitivity of detection has been utilised by a number of TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
^^^
418
Larry D. Ward ^r«/.
groups successfully to search cellular conditioned media for ligands for orphan receptor molecules [2]. These potentially novel cytokines are present in concentrations in the range of only a few ng/ml. There has been debate in the literature, however, over its quantitative applications. That is, how do equilibrium constants calculated using SPR detection compare to those calculated between similar interactants in solution? [3]. Equilibrium constants using SPR detection are in most cases calculated from the ratio of apparent dissociation and association rate constants describing the binding of interacting species to immobilised reactant [4]. Thus to compare directly calculated equilibrium constants with those calculated in solution it is assumed that the chemical modification inherent in the immobilisation chemistry has not influenced binding characteristics, there are no steric and accessibility problems upon binding to a crowded surface, and that the locally high concentration of immobilised reactant does not influence the binding properties of the immobilised reactant. An additional factor that complicates interpretation of SPR data occurs when the partitioning species is multivalent. In solution the two sites on a bivalent antibody bind in an equivalent and independent manner. Upon binding to immobilised antigen the two sites can interact simultaneously and hence bind with greatly increased apparent affinity to what is expected from their solution behaviour. The reliance upon pseudo first order kinetic expressions based upon 1:1 stoichiometry for multivalent species such as antibodies can lead to quite erroneous estimates of equilibrium constants when using this approach [3]. Here we describe studies of the interaction of interleukin-6 (IL-6) with a soluble form of its cell surface receptor (sIL-6R). A procedure utilising a competition approach is presented which allows the determination of the equilibrium constant in solution thus avoiding any potential problems associated with deviation in kinetic characteristics upon surface immobilisation. In addition, binding characteristics of stable monomeric and dimeric forms of IL-6 are presented to demonstrate both the drastic influence of solute multivalency on kinetic and equilibrium properties and the importance of auxiliary techniques such as analytical ultracentrifugation for the interpretation of SPR data. II.
Materials & Methods
Preparation of Recombinant IL-S and Soluble Human ILr6 Receptor
Cytokine-Cytokine Receptor Interactions
419
Soluble IL-6R was purified as previously described (3). The molecular weight of the shIL-6R is 36370 based on amino acid composition (3). However the CHO cell derived protein is glycosylated as the molecular weight determined by equilibrium ultracentrifugation was 53000 (3). Two forms of IL-6 were used in this study (1) Full length human IL-6 in pUC8 (referred to as IL-6D) was expressed in Escherichia coli (strain NM522) as a fusion protein with P-galactosidase [5]. The nine N-terminal amino acid residues of this recombinant IL-6 (Thr-Met-Ile-Thr-Asn-Ser-Arg-Gly-Ser) are derived from P-galactosidase and the polylinker of pUC8 (2) A second protein which was also full length was expressed as a growth hormone fusion protein before purification and cleavage with thrombin to remove the growth hormone moiety [6]. This IL-6 (referred to as IL-6M) possessed an additional alanine residue at the N-terminus. The molecular weights of IL6D and IL-6M are 43708 and 21854 Da respectively based on amino acid composition, these results agreeing to 0.02% mass accuracy to that determined by electrospray mass spectrometry. Studies of the ILr6/ sIL-6R interaction Studies of the interaction of IL-6 and the sIL-6R were monitored by SPR detection using a BIAcore^^ instrument [1] (Pharmacia, Uppsala). Immobilisation of the respective proteins to the carboxymethylated dextran matrix coating the gold sensor chip was performed using the EDC/NHS coupling chemistry as previously described for IL-6 and sIL-6R [3, 7, 8]. Regeneration of the sensor surface for IL-6 and sIL-6R was performed with lOmM HCl for 3 min [8] and 4M MgCl2 in lOmM Tris-HCl buffer, pH 7.4 for 1 min respectively. Analytical Ultracentrifugation Molecular weight determinations were performed utilising a Beckman XL-A analytical ultracentrifuge as previously described [3, 9] III.
Results & Discussion
SPR analysis of 11^6 binding to sIL'6R The binding of the sIL-6R to IL-6 was measured using a biosensor employing SPR detection, the IL-6D being immobilised on the sensor surface. A typical sensor gram is presented in Fig. lA, relatively rapid association and disassociation rates being observed. Utilising non-linear
420
Larry D. Ward et al.
11300
13100
B
I
13625
1
_ 13450
13275 12300110500
mm 1200
•ii^^L_
J
1000
L_l
± 1 12100
2000
3000
2000 4000 6000
Time (s) Figure 1. Characterization of the interaction between IL-6 and sIL-6R by means of SPR. Association and Dissociation Phases (A) IL-6D was immobilised on the sensor chip and sIL-6R (20 nM) introduced to the sensor surface, (B) sIL-6R was immobilised and IL6D (10 nM) passed over the sensor surface and (C) IL-6M (50 nM) was introduced to the sensor chip as a function of time of association, (a) 10 min., (b) 30 min. (c) 40 min. (d) 80 min. In all cases (A-C), the dissociation of bound ligand from the sensor surface was monitored upon replacement of free ligand with running buffer (10 mM Hepes, pH 7.4, containing 0.15 M NaCl, 3.4 mM EDTA and 0.005% (w/v) Tween-20. Arrows correspond to conunencement of the dissociation phase.
regression analysis and the relations of O'Shannessy et al [4], values of 0.45 X 10^ M-ls"l and 0.008 s-1 were calculated for the association (k^) and disassociation (kj)) rate constants respectively [10]. From the ratio of rate constants a value of 5.9 x 10''^ M"! (dissociation constant of 17 nM) was calculated for the equilibrium constant (K^x) describing the binding of receptor to immobilised IL-6D . This interaction was also investigated upon immobilising sIL-6R to the sensor surface. Assuming a 1:1 stoichiometry, identical sensorgrams and rate constants would be expected. However, drastically different binding characteristics were observed. IL-6(D) binding to immobilised sIL-6R (Fig. IB) was characterised by an enhanced binding affinity, this being characterised by a drastically decreased off rate constant. The kinetics of binding were complex and could not be fitted by pseudofirst-order kinetics for a single class or the sum of two such processes for two classes of binding sites (data not shown). This was suggestive of a complex binding mechanism. To evaluate whether this unusual behaviour was characteristic of this particular recombinant IL-6D, another recombinant IL-6 preparation, termed I L - 6 M was utilised. Sensorgrams describing IL-6M binding to immobilised
Cytokine-Cytokine Receptor Interactions
421
sIL-6R are presented in Fig. IC. This molecule demonstrated binding characteristics intermediate between those observed in Figs lA and IB, traces being presented as a function of the time of association. At low association times the bound D^-6M is predominantly rapidly dissociating with a small proportion of slowly dissociating species being observed (curve a, Fig. IC). As the association times increase the proportion of slowly dissociating species increases until eventually the slowly dissociating species predominates, the dissociation phase after an association phase of 80 min resembling that of IL-6D (Fig. IB). Solution Molecular Weight Determinations ofIL-6 In order to try and explain these seemingly contradictory results the solution molecular weights of both JL-6 preparations were determined by analytical ultracentrifugation, the results of these studies being presented in Fig. 2. For the preparation expressed using the pUC8 vector (IL-6D), the sedimentation distribution is reasonably described by a single species with M=39000 which indicates the preparation is essentially dimeric. The other preparation, IL-6M, however had a molecular weight of 20200 indicating it was predominantly monomeric. The sIL-6R was also monomeric as judged by analytical ultracentrifugation [3]. These data were confirmed by size exclusion chromatography (Fig. 3A), IL6D chromatographing as two peaks, the major and minor species chromatographing in positions corresponding to solution molecular weights of 40000 and 20000 respectively.
o CO
•0.2 •0.4 •0.6 •0.8 -
•1.0 r •1.2 b* •1.4
IL;^^ IL-6M
1
1
48
49
.
1
1
50
51
1
r^ (cm^) Figure 2. Determination of the molecular sizes of recombinant IL-6 preparations by sedimentation equilibrium. Linear transform of equilibrium distributions for IL-6M and IL-6D obtained at 12000 rpm. All studies were performed in PBS buffer.
422
Larry D. Ward et al.
0.08 0.06 0.04 0.02 0.00 (D O
O
<
0.08 0.06 0.04 0.02 0.00
t-A
40kDa
(1
h
p .
1
.
1
.
7 ilA 1
t-B
1
.
1
1
1
1
20kDa
p 1
.
40kDa/ 1 1
1
10
I
1
20
1
1
30
1
40
1
50
Elution Time (min) Figures. Size exclusion chromatography of recombinant IL-6 preparations. Chromatography was performed on a Superose 12 10/30 colunrn developed with PBS buffer. Panel A :IL-6D (10)ig) and Panel B IL-6M (lOjig). The indicated molecular weights were determined by chromatography of standard proteins.
The chromatographic profile of IL-6M indicated that greater than > 95% of the injected material was monomeric and had a solution molecular weight of 20000 (Fig. 3B). The small proportion of high molecular weight material present in this preparation, was predominantly dimer. The discrepant behaviour observed in Figs lA-lC could therefore be explained in the following manner. When IL-6D is inmiobilised, sIL-6R binds equivalently and independently to the two sites on the immobilised IL6D (Fig. lA). This is supported by the observation that immobilised IL-6M binds to sIL-6R (data not shown) in an identical manner to immobilised IL6D (Fig. lA). When sIL-6R is immobilised, however, the two sites on the dimer fraction which predominates in the IL-6D preparation can crosslink two immobilised receptor molecules. This results in an increased apparent affinity which is characterised by a decreased off rate constant (Fig. IB). There is also a small proportion of rapidly dissociating species, this probably being due to a combination of monomeric IL-6D binding and the binding of dimeric IL-6D through one site with immobilised receptor. The complex binding behaviour observed upon binding of n^-6M to immobilised sIL-6R (Fig. IC) suggests that at low association times (curve a) the large excess of monomeric species in the preparation of 1L-6M dominates, the majority of the dissociating species having a rapid off rate constant. At longer
Cytokine-Cytokine Receptor Interactions
423
association times (curve d) the majority of bound species have a slow off rate. This is due to the monomeric species in the preparation being in rapid exchange because of its fast on and off rate. The minority dimer, however, has a slow dissociation rate. Upon binding, it exchanges slowly thus dominating the binding response with increasing equilibration times. These interpretations were supported by observations that when higher molecular weight aggregates are removed from the IL-6M preparation by size exclusion chromatography (Fig. 3B), the binding characteristics of IL-6M to immobilised sIL-6R are similar to those observed in Fig. lA when IL-6D was immobilised. In addition, no association time dependent behaviour (as in Fig. IC) was observed for this preparation (data not shown). Calculation of Equilibrium Constants in Solution by Competition Studies using SPR A manifestation of the above interpretation that both sites on the dimeric form of IL-6 are equivalent and independent would be that IL-6D and IL-6M should bind identically to sIL-6R in solution. This was investigated by competition experiments where IL-6D was immobilised as in Fig. lA and the ability of both IL-6D and IL-6M to inhibit sE_.-6R binding to immobilised IL-6D monitored. Data is presented in Fig. 4. Fig. 4A presents a Scatchard plot of results obtained by flowing a range of sIL-6R concentrations across the sensor chip and measuring the equilibrium binding response Rp. The slope signifies a intrinsic association equilibrium constant (K^x) of 2.4x10^ M"^ (a dissociation constant of 42 nM) which compares favourably with the value of 17 nM calculated by the ratio of rate constants. The competition studies are sunmiarised in Fig. 4B. Data was analysed according to Eq(l) , expressions originally derived for quantitative affinity chromatography [11] and rederived for SPR data [3]. (1) Q = KAX/(KAX)S=[CA]/[CA]S= 1 + KAslq
[CS]T-[(Q-I)/Q]} [CAIT
where K^s and K^^j^ are the intrinsic binding constants for the interaction of acceptor to ligand in solution and immobilised ligand respectively. (KAX)S is a constitutive binding constant related to K^^ but measured in the presence of defined total concentrations of competing ligand, [CSIT q is the valence of competing ligand (one in this case) and [C^lj is the total liquid phase concentration of acceptor. [C^] and [C^ls are the free concentrations of acceptor in the presence and absence of competing ligand respectively. As shown in Fig. 4B, near identical estimates of K^^g were obtained for IL-6D (4.8 xlO^ M-1 a dissociation constant of 20 nM) and IL6M
424
Larry D. Ward et al.
u" O
500 1000 1500 2000 2500 Rp (units)
II
C
0
50
100
150 200 250
q[Cs]T-[(Q-l)/Q][CjT(nM)
Figure 4. Characterization of the interaction between IL-6 and sIL-6R by means of SPR and a biosensor chip with n^-6D as immobilised affinity ligand (A) Evaluation of the binding constant for the interaction of sIL-6R with inmiobilised IL-6 (K^^), Rp corresponding to the plateau response obtained at equilibrium. (B) Determination of the binding constant for the interaction between sIL-6R and IL-6 in solution (Ky^s) ^y including the latter component in the injected sample to act as a competitor. Results for IL-6M ( • )and IL-6D ( D ) are plotted according to equation 1.
(5.2xl0'7 M'l a dissociation constant of 19 nM) confirming that both sites on IL-6D are equivalent and independent. In conclusion these studies demonstrate that (1) multivalent partitioning solutes have the capacity to interact simultaneously with two sites on the BIAcore instrument (2). Even a small proportion (-5%) of multimeric partitioning species can complicate binding analysis (3). In such cases, if analysed in terms of pseudo first order binding equations, erroneous estimates of binding parameters will be obtained, (4) when dealing with an interaction between a multivalent and a monovalent species, the multivalent species should be immobilised to avoid multiple attachments complicating binding analysis, (5) competition studies can be performed to evaluate solution binding characteristics. In such a system any change in binding characteristics upon immobilization of a reactant to the sensor surface does not influence the calculated solution equilibrium constants and (5) Such competition studies should be performed to validate that binding to immobilised reactant mirrors solution behaviour. This was demonstrated to be the case for the IL-6/sIL-6R interaction when monitoring a 1:1 complex,
Cytokine-Cytokine Receptor Interactions
425
similar calculated equilibrium constants being obtained by kinetic and competition approaches. Consequently kinetic constants evaluated using the BIAcore instrument can be used with confidence in describing this system.
References. 1.
Jonsson, U., Fagerstam,L., Ivarsson, B., Johnsson,B., Karlsson, R., Lundh, K., Lofos,S., Persson,B., Roos, H., Ronnberg, I., Sjolander,S., Stenberg, R., Urbaniczky, C , Ostlin, H., and Malmuist, M. (1991) Biotechniques 11, 620-627. 2. Bartley, T. D., Hunt, R.W., A.A., Boyle, W.J., Parker, V.P., Lindberg, R.A., Lu, H.S., Colombero, A.M., Elliott, E., Trail, G., , B., Yarden, Y., Hunter, T., and Fox, G.M., (1994) Nature 368, 558-560. 3. Ward, L.D., Howlett, G.J., Hammacher, A., Weinstock, J., Yasukawa, K., Simpson, R.J., and Winzor, D.J. (1994) Biochemistry, In press. 4. O'Shannessy, D.J., Brigham-Burke, M., Soneson, K.K., Hensley, P., and Brooks, I. (1993) Anal. Biochem 212, 457-468. 5. Zhang, J.G., Moritz, R.L., Reid, G.E., Ward, L.D., and Simpson, R.J. (1992) Eur. J. Biochem. 207, 903-913. 6. Yasukawa, K. and Saito, T. (1990) Biotechnology Lett 12, 419-424 7. Ward, L.D., Hanmiacher, A., Chang, J., Zhang, J-G, Discolo, G., Moritz, R.L., Yasukawa, K., and Simpson, R.J. (1994) Techniques in Protein Chemistry V, 331338. 8. Ward, L,D., Hammacher, A., Zhang, J-G., Weinstock, J., Yasukawa, K., Morton, C.J., Norton, R.S., and Simpson, R.J., (1993) Protein Sci. 2, 1472-1481. 9. Ward, L.D., Howlett, G.J., Yasukawa, K., Hanmiacher, A., Moritz, R.L. and Simpson, R.J. (1994) J. Biol. Chem., In press. 10. Hanmiacher, A., Ward, L.D., Weinstock, J., Treutlein, H., Yasukawa, K. and Simpson, R.J. (1994) Protein Sci., In press. 11. Winzor, D.J. and Jackson, CM. (1993) in Handbook of Affinity Chromatography (Kline, T., ed.), pp253-298. Marcel Dekker, New York.
This Page Intentionally Left Blank
New High Sensitivity Sedimentation Methods: Application to the Analysis of the Assembly of Bacteriophage P22 Walter F. Stafford, III, Sen Liu, and Peter E. Prevelige, Jr. Boston Biomedical Research Institute 20 Staniford Street Boston MA 02114
I. Introduction Analytical ultracentrifugation is the method of choice for analyzing interacting systems. Estimates of binding stoichiometry and association constants may be obtained readily either directly from sedimentation equilibrium experiments, or by boundary analysis of sedimentation velocity experiments. The marketing of a new, simple to operate analytical ultracentrifuge (Beckman XL-A, Palo Alto CA.) with on line data acquisition capabilities guarantees a resurgence of popularity for this analytical technique. New methods for the analysis of sedimentation velocity data have extended the sensitivity of the UV scanning and Rayleigh interferometric optical systems of the analytical ultracentrifuge. Boundaries with concentrations on the order 1020|ig/ml now can be visualized readily with the Rayleigh optical system allowing analysis of sedimenting systems in a concentration range previously inaccessible to the analytical ultracentrifuge. The increase in sensitivity has been achieved by a combination of analytical (Stafford, 1992a; Stafford, 1992b; Stafford, 1994a) techniques that use the time derivative of the concentration profile and of instrumental techniques (Liu and Stafford, 1992; Yphantis et al., 1994) that employ a rapid acquisition video-based Rayleigh optical system. Use of the time derivative achieves an automatic optical background correction by removing the time independent components, and the fast video system allows signal averaging so that a large increase in the signal-to-noise ratio can be achieved. Sedimenting boundaries can be represented as apparent sedimentation coefficient distribution functions, g(s*) vs. s*, where s* is the apparent sedimentation coefficient defined in equation 2, and g(s*) has units proportional to concentration per svedberg. A plot of g(s*) vs. s* is geometrically similar to the corresponding plot of dc/dr vs. r that would have been obtained with a schUeren optical system. This chapter describes a general method for the analysis of concentration profiles obtained during sedimentation velocity experiments using the apparent distribution function. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
427
428
Walter F. Stafford III et al.
II. Theoretical Background The apparent sedimentation coefficient distribution function, g(s*) vs. s*, can be computed from the time derivative of the concentration profile using the following equation, in which the time derivative of the concentration profile has been corrected for the "plateau" contribution by an iterative procedure described previously (Stafford, 1992a; Stafford, 1994a): exp(2co2s*t)
«<»•>-flu
Co
(k) ^'^
where c is the concentration, g(s*) has units of concentration per svedberg, r is the radius in centimeters, rm is the radius of the meniscus, co is the angular velocity, in radians per second, of the rotor attained after acceleration, CQ is the initial loading concentration, t is the equivalent time of sedimentation in seconds, and s*, the apparent sedimentation coefficient, is defined as 0)21
[JmJ
(2)
where t is computed from the following integral t=t
t =4-
jco(t)2dt
(3)
©^ t=0 where (o(t) is the angular velocity during acceleration and o^ is the final angular velocity and also the value of co used in equation 2. The function g(s*) is referred to as an apparent sedimentation coefficient distribution because it has not been corrected for spreading arising either from diffusion or from macromolecular interactions. It is convenient to use the unnormalized distribution function, g(s*), since it can be related directly to the local concentration in the cell. The function g(s*) is defined in the following way
'^^H^U^f']
<^'
A^
Because g(s*) can be related directly to the local concentrations in the cell, the weight average sedimentation coefficient, to within a very good approximation and in spite of spreading due to either diffusion or to association, can be computed from (Stafford, 1994b) * s^ 1 sw=—
Js*g(s*)ds*
(5)
where the limits of integration refer to integration from the meniscus to the plateau region (the region centrifugal to the boundary), and where cp is the plateau concentration and is given by
High Sensitivity Sedimentation Methods
429
*
P cp= Jg(s*)ds*
(6)
s*=0 A further increase in precision can be achieved by signal averaging (Stafford, 1992a, 1994a). Values of (3c/3t)r can be averaged at constant values of s* over short periods of time. For example, the video-based Rayleigh optical system currently in use on the XL-A Analytical Ultracentrifiige can acquire and process Rayleigh interferograms about every 5 seconds so that in a rotor containing four cells the entire set of interferograms can be acquired approximately every 20 seconds allowing each cell to be sampled three times a minute. Values of (3c/3t)j. can be averaged at constant values of s*, to compensate for movement of the boundary during sedimentation, over short time periods relative to the rate of sedimentation. The averaged values of (3c/9t)r are then inserted into equation 4. As with other types of signal averaging, the random noise can be reduced by a factor of the square root of the number of patterns averaged. The apparent differential distribution function, g(s*) vs. s*, computed from the time derivative, is a sensitive tool for the analysis of boundary shape. It can be used to advantage in most cases in which one would use dc/dr vs. r for the analysis of homogeneous, heterogeneous and interacting systems. Because a plot of g(s*) vs. s* is very nearly geometrically similar to the corresponding plot of dc/dr vs. r, it has essentially the same information content. Another advantage of g(s*) over dc/dr is that, because of its relatively high signal-to-noise ratio, it can be used to study solutions at somewhat more than 2 orders of magnitude lower concentration than dc/dr obtained with schlieren optics.
i n . Application to Bacteriophage P22 The double stranded DNA containing bacteriophage P22 can be assembled from purified coat and scaffolding protein subunits in vitro (Prevelige et al, 1988). Nucleation of assembly requires the formation of a pentamer of coat protein, and assembly proceeds by the subsequent polymerization of monomers or small oligomers of both coat and scaffolding protein (Prevelige et al, 1993). Assembly can be inhibited by the binding of a single molecule of the hydrophobic dye l,r-bi(4-anilino)naphthalene-5,5'-disulfonic acid(bisANS)to the coat protein subunit. Circular dichroism and limited proteolytic digestion demonstrated that the binding of bisANS has little effect on either the secondary or tertiary structure of the coat protein subunits (Teschke et al, 1993). In order to analyze the effect of bis ANS on the quaternary structure of the coat protein, sedimentation velocity experiments were performed in the presence and absence of inhibitory concentrations of bisANS. Figure 1 shows the application of the time derivative method to the effect of bisANS on the self assembly of coat protein. The coat protein in the absence of bisANS sediments with an s value of 3.8. This value is consistent with that expected for a monomer of the molecular weight of the coat protein (47,000). In the presence of inhibitory concentrations of bisANS, the subunits associate to form oligomers with an s value of 5.2. This value is consistent with that expected for a dimer. The presence of dimers of coat protein only in the presence of bisANS suggests that bisANS is driving dimerization.
430
Walter F. Stafford III et al.
4.0
s*
6.0
10.0
(svedbergs)
Figure 1: Sedimentation velocity analysis of the interaction between bisANS and bacteriophage P22 coat protein. Sedimentation was carried out at 56,000 rpm at 20 °C in a Beckman Model E centrifuge equipped with a video-based on-line Rayleigh optical system. (TOP) Coat protein alone (CQ = 0.4 mg/ml; t = 5348 sec); (BOTTOM) Coat protein in the presence of 60 |iM bisANS (A) Co = 0.5 mg/ml; t = 5339 sec & (B) 0.2 mg/ml; t = 5343 sec). The error bars are the standard error of the mean propagated from the averaging process.
High Sensitivity Sedimentation Methods
431
0.20
n •a
>
0.12
0.10
0.16
0.08
(0
0.12
A
0.08
i
0.06
0.04
V
0.04
0.02
0.00
0.0
0.00
1.0
2.0
3.0 s*
4.0
5.0
6.0
7.0
8.0
(svedbergs)
Figure 2: Sedimentation velocity analysis of the interaction between bisANS and bacteriophage P22 coat protein. Coat protein in the presence of 30 |xM bisANS; CQ = 0.3 mg/ml; The right hand ordinate has units of mg-ml"^-svedberg. Sedimentation was carried out at 56,000 rpm at 20 °C; t = 7304 sec in a Beckman XL-A ultracentrifuge equipped with a photoelectric scanner and video-based on-line Rayleigh optical system similar to the one installed on the Model-E but using a high resolution Kodak MegaPlus 1.4 digital camera (Stafford, to be described elsewhere).
In order to determine whether both the monomer and dimer of coat protein were capable of binding bisANS, the sedimentation velocity runs were performed in a Beckman XL-A analytical ultracentrifuge equipped with both photoelectric scanner and refractive optics (Fig. 2). Absorbance at 390 nm was followed to monitor the distribution of bisANS, while Rayleigh optics were employed to monitor the distribution of the coat protein. The distribution of the bisANS is superimposible with that of the coat protein in both monomer and dimer form. Therefore, it is apparent that both monomeric and dimeric coat protein molecules are capable of binding bisANS. This results suggests that the binding of bisANS to the protein subunit induces a subtle conformational change leading to dimerization, rather than directly mediating the interaction.
IV. Conclusion This chapter has discussed the use of the apparent sedimentation coefficient distribution function, g(s*) vs. s*, as a tool for studying both interacting and non-interacting systems especially at low concentrations. The apparent distribution function can be computed from the time derivative of sedimentation concentration curves. The relatively high precision afforded by combining use of the time derivative with signal averaging allows the analysis of systems at total concentrations of a few micrograms per milliliter with the Rayleigh optical system or on the order of 0.01-0.02 a.u. with the photoelectric scanner system of the Beckman Instruments XL-A analytical ultracentrifuge. These methods can be applied to data obtained with other optical systems such as fluoresence to increase the sensitivity further. In the run shown in Figure 2, the ability to obtain concentration measurements in terms of both optical density and refractive index allowed us to determine that
432
Walter E Stafford III et al.
bisANS bound to both the monomer and the dimer of the coat protein. A dual wavelength determination would not have been feasible in this case because of the high absorbance of bisANS at 280nm. The weight average sedimentation coefficient can be estimated from the g(s*) patterns by simple integration according to equations 5 and 6. and, in generS, if one knows the sedimentation coefficient of each species as well as the stoichiometry, one may obtain an accurate estimate of the equilibrium constants and standard free energies describing the system (Kegeles, 1967; Cann, ,1970; Stafford, 1994b). Thus, the new techniques will allow the investigation of interacting systems that were previously inaccessible to analysis by analytical ultracentrifugation.
Acknowledgement This work was supported in part by NIH grant GM-47980.
References Teschke, CM., KingJ. and Prevelige Jr., P.E., (1993)Inhibition of Viral Capsid Assembly by l,r-bi(4-anilino)napthalene-5,5'-disulfonic acid (bisANS). {Biochemistry 32:10658-10665) Prevelige Jr. P.E., Thomas, D., and King, J. (1993) Nucleation and Growth Phases in the Polymerization of Coat and Scaffolding Subunits into Icosahedral Shells {Biophys. J. 64: 824-835) Prevelige Jr., P. E., Thomas, D. and King, J. (1988) Scaffolding Protein Regulates the Polymerization of P22 Coat Subunits into Icosahedral Shells in vitro.. J. Mol. Biol. 202: 743-757. Cann, J. R. (1970). Interacting Macromolecules. New York, Academic Press. Kegeles, G., L. Rhodes and J. L. Bethune (1967). "Sedimentation Behavior of Chemically Reacting Systems." Proc. Natl. Acad. Sci. 58, 45-51. Liu, S. and W. F. Stafford (1992). "A Real-Time Video-Based Rayleigh Optical System for an Analytical Ultracentrifuge Allowing Imaging of the Entire Centrifuge Cell." Biophys. J. 61, A476, #2745. Stafford, W. F. (1992a). "Boundary Analysis in Sedimentation Transport Experiments: A Procedure for Obtaining Sedimentation Coefficient Distributions Using the Time Derivative of the Concentration Profile." Anal. Biochem. 203, 295-301. Stafford, W. F. (1992b). "Sedimenation Boundary Analysis: An averaging Method for Increasing the Precision of the Rayeigh Optical System by Nearly Two Orders of Magnitude." Biophys. J. 61, A476,(#2746). Stafford, W. F. (1994a). "Methods of Boundary Analysis in Sedimentation Velocity Experiments." in Numerical Computer Methods, Part B., Methods in Enzymology,286, Eds. L. Brand and M. L. Johnson. Orlando, Academic Press. Stafford, W. F. (1994b). "Sedimentation Boundary Analysis of Interacting Systems: Use of the Apparent Sedimentation Coefficient Distribution Function" in MODERN ANALYTICAL ULTRACENTRIFUGATION: Acquisition and Interpretation of Data for Biological and Synthetic Polymer Systems Eds. T. M. Schuster and T. M. Laue. Boston, Birkhauser Boston, Inc. Yphantis, D. A., W. F. Stafford, S. Liu, P. H. Olsen, J. W. Lary, D. B. Hayes, T. P. Moody, T. M. Ridgeway, D. A. Lyons and T. M. Laue (1994). "On line Data Acquistion for the Rayleigh Interference Optical System of the Analytical Ultracentrifuge." in MODERN ANALYTICAL ULTRACENTRIFUGATION: Acquisition and Interpretation of Data for Biological and Synthetic Polymer Systems Eds. T. M. Schuster and T. M. Laue. Boston, Birkhauser Boston, Inc.
SECTION VII Protein Conformation and Folding
This Page Intentionally Left Blank
CYANOGEN AS A CONFORMATIONAL PROBE Richard A. Day, Amy Hignite, and Warren E. Gooden Department of Chemistry University of Cincinnati Cincinnati, OH 45221-0172
I. INTRODUCTION Cyanogen (ethanedinitrile, N s C - C ^ N , C2N2) is a unique protein reagent. C2N2 drives the condensation of paired groups to form covalent links in a mode similar to that of carbodiimides in peptide and amide bond formation. It differs from carbodiimides in a critical and useful manner: it only drives intra-molecular condensation of paired groups such as salt bridges and does not lead to inter-molecular condensation products (1,2). The intra-molecular changes, i ^ covalent bonds replacing paired groups such as salt bridges, have been shown to involve HIS, ARG, and LYS side-chain functional groups (3,4,5). depsi-Peptidt bond formation was also seen (3). Carboxylate is the other component of each pair. Preformed paired groups may be within the same molecule (1-5) or between subunits. The subunits of Hb are rapidly covalently linked with no aggregation beyond ^2)32 seen (1,6); the a,i8 subunits of human chorionic gonadotropin are also linked by the action of C2N2 (7). Hens egg white lysozyme has been shown to associate weakly and heterologously (8,9); C2N2 reacts rapidly with it producing insoluble aggregates (1). This result suggests that C2N2 can "trap" an associated pair even if formed transiently with an unfavorable equilibrium as in the case of HEW lysozyme. In principle, a suitable reagent could convert pairs of associated sidechain groups such as salt bridges into stable covalent bonds. If there are distinct and distinguishable sets of associated groups representing two or TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
435
436
Richard A. Day et al.
more conformational states, then employment of the reagent can (a) be expected to interfere with transitions between conformations, (b) provide a means to identify the paired groups formed/disassociated in the transition(s), and (c) with altered conformational response affect ligand binding. A candidate reagent should react in a non-perturbing way with naturally formed associated pairs. It appears that cyanogen is perhaps the only reagent that fulfills these requirements. Ribonuclease S, of known crystal structure (10), presents an instructive case. Except in the region of residues 18-23, it is very similar to RNase A (11). C2N2 modifies known salt-bridge pairs (5). The distance between the C„'s of ALA20 and SER 21 is 27 A calculated from the X-ray diffraction based coordinates (12), however, considerable uncertainty exists relative to structural parameters for residues 18-23 (12). Are residues 20 and 21 transiently associated and susceptible to C2N2 driven condensation? The answer appears to be yes but not as a reestablished peptide bond. Instead, it appears to be a depsi-^^txAt ester link. HSA exists as a single polypeptide chain of 585 amino acids. XRay crystallographic analysis has confirmed that this chain is folded into three domains, each of which is seen to exist as two subdomains (13). HSA undergoes extensive, pH dependent, reversible conformational changes (14). The so-called acid expansion at low pH is accompanied by loss of a large fraction of a-helical structure. We report here the C2N2 treatment of RNase S which traps a minor conformer restoring some of the properties of RNase A, and of serum albumin where C2N2 locks in a low a - helix form and alters ligand binding. n . Experimental Materials and Methods A. Protein and Reagents All chemicals were reagent grade and used without further purification. Cyanogen is a toxic gas and should be handled with care (15); it is not always available commercially. It can be readily prepared in one step from AgCN (16). B. Cyanogen Modification of Ribonuclease S Essentially the procedures described (3,4) followed. C. Sequence Analysis of Tryptic Peptides Sequence analyses of the protein and tryptic peptide samples were
Cyanogen as a Conformational Probe
437
performed by the Protein Chemistry Core facility of the Department of Pharmacology and Cell Biophysics at the University of Cincinnati, Cincinnati, Ohio. i n . Results and Discussion A. Ribonuclease S The covalent attachment of the S-peptide to the S-protein was determined from the relative areas of peaks corresponding to these two components on reverse-phase chromatograms. The enzymatic activity as measured by the method of Crook et al. (17), of RNase S is modified by C2N2 treatment. There is an initial increase in V,^ and K^ as seen within one minute The heightened activity at one minute coincides with the 86% reduction in peptide 7, Ser21-LYS31, from the map (compare Fig. lb, Ic). Subsequent lowering of activity with longer C2N2 treatment can be attributed to covalent modification of active site HIS 12 at a slower rate as shown where the four HIS ring C2 protons were monitored by NMR (5). A tryptic digest and HPLC analysis of standard RNase A and RNase S using the method of McWherter et al. (18) resulted in the tryptic maps shown in Figure la and lb. The order of elution and identity of the tryptides for RNase S were established to be the same as that reported by McWherter et al. (18) for RNase A (Fig. la), with the exception of the two peptides arising from the cleaved ALA20 to SER 21 bond characteristic of RNase S. The specific residue(s) involved in the covalent modifications present within the affected tryptides were identified through a combination of amino acid analysis, sequence analysis, molecular modeling and comparison with the X-ray diffraction based structure (10,11). The areas of peptide 7 (SER21-LYS31) and 5(ASN62-ARG85 and ASN67-ARG85) are reduced more rapidly than other tryptides in the map. For 7 the only candidate groups for reaction are the SER21 a-ammonium, LYS31 e-ammonium, and SER hydroxyls. Neither X-ray crystallography nor molecular modeling place the e-amino group in a salt bridge. However, a brief exposure to pH 10 (5-10 min) restores full enzymatic activity and at the same time makes the C2N2 RNase S susceptible to trypsin once more; pH 10 will have no effect on an e-carboxamido link. Re-formation of an ALA20-SER21 peptide is ruled out by the pH 10 lability of the linkage. The loss of 5 from the map (Fig. 1) is consistent with a cross-link of saltbridge ASP121 to LYS66 as confirmed by sequence analysis. It has nothing to do with the covalent reattachment of the S-peptide to the S-protein; such
Richard A. Day et al.
438
H
I
I K
E
XM
Li II—' Retention Time (min)
Retention Time (min)
Retention Time (min)
Figure 1. Reverse-phase chromatographic maps of tryptic digests of reduced carboxymethylated ribonuclease preparations: (a) RNase A control (b) RNase S control and (c) RNase S treated for 1 min with C2N2. Modified and unmodified protein samples were analyzed by reverse-phase HPLC (Perkin-Elmer 250 Binary Pumping Model 235, Diode array Detector, Vydac C-18, #218TP54 Column) using the method of McWherter et al. (18). The sample digests were frozen and stored at -70*^0 until analyzed.
Cyanogen as a Conformational Probe
439
side reactions at other pairs cause the trace in Fig. Ic not to be a replica of that in Fig. la. Thus, formation of the depsi-ALA20-SERll link occurs by trapping of a minor conformational state of the RNase S. At one minute C2N2 reaction time the new protein species remains an active enzyme. We conclude that most of the sites targeted by C2N2 are consistent with the Xray data. Facile deletion of peptide 7 from the map within one minute and restoration of trypsin resistance to RNase S through an alkali labile link is consistent with an ALA20 to SER21 depsi-ptptidc link formation. B. Human Serum Albumin Treatment of HSA with C2N2 at pHs 4, 7 and 9 gave a protein (HSA-CN) with altered conformational responses to changes in pH and altered ligand binding. Representative data are shown here; all molar elliptical values ([tJ]deg M* m"^) vs pH profiles have been duplicated one or more times. [i>] X values are shown at the X values indicated. Three sets of data are given: pH dependence of [d] of (1) HSA and HSA-CN in the far UV (2) in the near UV and (3) in the presence of Ca "^"^. Monitoring the [t>]227 pH dependence (Fig 2) shows a variation from -2000 at pH 2 - 3 to - 17,000 at pH 8 - 9. After C2N2 treatment at pH 7, the pH dependence of [t)]227 varies between - 0 and -8000 to -9000. While not shown, C2N2 modification at pH 4 and pH 9 gives a similar range of [tJ]227 values from pH 2 - 9. The absolute [i>]227 values for the HSA-CN are about one half that of the HSA control. Upon storage of the HSA-CN at pH 9 for several days (with NaNa), the profile slowly reverts to that of natural HSA with molar ellipticity values returning to that of native HSA. In the near UV the molar ellipticity [iJ]292 values for the control HSA are essentially the same as reported in thefirstCD based study of the N-B transition of HSA (19) at pH 6-9. C2N2 treatment changes [t>]292 = -3500 at pH 9 to -2000 deg M"^ m"^ (not shown). The ligand Ca"^"^ gives enhanced values of [t?]2i7 for HSA over the entire pH range when compared with HSA in its absence (Fig. 3). The values range from - -15,000 at pH 2-3 to —65,000 at pH 8 - 9 (Fig 3). The HSA-CN in the presence of Ca"'"' is changed to +8000 at pH 2 to -3000 to -6000 at pH 7 - 9. This represents a 90 - 95% change in the molar ellipticity in the a, jS band region and presumably reflects a correspondingly large change in the secondary structure. C2N2 treatment reduces the absolute values of molar ellipticities at all wavelengths and at all pH values. This is consistent with reduction in a-helical content. The changes in the near UV spectra are suggestive of
440
Richard A. Day et al. HSA (0.005mg/ml)
HSA (0.005mg/ml) and C2N2(pH=7)
Figure 2. Molar ellipticity [t>]227 deg M"^ m"* as a function of pH of control HSA Geft) and of HSA-CN (right). The HSA-CN was formed at pH 7 by passing C2N2 (Ice) through the HSA solution (Ice) contained in a 2cc septum-capped vial. After 1 hour at 25° , the C2N2 was entrained in a stream of N2 passed over the surface of the solution, followed by chromatography over Sephedex G25 and lyophilization of the protein fraction. CD spectra were taken on a Gary 61 spectropolarimeter. Fatty- acid free HSA was acquired from Sigma (St. Louis, MO).
HSA (0.005mg/ml) with CaClgCSmeq / L)
HSA (0.005mg/mi) and C2N2(pH=7) with CaClgCSmeq / L, pH=7)
Figure 3. Molar ellipticity as a function of pH in the presence of Ca"*"^ (5 meq-L-* CaCy of control HSA (left) and of HSA-CN formed at pH 7 (see caption to Fig 2).
Cyanogen as a Conformational Probe
441
alteration in tertiary structure. In prior studies C2N2 has not been found to cause significant changes in the CD spectra of conformationally stable proteins such as carbonic anhydrase (16). In no case among our studies cited above has C2N2 treatment lead to protein denaturation but in fact can lead to protein stabilization(6). Non-specific denaturation of HSA is unlikely for an additional reason, viz.. the HSA-CN reverts to HSA at pH 9 and gives once again the CD spectrum of native HSA. All but one of the types of covalent bonds formed by the action of C2N2 are hydrolyzed slowly at pH 9, 25^(1,2,16). We conclude that C2N2 traps conformation(s) similar to that seen at low pHs where a-helix is dramatically reduced . rV. Conclusions The cyanogen treated protein is covalently modified "locking" the protein in one form. In RNase S the trapping of the enzymatically active, trypsin-resistant form is consistent with a finite amount of an associated ALA20-SER21 pair at any given time. The high a conformation of HSA is locked into a low a, form by C2N2. Ca"^"^ enhances the difference in secondary structures of HSA and HSA-CN; in fact, Ca"^"*" exerts opposite effects on HSA and HSA-CN. Note. Cyanogen treatment of human gonadotropin (hCG), an ajS heterodimer, resulted in cross-linking of a significant fraction of the a-and j8-subunits. hCG in which the a-subset is radioiodinated binds to LH receptors. Treatment of these receptor complexes with cyanogen caused the hCG to become cross-linked to the receptors as shown by the appearance of a high molecular weight species of approximately 123 VD on SDS-polyacrylamide gels. This complex disappeared upon reduction with jS-ME. Thus, the hCG j8-subunit appears to be cross-linked to the LH receptors. However, the aand j8- subunits of hCG do not appear to be readily cross-linked to each other under the same conditions. These observations are consistent with a model in which the conformation of hCG is altered during binding of the hormone to its receptors, a phenomenon that may be studied further through the use of cyanogen cross-linking. (W. R. Moyle, personal communication, also see ref. 7). Acknowledgment 42697.
This work was supported in part by USPHS Grant GM
442
Richard A. Day et al.
References 1. Day, R.A., Kirley, J., Tharp, R., Ficker, D., Strange, C. and Ghenbot, G. (1989). In 'Techniques in Protein Chemistry" (T. E. Hugli, ed.), p.517. Academic Press, San Diego. 2. Day, R.A., Tharp, R.L., Madis, M.E., Wallace, J.A., Silanee, A., Hurt, P. and Mastruserio, N. (1990). Peptide Res. 3, 169. 3. Ghenbot, G., Emge, T. and Day, R.A. (1993). Biochim. Biophys. Acta. 1161, 59. 4. Karagozler, A.A., Ghenbot, G., and Day, R.A. (1993). Biopolymers 33, 687. 5. Gooden, W.E., Day, R.A. and Kreishman, G. P. (1993). Proc. 1993. Miami Bioltechnology Winter Symposium 3, 15. 6. Tharp, R. L. (1987). Ph.D. Dissertation. University of Cincinnati. 7. Lin, W., Day, R. A. and Moyle, W. R. (1993). Proc. 1993 Miami Bio/Technology Winter Symposium 3, 19. 8. Sophianapoulos, A.J. (1969). J. Biol. Chem. 244, 3188. 9. Zehavi, U. and Lustig, A. (1969). Biochim. Biophys. Acta. 194, 532. 10. Wyckoff, H.W., Tsemoglou, D., Hanson, A.W., Knox, J. R., Lee, B. and Richards, P.M. (1970). J. Biol. Chem. 245, 305. 11. Wlodawer, A. and Sjolin, L. (1983). Biochemistry 22, 2720. 12. Richards, F. M. and Wyckoff, H. W. (1973). In "Atlas of Molecular Structures in Biology" (D. C. Phillips and F. M. Richards, ed.), p 9, 19. Clarendon Press, Oxford. 13. He, X. M. and Carter, D. C. (1992). Nature 358, 209; Carter, D.C. and Ho, J.X. (1994). Adv. Prot. Chem. 45, 153. 14. Zurawski, V. R. Jr. and Foster, J. F., (1974). Biochemistry 13, 3465. 15. Fassett, D.W. (1983). In "Industrial Hygiene and Toxicology" Second Edition (F.A. Patty, ed.), p 2003. Interscience Publishers, New York. 16. Kirley, J. W., Day, R. A. and Kreishman, G. P. (1985). FEBS Lett. 193, 145. 17. Crook, E. M., Mathias, A. P. and Rabin, B. R. (1960). Biochem. J. 74, 234. 18. McWherter, C. A., Thannhauser, T.W., Fredrickson, R. A., Zogotta, M. T. and Scheraga, H. A. (1984). Anal. Biochem.Ul 523.
Evaluation of Interactions Between Residues in a-Helices by Exhaustive Conformational Search Trevor P. Creamer^, Rajgopal Srinivasani, and George D. Rose^ Department of Biochemistry and Molecular Biophysics Washington University School of Medicine, St. Louis, MO 63110
I. Introduction Recently, it was hypothesized that native protein structure is specified by a stereochemical code (1). We have been evaluating this hypothesis by examining high resolution protein structures to extract recurrent patterns and identify formative interactions (2,3). Many patterns are small enough to be analyzed exhaustively by conformational search techniques. Initially, we have focused on the a-helix. Exhaustive conformational searches were performed to evaluate interactions between pairs of hydrophobic side chains in mid-helix positions and to imderstand the motifs adopted by glydne-terminated helices (2). Both studies are described below.
II, Exhaustive Conformational Search Exhaustive conformational search is a simple and practical way to explore the entire conformational space available to a peptide (or molecular segment) with fewer than a dozen rotatable bonds. A search is performed by systematically varying each rotatable bond in the peptide. Rotations are made about backbone dihedrsds (^ and \|/) and/or side chain torsions (%). In our work, bond lengths and angles are held rigid. After each rotation, the molecule is checked for steric overlap. If overlap occurs, the conformer is discarded; otherwise it is
1 Current address: Dept. Biophysics and Biophysical Chemistry, Johns Hopkins University School of Medicine, Baltimore, MD 21205 TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
443
444
Trevor P. Creamer et al.
retained for later analysis. Helix nomenclature is: ...-N"-N'-Ncap-NlN2-N3-...-C3-C2-Cl-Ccap-C'-C"-... where residues N l to CI define the helix proper and have both helical i —> i-\-4 hydrogen bonds and backbone dihedral angles, ^ and \|f, with mean values of-64±7° and -41±7°, respectively (4). Residues Neap and Ccap depart from helical ((),\|/ angles, but make one additional helical hydrogen bond. Flanking residues - N', N",... and C\ C",... - are non-helical.
III. Side Chain Interactions Within a-Helices Hydrophobic interactions between residue side chains in an a-helix are thought to be helix-stabiHzing (5). This hypothesis was tested experimentally in isolated helical peptides (6), where stabilizing interactions were foimd between Leu-Tyr and Tyr-Leu pairs spaced either three or four residues apart. Using exhaustive conformational search, we evaluated the energy, entropy, and free energy of Leu-Phe and Phe-Leu pairs in a model helix, as described below. To model side chain - side chain interactions, a peptide with the sequence CH3CO-(Ala)i9-NHCH3 was used, with rigid backbone geometry ((|)=-64°, \|/=-41°). The two residues of interest were substituted into middle helix positions and their side chains rotated incrementally. After each rotation, the resultant conformer was checked for steric overlap and discarded whenever any two atoms were closer than 70% of their summed van der Waals radii (values taken from the OPLS non-bonded parameters (7)). When retained, the energy of the conformer was calculated using the AMBER/OPLS forcefield (7,8) (dielectric constant of 78 and temperature of 298K) and the solvation model of Wesson and Eisenberg (9). At the conclusion of the conformational search, the Boltzmannaveraged energy of the system was calculated using E = J^EiPi
(1)
where Ei is the energy of the ith conformer, pi is it's Boltzmann weighting factor and the sum is taken over all N conformers generated in IJie search. The Boltzmann weighting factor for the ^th conformer was calculated from the partition fimction
Pk=4 y^-E,/RT
(2)
Exhaustive Conformational Search Methodology
445
where R is the gas constant and T is the temperature (298K). The conformational entropy of the two side chains was estimated as N
S^-Rj^Pilnpi
(3)
i
using weights from Equation 2. Interactions between Leu-Phe and Phe-Leu pairs were analyzed using the protocol described above. Side chains were modeled at positions i (residue 8) and i+2 (residue 10), i+3 (residue 11) or i+4 (residue 12). The (i,i-\-2) pairs serve as a standard state because, in this position, the side chains are on opposite sides of the helix and cannot interact. Side chain torsions were rotated in increments of 30°. Results are shown in Table I. Although the (i,i+3) and (hi+4) Leu-Phe pairs have the same overall interaction energy, AE-TAS, they differ in both energy and entropy (Table I). The most stabiUzing of the pairs modeled, Phe-Leu at (i,i-\-4X undergoes again in entropy, a consequence of the fact that interactions between the two side chains cause the rotamer populations to be distributed more imiformly across their possible rotamer classes (10). The conformational entropy calculated using Equation 3 is maximal when all probabilities pj are equal (and nonzero). In agreement with the experimental results of Padmanabhan and Baldwin (6) on Leu-Tyr and Tyr-Leu pairs, we also find stabilizing interactions between these hydrophobic side chains. Favorable van der Waals contacts promote side chain - side chain interaction, and the solvation model (9) also biases the side chains toward hydrophobic burial. Similarly, these factors are seen in the favorable energy terms, AE. Conversely, these same factors can lead to a loss of side chain conformational entropy, -TAS, since the side chains typically lose conformationsil freedom (10), although in one case again in entropy is observed. We note that the energy in Table I is due to interactions between side chains and does not include the energy of helix formation. In particular, it can be entropically costly to fix residues in a helix (10), but favorable interactions can help "pay for" this cost. Table I : Energies and entropies (in kcal.mol"^) from the conformational searches for Leu-Phe and Phe-Leu pairs at spacings of (i,i+3) and {hi+4) normalized against the same pairs at spacings of iUi+2) Pair Spacing ^E -TAS AE-TAS -0.24 +0.06 -0.32 Leu-Phe i,i+3 Phe-Leu
i,i+4 i,i+S i,i+4
-0.45 -0.23 -0.12
+0.21 +0.05 -0.18
-0.24 -0.18 -0.30
Trevor P. Creamer et al.
446
IV. Glycine Terminated a-Helices A second example that illustrates the power of exhaustive conformational searching is found in the recent analysis of glycine terminated helices (Gly at C) (2). Specifically, two recurrent motifs are observed in a-helices terminated by a glycine (Figure 1). In each, the glycine residue adopts a left-handed conformation {Le, (^>0). In one case, termed the Schellman motif, a distinctive, doubly hydrogen bonded pattern between backbone partners, consisting of 6 -^ 1,5 -^ 2 hydrogen bonds between the N-H at C" and C=0 at C3 and between the N-H at C and C=0 at C2 is observed. In the other case, termed thettLmotif, a 5 -> 1 hydrogen bond between the N-H at C* and C=0 at C3 is observed. A distinguishing feature of the Schellman motif is the presence of interacting hydrophobic residues at C3 and C", while in the a^ motif C" is invariably a polar residue. From these observations, stereochemical rules were developed (2). Using these rules, simple visual inspection of the amino acid sequence was sufficient to distinguish the two motifs from each other, andfrominternal glycines that fail to terminate helices. The key feature of these motifs is that they involve interactions that are local in sequence space, i.e., within a
Hydrophobic Interaction
..
*
. W C3 Hydrogen Bond
Hydrogen Bonds
L.
C2
CI
C-cap
CI Schellman Motif
0^ Motif
Figure 1 : The two helix-terminating Gly motifs.
Exhaustive Conformational Search Methodology
447
six residue segment. For this reason, both motifs are ideal candidates for exhaustive modeling. We define a helix stop signal to be a residue sequence for which it is energetically more favorable to terminate the helix than to continue it. Under this definition, the two motifs are helix stop signals, as can be shown by exhaustive conformational analysis of suitable peptides.
A. The Schellman Motif For the Schellman motif, modeling was used to address three questions. For each question, the model peptide consisted of a helical fragment followed by a Schellman sequence. In detail, all conformations of the peptide CHsCO-Leu-Ala-Ala-Ala-Gly-Ile-NHCHs were generated, subject to the constraint that backbone torsions of the subsequence ...-Leu-Ala-Ala-... were maintained in helical conformation (
1. Question 1: Do the 6->lf 5^2 hydrogen bonds force a Schellman motif? To test this question, the working set was filtered to include only those conformations that satisfied a crude, distance-based hydrogen bonding criterion, viz., the N-H donor was constrained to be within 3.5A of its C=0 acceptor for residues C" and C3 and residues C and C2. This filter reduced the number of allowed structures to 157. All 157 were minimized (to a gradient of less than O.OlkJ/mol/A, using the AMBER/OPLS forcefield (7,8), with a dielectric of 1.0 and no nonbonded cutoffs). A set of six classes with unique backbone minima was found, each having an RMS difference of 0.25A or more from the others. However, only one of these six had acceptable geometry (11) for forming (C") N-H-0=C (C3) and (C) N-H.-0=C (C2) hydrogen bonds, compatible with a Schellman motif. In the remaining five classes, the putative donor/acceptor pairs moved apart upon minimization and could no longer form hydrogen bonds.
448
Trevor P. Creamer et al.
2. Question 2: Does the C*-C3 hydrophobic interaction force a ScheUman motif? To test this second question, the working set was filtered to include only those with interactions between the side chains of Leu (C3) and He (C"). A set of 6263 was found having an average of at least one hydrophobic contact per side chain atom. However, when constrained to the lower 25% of the energy range (using the contact energy scheme of Sander et al (12)) only 121 structures remained. Using the protocol above, the very lowest energy structiu^es were then minimized; each collapsed to a Schellman motif. 3. Question 3: Is termination of a helix in a ScheUman motif energetically favored over extending the helix through C ? To test this final question, the energy of the peptide firagment terminating in a Schellman motif was compared to the energy of the corresponding fragment when fully helical. Both have the same number of hydrogen bonds, but the conformation with the Schellman motif has substantially more favorable contacts (between the side chains of C3 and C"). Moreover, the fully helical peptide imfolds when minimized using the AMBER/OPLS forcefield (as above), while the peptide terminating in a Schellman motif remains folded.
B. The aj^ Motif Similar modeling was used to address three further questions for the ttL motif Analysis was more complex in this case because the a^ motif has fewer limiting constraints than the Schellman motif. 1. Question 1: Why is glycine is required at the C* position? To test this first question, all conformations of an alanyl hexapeptide with blocked termini were generated, with three residues locked into a helix (C3-C2-C1) and the remaining three (Ccap-C'-C") varied exhaustively, as above. The working set contained 12,726 sterically allowed conformations. Of these, 1638 had positive values of <[) for the C residue and 888 had distances that were consistent with an interaction between the a-carbons of C3 and C, as observed in the aL motif (i.e. < 5.4 A,), but no single conformer satisfied both constraints simultaneously. Thus, the presence of a Cp atom at C is sterically incompatible with the aj^ motif.
Exhaustive Conformational Search Methodology
449
2. Question 2: Why is glycine confined to a left-handed turn conformation? To test this second question, all conformations of an (Ala)4-Gly-Ala peptide with blocked termini were generated, with C3-C2-C1 in a helix and Ccap-C'-C" varied exhaustively, as above. The working set contained 36,042 sterically allowed conformations. In those where Gly has a negative value of <]), its C=0 is sequestered between the C=0 groups of CI and C2, with small interatomic distances between the three oxygens and correspondingly large electrostatic repulsion. This unfavorable configuration is relieved in a left-hand turn conformation {Le, 0).
3. Question 3: Is termination of a helix in an a^^ motif energetically favored over extending the helix through C*? Exhaustive modeling resolves both conformational questions posed about the aL motif However, the final question - whether the aL motif is energetically favored over continuation of the heHx through the Gly - cannot be answered with equivalent certainty. Comparing helix termination with helix continuation, the number of hydrogen bonds is identical and the nimiber of hydrophobic contacts similar. The principal distinction between these two alternatives is that a glycine at the C^ position in an a^ motif probably sacrifices considerably less conformational entropy than it would within a helix (10,13). This conjecture is in accord with the experimental finding that no helix in a data base of 42 high-resolution protein structures was found to have an internal glycine followed consecutively by a polar residue (except within Ncap-N3, the first helical turn), imless a tightly bound ion or prosthetic group was also present (2). Apparently, such a sequence is sufficient to cause helix termination in an aL motif.
C. Summary of the Glycine Motif Modeling Simamarizing the modeling for the Schellman motif, all three of the questions posed can be answered affirmatively. A notable and automatic consequence of the motif is to position the C glycine in a left handed turn conformation. For the aL motif, glycine is the only sterically allowed residue at the C position. This conclusion from modeling is confirmed experimentally. In every instance of an aL motif
450
Trevor P. Creamer et al.
in a data base of protein structures, glycine is found at C, lanlike the analogous situation for a Schellmgin motif, where a non-glycine residue is utilized occasionally (2). At the C position, electrostatic considerations force the glycine to adopt a left-hand turn conformation. Although it cannot be shown conclusively that terminating a helix in anttLmotif is of lower energy than extending the helix through C\ this proposition appears to be highly likely for entropic reasons.
V. Conclusions As demonstrated by the two studies described above, exhaustive search techniques can be used to successfully model the interactions and small motifs that are characteristic of protein structure. We have shown that small, stabilizing interactions can occur between two hydrophobic residues at appropriate spacings in the middle positions of an a-helix, in agreement with experimental findings (6). The two motifs adopted by glycine-terminated helices (2) have also been rationalized and shown to be authentic helix stop signals.
Acknowledgment This work was supported by NIH Grant GM29458.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Lattman, E.E. and Rose, G.D., (1993) Proc. Natl Acad. Set, U.S.A 90, 439. Aurora, R., Srinivasan, R. and Rose, G.D., (1994) Science 264, 1126. Seale, J.W., Srinivasan, R. and Rose, G.D., submitted for publication. Presta, L.G. and Rose, G.D., (1988) Science 240, 1632. Dill, KA., Fiebig, KM. and Chan, H.S., (1993) Proc. Natl. Acad. Sci., U.S.A. 90, 1942. Padmanabhan, S. and Baldwin, R.L., J. Mol. Biol, (in press). Jorgensen, W.L. and Tirado-Rives, J., (1988) J. Am. Chem. Soc. 110, 1657. Weiner, S.J., Kollman, P.A., Case, D.A., Singh, U.C, Ghio, C, Alagona, G., Profeta Jr., S. and Weiner, P.A., (1984) J. Am. Chem. Soc. 106, 765. Wesson, L. and Eisenberg, D., (1992) Protein Sci. 1, 227. Creamer, T.P. and Rose, G.D., (1992) Proc. Natl. Acad. Set, U.S.A. 89, 5937. Stickle, D.F., Presta, L.G., Dill, K.A. and Rose, G.D., (1992) J.Mol. Biol. 226, 1143. Sander, S., Scharf, M. and Scheider, R., (1992) in Protein Engineering, (A.R. Rees and M. Sternberg, eds.), Ch. 4, Oxford University Press, Oxford. Pickett, S.D. and Sternberg, M.J.E., (1993) J. Mol. Biol. 231, 825.
Design, Synthesis and Characterization of a Water-soluble p-sheet Peptide David S. Wishart, Les H. Kondejewski, Paul D. Semchuk, Cyril M. Kay, Robert S, Hodges, and Brian D. Sykes Protein Engineering Network of Centres of Excellence, University of Alberta, Edmonton, Alberta, Canada T6G 2S2
I. Introduction Peptide models of a-helices have revealed much about the intrinsic and extrinsic factors that control helix formation in peptides and proteins (1-5). While considerable progress has been made in our understanding of helix formation and stabilization, the same cannot be said of the situation regarding two other important classes of secondary structure: p-sheets and p-tums. The reason for this is that there has not yet been a p-sheet model developed that is as simple to prepare and as easy to characterize as a monomeric helix. Early peptide models of p-sheets were typically based on large homooligopeptide aggregates (6), which were too poorly defined to be of any practical use in determining p-sheet propensities. More recent work involving diacylaminoepindolidione or dibenzofuran-propionate mimetics of p-sheets and p-turns (7, 8), while very promising, are also problematic because of their reliance on non-peptide constituents, their difficulty in preparation, and in the case of the epindolidione model, their limited solubility. As an alternative to the peptidomimetic approach to studying p-sheet propensities, two protein-based models have recently been proposed. One is based on the Bl domain of staphylococcal protein G which binds IgG and is termed GBl (9, 10); the other is based on the zinc-fmger peptide CP-1 (11). While the results from the zincfinger peptide work agree quite well with published statistical p-sheet propensities, they unfortunately do not agree with the results from the GBl work (9). The poor agreement between the two models may result from neither model being a pure p-sheet. Consequendy, in both systems contributions from helices, metal ions and other p-strands may affect the measured p-sheet propensities in ways that are difficult to characterize. Given the limitations of the above systems, it is apparent that the optimal peptide model of a p-sheet (and a p-turn) should be as analagous to the monomeric helix models as possible. In particular, the ideal p-sheet model should be small (< 20 residues), monomeric, water-soluble, pure (composed of only p-sheets and p-turns), amphipathic (to investigate sidedness), reversibly denaturable, composed of only natural amino acids, easily synthesized and easily characterized by standard spectroscopic techniques. We believe that we have developed such a peptide model. It is based on the naturally occurring cyclic peptide gramicidin S, an antibiotic produced by the bacterium bacillus brevis (12). The schematic structure of gramicidin S as determined by X-ray and NMR studies (13, 14) is shown in Figure 1. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
451
452
David S. Wishart et al.
D-Phe
D-Phe
Om
Figure 1. The structure of gramicidin S from bacillus brevis. Note the well-defined p-pleated sheet structure.
Gramicidin S is a symmetric decapeptide (sequence: Val-Om-Leu-dPhe-Pro-ValOm-Leu-dPhe-Pro) composed of two antiparallel p-strands linked by two type IF p-turns. In this report we will describe methods that we have developed to easily prepare a number of gramicidin S (p-sheet) analogs using solid phase peptide synthesis and a simple but effective cyclization protocol. We will also present data describing the effects of selected residue substitutions on the solubility and structural stability of these peptides. Furthermore, we will demonstrate how lengthened (12 residue) analogs of gramicidin S exhibit features of cold denaturation and trifluoroethanol (TFE)-induced structure formation. These results will be used to demonstrate the potential of this system for determining p-sheet and p-turn propensities of amino acids and for studying the influences of hydrophobicity, sidechain packing, charge, temperature and solvent on p-sheet formation.
11. Materials and Methods A. Synthesis
and Cyclization
of Gramicidin
S and Analogs
Most syntheses of gramicidin S reported to date (12) are based on tedious solution phase approaches that take up to two weeks to complete. Because of the number of analogs that we anticipated using in this study, it became essential to develop a facile, semi-automated approach. The protocol described below allows gramicidin S analogs of varying length and composition to be synthesized in high yield and purified to homogeneity in less than three days with minimal human intervention. Beginning with either Boc--Lys (CL-Z)-PAM or Boc-Orn (CL-Z)-PAM resin, the following 10-residue linear peptides were synthesized using an Applied Biosystems 430A automated peptide synthesizer. Gramicidin S: LFPVOLFPVO LYPVKLYPVK Peptide 1: LSPVKLSPVK Peptide 2: Peptide 3: LNPVKLNP VK Peptide 4: LHPVKLHPVK
(D-Phe ~> D-Tyr) (D-Phe ~ > D-Ser)
(D-Phe - > D-Asn) (D-Phe - > D-His)
A Water-Soluble P-Sheet Peptide
453
The following 12-residue peptides were synthesized in the same manner, starting with Boc-Val-PAM resin, but with the £-amino group of lysine protected with Fmoc (instead of Cl-Z): Peptide 5:
KLKFPKVKLFPV
Peptide 6: Peptide 7:
ILKSPKVILSPV GLKSPKVILSPV
It should be noted that substitution of lysine to ornithine has no effect on the structure or activity of the peptide (12). The blocked peptides were cleaved from the resin using anhydrous HF in the presence of anisole. All peptides were purified prior to cyclization using reversed phase (Cg) HPLC with a linear AB gradient, where A=0.05% TFA/H2O and 3=0.05% TFA/acetonitrile. Cyclizations were performed at concentrations of --2 mg/mL in dichloromethane using 1.2 equivalents each of N,N'-dicyclohexylcarbodiimide, N-hydroxybenzotriazole and diisopropylethylamine. The progress of the cyclization reaction was monitored by both reversed phase HPLC and plasma desorption TOP mass spectrometry. For gramicidin S and Peptides 1-3, cyclizations were typically complete in 6 hours, with final overall yields ranging from 45-90%. These high yields were achieved with no indication of racemization, while using only the free peptide as the starting material. Peptides 4-7 took longer to cyclize (~24 hrs.) and the overall yields tended to be lower (5-10%). Fmoc groups on Peptides 5, 6 and 7 were removed after the cyclization step by treatment with piperidine (2 hr). Final purification was achieved using reversed phase HPLC. B. Spectroscopic
Characterization
(NMR and CD)
All NMR spectra were collected on a Varian Unity 500 MHz spectrometer (^H frequency 499.8 MHz) equipped with a 5 mm inverse detection probe. Sample concentrations were typically 1-2 mM and sample temperatures maintained at 25 "C (unless otherwise noted). Sample pH was typically 3.5 - 4.0. Onedimensional ^H data were acquired with a 'H sweep width of 6000 Hz and an acquisition time of 2.3 seconds. The residual water signal was suppressed by presaturation. 'H DQF-COSY, NOESY and TOCSY (15) spectra were collected and processed using standard methods. All chemical shifts were referenced relative to internal DSS. CD samples (-1 mg/mL) were prepared by dissolving the peptides into a 10 mM sodium acetate buffer (pH 5.5) and sonicating for approximately 1 minute. Insoluble material was removed by centrifugation. CD spectra were recorded at 25 *C (unless otherwise noted) on a Jasco J-500C spectropolarimeter using a 0.02 cm pathlength cell attached to a circulating water bath. CD spectra represent the average of four scans collected over a wavelength interval of 190 to 250 nm. Ellipticity is reported as mean residue ellipticity [9], with an approximate error of-500° at 220 nm.
III. Results A. Substitution
Effects
of D-Phe on Solubility
and
Structure
Gramicidin S is not readily soluble in water and often precipitates in the presence of divalent counter ions (HPO4). In order to design p-sheet analogs that were more water soluble and less sensitive to salt or pH, we investigated the effect of replacing the most hydrophobic amino acid (D-Phe) in gramicidin S with a series
454
David S. Wishart et al.
of polar amino acids. Analogs were synthesized with D-Tyr (Peptide 1), D-Ser (Peptide 2), D-Asn (Peptide 3) and D-His (Peptide 4) in the 4 and 4' positions (numbering according to reference 12). Three of the four peptides were found to be significandy more soluble than native gramicidin S, with Peptides 1 and 2 being soluble to >10 mg/mL and Peptide 4 being soluble to 8.5 mg/mL. In addition, all four peptides were analyzed by NMR and far UV CD spectroscopy to characterize their structure. For peptides 1, 2 and 3, chemical shifts, coupling constants, nOe connectivities and far UV CD spectra are all consistent with a psheet structure similar to gramicidin S. Peptide 4, however, appears to retain very littie p-sheet structure. These results are summarized in Figure 2, where the chemical shift index (16), derived from a-^H NMR chemical shifts, is plotted for each analog.
talLlllJ
tanuHiJ Peptide 3
Peptide 1
V K L N P V K L N P
V K L Y P V K L Y P
0)
ULiJL
o Peptide 4 V K L H P V K L H P
Figure 2. Chemical Shift Index plots of peptides 1, 2, 3 and 4. Arrows indicate the location of P-sheets in these peptides. Clusters of three or more positive chemical shift index (CSI) values are indicative of a p-sheet. Overall, these results suggest that it is possible to greatiy increase the solubility of this p-sheet model without significantly disrupting the structure. They also suggest that some residues (His in particular) can disrupt the type IF p-tum and eliminate most of the p-sheet structure. This result also implies that it may be possible to use host-guest techniques (17) to study type IF p-tum propensities with this system. B. Effects of Extending
the
p-sheet
An unexpected result concerning the 10 residue p-sheet analogs was their remarkable stability. None of the peptide models exhibited any significant structural change upon heating to 85 °C or upon addition of significant quantities of chaotropic solvents. To make these peptides more susceptible to denaturation, the p-sheet was extended by two residues. This chain extension was expected to increase the chain entropy, thereby reducing the thermal stability of the peptide. Unexpectedly, the addition of two lysines to the hydrophilic side of gramicidin S (Peptide 5) significandy reduced its p-sheet content under benign conditions. However, the addition of the structureinducing solvent trifluroethanol (TFE) actually enhanced the p-sheet content of this molecule (Figure 3a). While TFE is commonly used to induce helical
455
A Water-Soluble P-Sheet Peptide
Structure in peptides, we believe this represents one of the few instances where TFE has been used to induce the formation of p-sheet structure (18). A
5-1
_-^ o E •a u
0^ i"
•\
o E •o E o
^
''^^^
2.
'o -15H
\
'o
* TFE
B
5
-5
-10 -15
A ',\
5 "C
»\ * ^^
-25 J 190
200
220 230 Wavelength (nm)
240
250|
' ^^ ^85
»C
-----
2! -20
_ 210
. ' • ' • *
^^
X
5;-20
,"' ^ y' /
-25
1
190
200
210 220 230 Wavelength (nm)
240
250|
Figure 3. A) CD spectrum of Peptide 5 with TFE (51% P-sheet) and without TFE (26% Psheet). B) CD spectrum of Peptide 5 at 5 'C and 85 *C. Spectra were analyzed using the program RBOCON (R.F. Boyko, unpubUshed). Another interesting feature of this extended p-sheet model can be seen in Figure 3b, where we show the effect of temperature on the CD spectrum of Peptide 5. Curiously, when the sample temperature is decreased (to 5 °C), the spectrum takes on more of a "random coil" character (only 20% p-sheet); but, when the temperature is increased to 85 X , the spectrum exhibits significantly more psheet character (38% p-sheet). In other words, high temperatures induce structure and cold temperatures reduce structure. We believe that this represents an excellent example of cold denaturation (19), and it suggests that the thermodynamics of p-sheet formation may be more complex than currently appreciated. C. Stabilizing
and Destabilizing
Amino
Acid
Substitutions
To enhance the p-sheet content of Peptide 5, two of its lysines were exchanged for isoleucines. Isoleucine is known to have a stronger p-sheet propensity than lysine (20). However, because these changes were expected to reduce the solubility of the peptide, the two phenylalanines were exchanged for serines. The resulting construct was called Peptide 6. A second construct (Peptide 7) was synthesized wherein one of the isoleucine residues was substituted with a glycine. This substitution was predicted to reduce the p-sheet content of the peptide. In Figure 4 we compare the CD spectra of Peptides 6 and 7. As expected, the spectrum for Peptide 6 has substantially more p-sheet than Peptide 7 (31% p-sheet vs. 2% p-sheet). Indeed, the CD spectrum for Peptide 7 closely resembles that of a classic random coil (21). Furthermore, as judged by the overall shape of the CD curve, the spectrum for Peptide 6 appears to have slightly more p-sheet (31%) than Peptide 5 (26%), as expected. It is also worth noting that Peptide 6, just like Peptide 5, exhibits features of cold denaturation and TFE induced structure stabilization (data not shown). These results suggest that a model based on the sequence of Peptide 6 has many of the features required of an ideal p-sheet model.
Davids. Wishart^r a/.
456 5 1
w^
1
0Peptide 7. • " '
CM
^^"^
8" . 5 . CO 0
^-10-
- 1
1
5 ^
"
190
\/*
^/^
Peptide 6
••
200
210
220
230
240
250
1
Wavelength (nm)
Figure 4. CD spectra of Peptide 6 and Peptide 7 collected at 25 'C under benign (aqueous) conditions.
IV. Conclusions This report describes our efforts at designing, synthesizing and characterizing a water-soluble p-sheet analog. We think that we have succeeded in designing a 12 residue cyclic peptide (Peptide 6) which satisfies most of the criteria required of a model p-sheet: it is small (< 20 residues), monomeric, water-soluble, pure (composed of only p-sheets and p-turns), mostly amphipathic, reversibly denaturable, composed of only natural amino acids, relatively easily synthesized and easily characterized by either CD or NMR. We plan to refine this model to enhance its amphipathicity and to improve its cyclization efficiency. Once this refinement stage is complete, we will begin to systematically investigate the influence of amino acid substitutions on both the hydrophilic and hydrophobic sides of this peptide. The resultant data will be used to extract specific p-sheet propensities for all 20 naturally occurring amino acids. In addition to this work on monomeric p-sheets, we are beginning to study dimeric p-sheets (psandwiches) by preparing a variety of disulfide-linked p-sheet analogs. This will allow us to investigate the influences of side chain packing and hydrophobic effects on the stabilization of "idealized" p-sandwiches. We are hopeful that these model systems will provide researchers with the detailed information they need to understand the intricacies of p-sheet and p-turn formation in natural proteins.
References 1. 2. 3. 4. 5. 6. 7.
Padmanabhan, S., Marquesee, S,, Ridgeway, T., Laue, T.M., and Baldwin, R.L. (1990) Nature 344, 268-270. O'Neil, K.T., and DeGrado, W.F. (1990) Science 250, 646-651. Zhou, N.E., Kay, CM., and Hodges, R.S. (1992) J. Biol. Chem. 267, 2664-2670. Hill, C.P., Anderson. D.H., Wesson, L., DeGrado, W.F., and Eisenberg, D. (1990) Science 249, 543-546. Horovitz, A:, Matthews, J.M„ and Fersht, A.R. (1992) J. Mol. Biol. Ill, 560-568. Hartman, R., Schawaner, R.C., and Hermans, J. (1974) J. Mol. Biol. 175, 195-212. Kemp, D.S, (1990) Trends Bioiechnol 8, 249-255.
A Water-Soluble P-Sheet Peptide 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
457
Diaz, H., Tsang, K.Y., Choo, D., and Kelly, J.W. (1993) Tetrahedron 49, 3533-3545. Minor, D.L., and Kim, P.S. (1994) Nature 367, 660-663. Smith, K., Withka, J.M., and Regan, L. (1994) Biochemistry 33, 5510-5517. Kim, C.A., and Berg, J.M. (1993) Nature 362, 267-270. Izumiya, N., Kato, T., Aoyagi, H., Waki, M., and Kondo, M. (1979) "Synthetic Aspects of Biologically Active Cyclic Peptides - Gramicidin S and Tyrocidines", Kodansha Ltd., Tokyo. Hull, S.E., Karlson, R., Main, P., Woolfson, M.M„ and Dodson, E.J. (1978) Nature 275, 206-207. Krauss, E.M., and Chan, S.I. (1982) J. Am. Chem. Soc. 104, 6953-6961. Wuthrich, K. (1986) "NMR of Proteins and Nucleic Acids", J. Wiley & Sons, New York. Wishart. D.S., Sykes, B.D., and Richards, P.M. (1992) Biochemistry 31, 1647-1651. Scheraga. H.A. (1978) J. Pure Appl. Chem. 50, 315-324. Sonnichsen, F.D., Van Eyk, J.E., Hodges, R.S., and Sykes, B.D. (1992) Biochemistry 31. 8790-8798. Privalov, P.L., Griko, Y.V., Venyaminov, S.Y., and Kutyshenko, V.P. (1986) J. MoL Biol. 190, 487-498. Chou, P.Y., and Fasman, G.D. (1974) Biochemistry 13, 211 -222. Johnson, W.C. (1990) Proteins: Struct. Funct. Genet. 7, 205-214.
This Page Intentionally Left Blank
Automated Analysis of Protein Folding Richard A. Smiths, Jack Henkin^, and Thomas F. Holzmanl'^ ^Protein Biochemistry and ^Thrombolytics Research, ^To whom correspondence is addressed at D-46Y, Discovery Research, Abbott Laboratories, Abbott Park, IL 60064.
I. Introduction In recent years the need to obtain recombinant proteins has increased dramatically in the pharmaceutical and biotechnology-related industries. The ability to define and understand pathways for folding recombinant proteins can often be a rate-limiting step in the preparation of such proteins for use in diagnostic tests, drug screening, or for structural analysis by NMR and X-ray crystallography. These proteins are routinely obtained through heterologous over-expression in prokaryotic hosts. Unfortunately, instead of producing soluble folded protein, high-level expression often results in formation of inclusion bodies composed of partially folded, or misfolded, protein. In addition to being misfolded, proteins in inclusion bodies often have either mispaired or unpaired disulfide bonds. Since eukaryotic expression does not usually result in inclusion body formation, high-level expression in eukaryotic hosts can be pursued as a solution to this problem. However, obtaining high levels of intracellular expression, and concomitant secretion, routinely require substantially more effort and time to produce levels equivalent to those observed in prokaryotes. It is possible that metabolic engineering of prokaryotic hosts for enhancement of expression of native proteins may lead to "expression-tailored" organisms (1), but these efforts are clearly in their infancy. Methods presently employed for obtaining correctly refolded proteins from inclusion body preparations are often allor-none propositions. They typically consist of denaturant solubilization, in urea or guanidine, followed by dilution or Gradient dialysis (2). Recovery of native activity or Controller structure may be aided by using additives and Data (enzyme inhibitors, co-factors, oxidationreduction couples, etc.), which act to Acquisition stabilize the native-state protein conformation. However, because such efforts are time-consuming and tedious, CD systematic examinations of solution Detector conditions for protein folding/unfolding Figure 1. Diagram of HPLC-based protein are rarely performed. system. Thin lines represent biSome years ago, in an effort to improve folding directional computer communications and previous approaches (3-9), we developed data capture; thick lines represent an HPLC-derived method to automate fluid^ufier flow towards fraction analysis of protein folding and preparative collector. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
459
460
Richard A. Smith et al.
recovery of refolded recombinant proteins. As we recently described (10), this approach makes analyses easier to perform, permits exploration of an array of refolding conditions, and affords reproducibility at a preparative scale. In this method, continuous-flow injection into a narrow-bore, open-channel tube is used for both equilibrium and time-resolved kinetic folding experiments (Fig. 1). With appropriate in-line detectors it is possible to measure, nearly simultaneously, changes in protein quaternary, tertiary and secondary structure in response to chemical denaturants and solvent additives. Changes in protein quaternary structure or aggregation state are monitored by either simple lightscattering in the UV, or if resources permit, use of a detector designed specifically for light-scattering analyses. Tertiary structure changes Hypothetical Spectra are monitored by zero-order UV of Equilibrium Folding Intermediates protein absorbance spectra, the resulting second-derivative UV spectra, and changes in protein tryptophan fluorescence. Changes in Signals from Probes of secondary structure are monitored by Denaturation/Renaturation CD. •••Constant Total ProteinAlthough a number of system configurations are possible, the Denaturant instrumentation described here Gradient consists of a commonly available, temperature-controlled, ternary HPLC Increasing Time and Volume system and LC detectors with data After Mixing capture and analysis software (Fig. 1). Mixing and Delay It employs a combination of narrowFor Equilibrium i bore tubing and HPLC microbore c: Hypothetical Spectra of static and dynamic mixers to give a Kinetic Intemnediates at total, pre-sample mixing, dead volume Various Times After Mixing of less than 1 mL and sequential volumes of individual detectors ranging jfrom 12 to 40 |LIL. In comparison to standard manual mixing Denaturation/Renaturation techniques the system affords fast, 00 ...Constant Total Proteinrepetitive, unattended analyses of folding/unfolding combined with high data-capture rates. For example, after several hours of preparation, a typical denaturation profile by manual mixing might comprise 20-30 samples Increasing Time and Volume between a folded state in buffer and an After Mixing Mixing unfolded state in high denaturant, with Figure 2. Schematic depiction of potential each sample individually analyzed by modes of instrument operation. S i s a l s due to UV, fluorescence and CD. folding can be observed at equilibrium (upper Automation of a single run on the panel) or kinetically before equilibrium is flowing sample yields several hundred attained (lower panel). Insets depict hypothetical spectral differences that may be to several thousand data points each observed at different times (volumes) after for the UV, fluorescence, and CD mixing, measurements. Because the method employs standard HPLC control software, it is possible to establish with ease botii reproducibility and reversibility of equilibrium folding/unfolding. When combined with various oxidant/reductant electro-chemical couples, the method can be used in both analytical and preparative modes to perform automated refolding of proteins containing disulfides. Finally, the precise control of sample and denaturant flow rates, combined with variable volume delays to detectors, permits, in principle, selective observation and manipulation of a particular kinetic folding process, independent of time of refolding.
s
I I
I I
•L.
Automatic Analysis of Protein Folding
461
II. Materials and Methods HPLC SYSTEM AND CHARACTERISTICS. This technique was developed with the use of a ternary HPLC system employing three microbore-capable, steppermotor controlled, reciprocating HPLC piunps (Fig. 1). Specifically, the experiments presented here were accomplished using the following equipment: 1) a Beckman Ternary HPLC comprising one 126 binary pump and one 116 single pump (both pumps were equipped with programmable quaternary solvent selection valves); 2) a Measured Phase Delay Beckman 166 variable wavelength UV detector to monitor at 230 nm the gradient formed at the primary mixer; 3) a Beckman 168 diode array UVVis detector to Average Delay = 21.1 min at 0.2 mUmin acquire protein spectra from 200 to Signal After Delay Loop 400 nm (1 nm at Second UV Detector resolution and Average of 8 scans each, with HighA.ow 99.9% Confidence Intervals individual absorbance measurements at 250 and 278 nm); 4) a g 10.0 Shimadzu RF 551 HPLC fluorescence detector to record the intrinsic Trp fluorescence, and 5) a Jasco J600 spectropolarimeter to monitor changes in CD signal, typically at 222 nm. In addition, both static and dynamic microbore mixers and a modified pulsedamping system were Figure 3. System characterization for shape and position of added to the HPLC. absorbance-derived signals for urea unfolding gradients alone. The protein pump is Upper panel depicts the average of 8 runs measured at each primed off-line, and detector and tiie associated high/low statistical error limits at a confidence interval of 99.9%. Inset to upper panel the protein is briefly computed depicts measured urea phase delay between the averaged runs. recirculated to the The lower panel depicts the averaged data phase-corrected. The reservoir to establish lower panel inset depicts the residual differences in urea concentrations between the two absorbance detectors. uniform concentrations throughout the pump/dampener system for protein delivery. Although the priming process consumes no protein sample, the system "dead" volume up to the secondary mixer (Fig. 1) is -1 mL. In typical use the protein flow rate is - 5 10% of the total and varies between ~5 and 40 |xL/min for a -100-120 min run. For example, a 100 minute run using a sample of protein at 0.5 mg/mL flowing at 20 |iL/min consumes 1.0 mg. Thus, a single automated folding/unfolding experiment consumes significantly less protein than a manual mixing experiment in which 10-30 0.5 mL samples are prepared, each at 0.5 mg/mL. We routinely set the system up to perform 6-12 runs overnight to establish reproducibility. The use of the system in this fashion relies on flow accuracy and reproducibility at rates as low as 3.0 jaL/min. Standard reciprocating HPLC
Richard A. Smith et al.
462
pumps will not suffice; they lack flow-rate accuracy and often exhibit pronounced solvent pulsation. In contrast. microprocessor control. combined with a modified pulse damping system, provides peak-to-peak protein pulses of less than -^10 ^A.U. at the 280 nm protein absorbance maximum. As an alternative a syringe-type pump system could, in principle, yield pulseless flow until a refill cycle occured. Execution of multiple folding/ unfolding runs and coincident data capture is accomplished using the standard HPLC system control software provided by Beckman Instruments. 95 105 Minutes The rate of Figure 4. System characterization for shape and position of simultaneous data absorbance-derived signals for urea folding gradients alone. from all Upper panel depicts the average of 5 runs measured at each capture detector and the associated high/low statistical error limits detectors is variable computed at a confidence interval of 99.9%. Inset to upper panel from 2 Hz to -^20 Hz; depicts measured urea phase delay between the averaged runs. we typically collect The broad "spikes" of urea occurring after -100 min in the upper panel is due to system re-equilibration. The lower panel depicts data at 2 Hz. With all the averaged data phase-corrected. The lower panel inset depicts detectors on-line, a the residual differences in urea concentrations between the two single run produces absorbance detectors. about 2 MB of data stored in binary form. Data is parsed, analyzed, and plotted using a combination of programs. Beckman System Gold and Array-View software, ASYST Software from Keithly Instruments and Turbo Pascal for Windows are used for initial file conversions and parsing. Final data parsing and plotting are performed with Microsoft Excel 5.0 and Charisma 2.1 from Micrografx. After parsing and analysis, a typical single run collected at 2 Hz produces 4 to 8 MB of uncompressed data files. Because a series overnight runs can consume >100 MB of disk space, we find it convenient to store intermediate and final data sets on inexpensive magneto-optical media. Signal After Delay Loop at Second UV Detector
Measured Phase Delay
POTENTIAL MODES OF OPERATION FOR ANALYSIS OF FOLDING. Schematic diagrams showing the signals expected from the two possible modes of instrument operation are presented in Figure 2. In the upper panel signals are observed after a delay for the attainment of equilibrium. In equilibrium mode a denaturant gradient is formed at the primary mixer (Fig. 1) and observed after passage through delay tubing sufficiently long for the total fluid flow rate selected. If the same flow rate is used with a much shorter delay and a constant
463
Automatic Analysis of Protein Folding
concentration of "denaturant", then the signals observed are "pre-equilibrium". This latter mode is termed "isocratic freeze-frame" and is shown in the lower panel of Fig. 2. The signals represent folding or unfolding kinetic events occurring within the time domain after protein is mixed with buffer/denaturant at the secondary mixer (Fig. 1). Although we do not present here any data for pre-equilibrium measurements, the potential utility of such measurements is of note. If thorough mixing of protein with buffer/denaturant is attained at the secondary Static Manual Measurements 250^
200-
^
8.0 M Urea
150-
100 J nl
°^^VNA^ Iff! Urea X ^ V ^ ^
50 J
0J
w w
3C0
^ ^ \r^^^^^ 350
400 Wavelength, nm
450
50
Continuous Flow Measurement
2.0
4.0
6.0
Urea (M)
Figure 5. Fraction unfolded FKBP measured by automated folding \ First Unfolding Event analysis. The experiment was peri^i I I I I I I I I I 0.00 formed by flow-injecting a constant 8 9 10% stream of FKBP in 9.31 M Urea (M) buffered urea into the secondary mixer (Fig. 1) and performing automated refolding using intrinsic Figure 6. Equilibrium unfolding of low molecular tryptophan fluorescence (11) as weight urokinase observed by mtrinsic tryptophan fluorescence using manual mixing (upper panel) and described in Fig. 4. continuous flow analysis (lower panel). Samples for mixer, then a detector(s) placed manual analysis equilibrated for >1 hr before reading. immediately after the secondary The vertical arrow in the upper panel indicates mixer observes a static "freeze- fluorescence signal at 365 imi from excitation at 290 frame" signal that is earlier in nm. Data in lower panel was measured using the same excitation and emission wavelengths. Both sets "time" than a detector observing of measurements indicate two transitions in a static signal at equilibrium, fluorescence quenching. Manual measurements were with a Shimadzu RF-5000U after a long delay (Fig. 1). This made means that it is possible to make spectrofluorometer.
static, time-independent, spectroscopic measurements of folding events within the observable kinetic time frame between initiation of unfolding/folding at mixing and the point at which equilibrium is attained. This is accomplished simply by varying flow rate or by altering delay tubing length.
Richard A. Smith et al.
464
40 Fraction Number Figure 7. Activity measurements of low molecular weight urokinase refolded by continuous flow from 9.3 M urea to 20 mM Bis-Tris, pH 7.3 buffer and collected during run (Fig. 1). Circles indicate urea concentrations in each fraction as measured by refractive index. Squares indicate enzyme activity measured with urokinase substrate S-2444 (12). Inset shows a plot of activity versus urea concentration.
Finally, the ability to perform equilibrium and pre-equilibrium measurements can be combined with other experimental protocols. The concentration of protein can be varied automatically to determine the effects of concentration on folding and to observe, through sample scattering, the optimal concentration at which to perform refolding experiments. Folding measurements can be combined with reductant/oxidant couples (reduced/oxidized DTT or glutathione) to superimpose a specific redox potential on an experimental run. For example, for the refolding of reduced and unfolded protein from urea, a gradient ratio of reduced to oxidized DTT can be co-injected with protein to enhance intra/intermolecular disulfide exchange. SYSTEM CHARACTERIZATION AND SUITABILITY TESTS. In Figures 3 and 4 system characterization tests of unfolding and refolding gradients of urea alone are presented. The stock urea concentration was measured by refractive index (2). In both figures gradients were programmed to extend from 0-90% stock urea (9.82 M), or vice versa. In both experiments water was the remaining 10% fluid. In Figure 3 eight sequential runs of an unfolding urea gradient were collected overnight. The gradient reproducibility was so high that the computed error curves differ only slightly from the averaged data. In these unfolding runs the tubing length and secondary mixer produced a delay of-21 min between the gradient detector (Fig. 1) and the secondary UV detector used to monitor signals from folding. This test demonstrates little or no degradation in the denaturant gradient due to passage through the secondary mixer and the tubing delay (Fig. 1). The residual differences in the phase-corrected scans (Fig. 3, lower panel inset) indicate urea concentration variations between these two points in the system are on the order of O.l-to-0.2 M across the entire denaturation gradient. In Figure 4 a similar analysis is performed for the formation of a refolding gradient of urea, with similar results. Taken together these data suggest that the
Automatic Analysis of Protein Folding
465
HPLC-based system is capable of forming denaturant gradients with an acceptable degree of accuracy. III. Results and Discussion Using manual mixing experiments we have shown the equilibrium folding behavior of a recombinant peptidylprolyl cistrans isomerase, FK binding protein (FKBP), in urea (11). In that previous study second-derivative UV absorbance and intrinsic Trp fluorescence were used as probes of tertiary structure, and CD as a probe of secondary structure. The reversibility of folding was followed both by these optical probes of structure, as well as by two-dimensional N/ H heteronuclear single quantum coherence (HSQC) NMR of [U- N] FKBP. Fluorescence measurements indicated a transition midpoint at ~3.9 M urea (11). As a comparison, data for FKBP unfolding using the automated analysis is presented in Figure 5; total flow was 200 |iL/min (upper three panels are replicates). The dashed lines in each panel are the pre- and post-transition least-square baselines fitted to the thin data curve. The thicker curve in each of the upper three panels is the baseline-corrected data set. The lowest panel depicts the average of the three data sets overlaid with error curves from the three data sets computed at 99.9% confidence. The dashed lines in the lowest panel indicate the urea concentration at the folding transition mid-point. Data from the automated analysis of FKBP folding demonstrates a transition mid-point at -3.6 M urea, in good agreement with the previous results from manual mixing (11). In Figure 6 the equilibrium denaturation in urea of low molecular weight urokinase is performed by both manual mixing and automated equilibrium denaturation and measured in both cases by changes in the intrinsic Trp fluorescence. Manual readings (-15 samples) indicate the occurrence of two transitions in fluorescence quenching. The second transition, the major one, also exhibits an accompanying red-shift, consistent with exposure of Trp residues to bulk solvent upon complete unfolding. In the sample subjected to automated folding analysis, the two intensity transitions are clearly evident as well. Although we do not present data for automation of fluorescence wavelength scans, it is 40 60 possible to perform and record Fraction Number excitation and emission spectral scans using this system. In Figure 8. Renaturation of low molecular weight observed in samples collected from particular, the RF-551 fluorometer urokinase continuous flow refolding. UK was injected in can be externally triggered and 9.3 M urea into 2.0 M urea, 20 mM Bis-Tris, pH programed for scanning. These 7.8 buffer with a gradient from 2.5 mM reduced scans can be recorded sequentially glut-athione (GSH) to 2.5 mM oxidized (GSSG). Upper panel: Data for in the same data channel and then glutathioneconcen-tration are from direct parsed into discrete spectra post-run. GSSG absorbance measurements at the secondary UV As indicated in Fig. 1 the effluent detector; data for GSH were determined by from the spectrometer analyses can DTNB titration of collected fractions. Lower be collected and fiirther analyzed, or panel: urokinase activity in collected fractions by spectrophotometric assay using Sbe used for other purposes. In measured 2444 (12) in a Molecular Devices titerplate Figure 7 a manual analysis is reader. presented of low molecular weight EXAMPLES OF SYSTEM OPERATION.
466
Richard A. Smith et al.
urokinase which has been automatically refolded from urea and collected after the detectors (Fig. 1). In this analysis urea concentration was also manually determined in each sample using refractive index measurements, and urokinase activity was measured using a spectrophotometric kinetic titerplate reader from Molecular Devices. It is evident that the recovery of enzyme activity closely corresponds with the first fluorescence transition observed in Figure 6. Although interpretation of these data is not yet conclusive, it is likely that the transitions observed can be attributed to the known (calorimetric) domain structure of urokinase (13). In Figure 8 a demonstration of the use of the system for control of protein disulfide formation is presented. In this experi-ment low molecular weight urokinase was prepared in 9.3 M urea in a fiilly reduced form. The reduced UK was then injected into a gradient formed between reduced and oxidized glutathione in a constant concentration of 2.0 M urea. The protein eluting from the detectors was captured and analyzed for enzymatic activity (Fig. 8, lower panel). The data indicate recovery of activity is greatest at high GSH/GSSG ratios. In summary, we describe modifications to a standard HPLC system that will permit its use in automation of the analysis of protein folding. The system presented has a number of pertinent advantages. From a stock solution of concentrated protein, the concentrations of protein actually utilized are programmable over a broad range from <0.01 mg/mL to >10 mg/mL. The operational flow of protein sample has a low minimum flow rate of ~3 |iL/min. The total fluid flow rate is adjustable, with typical flows for equilibrium runs ranging from 0.2 to 0.5 mL/min. The typical sample volume consumed during a run is ~ 2 mL. The typical run length, including recycling to initial conditions, is ~ 140 min. The fluid delay is adjustable to essentially any length desired by the user. The aging delay for equilibrium in the system described is ~ 20 min. The delay tubing is commonly placed in an HPLC column heater allowing temperature control from ambient to ~80°C. In the equilibrium mode about 28 mL of total buffer and denaturant are consumed per run. At 12 runs per day a single gallon each of buffer and denaturant will last 10-12 days. With an integrated HPLC-triggered fraction collector and larger bore delay tubing the system can be set up to repetitively inject and collect preparative samples of unfolded protein to be refolded and analyzed. Because the preparatively folded samples of protein will only be folded and active in a certain range of denaturant concentration, this approach permits recapture of misfolded material and recycling through the system. References 1. 2. 3. 4. 5. 6. 7.
Bailey, J.E. (1991) Science 252, 1668-1675. Pace, C.N. (1986) Methods Enzymol 131, 266-280. Adler, M. and Scheraga, H.A. (1988) Biochemistry 27,2471-2480. Endo S., Saito Y., and Wada, A. {\m)Anal Biochem. 131, 108-120. Saito, Y. and Wada, A. (1983) Biopolymers 22, 2105-2122. Saito, Y. and Wada, A (1983) 5/o/7o(vmgr5 22, 2123-2132. Thannhauser, T.W., McWherter, C.A., and Scheraga, H.A. {\9%5) Anal Biochem. 149, 322330. 8. Wada, A., Tachibana, H., Hayashi, H., and Saito, Y. (1980) Biochem. Biophys. Methods 2, 257-269. 9. Wada, A., Saito, Y., and Ohogushi, M. (1983) Biopolymers 22, 93-99. 10. Smith, R.A., Henkin, J., Egan, D.A., and Holzman, T.F. (1994) Protein Science 3 (Suppl. 1) 62. 11. Egan, D.A., Logan, T.M., Liang,H., Matayoshi, E., Fesik, S.W., and Holzman, T.F. (1993) Biochemistry 32, 1920-1927. 12. Marcotte, P.A. and Henkin, J. (1993) Biochim. Biophys. Acta 1161, 105-112. 13. Novokahatny, V., Medved, L., Mazar, A., Marcotte, P., Henkin, J., and Ingham, K. (1992) J. Biol. Chem. 267, 3878-3885.
Hsp70-protein complexes: Their characterization by size-exclusion HPLC Daniel R. Palleros, Li Shi^ , Katherine Reid, and Andiony Fink Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064
I. Introduction The term molecular chaperones has been coined to refer to several families of structurally unrelated proteins with a common functional property: they associate with partially unfolded proteins and assist in their in vivo translocation, folding and assembly. Among the most intensively studied molecular chaperones are the heat shock proteins of 70 kDa molecular mass, hsp70 (for reviews see ref. 1.2). These proteins are known to bind small peptides (3) and unfolded proteins (4,5) and the evidence gathered in the last 5 years indicates that they play an important role in the prevention of protein misfolding and aggregation in vivo. To fully understand the nature of the hsp70-protein interaction, the complexes between the chaperone and the substrate proteins must be isolated and characterized by chemical and physical methods. Ideally, these processes should be carried out with minimum change or disruption of the proteinprotein interaction. Attempts to crystallize hsp70's, or their complexes, have been unsuccessful and only a 44-kDa fragment of an hsp70 has been crystallized and studied by X-ray diffraction (6). Among the substrate proteins we have investigated are reduced, carboxymethylated a-lactalbumin, RCMLA, (a permanently unfolded protein), and thermally unstable mutants of staphylococcal nuclease. The latter have the advantage that at temperatures of 30^0 or higher, where they are unfolded, they can be bound to hsp70. and then released at lower temperatures (e.g. lO^C). where they should be in their native states. We have found that complexes between unfolded proteins and bovine brain ^ Present address: Department of Molecular Biology MB-2 Scripps Institute. La Jolla. CA 92037. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
467
468
Daniel R. Palleros et al.
hsp73 (a constitutive member of the hsp70 family), or human hsp72 (a protein highly inducible by heat shock and other metabolic stress), or DnaK (E. coli) are stable enough to be analyzed and isolated by size-exclusion HPLC (SEC-HPLC) on silica-based columns. This technique can be used to estimate the Stokes radius of the complexes and their components (5.7), to study hsp70 unfolding behavior (7,8), to follow the kinetics of complex formation and dissociation (4,5,9). to investigate the effects of nucleotides and ions on complex stability (4,5,9), and to isolate the complex for further spectroscopic and chemical studies (5,10); for example, SEC-HPLC in combination with SDS-PAGE is a very useful technique for the determination of complex stoichiometiy. In this paper we focus on the application of SECHPLC to the determination of Stokes radii and the stoichiometiy of hsp70-protein complexes.
II. Materials and Methods All chemicals were from the same sources as previously reported (5). NCA-SNase is a staphylococcal nuclease A mutant in which the pentapeptide Ser-Gly-Asn-Gly-Ser has been substituted for the tetrapeptide Tyr-Lys-Gfy-Gln at positions 27-30 (11). SECHPLC was run on a Bio-SEP 3000 silica column (600 x 7.8 mm; Phenomenex, Torrance, CA) using 20 mM sodium phosphate, 200 mM KCl. pH 6.5, as the mobile phase at 22°C; flow rate was 1 mL/min; detection was by absorbance at 215 nm. SDS-PAGE was run on a Pharmacia PhastSystem^^ using 8-25% polyacrylamidegradient gels and Coomassie Blue R staining, following the protocol described in PhastSystem Development Technique File No. 200 (Pharmacia); densitometry was performed using an ISCO gel scanner (ISCO model 1312) coupled to an absorbance monitor (ISCO model UA5) and integrator (Spectraphysics, model 4270). SDS sample buffer contained 5.5% SDS, 46% glycerol. 0.01% bromophenol blue, 200 mM Tris-HCl, and 0.7% (3mercaptoethanol, pH 6.8. E. coli DnaK stock solution (17.4 \M) was in 18 mM Tris-HCl. 45 mM NaCl, 10% glycerol, 5 mM pmercaptoethanol, pH 7.5. For the determination of the relative response factor, k (see below), the RCMLA stock solution was 17.4 MM in 20 mM Tris-HCl, 19 mM NaCl, pH 7.3. The RCMLA stock solution for complex formation was 143 pM in 13 mM Tris-HCl. 12 mM NaCl, pH 7. NCA-SNase stock solution was 510 pM in 20 mM Tris-HCl. pH 7.1. The concentrations of DnaK, RCMLA and NCA-SNase were determined using molar extinction coeflficients at 280 nm of 27000 (7), 27200 (12) and 15400 M-^cm"! (13). respectively. The size-exclusion partition coefficient, K^j, was calculated as K^ = (Vi-Vo)/Vt. where Vi is the elution volume of the protein. VQ is the void volume (elution volume of blue-dextran. 11.4 mL) and Vt is the total solvent-accessible volume (elution volume of sodium azide. 24.0 mL). It should be noted that this definition of the partition coefficient differs from others found in the literature; the partition coefficient a (14). often also called K^j
Characterization of HSP-70 Protein Complexes
469
(15). is defined as: a = (Vi-Vo)/(Vt-Vo). It follows that both constants are related by the following relationship: a = K(i/[1(Vo/Vt)l. In our experience (over 2000 SEC-HPLC runs with hsp70). the silica-based columns give better resolution than agarosebased columns (for example Superose 12 for FPLC from Pharmacia); however, we found that the silica-based columns have a limited lifetime when used with hsp70. After approximatefy 100-150 injecUons (20 nL; Ihsp701 « 5 \M) hsp70 will no longer elute from the column; this happens very suddenly without progressive retardation in the elution volume of hsp70 as the number of runs increases. Attempts to clean the column with five volumes of 6 M guanidine hydrochloride or 20% dimethylsulfoxide or 5% acetonitrile have been unsuccessful. It should be pointed out that when the column reaches the state in which hsp70 is no longer eluted. most other proteins will still elute at their normal elution volumes, however, unfolded proteins such as RCMLA are also considerabfy retarded, probably due to binding to the hsp70 stuck on the column. We sdso observed that injection of hsp73 resulted in shorter column life-times than when DnaK was used.
III. Results and Discussion HspYO'protein complex stoichiometry. Hsp70-protein complexes were formed by incubating hsp70 with an excess of substrate protein at 37^C. The reaction mixtures were then analyzed by SEC-HPLC; the chromatograms corresponding to mixtures of DnaK with RCMLA and NCA-SNase are shown in Fig. 1. The stoichiometry of hsp70-RCMLA complex was determined by a combination of SEC-HPLC and SDS-PAGE for both bovine brain hsp73 and DnaK. Similar results were obtained in each case; the molar ratio of hsp70 and RCMLA in the complexes was 1:1. Only the results for DnaK are discussed here. The complex between DnaK and RCMLA was formed by incubating equal volumes of both protein stock solutions for about 100 min at 370C; 500 ^iL of the reaction mixture was injected and several 1.5-mL fractions were collected and concentrated using a Centricon 3 (cut-off 3000 Da); fraction # 4 corresponded to the peak attributed to the DnaK-RCMLA complex (elution volume ca. 16 mL. see Fig. 1). The final volume of the concentrated fractions was about 150 |iL of which 10 ^.L was treated with SDS sample buffer, heated at 95^C for 2 min and loaded (1 ^iL) onto a SDS-PAGE gel. After developing with freshly made Coomassie Blue R solution, the gel was dried overnight at 37^C and then scanned with the densitometer. Only two bands, corresponding to the positions of standard RCMLA and DnaK samples, were observed for fraction # 4. The intensity of the
470
Daniel R. Palleros et al.
bands was determined by the integration of the corresponding densitometry peaks. In order to minimize the error, the same fraction was run on 4 different gel lanes, and each lane was scanned at least three times. The average for the ratio of the areas for DnaK (AD) and RCMLA (AR). A D / A R , was 5.32 ± 0.83. I I I I I 1 I I I—I I I I I—I—I—I I I I I 1 I I I I I I I I I I I I I I I I I
DnaK Complexes ;• NCA-SNase
i-r%r-r'h*rTv';J^'-M*t* T T ' I
14
15
16
17 Elution
18 19 20 Volume (ml)
21
22
Fig. 1. Complex formation between DnaK and substrate proteins monitored by SEC-HPLC. DnaK stock solution was mixed with RCMLA or NCA-SNase stock solutions and 20 mM Tris-HCl buffer. pH 7.1. incubated for 30 min. at 37°C and then analyzed by HPLC. Final concentrations were: [DnaK] = 5.5 nM. [RCMLA] = 17 ^iM and [NCA-SNase] = 24 jiM. DnaK (5.5 jiM) alone was analyzed under the same conditions. The complexes partially dissociate (20-40% depending on the conditions) during the HPLC run (5).
The relative response factor of DnaK and RCMLA to Coomassie Blue R staining was determined by SDS-PAGE analysis of samples of known concentrations of DnaK and RCMLA. Aliquots of RCMLA and DnaK stock solutions of identical concentrations and Tris-HCl buffer (20 mM; 19 mM NaCl. pH 7.3) were mixed to afford different [DnaK]:[RCMLA] molar ratios (2:1; 1:1; 1:2; 1:3); the concentration of DnaK was kept constant at 4.4 ^iM. These solutions were treated with SDS sample buffer and analyzed by densitometry as described above. The relative
Characterization of HSP-70 Protein Complexes
471
response factor for DnaK and RCMLA, k. was calculated using eq. 1, [RCMLA] A D _ | ^ [DnaK] " AR The average from four different lanes (each one with a different [DnaK]:[RCMLA] molar ratio) gave a value for k of 5.43 ± 0.42. It should be noted that the relative response factor k reflects the relative ability of these two proteins to bind Coomassie Blue R; although the nature of the interaction between this dye and proteins is not clearly understood, the binding seems to be favored by the presence of basic residues (Lys, Arg and His). While the number of dye molecules bound to proteins varies largely from protein to protein, the number of dye molecules bound per positive charge on the protein seems to be fairly constant, ranging from 1.4 to 2.7 in a series of proteins (16). Therefore, for proteins with a similar proportion of basic amino acids, the number of Coomassie Blue R molecules bound to the protein is expected to be proportional to its molecular mass. Fortuitously, the proportion of basic amino acids in hsp73. DnaK and RCMLA is about the same (13%); therefore, the relative response factor of DnaK and RCMLA should be comparable to the ratio of their molecular masses (i.e. 69100/14700 = 4.7), which is the case, as the k value of 5.43 ± 0.42 indicates. With the knowledge of k and the ratio of the areas for DnaK and RCMLA bands in fraction # 4 as already determined ( A D / A R = 5.32), the ratio of the molar concentrations of DnaK and RCMLA in the complex can be calculated, eq. 2:
t5CMt^ = i k . k = ^ = M i = , . 0 2 t o . 2 4 [DnaK] AQ An/ 5.32
(2)
/AR
These results indicate that the stoichiometiy for the DnaK-RCMLA complex is 1:1. As Fig. 1 clearly shows, no stable complexes of higher molecular mass were detected. Moreover, performing the incubation with a molar excess of DnaK over substrate protein did not result in higher molecular mass complexes. Stokes radius determtnatiorh The Stokes radii (Rs) of DnaK, RCMLA. NCA-SNase and their complexes were determined by SEC-HPLC using a series of standard globular proteins for which Stokes radii were available (15, and references therein). It is well established that size-exclusion partition coefficients can be correlated to the molecular mass, MM, of proteins by eq. 3: Kd = - A log MM + B
(3)
472
Daniel R. Palleros et al.
where A and B are empirical constants. However, for non-globular or highly asymmetric proteins, a better correlation is obtained if the hydrodynamic radius (Stokes radius. Rg) is used instead of the molecular mass (14). A plot of log Rs us. Kd for standard proteins gave a good linear correlation (log Rs= 2.1037 - 2.1552 Kd; r= 0.990). Standard proteins (K^; Rs in A) were: ribonuclease A (0.376; 19.3); myoglobin (0.368; 20.2); bovine carbonic anhydrase (0.343; 23.6); ovalbumin (0.293; 31.2) and bovine serum albumin (0.258; 33.9). Using the correlation mentioned above and the Kd values listed below (in parenthesis), the following Rs (error: ± 2A) values were obtained: DnaK (0.228): 41A; DnaK-RCMLA complex (0.185): 51A; DnaK-NCA-SNase complex (0.187): 50A; RCMLA (0.309): 27A. and NCA-SNase (0.354): 22A. Our results indicate that DnaK does not behave as a globular protein on the SEC-HPLXD experiments. A Stokes radius of 41 A for DnaK is in agreement with previously published data determined by dynamic light scattering (7). and is also comparable with the Stokes radius determined for hsp73. 39 A (17). These values are greater tham predicted, however, for the Stokes radius of a globular protein of molecular mass 70 kDa. A correlation between volume (Rg^) and molecular mass for nine globular proteins is shown in Fig. 2; R^^ = -1707 + 0.6091 MM (r = 0.989). For a globular protein of 70 kDa a Stokes radius of about 34A is expected. For the DnaK-protein complexes. Stokes radii of about 51 A have been determined, which are too large for spherical-shaped complexes; a radius of 37A is expected for a globular protein of molecular mass 84 kDa (the molecular mass of the complexes). This abnormally large Stokes radius is in part a reflection of the non-globular character of DnaK; however, the 14A difference between the expected (37A) and observed (51A) Stokes radius for the complexes, is much larger than the difference of 7A detected for free DnaK. This disparity suggests that the substrate proteins must be substantially unfolded when bound to DnaK. This is not surprising in the case of RCMLA. because the protein is permanently unfolded regardless of the experimental conditions. This is also evidenced by its large Stokes radius. (27A); for a globular protein of molecular mass 14700. the Rs is expected to be around 19A. The results with NCA-SNase are unexpected in as much as the free substrate protein is folded under the conditions of the SEC-HPLC analysis (18). The unfolded nature of NCASNase in the complex with DnaK was further investigated by fluorescence spectroscopy and far-UV circular dichroism (5).
Characterization of HSP-70 Protein Complexes
4.0
473
10
Rs^ (A^) 2.0
10
1.0
h
10^
5.0
10^
Molecular Mass
9.0
10
(Da)
Fig. 2. Correlation between Rg^ and molecular mass for globular proteins. In order of increasing mass the proteins are: cytochrome c, ribonuclease A, myoglobin, bovine carbonic anhydrase, Plactoglobulin. ovalbumin, hemoglobin, bovine serum albumin and transferrin. Early attempts to determine the stoichiometry of hsp70protein complexes by a correlation between K^ and log MM were unsuccessful because the substrate proteins are substantially unfolded in their complexes with hsp70. and hsp70s themselves probably deviate from a spherical shape; to illustrate this point it should suffice to say that DnaK and DnaK-RCMLA complex behave as if they had apparent molecular masses of 93 and 156 kDa, respectively, when a correlation of log MM us. K^ (using the same five standard proteins mentioned above plus bovine serum albumin dimer) was used to estimate molecular masses.
IV. Conclusions SEC-HPLC on silica-based columns is a fast and versatile technique for the characterization of complexes between molecular chaperones and substrate proteins. The technique provides an invaluable tool for a rapid determination of their hydrodynamic properties and, in combination with SDS-PAGE, can be used to determine the stoichiometry of such complexes. One limitation is the fact that SEC-HPLC can be applied only to the study of stable complexes, i.e. complexes that will not dissociate significantly during the chromatographic run.
.•^.
Daniel R. Palleros et al.
References 1. Ellis. R.J.. and van der Vies, S.M. (1991) Annu. Rev. Biochem. 60. 321-347. 2. McKay. D. (1993) Aduances Prot Chem. 44. 67-97. 3. Flynn. G.C.. Chappell. T.G. and Rothman. J.E. (1989) Science 2 4 5 385-390 4. Palieros, D.R, Welch. W.J.. and Fink. A. L. (1991) Proc. Natt. Acad. Set U.SA. 88. 5719-5723. 5. Palleros. D.R. Shi. L.. Reid. K.L.. and Fink. A.L. (1994) J. BtoL Chem. 269. 13107-13114. 6. Flaherty. K.M.. DeLuca-Flaherty. C . and McKay. D.B. (1990) Nature 346. 623-628. 7. Palleros. D.R.. Shi. L.. Reid. K.L.. and Fink. A.L. (1993) Biochemistry 32, 4314-4321. 8. Palleros. D.R. Reid. K.L.. McCarty. J.S.. Walker. G.C.. and Fink, A.L. (1992) J. Biol Chem. 267. 5279-5285. 9. Palleros. D.R. Reid. K.L.. Shi. L.. Welch. W.J.. and Fink. A.L. (1993) Nature 365. 664-666. 10. Palleros, D.R. Reid. K.L., Shi, L., and Fink. A.L. (1993) FEBS Lett 336. 124-128. 11. Hynes. T.R. Kautz. RA.. Goodman. M.A.. Gill. J.F.. and Fox. R.O.. (1989) Nature 399. 73-76. 12. Ikeguchi. M.. and Sugai. S. (1989) Int J. Peptide Protein Res. 33, 289-297. 13. Fuchs. S.. Cuatrecasas. P.. and Anfinsen. C.B. (1967) J. Biol Chem. 242.4768-4770. 14. Ackers. G.K.. (1970) Aduances Prot Chem. 2 4 . 381-383. 15. Corbett. R.J.T.. and Roche. R S . (1984) Biochemistry 2 3 . 1888-1894. 16. Tal. M.. Silberstein. A., and Nusser. E.. (1985) J. Biol Chem. 260. 9976-9980. 17. Schlossman. D.. Schmid, S.L.. Braell., W.A., and Rothman. J.E. (1984) J. CeRBiol 99. 723-733. 18. Antonino. L.C.. Kautz, RA.. Nakano. T., Fox, R.O.. and Fink. A.L. (1991) Proc. NatL Acad. Set U.S.A. 8 8 . 7715-7718.
Methods for Collecting and Analyzing Attenuated Total Reflectance FTIR Spectra of Proteins in Solution* Keith A. Oberg and Anthony L. Fink Department of Chemistry and Biochemistry University of California, Santa Cruz 95064
I.
Introduction
Attenuated Total Reflectance FTIR (ATR-FTIRt) is a method that has been applied by a number of workers for the study of protein conformation. ATR has been used for monitoring adsorption of proteins or blood components to surfaces (1,2), and for the structural analysis of proteins dried onto an IRE (thin fihn) (3,4), It has also been used for exploring the effects of solution conditions on the structure of proteins irreversibly adsorbed to an IRE (5,6,7), and has been shown to be useful for studying the secondary structure and ligand binding properties of membrane proteins (8,9). To date, there have been no published studies using ATR-FTIR to measure the spectra of just the protein in (bulk) solution. We have found that such solution spectra can be obtained by subtracting the contribution of denatured material irreversibly bound to the IRE surface from that of bulk and adsorbed protein. The strong interactions between polypeptides and IRE materials that immobilize proteins on IRE surfaces have a deleterious effect on the structure of these molecules. The characteristics of the adsorption process and its affects on protein structure will be discussed elsewhere (10). * This work was supported by a grant from the National Science Foundation. t Abbreviations used are as follows. FTIR: Fourier transform infrared spectroscopy, ATR: attenuated total reflectance, IRE: internal reflection element, SATR: solution ATR-FTIR, FSD: Fourier self-deconvolution, PLS: partial least-squares analysis, PRESS: prediction residual sum of squares from PLS. SECV: standard error of calibration values from PLS, PLSl: PLS analysis in which each component is predicted independently, PLS2: PLS analysis in which all components are predicted simultaneously. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
475
476
II.
Keith A. Oberg and Anthony L. Fink
Materials: Protein Solutions
Proteins were purchased in the purest form available from Sigma, Worthington or Biocell Laboratories, and used without fiirther purification. Proteins used for the generation of PLS basis sets were chosen to represent a wide range of structural motifs. Proteins used were as follows: Sigma: Carbonic Anhydrase (C3934), Concanavalin A (C7275), Cytochrome C (C7552), Insulin (13505), pLactoglobulin (L7880), Myoglobin (Ml882), Papain (P4762), Trypsin Inhibitor (bovine pancreas, T0256), Ubiquitin (U6253); Worthington: Chymotrypsinogen A (5630), Lysozyme (2931), Ribonuclease A (3433), Biocell: IGG (goat anti rabbit). To prepare protein solutions, 15-25 mg of each protein were dissolved shortly before use in sufficient 20 mM sodium phosphate, pH 7.0 to give a 30 mg/ml solution. 100 jil of the resulting solutions were diluted to 1 ml to make 3 mg/ml solutions. Samples that had visible precipitate after dissolution in buffer (carbonic anhydrase, concanavalin A, and insulin) were centrifuged for 2 minutes before use.
III. Methods for Data Collection and Structure Analysis The contribution of water absorption in solution spectra can be more than 99% of the total signal. The technological sophistication and high sensitivity of modem FTIR instruments make it possible to extract protein spectra from solution data; however extreme care must be exercised to insure accuracy. Frequently FTIR data processing is not done with sufficient rigor, and hence flawed analyses have appeared in print. As a part of this work, a careful exploration of the various aspects of data collection, spectral processing, and structural analysis was performed in order to develop a reproducible and reliable protocol. The complete protocol is presented in the following sections. A,
Data collection
A modified out-of-compartment IRE holder (SPECAC) that could be configured as a 125 \i\ flow cell was used for this study. The IREs used here measure 72x10x6 mm; they have a 45° angle of incidence, and 7 internal reflections. Interferograms were collected on a Nicolet 800 FTIR spectrometer equipped with a liquid nitrogen cooled MCT detector. For each spectrum, 1000 interferograms were co-added at 4 cm"^ resolution (total collection time: 6 minutes). Duplicate data sets were collected on Ge and ZnSe IREs. For each spectrum, 500-1000 |LI1 of a 3 or 30mg/ml protein solution (20 mM phosphate, pH 7.0) were used. The protocol developed for the collection of solution spectra is summarized in Table I.
Protein FTIR Spectral Analyses
477
Table I. The general protocol for collection of bulk and adsorbed protein spectra Step
1 2 3 4 5 6 7 8 opt.
9
B.
Procedure Assemble & align flow cell Collect 3 spectra of empty flow cell Fill with buffer Remove buffer Load protein solution, (allow to adsorb, 1-5 min.) Remove protein solution Flow 2ml buffer once through cell («10sec) Soak cell in soap 5-10 minutes Rinse with >100 volumes H2O For additional spectra, steps 5-8 were repeated
Spectral
Spectrum
Components
~ Background Buffer
— Empty cell, spectrum of gasket Water, (+buffer components)
~ Total
1
Adsorbed & bulk protein, buffer
1
— Flush Wash
1
Tightly adsorbed protein, 1 buffer Irreversibly adsorbed protein, 1 water
~
—
Pre-processing
All data manipulation was performed with Lab Gale or GRAMS/386 (Galactic Industries). Interferograms were Fourier transformed using the Mertz method (11) with medium Norton-Beer apodization. Spectra were converted to absorbance by ratioing against an appropriate background spectrum (Table I). Water vapor spectra ("vapor") were generated by ratioing two background spectra; one background was collected with the instrument open to the atmosphere, which produced strong adsorption bands. The high intensity of the vapor spectrum was beneficial in that it served to minimize the increase of noise in protein spectra due to vapor subtraction. The contribution of water vapor in "buffer," "total," and "flush" spectra (Table II) was subtracted automatically by an Array Basic (Galactic Industries) program that optimized the scaling factor, s, (Equation 1) using the same type of linear search described by Powell et al. (12). original - s (subtrahend) = corrected spectrum
(1)
Briefly, optimization of s was performed iteratively. This is shown graphically in Figure 1. This approach can be used for any subtraction where a "goodness of s" parameter, g, can be defined and evaluated. At each step, g was evaluated for «„ (gn) The subtraction was then performed using Sn+i=*n+» (where / is an arbitrary increment factor), and g was evaluated again (gn+\). If l^n+il
478
Keith A. Oberg and Anthony L. Fink
^"""^55=54 - 2 i : i =71 Sj = 5 i + i
52^5
Si 1
^1
%9 S6
W^ %M Si
54 „
1
1
1 ^4
^3
Figure 1. Graphical representation of the scaling factor (s) optimization method used for the subtraction of water vapor and liquid water signals from protein spectra, x represents the "correct" value of 5. The algorithm begins at 5i=0, and increases s by steps of size 1 until the goodness parameter, gn+i is worse than g^. At this point, the algorithm "backs up" and starts increasing s in smaller steps. For optimizing s in water-vapor subtraction, the second derivatives of both the vapor and sample spectra were used and g was defined as the standard deviation of the subtraction result in the region between 1950 and 1800 c m \ The optunum s was then used for subtraction of the original spectra. For liquid water, g was defined as the difference in the slopes resulting from linear extrapolations of the regions from 1990 to 1900 and 1880 to 1790 cm"^ after Powell et al. (12) (region 1), or from 2242 to 2170 and 2050 to 1980 cm'^ (region 2). Subtractions used to generate spectra of bulk and adsorbed protein were performed as shown in equation 1. Water vapor was subtracted before liquid water to eliminate extrapolation errors in the 1880 to 1790 cm'^ region. In some cases, protein interaction with the IRE appeared to generate an unassigned peak in the 1800 to 1700 cm"^ region; s optimization using region 2 provided acceptable results in such cases. C 1.
Secondary structure analysis methods
Curve fitting Two types of secondary structure analysis were used on the spectra collected in this study. The classic curve-fitting method (13) for analysis of the amide I band, was performed in two stages. The first step in the analysis is band narrowing, which allows visualization of component bands, using derivatization and Fourier self-deconvolution (FSD) (14). The practical considerations for choosing FSD parameters have been discussed in detail by Griffiths and Pariente (14) and Kauppinen et al. (15). Since the potential for the creation of artifacts is high in FSD, the parameters used were verified in the following manner. First, FSD spectra did not have any peaks in the
Protein FTIR Spectral Analyses
479
Table II. Subtractions used for obtaining spectra of different protein fractions from SATR data of > 3 mg/ml protein solutions original flush total total
subtrahend buffer buffer flush
^^?^]^ 'Adsorbed combined bulk
typical s range "^0.985 - 0.998 0.960-0.998 0.970 - 0.999
region adjacent to the amide I band (1700-1800 cm'^). Second, they were compared with second derivative spectra. Proper choice of parameters was indicated by a match of the number and positions of peaks in these two spectra. The best parameters for FSD of ATR-FTIR spectra were found to be y»3, and /=0.38-0.42, where y is a band narrowing factor, and / is afilterfactor (14). It is becoming a common practice to include bands in curve fitting if they are absent in the second derivative but resolved in the FSD. However, this completely defeats the use of the second derivative for validation of FSD parameters and should be strictly avoided. Bands observed in this manner result fi'om a poor choice of deconvolution or differentiation parameters. FSD spectra are frequently curve-fit to obtain an estimate of the secondary structure content of the protein being examined. This is justifiable because, in theory, Fourier self-deconvolution should not affect tihe relative areas of component bands. In practice however, it was found that this assumption is not valid. The relative areas of bands at the edges of the amide I region are increased by FSD. Therefore the following procedure was used for structural analysis. The process was initiated by fitting the FSD spectrum. Initially band parameters were chosen manually based on peaks appearing in both the second derivative and FSD spectra. After initial parameters had been entered, iteration was begun. The FSD fit is not very sensitive to variations in initial values for component peaks because bands in the FSD spectrum are well resolved. All band parameters, including positions, heights, widths, and % Lorenzian were optimized simultaneously using a modified version of the program CURVEFIT.AB supplied with Lab Calc. The iteration process typically converged rapidly during this stage (<300 iterations). Thefinalparametersfromthis fit were then stored. Next, the "raw" amide I spectrum was fit after baseline correction. The initial parameters for fitting the raw spectrum were taken from the FSD results. Before iteration was begun, the FSD bands were vddened by y, and decreased in intensity to give the best possible agreement with the raw spectrum. Because the component bands in the amide I region are not well resolved, the fit was first optimized with peak positions fixed to avoid drift. Typically 500 iterations with peak positions fixed were sufficient to achieve good agreement between the raw and reconstructed spectra. To fiirther optimize (and verify) the quality of the fit, all parameters, including band positions, were allowed to vary for 100-200 additional iterations.
480
Keith A. Oberg and Anthony L. Fink
The change in the positions of major bands for this final stage was usually less than 1 c m \ In the rare case where changes were larger, the FSD parameters were judged to be inappropriate and the entire analysis process was repeated. Finally, the secondary structure content was evaluated by calculating the relative areas of all component bands. Such quantification of structural content is based on the assumption that the extinction coefficients for all structural types are similar. The band assignments used have been adapted from other studies (13, 16, 17). 2.
Multivariate statistical methods The partial least squares methods (PLSl, and PLS2) used for analysis of secondary structure have been discussed in detail by Haaland and Thomas (18). The software package PLSPlus version 2.1G for GRAMS/386 was purchased from Galactic Industries. The PLS2 algorithm, because of its speed, was used to determine optimal parameters, after which PLSl was used for prediction of protein secondary structure. For PLS solution basis sets, "bulk" spectra were generated as described above. Standard error of calibration values (SECV) were determined from prediction residual sum of squares (PRESS) analyses of various permutations of the amide I, II, and III bands (always including amide I)fi-omboth Ge and ZnSe spectra. After determination of the effects of different types of normalization on the results, these bands were individually normalized to an area of 100 absorbance units before PLSl training. A PRESS analysis was also performed by removing one spectrum at a time from the basis set. This provided some measure of the consistency of the preprocessing methods and the effect of noise on the analysis (3 mg/ml spectra were included). The SECV values from this analysis were still relatively large (»6%).
IV.
Results and Discussion
To evaluate the usefuhiess of ATR-FTIR measurements for the quantitative determination of protein secondary structure, both PLS and classic curve-fitting analyses were performed. The results of these analyses are compared vdth values calculatedfi-omX-Ray structures in Table III. The curve-fitting results reflect the inability of FTIR to resolve disordered and helical bands in ^H20. For the proteins where the estimate of turn is high, there was typically a band at 1660-1663 cm'^ that was assigned as turn but which may reflect the presence of helix or irregular loop. In several high-p proteins, there was a band at 1648-1650 cm"^ which was assigned to disordered structure. Curvefitting was only consistently accurate for determining p sheet content in ^H20. Because of this, it can be concluded that PLS is the superior analysis method for this solvent.
Protein FTIR Spectral Analyses
Table III.
481
Results from a PLS2 PRESS analysis of a bulk spectra S ATR basis set
% helix % extended cone. Method mg/ml X-Ray' SATR X-Ray SATR Myoglobin 30 PLS^ 88 70.0 0 -6.9^ CF' 78.7 16.7 iisuiiii --••-•pj^ --j- - - - - - - —Protein
Ciyt'c Tyso^e Papain BPfl """"*RNas^*A """"Carboiiic Anhydrase "'Ublquitin 'iChymottyp sinqgen p'Uctoglobulin ''"inTmuno^' globulin Y Conca^ navalinA
CF PLS CF 30 PLS CF 30 PLS CF 30 PLS CF *3'0 PLS CF 3Q--"pLg CF "30 """PLS CF "30 "PLS CF 30 *PLS CF 30 PLS"'" CF '30* "PLS CF 30
49
481"' 45.0 46 48"8'"" 37.6 28 liJ" 17.6 26 24"0'"" 16.4 23 i7"5*'" 23.4 [^ - yf 29.9 12 17**7"' 0 _ 12 '"•i5;9'" 0 "i •"•Yi;6"' 15.5 3 uY" 0 "3 8;4"" 15.9
11 19 29 45 46 43 33 49 51 67 65
R8 20.3 '2L7 39.0 353 29.4 318 45.0 '4i"'o 47.1 J^'g 41.6 43**2 34.7 Jye 43.0 49*^8 56.0 48"2 70.5 siJ 53.8
%tum X-Ray SATR 7 18.9 4.5 -'22 23 18 16 21 25 32 23 15 19 22
22'2" 19.8 T9"i" 23.1 277" 25.5 25*7*" 37.6 25"3"" 29.3 25'4""' 27.5 215*** 32.2 214**' 30.1 24"2*' 28.3 2I9" 33.3 18^9 " 28.5
a) X-Ray values were tabulated from data in Levitt & Greer (19) where possible. Other values were taken from the original articles, b) PLS analysis was performed using amide I and II bands that had been normalized independently. The analysis was performed using 3 factors. All spectra of a given protein were rotated out for each step of the PRESS rotation. SECV values for this PLS analysis were a-helix: 9.6, extended structure: 9.4%, and turn: 7.5%. c) Curve fitting results, d) Negative values for extended structure were not observed in all PRESS runs. Myoglobin was consistently identified as an outlier by the validation routines that are part of the PLSPlus program, e) The S/N ratio of insulin was insufficient for curve fitting analysis.
482
1750
Keith A. Oberg and Anthony L. Fink
1700
1650
1600
1550
1500
Wavenumber Figure 2.
Comparison of 30mg/ml carbonic anhydrase spectra collected on ZnSe (solid) and Ge (dashed) IREs. The spectra are normalized to the same height.
To keep the optimization of PLS parameters as simple as possible, only three types of structure, helix, extended, and turn, were used in constructing the basis sets. By the eventual addition of ordered and disordered helix, 3io helix, and "other" components (20), it is likely that the standard error and the level of structural detail availablefromthe method can be improved. PLS structural analysis of FTIR spectra of proteins in H2O is commonly done using the amide I (1700-1600 cm"^) and II (1600-1500 cm"') regions. The use of these regions produced the best results in this study. Inclusion of the amide III region (1350-1200 cm"') increased the error of prediction. This suggests that the amide III band shape has a poor correlation with secondary structure. This contradicts the success reported using this region with curve-fitting analyses (21,22). SECV values improved when spectra that were poorly fit by PLS2 were removed. This suggests that the spectra of these proteins could not be modeled well by the PLS algorithm. For some runs, myoglobin and/or insulin were omitted. Since these proteins have a high a-helical content, it was expected that the remaining proteins, which all have significant amounts of P structure, would not be able to accurately predict their structures. It is noteworthy that the PLS2 algorithmfrequentlypredicted 71±9% helix, and 12±10% extended structure for myoglobin. These values agree with other published values (3, 13, 20), which is evidence that this protein has an anomalous FTIR spectrum. Finally, it is noteworthy that protein bulk spectra collected with Ge and ZnSe were never identical (Figure 2). However, basis sets containing spectra from only a Ge IRE, only a ZnSe IRE, and both Ge and ZnSe gave essentially identical SECV values. This indicates that the differences between spectra collected using
Protein FTIR Spectral Analyses
483
Ge and ZnSe IREs were systematic, could be modeled to some extent by the PLS algorithm, and did not result from protein interactions with the IREs. The standard errors for the optimal analysis parameters here are similar to those obtained using PLS analysis of transmission FTIR spectra, for example, Dousseau and Pezolet (20) found average standard deviations of 11.7% for ahelix, 6.6% for extended structure (P), and 6.7% for turn + other structures using a 3 component analysis.
V.
Conclusions
It has been shown here that ATR-FTIR spectra of proteins in solution can be collected and used successfully for secondary structural analysis. Relatively accurate bulk protein spectra can be obtained by subtracting tiie spectrum of protein irreversibly adsorbed to the surface of an IRE. Structural analysis using both curve-fitting and multivariate statistical methods can be employed to ATR-FTIR spectra. However, because disordered and helix bands may not be well resolved, band assignments in curve-fitting analysis can be difficult. PLS analysis, on the other hand, can determine the secondary structure of proteins from ^H20 ATR-FTIR spectra with higher accuracy because it is not necessary to assign bands using this method.
References 1 2 3 4 5 6 7 8 9 10 11
Wasacz F. M., dinger J. M., Jakobsen R. J. (1987) Biochemistry, 26(5), 146470. Chittur K. K., Fink D. J., Leininger R. I., Hutson T. B. (1986) Journal of Colloid and Interface Science, 111(2), 419-433. Goormaghtigh E., Cabiaux V., Ruysschaert J. M. (1990) European Journal of Biochemistry, 193(2), 409-20. Uchida K., Harada I., Nakauchi Y., Maruyama K. (1991) FEBS Letters, 295(13), 35-8. Jacobsen R. J., Wasacz F. M. (1987) In, Proteins at Interfaces J. L. Brash, T. A. Horbett Ed. American Chemical Society Books. Washington DC, pp. 339-361. Singh B. R., Fuller M. P. {\99\) Applied Spectroscopy, 45(6), 1017-1021. Singh B. R., Wasacz F. M., Strand S., Jakobsen R. J., DasGupta B. R. (1990) Journal of Protein Chemistry, 9,705-13. Baenziger J. E., Miller K. W., Rothschild K. J. (1993) Biochemistry, 32(20), 5448-54. Buchet R, Varga S., Seidler N. W., Molnar E., Martonosi A. (1991) Biochimica etBiophysicaActa, 1068(2), 201-16. Oberg and Fink (1994) The Application of Attenuated Total Reflection FTIR to the Study of Protein Structure In Solution. Ms. in preparation. Mertz L., (1965) Transformations in Optics, Wiley, New York.
484
12 13 14 15 16 17 18 19 20 21 22
Keith A. Oberg and Anthony L. Fink
Powell J. R., Wasacz F. M., Jacobsen R. J. (1986) Applied Spectroscopy, 40(3), 339-344. Byler D. M., Susi H. (1986) Biopolymers, 25(3), 469-87. Griffiths P. R., Pariente G. L. (1986) Trends in Analytical Biochemistry, 5, 209215. Kauppinen J. K., Mofifatt D. J., Mantsch H. H., Cameron D. G., (1981) Applied Spectroscopy, 35(3), 271-276. Surewicz W. K., Mantsch H. H. (1988) Biochimica et Biophysica Acta, 952(2), 115-30. Wilder C. L., Friedrich A. D., Potts R. O., Daumy G. O., Francoeur M. L. (1992) Biochemistry, 31(1), 27-31. Haaland D. M., Thomas E. V. (19SS) Analytical Chemistry, 60, 1193-1221. Levitt M., Greer J. (1977) Journal of Molecular Biology, 114(2), 181-239. Dousseau F., Pezolet M. (1990) Biochemistry, 29(37), 8771-9. Singh B. R., Fuller M. P., Schiavo G. (1990) Biophysical Chemistry, 36(2), 15566. Singh B. R., Fuller M. P., DasGupta B. R. (1991) Journal of Protein Chemistry, 10, 637-49.
SECTION VIII NMR Analysis of Peptides and Proteins
This Page Intentionally Left Blank
^^F NMR Studies of Fluorinated Sugars Binding to the Glucose and Galactose Receptor Linda A. Luck Laboratory of Molecular Biophysics National Institute of Environmental Health Sciences Research Triangle Park, North Carolina 27709
I. Introduction Protein-ligand interactions and molecular recognition are frequently defined by hydrogen bonding. These bonds are critical for conferring specificity, ensuring correctness of fit of substrates, inducing conformational changes and enhancing the stability of complexes. Carbohydrate-protein interactions present a unique system to study the role of hydrogen bonds in ligand binding due to the large number of hydroxyls in these molecules. Although x-ray crystallography has provided valuable insights into hydrogen bonding interactions, these studies are constrained by the required crystal environment. A complimentary approach to these studies is nuclear magnetic resonance (NMR) which allows the biochemist to observe dynamic behavior and relate structural studies to in vivo cellular processes. The ultimate advantage of this method is the ability to study macromolecular interactions in solution, the native environment of the proteins. The ^H has been the traditional nucleus for biological studies in the past decade and researchers have successfully utilized multidimensional NMR methods to assign spectra for proteins up to 30 kDa.^'^ Other nuclei such as ^^C, ^^N, ^^O and ^^P have been used but their lower gyromagnetic ratios and/or natural abundances reduce sensitivity unless expensive isotopic enrichment methods are employed. An alternative nucleus for studies of biological materials is ^^F.^ ^^F NMR is a powerful technique due to several unique features of the fluorine nucleus. This spin 1/2 nucleus presents a probe that is 0.83 times as sensitive as a ^H with a 100-fold wider frequency range. The chemical shift frequency is influenced by van der Waals packing around the nucleus and if the fluorine is located on an aromatic ring it is additionally sensitive to the electronic distribution of the adjacent n system. Hence this nucleus is a unique probe for environmental changes in proteins. Another added advantage of the fluorine nucleus is the lack of fluorine in biological materials which eliminates background signals in the spectra. To characterize protein-ligand interactions by ^^ F NMR, the ligand or the protein can be labeled with fluorine to produce spectra without overwhelming complexity. Proteins expressed in Exoli and tissue culture have been labeled with fluorine by biosynthetic incorporation of fluoro analogs of tryptophan, phenylalanine and tyrosine. Conformational properties of receptor TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
487
488
Linda A. Luck
proteW, transmembrane proteins^ and signalling proteins^ have been determined by this method. Fluorodeoxy sugars have been utilized to examine enzyme specificities and mechanisms of glycogen phosphorylase and phosphoglucomutase/'^ Substitution of fluorine for the sugar hydroxyl is sterically conservative with respect to both the bond length Md the van der Waals radius. The sugar hydroxyl in an optimal bonding situation can form two hydrogen bonds as a proton acceptor and one hydrogen bond as a proton donor. On the other hand, fluorine in a fluorodeoxy sugar can participate only as an acceptor. Measurement of the binding capacity of a series of fluorodeoxy sugars with individual substitutions for each hydroxyl provides insight into the role of each hydroxyl in the carbohydrate-protein complex. The D-glucose and D-galactose receptor is a 33 kDa periplasmic binding protein and the first component in the distinct chemosensory and transport pathways for these sugars. X-ray crystallographic studies have provided detailed information concerning the overall structure of this receptor with the sugar in the binding pocket.^ The receptor consists of two domains, each with an a/p motif, engulfing a sugar molecule between the domains. The overall protein structure and residues involved in sugar binding are shown in Figure 1. Crystallographic analysis shows the P anomers of the sugars positioned identically in the cleft even though D-glucose has a binding affinity 2 times tighter than D-galactose. Although there is no direct evidence from crystallography or equilibrium dialysis it has been inferred that both anomers bind.^^ NMR methods have the unique ability to observe distinct resonances from both anomers of the sugars in solution thus questions concerning the individual anomers can be addressed. The methods discussed in the text illustrate how NMR can be used to investigate the role of each hydroxyl in the carbohydrate/protein complex and how to determine anomeric specificity in binding of sugars to large molecular weight proteins. B
ASP154
Figure 1. Structure of the E. coli D- glucose and D-galactose receptor.^ Shown is the (A) acarbon backbone structure of the receptor with p-D-glucose in the sugar binding pocket and Ca(n) ion in the metal binding site. (B) illustrates the residues that are involved in binding the sugar in the pocket.
''^F NMR Sugar/Receptor Binding Studies
489
II. Materials and Methods Receptor was over expressed in E.coli NM303 cells by means of the pSF5 plasmid as previously described.^^ After harvesting, standard osmotic shock procedures were used to lyse the E.coli outer membrane to release the receptor from the periplasmic space. The supernatant liquid containing receptor was dialyzed using 12 kDa cutoff membrane against unfolding buffer to release the bound substrates by partially unfolding the protein. This buffer contained 3.0 M guanidine HCL, 100 mM KCl, 20 mM EDTA, 10 mM Tris-HCl pH 7.1 and 0.5 mM phenylmethanesulfonylfluoride. The receptor was renatured by 4 changes of buffer containing 100 mM KCl, 10 mM Tris-HCl pH 7.1, 0.5 mM CaCl2. Samples were concentrated to yield 0.5-2 mM aliquots for NMR studies. Receptor purity by SDS page was 98%. NMR samples contained 0.6 ml receptor (0.5-2.0 mM) dissolved in refolding buffer (vide supra) with 10% DjO. One-dimensional ^^F NMR spectra were obtained at 470 mHz on a General Electric GN 500 spectrometer fitted with a 5 mm ^^F probe. Parameters included 16K data points, 3.0 second relaxation delay and 25 Hz linebroadening for processing spectra. Tj relaxation times were measured by the inversion recovery method. The two-dimensional ^^F NOES Y NMR spectrum was obtained on a Varian Unity Plus 500 using the standard Varian pulse sequence. A total of 128 experiments with a mixing time of 0.3 seconds were performed with collection of 1024 data points. Quadrature detection in the second dimension was obtained through the method of States and Haberkom.^^ ^^C {^H} NMR spectra were obtained on a Varian 500 Unity Plus fitted with a 10 mm broadband probe. Fluorodeoxy glucose compounds and ^^C labeled sugars were purchased through Sigma chemical company. Fluorodeoxy galactose compounds were a generous gift from Dr. Stephen Withers, University of British Columbia.
III. One- and two-dimensional NMR Spectroscopy Fluorinated sugars bound to receptor exhibit a downfield shift in the ^^F NMR spectrum as summarized in Table 1. These shifts correspond to a deshielding of the fluorine nucleus upon sequestering of the sugar in the binding pocket of the receptor and can reflect desolvation and/or closer van der Waals interactions of the sugar upon binding. Our data suggest that the magnitude of the shift may be correlated to the strength of binding interaction. 2-fluoro-2deoxy sugars show no bound peaks upon addition to the receptor. This suggests that the hydrogen bonds made by the OH in this position are critical for interaction with the receptor and the fluorine substitution prohibits binding of the sugars. These results have been corroborated by titrating calorimetry. However, when the OH in position 4 is replaced with fluorine, analogs of both glucose and galactose exhibit binding as shown by the large chemical shift of the bound fluorine resonance in the NMR. By inspection of the crystallographic data one would expect the interactions to be flexible about 0H4 position in the sugars since bound water is involved in the hydrogen bonding network and it is the site for epimer recognition.
490 Table 1.
Linda A. Luck ^''F NMR Chemical Shifts for Fluorodeoxy Sugars Free and Complexed to the Receptor"*
Sugar
free 5 (ppm)
bound 5 (ppm)
difference A5 (ppm)
1-fluoro-l-deoxy-a-glucose 1-fluoro-l-deoxy-P-glucose
-75.6 -69.2
-68.6
0.6
2-fIuoro-2-deoxy-a-glucose 2-fluoro-2-deoxy-p-glucose 2-fluoro-2-deoxy-a-galactose 2-fluoro-2-deoxy-p-galactose
-121.8 -122.0 -130.0 -130.2
3-fluoro-3-deoxy-a-glucose 3-fluoro-3-deoxy-p-glucose 3-fluoro-3-deoxy-a-galactose 3-fluoro-3-deoxy-p-galactose
-122.8 -117.7 -125.8 -121.7
-109.5 -120.0 -112.9
9.2 5.8 8.8
4-fluoro-4-deoxy-a-glucose 4-fIuoro-4-deoxy-p-glucose 4-fluoro-4-deoxy-a-galactose 4-fluoro-4-deoxy-p-galactose
-120.8 -122.8 -142.0 -139.5
-117.9 -117.0 -131.6 -127.5
2.9 5.8 10.4 12.0
* Measurements included 100 mM KCl, 10 mM Tris pH 7.1 and 10% DjD, Temperature at l^C, Those bound include 1.0 mM Receptor.
Titrating calorimetry studies have determined the K^ of the 4-fluoro-4deoxy-galactose as a mixture of anomers. The substitution at the OH4 position showed a 16 fold increase compared to D-galactose. This suggests that the fluorine substitution has not altered the binding ability to a large degree. The published dissociation rate of galactose is 4.5 s"^ ^^ as determined by rapid mixing stopped-flow kinetic studies. This study assumed that both sugar anomers bind to the receptor with the same specificity. Using ^^F NMR magnetization transfer methods^"^ we were able to obtain dissociate rates specifically for 4-fluoro-4-deoxy-p-galactose which were 261 s'^ at 25 °C and 60 s'^ at 2° C ( manuscript in preparation). Similar methods can be used for carbon-13 labeled sugars to yeild off rates for the unfluorinated species. Figure 2 shows the titration experiments of 1-fluoro-l-deoxy-a-glucose and 1-fluoro-l-deoxy-P-glucose with receptor at 2°C. The top traces (A and B) show the ^^F NMR spectrum of the free sugars in refolding buffer without receptor on the same chemical shift axis from two separate experiments. When less than stoichiometric amounts of 1-fluoro-l-deoxy-P-glucose, are added to receptor only the bound form of the sugar is present. The middle spectrum (C) illustrates this and shows a broad resonance at -66.5 ppm which we assigned to the p-bound form of the sugar. Upon addition of excess 1-fluoro-l-deoxy-pglucose, a peak corresponding to the free sugar appears along with the bound peak in the spectrum (D). This "free" sugar peak is found at the same chemical shift as the free sugar peak without receptor present but is in fact in exchange with the bound form. This is demostrated by the broadening of the linewidth of the free peak in spectrum (D). A similar experiment was performed with
^F NMR Sugar/Receptor Binding Studies
491
B
E
-66
—63
-70
-72
-7 4
-7 6
-78
Figure 2. ' ^ NMR spectra of (A) 1-fluoro-l-deoxy-p-glucose and (B) 1-fluoro-l-deoxy-aglucose at 2° C widiout receptor. (C) shows 1.0 mM receptor and 0 2 5 mM l-fluoro-l-deoxy-pglucose. Bottom traces show two separate experiments where (D) 2.0 mM 1-fluoro-l-deoxy-pglucose and (E) 2.0 mM 1-fluoro-l-deoxy-a-glucose were added to 1.0 niM receptor.
l-fluoro-l-deo?Qr-a-glucose. No bound peak was observed throughout the titration. The data from the two experiments suggest that only the beta form of the 1fluorodeoxy glucose binds to the receptor. To further investigate this anomeric preference, glucose with ^^C enrichment of the C-1 position was titrated into the receptor. The enrichment of the carbon increases the sensitivity of the NMR experiment without altering the binding affinity. The results are illustrated in Figure 3. (A) shows the ^^C NMR spectrum of the free sugar in refolding buffer illustrating a 100:66 p/a ratio. Limited sugar in the presence of receptor showed only a broad peak in the ^^C NMR spectrum (B) which we assigned to the beta bound form of the sugar. Due to the excess number of available receptor sites all of the beta sugar was bound. This in turn disturbed the anomeric equilibrium causing all the sugar to be converted to the beta form. When a four fold excess of the sugar was added to the receptor the ^^C NMR spectrum (C) shows the "free" and
492
Linda A. Luck
bound forms of the sugar. The broad peak at 96.5 ppm was assigned to the beta bound form by NMR magnetization transfer techniques. These multinuclear NMR experiments have given conclusive evidence that the receptor has an overwhelming preference for the p-anomers of the unfluorinated and 1-fluoro1-deoxy glucoses. Substitution of fluorine for 0H4 on the sugars shows two forms of the bound sugar. ^^F NMR of the 4-fluoro-4-deoxy analogs of glucose and galactose are shown in Figure 4. (A) illustrates a two-dimensional NOE exchange spectrum of 4-fluoro-4-deoxy-galactose at 25°C. This two-dimensional technique was used to identify exchanging resonances in the one-dimensional spectrum which is shown above the map. Off diagonal peaks on the map unequivocally assign the bound peak (-127.5 ppm) to be in exchange with the P anomer (-139.5ppm). Relaxation data has also been useful in identifying exchanging resonances and illustrating changes in the environment of the fluorinated sugars. T^ values from a one-dimensional NMR inversion recovery experiment are as follows for the 4-fluoro-4-deoxy-galactose resonances in presence of receptor at 25° C: pbound =0.53; %^= 0.52 and 0Cfree=0.95 seconds. In the absence of receptor the free sugar anomers have T^ values of 1.6 seconds. The relaxation time of the fluorinated sugar m the pocket is shorted due to the interactions of binding and the data show that the Tj values of exchanging peaks are similar.
a
V_ 97.0
96.0
95.0
94.0
93.0
ppm
Figure 3. (A) '^C {^H} NMR spectrum of D-glucose-1-^^C free in refolding buffer; (B) 0.5 mM receptor and 0.1 mM D-glucose-1-^^C; and (C) 0.5 niM receptor and 2 mM D-glucose-1*^C . The small resonances to the right of the free a-sugar in the bottom trace appear to be natural abundance carbon signals from impurities or fold over which are observed after extensive signal averaging.
"^F NMR Sugar/Receptor Binding Studies
493
Figure 4 shows receptor with the addition of excess 4-fluoro-4-deoxy glucose at 25°C (B) and at 2°C (C) to illustrate the effect of temperature on the rate of exchange between the bound and free forms of the 4-fluoro-4-deoxyglucose. At 25° C, the exchange rate is in the intermediate range and there is a significant broadening of the resonances, whereas the spectrum at 2° C (C) is at slower exchange and shows distinct peaks for the bound and free forms of the sugar. Figure 4 (D) shows the 4-fluoro-4-deoxy galactose spectrum which exhibits much narrower lineshapes than the glucose analog at the same temperature, the width at half-height of the bound p peaks for galactose and glucose being 133 and 228 Hz respectively. The dissociation rate is 4 times higher for 4-fluoro-4-deoxy-p-glucose than that of the galactose analog as determined by NMR magnetization transfer studies (manuscript in preparation).
I ' ' ' ' I 130 -135
Figure 4. *^ NMR spectra of (A) two-dimensional NOESY exchange of 3 mM 4-fluoro-4deoxy-galactose and 2 mM receptor at 25°C. Diagonal peaks show the one-dimensional spectrum which is projected on the top of the map. (B) 1.6 mM 4-fluoro-4-deoxy glucose and 0.8 mM receptor at 25°C and (C) l^C. (D) 1 mM 4-fluoro-4-deoxy-galactose and 2 mM receptor at 2° C.
The presented data demonstrate the utility of ^^F NMR for characterizing the nature of carbohydrate/protein interactions. The well defined spectra illustrate the beauty of the NMR methods to provide valuble information about exchange rates, anomeric preferences and effect of temperature on binding events in these large protein systems.
494
Linda A. Luck
Acknowledgments I would like to thank Drs. R. London, J. Falke and D. LeBlanc for helpful discussions. I would also like to thank Dr. Eric Toone for use of his titrating calorimeter. Grant # T32 HL 07594 supported preliminary work for this project. Lastly, I am deeply indepted to Dr. Steve Withers for initiating this project.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Fesik, S.W., and Zuiderweg, E. P. (1990) Quart. Rev. Biophys. 23, 91-131. Bax, A., and Grzesiek, S. (1993) Accounts Chem. Res. B 26, 131-138. Gerig, J. T. (1994) Prog. NMR Spectrosc, 26, 293-370. Luck, L. A. and Falke, J. J. (1991) Biochemistry 30, 4248-4256. Falke, J. J., Luck, L. A., Scherrer, J. (1992) Biophys. J. , 62, 82-86. Drake, S. K., Bourret, R. B., Luck, L A., Simon, M. I., and Falke, J. J. (1993) /. Biol. Chem. 268, 13081-13088. Street, I. P., Kempton, J. B., and Withers, S. G. (1992) Biochemistry 31, 99709978. Withers, S. G., and Street, L P. (1988) /. Am. Chem. Soc. 110, 8551-8553. Vyas, N. K., Vyas, M. N. and Quiocho, F. A. (1987) Nature (London^ 387, 635- 638. Vyas, M. N., Vyas, N. K. and Quiocho, F. A. (1994) Biochemistry, 33, A161-A16%. Snyder, E. E., Buoscio, F. W. and Falke, J. L (1990) Biochemistry, 29,3937-3943. States, D. J., Haberkorn, R. A. and Ruben, D. J. (1982) /. Magn. Reson. 48, 286-292. Miller, DJM., Olson, J. S. and Quiocho, F. A. (1980) /. Biol. Chem, 255, 2465-2470. Robinson, G, Kuche, P. W., Chapman, B. E., Doddrell, D. M., Irving, M. G. (1990) J. Magn. Reson. 90, 363-369.
Heteronuclear Gradient-Enhanced NMR for the Study of 20-30kDa Proteins: Application to Human Carbonic Anhydrase II Ronald A. Venters and Leonard D. Spicer Departments of Biodiemistiy and Radiology and Uie Duke University NMR Center, Duke University, Durham, NC 27710
L Introduction Recent advances in NMR spectrometers, multi-dimensional multinuclear NMR pulse sequences, and molecular biology have made it possible to utilize NMR spectroscopy to obtain three-dimensional solution structures of proteins and other biological macromolecules at resolutions comparable to those obtained using x-ray crystallography. In addition, high-resolution NMR techniques can often be used to study dynamical features and interactions of these molecules under physiologically relevant pH and buffer conditions. Proteins as large as 30kDa can now be successfully assigned and studied using elegant NMR pulse sequences developed over the last several years. These sequences often depend on large one-bond heteronuclear couplings and spread through-bond correlations into 2, 3, and even 4 dimensions. For every NMR pulse sequence it is possible to define the active coherence transfer pathways present at each point in the sequence. Phase cycling is commonly used to filter out signals from unwanted coherence pathways. This is accomplished by collecting multiple scans in which the phases of selected RF pulses and the receiver are cycled in such a way that only the desired coherence pathway is recorded in each scan. Subsequently, unwanted coherence transfer pathways are eliminated by spectral subtraction. Phase cycling places stringent demands on spectrometer and environmental stabilities (1), since any instrumental instabilities will lead to imperfect subtractions and; therefore, coherent ti noise. In a reasonably concentrated NMR sample the requirement of completing a phase cycle can lead to inefficient use of spectrometer time when the required signal-toTECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
495
496
Ronald A. Venters and Leonard D. Spicer
noise ratio and/or resolution can be obtained with shorter acquisition times. This can be especially impoitant in 3- and 4-D heteronuclear NMR experiments involving larger proteins. One of the most recent additions to high-resolution NMR methodology has been the use of pulsed field gradients (PFG). In most PFG systems an extra coil is added to the probe which enables application of a precisely controlled linear magnetic field gradient across the sample of interest. Gradients allow for control of coherence transfer echo selection and artifact suppression without the use of radio frequency pulse phase cycling or, in most cases, solvent presaturation (1). PFG methods were proposed early in the development of NMR (2,3,4); however, widespread use of the technology was delayed until the development of actively shielded gradient coils eliminated eddy currents in the NMR probe (S). By eliminating the need for extensive RF pulse phase cycling, gradient-enhanced NMR data can be obtained in a fraction of the time, thereby increasing the throughput on high-field instrumentation. The duration of the PFG experiment is solely dependent on the required signal-to-noise ratio and resolution and is independent of the need to complete a phase cycle. More importantly, the data obtained is essentially free of artifacts normally caused by imperfect subtraction of phase cycled experiments (1). Additionally, the use of gradients can, in most cases, eliminate the need for solvent presaturation. In protein studies the solvent is most often H2O. Elimination of H2O presaturation in these samples enables the observation of protons in fast exchange with and at the same chemical shift as the water. Also, since presaturation diminishes the overall signal observed from all protons in the molecule of interest, eliminating its use can improve the signal-to-noise ratio dramatically. As described below, it is possible to introduce linear field gradients into most of the extensively utilized 2-D and 3-D homonuclear and heteronuclear NMR experiments optimized for use with proteins and peptides (5). We have utilized gradient NMR pulse sequences to study numerous proteins including the enzyme human carbonic anhydrase (HCA). HCA is ubiquitous in living systems with seven different mammalian isozymes (CAI to CAVII). The HCAII isozyme is a 29kDa monomeric zinc metaloenzyme of 259 residues which catalyzes the reversible hydration of caibon dioxide to bicarbonate and a proton with a second-order rate constant, kc^JKu of 1.5 x 10* M"^ s"^ (6). HCAII is one of the largest monomeric proteins currently being studied by NMR, making it a good system on which to demonstrate the advantages of PFG technology.
n. Instrumental and Sample Requirements In order to utilize the gradient-enhanced NMR pulse sequences which are now appearing with increasing frequency in the literature one must have an instrument capable of applying a precisely controlled and reproducible linear field gradient across the sample of interest. Most applications to date make use of a single gra-
NMR of 20-30 kDa Proteins
497
dient along the longitudinal axis (Z gradient) of the sample; however, recent studies suggest that X, Y, and Z gradients may be desirable in some cases (7). The gradient field produced must be of sufficient strength and duration to select for the desired coherence pathway or to suppress artifacts and, yet, must not produce eddy currents at the sample which adversely effect observation of desired signals. The typical commercial PFG system consists of a probe with an added fteld gradient coil, a current amplifier, an interface to the acquisition computer, and software control. Probe manufacturers have produced a wide variety of probes containing a tield gradient coil, including ^H only, broadband indirect detection, and 5mM and 8mM triple-resonance (^H, ^^C, and ^ N) probes. The amplifier must be stable, have low noise, and provide an extremely reproducible gradient pulse area. In order to realize the full power of modem gradient-enhanced NMR for the study of biological systems one must be able to obtain millimolar concentrations of C and (and in some cases ^H) enriched samples. HCAII uniformly labeled with and C was obtained by growing E. coli in defined media containing 3g/L sodium [1,2-^^C2,99%] acetate as the sole carbon source and Ig/L [^^N, 99%] ammonium chloride as the sole nitrogen source (8). The activity and NMR spectra of the protein labeled by this technique are the same as those obtained from protein produced from media containing labeled glucose; however, the cost of the sodium [1,2-^^C2,99%] acetate growth media is considerably less than the cost of the [^^Q, 99%] glucose growth media. Commonly, the sample produced for heteronuclear triple-resonance 3-D NMR experiments must be ImM or greater in macromolecule concentration in approximately 400-600uL of solution in a 5mM NMR tube. However, it should be noted that there are efforts underway in several laboratories aimed at developing probes which can acconmiodate larger sample volumes. The solution used is routinely 90% H2O with 10% D2O added for field locking. The sample should be free of impurities, especially other proteins which may copurify and may also be labeled with ^^C and ^^N isotopes.
HL Application of Pulsed Held Gradients Pulsed field gradients are added to existing NMR pulse sequences in order to suppress artifacts and/or to select certain coherence transfer pathways. The application of gradients for these two purposes maintain different requirements and benefits. Coherence pathway selection using gradients is accomplished by applying a series of gradient pulses positioned in the sequence so that thefinalpulse refocuses only the desired coherence order. Originally, PFG experiments using coherencetransfer pathway selection had lower sensitivity than their phase cycled counterparts because only N- or P-type coherence pathways were collected during any single scan, but not both (9). Kay et. al. (10) have introduced a method for ob-
498
Ronald A. Venters and Leonard D. Spicer
serving pure absorption ^H-^^N heteronuclear single quantum coherence (HSQC) that does not suffer this loss and that has demonstrated sensitivities greater than what can be obtained with other gradient schemes or non-gradient experiments (Figure 1). This sensitivity-enhanced 2-D experiment correlates directly bonded ^H-^^ pairs and is an important experiment in most protein NMR studies. The sensitivity-enhancement is achieved by refocusing and detecting two orthogonal in-phase proton magnetization components which are then deconvoluted and added (11). Pulses Gl and G2 in Figure 1 are an example of a gradient pulse pair used for coherence pathway selection. Gl dephases any magnetization which is transverse when it is applied. By precisely controlling the area and sign of G2 only the ^H-^ N coherence is refocused. All other coherence pathwaysremaindephased and are not observed including the usually intense signals from H2O and protons attached to nuclei other than ^ N. The PFG sensitivity-enhancement protocol can be introduced into a wide variety of ^^C, ^^N heteronuclear 3-D experiments which use the amide proton for detection (10,11). PFG elements can also be used purely for artifact suppression in heteronuclear pulse sequences. We have modified the sensitivity-enhanced HSQC sequence to include artifact suppression gradients (Figure 1). Used in this manner gradient pulses maintain all of the desired magnetization pathways and only use gradients when the sensitivity of the experiment will not be effected (9). There are three general ways to use PFG pulses for artifact suppression. First, a gradient pulse can be placed between thefinalpair of 90 degree RF pulses in a standard heteronuclear INEPT transfer (pulse G4 in Fig 1). This gradient pulse destroys all magnetization which is not longitudinal after the proton pulse. Since the desired magnetization is along the Z-axis at this point no sensitivity is lost. Secondly, whenever a 180 degreerefocusingRF pulse is applied to a single-spin species you can apply PFG
lU 1
ll
Jnh/4
piesat y
JnhMl 1 x^,-x.-x
tl/2
1 mm
tl/2
JnhMl iJnhM
JnhMl
)r
1 1 ^
1 y
hs^
\ X, -X.-X,X
+/-
I ^
15N
1
dec y
ZGRAD
n
n G3
G4
y
n n n n G5
G6
Figure 1. Gradient-eDhanced ^H- N heteronuclear single quantum coherence pulse sequence with coherence transfer selection and artifact suppression gradients. All pulses are of phase x unless otherwise indicated.
NMR of 20-30 kDa Proteins
499
pulses of equal area and sign before and after the refocusing pulse to select terms that have a transverse magnetization both before and after the pulse, thereby, eliminating the effect of imperfections in the RF refocusing pulse. Examples of this use of gradients are illustrated by the G3, G5, and G6 pulse pairs in Figure 1. Finally, PFG pulses of equal area but of opposite sign can be placed around a heteronuclear decoupling 180 degree pulse to eliminate any pulse imperfections which can create unwanted transverse magnetization (9). Using these types of gradient pulses removes the need for phase cycling schemes which were previously used to accomplish the same artifact suppression. In general, artifact suppression with gradients can be done at lower gradient strength than coherence selection putting less stringent requirements on the gradient amplifier. In practice, it is desirable to include both artifact suppression gradients and coherence pathway selection gradients in ^^C and/or ^Ti heteronuclear 2-D and 3-D experiments when possible. This is especially true in amide proton detected experiments in proteins where sensitivity-enhancement can be used. Using the sensitivity-enhanced HSQC experiment diagranmied in Figure 1, we have examined the efficacy of gradients for artifact suppression and coherence pathway selection. In addition, we have examined the Umits and optimization of gradient strength and duration for coherence order selection. The proteins utilized in these comparisons were HCAII and an 80 amino acid fragment of the cl lambda repressor protein. The results obtained on HCAII and the cl lamda repressor fragment were very similar, therefore, we will only present results from HCAII here. We first compared the gradient version of the sensitivity-enhanced HSQC experiment utilizing all of the gradient pulses diagrammed in Figure 1 with the same experiment with the gradient pulses deleted and the addition of presaturation solvent suppression (Figure 2a and b). No solvent suppression routines using RF pulses are used in the gradient version of the experiment. The data shown are 1-D ^H cross sections through the ^^N dimension at 110.9 ppm. It is clear from Figure 2 that the signal-to-noise ratio and the suppression of coherent tl noise is better in the spectra which employ pulse field gradients. In order to obtain the same signalto-noise ratio without gradients four times as many scans for each tl increment must be collected resulting in significantly longer acquisition times and less efficient use of the instrument. Compared to the presaturation data collected with four times the scans, the artifact and solvent suppression in the gradient spectrum is superior (Figures 2a and 2c). It is clear that gradients allow for the collection of better data using significantly less instrument time. Figure 2 also presents a comparison of spectra obtained with only coherence order selection or with only artifact suppression using our gradient version of the sensitivity-enhanced HSQC. We collected data in which only the Gl and G2 gradients were used resulting in H- N coherence transfer selection as described above (Figure 2d). We compared this with data collected using only the G3, G4, G5, and G6 gradients resulting in pure artifact suppression with no coherence transfer selection (Figure 2e) and with data collected using both methods (Figure 2a). It is evident from Figure 2e that using pulsedfieldgradients purely for artifact
Ronald A. Venters and Leonard D, Spicer
500
*^) .NM M^MJb^K^V'^
Figure 2. HSQC data on a 2.4mM san^le of ^ ^ labeled HCAll at pH 6.8 and 30C. Spectra a-e are 1-D ^H cross sections through the ^^N dimension at 110.9 ppm. With the exception of data set c, each 2-D data set was collected with 128 blocks and two transients per block. Gradient conditions for each data set are: a) gradients Gl, G2, G3, G4, G5, and G6, b) no gradients, H2O presaturation for 2 seconds, c) same as b with eight transients pex blodc, d) only coheroice selection gradients Gl and G2, e) only artifact suppression gradi^its G3, G4, G5, and G6.
suppression results in inferior solvent suppression. In addition, the signal-to-noise ratio of this data set is approximately 30% lower than the data sets collected using coherence pathway selection gradients. In this set of experiments, the signal-to noise ratio is identical and optimal when either coherence pathway selection gradients Gl and G2 are used alone ( Figure 2d) or when all gradient pulses G1-G6 are used (Figure 2a). Finally, we examined an array of power level and duration parameters for the gradients (Gl and G2) used in coherence transfer selection. Since the gyromagnetic ratio for ^H is 9.87 times that of ^^, the gradient area of G2 should be 9.87 times smaller than the gradient area of Gl in order to refocus the H-
coherence
pathway. This ratio of areas can be achieved by either adjusting the gradient times and/or the gradient power levels. We have collected data using a wide array of gradient powers and times for Gl and G2 and have compared the signal-to-noise
NMR of 20-30 kDa Proteins
501
20-
t^5
1 • ^2
•
• ^9
V.
o
•.72
**%
•
.-^•.90
tp
•
0 10
20
30
40
Gl gradient level (gauss/cm) Figure 3. Graph lepresenting power level (gauss/cm) and duration (msec) combinations examined for coherence transfer selection gradient Gl in the ^H-^^ HSQC ^)ectra of 2.4mM * ^ labeled HCAn. For each data point, gradient G2 is adjusted to maintain the proper ratio for coherence refocusing. The dashed line represents a constant gradient area of 65G-msec/cm below which artifacts are clearly observable in the spectra. The numbers beside selected points indicate relative signal-to-noise ratios for cross peaks in the 2-D HSQC spectra. Note that the signal-to-noise ratio is highest when short duration, high power, gradients are used.
ratio obtained. Figure 3 presents a graphicalrepresentationof the results. Below the line drawn in the figure incomplete filtering of solvent and artifacts results. The line corresponds to a constant Gl gradient area of 65G-msec/cm. It should be noted here that this line, while independent of protein sample on our instrument, may vary significantly on different instruments and/or different probes and is most likely strongly dependent on gradient amplifier performance and eddy current suppression in the probe. It is also apparent from Figure 3 that the signal-to-noise ratio obtained is better when higher power shorter duration gradients are used as long as the gradient times do not become too short for proper amplifier response. Theseresultscan be used directly to implement gradient versions of NH detected ^^C/^^N/H 3-D experiments for protein assignments purposes resulting in data with higher sensitivity and less coherent noise using significantly shorter instrument times when compared with their non-gradient precursors. These experiments allow for the study of proteins 30kDa and larger greatly expanding the quantity and types of systems now accessible.
502
Ronald A. Venters and Leonard D. Spicer
Acknowledgments The authors would like to thank Dr. Terrence G. Oas and Dr. Guewha S. Huang for the use of N labeled cl lambda repressor fragment and for many helpful discussions. The Duke University NMR Center was established with grants from the NIH, NSF, and the North Carolina Biotechnology Center, which are gratefully acknowledged. This work was supported in part by the NIH research grant GM 41829.
References 1. "Pulsed Field Gradiaits: Thewy and Practice" Keeler, J., Clowes, R.T., Davis, AI.., and Laue, E D . (1994) Concepts in Magnetic Resonance 6. 2. Maudsley, A.A., Wokaim, A., and Ernst, RJR. (1978) Chem, Phys. Lett. SS, 9-14. 3. Barker, P., Freeman. R. (1985) Journal of Magnetic Resonance 64,334-338. 4. Bax, A., De Jong, P.G., Mehlkopf, A J . , and Smidt, J. (1980) Chem, Phys, Lett, 69,567-570. 5. Hurd, R.E. (1990) Journal of Magnetic Resonance 87,422-428. 6. Silverman, D.N., & Lindskog, S. (1988) Ace. Chem. Res. 21,30-36. 7. Warren, W.S., Richter, W., Andreotti, AJl., and FarmCT, B.T. H (1993) Science 262,2005-2009. 8. Venters, R.A., Caldetone, T i . , Spicer, L.D.,and Fieike, C.A. (1991) Biochmeistry 30, 4491-4494. 9. Bax, A., and Pochapsky, S.S. (1992) Journal of Magnetic Resonance 99,638-643. 10. Kay, L.E., Keifer, P.. and Saarinen, T. (1992) /. Am, Chem, Soc, 114, 10663-19665. 11. Palmer, A.G.in, Cavanagh. J., Wright, PJE., and Ranee, M. (1991) / . Magn, Reson, 93,151-170.
Toward the Solution Structure of Large (>30 kDa) Proteins and Macromolecular Complexes Cheryl H. Arrowsmith, Weontae Lee, Matthew Revington Division of Molecular and Structural Biology, Ontario Cancer Institute and Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada M5X 1K9 and
Toshio Yamazaki and Lewis E. Kay Protein Engineering Network of Centers of Excellence and Departments of Medical Genetics, Biochemistry and Chemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8
I.
Introduction
Advances in NMR technology over the last 5 years, particularly the development of 3 and 4 dimensional (3D and 4D) heteronuclear NMR (1-4), have provided the tools to determine the solution conformations of medium sized proteins in the range 15-25 kDa. Recent applications of this technology include the structures of a 23 kDa cahnodulin-peptide complex (5), ILl-P (17 kDa) (6), domain HA of glucose permease (18 kDa) (7) and protein S (19 kDa) (8). Backbone assignments have been reported for two 28 kDa proteins (9,10) and most recently for a 38 kDa protein (Copie and Torchia, personal communication). However, a number of laboratories have pointed out the limitations of these techniques for many protein systems with molecular weights greater than about 25 kDa (10-13). The problem arises from the short transverse relaxation times (T2) of nuclei in larger molecules due to slower rotational correlation times and the resulting efficient dipolar relaxation between covalendy bound nuclei. Rapid T2 relaxation, especially of the carbon nuclei, during evolution periods and scalar transfer steps of nD pulse sequences reduces the sensitivity of many triple resonance techniques. In addition, many of the isotope filtered pulse sequences (14-17), which are crucial for identifying intermolecular interactions, also contain extensive delays which reduce their sensitivity for larger systems. In practice, many of the currently used NMR experiments wOl fail for molecular weights in excess of -30 kDa. There are at least two strategies that can be used to overcome this problem and to extend the usefulness of NMR to larger macromolecules and complexes. First, one can modify current sequences to minimize the length of the pulse sequence and secondly one can improve the sensitivity of pulse schemes by increasing the T2 relaxation times of nuclei involved through random incorporation of deuterium into the protein (12,18,19). We discuss here our application of these strategies to the 37 kDa complex between Exoli trp repressor and trp operator. The resonances of this complex have been assigned and a structure determined using a combination of selective deuteration and a series of heteronuclear NMR experiments which did not rely on the small Ca- N couplings (11). We discuss the limitations of this method and promising recent results suggesting that the strategies mentioned above wiU extend the usefulness of NMR to molecular weights in the range of 30-40 kDa. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
593
504
Cheryl H. Arrowsmith et al.
IL Materials and Methods A.
Protein
Preparation
Uniformly i^c and I^N labelled trp repressor was isolated from E, coli strain CY15070 (20) containing the overproducing plasmid pJPR2 grown in 1-2 L of minimal media with Ig/L 15NH4CI and 2.5g/L D- glucose-i^Ca as the sole nitrogen and carbon sources, respectively. It typically takes 12-16 hours at 37"C for the cells to reach an OD of 0.6-0.9 at 600 nm at which point protein production was induced with 1 mM IPTG. Cells were harvested after another 8-9 hours, when the final OD was typically double that at the time of induction. Purification is as described previously (20-21). Typical final protein yields were 10-20 mg /L growth media. For triply labelled 2H,i3C,i5N-rr/7 repressor the bacteria were first adapted to growth in D2O as follows. A single colony from an LB agar plate was used to inoculate 12 ml of M9 media (22) with 200 jig/ml ampicillin and 33% D2O as the solvent After overnight growth at 37°C, the cells were plated out onto an M9 agar plate made with 33% D2O. The 33% plate was incubated at 37°C for two days resulting in very small colonies. A colony from the 33% plate was used to inoculate 12 ml of M9 media with 56% D2O as solvent. After overnight growth the cells were plated onto a 56% D2O M9 plate and incubated for 24 hours. For large scale purification of triply labelled trp repressor a single colonyfi*omthe 56% D2O plate was used to inoculate 10 ml of M9 media containing 15NH4CI, D-glucose13C6 and 70% D2O (Cambridge Isotope Labs.) as solvent. After overnight growth this culture was dense and was used to inoculate 1.5 L of the same M9 media. After shaking at 37"C for 18.5 hours the O.D. (600 nm) had reached 0.6, at which point the culture was induced with IPTG and grown for another 15 hours. The final O.D. was 1.2, yielding 3.9g of wet cells. Purification as usual yielded 60 mg of pure triply labelled protein. For NMR samples the protein was concentrated by ultrafiltration to a final concentration of 1-2.4 mM trp repressor monomer in 500 mM NaCl, 50 mM sodium phosphate, pH6. The high salt concentration was necessary to prevent aggregation of the protein. The corepressors, either L-tryptophan or 5-methyl-Ltryptophan, were added at a concentration of 1.5-2 times the protein subunit concentration to form trp holorepressor. Protein-DNA complexes were prepared by adding the appropriate amount of synthetic operator DNA (23) to the above sample followed by dialysis and concentration into a pH6 solution of 50 mM (or less) sodium phosphate. Higher salt concentrations destabilized the protein-DNA complex and resulted in much poorer quality NMR spectra.
B.
NMR
Spectroscopy
For non-deuterated complexes, NMR experiments were performed on either Varian Unity600 or Varian Unity+500 spectrometers. The 600 MHz instrument was equipped with a triple resonance probe and a PTS synthesizer as a pseudo fourth channel. The 500 MHz spectrometer was a four channel instrument with a triple resonance probe with an actively shielded pulsed field gradient coil. All experiments were performed at 3 7 ^ . The heteronuclear experiments shown in Figure 1 were performed as described in Zhang et al (11) and Revington et al (24). For all 3D experiments 32 transients were required for sufficient signal to noise. This necessitated the use of fewer increments and the use of linear prediction (25)
NMR Analysis of Large (>30 kDa) Structures
505
to obtain sufficient resolution in one or both of the indirectly detected dimensions. An improved version of the 3D i3C-F3-filtered-HMQC-NOESY (5) was recorded as reported by Lee et al (27). An HNCA experiment designed for 2H,i5N,i3C labelled proteins was recorded as described by Yamazaki et al (19). TTiis experiment was acquired on a three channel Varian Unity500 spectrometer modified to perform the I^N pulses, 15N decoupling and 2H decoupling on a single channel. Alternatively, this could be accomplished on a four channel instrument without modification.
III. Results and Discussion The solution structure of the trp repressor-operator complex was recently determined using a combination of selective deuteration and heteronuclear NMR assignment strategies. Figure 1 shows the heteronuclear experiments used in our laboratory to assign ^^N, i^c and iH resonances of tihe repressor-operator complex. We started off with gradient enhanced I 5 N H S Q C (26) and NOESYHMQC (28) spectra of an I^N labelled protein bound to natural abundance DNA and corepressor. Since only those proton and i^N resonances in the DNA-binding or ligand binding regions showed significant changes upon binding DNA, approximately 70% of the amide N-H pairs could be assigned based on similarity with spectra of the holorepressor. Assignment of the backbone carbon resonances using conventional triple resonance techniques (2,4) was not possible due to the poor quality of spectra involving the CA-NH correlation. The NOE-based 15N assignments were confirmed and further backbone carbon and nitrogen assignments identified from the HNCO (29) and (HB)CBCACO(CA)HA (30) experiments. The HNCO is the most sensitive triple resonance experiment (2) and provides sequential connectivities through the peptide bond. The (HB)CBCACO(CA)HA relys on relatively large C-H and C-C couplings and also gives a reasonable signal on this size of complex. By matching the carbonyl resonances from these two experiments it is possible, in principle, to connect the Ca and Cp of each side chain with the NH resonances of the next residue in the sequence. In practice, however, there will be ambiguities which would ideally be resolved by additional triple resonance experiments. Although we were not successful in acquiring additional triple resonance experiments, the fact that approximately 50% of the C a resonances remained the same as in the free protein allowed us to assign -80% of the backbone and CP resonances from the two spectra described above. The Ca and Cp
15N
Labelled Protein 15N/13C
13C
HSQC
3D HNCO
3D HCCH-TOCSY
3D NOESY-HMQC
3D (HB)CBCACO(CA)HA
3D NOESY-HSQC
15N-Frfiltered NOESY
i5N/i3C-Fi, Fa-fiKered NOESY
3D 13C-Fa-filtered HMQC-NOESY
Figure 1 NMR Experiments used to assign the trp repressor-operator complex (11,24).
506
Cheryl H. Arrowsmith et al.
assignments served as a starting point for analysis of the 3D i^C-NOESY-HSQC (31) and HCCH-TOCSY (32). In our hands the HCCH-TOCSY gave only partial connectivities, presumably due to rapid T2 relaxation. Therefore a complete side chain assignment via scalar connectivities was not possible, and we relied heavily on NOE spectra for assignment of the side chain resonances. Using this combination of experiments it was possible to assign approximately 80% of the backbone and side chain resonances of DNA-bound trp repressor. The proton assignments were confirmed by comparison with the results of selective deuteration experiments (11). The success of this strategy relied, in part, on the fact that a large portion of the protein did not undergo significant changes in chemical shift upon binding to DNA. This meant that assignments for the smaller species based on the more reliable triple resonance experiments (rather than NOESYs) could be carried over to the complex. Figure 2 shows a comparison of "strips" from the HNCA of the holorepressor with those of the (HB)CBCACO(CA)HA of the complex for residues 94-106 whose alpha carbons do not change significantly upon interaction witJi DNA.
'""1
3»»|H»|mi|»M|im
V E L R Q W L E E V L L K 94 95 96 97 98 99 100 101 102 103 104 105 106
L R Q W L E E V L L K 94 95 96 97 98 99 100 101 102 103104 105106
Figure 2. Strips from the HNCA of trp repressor (A) and the (HB)CBCACO(CA)HA of trp repressor bound to DNA (B) showing how residues 94-106 have very similar chemical shifts. Vertical strips from each residue in (A) contain correlations to the intraresidue Ca as well as the Ca of the previous residue. Vertical strips in (B) show only intraresidue Ca correlations.
NMR Analysis of Large (>30 kDa) Structures
507
In addition to assigning the resonances of the protein we needed to assign those of the DNA and ligand in the complex and identify intermolecular NOEs from the protein to corepressor and DNA. The assignment of the resonances of unlabelled trp operator DNA and the corepressor, L-tryptophan, were accomplished using 2D F1/F2 isotope-filtered NOESY experiments (15). These spectra show only NOEs within the bound DNA and ligand, and can be analyzed in the conventional manner for sequence specific assignment of DNA (33). NOEs between the i^C-labelled protein and unlabelled DNA and corepressor were identified from the Fa-filteredHMQC-NOESY. This spectrum proved to be extremely valuable for the structural analysis of the protein-DNA contact surface. However, all the isotope filtered experiments require "purge sequences" to eliminate magnetization from the labelled species which requires the spins involved to spend additional time in the transverse plane. This reduces the sensitivity of these experiments relative to a normal 2D or 3D NOESY. In particular, tiie 3D Fs-filtered-HMQC-NOESY sequence as originally reported (5) gave useful signal on our complex only when it was run in the 2D mode (Le. the carbon evolution time was eliminated). Although the carbon evolution time reduces the sensitivity relative to the 2D experiment, it is essential for the unambiguous assignment of the protein contribution to the cross peaks. Therefore, we improved upon the existing 3D pulse sequence by reducing the total time that iH magnetization spends in the transverse plane by combining the proton evolution and iH/i3C scalar transfer times (4, 27, 34, 35). In addition, pulsed field gradients (PPG) were used to eliminate artifacts rather tiian extensive phase cycling. A double purging scheme with delays optimized for two different values of carbon-proton coupling was also used to insure complete filtering of all i^Cbound protons. Figure 3 shows a slice of the 3D Fs-filtered-HMCJC-NOESY spectrum with that of the analogous HMQC-NOESY. It is clear from this comparison that the identification of solely intermolecular NOEs in the Fa-filtered spectrum is very useful for interpreting the NOEs in the non-filtered spectrum.
1.0
vai 58y/rrp(H5) '
J
^j
«0 '
T • 0
.
'
4
f * 4
^^C= =21.3 ppm 7.0
F1 2 0 (ppm)
i pf^twr—" ^niv" — V * T ^
-w
Y
4Ddh^
Jj^^te^ 1
•
1
6.0
5.0
9
k
2.0.
i
8.0
»
^ Thr44yA12(H4')
13^
A
1
I It V
Thr8iyTrp(H7) Thr44/Trp(H2)
B
Thr44yA12(H2')
Thr44yA12J[H3')
4.0
3.0
F3(ppm) Figure 3. A comparison of carbon slices from the Fs-filtered-HMQC-NOESY spectrum(A) with those of the i^c HMQC-NOESY (B). Intermolecular NOEs in (B) are indicated.
508
Cheryl H. Arrowsmith et al.
Although an assignment strategy such as that outlined in Figure 1 may work for certain favorable cases, it is clear that a more general assignment strategy is needed for many protein systems with molecular weights in excess of 25 kDa. Since the major problem with larger systems is rapid transverse relaxation during the pulse sequence, longer T2 relaxation times of the nuclei involved will improve the sensitivity of most NMR experiments. The major mechanism of carbon transverse relaxation involves dipolar relaxation with its covalently bound protons. Therefore, we have sought to increase the carbon T2 by substitution of deuterium for hydrogen. Toward this end we have prepared -70%-2H/I5N/I3C labelled trp repressor as described above. From preliminary NMR measurements it appears that the level of deuterium incorporation at the Ca position is approximately that of the deuterium level in the growth media. The incorporation of deuterium within the amino acid side chains is currently under investigation. Deuterium labelling at this level is only slightly more expensive than a protonated I5N/13C growth because high isotopic purity D20 is not necessary. Moreover, although the bacteria grew more slowly in deuterated media, the final yield of protein was as good or better than other protonated preparations in our lab. Preliminary relaxation measurements indicate that this level of deuteration increases the Ca T2 relaxation times by approximately 8-fold (Yamazaki et al, submitted). This has allowed us to perform triple resonance experiments with excellent sensitivity in an effort to more systematically and thoroughly assign all backbone residues.
45.0
55.0
13 C
(ppm)
60.0
E60 L61 L62 R63 G64 E65 M66 S67 N68 R69 Residue Number Figure 4. Strips from the HNCA of 70% deut^ated 15N,13C labelled trp repressor bound to DNA in the presence of the corepressor 5-methyl-L-tryptophan. Each vertical strip contains correlations to the intraresidue Ca as well as the Ca of the previous residue.
NMR Analysis of Large (>30 kDa) Structures
509
Figure 4 shows strips from the HNCA spectrum of 2H/15N/13C labelled repressor bound to DNA. Due to the excellent sensitivity and resolution of this spectrum, 99% of the expected inter and intra residue crosspeaks were observed. Thus, a high proportion of the HN, N and Ca resonances of the DNA-bound protein could be assigned from this spectrum alone. It is likely that a complete backbone assignment will be possible with complementary triple resonance experiments on this deuterated complex. This is a promising result indicating that the use of fractional deuteration may enable one to assign proteins and complexes as large as perhaps 30-40 kDa.
Acknowledgments The protein chemistry, much of the spectroscopy and NMR resonance assignments for trp repressor were carried out in the Arrowsmith laboratory with support from the NCI of Canada and the Human Frontier Science Program. We acknowledge our very fruitful collaboration with the laboratory of O. Jardetzky on the structure of the trp repressor complex. The development and implementation of the HNCA and isotope filtered pulse sequences was carried out in the laboratory of L.E. Kay with support from NSERC and NCI of Canada. T. Yamazaki acknowledges fellowship support from the Human Frontier Science Program.
References 1. 2. 3. 4. 5.
Montelione, G.T., Wagner, G. (1990) I Magn. Reson.Sl 183-188. Ikura, M., Kay, L.E., Bax, A. (1990) Biochem. 29 4659-4667. Kay, L. E., Clore, G. M., Bax, A. and Gronenbom, A. M. (1990) Science 249, 411-414. Bax, A., and Grzesiek, S. (1993) Ace. Chem. Res. 26, 131-138. Ikura, M; Core, G.M.; Gronenbom, A.M.; Zhu, G.; Klee, C.B.; Bax, A. (l992)Science 256, 632-638. 6. Clore, G.M., Wingfield, P.T. and Gronenbom, A.M. (1991) Biochem. 30, 2315-2323. 7. Fairbrother, W.J., Gippert, G.P. Reizer, J., Seier, M.H. and Wright, P.E. (1992) FEBS Lett. 296, 148-152. 8. Bagby, S., Harvey, T.S., Kay, L.E., Eagle, S.G., Inouye, S., and Ikura, M. (1994) Structure 2, 107-122. 9. Remerowski, M.L., Komke, T., Groenewegen, A., H.A.M. Pepermans, Hilbers, C.W. and van de Ven, F.J.M (1994) /. Biomol. NMR. 4, 257-278. 10. Fogh, R.H., Schipper, D. Boelens, R., and Kaptein, R. (1994) J. Biomol. NMR 4, 123-128. 11. Zhang, H., Zhao, D., Revington, M., Lee, W., Jia, X., Arrowsmith, C. and Jardetzky, O. (1994) J.Mol.Biol. 238, 592-614. 12. Grzesiek, S., Anglister, J., Ren, H. and Bax, A. (1993) J. Am. Chem. Soc. US, 4369. 13. Kushlan, D.M. and LeMaster, D.M. (1993) J. Biomol. NMR 3, 701. 14. Wider, G., Weber, C, Traber, R., Widmer, H., and Wuthrich, K. (1990) J. Am. Chem. Soc. 112, 9015-9016; Otting, G, Wuthrich, K., (1990) Q. Rev. Biophys. 23, 39-96. 15. Ikura, M., Bax, A., (1992)7. Am. Chem. Soc, 114, 2433-2440. 16. Gemmecker, G., Olejniczak, E.T., Fesik, S.,(1992) J. Magn. Reson., 96, 199-204. 17. Folkers, P.J.M., Fohner, R.H.A., Konings, R.N.H., Hilbers, C.W. (1993) /. Am. Chem. Soc. 115, 3798 3799. 18. LeMaster, D.M. and Richards, F.M. (1988) Biochem. 27, 142-150. 19. Yamazaki, T. Lee, W., Revington, M., Mattiello, D. L., Dahlquist, F. W., Arrowsmith, C. H. & Kay, L. E. (1994) J. Amer. Chem. Soc.lU 6464-6465.
510
Cheryl H. Arrowsmith et al.
20. Paluh, J.L., and Yanofsky, C. (1986) Nuc, Acids. Res. 14, 7851-7861. 21. Arrowsmith, C.H., Pachter, R., Altman, R.B., Iyer, S.B. and Jardetzky, O. (1990) Biochem. 29, 6332-6341. 22. Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning, 2nd Ed. Cold Spring Harbor Press, Cold Spring Harbor, N.Y. 23. Lefevre, J.F., Lane A.N. and Jardetzky, O. (1986) Biochem. 16, 5076-5090. 24. Revington, M.J., Lee, W. and Arrowsmith, C.H. submitted. 25. Marion, D. and Bax, A. (1989) / Magn. Res. 84, 72-84. 26. Kay, L. E., Keifer, P., & Saarinen, T. (1992) J. Am. Chem. Sac. 114, 10663-10665. 27. Lee, W., Revington, M.J., Arrowsmith, C.H. and Kay, L.E.(1994) FEBS. Lett, in press. 28. Zuiderweg, E. R. P. & Fesik, S. W. (1989) 5wc/zemwrry 28, 2387-2391. 29. Kay, L. E., Dcura, M. & Bax, A. (1991) J. Magn. Reson. 91, 84-91. 30. Kay, L. E. (1993) J. Am. Chem. Soc. 115, 2055-2057. 31. Muhandiram, D. R., Farrow, N., Xu, G. Y., Smallcombe, S.J. and Kay, L. E. (1993) / Magn. Reson. Series B. 102, 317-321. 32. Kay, L. E., Xu, G. Y., Singer, A. U., Muhandiram, D. R. and Forman-Kay, J. D. (1993) J. Magn. Reson. B 101, 333-337. 33. Wuthrich, K. (1986) NMR of Proteins and Nucleic Acids. John Wiley & Sons, N.Y. 34. Muhandiram, D. R., Xu, G.Y. and Kay, L. E. (1993) J. Biomol. NMR, 3, 463-470. 35. Logan, T. M., Olejniczak, E. T., Xu, R. X. and Fesik, S. W. (1993)7. Biomol. NMR, 3, 225-231.
Solution Structures of Horse Ferro- and Ferricytochrome c using 2D and 3D 1 H NMR and Restrained Simulated Annealing Phoebe X. Qi^, Ernesto J. Fuentes^, Robert A. Beckman^'l, Deena L. Di Stefano^, and A. Joshua Wand^'^^'^ ^Department of Biochemistry, University of Illinois at Urbana-Champaign Urbana, Illinois 61801 and ^Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, PA 19111
L INTRODUCTION The introduction and implementation of heteronuclear-based multidimensional techniques have revolutionized the protein NMR field. Large proteins (> 100 residues) are now amenable to detailed NMR studies and structure determination. These techniques, however, necessarily require a scheme by which i^C and i^N isotopes can be incorporated into the protein to yield a uniformly labeled sample. Additional complications, such as extensive covalent post-translational modifications, can seriously limit the ability to efficiently and cost effectively express a protein in isotope enriched media - the c-type cytochromes are an example of such a limitation. In the absence of an effective labeling protocol, one must therefore rely on more traditional proton homonuclear NMR methods. These include twodimensional (1) and, more recently, three-dimensional ^H experiments (2,3). Cytochrome c has become a paradigm for protein folding and electron transfer studies because of its stability, solubiUty and ease of preparation. As a result, several high-resolution X-ray crystal structure models for c-type cytochromes, in both redox states, have emerged. Although only subtle structural differences between redox states have been observed in these 1
Present address: Biophysics Research Division, University of Michigan, Ann Arbor, MI 48109. 2 Address correspondence to this author at the University of Illinois. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
511
512
Phoebe X. Qi ^r a/.
models (4,5), a number of NMR-based structural studies suggest the presence of potentially significant structural differences (6). In an effort to clarify redox dependent structural issues, we have undertaken the determination of the solution structures of horse cytochrome c in its two redox states. No economic method for uniform ^^C and ^^N labeling of cytochrome c has been achieved, consequently the use of heteronuclear NMR methods has been precluded. Here we describe the high-resolution solution structures for ferro- and ferricytochrome c using ^H 2D and 3D NMR spectroscopy and hybrid distance geometry - simulated annealing calculations. These detailed structural studies will provide the basis for a comprehensive re-evaluation of hypotheses concerning the fundamental nature of the electron transfer processes in proteins and also serves to illustrate that highly defined molecular models of proteins of moderate size can be determined using structural restraints derived solely from ^H NMR experiments.
II. MATERIALS & METHODS A. Sample preparation and NMR spectroscopy The use of homonuclear NMR methods, especially those of higher dimensionality, require relatively concentrated samples. Either 7 mM or 10 mM solutions of cytochrome c in 90% H2O/10% D2O containing 50 mM potassium phosphate (pH=5.74) were used to carry out 2D NMR experiments. Proton 3D NOESY-TOCSY spectra were obtained using 15 mM solutions. Samples were reduced with soUd ascorbate and deoxygenated with nitrogen. NMR spectra were collected at 20 ""C on Bruker AM-600, AMX-500 and AM-300 NMR spectrometers as described elsewhere (22,25,27,28). NMR spectra were processed and analyzed using the computer programs FTNMR and Felix (Hare Research, Bothell, WA). NOESY (7) spectra were recorded with mixing times of 30 ms, 50 ms, 70 ms, 90 ms and 110 ms at 600 MHz with 64 scans per free induction decay. Double quantum filtered COSY (8), NOESY (60 ms mixing time) and TOCSY (9) spectra were also acquired in 90% H2O/10% D2O at 500 MHz under identical experimental conditions and used to determine the ^jHaHN coupling constants as described by Ludvigsen et al. (10). Proton 3D NOESY-TOCSY spectra (2) were acquired at 500 MHz using samples prepared in 90% D2O/10% H2O buffer. Ferrocytochrome c spectra were acquired with eight scans per free induction decay consisting of 220 (ti) x 200 (t2) X 512 (t3) complex points with spectral width of 6410 Hz at 500 MHz. The spectra for ferricytochrome c was acquired under identical conditions using 192 (ti) x 192 (t2) x 512 (t3) complex points and a spectral
2D and 3D NMR Studies of Cytochrome C
513
width of 6579 Hz at 500 MHz. A NOESY mixing time of 100 ms was used and isotropic mixing was accomplished with 30 ms of spin-locking with an MLEV-17 pulse train.
III. RESULTS A. Distance Restraints Derived From 2D ^H NMR Spectra Previously reported assignments for the ^H NMR spectrum of horse heart ferro- (11) and ferricytochrome c (12) were used to identify crosspeaks in NOESY spectra collected with mixing times of 30 ms, 50 ms, 70 ms, 90 ms and 110 ms. Initial rates of NOE buildups were estimated with local baseline correction of crosspeak volumes, calibrated using alpha helical main chain distances relationships (13) and used to generate upper bounds for initial distance restraints. All upper bound restraints involving flipping aromatic rings were increased by 2.5 A above that set by initial rate of NOE buildups. A lower bound distance restraint of 1.90 A was applied. One round of structure calculations and iterative crosspeak assignment completed the initial set of NOE restraints. A structurally effective restraint set consisting of 1133 NOE-based distance restraints for ferrocytochrome c and 909 NOE-based distance restraints for ferricytochrome c were thereby obtained from the 2D NOESY spectra. B. Distance Restraints Derived From 3D ^H NMR Spectra Due to chemical shift overlap in 2D spectra, unambiguous assignment of many NOEs is generally laborious and at times impossible. Through the use of 3D NMR spectroscopy, increased resolution can be achieved by correlating the resonance frequencies of three individual spins in three independent dimensions. In the 3D ^H NOESY-TOCSY experiment, one combines magnetization transfers due to both NOE and J coupling in one experiment to reduce chemical shift overlap problems. Heteronuclear 3D experiments, using the chemical shift dispersion of i^C or i^N, can also be used to edit a complex homonuclear 2D spectrum. In contrast to heteronuclear 3D experiments, the homonuclear 3D spectrum contains a larger number of cross peaks, which is a disadvantage, but provides multiple confirmations of the origin of the NOE interaction found in the homonuclear 2D spectrum. The use of 3D iH NOESY-TOCSY spectrum of cytochrome c in each redox state resulted in additional NOE-based distance restraints not available using conventional 2D NOESY analysis. These distance restraints were
514
PhoQbeX.Qietal.
obtained by careful evaluation of a family of structures, refined using the initial restraint set described above, in conjunction with an analysis of the 3D iH NOESY-TOCSY spectrum. This analysis yielded a restraint set composed of 70 intraresidue and 237 short range, 182 medium range and 324 long range inter-residue NOE-based distance restraints for ferrocytochrome c. The restraint set for ferricytochrome c was composed of 34 intraresidue and 114 short range, 56 medium range and 144 long range inter-residue NOE-based distance restraints. The 3D iH NOESY-TOCSY experiment does not provide simultaneous confirmation of the origin of both NOE-correlated frequencies. Thus, although confirmation of the origin of both frequencies could be checked by examing both NOESY-TOCSY pathways (i.e., spin A NOE to spin B TOCSY to spin C and spin B NOE to spin A TOCSY to spin D), this does not guarantee that a given crosspeak is entirely due to one spin pair. Therefore initial structures were examined to provide an additional level of confidence on the assignment of a given NOE crosspeak to a given spin pair by rejecting all other possible spin pairs on gross structural grounds. To avoid issues relating to variable transfer efficiencies in the TOCSY component of the three dimensional experiment, all restraints derived from analysis of the three dimensional NOESY-TOCSY spectrum were simply encoded as corresponding to distance upper bounds of 5.0 A (7.5 A in the case of NOEs involving hydrogens of flipping aromatic rings and 6.0 A in the case of methyl groups, see Dell wo & Wand (17) and a lower bound of 1.9 A. C. Torsion Angle Restraints Derived From I R - I H J-coupling Constants Using established homonuclear methods for measuring ^jHaHN coupling constants, ^ torsion restraints were obtained for cytochrome c in both redox states. Specifically, 2D DQF-COSY, TOCSY and NOESY spectra were taken under identical experimental conditions and used to determine ^JHOHN coupling constants. For example, in the ferrocytochrome c spectra, 38 (|) torsion restraints were obtained for those residues with well resolved crosspeaks in the fingerprint region. These torsion restraints were obtained by use of linear combinations of corresponding cross sections of antiphase and inphase spectra along the 0)2 dimension (10). The ^JHaHN coupling constants were also estimated directly from high-resolution DQFCOSY spectrum processed with strong resolution enhancement and corrected as described by Neuhaus et al. (14). The empirical calibration constants of Pardi et al. (15) were used to solve the Karplus equation (16). In this way, an additional 47 ^ torsion angle restraints were determined for ferrocytochrome c and 58 ^ torsion angle restraints for ferricytochrome c, all with an assumed precision of better than ±30^. In total, 85 and 58 ^ torsion
2D and 3D NMR Studies of Cytochrome C
515
angle restraints were used in the final structure calculations for ferro- and ferricytochrome c, respectively. D. Definition of Hydrogen Bond Restraints A representative family of initial structures (intermediate resolution) was used to assign definitive hydrogen bonding involving main chain atoms. This was achieved using the program Dspace (Hare Research, Bothell, WA). A metric matrix approach (18) encoded in Dspace was used to generate starting structures. The bounds matrix was created using amino acid templates as described previously (22), and employed reduced van der Waals radii for atom pair interactions that could potentially be hydrogen bonding. The bounds matrix was smoothed by exhaustive application of the triangle inequality, randomly sampled and the structures embedded in E3 space. The embedded structures were then refined with repetitive application of steepest descent least squares minimization of the sum of the squares of the violations of the structure with respect to covalent geometry and experimental restraints. At this stage of the refinement, errors in chirality were corrected by inversion. Minimization was then followed by several hundred cycles of simulated annealing using the sum of the squares of the violations as a pseudo-temperature variable (19). The geometrical criteria for hydrogen bonding employed required the amide nitrogen-carbonyl oxygen distance (dNO) to be less than 2.5 A and the angle formed by the amide nitrogen, amide hydrogen and carbonyl oxygen to be greater than 120 degrees (20). The statistical criterion used required that the geometric criteria be satisfied in all structures in order for a given hydrogen bond to be included as a restraint. This is a very stringent statistical requirement and was employed to avoid issues raised elsewhere (22). A total of 26 hydrogen bonds were identified in the initial structures of ferrocytochrome c, each with the distance variations smaller than 0.5 A and angle variations smaller than 40 degrees within the family. Similarly, 14 hydrogen bonds were identified in the initial structures of ferricytochrome c. These hydrogen bonds were incorporated as restraints by encoding them as simple distance bounds (dNO < 2.5 A). Linearity was not required. E. Stereospecific Assignments The floating chirality technique (21,22) was used to obtain stereospecific assignments and employed the family of structures refined by simulated annealing in Dspace. In cases where available restraints on side chain torsion angles are absent or limited, the restraint set only subtly suggests the correct prochiral assignment. The floating prochirality method, when properly implemented, has the potential of being able to assign
516
Phoebe X. Qi ^r«/.
prochirality under conditions where only subtle statistical preference is observed. Prochiral assignments for a significant fraction of the y- and 5methyls of valine and leucine residues, respectively, the alpha hydrogens of glycine and the p methylene centers were assigned by use of a protocol involving appropriate statistical tests and investigation of conditional probabilities (22). F. Restrained Molecular Dynamics and Inclusion of Structural Water The final stages of structure refinement were accomplished with the program X-PLOR (23). This involved 1 picosecond molecular dynamics calculations and restrained energy minimization. The structure calculations included a total of 1960 restraints for ferrocytochrome c and 1630 for ferricytochrome c. The restraints were composed of all the 2D and 3D NOEbased distance restraints, ^ torsion angle restraints, and hydrogen bond restraints. Since violations of the NOE-based distance restraints in the Dspace refined structures were relatively small and infrequent, a standard refinement protocol was employed (24). The empirical energy function of XPLOR was applied and included terms to represent covalent geometry, hard sphere van der Waals interactions and pseudo-energy terms to represent experimental distance and torsion angle restraints. No non-bonded, attractive potentials were employed. The force constant used to scale the van der Waals' repulsion term was 4 kcal mol "^ A""*. The NOE and torsion angle restraints were represented by a square well potential, and the hydrogen bond restraints were represented by a soft square well potential. The NOE, torsion angle and hydrogen bond restraints were expressed in the empirical energy function using force constants of 50 kcal mol"^ A"^, 1000 kcal mol"^ rad"^ and 200 kcal mol"^ A"^, respectively. In the final stages of refinement, water molecules were introduced into the refined model and restrained by simple distance restraints, derived from observed NOEs (25), using force constants of 200 kcal mol-^ A-2. Structural water was detected using a straightforward modification of the selective 3D ^H NOESY-TOCSY experiment (26). For most purposes, where the bound water is in rapid exchange with bulk solvent on the chemical shift time scale, the relevant information contained in the NOESY-TOCSY experiment is found in a single two dimensional plane at the frequency of water. With this in mind, we further simplified the NOESY-TOCSY experiment by replacing the first two nonselective pulses of the experiment with pulses that selectively excite the water resonance. This allows collapse of the three dimensional experiment into a two dimensional experiment containing the same information and also allows a higher digital resolution experiment and/or higher signal-to-noise experiment to be carried out over the same total acquisition time. A total of 34 and 20 NOEs to bound water were detected in spectra of ferro- and ferricytochrome c, respectively. The number of water
2D and 3D NMR Studies of Cytochrome C
517
molecules determined was defined by the minimum number of water molecules required to satisfy the NOEs to bound water in a manner consistent with the refined model for each redox state. Six and five waters in ferro- and ferricytochrome c were determined to be long lived (i.e., lifetimes greater than 300 ps) and were introduced into the refined average structure.
IV. DISCUSSION A. Summary of the Structures
N-helix
tM^
70s-helix
C-helix
N-helix
^0
^ 50s-helix
Figure 1: Ribbon representations of the models for the solution structure of oxidized (left) and reduced (right) horse cytochrome c. Drawn with the program Molscript (29). Ribbon representations of the refined average structures of ferro- and ferricytochrome c are shown in Figure 1. There is no doubt that the overall topology of both structures is the same as previously reported for c-type cytochromes. The refined average structures for each redox state were derived from 44 independently refined structures. The average all-residue r.m.s.d. of the main chain of each structure from the refined average structure was 0.51 A and 0.60 A for the reduced and oxidized proteins, respectively. The average all-residue r.m.s.d. of the heavy atoms of each structure from the refined average structure was 1.05 A and 1.29 A for the reduced and oxidized proteins, respectively. Though the ribbon representations of the two structures are quite similar there are clear differences, centered on residues comprising the 50s and 70s helices and the
518
Phoebe X. Qi ^r a/.
relative alignment of the C- and N-terminal helices. In addition, there are significant differences in the conformation of the loop comprised by residues 20 to 28. These differences are manifested in the observed r.m.s.d. of 2.6 A for the heavy backbone atoms of the two refined average structures. The significance of these structural differences with respect to the kinetic and thermodynamic issues underlying interprotein electron transfer, proteinprotein recognition and the relative stability of the two redox states is currently under investigation. The complete details of the refinement of the oxidized protein and its comparison to the reduced structure is being reported elsewhere (28). B. Conclusions Using a comprehensive battery of ^H-based NMR experiments including three dimensional homonuclear spectra, we have determined the solution structures of horse ferro- and ferricytochrome c to high resolution. Key to this approach was the use of advanced statistical methods to compensate for the lack of extensive torsion angle constraints in the identification of prochiral hydrogens. The use of three dimensional spectroscopy in conjunction with good working models of the structures allowed the number of NOE-based restraints that could be determined to approach the density required for high resolution structures to be obtained. ACKNOWLEDGMENTS The authors are grateful to D. R. Hare and R. Morrison of Hare Research for the implementation of many necessary capabilities in the program Dspace and for providing computational facilities. This work was supported by NIH grants GM-35940, CA-06927 and RR-05539, by a grant from the Commonwealth of Pennsylvania, and by a grant from the F. Ripple Foundation. RAB is the recipient of an NIH Physician Scientist Award (CA01456). REFERENCES 1. Wuthrich, K. (1986) NMR of Proteins and Nucleic Acids, Wiley, New York. 2. Vuister, G. W., Boelens, R., & Kaptein, R. (1988) / . Magn. Reson., 80, 176-185. 3. Vuister, G. W., Boelens, R., Padilla, A., Keywegt, G. J., & Kaptein, R. (1990) Biochemistry 29, 1829-1839.
2D and 3D NMR Studies of Cytochrome C
519
4. Takano, T. & Dickerson, R. E. (1981) /. Mol. Biol. 153, 95-114. 5. Berghuis, A. M., & Brayer, G. D. (1992) /. Mol. Biol. 223, 959-976. 6. Feng, Y., Roder, H., & Englander, S. W. (1990) Biochemistry 29, 3494-3504. 7. Macura, S., & Ernst, R. R. (1980) Mol. Phys. 41, 95-117. 8. Ranee, M.,S0rensen, O. W., Bodenhausen, G., Wagner, G., Ernst, R. R., & Wiithrich, K. (1983) Biochem. Biophys. Res. Commun. 117, 479-485. 9. Bax, A., & Davis, D. G. (1985) /. Magn. Reson. 65, 355-360. 10. Ludvigsen, S., Anderson, K. V., & Poulsen, P. M. (1991) J. Mol. Biol. Ill, 731-736. 11. Wand, A. J., DiStefano, D. L., Feng, Y., Roder, H., & Englander, S. W. (1989) Biochemistry, 28, 186-194. 12. Feng, Y., Roder, H., Englander, S. W., Wand, A. J., & Di Stefano, D. L. (1989) Biochemistry 1%, 195-203. 13. Wand, A. J. & Nelson, S. J. (1991) Biophys. J. 59, 1101-1112. 14. Neuhaus, D., Wagner, G., Vasak, M., Kagi, H. R., & Wuthrich, K. (1985) Eur. J. Biochem. 151, 257-273. 15. Pardi, A., BiUeter, M., & Wuthrich, K. (1984) J. Mol. Biol. 180, 741751. 16. Kaiplus, M. (1959) J. Chem. Phys. 30, 11-15. 17. Dellwo, M. J., & Wand, A. J. (1993) /. Am. Chem. Soc. 115, 18861893. 18. Havel, T. F. (1991) Prog. Biophys. Molec. Biol. 56, 43-78. 19. Nerdal, W., Hare, D. R., & Reid, B. R. (1988) / . Mol. Biol. 201, 717-739. 20. Stikle, D. F., Presta, L. G., Dill, K. A., & Rose, G. D. (1992) J. Mol. Biol., 220, 1143-1159. 21. Weber, P. L., Morrision, R., and Hare, D. R. (1988) /. Mol. Biol. 204, 483-487. 22. Beckman, R. A., Litwin, S., & Wand, A. J. (1993) / . Biomolecular NMR 3, 675-700. 23. Brunger, A. T. (1992) X-PLOR version 3.1 Manual, Yale University, New Haven, CT. 24. Nilges, M., Clore, G. M. & Gronenbom, A. M. (1988) F.E.B.S. Lett. 229, 317-324. 25. Qi, P. X., Urbauer, J. L., Fuentes, E. J., Leopold, M. F., & Wand, A. J. (1994) Nature Struc. Biology 6, 378-382. 26. Otting, G., Liepinsh, E., Farmer, B. T., H, & Wuthrich, K. (1991) / . Biomolecular NMR 1, 209-215. 27. Qi, X. P., Di Stefano, D.L. & Wand, A. J. (1994) Biochemistry 33, 6408-6417. 28. Fuentes, E. J., Beckman, R. A., Qi, P. X. & Wand, A. J. Biochemistry, submitted. 29. P. J. Kraulis (1991) /. Appl. Cryst. 24, 946-950
This Page Intentionally Left Blank
NMR Relaxation Methods To Study Ligand-Receptor Interactions David W. Hoyt, Jian-Jun Wang, and Brian D. Sykes Protein Engineering Network of Centres of Excellence and the Department of Biochemistry, University of Alberta, Edmonton, AB, T6G 2S2, Canada
I. INTRODUCTION The formation of ligand/receptor complexes is an area of high interest from a biological standpoint in terms of understanding signal transduction in cellular growth and communication, and in terms of developing therapeutics to effect a desired biological response. With the increasing availability of receptors in quantities suitable for biophysical study, NMR spectroscopy should prove useful in the study of the structure of these protein complexes. While x-ray crystallography can be used to study protein complexes of a wide range of molecular weight, crystals can not always be obtained. NMR spectroscopy, although limited to complexes of less than 100 kDal, can yield kinetic and flexibility information in addition to the structure of protein complexes. Recent work on the cyclosporin A / cyclophilin complex has shown a large difference between free and bound structures of the cyclosporin A ligand when binding to its biological target, cyclophilin (1). Studies like these emphasize the importance of obtaining a bound ligand structure in complex with its target protein. For the study of tight-binding complexes between ligand and protein, the recently developed NMR techniques of isotope-editing (2) and isotope-filtering (3) are best suited for gaining structural data for the bound ligand, receptor and intermolecular interactions between the proteins. Hydrogen-exchange studies may also be used to distinguish between solvent accessible residues of tightlycomplexed proteins from those protected in the binding site (4). In this paper we focus on the interactions of ligand and protein which are in fast exchange limit on the NMR time scale. These more weakly binding protein-protein or peptide-protein complexes will be especially prevalent in studying the interactions of analogs or fragments of ligands to their native receptors or soluble binding domains. Thus, these methods can yield information on ligand-receptor complexes of reduced affinity, such as most lead compounds, which may yield critical binding site data to be used in the rational design of therapeutic analogs. NMR relaxation methods discussed herein can be applied to obtain kinetic parameters, discriminate regions of flexibility within bound ligands, and determine the structure of bound ligands. As with all NMR relaxation measurements, structure information and dynamic information are interwoven, and the goal is to develop methods to obtain structural and dynamic information separately. Two approaches will be discussed in this paper; the first focuses on dynamic information and the second on structural information. These methods TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
521
522
David W. Hoyt et al.
are direct NMR relaxation time (Ti and T2 ) measurements and transferred NOE techniques, and are applied to two model systems. Firstly, Ti and T2 relaxation studies of the ^H NMR resonances of aromatic rings and methyl groups are used as probes of the mobility of human transforming growth factor a (TGF-a) when bound to the extracellular domain of the epidermal growth factor receptor (EGFR-ED). Secondly, 2-D iH NMR transferred NOE techniques were used to determine the structure of desmopressin when bound to neurophysin-II (NP-II). While the relaxation approach could also be used for the desmopressin/ NP-II system, application of the transferred NOESY approach to the TGF-a/ EGFR-ED system would be more difficult because significant NOE's exist for the larger TGF-a prior to the addition of the receptor.
11.
Materials and Methods
A.
Protein I Peptide Production
Recombinant human TGF-a was provided by Dr. R. N. Harkins (Berlex Biosciences, Inc. Richmond, CA). The methods of expression, harvesting, refolding and purification have been previously described (5 and references therein). EGFR-ED was provided by Dr. Maureen O'Connor-McCourt (Biotechnology Research Institute, Montreal, PQ). Expression and purification using this system has been described previously (5 and references therein). Desmopressin was synthesized by solid-phase peptide synthesis methods and was purified by reversed-phase HPLC (6). Pure bovine neurophysin-II (NP-II) was provided by Dr. Esther Breslow (Cornell University, Ithaca, NY).
B.
NMR Spectroscopy
Samples prepared for NMR experiments are in 90% H2O/10%D2O. All 1-D ^H NMR relaxation spectra of the TGF-a/EGFR-ED system were acquired using a Varian Unity-600 NMR spectrometer at pH 6.0 and 298 °K and all 2-D transferred NOE experiments of the desmopressin/NP-II system were performed on a Varian VXR-500 NMR spectrometer at pH 5.7 and 278 °K. The H2O resonance was suppressed by low-power coherent irradiation for 2.0 s prior to each pulse train. Spectra of TGF-a in the presence of EGFR-ED were observed at seven ratios of ligand to receptor (10:1 to 0.5:1) as well as TGF-a and EGFR-ED separately. The maximum concentration of TGF-a and EGFR-ED were 2 mM and 200 |xM, respectively and all TGF-a/EGFR-ED samples were made in a 50 mM phosphate, 10 mM KCl, 1 mM EDTA, and 0.5 mM NaNs buffer. For all experiments, the spectral width in the ^H dimension(s) was 8000 Hz. Longitudinal relaxation times Ti were measured using the standard inversionrecovery pulse sequence (7). Transverse relaxation times T2 were measured using the Meiboom-Gill modification of the Carr-Purcell experiment (8). Time delays between 180" pulses was kept small (2t = 0.25 ms) to minimize dephasing due to spin-spin coupling. The relaxation curves were fit to Mz(t) = Mz(0) [1-w exp (- t/Ti)] and Mxy(t) = [Mxy(O) exp (-irT2)]
(1)
for Ti and T2 measurements, respectively, where w is the 180° pulse inversion factor (w = 2 for perfect 180° pulse). Values for w were typically 1.8-1.9.
NMR Analysis of Ligand-Receptor Interactions
523
Spectra of 10 mM samples of desmopressin were recorded in the absence and presence of 1.0 mM NP-II. Transferred NOESY and NOESY experiments were recorded at 150 ms and 300 ms, respectively. The spectral width were 6000 Hz in both dimensions with 1024 complex points in t2 dimension with 512 FIDs in the tl dimension.
C
Theory
The equilibrium between free and bound ligand (TGF-a or desmopressin) in exchange with its receptor or carrier protein (EGFR-ED or NP-II, respectively) can be written as L
+
P
<->
LP
(2)
For systems in fast-exchange on the NMR timescale is defmed by [47r2 (5F - 6B)2 TB^ ] « 1; [ TB / T2B] « 1
(3)
where TB (XB = 1/koff) is the lifetime of the bound ligand, (5F - 5B) is the change in chemical shift in Hz upon binding, and T2B is the transverse relaxation time in the bound environment. The observed properties (Fobs) such as chemical shift (5obs)» longitudinal and transverse relaxation rates (1/Tiobs and l/T2obs» respectively), and linewidth (Avobs) are given by Pobs = PF Pp + PB PB
(4)
where pp and pB are the fraction of free and bound ligand. If pp » PB» then Fobs will be dominated by the properties of the free ligand. Specifically for the Ti and T2 relaxation measurements eqn. 4 becomes Rlobs= 1/Tlobs = P F / T I F + P B / T I B
R2obs = l/T2obs = PF /T2F + PB / T2B
(5)
where l/Tip and I/T2F are the longitudinal and transverse ^H NMR relaxation rates of the free ligand, and 1/TIB and I/T2B are the those for the bound ligand. NMR relaxation of protons in proteins is dominated by the dipole-dipole relaxation mechanism. For aromatic, geminal or methyl protons this involves protons separated by a fixed internuclear distance, and for which the internal motion of the groups is simple and well understood (9-11). The NMR relaxation of protons relaxing via the intramolecular dipole-dipole mechanism [where the contributions of internal motion are not yet considered] is given by R2 = I/T2 = A {3tcB + 5TCB /[I + (coo tcB)^] + 2TCB / [ 1 + (2ob TCB)^]} Rl = 1/Ti = 2A {TCB /[ 1 + (coo tcB)2] + 4tcB /[ 1 + (2coo XCB)^] } (6) where A = (3/20) (N-1) (y^ n^lifi) , N is the number of equivalent spins separated by distance r , COQ is the NMR resonance frequency, and tcB is the correlation time for the internuclear vector between the protons in the bound state. The value of A is 5.0 x 10^ s"2 for three equivalent protons separated by r = 1.8 A such as a methyl group, and equal to 3.6 x 10^ s"2 for two equivalent protons separated by r = 2.5 A such as the proton pairs on the 2,3 and 5,6 positions of the aromatic ring of Tyr sidechains. The protons in the 3 and 5
524
David W. Hoyt et al.
positions of Tyr are relaxed through dipole-dipole interaction with one other proton (2 or 6) whereas the 2 and 6 protons are relaxed by the 3 or 5 protons as well as the P, P' CH2 protons. Therefore the 3,5 proton resonance is ideal to follow and will be used as an example in this paper. For these aromatic groups, internal motion around the P ^ bond does not effect the relaxation since the axis of internal motion is colinear with the intemuclear vector. The internal motions of methyl groups reduces A by a factor of 1/4 and has been considered elsewhere (5,9. 10). In contrast to the relaxation studies where the measurements are influenced mostly by the free ligand, the TRNOESY cross-peak intensities, ay, are dominated by the bound ligand even when pp » P B » since the cross-relaxation rate between protons in the bound ligand is so much faster than in the free ligand. This is expressed in the fast exchange limit as ay (Tm) « -[ PB WijB + PF WijF ] tm « -[ PB WijB ] im
O)
where WijF and Wye are the cross-relaxation rates for the ij proton pair for the free and bound ligand, respectively, and WyB » WJJF- This result was first presented by Clore et al. (12) and is valid in the limit of short mixing times (xm)A full theoretical evaluation of TRNOE has been presented elsewhere (13, 14). The Wye values, once obtained, yield interproton distances in the bound ligand since Wya is proportional to < 1/r^yB >• Thus, ryB can be evaluated using a suitable reference distance where rys = {(Wref/WijB) X r^ref}^^^ under the assumption that the same isotropic rotational correlation time applies to all proton intemuclear vectors.
IIL
Results and Discussion
Two specific relaxation approaches are discussed; the first approach involves measurement of the NMR longitudinal and transverse relaxation times (Ti and T2) wheareas the second approach involves measurement of transferred NOE intensities. In the relaxation time measurement approach, the distance information is assumed known and the focus is on dynamic information. In the transferred NOESY approach, the motion is considered constant and the focus is on the structural information. These are discussed in terms of two different biological systems.
A.
Tl and T2 Relaxation Experiments
Figure 1 presents Ti and T2 relaxation data for the ring protons of Y38 of TGFa. In our approach, 1-D iH NMR spectra from Ti and T2 experiments are collected for the ligand (TGF-a) in the absence of receptor (panels C & D) and in the presence of EGFR-ED at various mole ratios of ligand / receptor. In panels E & F, TGF-a is present in a 10:1 ratio to EGFR-ED. In these relaxation measurements a proton resonance is followed as a function of the delay time in the experiment and one observes the exponential increase of the signal in the Ti experiments or exponential decay of the signal in the T2 experiments. The series of resulting spectra are overlayed and presented in the Figure 1 (Panels C, D, E & F). The marked peak (*) in panels E & F corresponds to the 3,5 protons of the aromatic sidechain of Y38. In panels A & B of Figure 1, the intensity of these protons are plotted as a function of the delay times and the data points are curve-fitted using eqn. 1 of this paper. From these equations one can derive Ti
525
NMR Analysis of Ligand-Receptor Interactions
A
C
0
0.05
0.10
6.70
time (s)
6.60
6.50
6.40
6.70
6.60
8 (ppm)
6.50
6.40
6 (ppm)
Figure 1. Typical Ti and T2 relaxation measurements of TGF-a. Relaxation data for the aromatic sidechain protons of residue Y38 are shown for TGF-a in the absence (panels C+D) and presence (panels A+B & E+F) of EGFR-ED. Panels A, C and E depict Ti experimental data and panels B, D and F depict T2 data. In panels A & B are plots of typical curve-fit data.
and T2 values which can be discussed in terms of relaxation rates Ri and R2, where Ri = 1/Ti and R2 = I/T2. Relaxation rate data for the 3,5 protons of Y38 and a representative set of methyl proton resonances (5) are shown in Table I. The Ri data are not very useful in this system because spin-spin diffusion leads to a similar Ri value for all resonances free or bound (5, 10). The R2 data is able to discriminate flexible regions within bound TGF-a by comparing the slopes of the R2 vs. pB lines and extrapolating to the bound relaxation rate R2BPB is calculated assuming that the receptor is completely saturated with TGF-a (KD « 0.1 - 1 |LiM). The relaxation rates are a reflection of the mobility or flexibility afforded the groups and R2B can be simplified from eqn. 6 in the limit (coo TCB)^ » 1 to R2B = ^AvB = 1.1 TcB^^^ R2B =3.7TcB^ff
(for aromatics) (for methyl groups)
(8)
which the corelation time for internal methyl or aromatic rotation (xcB^^O is in nsec. The data demonstrate that the proton relaxation rates of some resonances are greatly enhanced in the presence of receptor while others (such as VI) are not. The low relaxation rates indicate residues of high mobility. Therefore, residues in which the sidechain and backbone motions are retained in the presence of receptor show little enhancement of their relaxation rates and are thus not likely to be involved in significant interactions. Conversely, greatly enhanced relaxation rates indicate proton resonances of residues in regions which encompass the ligand-binding site.
526
David W. Hoyt et al.
Table I: Relaxation rates for TGF-a proton resonances^ as a function of added EGFR-ED A.
Longitudinal relaxation rates (Ri) [s"•h
[TGF-a]/ [EGFRED] 10.0 7.5 5.0 3.5 2.0 1.0 0.5 B.
VI
pB
ooo
0.10 0.13 0.20 0.29 0.50 1.00 1.00
_-g
2.0 2.1 2.2 2.3 2.4
L24
Y38
3.0 3.3 3.7 4.4
08 1.8 2.0 2.5 2.8 3.4
L24
Y38
"To
A46
\3
L48
TZ
3.1 3.2 3.3
2.5 2.6 2.9 3.2
A46
L48
Transverse relaxation rates (R2) [s"^]1
[TGF-a]/ [EGFR-ED] 10.0 7.5 5.0 3.5 2.0 1.0 0.5
VI
pB CTDD 0.10 0.13 0.20 0.29 0.50 1.00 1.00
—4 6 6 6 7 8
To 23 11 40 66
12 24 28 33 38 65
33 62 72
5 25 29 50 90
t Proton resonances are the 3,5 protons of the aromatic ring sidechainfor Y38 and a representative set of methyl resonances for the other residues shown.
B.
TRNOE Experiments
In these experiments, we have studied the peptide desmopressin in the presence of 0.1 mole equivalentsof bovine NP-II. The affinity of desmopressin for bovine NP-II is low so that the interaction is in the NMR fast exchange limit on both the chemical shift and relaxation time scale (15). In panel B of Figure 2, the aromatic to sidechain region of the ^H NMR 2-D NOESY of free desmopressin is shown. Only four cross-peaks of desmopressin are seen in this region which corresponds to the intraresidue peaks of F3. Very few other contacts are observed, partially because the rotational correlation time for desmopressin in the free state is near the inverse of COQ, and partially because of the flexibility of the peptide (16). When NP-II was added (Figure 2A), many new transferred NOE crosspeaks developed, caused by binding of desmopressin to NP-II. The binding induces desmopressin, which is flexible in aqueous solution, to become more rigid and gives it a longer correlation time (XCB > 'Ccp)It is also possible that the structure of desmopressin changes when bound to the carrier protein. The new crosspeaks indicate a variety of new contacts within desmopressin which are labeled in Figure 2. In other regions of the 2-D transferred NOESY, contacts can also be seen between the peptide and the protein which can be very useful in identifying the binding site of the ligand and interactions in the desmopressin /NP-II complex. From the full transferred NOE data (which includes more than 230 crosspeaks for the 9 residue peptide), distance restraints can be derived for calulating the bound structure of desmopressin using molecular modeling.
527
N M R A n a l y s i s o f L i g a n d - R e c e p t o r Interactions
6P'/9lNH'
| ^ V 3 ^ 2P'/36 ^1^ 4Pf3£
3pf3£ I
7.46 7.48
8y/3e
6Pf9tNH 6pV9tNH
|iiii|iiii|iiii|iiii|iiii|iiii|iiii|iiii[iiii|iiii|iin|iiii|ini|iiii|nii|iiii|iiii|ir
3.6
3.2
2.8 Fl (ppm)
2.4
2.0
"'i[iiii|iiii[iiii|iiii|iiii|iin[iiii|iiii|iiii|iiii[iiii|iiii|iiii|iiii|iiiniiii[iiii|ii'
3.6
3.2
2.8
2.4
2.0
PI (ppm)
Figure 2. Transferred NOESY aiid NOESY spectra of desmopressin. The aromatic-sidechain region of the transferred NOESY spectrum of desmopressin in the presence of 0.1 mole equivalent NP-II (panel A) and the same spectral region of the NOESY spectrum of desmopressin in absence of NP-II (panel B).
IV.
Conclusions
In both of the NMR relaxation approaches presented herein the parameters measured R2ij and ay are proportional to [TCB] , where indicates an ensemble average. In the first approacn one assumes the distance is constant and known between particular proton pairs and looks for changes in local mobility. The caveat on this approach is that other proton intemuclear contacts can contribute to the relaxation, or other sources of linebroadening such as exchange broadening. In the second approach one assumes the bound correlation time to be uniform within the bound ligand and looks for changes in proton-proton distances. The obvious caveat here is that some section of the ligand may be still flexible when bound, and the contribution of the protein protons in the relaxation is hard to evaluate. Acknowledgments The TGF-a/EGFR-ED and the desmopressin /NP-II projects were supported by the Government of Canada through the Networks of Centres of Excellence Program. DWH wishes to express his thanks for additional financial support and the provision of TGF-a samples by Berlex Biosciences, Inc., USA. DWH also thanks Dr. Maureen O'Connor-McCourt's lab for providing EGFR-ED. JJW wishes to thank Dr. Esther Breslow for providing NP-II and Paul Semchuk for peptide synthesis of desmopressin. References 1. Weber, C , Wider, G., von Freyberg, B., Traber, R., Braun, W., Widmer, H., & Wuthrich, K. (1991). Biochemistry 30, 6563-6574.
528
David W. Hoyt et al.
2. Tsang P., Ranee M., Fieser, T.M., Ustresh, J. M., Houghten, R.A., Lemer, R. A., & Wright, P.E. (1992). Biochemistry 31,3862-3871. 3. Ikura, M. & Bax, A. (1992). J. Am. Chem. Soc. 114,2433-2440. 4. Englander, S. W. & Mayne, L. (1992). Anna. Rev. Biophys. Biomol. Struct. 21,243-265. 5. Hoyt, D. W., Harkins, R. N., Debanne, M. T., O'Connor-McCourt, M., & Sykes, B. D. (1994). submitted to Biochemistry:. 6. Wang, J. J., Hodges, R. S., & Sykes, B. D. (1994a). submitted to Int. J. Pept. Protein Res. 7. Void, R. L., Waugh, J. S., Klein, M. P., & Phelps, D. E. (1968). J. Cfiem. Phys. 48, 38313832. 8. Meiboom, S., «& Gill, D. (1958). Rev. Sci. Instr. 29,688-691. 9. Marshall, A. G., Schmidt, P. G., & Sykes, B. D. (1972). Biochemistry 11, 3875-3879. 10. Sykes, B. D., HuU, W. E., & Snyder, G. H. (1978). Biophysical J. 21,137-146. 11. Goldman, M. (1988). Quantum Description of High-Resolution NMR in Liquids (J. S. Rowlinson, ed.). International Series of Monographs on Chemistry 15, pp. 243-248, Clarendon Press, Oxford. 12. Clore, G. M. & Gronenbom, A. M. (1982). J. Magn. Reson. 48,402-417. 13. Landy, A. B. & Rao, B. D. N. (1990). J. Magn. Reson. 81,371-377. 14. Campbell, A. P. & Sykes, B. D. (1991). J. Magn. Reson. 93,77-92. 15. Wang, J. J., Hodges, R. S., Breslow, E., & Sykes, B. D. (1994b). manuscript in preparation. 16. Sykes, B. D. (1994). In "Peptides: Chemistry, Structure and Biology" (R. S. Hodges and J. A. Smith, eds.), pp. 1099-1102, ESCOM, Leiden, The Netheriands.
SECTION IX Peptide Synthesis
This Page Intentionally Left Blank
Application of 2-Chlorotrityl Resin: Simultaneous Synthesis of Peptides which Differ in the C-Termini Anita L. Hong, Tin T. Le, and Tning Phan AnaSpec, Inc., San Jose, CA 95131
I.
Introduction
Peptides which differ in their C-termini often exhibit different structural-activity relationship, for example, the amide C terminal of the des-pentapeptide(B26-30) insulin(l) was shown to be 100% active in contrast to the free acid(20-30%) (2). Semisynthesis employing proteases such as trypsin, chymotrypsin, carboxypeptidase Y were used to modify the C-terminal of these biologically active peptide hormones(3). One of the limitations of enzyme peptide synthesis is the substrate specificity of the proteases. Tjoeng et al (4) reported on multiple peptide synthesis using a single support. Peptide mixture was obtained and the peptides were separated by HPLC. We have demonstrated that peptides which differ in their C-termini can be simultaneously synthesized in one reaction vessel by employing resins that possess different cleavage properties. The resins that we used were the weak acid labile 2-chlorotrityl resins(5-7) and the TFA cleavable Wang resins.The success of this approach was shown by the co-synthesis of : a). ACTH(4-10) with ACTH(4-11); b). Neuropeptide Y, a C-terminal amide peptide with its corresponding C-terminalfreeacid peptide. II.
Materials and Methods
A. Reaggms and Materials Fmoc protected amino acids, 2-chlorotrityl(Clt) resins. Rink amide MBHA resin, Wang resins, HBTU and HOBt are commercially available from AnaSpec, Inc. The side chain protecting groups for the amino acids are t-butyl for Asp, Glu, Ser, Thr, Tyr; trityl for Asn, Cys, Ghi, His; Pmc for Arg and Boc for Lys. N-methylpyrrolidinone(NMP), Omnisolv^ grade, was purchased from VWR. Piperidine, diisopropylethylamine (DIEA)were purchased from Aldrich. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
531
532
Anita L. Hong et al.
All syntheses were performed on the Applied Biosystems Peptide Synthesizers Model 430A or 431 A. Peptides were synthesized using solid phase peptide synthesis employing Fmoc chemistry methodology. Fmoc amino acids were activated using one equivalent of 0.45M HBTU/HOBt solution and two equivalents of DIEA. NMP was used as the coupling medium. Synthesis was performed starting with the appropriate resin. Fmoc protecting groups were removed using 20% piperidin^MP. Cleavage of the protected peptide from the Qt-resin was accompUshed using 30% acetic acid in DCM for 3 hours at room temperature. Alternatively, cleavage of the protected peptide from the Clt-resin can be achieved by using a mixture of 1:2:7: acetic acid: trifluoroethanol: DCM for 45 minutes at room temperature. Deprotection/cleavage of the peptide from the Wang resin was performed using trifluoroacetic acid in the presence of scavenger mixture (0.75g phenol, 0.25ml EDT, 0.5 ml thioanisole, 0.5ml water and 10ml TFA) for 2-3 hours at room temperature. Peptides were purified to >95% purity on reverse phase HPLC using CI8 columns and an AB gradient from 0%B where A is 0.1%TFA in water and B is 0.08% TFA in acetonitrile. Analytical HPLC was obtained using HP 1090 Liquid Chromatograph equipped with a diode-array UV detector. Capillary electrophoresis was performed on Waters Quanta 4000E. The authenticity of the peptides were identified by molecular weight determination using the Vestec 201 electrospray quadrupole mass spectrometer or Finnigan Mat 900 magnetic sector mass spectrometer. Amino acid analysis was performed on the Apphed Biosystems Model 420H amino acid analyzer. III.
Results
A. Synth^is Qf ACTH(4-10) and ACTHf4-ll) ACTH(4.10), Met-Glu-His-Phe-Arg-Trp-Gly and ACTH(4-1 l),Met-Glu-His-Phe-Arg-Trp-Gly-Lys were co-synthesized using a mixture of Qt resin and Wang resia The synthesis was accomplished by starting with Fmoc-Lys(Boc)-Wang resin. After the Fmoc group was removed, followed by several washing steps, Fmoc-Gly was coupled to form the Fmoc-Gly-Lys(Boc) Wang resin.
Peptide Synthesis with 2-Chlorotrityl Resin
533
To begin the synthesis of the second peptide, Gly-Clt resin was added. The synthesis was continued to complete the chain assembly for both sequences. ACTH (4-10) was isolated by cleaving the resin mixture with 30% acetic acid/DCM followed by removal of the protecting groups with TFA /scavenger mixture. The crude peptide was purified to yield 25 mg of peptide with purity of >95%. HPLC profiles of the crude and purified ACTH (4-10) are shown in Figures 1 and 2. ACTH (4-11) was obtained by cleaving the remaining resin with the TFA/scavenger mixture. The crude peptide obtained was purified by reverse phase HPLC to yield 83 mg of product with purity of >95%. HPLC profiles of the crude and purified ACITH (4-11) are shown in Figures 3 and 4. HPLC profile of a mixture of purified ACTH (4-10) and ACTH (4-11) is shown in Figure 5. Mass spec analysis of ACTH (4-10) and ACTH (4-11) showed mass units of 962 (Theoretical 962.4) and 1090.5 (Theoretical: 1090.5) respectively. Amino acid analysis showed both peptides to have the correct amino acid compositions.
LC
R
220,4
450,80
o-F
nCTH4-10.D
15001 ^
1 000
^
500-i 0 T 1 me
Figure 1
15
20
HPLC Profile of Unpurified ACTH (4-10)
LC n 220,4
D CE E
10 (m1n. )
450,80
o-F
nCTH4-10.D
500 400i 3001 200 1001 0 Time
Figure 2
10 (m i n. )
HPLCProfileof Purified ACTH (4-10)
15
20
534
Anita L. Hong et al.
LC R 2 2 0 , 4
450,80
of
nCTH4-ll.D
i000i CE
^
500H 0-1.
"qp"—^^g*!
\
I
V J ^
•
10 Tlme
Figure 3
15
20
(mln.)
HPLC Profile of Unpurified ACTH (4-11)
LC R 220,4
450,80
o-f R C T H 4 - 1 1 . D
1500H 3 1000(E ^ 500'
10 Tlme Cmln.)
Figure 4 UV 100T\UV
11.7GG 12.522
CR) of (R) of
300
450,80
of
1500d
U
^
500-J 0
RCTH(4-1 1 )
I
I
I
I
5 Tlme
Figure 5
20
RCTCOINJ RCTCOINJ
280 (nm)
LC R 2 2 0 , 4
1000H
I
15
HPLC Profile of Purified ACTH(4-11)
240 260 Wavelength
D
I
RCTCOINJ.D RCTHC4-10)
>
I t " ' — * —
10 Cm 1n. )
15
20
HPLC Profile of a Mixture of Purified ACTH (4-10) and ACTH (4-11)
Peptide Synthesis with 2-Chlorotrityl Resin
535
B.Svnthesis of Neuropeptide Y (YPSKPDNPGEDAPAEDMARYYSALRHYINLITRORY-amide^ and its corresponding C-terminal free acid peptide Synthesis of Neuropeptide Y(NPY) was started by first adding Rink amide MB HA resin (0.14 mmol) to the reaction vessel. One synthesis cycle was performed to load Fmoc-Tyr (0-t-Butyl) to tiie Rink amide MBHA. For synthesizing the corresponding peptide with C-terminal acid, Neuropeptide Y free acid peptide (NPYFA), 0.07 mmol of Tyr-Clt resin was added to die reaction vessel. The synthesis was continued to complete the chain assembly for both sequences. NPYFA was isolated by cleaving the resin mixture with 1:2:7: acetic acid: trifluoroethanol: DCM, followed by removal of the protecting groups with TFA/scavenger mixture. HPLC profiles of the crude and purified NPYFA are shown in Figxu-es 6 and 7. NPY was obtained by cleaving the remaining resin with the TFA/scavenger mixture. The crude peptide was purified to yield 43 mg of product with purity of >95%. Figures 8 and 9 show the HPLC profiles of the crude and purified NPY. Analytical HPLC as well as capillary electrophoresis of a mixture of NPY and NPYFA showed two distinct peaks corresponding to the two peptides as shown in Figures 10 and 11. Mass spec analysis of NPY and NPYFA showed mass units of 4269.1(Theoretical: 4269.08) and 4270.1 (Theoretical: 4270.07), respectively. Amino acid analysis showed peptides to have the correct amino acid compositions. A summary of the synthesis results for the two sets of peptides is shown in Table 1. LC R
220,4
450,80
of
NPY-FR
CD
300 D
200i
E
100 0 T1 me
Figure 6
800i 600 400H 200i 0
450,80
20
I
I
of NPY-FR.D
I
I
I
,
10 T i me (m i n. )
Figure 7
15
HPLC Profile of Unpurilied NPYFA
LC R 220,4 -3 ^
10 (m1n. )
HPLC of Purified NPYFA
15
20
536
Anita L. Hong et al. LC
R
220,4
450,80
o-F
NPY
CRU.D
G00-I D
400
E
200
T i me
Figure 8
10 Cmi n . )
15
20
HPLC Profile of Unpurified NPY
LC
n 220,4
450,80
of
NPY.D
400^ 300-^ D cn E
200-^ 1000T i me
Figure 9 UV
10 (m i n . )
20
15
HPLC Profile of Purified NPY
16.03? (R) of 1 S . 3 3 1 ( F^ ) o f
RNflSPEC
TESTR100.D 'r r; c^ -T- p 1 0 c^ ^ n
INC.
u D
240 260 280 W a v e l e n g t h (nm) LC
CE E
250-i 200i 150i 100i 50i 0-^
Figure 10
R
220.4
300 of
450,80
TESTR100.D l||<
NPY
,
,
.
—1
. ,—
1
1
1
1 —— 1
.
,
10 T i me (m1n . )
>
.
,
15
1
NPY-FR
L
HPLC Profile of a Nfixture of purified NPY and NPYFA
— •
1
20
"
537
Peptide Synthesis with 2-Chlorotrityl Resin SaaplMlaiM: MFXIH 4 MVY C a ^ l l u y : 75iH x COoNJuiokBlot Oat* Aequirwl: OS/21/94 02;S3 M Dttteotlon: ISS mi I l « e t x o l y t « : 50 iM MaP, pB 2.5 InjMtMBda: liydzostatia RunVoltag*: 15 kv T i p r a t u r a ; 30 Voli—;. 20.00
«
Ret Time (min)
1 2 3 4 5 6
13.925 14.217 14.858 15.250 15.825 16.908
(uV*sec) 18467 60421 77473 1544276 1716179 136349
% Area 0.52 1.70 2.18 43.46 48.30 3.84
Height (uV) 1598 3012 5282 91194 83738 4310
% Height
I n t Type
0.84 1.59 2.79 48.22 44.27 2.28
BV W W W W VB
Figure 11
Capillary Electrophoresis Profile of a Mixture of NPY and NPYFA
Table 1.
Summary of the Synthesis Results of Two Sets of Peptides which Differ in their C-Termini
Peptide ACTH (4-10) ACTH(4-11) NPY NPYFA
Starting Resin (mmol) Yield Mol. Weight 1 AA* Fmoc-Lys(Boc) Rink Amide (mg) Found (Theor.) Chlorotrityl MBHA Wang 0.125 83 962(962.5) 0.125 25 1090.5(1090.5) 0.14 43 4269.1(4269.08) 0.07 56 4270.1(4270.07) | * Starting resins used for synthesizing ACTH(4-10) and NPYFA were Gly-Clt-resin and Tyr-Clt-resin, respectively.
538
Anita L. Hong et al.
rV. Conclusion We have demonstrated that peptides which differ in their C-termini can be simultaneously synthesized in one reaction vessel by employing resins that possess different cleavage properties. This synthesis, strategy can also be used for the synthesis of multiple antigenic peptide systems, MAPS (8) and their corresponding des-lysine core sequences. Moreover, this strategy can be expanded to the synthesis of morefliantwo peptides by employing other resins such as the HF cleavable Pam resins and the photo-labile resins.
References 1. 2. 3. 4. 5. 6. 7. 8.
Fischer, W.H., Saunders, D.,Brandenburg, D., Wollmer, A., Zahn, H.(1985) Biol Chem. Hoppe-Seyler 366, 521-525. Gattner, H. G.,(1975), Z. Physiol Chem. Hoppe-Seyler 356, 1397-1404. Widmer, F. and Johansen, J. T., In Alitalo, K., Partanen, P. and Vaheri, A.(Eds.) (X9%S)Synthetic Peptides in Biology and Medicine, ElsevierScience PublishersAmsterdam p.79-86. Tjoeng, F. S., Towery, D. S., Bulock, J. W., Whipple D. E., Fok, K. F., Willianis, M. H., Zupec, M. E., Adams, S. P.(1990)/n/. J. Peptide Protein Res.35, 141-146. Barlos, K., Chatzi, O., Gates, D., Stavropoulos, G.(1991)/«r. / . Peptide Protein Res37, 513-520. Barlos, K., Gatos, D., Kapolos, S., Poulos, C. Schafer, W., Wenqing,Y.(1991) Int. J. Peptide Protein Res.3%, 553-561. Barlos, K. Gatos, D., Kutsogknni, S., Papahotiou G., Poulos, C. Ysegenidis, T. (1991) Int. J. Peptide Protein Res. 38, 562-568. Tarn, J. P.(1988) Proc. Natl Acad. 5CJ.85, 5409-5413.
Correlation of Cleavage Techniques With Side-Reactions Following Solid-Phase Peptide Synthesis Gregg B. Fields,^ Ruth H. Angeletti,^ Lynda F. Bonewald,^ William T. Moore,^ Alan J. Smith,^ John T. Stults,^ and Lynn C. Williams'^ ^Dept. Lab Medicine & Pathology, Univ. Minnesota, Minne^)olis, MN 55455 ^Dept. Develop. Biol. & Cancer, Albeit Einstein College Med., Bronx, NY 10461 ^Depts. Med. «& Biochem., Univ. Texas Health Sci. Center, San Antonio, TX 78284 ^Dept. Pathology & Lab Medicine, Univ. Pennsylvania School Med., Philadelphia, PA 19104 ^Beckman Center, Stanford University Medical Center, Stanford, CA 94305 ^Genentech, Inc., South San Francisco, CA 94080 ^Norris Cancer Research Institute, University of Southern California, Los Angeles, CA 90033
L Introduction Solid-phase peptide synthesis is routinely used in research ranging from the elucidation of chemical mechanisms to the development of potential therapeutics. The solid-phase method was originally designed for the synthesis of a single peptide at a time, but has more recently been applied for multiple peptide syndiesis and the creation of synthetic peptide libraries. The vast number of diverse products that can be created with peptide libraries has made the need for highly efficient synthetic methods especially critical. The Association of Biomolecular Resource Facilities (ABRF) Research Committee on Peptide Synthesis (PS) was formed to evaluate the quality of the synthetic methods utilized in its member laboratories for peptide synthesis. Peptide synthesis, as defined by this committee, includes the chemistries used for peptide assembly and cleavage and the methods used for characterization of the final product. Studies in 1991 and 1992 requested the synthesis of test peptides by ABRF member laboratories. Products from these syntheses were characterized by amino acid analysis (AAA), reversed-phase high-performance liquid chromatography (RPHPLC), capillary electrophoresis (CE), and mass spectrometry. Results were somewhat unexpected, as 13% of the 94 ABRF laboratory crude samples submitted for the 1991 and 1992 studies did not contain any of the desired peptide products (1,2). The respective cleavage conditions of Fmoc and Boc solid-phase peptide synthesis were believed to be the primary source of synthetic difficulties, since most non-desired products were the result of covalent adducts, not deletions. The 1993 study focused on peptide-resin cleavage conditions, as ABRF member laboratories were supplied with a peptide-resin that was assembled by the ABRF PS Committee. Even with the preassembled peptideresin, 20% of the 46 crude samples did not contain any of the desired product (3), further emphasizing problems in cleavage conditions. This year a study was designed whereby problems in peptide assembly versus cleavage were evaluated by AAA, RP-HPLC, Edman degradation sequence analysis, electrospray mass spectrometry (ESMS), and matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). The ABRF PS Committee specified a limited number of cleavage conditions to determine if certain protocols could be generally recommended. A total of 45 ABRF laboratories participated in the study by TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
539
540
Gregg B. Fields et al.
supplying 82 crude samples of a peptide whose sequence was designed by the ABRF PS Committee. The sequence was identical totiiatof the 1991 ABRF PS peptide (1), and thus would idlow for direct comparison of the efficiency of methodologies used in laboratories now versus 3 years ago.
II. Materials and Methods Participating ABRF laboratories were asked to synthesize the following peptide by the methodology most commonly used in their facility: H-Val-Lys-Lys-Arg-Cys-Ser-Met-Trp-ne41e-Pio-Thr-Asp-Asp-Glu-Ala-OH This particular sequence, which is identical to that used for the 1991 study (1), was chosen based on the potential for side-reactions during assembly, side-chain deprotection, and cleavage (1). Recommended side-chain protecting group strategies were Arg(Tos), Asp(OBzl), Cys(Meb), Glu(OBzl), Lys(ClZ), Ser(Bzl), Thr(Bzl), and Trp(For) for Boc-based chemistry and Arg(Pmc), Asp(OrBu), Cys(Trt), Glu(OrBu), Lys(Boc), Ser(rBu), Thr(fBu), and Trp(Boc) for Fmoc-based chemistry (4). Participants were asked to chose either HF or trimethylsilyl trifluorometiiane sulfonate (TMSOTf) cleavage methods following Boc chemistry. Following Fmoc chemistty, participants were asked to use either reagent K (5) or reagent B (6) for peptide-resin cleavage and to work-up the product by either (i)filteringthe resin and precipitating the product with ether or (ii)filteringthe resin, diluting thefiltratewith water and extracting with ether, and lyophilizing the prcKiuct. Samples containing --5 mg of crude product were supplied to the ABRF PS Committee in coded form via a third party to maintain participant anonymity but allow the participants to identify data sets resulting from their samples.
A. Analytical RP-HPLC Samples were dissolved in 0.1% aqueous TFA and --SO ug analyzed on a Peridn Elmer Series 4 HPLC using a Vydac Cig column (300 A pore size, 4.6 x 250 mm). The linear gradient extended from 0.1% aqueous TFA to 70% acetonitrUe (containing 0.09% TFA) over 33 min. Theflowrate was 2 mL/min and the absorbance monitored at 214 nm using a Perkin Elmer LC 95 detector. Samples were injected by a Perkin Elmer SS 100 autosampler. Quantitation was by a Nelson Model 1020 Data System.
B. Amino Acid Analysis Samples (-0.5 ^ig) were hydrolyzed for 24 h at 112 °C in 100 jiL 6 N HCl, 0.2% phenol. Analysis was performed on a Beckman 6300 with a sulfated polystyrene cation-exchange column (0.4 x 25 cm). Quantitation was by a Beckman 7300.
C. Sequence Analysis Edman degradation sequence analysis of selected samples (dissolved in 0.1% TFA-20% acetonitrile) was performed on an Applied Biosystems 477A Protein Sequencer/120A Analyzer using BioBrene Plus as described (7). In order to identify deprotection products and deletions, 800-900 pmol of sample was sequenced.
ABRF 1994 Peptide Synthesis Study
Z>. Mass
541
Spectrometry
ESMS was performed with a Fisons VG Quattro outfitted with a Fisons Electrospray Source. Samples were dissolved in 1.0 mL of 50% methanol-1% acetic acid, then diluted 1:10 with 50% acetonitrile-1.0 mM ammonium acetate to give 25 pmol/pl-. A 10 pL aliquot of each sample was injected into a 10 ^L/min stream of 50% acetonitnle-l.O mM ammonium acetate. Data was processed using Fisons MassLynx Software. MALDI-MS was performed with a Vestec Benchtop lit linear time-of-flight mass spectrometer, operated in the linear mode with an N2 laser (337 nm). Samples were dissolved in 1.0 mL of 25% acetonitrile-0.1% TFA, then diluted 3:100 to give 5-10 pmol/^L. A 0.5 ^iL aliquot of each sample solution was added to 0.5 jiL of matrix [a-cyano-4hydroxycinnamic acid, saturated solution in 50% acetonitrile-2% TFA]. Samples were dried at ambient temperature and pressure. Each spectrum was the sum of ion intensity from 10-50 laser pulses, liie mass axis was c^brated externally.
III. Results and Discussion A total of 82 crude peptide samples were supplied for inclusion in this study. Two of the crude peptides were synthesized by Boc chemistry (2.4%) and 80 by Fmoc chemistry (98%). The fraction of peptides synthesized by Fmoc chemistry has thus increased each year of the ABRF PS study (Table I). The 2 peptide-resins assembled by Boc chemistry were cleaved with HF containing 10% anisole, 10% dimethyl sulfide, and 2%/7-thiocresol. One laboratory elected to deprotect Trp(For) by treatment of the peptide-resin with 10% piperidine-DMF prior to HF cleavage; the other did not specify Trp(For) deprotection conditions. Of the 80 pepti4e-resins assembled by Fmoc chemistry, 49 were cleaved by reagent K (EDT-thioanisole-water-phenol-TFA, 1:2:2:2:33), 22 by reagent B (triisopropylsilane-phenol-water-TFA, 2:5:5:88), and 9 by other cleavage cocktails. Laboratorie? preferred to recover their products by (i)filteringthe resin and precipitating the product with ether (67 samples, 84%) rather than (ii) filtering the resin, diluting thefiltratewith water and extracting with etiier, and lyophilizing the product (13 samples, 16%). AAA showed 66 of 82 crude products to be compositionally correct (Trp was not quantitated). Only one product had a complete deletion, corresponding to a des(Val,Lys,Arg) peptide synthesized by Boc chemistry. Of the other 15 samples that were not compositionally correct, 13 showed at least a low Lys value, often accompanied by a low Ser, Thr, and/or Arg value. Based on sequence analysis and mass spectrometric results (see later discussion), neither the low Lys nor other low amino acid values could be demonstrated, indicating that AAA of crude peptides may sometimes suffer interference from scavengers used during peptide-resin cleavage. A similar effect was seen in the 1993 ABRF PS study (3). TahkJL Chemistrv Utilized bv Core Facihties for ABRF Test Pentides Year Fmoc Boc % # # 1991 18 50 18 42 1992 72 16 34 1993 74 12
1994
80
98
2
% 50 28 26
2
542
Gregg B. Fields et al.
l a b k IL Characterization of 1994 A R R F Test Pentide
Desired Product*
RP-HPLC^
ESMS^
MALDI-MS^
<25 25-75
r%^
Fmoc 28 70
Boc 100 0
Fmoc 6.2 78
Boc 50 50
Fmoc 6.2 78
Boc 50 50
>75
2.5
0
16
0
16
0
*A Gin-containing peptide was the desired product for 2 Fmoc syntheses. See text for discussion. ^ e total number of samples was 80 Fmoc and 2 Boc.
The RP-HPLC retention time of die apparent desired peptide was 16.32 ± 0.18 min. However, time variations were found outside this range with different samples of the same product and different sample sizes. These differences were probably attributable to the effects of scavengers on RP-HPLC, as mass spectrometric results confirmed the presence of the desired product. RP-HPLC analyses indicated a good percentage of successful Fmoc syntheses, as 72% of the products had ^25% of tiie apparent desired peptide (Table II). RP-HPLC analyses of the 2 peptides synthesized by Boc chemistry indicated that neither contained >25% of the desired product. It should be noted that RP-HPLC may overestimate the percentage of non-desired product due to the high UV absorbance of scavengers and side-chain protecting group adducts. Eight peptides were subjected to Edman degradation sequence analysis. Three showed Wghly efficient peptide assembly, resulting in desired sequence purities of >95%. One of these two samples had a low Ala value by AAA. The AAA result was thus not consistent with that from sequence analysis. Three peptides had partial sequence deletions that included Val^ in one sample, Cys^ and/or Ser^ in one sample, and Ile^ or Ile^^ in two samples. One sample had a complete deletion of Val^, Lys^, Lys^, and Arg*. These deletions were consistent with results from AAA and mass spectrometric analyses (see below). One sample contained --12.6% of an unidentified component eluting in the Trp^ cycle. The hydrophobic nature of this component (elution time = 31.4 min) and the mass spectrometric results for the peptide (see below) are indicative of PTHTrp containing a rBu adduct. This apparent PTH-Trp(rBu) peak was seen in several other samples, although a noted variation in retention time (31.8 - 35.3 min) suggests Trp modification at several different positions by the rBu group (8). Assessment of product purity by ESMS and MALDI-MS showed semiquantitative agreement with RP-HPLC analyses (Table 11). Figure 1A shows the analyses of a crude peptide containing >75% of the desired product as evaluated by ESMS and MALDI-MS. The molecular ions were [M + 3H]3+ = 631.3 Da and [M + 2H]2+ = 946.3 Da by ESMS and [M + H]+ = 1893.2 Da by MALDIMS. For ESMS, samples were run at atypically high concentrations so that minor components were detectable. Further dilution of samples yielded identical results for those cases examined. In contrast, the relative abundances of peaks in a mixture detected by MALDI-MS varied with different dilutions of die sample, along with different spots on the sample target and different laser power settings. By comparison of the relative abundance of the desired molecular ions for all samples, the combined mass spectrometric techniques assigned 6 (6.2%) Fmocsyntiiesized products and 1 (50%) Boc-based product as "poor" quality (<25% desired product) (Table II). In general, the Fmoc-synthesized poor quality products contained several species with masses both below (-36 Da, -18 Da) and above (+16 Da, +56 Da) the desired peptide. These species are indicative of double and single dehydrations, single oxidations, and rBu adducts (Table III). One of the Fmoc products contained several species of lower molecular mass, suggesting deletion peptides that include des(Thr), des(Ile,Thr), and
[l [l ll —4"
[I
(-
-4 m
i
J
CO
5 -" vo
, ^ ^ i
^ 1
ji =i
^ J»
s:? s s s
-O H 0 \
^ ^ *§ <^ 2^ us ^
Slip
fill"
a «n * S S.
CO : L r^ CO ^CO +
^>
12 # " i ; c •« -o S
SO
II 'P +
.+ + ^ O ,
II
d a a •§ S^ ^
>M^ ^ . i /
M
U^
fa + + i j + ^
544
Gregg B. Fields ^r^/.
des(2Ile,Thr). The 1 Boc-synthesized poor quality product contained a variety of deletion peptides. It should be noted that 3 Fmoc-synthesized products (from 2 distinct syntheses) were quantitated as "good" (^25% desired product) even though ESMS showed the products to be 1 Da lower than the desired peptide. In 1 synthesis, the lower mass was the result of using an Fmoc-Gln derivative instead of Fmoc-Glu(OrBu). A similar error was found in the 1992 study, where one laboratory used the wrong resin, resulting in a peptide amide instead of the desired peptide acid (2). These errors are a particular cause for concern. In our study, the analysis of multiple samples allowed for the easy identification of products that had 1 Da deviation from the desired peptide; such distinctions may not be made when only isolated samples are analyzed. The combination of AAA, RP-HPLC, ESMS, and MALDI-MS showed 5 (6.2%) of the Fmoc samples and 1 (50%) of the Hoc samples to contain poor yields (<25%) of the desired product. Two of the 6 poor yield crude products were the result of deletion peptides (1 from Fmoc syntheses, 1 from Hoc syntheses). The other 4 were due to the presence of dehydrated peptides and/or peptides containing covalent adducts. Of the 29 Fmoc-synthesized peptides for which both mass spectrometric techniques detected >5% dehydration, 15 contained a +67 Da species (Table IE). An example of one such product is given in Figure IB. The +67 Da species was probably the desired peptide modified by a p-piperidide (9). Base treatment used to remove the Fmoc group can result in aspartimide formation (dehydration) from Asp(OrBu) residues; the cyclic aspartimide residue can then incorporate a p-piperidide (9). Thus, peptides containing the +67 Da adduct were not modified during cleavage. Aspartimide formation from Asp(OrBu) residues can be inhibited by adding HOBt or 2,4dinitrophenol to the piperidine solution (9,10). The other problem detected in Fmoc syntheses was the generation of tBu adducts during peptide-resin cleavage. The modification of peptides by rBu groups (usually via the indole side-chain of Trp) can be dependent upon the work-up of the crude peptide following cleavage. TaWe m. By-Produgts Observed by Mass Spgctrorngtrv MassDiffer^ce From Desired Peptide (Da^ -327 -315 -214 -200 -115 -101
-99 -71 -36 -18 +16 +32 +56 +67 +100 +212
Possible Product des(2Ile,Thr) des(Asp,Glu,Ala) des(Ile,'nir) des(Val,'nir) or des(Glu Ala) des(Asp) desCIlir) des(Val) des(A]a) -2H2O -H2O oxidation double oxidation
tBu P-pipoidide
Boc
Crude Peptides With >5% Contaminant (%)^ Fmoc Bac
1.2 1.2 1.2 2.4 1.2 2.4 1.2 1.2 5.0 35
50 50
20 1.2 26 19 1.2
Mil —L2 z ^The numb^ of samples analyzed was 80 Fmoc, 2 Boc. All contaminants were seen in both mass spectrometric analyses.
ABRF 1994 Peptide Synthesis Study
545
However, poor quality products were found whether reagent K or B was used and with either work-up procedure. Thus, although specific protocols for workup of crude products was requested from each participating laboratory, correlation of modifications with work-up protocols could not be made. There was also no correlation between the age of reagents used and the quality of peptide products. One of the 2 Boc-synthesized products contained dehydrated peptides (Table III), which was a significant problem for Boc-synthesized peptides in prior studies (1,3). It is important to note that the detection limit for non-desired products was quite low (5%), and that most samples contained large amounts of the desired peptide in the presence of non-desired peptides (Table II).
IV. Conclusions Fmoc chemistiy has continued to gain popularity since the initial ABRF PS study in 1991. This is probably due to the development of reliable automated Fmoc protocols (4) and the increased awareness of the ease of peptide-resin cleavage following Fmoc synthesis as compared with Boc chemistry. From this year's study, 76 of the 82 crude samples (93%) contained ^ 5 % of the desired product as estimated by ESMS. This represents a reasonable improvement over the 1991 study, where 78% of the crude products contained ^25% of the desired material (1). Possible reasons for these improved results are any combination of (i) the greater percentage of peptides synthesized by Fmoc chemistry, where cleavage conditions are less harsh, (ii) the use of different side-chain protecting group strategies (i.e., Pmc instead of Mtr for Arg, Boc for Trp) that help reduce side-reactions during cleavage, (iii) the use of cleavage protocols designed to minimize side-reactions, and (iv) more rigor and care in laboratory teclmiques. The presence of a p-piperidide adduct in 19% of the Fmoc-synthesized products indicates that dehydration of Asp(0/Bu) residues during Fmoc removal can be a significant problem during synthesis. RP-HPLC using a Cig column appeared to provide an accurate estimate of the complexity of the peptide samples. The accuracy of RP-HPLC estimation of the content of the desired product was cUminished by the presence of scavengers and side-chain protecting group adducts. AAA was the best technique for absolute amino acid quantitation, but was not helpful for detecting modifications of amino acid residues and was susceptible to inaccuracies probably due to the presence of scavengers in crude peptide mixtures. Preview sequence analysis could determine residue position and appeared to be less susceptible than AAA to inaccuracies due to scavenger interference. The mass spectrometric techniques examined here (ESMS and MALDI-MS) allowed for identification of peptide modifications, including residual protecting groups. MS as used in this study does not distinguish between products of the same mass, nor allow for assignment of the positions of residue deletions and/or modifications. ESMS and MALDI-MS estimated tiiat the mean % desired product was 59%, while RP-HPLC estimated the mean % desired product was 39%. Our experience in the present and prior studies (1-3) suggests fiiat efficient characterization of synthetic peptides is best obtained by a combination of RPHPLC and MS, with sequencing by either Edman degradation or tandem MS being used to identify tiie positions of modifications and deletions. Proper peptide characterization is essential, especially in light of the lack of correlation between product quality and cleavage reagents or work-up protocols. These results suggest that laboratory technique plays an important role in peptide synthesis and hence product integrity should not be taken for granted.
546
Gregg B. Fields et al.
Acknowledgments We thank all of the ABRF core facilities that participated in this study and the National Science Foundation for financial support (grant DIR 9003100).
References and Notes 1. Smith, A.J., Young, J.D., Carr, S.A., Marshak, D.R., Williams, L.C., and Williams, K.R. (1992). In "Techniques in Protein Chemistry m" (Angeletti, R.H., ed.) pp. 219229, Academic Press, Orlando, FL. 2. Fields, G.B., Carr, S.A., Marshak, DJ^., Smith, AJ., Stults, J.T., Williams, L.C., Williams, K.R. and Young, J.D. (1993). In "Techniques in Protein Chemistry IV" (Angeletti, R.H., ed.) pp. 227-238, Academic Press, San Diego, CA. 3. Fields, G.B., Angeletti, RJl., Carr, S.A., Smith, AJ., Stults, J.T., Williams, L.C. and Young, J.D. (1994). In "Techniques in Protein Chemistry V" (Crabb, J.W., ed.) pp. 501507, Academic Press, San Diego, CA. 4. Review: Fields, G.B., Tian, Z., and Barany, G. (1992). In "Synthetic Peptides: A User's Guide" (Grant, G.A., ed.) pp. 77-183, W.H. Freeman and Co., New York, NY. The reader is referred to this review for and abbreviations and specific references. 5. King, D.S., Fields, C.G., and Fields, G.B. (1990) Int. J. Peptide Protein Res. 36,255266. 6. Sol6, N.A. and Barany, G. (1992) / . Org. Chem. 57, 5399-5403. 7. Applied Biosystems, Inc. (1989) Applied Biosystems Model 477A Protein-Pq)tide Sequencing System Users Manual, Foster City, CA. 8. L5w, M., Kisfaludy, L., Jaeger, E., Thamm, P., Knof, S., and Wfinsch, E. (1978) HoppeSeyler's Z. Physiol. Chem. 359, 1637-1642. 9. Dolling, R., Beyermann, M., Haenel, J., Kemchen, F., Krause, E., Franke, P., Brudel, M., and Bienert, M. (1994) / . Chem. Soc. Chem. Commun., 853-854. 10. Martinez, J. and Bodanszky, M. (1978) Int. J. Peptide Protein Res. 12,277-283.
Protein Synthesis on a Solid Support using Fragment Condensation Siegfried Brandtner and Christian Griesinger
Institut fur Organische Chemie. Universitat Frankfurt, Marie Curie Str. 11, D-60439 Frankfurt/Main, Germany
I.
Introduction
There are two general approaches to the synthesis of peptides: the classical method, in which all reactions are carried out in homogeneous solution, and the solid phase method, in which the reactions are heterogeneous ones between soluble reagents and an insoluble peptide chain that is attached to a solid support. Since its introduction by Merrifield in 1962 solid-phase peptide synthesis has been applied successfully to the preparation of a great number and variety of peptides including proteins [1,2]. However, the synthesis of large peptides and proteins is hampered by the sequence-dependent difficulties during assembly. The combination of stepwise solid phase synthesis and fragment condensation on a solid support appears to be an attractive approach towards the goal of avoiding such diflficuhies [3,4]. Thus, peptides and proteins to be prepared should be first divided into corresponding fragments optimised according to their solubihies. Barlos et al. introduced the acid labile 2-chlorotrityl chloride resin [5] for synthesis of protected peptide fragments. After cleavage from the resin by dilute acetic acid or trifluorethanol the fully side chain protected peptide fragments can be purified and analysed. The protected C-terminal fragment can be bound to the solid support and condensed with the other peptide fragments sequentially on the resin to obtain the complete protein. The purification of the target protein from impurities lacking one or more fragments should be easier than the separation of proteins that differ by one ore more amino acid residue [6]. The strategy of fragment condensation in addition is more flexible since modifications of the amino acid sequence may aflfect only one fragment. Only this fragment needs then to be synthesized while the unchanged fragments can still be used. Therefore this approach is well suited for the study of structure-activity relationships. We chose to study the efficiency of the solid phase fragment condensation on the example of a type III antifreeze-protein (64 amino acids, 6.7 KDa), isolated TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
547
548
Siegfried Brandtner and Christian Griesinger
from Macrozoarces americamis [7] and at the complete variable region of the vp, Dp and Jp region of the human T-cell receptor hvp-13.1D2-l (110 aminoacids, 12.9kDa) [8].
II. Biological Background
A.
Antifreeze protein (AFP)
Animals exposed to temperatures below 0°C have developed strategies to avoid lethalfreezingof their bodyfluids.Among these is the stabilisation of body fluids in a supercooled state by inhibiting the growth of ice crystals. Antarctic fish living in ice-laden environments achieve this by the synthesis of antifreeze proteins (AFPs) [7]. These proteins lower the freezing point of their body fluids, whilst the melting point is not affected [9]. The proteins are thought to inhibit the growth of ice crystals by adsorbing to them [9, 10]. However, the exact mechanism of this interaction has not been established to date [9, 11, 12]. Of the groups of AFPs known to occur in fish [7], a group of proteins of « 60 amino acids is of special interest, since their sequence cannot be correlated with the models proposed so far for the interaction of AFPs with water. Recently, the structure of one of such proteins has been determined by NMR [13, 14]. Proteins of this group have so far only been accessible by isolation or in low yield by heterologous expression [15]. Here, we report the synthesis of one of the proteins by solid-phase peptide synthesis.
B.
T-cell receptor (TCR)
Staphylococcal enterotoxins (SE), so called superantigens, are responsible for food poisoning and shock in man and animals. They elicit T cells with enterotoxin specific variable regions in the p-chain of the T-cell receptor (hvp). The enterotoxin binds in contrast to conventional antigens only to the p-chain and not to a- and p-chain of the T-cell receptor [16, 17]. The SEC2 stimulate T-cells carrying hvp 13.1 [8]. Since there is no further information about the interaction between the suprantigen and the T-cell receptor, it might be highly interesting to synthesize a part of the v-P-chain, because the structural information, we hope to get, could be a tool to learn about the mechanism of antigen recognition.
III. Materials 2-Chlorotrityl chloride resin was obtained from CBL (Patras). Fmoc-amino acids and their derivatives were purchased either from Advanced Chemtech, NovaBiochem, SNPE or Bachem (Switzerland). Solvents, AcOH, TFA and TFE were of analytical grade, purchased either from Merck or Fluka and used without further purification. TLC was performed on a precoated silica gel 60 F254 (Merck) aluminium plates employing the following solvent systems: chloroform/methanol/acetic acid (90:12:2) and (85:10:5), toluene/methanol/acetic acid (70:30:15) and (90:10:10). All HPLC runs were performed on a Merck Hitachi
549
Fragment Condensation on a Solid Support
apparatus with an L-6200 Pump, equipped with a Merck Hitachi L-3000 diode array-UV detector. Columns were either Nucleosil RP18 PPN 8x250 mm, 300A, 5 |im; VYDAC RP4 4x250mm, 300A, S^m; or VYDAC RP18 20x250mm, 300A, 5|Lim. A gradient solvent system of 0.1% TFA in water / 0.1% TFA in acetonitrile or 0.1% TFA in water / 0.1% TFA in acetonitrile / 0.1% TFA in methanol was employed. The CE runs were performed on a Beckman P/ACE System 2100 with a standard capillary, 100 ^m, total capillary 57 cm, capillary to detector, 50 cm, using a 10 mM P04^" buffer, pH = 6.0.
IV. Results The preparation of protected peptides and their sequential condensation were done on the 2-chlorotrityl chloride resin (Fig. 1).
SOUD PHASE PEPnOE SYNTHESIS Esterification Fmoc-AA(OH), DIEA
Deprotection 40% Hperidin/DMF
Coupling a.) Fmoc-AA(OH), 7BTU, DIEA (1:1.5:1,1) b.) Fmoc-Peptide(OH), HOBt, DIG (1:10:10)
O-AAl-CAAfi-FrTnoc
Cleavage from the Resin IFE, AcOH. DCM 90'
SIdechain Deprotection Reagent K: FA, EDT, Ihloonbol, Phenol, Water 82.5 : 2.5 : 5 : 5 ; 5%
Fully protected Peptide
Fig. l:The use of the 2-chlorotrityl chloride resin for the preparation of protected peptides and proteins. Either amino acids (a) or fully protected peptide fragments (b) are condensed to the growing peptide chain on the resin.
A.
Synthesis of protected Peptides
The fully protected peptides were synthesized either on a manual shaker or after optimization of coupling conditions [18] on an automatic peptide synthesizer (ACT 200) from Advanced Chemtech using the 2-chlorotrityl chloride resin and Fmoc/t-Butyl strategy [2]. The coupling with TBTU/DffiA [2] and the
Siegfried Brandtner and Christian Griesinger
550
Fmoc-cleavage were monitored by Ninhydrin reaction [4] and TLC. The integrity of the protected peptides was proven by ID and 2D iH-NMR spectroscopy [19]. The correct masses were established by FAB-MS. The complete TCR vp, Dp and Jp-chain is synthesized using 9 fragments (Fig. 2), the AFP-Type III protein is synthesized using 8 fragments (Fig. 3).
Fig. 2 Condensation site Fmoc-Peptide(OH) | ^ NH,-Peptide-0-resm
®.'®(^'fe@®®(!)(p5^'fe(A)®©@®(^ ©©©©©(KX? Fig. 3
Fig. 2 & 3: Sequence of the TCR vp, Dp and jp-chain (1-110 aa) divided into 9 fragments (Fig. 2). The complete sequence of the AFP-Type III [^Pyr] HPLC-6 divided into 8 fragments (Fig. 3). We used ^Bu as protecting group at S;T;E;Y, Boc at K, Pmc at R and Trt at Q; N and C. We synthesized the C-terminal fragments TCR 110-76 (Fig. 2) and AFP 64-47 (Fig. 3) using a low resin substitution of 34 ^mol/g and 52 jimol/g respectively because there is a lower tendency for intermolecular interaction between the growing peptides. Kaiser test and TLC monitoring after all coupling and Fmoc deprotection steps lead to AFP 64-47 in high yield and purity. Therefore it was not necessary to cleave the peptide from the resin for purification
Fragment Condensation on a Solid Support
551
The TCR 110-76 peptide on the other hand could neither be dissolved with full side chain protection nor could its purity be defined by mass spectrometry after deprotection of the side chain due to its insolubility. This peptide was therefore used without purification. B.
Fragment
condensation:
After cleavage of the protected fragments from the resin using dilute acetic acid or after HPLC-purification using 0.1% TFA-solvents it is essential to remove the acid trace from the peptide before it can be used for fragment condensation. The peptide fragment has to be dissolved in a minimum volume of TFE or DMSO. The dissolved peptide is dropwise added to 500 ml of water. The precipitated peptide was filtered and washed with water and if possible with diethylether. The acid free fragment is dried in vacuo. The fragment condensation was performed in DMSO/DCM 10:1 using an 3-6 fold molar excess of protected peptide and HOBt/DIC for activation ( peptide fragment/HOBt/DIC 1:10:10)[19]. The ninhydrin reaction for resin bound peptides gives ambiguous results probably due to inaccessibility of the N-terminal amino group. Therefore a small fraction of the peptide was cleaved from the resin. After concentration by an N2 stream the ninhydrin-reaction or monitoring by TLC or HPLC were performed. The Fmoc-cleavage conditions and times of fragment condensation for the TCR-chain is given in Table 1. Tab. 1: The time for the complete Fmoc deprotection and the time for the condensation reaction of the fragments for the TCR peptides are given. In the left column only the product of the fragment condensation is shown. The fragmentation for the TCR peptides is given in Fig. 2.
resin-0-(TCR resin-0-(TCR resin-0-(TCR resin-0-(TCR resin-0-(TCR
110-69)(Fmoc) 110-62)(Fmoc) 110-57)(Fmoc) 110-50)(Fmoc) 110-39)(Fmoc)
resin-0-(TCR 110-23)(Fmoc) resin-0-(TCR 110-15)(Fmoc) resm-0-(TCR 110-l)(Fmoc) resin-O-(TCR110-l)NH2
Fmoc-cleavage (time, % Pip./DMF) 3 h, 40% 17h, 20% 8 h, 40%; 20 h 40% 8 h, 40%; 20 h, 40% 3 h, 40%; 17 h, 15% 3h,40%, 17h, 15% 72 h, 15% 72 h, 15% 24h, 15% 12 h, 40% 48 h, 40%
Coupling-time (hours) 48 43 136 42 168 240 264 120
The TCR protein was cleaved from the resin in DCM/TFE/AcOH (3:1:1) for 3 hours. To obtain the crude protein, side chain deprotection was performed with reagent K [20] for 8 hours with an yield of 48 mg = 74%. Due to the insolubility
552
Siegfried Brandtner and Christian Griesinger
of the T-cell receptor fragment we could only use the SDS-PAGE as a tool to determine the mass between 13 and 7 KDa. In the Table 2 the Fmoc-cleavage and times of fragment condensation for the AFP-type Ill-protein are given.
Tab. 2: The time for the complete Fmoc deprotection and the time for the condensation reaction of the fragments for the AFP peptides are given. In the left colmnn only the product of the fragment condensation is shown. The fragmentation for the AFP peptides is given in Fig. 3.
resin-0-(AFP resin-0-(AFP resin-0-(AFP resin-0-(AFP resin-0-(AFP resin-0-(AFP resin-0-(AFP 1 resin-0-(AFP
Fmoc-cleavage (time, % Pip./DMF) 3 h, 40% 18 h, 15% 23 h, 15% 38 h, 15% 49 h, 20% 60 h, 15% 44 h, 15% 48 h, 15%
64-41)(Fmoc) 64-35)(Fmoc) 64-30)(Fmoc) 64-23)(Fmoc) 64-18)(Fmoc) 64-ll)(Fmoc) 64-1 XFmoc) 64-\)mij
Coupling-time (hours)
46 118 161 62 48 72 112
The AFP protein was cleaved from the resin in DCM/TFE/AcOH (3:1:1) for 3 hours. To obtain the crude protein, side chain deprotection was performed with reagent K [20] for 3 hours with an yield of 217 mg = 33%. After two purification steps performed on RP-HPLC using a VYDAC RP18 20x250 mm, 300A, 5 \im column and a gradient solvent system of 0.1% TFA in water / 0.1% TFA in acetonitrile / 0.1% TFA in methanol we obtained 28 mg of purified AFP TypeIII-[lGln] HPLC 6. The homogeneity of the purified protein has been determined by HPLC (Fig. 4), CE (Fig. 5), ESI-MS (Fig. 6), SDS-PAGE, amino acid analysis, chiral amino acid analysis and activity measurement.
1
0,46
1
0.40
1
0.3t>
1
0.30
1
C,2S
1
a^a
0.201
0,16
0,10 1
0.16
1
O.IO
/
1
OjOS
1
O.CX3
- J _ ..J -
o,oe
S
TO
Fig. 4
IS
20
1L,
0,00
Y""' ' '"*
O
2«
an
0
5
10
15
20
1
Fig. 5
Fig. 4 & 5: The analytical RP-HPLC chromatogram and the CE chromatogram of the purified AFP-Type III protein are shown. HPLC conditions: VYDAC RP-18 column gradient: 20/20/60% (H20/ACN/MeOH)+0.1%TFA into 30 min. to 0/10/90%; into 35 min. 0/100/0%. The CE conditions are as discribed into the material section.
553
Fragment Condensation on a Solid Support
Fig. 6: The ESI-MS of the purified AFP-Type III protein is given. The ([M+4H]'^'^)/4, ([M+5H]5+)/5 and the ([M+6H]6+)/6 ions are marked.
The small chemical shift dispersion of the NH region in the NMR spectrum of the protein indicates a mostly unfolded structure in the presence of a small fraction of the folded protein (Fig. 7a & c), as could be appreciated from the IDspectrum of the native AFP-Type III sequence (Fig. 7b). Whether the N-terminal Gin is responsible for unfolding is currently under investigation.
..uA^m^^yJWi
^'•^_y_
••
Jul^f
AwW"^
Ijl
-rr—n
vs
VM--
m
Fig. 7a-c Fig. 7a-c : The ID-NMR spectrum of the purified AFP-Type III protein (7a & c) and the IDNMR spectrum of the native AFP-Type III HPLC-6 (7b) isolated from Macrozoarces americanus [7] are given.
554
V.
Siegfried Brandtner and Christian Griesinger
Conclusion
Fragment condensation on the solid support has been shown to be a feasible approach towards chemical synthesis of proteins. A progress report on the example of the TCR p-chain and AFP type III has been given. In the latter case a pure AFP type III peptide was obtained that however did not fold properly.
Acknowledgment This work is supported by DFG under grant Gr 1211/4-1 and by the Fonds der Chemischen Industrie. We thank Dr. Zechel and Dr. Haupt, BASF, Ludwigshafen, Dr. Savelsberg, Merck, Darmstadt and Dr. Fehlhaber, Hoechst, Frankfurt for the recording of mass spectra. Prof Dr. Davies, Kingston, Canada for the measurement of the freezing activity and Dr. Sonnichsen, Edmonton, Canada for the ID NMR spectrum of the native AFP-Type III.
References [I] [2] [3] [4]
Merrifield, R.B. (1963). J. Am. Chem. Soc. 85: 3045-3052. Fields, G.B. & Noble, R.L. (1990). Int. J. Peptide Protein Res. 35: 161-214. Barlos, K., Gatos, D. & Sch^er, W. (1991). Angew.Chemie 103: 572-575. Kaiser, E.T., Mihara, H., Laforet, G.E., Kelly, J.W., Walters, L.; Findeis, M.A. & Sasaki, T. (1989). Science, 243: 187-192. [5] Barlos, K., Chatzi, O., Gatos, D. & Stavropoulos,G., (1991). Int.J.Peptide Protein Res., 37: 513-520. [6] Merrifield, R.B. (1978). Pure &Appl. Chem. 50: 643-653. [7] Davies, P.L.; & Hew, Choy L. (1990). FASEB J., 4. 2460-8. [8] Choi, Y., Herman, A., DiGiusto, D., Wade, T., Marrack, P. & Kappler, J. (1990). Nature, 346: 471-3. [9] Knight, C. A., Cheng, C. C. & DeVries, A. L. (1991). Biophys. J. 59: 409-418. [10] Raymond, J. A. & DeVries, A. L. (1977). Proc. Natl. Acad Sci. USA 74: 2589-93. [II] Hew, C.L. & Yang, D.S.,(1992). Eur. J. Biochem., 202: 33-42. [12] Chou, K.C.,(1992). J. Mol. Biol., 223. 509-17. [13] Sonnichsen, F. D., Sykes, B. D., Chao, H. & Davies, P. L. (1993). Science 259: 1154-7. [14] Chao, H., Davies, P. L., Sykes, B. D., & Sonnichsen, F. D., (1993). Protein Science 2: 1411-1428. [15] Li, X. & Hew, C.L. (1991). Protein Eng., 4: 1003-8. [16] Choi, Y., Kotzin, B., Herron, L., Callahan, J., Marrack, P. & Kappler, J.W., (1989). Proc.Natl.Acad.Sci.USA, 86: 8941-8945. [17] Choi, Y., Lafferty, J.A., Clements, J.R., Todd. J.K., Gelfand, E.W., Kappler, J., Marrack, P. & Kotzin, B.L. (1990). J. Exp. Med, 112: 981-4. [18] Brandtner, S., Ihringer, S. & Griesinger, C, Poster Nr. 5, presented at the 13*^ APS, Edmonton, Kanada, June 1993. [19] Brandtner, S., Schleucher, J. & Griesinger, C, In Hodges R.S. & Smith J.A., Eds. Peptides: Chemistry, Structure and Biology (Proceedings of the 13^^ APS. ESCOM, Leiden (1994) 49-50. [20] King, D.S., Fields, C.G. & Fields G.B. (1990). Int. J. Peptide Protein Res. 36: 255-266.
Characterization of a Side Reaction Using Stepwise Detection in Peptide Synthesis with Fmoc Chemistry
Yan Yangl, William V. Sweeney^ Susanna Thomqvist^, Klaus Schneider^, Brian T. Chait^ and James P. Tam^ 1 Department of Chemistry, Hunter College of CUNY, New York, NY 10021 ^ Laboratory for Mass Spectrometry and Gaseous Ion Chemistry, The Rockefeller University, New York, NY 10021 and ^ Department of Microbiology and Immunology, Vanderbilt University, A5119 MCN, Nashville, TN 37232
I.
Introduction
In solid phase peptide synthesis, it is important that the repetitive steps proceed rapidly, in high yield, and with minimal side reaction to prevent the accumulation of by-products (1). Solid phase peptide synthesis almost always employs either Boc or Fmoc chemistry. Boc chemistry requires acidic conditions for deblocking, and the potential side reactions have been extensively studied. However, in Fmoc chemistry (2) repetitive basic conditions are required for deblocking, and only a few of the base-catalyzed side reactions have been characterized. We present here a method used to demonstrate that a side reaction well known in Boc chemistry (3) but thought not to occur under the conditions of Fmoc, in fact occurs with both approaches: aspartimide formation. In this method the progress of synthesis was monitored by stepwise microscale TFA cleavage in conjunction with reversed-phase HPLC and mass spectrometry (MS) for identification of products. The resins were sampled after each coupling step. Then the peptide fragments generated by TFA cleavages were examined by HPLC and MS. By comparing each peptide fragment, the side reaction due to aspartimide formation was detected and eventually defined. Peptide ladder mass spectrometric analysis was used for a mixture of the collected peptide fragments to provide further corroborative evidence for the aspartimide formation. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
555
556
Yan Yang et al.
II.
Experiments and Procedures
A.
Peptide
Synthesis
An initial attempt to synthesize an epidermal growth factor (EGF)-like domain in human blood coagulation factor X (residues 83-130) was made at the RCMI Peptide Synthesis Facility of Hunter College using solid-phase methodology (4) on an ABI 430A synthesizer. A standard single coupling with HOBt/HBTU and deprotection with 20% piperidine were used. All amino acids and resins were purchased from Applied Biosystems Inc. (Foster City, CA). The side-chain protecting groups for Fmoc amino acids were as follows: Asp and Glu by OtBu, Ser, Thr and Tyr by t-Bu, Asn, Gin and His by Trt, Lys by Boc, and Arg by Pmc. Other related peptides derived from factor X (residues 106-130, residues 98-130, see Figure 1) were later synthesized manually by the stepwise solidphase method with Fmoc-Tyr(t-Bu)-HMP resin at 0.42 mmol/g substitution, where HMP resin was 4-(hydroxymethyl)phenoxymethyl-Copolystyrene resin (Wang resin). Stepwise coupling of Fmoc amino acids using DCC/HOBt was performed first, followed by TBTU as a second coupling when necessary. All amino acids and resins were purchased from Bachem (Torrance, CA). Deblocking the NH2-terminal Fmoc protecting group with 20% piperidine in DMF (vol/vol) was typically carried out for 20 min. Since Pro was the second residue from the C-terminal, 50% piperidine in DMF (vol/vol) was applied for 5 min to the dipeptide resin to minimize the formation of diketopiperazine.
B.
Micro-scale
TFA Cleavage
A sample of approximately 10-15 mg of resin was treated with a small portion of cleavage mixture (1 to 1.5 ml) in a sealed 20 ml scintillation vial. The cleavage mixture used for peptides containing Trt protecting groups was: 0.25 ml 1,2-ethanedithiol (EDT), 0.25 ml H2O and 9.5 ml TFA. The cleavage reaction was performed at room temperature with stirring for 1.5 hr. Crude peptides were filtered in a Buchner funnel with a fritted disc to remove the solid support, precipitated in about 6-8 ml of cold ethyl ether in a centrifuge tube (10 ml size), and then collected by centrifugation. The crude peptide was washed repeatedly by centrifugation with cold ethyl ether (at least three times) and then dissolved in water for lyophilization. A number of samples can be processed simultaneously in this manner.
C.
HPLC
Separation
Analytical C18 reversed-phase HPLC was performed. Buffer A contained 5% acetonitrile in 0.045% TFA. Buffer B contained 60% acetonitrile in 0.037% TFA. Peptides were eluted with a 10-40% buffer B linear gradient over 30 min at 1.5 ml/min, monitored at 220 nm.
Characterization of a Side Reaction in FMOC Chemistry
557
Figure 1. Manually synthesized sequence derived from an epidermal growth factor-like domain in blood coagulation factor X. The numbers in the parentheses show the sequence position relative to the C-terminal. The arrow indicates the location of aspartimide formation and subsequent ring opening by piperidine to form an adduct.
D.
Mass Spectrometry
Mass spectra of individual peptide samples were analyzed in an electrospray mass spectrometer constructed at The Rockefeller University and described elsewhere (7). The peptide samples were dissolved in a mixture of water, methanol and acetic acid (20:19:1) to a concentration of 10 |LIM and sprayed at a voltage of 3-4 kV. Peptide ladder samples were analyzed on a matrix-assisted laser desorption time-of-flight mass spectrometer constructed at The Rockefeller University and described elsewhere (8,9). The individual peptides (9 residues to 18 residues in one vial and 19 to 32 residues in another vial) were mixed in approximately equal amounts and dissolved in water. The ladder mixtures were added to the matrix material (4-hydroxy-a-cyano-cinnamic acid (4HCCA) in formic acid/water/isopropanol 1:3:2) to a final concentration of 1-5 |LiM for each peptide component. The complete peptide ladder, which ranged from the 9 mer to the 32 mer (except 16 mer and 28 mer) was measured from 4HCCA in water/acetonitrile 2:1. The final concentration of each peptide component was in the range of 0.2-1 |LiM. Bovine insulin and substance P were used as internal calibrants.
III. Results and Discussion A.
Initial Synthetic Attempts
The initial synthesis of the EGF-like peptide carried out on a synthesizer using an Fmoc approach failed. As identified by both HPLC and MS, the resulting peptide with 48 residues was found to be a mixture containing no detectable amount of the desired peptide. Manual synthesis was then attempted. Since the
558
Yan Yang et al.
24mer 23mer 22mer mer 20 mer 19 mer 18 mer 17 mer
16 mer Residue Number 15 mer 14 mer
Retention Time (min) Figure 2. Reverse phase HPLC profiles showing the stepwise analysis of peptide samples containing the C-temiinal 10 mer to 24 mer (except the 16 mer).
5 a^
150-
o
100-
9
(p
/
t
— n — HPLC peak area
"^
o
9
-" - 0- • - Ladder MS peak height
^ SIP
50-
ES MS peak height
1 nd
5
10
J
1 15
1 20
1
1
1
25
30
35
Number of residues (relative to C-terminal) Figure 3. Plot of the ratio of unknown peak to the normal peptide peak. —D— with peak area in the HPLC profiles; -••-•• with peak height in the electrospray mass spectra (ES MS); - -0 - • with peak height in peptide ladder by matrix-assisted laser desorption/ionization mass spectrometry (Ladder MS).
Characterization of a Side Reaction in FMOC Chemistry
559
Table I. Electrospray mass spectrometry iresults of individual peptides Mass found in AMass between Mass of expected peptides Peptide unknown expected and Theory samples Found unknown peptides species 11 mer 1120.3 1120.3 1235.4 12mer 1235.3 13 mer 1306.3 1306.5 67.0 1373.3 67.2 14 mer 1419.3 1419.6 1486.5 1520.5 1520.7 1587.2 15 mer 66.7 17 mer 67.8 1740.5 1741.0 1808.3 1896.7 1897.1 66.7 18 mer 1963.7 1968.4 1968.2 67.9 19 mer 2035.9 2071.4 66.7 20 mer 2069.3 2136.0 2158.4 66.7 21 mer 2157.8 2224.5 67.1 2331.4 2332.5 22 mer 2398.5 67.1 23 mer 2430.7 2431.7 2497.8 68.3 24 mer 2529.3 2530.8 2597.6 68.3 26 mer 2731.5 2732.0 2799.8 27 mer 2857.1 2860.1 2925.0 67.9 67.2 29 mer 3116.6 3118.3 3183.8
diketopiperazine formation at the onset of the synthesis could have been responsible for the failure of the synthesis, an attempt was made to minimize formation of diketopiperazine. The piperidine treatment time was shortened to about 7 minutes at the dipeptide resin stage using 50% piperidine in DMF. As the synthesis progressed, ninhydrin tests on the deblocked peptide resins indicated that there was a problem in the synthesis. Examination of the deprotected crude peptides produced after coupHng of the 17th residue failed to yield a single major peak in the HPLC profile, indicating the presence of a mixture of products. In addition, an electrospray MS analysis of the crude peptide product did not show the presence of the desired peptide.
B.
Stepwise Monitoring of the Peptide Synthesis with HPLC and MS
A portion of the desired EGF-like peptide (Figure 1) was then synthesized manually. After coupling of the first 7 amino acid residues, a sample of peptide resin was removed. Thereafter a sample of peptide resin was removed subsequent to each coupling step (except after the 16th and 28th cycles). The peptide resin samples were cleaved to obtain crude peptide fragments for analysis by both HPLC and MS. A plot collecting the HPLC results from the peptide samples containing 10 mer to 24 mer is shown in Figure 2. These HPLC profiles indicate that the synthesis proceeded well initially. However, after the 13th residue from the
560
Yan Yang et al.
C-terminal a new peak appeared in the HPLC profile (shown shaded in Figure 2). This unknown peak eluted approximately 5 minutes later than the desired peptide peak. The area of the unknown peak increased with each stepwise coupling. After 10 more couplings, the area of this new HPLC peak was almost the same as that of the desired peptide peak. Figure 3 shows the ratio of the area of the unknown peak and the expected peptide peak calculated from HPLC compared with the ratio of peak heights obtained from MS analysis. Mass spectrometric analysis on individual peptide samples showed that the unknown peptide had a mass 67u larger than that of the expected peptide (see Table I and II). This difference in mass suggests that the unknown peak was possibly caused by aspartimide formation (-18u) and subsequent ring opening by piperidine adduction (+85u). For the crude peptide containing 12 residues, a small peak in the electrospray mass spectrum showed a mass 18 )U lower than the target peptide, which indicates the loss of water due to aspartimide formation. This finding supports the possibility that a piperidine adduct was derived from the nucleophilic attack at the aspartimide formed between Aspt^^l and Asntl^l.
Table II. A tabular presentation of the peptide ladder mass spectrometry. The data was obtained from two mixtures. The first mixture contained the individual peptides from the 9 mer to the 18 mer and the second mixture from the 19 mer to the 32 mer. AMass between Mass found Peptide Mass of expected peptide in unknown expected and samples unknown peptides Theory species Found 9 mer 949.1 948.9 1006.2 10 mer 1006.1 11 mer 1120.3 1120.0 12 mer 1235.4 1234.9 13 mer 1306.3 1306.5 68.0 1374.3 14 mer 1419.5 67.3 1419.6 1486.8 1520.4 1520.7 67.1 15 mer 1587.5 1741.0 66.6 17 mer 1741.4 1808.0 66.7 1897.1 1897.3 18 mer 1964.9 1968.2 67.7 1968.2 19 mer 2035.9 67.6 20 mer 2071.4 2071.3 2138.9 67.4 2158.4 21 mer 2226.0 2158.6 66.8 22 mer 2399.4 2332.5 2332.6 68.2 2499.4 2431.7 23 mer 2431.2 24 mer 67.9 2530.4 2598.3 2530.8 69.5 2799.4 26 mer 2732.0 2729.9 68.4 2860.1 2859.1 2927.5 27 mer 29 mer 67.6 3118.3 3185.2 3117.6 30 mer 3255.5 3254.3 69.5 3323.8 3358.6 31 mer 3358.1 67.0 3425.1 32 mer 3505.2 66.8 3505.8 3572.0
Characterization of a Side Reaction in FMOC Chemistry
MS Analysis of Peptide Ladder
561
the Synthetic
MS analysis of synthetic peptide ladders, first introduced for the purpose of sequencing in solid phase synthesis (5,6), was used here for tracking sidereactions. A sample was prepared by pooling the individual crude peptides. A ladder spectrum gives a single readout data set that can be interpreted in a straightforward manner. From the peptide ladder mass spectrum (Figure 4), it is clear that the A=67u peak first appeared at the 13th residue from the Cterminal bearing the sequence -Alatl^l-Aspt^^l-Asnt^^]-. Table II shows the results obtained from this ladder spectrum for the normal peptide and the unknown species.
IV.
Conclusions
Stepwise analysis is an efficient and direct method to monitor the progress of side reactions in Fmoc chemistry. The analysis involves stepwise micro-scale TFA cleavage in conjunction with HPLC and MS. Micro-scale TFA cleavage can be conveniently carried out on a small amount of sample. MS analysis of peptide ladders provides a rapid method for monitoring and identification of side reactions in peptide synthesis.
KIGI
N^M
i}'\
D^^ A^^
T''
+
UOJvv^ w 800
1000
\l
wOv
1200
1400
1600
m/z Figure 4. Partial view of the matrix-assisted laser desorption mass spectrum of the synthetic peptide ladder from 9 to 32 residues. The formation of aspartimide (loss of water, -18u) and the piperidine adduct (+67u) are strongly observed after the synthesis of 13 residues. The weak intensity of the peak corresponding to the 14 mer is due to the low amount of 14 mer added.
562
Yan Yang et al.
Acknowledgments Most aspects of the peptide synthesis were performed using facilities generously provided by Dr. R. Bruce Merrifield of The Rockefeller University. This work was supported in part by US PHS grants HL41935 (J.P.T. and W.V.S.), CA 36544 (J.P.T.) and RR00862, GM38274 (B.T.C.), and by US PHS grant RR03037 to the Hunter College Synthesis and Sequence Facility.
References 1. 2. 3. 4. 5. 6. 7. 8. 9.
Merrifield, B. (1984). Science 131, 341-347. Carpino, L.A. and Han, G.Y. (1972). J. Org. Chem. 37, 3404-3409. Tarn, J.P., Rieman, M.W. and Merrifield, R.B. (1988). Peptide Res. 1, 6-18. Merrifield, R.B. (1963). J. Am. Chem. Soc. 85, 2149-2154. Chait, B.T., Wang, R., Beavis, R.C. and Kent, S.B.H. (1993). Science 262, 89-92. Walker, S.H., Wang, R., Milton, S., Chait, B.T. and Kent, S. (1993). In "Proceedings of the 41st ASMS Conference on Mass Spectrometry and Allied Topics", San Francisco, CA, pp. 380a-380b. Chowdhury, S.K.; Katta, V. and Chait, B.T. (1990), Rapid Commun. Mass Spectrom. 4, 81-87. Beavis, R.C. and Chait, B.T. (1989). Rapid Commun. Mass Spectrom. 3, 233-237. Beavis, R.C. and Chait, B.T. (1990). Anal. Chem. 62, 1836-1840.
ERRATUM The following article, minus its second page, appeared in Techniques in Protein Chemistry V. It is repeated here in its entirety.
This Page Intentionally Left Blank
High Sensitivity Peptide Sequence Analysis Using In Situ Proteolysis on High Retention PVDF Membranes and a Biphasic Reaction Column Sequencer Sandra Best, David F. Reim, Jacek Mozdzanowski, and David W. Speicher The Protein Microchemistry Laboratory, The Wistar Institute, Philadelphia, PA
L Introduction Polyvinylidene difluoride (PVDF) membranes have proven to be valuable supports for the facile isolation of proteins by electroblotting from polyacrylamide gels (1,2). The chemical resistance of PVDF permits direct automated N-terminal sequence analysis of proteins and the high binding capacity of newer generation "high retention" PVDF membranes minimizes potential losses during the electroblotting step (3,4). As a result of these advances the determination of N-terminal sequences are fairiy routine even when as little as 10 pmoles or less of a partially purified protein is loaded onto a ID or 2D gel for final isolation. Unfortunately at least 50% to 80% of all proteins have blocked Nterminals (5,6). Therefore it is necessary to either remove the blocking group or to obtain internal sequences after proteolytic or chemical fragmentation of the protein. Since the chemical nature of the blocking group is usually unknown, it is generally more practical to obtain internal sequences. An early, reliable strategy for obtaining internal sequences involved electroblotting proteins to nitrocellulose membranes, in situ protease digestion, and subsequent isolation of peptides by reverse phase HPLC (7). This method has been employed in our laboratory for more than three years with a success rate of >95%. The two major limitations of this approach are: 1) the blotting efficiency using nitrocellulose tends to be highly variable and frequently recovery at this step is low, and 2) the multiple step nature of this procedure further reduces overall recoveries. Even when recoveries are optimized at each step of the procedure, it is usually necessary to start with at least 5 to 10 times more protein than the amount required for direct N-terminal sequence analysis. Various alternative strategies have been reported for obtaining internal sequences in high yield from proteins separated by SDS gel electrophoresis. A number of approaches using electroblotting to PVDF membranes instead of nitrocellulose include: blotting to PVDF/CNBr digestion/extraction/protease TECHNIQUES IN PROTEIN CHEMISTRY V Copyright © 1994 by Academic Press, Inc. All rights of reproduction in any form reserved.
565
566
Sandra Best et al.
digestion/HPLC (8); blotting to a cationically derivatized PVDF membrane (Immobilon CD) followed by either chemical cleavage or proteolysis (9); extraction from PVDF with triton and SDS/removal of detergent with a reverse gradient/solution digestion/reverse phase HPLC (10); and cleave on the PVDF membrane at tryptophans/re-electrophoresis and re-blotting to PVDF (11). A number of "in gel" cleavage strategies have also been explored in attempts to improve recoveries, including: partial in gel digestion with V8 protease followed by a second gel separation and electroblot onto PVDF (12); and cleavage in the gel matrix followed by reverse phase HPLC separation (13,U). Although each of the above approaches has its merits, an ideal method for obtaining internal sequences would use a minimum number of steps with high recoveries at each step and would be applicable to essentially any protein. A recent report by Fernandez, et ah (15) described a modification of the original in situ nitrocellulose method (7) where hydrogenated triton was added to the digestion solution to improve peptide extraction from either nitrocellulose or PVDF. A surprising conclusion of the Fernandez, et al, study was that good proteolysis and extraction of peptides could be obtained in the presence of hydrogenated triton even when newer high retention PVDF membranes such as Trans-Blot (Bio-Rad) were used. Efficient extraction from high retention membranes was unexpected since it was previously observed that extraction of CNBr cleaved peptides were more difficult to extract from the tighter binding membranes (8), and we observed similar poor extraction from these membranes even when detergents were used for extraction (data not shown). Coupling the use of high retention PVDF membranes with use of hydrogenated triton in the protease digestion buffer appeared particularly attractive since high electroblotting efficiencies of most proteins to high retention PVDF membranes such as Trans-Blot can be obtained, and unlike the original Immobilon P membrane, blotting efficiencies are minimally influenced by specific transfer conditions (3). The goals of the current study were to 1) reevaluate the relative yields of peptides from in situ protease digestion by comparing nitrocellulose, a low retention PVDF membrane (Immobilon P) and a high retention PVDF membrane (Trans-Blot or ProBlott) using the hydrogenated triton approach; 2) determine the minimal amount of standard and experimental peptides needed to obtain useful sequence information in our facility; and 3) evaluate of the potential advantages of a biphasic reaction cartridge sequencer for low picomole to subpicomole sequence analysis of peptides.
II. Methods A. SDS'PAGE and Electroblotting to PVDF Protein samples were solubilized in Laemmli solubilizing buffer without urea, heated for 15 minutes at 3TC, and separated using SDS-PAGE as described by Laemmli (16). SDS gels including the stacking gel were allowed to stand at room temperature for 24 h to help eliminate free radicals and other reactive byproducts. Additionally, 0.1 mM thioglycolate was added to the upper
Peptide Sequencing Using a Biaphasic Reaction Column Sequencer
567
buffer chamber prior to electrophoresis. Transfer of proteins to a high retention PVDF membrane was performed in a Bio-Rad Trans-Blot solid plate electrode apparatus using a 0.5X Towbin (17) buffer (96 mM glycine, 12.5 mM Tris, pH 8.3) with 10% methanol for 3 h at 250 mA constant current (3). After transfer, the membrane was thoroughly rinsed with Milli-Q water, then stained with one of three stains. Possible PVDF stains in order of preference included Amido Black (Sigma), Ponceau S (Sigma), and Coomassie Blue R-250 (BioRad). The PVDF membranes were air dried, sealed in plastic bags and stored at -20°C. Gels after transfer were stained with Coomassie Blue R-250 to detect any protein that did not transfer.
B. Comparison of Membrane Types Equal amounts (100 pmoles) of a standard protein from a single solution were applied to multiple lanes of a gel. After electrophoresis, nitrocellulose, Immobilon P, and BioRad Trans-Blot membranes were placed side-by-side on a single gel so that transfer conditions were identical for each membrane type. A Trans-Blot PVDF membrane was used as a backup to detect any protein not immobilized by the primary membrane,
C. In Situ Digestion Proteolysis was performed as described by Fernandez, et ah using hydrogenated triton (15) with several modifications. Briefly, the membrane bands and tubes were blocked a single time with 0.2% PVP-40 in methanol, washed several times with Milli-Q water and then washed several times with 20% acetonitrile. Washes were performed in a sonicator bath (about 5 min per wash) and solutions were removed by vacuum aspiration. After protein bands were washed, they were cut into approximately 1 X 3 mm pieces prior to digestion at 3TC with trypsin for 24 h in a buffer containing 1% triton/10% acetonitrile/lOOmM Tris, pH 8.0. In most experiments, 0.2 ixg of trypsin (modified trypsin, # v511/1,2, Promega) was initially added per sample and after 4 to 6 h a second 0.2 fxg aliquot of enzyme was added, digestion was then continued overnight. Generally 1 to 5 protein bands were digested per reaction tube in a total of 50 pi. A maximum of about 1(X) mm^ membrane surface area was used per reaction. Samples which were spread over a larger amount of membrane were: divided into two tubes, digested separately, and pooled immediately before HPLC separation. After digestion, the sample was sonicated for 5 min, briefly centrifuged, the supernatant removed and digestion stopped by adding 3.7 ^A of 5% TFA which decreases the pH to about 2.0; the membrane pieces were then rinsed with 25 ^1 of digestion buffer, and sonicated for 5 min, then briefly centrifuged, the supernatant removed, and 1.8 /il of 5% TFA was added. A second rinse was completed using 25 /xl 0.1% TFA following the same procedure. The supernatant and washes were combined and stored at -20 T prior to reverse phase HPLC separation.
568
Sandra Best et al.
D. Reverse Phase HPLC and Fraction Collection Samples were run on a Supelco LC-18-DB HPLC column, 2.1 mm X 25 cm. The buffers were A= 0.1% TFA in Milli-Q water, B= 0.09% TFA in 70% acetonitrile and the gradient was: 0% B, 5min.; 0 - 10% B, lOmin.; 10 50% B, 60 min.; 50 - 100% B, 25 min.; 100% B, lOmin. Fractions were collected in 1.5 ml polypropylene tubes, which were precleaned with 0.1 % TFA in 50% acetonitrile, using an Isco Foxy fraction collector with peak separator. The fraction collector was enclosed in a Plexiglas chamber which was under positive nitrogen pressure to minimize airborne contamination of fractions. Fractions were capped and stored at -20*'C. Immediately before loading on the sequencer, TFA (ABI) was added to fractions (25% final) to minimize peptide losses due to adsorption to the tube or pipet tip (18).
E. Sequencer Programs and Reagents Sequences were analyzed using either an updated Applied Biosystems 475A sequencer as described previously (19) or a Hewlett-Packard G1005A sequencer which uses a novel biphasic reaction cartridge. The G1005A sequencer used standard reagents, solvents and programs (initially Version 1.3 and also a prototype Version 2.0) as supplied by the manufacturer. Samples were loaded onto the hydrophobic half of the biphasic cartridge after dilution to 1 ml final volume with 2% TFA (HP) using the Sample Preparation Station. Prior to loading the sample, the loading funnel was precleaned with sequential washes using Milli-Q water, methanol and 2% TFA to minimize background signals in the first sequence cycle. In most cases only 50 to 75% of the total HPLC fraction was loaded to the sequencer. Reported PTH amino acid yields have been corrected for background and the amount injected on the PTH analyzer (50 IJLI/75 /xl) unless otherwise indicated. Initial and repetitive yields were calculated by linear regression using least squares of log (pmol yield) versus cycle number. Labile amino acids (serine, threonine and tryptophan) as well as the last two residues (C-terminal and penultimate residues) were excluded from linear regression calculations.
IIL Results and Discussion A. Evaluation of In Situ Trypsin Digestion on PVDF Membranes Using Hydrogenated Triton Comparative HPLC peptide maps of transferrin and myoglobin were used to evaluate relative yields of tryptic peptides from different types of blotting membranes. To ensure identical transfer conditions, replicate lanes from a single gel were transferred to side-by-side strips of: nitrocellulose, Immobilon P PVDF and Trans-Blot PVDF as described in Methods. As shown in Fig. 1, our results confirm the observations of Fernandez et al. (15) that optimal
Peptide Sequencing Using a Biaphasic Reaction Column Sequencer
569
Figure 1. HPLC peptide map comparisons (215 mn) using different membrane types. In situ tryptic digestion of replicate apomyoglobin bands (100 pmols loaded/lane) electroblotted from a single gel onto: 7—Nitrocellulose, 2—Immobilon P, J~Trans-Blot PVDF. B~peaks in a trypsin/buffer control chromatogram. P—PVP-40 peak which is variable from run to run, but more prominent on Trans-Blot membranes.
yields are obtained from the high retention PVDF membranes such as TransBlot. In the example shown in Fig. 1, the recovery (estimated from peak heights) of most tryptic myoglobin peptides from Trans-Blot PVDF is three to five times higher than from nitrocellulose. These observed differences in yields between membrane types appear to primarily reflect differences in electroblotting recoveries rather than variations in proteolysis or peptide extraction since the observed peptide yields on the HPLC profile roughly correlate with the blotting efficiency on the different membrane types. In addition, only minor differences in peptide recoveries from different membrane types were observed for transferrin which is well retained by all membrane types evaluated here (data not shown). Based on these results, high retention PVDF membranes such as TransBlot are the preferred membranes for routine in situ protease digestions of gel purified proteins due to: their high retention of proteins during electroblotting; their improved handling characteristics compared with nitrocellulose; the high recovery of proteolytic peptides from these membranes when hydrogenated triton is used in the digestion solution; and recoveries are not adversely affected if the membranes are dried and stored at -20°C. High retention PVDF membranes have been used exclusively in our facility over the past 9 months for in situ digestion of proteins. The only disadvantage of the high retention PVDF membranes is the large PVP-40 peak which elutes late in the HPLC separation and occasionally obscures late eluting peptides. However, two alternative approaches to this problem have recently been reported. Fernandez et aL (20) have eliminated the PVP-40 blocking step and find that the hydrogenated triton provides adequate blocking of potential protein binding sites. Alternatively, Tempst and coworkers have observed that the PVP-40 peak can be eliminated while preserving high protein recovery, if Tween 80 is used in place of hydrogenated triton (21 and personal communication).
Sandra Best et al.
570
0.01
Figure 2. Representative internal peptide sequence. In this experiment, a single band of HSA was electroblotted to Trans-Blot PVDF (40 pmol loaded to gel), digested with trypsin, and separated by HPLC. The initial coupling was 2.8 pmoles and the repetitive yield was 88%.
B. Determination of the Minimal Protein Amount Required to Obtain Internal Sequences There are currently no consistently reliable methods which permit accurate estimations of the amount of experimental proteins actually present on blots. Therefore, several standard proteins were run on ID SDS PAGE at different loads, blotted onto Trans-Blot PVDF, digested with trypsin and separated by HPLC to estimate the minimal amount of protein required to reliably obtain internal sequences. The tryptic peptide sequence shown in Fig. 2 is a low yield peptide which was obtained from a 40 pmole load of human serum albumin (HSA) onto the gel followed by blotting to a Trans-Blot membrane (Fig. 2). The recovery on the blot was estimated at about 75% or 30 pmoles. This overall yield on the PVDF membrane is affected by both minor losses during electrophoresis and losses during electroblotting. As shown, the complete, unambiguous sequence was readily obtained on the HP G1005A sequencer with an initial coupling of about 4 pmoles (13% of the amount on the blot, 10% of the amount applied to the gel). This example represents the lower end of the typical recovery range for this method; this low recovery is probably due to the fact that the illustrated peptide is an incomplete cleavage and/or the fact that the sample was stored for several months between the HPLC separation and the sequence analysis. Since useful sequence can be routinely obtained on the HP G1005A sequencer with initial couplings in the 1 - 2 pmole range (see below) and since the example shown above is a low yield, incomplete cleavage peptide, one could predict that the lower limit of this technique (as applied here using 2.1 mm HPLC columns) might be as low as 5 to 10 pmoles of protein on the blot. However, post digestion adsorptive losses are probably not linearly reduced as sample amounts decrease and we have not been consistently successful in
Peptide Sequencing Using a Biaphasic Reaction Column Sequencer
571
Table I. Summary of Experimental In Situ Digestions Over a 6 Month Period. Proteins Digested Amt on Blot (pmol)*"
No.
M.W. (kDa)
101-200
7 12 13 32
29-140
61-100 20-60 Totals
45-150 40-180 29-180
Sequence Results'
Number of Sequences Attempted Assigned*
26 20 28 74
23 17 21 61
Assigned to End**
19 15 13 47
Initial Residues Yields" Assigned^ 10-60
16.3(6-33)
7-27
10.2(5-18)
1-7
13.5(5-28)
1-60
13.3(5-33)
* Values summarize results from all sequences where assignments could be made. ^ Estimate based on protein staining intensity on blot relative to serial dilution of standard proteins on a duplicate membrane. " Number of sequences where assignments could be made. Remaining sequences were either uninterpretable mixtures or too low level. ^ Number of sequences where assignments could be made to the C-terminal lysine or arginine. * Observed initial yields in pmoles. In many cases only 50 % - 75 % of total sample was used for sequence analysis. The indicated amount is not corrected back to total sample amount. ' Average number assigned per sequence with range shown in parenthesis.
recovering adequate amounts of peptides from standard proteins where <20 pmoles was applied to the gel. Therefore a realistic lower limit of electroblotted protein appears to currently be about 20 pmoles. Consistent with this estimate, during the 6 month period summarized in Table 1, peptide sequences were obtained for all proteins where sequences were attempted even though the most commonly available amount on the blot was estimated to be 20 - 60 pmoles (see Table I). Three potentially important modifications of the current method compared with the original method (15) may influence success rates at the low pmole level. The differences in the current method are 1) trypsin which was modified by reductive methylation (Promega) was used which greatly minimizes autodegradation and essentially eliminates interference from trypsin derived peptides; 2) a second aliquot of enzyme was added after 4 - 6 h; and 3) instead of attempting to maintain a constant enzyme to substrate ratio, the amount of enzyme per reaction vessel was held constant with a maximum membrane surface area per vessel of 100 mm^. We observed a variable loss of protease activity when enzyme concentrations of <0.1 fig/50 ^1 were used, presumably due to minor adsorptive losses onto the tube and membrane even in the presence of detergents and blocking groups. The above approaches can result in a high enzyme to substrate ratio when low pmole amounts of lower molecular weight proteins are used; however, use of less enzyme is likely to result in incomplete digestion with low protein levels. The lack of autodegradation products from modified trypsin makes high enzyme to substrate ratios practical; furthermore background HPLC peaks from the modified trypsin are highly consistent (most peaks in Fig. 1 trypsin control arise from triton or the HPLC solvents rather then trypsin). In our laboratory, when unmodified trypsin was used, tryptic sequences were observed in about 10% of the attempted sequences from peptide
572
Sandra Best et al.
maps while no trypsin sequences have been observed when modified trypsin has been used for in situ digestions (data not shown).
C. Evaluation of the HP Sequencer for High Sensitivity Peptide Sequence Analysis The biphasic reaction column of the HP 1005 A sequencer facilitates sample loading and cleanup. Direct comparisons of the biphasic column sequencer with gas phase sequencing on polybrene-coated glassfiltersshowed comparable initial yields and comparable repetitive yields for most peptides. In general, recovery of C-terminal lysines were lower on the biphasic cartridge compared with a duplicate sample analyzed on a polybrene-coated glass filter, while C-terminal arginines were recovered in comparable yields. In many cases the first one or two cycles could not be confidently assigned on a gas phase sequencer using polybrene coated glass filters, especially when initial yields were less than 10 pmoles. However, most initial cycles could be confidently assigned when an identical sample was run on the biphasic cartridge sequencer. The sequence shown in Fig. 3 is an example of a high sensitivity run from an unknown protein purified in a single step using 2D gel electrophoresis. The entire indicated sequence was unambiguously assigned with an initial yield of <4 pmol and a database search identified the protein as triose phosphate isomerase. Subpicomole sequencing of peptides can also be readily performed using the G1005 A sequencer with the reagents, supplies and programs provided by the manufacturer. Fig. 4 shows the sequence of a 13 residue peptide from an in situ digestion of an unknown 110 kDa protein. The entire sequence, including tryptophan in the first cycle was unambiguously assigned with the exception of the C-terminal arginine which was a tentative assignment. A subsequent database search identified the protein as a adducin and confirmed the accuracy 10 F
0.01 V P A D T E V V C A P P T A Y
I
D F A R
Figure 3. Sequence data for a 20 residue experimental tryptic peptide. A 27 kDa protein was purified from a whole cell extract of a human melanoma cell using high resolution 2D gels followed by electroblotting to PVDF. Spots derived from four gels were combined for in situ digestion. The estimated amount of protein present on the combined blots was about 20 pmol. The initial yield was 3.4 pmoles and the repetitive yield was 86%.
Peptide Sequencing Using a Biaphasic Reaction Column Sequencer
573
Figure 4. An experimental subpicomole sequence on the HP G1005A sequencer. ^4—Linear regression plot of total residue yields (corrected for the portion injected onto the PTH analyzer). The initial yield was 900 fmoles and the repetitive yield was 92%. B—Chromatograms for selected cycles indicating the residue assignments in single letter code and the actual detected quantities in fmoles (not corrected for the portion injected).
of the tentative arginine. Fig. 4B illustrates the relatively flat, stable baseline of the PTH separation, good signal to noise ratio, and low background in the early cycles which make sequence assignment at the hundred fmol level possible. While similar sequence levels can be obtained on other sequencers and PTH analyzers, it should be noted that the data presented here was obtained without any special optimizations or modifications of the sequencer or PTH analyzer. Also, the multiple column feature and short reaction cartridge precycle time (about 18 min) on the G1005A sequencer minimizes operator time; about 1 h is required to set up two sequence runs. In summary, in situ protease digestion on high retention PVDF membranes in the presence of hydrogenated triton provides a high yield method for obtaining internal peptides from proteins which are electroblotted from either ID or 2D gels. Sequences can usually be obtained from most proteins when at least 20 pmoles is present on the PVDF membrane. The major limitations with lower amounts of protein on the blot appear to be due to adsorptive losses on the HPLC column and/or in the collection tubes rather than sequencer sensitivity. The use of smaller HPLC columns and direct collection of peaks on the sequencer reaction cartridge would be expected to further improve the sensitivity of this technique. Complete unambiguous sequence assignments of tryptic peptides can usually be made on experimental peptides with initial yields ranging from several pmoles to slightly less than 1 pmole on the biphasic cartridge sequencer without special treatments or modifications (see Fig. 4).
574
Sandra Best et al.
Acknowledgements This study was partially supported by grants from the NIH to D.W.S. We also thank the Protein Chemistry Systems Group at Hewlett-Packard for their support, including supplying prototype Version 2.0 programs and Version 2.0 Rl and R2 reagents.
References 1. Matsudaira,P. (19S1) J.Biol. Chem. 21, 10,035-10,038. 2. Bauw, G., De Loose, M., Inze, D., Van Montagu, M., and Vandekerchove, J. (1987) Proc. Natl. Acad. Sci. USA 84, 4806-4810. 3. Mozdzanowski,J. and Speicher, D.W. (1992) Anal. Biochem. 207, 11-18. 4. Reim,D. and Speicher, D.W. (1992) Anal. Biochem. 207, 19-23. 5. Brown, J. L. and Roberts, W.K. (1976) J. Biol. Chem. 251, 1009-1014. 6. Driessen, H.P., de Jong, W.W., Tesser, G.I. and Bloemendal, H. (1985) In "Critical Reviews in Biochemistry" (Fasman, G.D.,ed) Vol. 18, pp. 281-325, CRC Press, Boca Raton. 7. Aebersold, R.H., Leavitt, J., Saavedra, R.A., Hood, L.E. and Kent, S.B. (1987) Proc. Natl. Acad. Sci. USA 84, 6970-6974. 8. Stone, K.L. (1992) in Techniques in Protein Chemistry III (R.H.Angeletti, Ed.), pp. 2334, Academic Press, San Diego, CA. 9. Aebersold, R., Patterson, S.D., Hess, D. (1992) in Techniques in Protein Chemistry III (R.H. Angeletti, Ed.), pp. 87-96, Academic Press, San Diego, CA. 10. Simpson,R.J. et al. (i9S9) J.Chrom. 476, 345-361. 11. Crinunins, D. L. et al. (1990) Anal. Biochem. 187, 27-38. 12. Kennedy, T.E. et al. (1988) Proc. Natl. Acad. Sci. USA 85, 7008-7012. 13. Eckerskom, Ch. and Lottspeich, F. (1989) Chromatographia 28, 92-94. 14. Kawasaki, H. et al. (1990) Anal. Biochem. 191, 332-336. 15. Fernandez, J. et al. (1992) Anal. Biochem. 201, 255-264. 16. Laemmli,U.K. (1970) Nature 111, 680-685. 17. Towbin,H., Stachelin,T., and Gordon, J. (1979) Proc. Natl. Acad. Sci. USA 76, 43504354. 18. Erdjument-Bromage, H., Geromanos, S. Chodera, and Tempst,P. (1993) in Techniques in Protein Chemistry IV (R.H. Angeletti Ed.), pp. 419-426, Academic Press, San Diego, CA. 19. Reim,D.F. and Speicher, D.W. (1993) Anal. Biochem. 214, in press. 20. Fernandez, J., Andrews, L., and Mische, S.M. (1993) Protein Science. 2(Suppl.l), 103. 21. Erdjument-Bromage, H., Lui, M., Lacomis, L. and Tempst, P. (1993) Protein Science. 2(Suppl.l), 104.
Index
ABRF-94SEQ, 209-216 best responses, 216 protein characterization, 210 sample preparation and distribution, 210 sequence assignments, 211-213 survey results, 210-211 test sample selection, 210 Acetylation, E. coli, 105 p-N-Acetylhexosaminidase, treatment of monoclonal antibody oligosaccharides, 71-72 e-N-Acetyllysine, 99-106 quantitation, amino acid analysis, 104-105 Acrylamide gel resolved proteins, peptide mapping, 316-318 ACTH, synthesis, 532-534 Adsorption isotherms, frontal loading, 314 Affinity coelectrophoresis, DNA binding peptides, 393-400 Albumin bovine serum, gel digestion, 163-165 human serum C-terminal protein sequencing, 225-226 cyanogen modification, 436, 439-441 Alkylation, see Carboxy-terminal protein sequencing a-helix formation in small proteins, 323-331 hydrophobic core repacking, 325, 330 model system, 324 structural controls, 326-327 thermodynamic characterization, 328-330 interactions between residues, 443-450 exhaustive conformational search, 443-444 glycine terminated, 446-450 side chain interactions, 444-445 ttL motif, 448-449 Amino acid, analysis, ABRF-94AAA collaborative trial, 185-192
calculations, 186 comparison of analyses of single batch and individual hydrolysates, 188-189 cystine, 190 glucosamine, 190 histidine, 191 participation, 186 pre-hydrolyzed sample, 188 sample preparation and distribution, 185 yield and accuracy, 186-188 Antibodies, interactions with bacterial cellsurface proteins, 409-415 Antifreeze protein ESI-MS, 552-553 Fmoc-cleavage and fragment condensation times, 552 ID-NMR, 553 RP-HPLC, 552 Apparent sedimentation coefficient distribution function, 428 Arthrobacter urefaciens, treatment of monoclonal antibody oligosaccharides, 71 7-Azatryptophan, 349-356 B Bacterial cell-surface proteins, interactions with antibodies, 409-415 Bacteriophage, P22, high sensitivity sedimentation methods, 429-431 pi, thermodynamic characterization, 329-330 P-sheet formation in small proteins, 323-331 model system, 324 propensity studies, 326 structural controls, 326-327 thermodynamic characterization, 328-330 water-soluble peptide, 451-456 D-Phe, substitution effects on solubility and structure, 453-454 effects of extending, 454-455 575
576 p-sheet, water-soluble peptide (continued) materials and methods, 452-453 stabilizing and destabilizing amino acid substitutions, 455-456 1,1 '-Bi(4-anilino)napthalene-5,5'-disulfonic acid, interaction with bacteriophage capacity, silica-based supports, 314 Bis(2-mercaptoethyl)sulfone, 260 Bitopic membrane proteins, appropriate solvent and chromatography systems, 301-309 detergents and denaturants, 304 homogeneous polypeptide-lipid mixtures, 304 materials and methods, 302 organic solvents, 302-304 screening, 305 Boltzmann-averaged energy, 444 a-Bungarotoxin, in venom-derived K-bungarotoxin, 293-298 chick skeletal muscle assay, 296-298 dissociation constants, 297-298 materials and methods, 294-295 K-Bungarotoxin, affinity for mouse fetal muscle receptor, 296, 298 BZIP proteins, peptide models, 385-390 binding model, 387-388 DNA binding, 389-390 electrophoretic mobility shift assays, 386-387 equilibrium dissociation constants, 389-390 materials and methods, 386-389 sequences, 385-386 thermal stabilities, 389
Calbindin D-28K, purification, 287-288 Calcium-binding proteins, see Retina, calciumbinding proteins Calmodulin, 401-408 dissociation constants, 405-406 far UV CD spectra, 406-407 fluorescence spectra, 403-404 materials and methods, 402-403 near UV CD spectroscopy, 404-405 purification, 286-287 Capl, purification, 286-287 Capillary electrophoresis, neuropeptide Y profile, 537 Carbohydrates, detecting posttranslational modifications, 109-111 Carbonic anhydrase II, heteronuclear gradientenhanced NMR, 495-501
Index Carboxy-terminal protein sequence analysis, 219-227 alkylation method, 229-236 ABI 477A Protein Sequencer, 230 electroblotted proteins, 234-236 immobilization of purified proteins on PVDF by centrifugation, 230-231 Protein Kinase CKII, 233-234 recombinant interleukin-2, 232-233 C-terminal coupling and cyclization reactions, 221 HSA, 225-226 P-lactoglobulin A, 225-226 materials and methods, 220 mouse immunoglobulin G, 223-224 peptidylthiohydantoin cleavage reaction, 221-222 polypeptides containing C-terminal proline, 239-246 chemistry, 242-244 materials and methods, 240-242 reaction scheme, 243-244 polyproline, 224-225 sample application on inert reaction support, 220-221 superoxide dismutase, 223-224 thiohydantoin-amino acid derivatives, HPLC analysis, 222-223 p-Casein, stepped collision energy scanning LC-ESMS, 114 Catalysis, thiocarbamylation, 181-183 Chemical rescue, deficient site-directed mutants, 360-361 Chemoattractant protein-1, amino acid sequence, 127-129 Chemokines, CXC family, 127 2-Chlorotrityl resin solid-phase protein synthesis, 548-549 synthesis of peptides differing in C-termini, 531-538 ACTH(4-10) and ACTH(4-11) synthesis, 532-534 neuropeptide Y synthesis, 535-537 procedure, 532 reagents and materials, 531 Chromatographic system, automated 2-dimensional, 40 Chromosome C, Lys-C digested, peptide separations with and without SDS, 269-272 a-Chymotrypsinogen A, reduction, 262 Circular dichroism spectroscopy far UV, calmodulin, 406-408
Index gramicidin S, 453 near UV, calmodulin, 404-405, 407 Cleavage correlation with side-reactions, following solid-phase peptide synthesis, 539-545 peptidylthiohydantoin, 221-222 Collisionally induced dissociation spectra doubly charged ions, 58-59 N-terminal chymotryptic peptide from phosphorylase, 60 Column sequencer, biphasic reaction evaluation for high sensitivity peptide sequence analysis, 572-573 in situ proteolysis, 565-573 Conformational probe, cyanogen as, 435-441 Conus venom, 31-37 HPLC, 33-35 LSIMS, 32 MALDI-TOF MS, 31-32 materials and methods, 32-33 observed masses in LSIMS and MALDI mass spectra, 33-35 peptide modification, 33 PSD spectrum, 35-36 Cyanogen, as conformational probe, 435-441 experimental materials and methods, 436-437 human serum albumin, 436, 439-441 molar ellipticity as function of pH, 439-440 ribonuclease S, 436-439 Cysteine content of calcium-binding proteins, 290 residues, assignment, ABRF-94SEQ, 209-216 Cystine analyses, 190 residues cleavage reagents, 193-198 fragments from hen ovalbumin and yeast alcohol dehydrogenase, 196-197 location in ^-lactoglobulin, 195-197 urodilatin, MALDI mass spectrum, 197-198 Cytochrome c derivatized peptide maps, 255-258 modified, LC/MS, 57-59 tryptic digests, 251-258 Cytokine-cytokine receptor interactions, 417-425
Deglycosylation, human B61, 77
577 Denaturants, bitopic membrane proteins, 304 Desalting, biological samples, 279-280 Desmopressin, NMR relaxation, 522-523, 526-527 Detergents anionic, removal, 281-282 liquid chromatography mass spectrometry, 267-274 bitopic membrane proteins, 304 non-ionic, removal, 282-283 liquid chromatography mass spectrometry, 267-274 removal cartridges, 278 N-N-Diethylaminopropyl-bis-(3-hydroxypropyl) phosphine, 193-198 materials and methods, 194 structure, 193 Digestion, in-gel, protocol, 313 m^56>-2,5-Dimercapto-N,N,N',N'tetramethyladipamide, 260 Diplococcus pneumoniae, treatment of monoclonal antibody oligosaccharides, 71 Disulfide bonds determination in human macrophage chemoattractant protein-1, 125-132 reducing reagents, 259-266 DNA, recognition, bZIP proteins, 385 DNA binding peptides, affinity coelectrophoresis, 393-400 ACE gel autoradiogram, 394-395 clupeine Z, Scatchard plot, 398-399 materials and methods, 394 TPPI binding curve, 396 Xfin-31 binding isotherms, 397 Scatchard plots, 397-398 DnaK, Hsp70-protein complexes, 467-473 Drosophila melanogaster, calmodulin, 401-408
Edman degradation, 83-89 phosphorylation site identification, 117-122 lag as function of phosphorylation position, 118, 120 materials and methods, 117-118 peptide coupling efficiency, distribution, 118-119 preview as function of phosphorylation position, 118-119 radioactivity and amino acid release, 121
578 Edman degradation (continued) PTH-glycoamino acids, 86-88 solid-phase, 85 Edman sequencing automated, 171 direct collection onto Zitex and PVDF, 169-176 Lys-C peptide chromatograms, 172, 174 Lys-C peptide mass assignments, 176 MALDI-TOF-mass spectrometry, 171-172 polyacrylamide gel electrophoresis and electroblotting, 170 protein and peptide sources, 170 proteolytic digestions, 170-171 minimizing N-to-0 shift, 177-184 kinetics of 0-to-N acyl migration, 180-181 materials and methods, 179-180 thiocarbamylation catalysis, 181-183 Electroblotting C-terminal sequencing, 234-236 protein preparation, 231 to PVDF, 566-567 Electrophoretic mobility shift assays, bZIP proteins, 386-387 Electrospray ionization mass spectrometry e-A^-acetyllysine in E. coli, 103-104 antifreeze protein, 552-553 Asp-N 20.3 peptide, 78 Asp-N 20.7 peptide, 79 Asp-N 20.9 peptide, 80 comparison with MALDI-TOF MS and LSIMS, 21-30 EGF-like peptide, 559-560 PTH-glycoamino acids, 85, 88-89 solid-phase peptide synthesis, 541-545 Endo F2, treatment of monoclonal antibody oligosaccharides, 69 Endo H, treatment of monoclonal antibody oligosaccharides, 70 Erythropoietin, recombinant, digestion, 158-159 Escherichia coli P-galactosidase, 365-370 glucose and galactose receptor, 488 macrophage chemoattractant protein-1, 126 neurotrophin-3 expression, 341-348 recombinant proteins, e-A^-acetyllysine in, 99-106 electrospray mass spectrometry, 103-104
Index isolation of monoacetylated somatotropins, 100-102 material and methods, 99-100 quantitation, amino acid analysis, 104-105 RP-HPLC analysis of somatotropins, 102-103 trp repressor-operator complex, 503-509 Eukaryotic transcription factors, bZIP proteins, 385-390
Factor X, synthesis, 556-557 Fatty acyls, detection of posttranslational modifications, 113-115 Ferricytochrome c distance restraints, 513-514 hydrogen bond restraints, 515 restrained molecular dynamics and inclusion of structural water, 516-517 sample preparation and NMR spectroscopy, 512-513 solution structure, 511-518 stereospecific assignments, 515-516 structures, 517-518 torsion angle restraints, from ^H-'H Jcoupling constants, 514-515 Ferrocytochrome c distance restraints, 513-514 hydrogen bond restraints, 515 restrained molecular dynamics and inclusion of structural water, 516-517 sample preparation and NMR spectroscopy, 512-513 solution structure, 511-518 stereospecific assignments, 515-516 structures, 517-518 torsion angle restraints, from 'H-'H Jcoupling constants, 514-515 Fibrinopeptide B, quantitative derivatisation, 8-10 FK binding protein, equilibrium folding behavior, 463, 465 Flow velocity, effect on resolution and recovery, 315-316 Fluorescence spectra, calmodulin, 403-404 Fourier self-deconvolution, 478-480 Fragmentation, colHsion-induced, 107-108 Fragment condensation, solid-phase protein synthesis, 547-554 FTIR, attenuated total reflectance, 475-483 chemical shifts, 489-490
Index collected with Ge and ZnSe, 482 curve fitting, 478-480 data collection, 476-477 dissociate rates, 490 materials and methods, 489 multivariate statistical methods, 480-481 protein solutions, 476 spectral pre-processing, 477-478 titration experiments, 490-491 two-dimensional NOESY exchange, 492-493
579 gD2, stepped collision energy LC-ESMS, 110-111 Glycosylation monoclonal antibody, 65-72 N- and O- linked sites, 83-89 Gramicidin S D-Phe substitution effects on solubility and structure, 453-454 spectroscopic characterization, 453 structure, 451-452 synthesis and cyclization, 452-453 H
p-Galactosidase conserved histidine residues, 365-370 pH profiles, 368-369 probable mechanism, 366-367 purification, 366 treatment of monoclonal antibody oligosaccharides, 71 Gas phase sequencer, disulfide bond determination, 125-132 diPTH-cystine release, 130-131 disulfide bonded peptic peptide structure, 129 materials and methods, 126-127 PTH amino acid separation, 130 GCAP, purification, 288 Gel digestion peptide mapping, 153-160 SDS PAGE-separated proteins, 143-152 LDMS and peptide sequencing, 146 optimizing internal sequencing, 149-150 procedure, 144-145 results from "unknown" proteins, 147-149 reverse phase HPLC separation, 145-149 sample separation, 144 in zinc chloride and Ponceau S, 161-166 in gel, 162-163 on PVDF membrane, 163 RP-HPLC and peptide sequencing, 163 in solution, 162 Glucosamine, analysis, 190 Glucose and galactose receptor '^F NMR, 487-493 structure, 488 Glycine, a-helix terminator, 446-450 Glycoproteins, 84 covalent attachment to Sequelon-DITC andAA, 84-85
Hewlett-Packard G1009A C-terminal protein sequencing system, 219-227 High-performance liquid chromatography ACTH profile, 533-534 Conus venom, 33-35 method for protein folding analysis, 459-466 microbore PTH separations, 202-203 neuropeptide Y profile, 535-536 peptide map, comparisons using different membrane types, 568-569 reversed-phase antifreeze protein, 552 denatured MHC complex, 379, 382 digestion in zinc chloride and Ponceau S, 163 EGF-like peptide, 558-560 factor X, 556 high sensitivity peptide sequencer analysis, 568 neurotrophin-3, 345 separation of enzymatic digests, 145-149 silica-based supports, 311-319 solid-phase peptide synthesis, 540, 542, 545 thiohydantoin amino acid, 241-242 size-exclusion, Hsp70-protein complexes characterization, 467-473 thiohydantoin-amino acid derivatives, 222-223 UV chromatogram, peptide mixture, 43-44 Histidine, analysis, 191 Histidine residues, conserved, p-galactosidase, 365-370 enzyme characterization, 368-370 kinetic and inhibitor constants, 366-367 purification and stability, 367-368 site directed mutagenesis, 365-366 rationale for, 367
580
Index
HMQC-NOESY, large proteins and complexes, 506-507 'H-'^N Heteronuclear single quantum coherence, gradient-enhanced, 498-499 HPAEC-PAD analysis, monoclonal antibody glycosylation, 65-72 p-N-acetylhexosaminidase treatment, 71-72 Endo F2 treatment, 69 Endo H treatment, 70 (3-galactosidase treatment, 71 monosaccharide analysis, 66-68 neuraminidase treatment, 71 oligossaccharide mapping, 67-68, 70 Hsp70-protein complexes, 467-473 materials and methods, 468-469 stoichiometry, 469-471 Stokes radius determination, 471-473 Hydrogen bond restraints, ferro- and ferricytochrome c, 515 Hydrophobic core, repacking, 325, 330 5-Hydroxylysine, in non-collagenous proteins, 91-98 amino acid analysis program, 92 levels, 96 materials and methods, 92-93 N-terminal sequence analysis, 95 5-Hydroxytryptophan, 349-356 proteins incorporating, 355-356
I Immunoglobulin, reduction, SDS-PAGE analysis, 262 Immunoglobulin G, C-terminal protein sequencing, 223-224 Inclusion bodies, expressing NT-3, 343-344 Interleukin-2, recombinant C-terminal sequencing, 232-233 preparation, 230 Interleukin-6 interaction with soluble IL-6 receptor equilibrium constant calculation, 423-425 studies, 418-419 surface plasmon resonance analysis, 419-421 recombinant, preparation, 418-419 size inclusion chromatography, 421-423 solution molecular weight determinations, 421-423
Jack bean enzyme, treatment of monoclonal antibody oligosaccharides, 71-72 J-coupling constants, 'H •H, torsion angle restraints, 514-515
a-Lactalbumin, reduced, carboxymethylated, Hsp70-protein complexes, 467-473 Lactoferrin, ABRF-94SEQ, 210 3-Lactoglobulin C-terminal protein sequencing, 225-226 Cys residue location, 195-197 Laser desorption mass spectrometry, enzymatic digests, 146 Ligand-receptor interactions, NMR relaxation methods, 521-527 protein/peptide production, 522 T, and T2 relaxation experiments, 524-526 theory, 523-524 TRNOE experiments, 526-527 Liquid chromatography mass spectrometry cytochrome c, 57-59 detergent removal strategies, 267-274 ALF, HILIC separation, 273-274 materials and methods, 268 SDS-removal precolumn, 269-270 phosphorylase b, 59-61 selective detection of posttranslational modifications, 107-116 carbohydrate specific detection, 109-111 efficiency, 118, 120 fatty acyl-selective detection, 113-115 materials and methods, 108 phosphate-specific detection, 112-113 stepped coUision energy, 108-111 sulfate-specific detection, 115-116 TIC chromatogram, peptide mixture, 44-45 Liquid secondary ion mass spectrometry comparison with MALDI-TOF MS and ESI, 21-30 Conus venom, 32 Lys329, in Rubisco, 361-363 Lysine side-chains, modification, 55-56 M Major histocompatibility complex, class I, 375-383 formation and time course of assembly, 376-379
Index gel filtration profile, 377-379 glutathione in incubation mixture, 382 peptide source, 375 sequence analysis, 377 ternary complex peak, 379-382 Marine cone snails, see Conus venom Mass spectrometry, 40-41 peptide ladder, 560-561 PVDF-bound proteins, enzymatic digestion, 135-142 RNase Tl, 337-338 Mass transfer kinetics, silica-based supports, 314 Matrix assisted laser desorption/ionization time-of-flight mass spectrometry, 3, 13-19 Asp-N 20.3 peptide, 77-78 Asp-N 20.7 peptide, 79 Asp-N 20.9 peptide, 80 comparison with LSIMS and ESI, 21-30 Conus venom, 31-32 direct collection onto Zitex and PVDF, 171-172 from polyethylene membranes mass accuracy and reproducibility, 16, 18 practical mass range, 15-17 spectral quality, 15 PVDF-bound proteins, 140-141 Matrix assisted laser desorption/ionization mass spectrometry peptide mapping, 157-158 solid-phase peptide synthesis, 541-545 McGhee-von Hippel equation, 395-397 Methionine enkephalin, response as function of pH, 254-255 Micro-affinity anti-phosphotyrosine antibody column, 41 Micro-affinity Avidin:Biotinyl-SH2 column, 42 Microbore PTH separations, 201-208 comparative sequencing runs, 206-208 HPLC system, 202-203 reproducibility, 204 sample preparation, 206 sequencer description, 203-205 Microcolumn affinity chromatography-capillary HPLC system, direct coupling with MS, 39-46 enzymatic digestion of SH2, 42-43 experimental, 40-42 HPLC UV chromatogram for peptide mixture, 43-44 LC/MS TIC chromatogram for peptide mixture, 44-45
581 preferred peptides substrate identification, 43-44 SH2 sequence, 43 Molecular dynamics, restrained, ferro- and ferricytochrome c, 516 Monoclonal antibody glycosylation, 65-72 primary structure analysis, 21-30 detergent effect on tryptic map, 26-27 mass spectra of glycopeptide, 25-26 mass spectra of tryptic peptide, 23-24, 29 from detergent-containing digest, 26, 28, 30 materials and methods, 22, 24 Monosaccharides composition, PTH-glycoamino acids, 85, 88 monoclonal antibody, HPAEC-PAD analysis, 66-68 Mutagenesis, site-directed histidine residues in 3-galactosidase, 365-366 PCR based, RNase Tl circular permutation, 333-340 ribulose 1,5-bisphosphate, 357-363 Mutants (C2A, ClOA) construction, 335 thermodynamic stability, 339 cp35Sl, circularly permuted, construction, 336-337
Neuraminidase, treatment of monoclonal antibody oligosaccharides, 71 Neuropeptide Y, synthesis, 535-537 Neurotrophin-3, E. c<^//-expressed, 341-348 endoproteinase Lsy-C map, 346-347 inclusion bodies, 343-344 materials and methods, 342-343 RP-HPLC, 344-345 translated cloned DNA sequence, 347-348 NOESY, transferred, desmopressin, 526-527 Nuclear magnetic resonance ferro- and ferricytochrome c, 512-513 gramicidin S, 453 heteronuclear gradient-enhanced, 20-30kDa proteins, 495-501 HCAII, 499-500 instrumental and sample requirements, 496-497
582
Index
Nuclear magnetic resonance, heteronuclear gradient-enhanced (continued) power level and duration parameters, 500-501 pulsed field gradients, 497-501 large proteins and macromolecular complexes, 503-509 1-D, antifreeze protein, 553 protein G, 410 relaxation ligand-receptor interactions, 521-527 theory, 523-524 •^C Nuclear magnetic resonance, free sugar in refolding buffer, 491-492 '^F Nuclear magnetic resonance advantages, 487 sugars binding to glucose and galactose receptor, 487-493 'H Nuclear magnetic resonance, ferro- and ferricytochrome c distance restraints, 513-514 O Oligosaccharides, monoclonal antibody Endo F2 treatment, 69 Endo H treatment, 70 mapping, 67-68, 70 Osteopontin, stepped collision energy scanning LC-ESMS, 112-113 Ovalbumin, C-terminal sequencing, 246
PamgCys, positive-ion mass spectrum, 114-115 Partial least squares methods, 480-482 Peptides analysis, 3-10 derivatisation with quaternary amines, 5 ladder sequences, 7-9 materials and methods, 4-6 MS analysis, 5-6 TFEITC degradation protocol, 6, 8 differing in C-termini, synthesis, 531-538 high sensitivity sequence analysis, 565-573 comparison of membrane types, 567 HP sequencer evaluation, 572-573 in situ digestion, 567 in situ trypsin digestion evaluation using hydrogenated titron, 568-569 minimal protein amount determination, 570-572
modifications of current method, 571 RP-HPLC and fraction collection, 568 SDS-PAGE and electroblotting, 566-567 sequencer programs and reagents, 568 mapping acrylamide gel resolved proteins, 316-318 in-gel versus PVDF digestion techniques, 153-160 analysis of unknown protein samples, 160 method modifications, 154 methods and materials, 153-154 recombinant human erythropoietin, 158-159 Stem Cell Factor digests, 154-158 protected, synthesis, 549-551 solid-phase synthesis, 539-545 amino acid analysis, 540 analytical RP-HPLC, 540, 542, 545 crude peptide samples, 541-542 Fmoc chemistry, 555-561 EGF-like peptide, 557, 559 HPLC separation, 556 mass spectrometry, 557 micro-scale TEA cleavage, 556 peptide ladder, mass spectrometry, 560-561 stepwise monitoring, 558-560 mass spectrometry, 541-545 sequence analysis, 540 transmembrane, 301 underivatized separation, 252-253 Peptidylthiohydantoin, cleavage reaction, 221-222 Phosphates, detecting posttranslational modifications, 112-113 Phosphopeptides, binding and elution, P-Tyr antibody micro-column, 41 Phosphorylase, N-terminal chymotryptic peptide, CID spectra, 60 Phosphorylase b chymotryptic digest, LC/MS, 59-60 in-gel versus in-solution tryptic digestion, 317 N-terminal tryptic peptide, CID spectra, 60-61 tryptic digest, LC/MS, 60-61 Phosphorylation, site identification by Edman degradation, 117-122 Plasmids, pKW02, 335 Polyethylene membranes, 13-19 MALDI analysis practical mass range, 15-17
Index spectral quality, 15 mass accuracy and reproducibility, 16, 18 materials and methods, 14 sample washing, 15-16, 18 Polymerase chain reaction protocol for creation of circularly permuted proteins, 333-334 site-directed mutagenesis, RNase Tl circular permutation, 333-340 Polypeptide-lipid mixtures, homogeneous, 304 Polypeptides, containing C-terminal proline, Cterminal sequence analysis, 239-246 Polyproline, C-terminal protein sequencing, 224-225 Polyvinylidene difluoride, see PVDF Ponceau S, enzymatic digestion in, 161-166 Positive-ion mass spectrum, PamjCys, 114-115 Post source decay spectrum, Conus venom, 36 Posttranslational modifications disulfide bonds, 125-132 marker-ions for detection, 109 selective detection, LC-MS methods, 107-116 Prediction residual sum of squares, 480-481 Proline, C-terminal, polypeptides containing, 239-246 Protease digestion, in situ, 565-573 Protein A, IgG-binding B domain, 409-410 Protein folding automated analysis, 459-466 FK binding protein, equilibrium folding behavior, 463, 465 HPLC system and characteristics, 461-462 potential modes of operation, 460, 462-464 system characterization and suitability tests, 461-462, 464-465 urokinase equilibrium unfolding and refolding, 463-466 ribbon diagrams, 323 Protein G complex of domain II, 414-415 Fc residues, 412 IgG-binding domains, 412-413 NMR spectra, 410-413 Protein G-antibody complexes, structure, 412, 414-415 Protein Kinase CKII C-terminal sequencing, 233-234 preparation, 230 Protein-protein interactions, 401
583 bacterial cell-surface proteins with antibodies, 409-415 Proteins human B61, 75-82 C-terminal sequence analysis, 76-77 de-glycosylation and molecular weight, 77 expression, 75 N-linked glycopeptides, 77-79 N-terminal sequence analysis, 76-77 large, solution structures, 503-509, 511-518 NMR spectroscopy, 504-505 NOESY-HSQC, 506-507 protein preparation, 504 non-collagenous, 5-hydroxylysine in, 91-98 N-terminally blocked, 55-62 CID spectra doubly charged ions, 58-59 tryptic peptide from phosphorylase b, 60-61 identification of amino terminal peptide, 56 in situ derivatization on cationic PVDF membrane, 56-57 LC/MS cytochrome-C, 57-59 phosphorylase b, 59-61 lysine side-chains, modification, 55-56 proteolytic digestion, 56 re-acetylation, 56 solid-phase synthesis, 547-554 antifreeze protein, 548 fragment condensation, 551-553 materials, 548-549 protected peptides, 549-551 T-cell receptor, 548 Proteolysis in-gel, 2-D gel protein identification, 311-319 modified protein, 56 PTH-glycoamino acids Edman degradation, 86-88 electrospray ionisation mass spectrometry, 85, 88-89 monosaccharide composition, 85, 88 Pulsed field gradients, 496-501 PVDF cationic membrane in situ derivatization, 56-57 collection on, Edman sequencing, 169-176 comparison of membrane types, 567 digestion, peptide mapping, 153-160 high retention, in situ proteolysis, 565-573
584
Index
PVDF (continued) immobilization of purified proteins on, 230-231 SDS-PAGE and electroblotting to, 566-567 PVDF-bound proteins, enzymatic digestion, 135-142 MALDI-TOF mass spectrometry, 140-141 materials and methods, 136-137 peptide maps, 137-139 recovery, 136, 138
R Rayleigh optical system, 427-429 Re-acetylation, N-terminally blocked proteins, 56 Reagents disulfide-reducing, 259-266 bis(2-mercaptoethyl)sulfone, 260 a-chymotrypsinogen A reduction, 262 me56>-2,5-dimercapto-N,N,N',N'tetramethyladipamide, 260 immunoglobulin reduction, SDS-PAGE analysis, 262 papain-S-SCHj, reduction, 261, 264-265 rate constants, 263 trypsinogen reduction, 261-264 ionic and non-ionic detergents, 267-274 Receptor tyrosine kinase, ECK, ligand for, 75-82 Recombinant proteins, spectral enhancement with tryptophan analogs, 349-356 fluorescence excitation and emission spectra, 351-352 materials and methods, 350 expression and analysis sTF protein, 350-353 Trp replacement, sTF mutants, 353-354 Recoverin, purification, 286-287 Restriction site, iSryl, construction in RNase Tl, 335 Retina, calcium-binding proteins, 285-292 characterization, 288 cysteine content, 290 guanylate cyclase activity, 291 purification, 289 calbindin D-28K, 287-288 GCAP, 288 recoverin, calmodulin, capl and S-1008, 286-287
Ribonuclease S, cyanogen modification, 436-439 Ribulose 1,5-bisphosphate, carboxylation and oxygenation, 357-363 chemical rescue applications, 360-361 Lys329, 361-363 oxygenation and other side reactions, 359-360 partial reactions, 359 reaction pathways, 357 RNase Tl, circular permutation, 333-340 amino-terminal sequencing, 338 (C2A, ClOA) mutant construction, 335 cp35Sl mutant construction, 336-337 enzyme purification and assay, 336-337 mass spectrometry, 337-338 materials, 334-335 Styl restriction site construction, 335 thermodynamic measurements, 337 3-D structure, 338 wild-type and mutant variants, 339 Rop, thermodynamic characterization, 328-329 Rubisco mutants, characterization, 358-359 ribulose 1,5-bisphosphate, carboxylation and oxygenation, 357-363 side reactions, 359-360
Samples, complex biological, online preparation, 277-284 anionic detergent removal, 267-274, 281-282 detergent removal cartridges, 278 materials and methods, 279 non ionic detergent removal, 267-274, 282-283 sample concentration and de-salting, 279-280 S-IOOP, purification, 286-287 Scaling factor optimization method, 477-478 Schellman motif, 446-448 SDS gel electrophoresis, partial cystine cleavage, 195 SDS-PAGE polyethylene membranes as sample support, 13-19 to PVDF, 566-567 separated proteins, in situ gel digestion, 143-152 Sedimentation methods, high sensitivity, 427-432
Index ' application to bacteriophage P22, 429-431 theoretical background, 428-429 SH2 affinity column, binding and elution of SH2 ligands, 42 Side chain, interactions within a-helices, 444-445 Silica-based supports, 311-319 binding capacity and mass transfer kinetics, 314 flow velocity effect on resolution and recovery, 315-316 materials and methods, 313 peptide-mapping of acrylamide gel resolved proteins, 316-318 Size-exclusion partition coefficients, 471-472 Sodium dodecyl sulfate, removal, 278, 281-282 Solvents, organic, bitopic membrane proteins, 302-304 Staphylococci, see Bacterial cell-surface proteins Stem Cell Factor digests, comparison of in-gel and PVDF techniques, 154-158 Stokes radius, Hsp70-protein complexes, 471-473 Streptococcus, see Bacterial cell-surface proteins, interactions with antibodies Sugar, binding to glucose and galactose receptor, '^F N M R , 487-493 Sulfates, detection of posttranslational modifications, 115-116 Superoxide dismutase, C-terminal protein sequencing, 223-224 Surface plasmon resonance, 417-418 analysis of IL-6 binding to sIL-6R, 419-421 equilibrium constant calculation, 423-425
T-cell receptor, 548, 550 Thin layer chromatography, chromatography system screening, bitopic membrane proteins, 305-309 Thiocarbamylation, catalysis, 181-183 Thiohydantoin amino acid, HPLC derivatives, 222-223 reverse phase separation, 241-242
585 Thiohydantoin proline, synthesis, 241 Tissue factor, soluble domain mutants, 353-354 protein expression and analysis, 350-353 Transforming growth factor-a, NMR relaxation, 522-526 Trypsin, in situ digestion, evaluation using hydrogenated triton, 568-569 Trypsin micro-columns, rapid flow-through digestion of proteins, 41 Trypsinogen, reduction, 261-264 Tryptic digests, high sensitivity detection, 251-258 derivatized cytochrome c peptide maps, 255-258 HPLC analysis, 252 materials, 251 peptide derivatization, 252 response as function of pH, 254-255 underivatized separation, 252 proteolytic digestion, 252 separation using TFA eluents, 253-254 Tryptophan analogs, recombinant protein spectral enhancement, 349-356 residues, assignment, ABRF-94SEQ, 209-214 U Urea, unfolding and refolding gradients, 461-462, 464 Urodilatin, partial cystine cleavage, MALDI mass spectrum, 197-198 Urokinase, equilibrium unfolding and refolding, 463-466
Venom, see a-Bungarotoxin, in venom-derived K-bungarotoxin; Conus venom
Zinc chloride, enzymatic digestion in, 161-166 Zitex, collection on, Edman sequencing, 169-176
This Page Intentionally Left Blank
ISBN D-lE-l'mTlS-E 9001 8
9 "780121"947125
This Page Intentionally Left Blank